未验证 提交 803dab78 编写于 作者: P pkpk 提交者: GitHub

test=develop (#4389)

上级 9e12ab90
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
......@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:**
......@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 |
......@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text
.
├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱
├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型
├── lexical_analysis # LAC词法分析
├── models # 共享网络
├── shared_modules/models # 共享网络
│ ├── __init__.py
│ ├── classification
│ ├── dialogue_model_toolkit
......@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation
│ ├── sequence_labeling
│ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具
├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py
│ ├── ernie
│ ├── padding.py
......
......@@ -16,7 +16,6 @@
# limitations under the License.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......@@ -40,43 +39,55 @@ import math
np.random.seed(0)
random.seed(0)
parser = argparse.ArgumentParser(__doc__)
DEV_COUNT = 1
model_g = ArgumentGroup(parser, "model", "model configuration and paths.")
model_g.add_arg("init_checkpoint", str, None, "Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints", "Path to save checkpoints.")
model_g.add_arg("init_checkpoint", str, None,
"Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints",
"Path to save checkpoints.")
model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.")
model_g.add_arg("build_dict", bool, False, "Build dict.")
train_g = ArgumentGroup(parser, "training", "training options.")
train_g.add_arg("cpu_num", int, 3, "Number of Threads.")
train_g.add_arg("epoch", int, 100, "Number of epoches for training.")
train_g.add_arg("learning_rate", float, 0.1, "Learning rate used to train with warmup.")
train_g.add_arg("save_steps", int, 1000, "The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100, "The steps interval to evaluate model performance.")
train_g.add_arg("learning_rate", float, 0.1,
"Learning rate used to train with warmup.")
train_g.add_arg("save_steps", int, 1000,
"The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100,
"The steps interval to evaluate model performance.")
train_g.add_arg("random_seed", int, 7, "random seed")
train_g.add_arg("threshold", float, 0.1, "When the confidence exceeds the threshold, the corresponding label is given.")
train_g.add_arg(
"threshold", float, 0.1,
"When the confidence exceeds the threshold, the corresponding label is given."
)
log_g = ArgumentGroup(parser, "logging", "logging related.")
log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.")
data_g = ArgumentGroup(parser, "data", "Data paths, vocab paths and data processing options")
data_g = ArgumentGroup(parser, "data",
"Data paths, vocab paths and data processing options")
data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.")
data_g.add_arg("save_dir", str, "./data/output/", "Path to save.")
data_g.add_arg("max_seq_len", int, 50, "Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64, "The total number of examples in one batch for training.")
data_g.add_arg("max_seq_len", int, 50,
"Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64,
"The total number of examples in one batch for training.")
run_type_g = ArgumentGroup(parser, "run_type", "running type options.")
run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.")
# run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).")
run_type_g.add_arg("do_train", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_eval", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_train", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_eval", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True,
"Whether to perform evaluation on test data set.")
args = parser.parse_args()
def get_score(pred_result, label, eval_phase):
"""[get precision recall and f-score]
......@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place):
pred_label = build_res["pred_label"]
label = build_res["label"]
fetch_list = [cost.name, prediction.name, pred_label.name, label.name]
train_pyreader = build_res["train_pyreader"]
train_data_loader = build_res["train_data_loader"]
train_prog = build_res["train_prog"]
steps = 0
time_begin = time.time()
......@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place):
logger.info("Begin training")
for i in range(args.epoch):
try:
for data in train_pyreader():
for data in train_data_loader():
avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \
fetch_list=fetch_list)
steps += 1
if steps % int(args.skip_steps) == 0:
time_end = time.time()
used_time = time_end - time_begin
get_score(pred_label, label, eval_phase = "Train")
get_score(pred_label, label, eval_phase="Train")
logger.info('loss is {}'.format(avg_cost_np))
logger.info("epoch: %d, step: %d, speed: %f steps/s" % (i, steps, args.skip_steps / used_time))
logger.info("epoch: %d, step: %d, speed: %f steps/s" %
(i, steps, args.skip_steps / used_time))
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog)
logger.info("[save]step %d : save at %s" % (steps, save_path))
fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" %
(steps, save_path))
if steps % args.validation_steps == 0:
if args.do_eval:
evaluate(args, test_exe, build_res, "eval")
......@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place):
logger.error("Train error : %s" % str(e))
exit(1)
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog)
fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path))
def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent=None):
def evaluate(args,
test_exe,
build_res,
eval_phase,
save_result=False,
id2intent=None):
"""[evaluate on dev/test dataset]
Arguments:
......@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], []
if eval_phase == "eval":
test_prog = build_res["eval_compiled_prog"]
test_pyreader = build_res["eval_pyreader"]
test_data_loader = build_res["eval_data_loader"]
elif eval_phase == "test":
test_prog = build_res["test_compiled_prog"]
test_pyreader = build_res["test_pyreader"]
test_data_loader = build_res["test_data_loader"]
else:
exit(1)
logger.info("-----------------------------------------------------------")
for data in test_pyreader():
for data in test_data_loader():
avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \
return_numpy=True)
total_cost.append(avg_cost_np)
......@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
label_list.extend(label)
if save_result:
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase + ".rst")
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase +
".rst")
save_dir = args.save_dir
if not os.path.exists(save_dir):
logger.warning("save dir not exists, and create it")
os.makedirs(save_dir)
fin = codecs.open(os.path.join(args.data_dir, eval_phase + ".txt"), "r", encoding="utf8")
fout = codecs.open(args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
fin = codecs.open(
os.path.join(args.data_dir, eval_phase + ".txt"),
"r",
encoding="utf8")
fout = codecs.open(
args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
for line in pred_prob_list:
query = fin.readline().rsplit("\t", 1)[0]
res = []
......@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
logger.info("-----------------------------------------------------------")
def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_net", is_infer=False):
"""[create network and pyreader]
def create_net(args,
flow_data,
class_dim,
dict_dim,
place,
model_name="textcnn_net",
is_infer=False):
"""[create network and loader]
Arguments:
flow_data {[type]} -- [description]
......@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
model = textcnn_net_multi_label
else:
return
char_list = fluid.data(name="char", shape=[None, args.max_seq_len, 1], dtype="int64", lod_level=0)
label = fluid.data(name="label", shape=[None, class_dim], dtype="float32", lod_level=0) # label data
reader = fluid.io.PyReader(feed_list=[char_list, label], capacity=args.batch_size * 10, iterable=True, \
char_list = fluid.data(
name="char",
shape=[None, args.max_seq_len, 1],
dtype="int64",
lod_level=0)
label = fluid.data(
name="label", shape=[None, class_dim], dtype="float32",
lod_level=0) # label data
data_loader = fluid.io.DataLoader.from_generator(
feed_list=[char_list, label],
capacity=args.batch_size * 10,
iterable=True,
return_list=False)
output = model(char_list, label, dict_dim,
output = model(
char_list,
label,
dict_dim,
emb_dim=flow_data["model"]["emb_dim"],
hid_dim=flow_data["model"]["hid_dim"],
hid_dim2=flow_data["model"]["hid_dim2"],
......@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
max_seq_len=args.max_seq_len)
if is_infer:
prediction = output
return [reader, prediction]
return [data_loader, prediction]
else:
avg_cost, prediction, pred_label, label = output[0], output[1], output[2], output[3]
return [reader, avg_cost, prediction, pred_label, label]
avg_cost, prediction, pred_label, label = output[0], output[1], output[
2], output[3]
return [data_loader, avg_cost, prediction, pred_label, label]
def build_data_reader(args, char_dict, intent_dict):
"""[decorate samples for pyreader]
def build_data_loader(args, char_dict, intent_dict):
"""[decorate samples for dataloader]
Arguments:
args {[type]} -- [description]
......@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict):
Returns:
[type] -- [description]
"""
reader_res = {}
loader_res = {}
if args.do_train:
train_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
train_data_generator = train_processor.prepare_data(
data_path=args.data_dir + "train.txt",
batch_size=args.batch_size,
mode='train')
reader_res["train_data_generator"] = train_data_generator
loader_res["train_data_generator"] = train_data_generator
num_train_examples = train_processor._get_num_examples()
logger.info("Num train examples: %d" % num_train_examples)
logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \
args.epoch // DEV_COUNT))
if math.ceil(num_train_examples * 1.0 / args.batch_size) // DEV_COUNT <= 0:
logger.error("Num of train steps is less than 0 or equals to 0, exit")
if math.ceil(num_train_examples * 1.0 /
args.batch_size) // DEV_COUNT <= 0:
logger.error(
"Num of train steps is less than 0 or equals to 0, exit")
exit(1)
if args.do_eval:
eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
......@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "eval.txt",
batch_size=args.batch_size,
mode='eval')
reader_res["eval_data_generator"] = eval_data_generator
loader_res["eval_data_generator"] = eval_data_generator
num_eval_examples = eval_processor._get_num_examples()
logger.info("Num eval examples: %d" % num_eval_examples)
if args.do_test:
......@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "test.txt",
batch_size=args.batch_size,
mode='test')
reader_res["test_data_generator"] = test_data_generator
return reader_res
loader_res["test_data_generator"] = test_data_generator
return loader_res
def build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res):
def build_graph(args, model_config, num_labels, dict_dim, place, test_place,
loader_res):
"""[build paddle graph]
Arguments:
......@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
num_labels {[type]} -- [description]
dict_dim {[type]} -- [description]
place {[type]} -- [description]
reader_res {[type]} -- [description]
loader_res {[type]} -- [description]
Returns:
[type] -- [description]
......@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
if args.do_train:
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
train_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
train_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, place, model_name="textcnn_net")
train_pyreader.decorate_sample_list_generator(reader_res['train_data_generator'], places=place)
res["train_pyreader"] = train_pyreader
sgd_optimizer = fluid.optimizer.SGD(learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate, decay_steps=1000, decay_rate=0.5, staircase=True))
train_data_loader.set_sample_list_generator(
loader_res['train_data_generator'], places=place)
res["train_data_loader"] = train_data_loader
sgd_optimizer = fluid.optimizer.SGD(
learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate,
decay_steps=1000,
decay_rate=0.5,
staircase=True))
sgd_optimizer.minimize(cost)
if args.do_eval:
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
eval_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
eval_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net")
eval_pyreader.decorate_sample_list_generator(reader_res['eval_data_generator'], places=test_place)
res["eval_pyreader"] = eval_pyreader
eval_data_loader.set_sample_list_generator(
loader_res['eval_data_generator'], places=test_place)
res["eval_data_loader"] = eval_data_loader
if args.do_test:
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
test_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net")
test_pyreader.decorate_sample_list_generator(reader_res['test_data_generator'], places=test_place)
res["test_pyreader"] = test_pyreader
test_data_loader.set_sample_list_generator(
loader_res['test_data_generator'], places=test_place)
res["test_data_loader"] = test_data_loader
res["cost"] = cost
res["prediction"] = prediction
res["label"] = label
res["pred_label"] = pred_label
res["train_prog"] =train_prog
res["train_prog"] = train_prog
res["eval_prog"] = eval_prog
res["test_prog"] = test_prog
return res
......@@ -421,8 +471,9 @@ def main(args):
id2intent[int(value)] = key
num_labels = len(intent_dict)
# build model
reader_res = build_data_reader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res)
loader_res = build_data_loader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place,
test_place, loader_res)
build_res["place"] = place
build_res["test_place"] = test_place
if not (args.do_train or args.do_eval or args.do_test):
......@@ -432,11 +483,13 @@ def main(args):
exe.run(startup_prog)
if args.init_checkpoint and args.init_checkpoint != "None":
try:
init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog)
init_checkpoint(
exe, args.init_checkpoint, main_program=startup_prog)
logger.info("Load model from %s" % args.init_checkpoint)
except Exception as e:
logger.exception(str(e))
logger.error("Faild load model from %s [%s]" % (args.init_checkpoint, str(e)))
logger.error("Faild load model from %s [%s]" %
(args.init_checkpoint, str(e)))
build_strategy = fluid.compiler.BuildStrategy()
build_strategy.fuse_all_reduce_ops = False
exec_strategy = fluid.ExecutionStrategy()
......@@ -449,10 +502,12 @@ def main(args):
exec_strategy=exec_strategy)
build_res["compiled_prog"] = compiled_prog
if args.do_test:
test_compiled_prog = fluid.compiler.CompiledProgram(build_res["test_prog"])
test_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"test_prog"])
build_res["test_compiled_prog"] = test_compiled_prog
if args.do_eval:
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res["eval_prog"])
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"eval_prog"])
build_res["eval_compiled_prog"] = eval_compiled_prog
if args.do_train:
......@@ -465,7 +520,6 @@ def main(args):
save_result=True, id2intent=id2intent)
if __name__ == "__main__":
logger.info("the paddle version is %s" % paddle.__version__)
check_version('1.6.0')
......
......@@ -32,7 +32,6 @@ try:
except ImportError:
import ConfigParser as cp
random_seed = 7
logger = logging.getLogger()
format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s"
......@@ -77,6 +76,7 @@ class ArgumentGroup(object):
Arguments:
object {[type]} -- [description]
"""
def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des)
......@@ -107,6 +107,7 @@ class DataReader(object):
Returns:
[type] -- [description]
"""
def __init__(self, char_vocab, intent_dict, max_len):
self._char_vocab = char_vocab
self._intent_dict = intent_dict
......@@ -128,12 +129,17 @@ class DataReader(object):
# word_dict_path), "The given word dictionary dose not exist."
assert os.path.exists(data_path), "The given data file does not exist."
if mode == "train":
train_reader = fluid.io.batch(paddle.reader.shuffle(self.data_reader(data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100), batch_size)
train_reader = fluid.io.batch(
paddle.reader.shuffle(
self.data_reader(
data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100),
batch_size)
# train_reader = fluid.io.batch(self.data_reader(data_path), batch_size)
return train_reader
else:
test_reader = fluid.io.batch(self.data_reader(data_path, self.max_len), batch_size)
test_reader = fluid.io.batch(
self.data_reader(data_path, self.max_len), batch_size)
return test_reader
def data_reader(self, file_path, max_len, shuffle=False):
......@@ -150,7 +156,8 @@ class DataReader(object):
char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \
list(query)))
if len(char_id_list) < max_len:
char_id_list.extend([self.padding_id] * (max_len - len(char_id_list)))
char_id_list.extend([self.padding_id] *
(max_len - len(char_id_list)))
char_id_list = char_id_list[:max_len]
intent_id_list = [self.padding_id] * self.intent_size
for item in intent.split('\2'):
......@@ -159,6 +166,7 @@ class DataReader(object):
if shuffle:
random.seed(random_seed)
random.shuffle(self.all_data)
def reader():
"""
reader
......@@ -166,6 +174,7 @@ class DataReader(object):
for char_id_list, intent_id_list in self.all_data:
# print char_id_list, intent_id
yield char_id_list, intent_id_list
return reader
......@@ -178,6 +187,7 @@ class DataProcesser(object):
Returns:
[type] -- [description]
"""
@staticmethod
def read_dict(filename):
"""
......@@ -227,7 +237,8 @@ class DataProcesser(object):
intent_dict[intent] = 0
intent_dict[intent] += 1
# save char dict
with codecs.open("%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
with codecs.open(
"%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("PAD\0020\n")
f_out.write("OOV\0021\n")
char_id = 2
......@@ -238,7 +249,8 @@ class DataProcesser(object):
f_out.write("%s\002%d\n" % (key, char_id))
char_id += 1
# save intent dict
with codecs.open("%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
with codecs.open(
"%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("SYS_OTHER\0020\n")
intent_id = 1
for key, value in intent_dict.items():
......@@ -249,7 +261,6 @@ class DataProcesser(object):
intent_id += 1
class ConfigReader(object):
"""[read model config file]
......@@ -282,49 +293,13 @@ class ConfigReader(object):
return flow_data
def init_pretraining_params(exe,
pretraining_params_path,
main_program,
use_fp16=False):
"""load params of pretrained model, NOT including moment, learning_rate"""
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
def init_checkpoint(exe, init_checkpoint_path, main_program):
"""
Init CheckPoint
"""
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.load(main_program, init_checkpoint_path, exe)
print("Load model from {}".format(init_checkpoint_path))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print ("Load model from {}".format(init_checkpoint_path))
def print_arguments(args):
"""
......@@ -350,5 +325,3 @@ def check_version(version='1.6.0'):
except Exception as e:
logger.error(err)
sys.exit(1)
......@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True)
train_duration_card1 = DurationKpi(
'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [
train_loss_card1,
......
......@@ -20,22 +20,25 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
'TRAINED_MODEL': './data/saved_models'}
PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid
def create_net(
is_training,
def create_net(is_training,
model_input,
args,
clip_value=10.0,
......@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM
context_fc = fluid.layers.fc(
input=context_emb,
context_fc = fluid.layers.fc(input=context_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc(
input=response_emb,
response_fc = fluid.layers.fc(input=response_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
......@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
"""
Set word embedding
"""
word_emb_param = fluid.global_scope().find_var(
word_emb_name).get_tensor()
word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_param.set(word_emb, place)
......@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
if args.use_cuda:
place = fluid.CUDAPlace(0)
......@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name,
input_field.response_wordseq.name,
],
target_vars=[
logits,
],
target_vars=[logits, ],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml")
......
......@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
logits.persistable = True
fetch_list = [logits.name]
......@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size)
batch_generator = processor.data_generator(
place=place,
phase="test",
shuffle=False,
sample_pro=1)
place=place, phase="test", shuffle=False, sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator)
......@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset()
break
scores = scores[: num_test_examples]
scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores):
......
......@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
loss = create_net(
is_training=True,
model_input=input_field,
args=args
)
is_training=True, model_input=input_field, args=args)
loss.persistable = True
# gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
......@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0')))
place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else:
dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace()
......@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init:
print("start loading word embedding init ...")
if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32')
word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32')
word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place)
print("finish init word embedding ...")
......@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time))
print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0
time_begin = time.time()
if steps % args.save_steps == 0:
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi
from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True)
each_step_duration_atis_slot_card1 = DurationKpi(
'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi(
'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi(
'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [
each_step_duration_atis_slot_card1,
......
......@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list):
if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts]
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])]
labels_list = [
inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")]
......@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids
# Second step: padding
src_id, self_input_mask = pad_batch_data(
out,
max_len,
pad_idx=pad_id,
return_input_mask=True)
out, max_len, pad_idx=pad_id, return_input_mask=True)
pos_id = pad_batch_data(
batch_pos_ids,
max_len,
......@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias.
"""
return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts)
max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts
])
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data
......
......@@ -25,18 +25,21 @@ class DefinePredict(object):
"""
Packaging Prediction Results
"""
def __init__(self):
"""
init
"""
self.task_map = {'udc': 'get_matching_res',
self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res',
'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'}
'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None):
"""
......@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)])
return label_str
......@@ -20,25 +20,29 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']:
for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path):
continue
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""build swda train dev test dataset"""
import json
......@@ -27,6 +26,7 @@ class ATIS(object):
"""
nlu dataset atis data process
"""
def __init__(self):
"""
init instance
......@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower()))
fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict:
......@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag]))
if i == 0:
if start not in [0, 1]:
prefix_num = len(text[: start].strip().split())
prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split())
prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8')))
fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict:
......@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__":
atis_inst = ATIS()
atis_inst.main()
......@@ -28,6 +28,7 @@ class DSTC2(object):
"""
dialogue state tracking dstc2 data process
"""
def __init__(self):
"""
init instance
......@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i])
self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self):
"""
......@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]]
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels])
labels = [
"%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript']
user = label_turn['transcription']
if not labels_ids.strip():
labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0][
'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
......@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
dstc_inst = DSTC2()
dstc_inst.main()
......@@ -27,6 +27,7 @@ class MRDA(object):
"""
dialogue act dataset mrda data process
"""
def __init__(self):
"""
init instance
......@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
conv_id = elems[2]
......@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list:
trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
if len(elems) != 3:
......@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0])
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out)
def get_train_dataset(self):
......@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
mrda_inst = MRDA()
mrda_inst.main()
......@@ -27,6 +27,7 @@ class SWDA(object):
"""
dialogue act dataset swda data process
"""
def __init__(self):
"""
init instance
......@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8")
idx = 0
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for r in row:
if idx == 0:
idx += 1
......@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
swda_inst = SWDA()
swda_inst.main()
......@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA
if __name__ == "__main__":
task_name = sys.argv[1]
task_name = task_name.lower()
......@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis':
atis_inst = ATIS()
atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
"../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2':
dstc_inst = DSTC2()
dstc_inst.main()
else:
exit(0)
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Tokenization classes."""
from __future__ import absolute_import
......
......@@ -113,7 +113,7 @@ def multi_head_attention(queries,
"""
Scaled Dot-Product Attention
"""
scaled_q = layers.scale(x=q, scale=d_key ** -0.5)
scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias:
product += attn_bias
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig
def create_net(
is_training,
model_input,
num_labels,
paradigm_inst,
args):
def create_net(is_training, model_input, num_labels, paradigm_inst, args):
"""create dialogue task model"""
src_ids = model_input.src_ids
......@@ -48,14 +43,15 @@ def create_net(
config=bert_conf,
use_fp16=False)
params = {'num_labels': num_labels,
params = {
'num_labels': num_labels,
'src_ids': src_ids,
'pos_ids': pos_ids,
'sent_ids': sent_ids,
'input_mask': input_mask,
'labels': labels,
'is_training': is_training}
'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params)
return results
......@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
......@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=[
input_field.src_ids.name,
input_field.pos_ids.name,
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
input_field.src_ids.name, input_field.pos_ids.name,
input_field.sent_ids.name, input_field.input_mask.name
],
target_vars=[probs],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
......@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len,
......@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator(
batch_size=args.batch_size,
phase='train',
shuffle=True)
batch_size=args.batch_size, phase='train', shuffle=True)
num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens:
......@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
if accuracy is not None:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time
])
else:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
args.print_steps / used_time
])
np.mean(np_loss), args.print_steps / used_time))
ce_info.append(
[np.mean(np_loss), args.print_steps / used_time])
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = "step_" + str(steps)
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path)
save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path)
save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -19,8 +19,7 @@ from __future__ import print_function
import os
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
import numpy as np
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Emotion Detection Task
"""
......@@ -24,7 +23,7 @@ import os
import time
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......@@ -38,9 +37,7 @@ import reader
import utils
def create_model(args,
num_labels,
is_prediction=False):
def create_model(args, num_labels, is_prediction=False):
"""
Create Model for Emotion Detection
"""
......@@ -77,10 +74,17 @@ def create_model(args,
raise ValueError("Unknown network type!")
if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, class_dim=num_labels, is_prediction=True)
probs = network(
data,
seq_len,
None,
args.vocab_size,
class_dim=num_labels,
is_prediction=True)
return loader, probs, [data.name, seq_len.name]
avg_loss, probs = network(data, seq_len, label, args.vocab_size, class_dim=num_labels)
avg_loss, probs = network(
data, seq_len, label, args.vocab_size, class_dim=num_labels)
num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return loader, avg_loss, accuracy, num_seqs
......@@ -142,7 +146,8 @@ def main(args):
exe = fluid.Executor(place)
task_name = args.task_name.lower()
processor = reader.EmoTectProcessor(data_dir=args.data_dir,
processor = reader.EmoTectProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path,
random_seed=args.random_seed)
#num_labels = len(processor.get_labels())
......@@ -173,9 +178,7 @@ def main(args):
with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard():
train_loader, loss, accuracy, num_seqs = create_model(
args,
num_labels=num_labels,
is_prediction=False)
args, num_labels=num_labels, is_prediction=False)
sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr)
sgd_optimizer.minimize(loss)
......@@ -189,37 +192,27 @@ def main(args):
if args.do_val:
if args.do_train:
test_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='dev',
epoch=1)
batch_size=args.batch_size, phase='dev', epoch=1)
else:
test_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='test',
epoch=1)
batch_size=args.batch_size, phase='test', epoch=1)
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_loader, loss, accuracy, num_seqs = create_model(
args,
num_labels=num_labels,
is_prediction=False)
args, num_labels=num_labels, is_prediction=False)
test_prog = test_prog.clone(for_test=True)
if args.do_infer:
infer_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='infer',
epoch=1)
batch_size=args.batch_size, phase='infer', epoch=1)
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_loader, probs, _ = create_model(
args,
num_labels=num_labels,
is_prediction=True)
args, num_labels=num_labels, is_prediction=True)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
......@@ -292,8 +285,9 @@ def main(args):
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.save_checkpoint_dir,
"step_" + str(steps))
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate on dev set
......@@ -306,11 +300,11 @@ def main(args):
print("final step: %d " % steps)
if args.do_val:
evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name],
"dev")
[loss.name, accuracy.name, num_seqs.name], "dev")
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.save_checkpoint_dir,
"step_" + str(steps))
fluid.save(train_program, save_path)
train_loader.reset()
break
......@@ -334,15 +328,12 @@ def main(args):
if not args.do_train and args.do_val:
print("Final test result:")
evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name],
"test")
[loss.name, accuracy.name, num_seqs.name], "test")
# infer
if args.do_infer:
print("Final infer result:")
infer(test_exe, test_prog, infer_loader,
[probs.name],
"infer")
infer(test_exe, test_prog, infer_loader, [probs.name], "infer")
def get_cards():
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Emotion Detection Task, based on ERNIE
"""
......@@ -25,7 +24,7 @@ import time
import argparse
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......@@ -350,7 +349,7 @@ def main(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate dev set
......@@ -369,7 +368,7 @@ def main(args):
except fluid.core.EOFException:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
train_pyreader.reset()
break
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
EmoTect utilities.
"""
......@@ -29,27 +28,13 @@ import paddle
import paddle.fluid as fluid
import numpy as np
def init_checkpoint(exe, init_checkpoint_path, main_program):
"""
Init CheckPoint
"""
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path))
fluid.load(main_program, init_checkpoint_path, exe)
def word2id(word_dict, query):
......@@ -57,8 +42,10 @@ def word2id(word_dict, query):
Convert word sequence into id list
"""
unk_id = len(word_dict)
wids = [word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")]
wids = [
word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")
]
return wids
......
......@@ -5,7 +5,7 @@
## 1. 任务说明
本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。
**目前语言模型要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。**
**目前语言模型要求使用PaddlePaddle 1.7及以上版本或适当的develop版本。**
同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290)
......
......@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding("utf-8")
sys.path.append('../')
sys.path.append('../shared_modules/')
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
......@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'):
def get_current_model_para(train_prog, train_exe):
param_list = train_prog.block(0).all_parameters()
param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list]
vals = {}
......@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe):
def save_para_npz(train_prog, train_exe):
print("begin to save model to model_base")
param_list = train_prog.block(0).all_parameters()
param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list]
vals = {}
......
......@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型
#### 1.PaddlePaddle 安装
本项目依赖 PaddlePaddle 1.6.0 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
本项目依赖 PaddlePaddle 1.7 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
> Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。
......
......@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset
from ernie_reader import SequenceLabelReader
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader
......@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model"""
# model's input data
words = fluid.data(name='words', shape=[-1, 1], dtype='int64', lod_level=1)
words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data(
name='targets', shape=[-1, 1], dtype='int64', lod_level=1)
name='targets', shape=[None, 1], dtype='int64', lod_level=1)
# for inference process
if mode == 'infer':
......@@ -88,9 +89,11 @@ def create_pyreader(args,
return_reader=False,
mode='train'):
# init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac':
pyreader = fluid.io.PyReader(
pyreader = fluid.io.DataLoader.from_generator(
feed_list=feed_list,
capacity=50,
use_double_buffer=True,
......@@ -101,19 +104,19 @@ def create_pyreader(args,
# create lac pyreader
if mode == 'train':
pyreader.decorate_sample_list_generator(
pyreader.set_sample_list_generator(
fluid.io.batch(
fluid.io.shuffle(
reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size),
batch_size=args.batch_size / device_count),
places=place)
else:
pyreader.decorate_sample_list_generator(
pyreader.set_sample_list_generator(
fluid.io.batch(
reader.file_reader(
file_name, mode=mode),
batch_size=args.batch_size),
batch_size=args.batch_size / device_count),
places=place)
elif model == 'ernie':
......@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config):
# ERNIE's input data
src_ids = fluid.data(
name='src_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='src_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='sent_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(
name='pos_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='pos_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len, 1], dtype='float32')
name='input_mask', shape=[None, args.max_seq_len, 1], dtype='float32')
padded_labels = fluid.data(
name='padded_labels', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='padded_labels', shape=[None, args.max_seq_len, 1], dtype='int64')
seq_lens = fluid.data(
name='seq_lens', shape=[-1], dtype='int64', lod_level=0)
name='seq_lens', shape=[None], dtype='int64', lod_level=0)
squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1])
......
......@@ -20,7 +20,7 @@ import sys
from collections import namedtuple
import numpy as np
sys.path.append("..")
sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator
import reader
import utils
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator
import utils
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda
from models.model_check import check_version
......@@ -188,15 +188,16 @@ def do_train(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir,
"step_" + str(steps))
"step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
evaluate(exe, test_program, test_pyreader, train_ret)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps),
"checkpoint")
fluid.save(train_program, save_path)
def do_eval(args):
......
......@@ -29,7 +29,7 @@ import reader
import utils
import creator
from eval import test_process
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......@@ -151,8 +151,8 @@ def do_train(args):
# save checkpoints
if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir,
"step_" + str(step))
fluid.io.save_persistables(exe, save_path, train_program)
"step_" + str(step), "checkpoint")
fluid.save(train_program, save_path)
step += 1
if args.enable_ce:
......
......@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
try:
checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
fluid.load(main_program, checkpoint_path, exe)
except:
fluid.load(main_program, init_checkpoint_path, exe)
print("Load model from {}".format(init_checkpoint_path))
......@@ -224,15 +216,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
fluid.load(main_program, pretraining_params_path, exe)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
......@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
......@@ -106,7 +106,7 @@ python -u main.py \
--prepostprocess_dropout 0.3
```
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_param``save_checkpoint`(默认为 trained_params 和 trained_ckpts),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将分别保存当前训练的参数值和 checkpoint 到相应目录,每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_model_path`(默认为 saved_models),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将保存当前训练的 checkpoint 到相应目录(会保存分别记录了模型参数和优化器状态的 `transformer.pdparams``transformer.pdopt` 两个文件),每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
```txt
[2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s
......@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len
### 预训练模型
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_params.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_params.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_graph.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_graph.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
## 进阶使用
......
......@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
def get_input_descs(args):
"""
Generate a dict mapping data fields to the corresponding data shapes and
......@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder.
# The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64",
......@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference.
# The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model]
......@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs
# Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = (
"src_word_emb_table",
......
......@@ -24,6 +24,7 @@ import paddle.fluid as fluid
from utils.input_field import InputField
from utils.configure import PDConfig
from utils.load import load
# include task-specific libs
import desc
......@@ -31,51 +32,6 @@ import reader
from transformer import create_net
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_params(args, exe, program):
assert isinstance(args.init_from_params, str)
if not os.path.exists(args.init_from_params):
raise Warning("the params path does not exist.")
return False
fluid.io.load_params(
executor=exe,
dirname=args.init_from_params,
main_program=program,
filename="params.pdparams")
print("finish init model from params from %s" % (args.init_from_params))
return True
def do_save_inference_model(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
......@@ -84,6 +40,11 @@ def do_save_inference_model(args):
dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace()
src_vocab = reader.DataProcessor.load_dict(args.src_vocab_fpath)
trg_vocab = reader.DataProcessor.load_dict(args.trg_vocab_fpath)
args.src_vocab_size = len(src_vocab)
args.trg_vocab_size = len(trg_vocab)
test_prog = fluid.default_main_program()
startup_prog = fluid.default_startup_program()
......@@ -119,13 +80,10 @@ def do_save_inference_model(args):
exe = fluid.Executor(place)
exe.run(startup_prog)
assert (args.init_from_params) or (args.init_from_pretrain_model)
if args.init_from_params:
init_from_params(args, exe, test_prog)
elif args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, test_prog)
assert (
args.init_from_params), "must set init_from_params to load parameters"
load(test_prog, os.path.join(args.init_from_params, "transformer"), exe)
print("finish initing model from params from %s" % (args.init_from_params))
# saving inference model
......
......@@ -25,7 +25,6 @@ from train import do_train
from predict import do_predict
from inference_model import do_save_inference_model
if __name__ == "__main__":
LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s"
logging.basicConfig(
......
......@@ -25,6 +25,7 @@ import paddle.fluid as fluid
from utils.input_field import InputField
from utils.configure import PDConfig
from utils.check import check_gpu, check_version
from utils.load import load
# include task-specific libs
import desc
......@@ -32,51 +33,6 @@ import reader
from transformer import create_net, position_encoding_init
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_params(args, exe, program):
assert isinstance(args.init_from_params, str)
if not os.path.exists(args.init_from_params):
raise Warning("the params path does not exist.")
return False
fluid.io.load_params(
executor=exe,
dirname=args.init_from_params,
main_program=program,
filename="params.pdparams")
print("finish init model from params from %s" % (args.init_from_params))
return True
def post_process_seq(seq, bos_idx, eos_idx, output_bos=False, output_eos=False):
"""
Post-process the beam-search decoded sequence. Truncate from the first
......@@ -160,13 +116,10 @@ def do_predict(args):
exe = fluid.Executor(place)
exe.run(startup_prog)
assert (args.init_from_params) or (args.init_from_pretrain_model)
if args.init_from_params:
init_from_params(args, exe, test_prog)
elif args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, test_prog)
assert (
args.init_from_params), "must set init_from_params to load parameters"
load(test_prog, os.path.join(args.init_from_params, "transformer"), exe)
print("finish initing model from params from %s" % (args.init_from_params))
# to avoid a longer length than training, reset the size of position encoding to max_length
for pos_enc_param_name in desc.pos_enc_param_names:
......
......@@ -27,6 +27,7 @@ import utils.dist_utils as dist_utils
from utils.input_field import InputField
from utils.configure import PDConfig
from utils.check import check_gpu, check_version
from utils.load import load
# include task-specific libs
import desc
......@@ -39,91 +40,6 @@ if os.environ.get('FLAGS_eager_delete_tensor_gb', None) is None:
num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_checkpoint(args, exe, program):
assert isinstance(args.init_from_checkpoint, str)
if not os.path.exists(args.init_from_checkpoint):
raise Warning("the checkpoint path does not exist.")
return False
fluid.io.load_persistables(
executor=exe,
dirname=args.init_from_checkpoint,
main_program=program,
filename="checkpoint.pdckpt")
print("finish initing model from checkpoint from %s" %
(args.init_from_checkpoint))
return True
def save_checkpoint(args, exe, program, dirname):
assert isinstance(args.save_model_path, str)
checkpoint_dir = os.path.join(args.save_model_path, args.save_checkpoint)
if not os.path.exists(checkpoint_dir):
os.mkdir(checkpoint_dir)
fluid.io.save_persistables(
exe,
os.path.join(checkpoint_dir, dirname),
main_program=program,
filename="checkpoint.pdparams")
print("save checkpoint at %s" % (os.path.join(checkpoint_dir, dirname)))
return True
def save_param(args, exe, program, dirname):
assert isinstance(args.save_model_path, str)
param_dir = os.path.join(args.save_model_path, args.save_param)
if not os.path.exists(param_dir):
os.mkdir(param_dir)
fluid.io.save_params(
exe,
os.path.join(param_dir, dirname),
main_program=program,
filename="params.pdparams")
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
def do_train(args):
if args.use_cuda:
if num_trainers > 1: # for multi-process gpu training
......@@ -226,11 +142,17 @@ def do_train(args):
## init from some checkpoint, to resume the previous training
if args.init_from_checkpoint:
init_from_checkpoint(args, exe, train_prog)
load(train_prog,
os.path.join(args.init_from_checkpoint, "transformer"), exe)
print("finish initing model from checkpoint from %s" %
(args.init_from_checkpoint))
## init from some pretrain models, to better solve the current task
if args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, train_prog)
load(train_prog,
os.path.join(args.init_from_pretrain_model, "transformer"), exe)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
build_strategy = fluid.compiler.BuildStrategy()
build_strategy.enable_inplace = True
......@@ -293,14 +215,11 @@ def do_train(args):
avg_batch_time = time.time()
if step_idx % args.save_step == 0 and step_idx != 0:
if args.save_checkpoint:
save_checkpoint(args, exe, train_prog,
"step_" + str(step_idx))
if args.save_param:
save_param(args, exe, train_prog,
"step_" + str(step_idx))
if args.save_model_path:
model_path = os.path.join(args.save_model_path,
"step_" + str(step_idx),
"transformer")
fluid.save(train_prog, model_path)
batch_id += 1
step_idx += 1
......@@ -319,11 +238,10 @@ def do_train(args):
time_consumed = time.time() - pass_start_time
if args.save_checkpoint:
save_checkpoint(args, exe, train_prog, "step_final")
if args.save_param:
save_param(args, exe, train_prog, "step_final")
if args.save_model_path:
model_path = os.path.join(args.save_model_path, "step_final",
"transformer")
fluid.save(train_prog, model_path)
if args.enable_ce: # For CE
print("kpis\ttrain_cost_card%d\t%f" % (dev_count, total_avg_cost))
......
......@@ -17,6 +17,7 @@ import numpy as np
import paddle.fluid as fluid
import paddle.fluid.layers as layers
from paddle.fluid.layers.utils import map_structure
from desc import *
......@@ -90,7 +91,6 @@ def multi_head_attention(queries,
n_head=1,
dropout_rate=0.,
cache=None,
gather_idx=None,
static_kv=False):
"""
Multi-Head Attention. Note that attn_bias is added to the logit before
......@@ -161,30 +161,28 @@ def multi_head_attention(queries,
v = transpose_layer(x=reshaped_v, perm=[0, 2, 1, 3])
if cache is not None: # only for faster inference
cache_, i = cache
if static_kv: # For encoder-decoder attention in inference
cache_k, cache_v = cache["static_k"], cache["static_v"]
# To init the static_k and static_v in cache.
# Maybe we can use condition_op(if_else) to do these at the first
# step in while loop to replace these, however it might be less
# efficient.
cache_k, cache_v = cache_["static_k"], cache_["static_v"]
# To init the static_k and static_v in global block.
static_cache_init = wrap_layer_with_block(
layers.assign,
fluid.default_main_program().current_block().parent_idx)
static_cache_init(k, cache_k)
static_cache_init(v, cache_v)
static_cache_init(
k,
fluid.default_main_program().global_block().var(
"static_k_%d" % i))
static_cache_init(
v,
fluid.default_main_program().global_block().var(
"static_v_%d" % i))
k, v = cache_k, cache_v
else: # For decoder self-attention in inference
cache_k, cache_v = cache["k"], cache["v"]
# gather cell states corresponding to selected parent
select_k = layers.gather(cache_k, index=gather_idx)
select_v = layers.gather(cache_v, index=gather_idx)
if not static_kv:
# For self attention in inference, use cache and concat time steps.
select_k = layers.concat([select_k, k], axis=2)
select_v = layers.concat([select_v, v], axis=2)
# update cell states(caches) cached in global block
layers.assign(select_k, cache_k)
layers.assign(select_v, cache_v)
return q, select_k, select_v
# use cache and concat time steps.
cache_k, cache_v = cache_["k"], cache_["v"]
k = layers.concat([cache_k, k], axis=2)
v = layers.concat([cache_v, v], axis=2)
cache_["k"], cache_["v"] = (k, v)
return q, k, v
def __combine_heads(x):
......@@ -301,12 +299,13 @@ def prepare_encoder_decoder(src_word,
src_word,
size=[src_vocab_size, src_emb_dim],
padding_idx=bos_idx, # set embedding of bos to 0
param_attr=fluid.ParamAttr(name=word_emb_param_name,
initializer=fluid.initializer.Normal(
0., src_emb_dim**-0.5)))
param_attr=fluid.ParamAttr(
name=word_emb_param_name,
initializer=fluid.initializer.Normal(0., src_emb_dim**-0.5)))
src_word_emb = layers.scale(x=src_word_emb, scale=src_emb_dim**0.5)
src_pos_enc = fluid.embedding(src_pos,
src_pos_enc = fluid.embedding(
src_pos,
size=[src_max_len, src_emb_dim],
param_attr=fluid.ParamAttr(
name=pos_enc_param_name, trainable=False))
......@@ -405,8 +404,7 @@ def decoder_layer(dec_input,
relu_dropout,
preprocess_cmd,
postprocess_cmd,
cache=None,
gather_idx=None):
cache=None):
""" The layer to be stacked in decoder part.
The structure of this module is similar to that in the encoder part except
a multi-head attention is added to implement encoder-decoder attention.
......@@ -421,8 +419,7 @@ def decoder_layer(dec_input,
d_model,
n_head,
attention_dropout,
cache=cache,
gather_idx=gather_idx)
cache=cache)
slf_attn_output = post_process_layer(
dec_input,
slf_attn_output,
......@@ -440,7 +437,6 @@ def decoder_layer(dec_input,
n_head,
attention_dropout,
cache=cache,
gather_idx=gather_idx,
static_kv=True)
enc_attn_output = post_process_layer(
slf_attn_output,
......@@ -476,8 +472,7 @@ def decoder(dec_input,
relu_dropout,
preprocess_cmd,
postprocess_cmd,
caches=None,
gather_idx=None):
caches=None):
"""
The decoder is composed of a stack of identical decoder_layer layers.
"""
......@@ -497,8 +492,7 @@ def decoder(dec_input,
relu_dropout,
preprocess_cmd,
postprocess_cmd,
cache=None if caches is None else caches[i],
gather_idx=gather_idx)
cache=None if caches is None else (caches[i], i))
dec_input = dec_output
dec_output = pre_process_layer(dec_output, preprocess_cmd,
prepostprocess_dropout)
......@@ -536,7 +530,8 @@ def transformer(model_input,
label = model_input.lbl_word
weights = model_input.lbl_weight
enc_output = wrap_encoder(enc_inputs,
enc_output = wrap_encoder(
enc_inputs,
src_vocab_size,
max_length,
n_layer,
......@@ -553,7 +548,8 @@ def transformer(model_input,
weight_sharing,
bos_idx=bos_idx)
predict = wrap_decoder(dec_inputs,
predict = wrap_decoder(
dec_inputs,
trg_vocab_size,
max_length,
n_layer,
......@@ -575,8 +571,9 @@ def transformer(model_input,
if label_smooth_eps:
# TODO: use fluid.input.one_hot after softmax_with_cross_entropy removing
# the enforcement that the last dimension of label must be 1.
label = layers.label_smooth(label=layers.one_hot(input=label,
depth=trg_vocab_size),
label = layers.label_smooth(
label=layers.one_hot(
input=label, depth=trg_vocab_size),
epsilon=label_smooth_eps)
cost = layers.softmax_with_cross_entropy(
......@@ -654,7 +651,6 @@ def wrap_decoder(dec_inputs,
weight_sharing,
enc_output=None,
caches=None,
gather_idx=None,
bos_idx=0):
"""
The wrapper assembles together all needed layers for the decoder.
......@@ -687,8 +683,7 @@ def wrap_decoder(dec_inputs,
relu_dropout,
preprocess_cmd,
postprocess_cmd,
caches=caches,
gather_idx=gather_idx)
caches=caches)
# Reshape to 2D tensor to use GEMM instead of BatchedGEMM
dec_output = layers.reshape(
dec_output, shape=[-1, dec_output.shape[-1]], inplace=True)
......@@ -722,7 +717,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dec_inputs = (model_input.trg_word, model_input.init_score,
model_input.init_idx, model_input.trg_src_attn_bias)
enc_output = wrap_encoder(enc_inputs,
enc_output = wrap_encoder(
enc_inputs,
src_vocab_size,
max_in_len,
n_layer,
......@@ -748,8 +744,6 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
force_cpu=True)
step_idx = layers.fill_constant(
shape=[1], dtype=start_tokens.dtype, value=0, force_cpu=True)
cond = layers.less_than(x=step_idx, y=max_len) # default force_cpu=True
while_op = layers.While(cond)
# array states will be stored for each step.
ids = layers.array_write(
layers.reshape(start_tokens, (-1, 1)), step_idx)
......@@ -773,21 +767,31 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=enc_output.dtype,
value=0),
"static_k": # for encoder-decoder attention
layers.create_tensor(dtype=enc_output.dtype),
fluid.data(
shape=[None, n_head, 0, d_key],
dtype=enc_output.dtype,
name=("static_k_%d" % i)),
"static_v": # for encoder-decoder attention
layers.create_tensor(dtype=enc_output.dtype)
fluid.data(
shape=[None, n_head, 0, d_value],
dtype=enc_output.dtype,
name=("static_v_%d" % i)),
} for i in range(n_layer)
]
with while_op.block():
pre_ids = layers.array_read(array=ids, i=step_idx)
# Since beam_search_op dosen't enforce pre_ids' shape, we can do
# inplace reshape here which actually change the shape of pre_ids.
# pre_ids = layers.reshape(pre_ids, (-1, 1, 1), inplace=True)
pre_scores = layers.array_read(array=scores, i=step_idx)
def cond_func(step_idx, selected_ids, selected_scores, gather_idx,
caches, trg_src_attn_bias):
length_cond = layers.less_than(x=step_idx, y=max_len)
finish_cond = layers.logical_not(layers.is_empty(x=selected_ids))
return layers.logical_and(x=length_cond, y=finish_cond)
def body_func(step_idx, pre_ids, pre_scores, gather_idx, caches,
trg_src_attn_bias):
# gather cell states corresponding to selected parent
pre_caches = map_structure(
lambda x: layers.gather(x, index=gather_idx), caches)
pre_src_attn_bias = layers.gather(
trg_src_attn_bias, index=parent_idx)
trg_src_attn_bias, index=gather_idx)
pre_pos = layers.elementwise_mul(
x=layers.fill_constant_batch_size_like(
input=pre_src_attn_bias, # cann't use lod tensor here
......@@ -796,7 +800,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=pre_ids.dtype),
y=step_idx,
axis=0)
logits = wrap_decoder((pre_ids, pre_pos, None, pre_src_attn_bias),
logits = wrap_decoder(
(pre_ids, pre_pos, None, pre_src_attn_bias),
trg_vocab_size,
max_in_len,
n_layer,
......@@ -812,8 +817,7 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
postprocess_cmd,
weight_sharing,
enc_output=enc_output,
caches=caches,
gather_idx=parent_idx,
caches=pre_caches,
bos_idx=bos_idx)
# intra-beam topK
topk_scores, topk_indices = layers.topk(
......@@ -832,16 +836,20 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
beam_size=beam_size,
end_id=eos_idx,
return_parent_idx=True)
layers.increment(x=step_idx, value=1.0, in_place=True)
# cell states(caches) have been updated in wrap_decoder,
# only need to update beam search states here.
step_idx = layers.increment(x=step_idx, value=1.0, in_place=False)
layers.array_write(selected_ids, i=step_idx, array=ids)
layers.array_write(selected_scores, i=step_idx, array=scores)
layers.assign(gather_idx, parent_idx)
layers.assign(pre_src_attn_bias, trg_src_attn_bias)
length_cond = layers.less_than(x=step_idx, y=max_len)
finish_cond = layers.logical_not(layers.is_empty(x=selected_ids))
layers.logical_and(x=length_cond, y=finish_cond, out=cond)
return (step_idx, selected_ids, selected_scores, gather_idx,
pre_caches, pre_src_attn_bias)
_ = layers.while_loop(
cond=cond_func,
body=body_func,
loop_vars=[
step_idx, start_tokens, init_scores, parent_idx, caches,
trg_src_attn_bias
],
is_test=True)
finished_ids, finished_scores = layers.beam_search_decode(
ids, scores, beam_size=beam_size, end_id=eos_idx)
......
......@@ -11,10 +11,11 @@ init_from_checkpoint: ""
init_from_pretrain_model: ""
# path of trained parameter, to make prediction
init_from_params: "trained_params/step_100000"
save_model_path: ""
# the directory for saving checkpoints.
# the directory for saving models.
save_model_path: "saved_models"
# deprecated, the directory for saving checkpoints.
save_checkpoint: "trained_ckpts"
# the directory for saving trained parameters.
# deprecated, the directory for saving trained parameters.
save_param: "trained_params"
# the directory for saving inference model.
inference_model_dir: "infer_model"
......
......@@ -199,9 +199,14 @@ class PDConfig(object):
"Whether to perform model saving for inference.")
# NOTE: args for profiler
self.default_g.add_arg("is_profiler", int, 0, "the switch of profiler tools. (used for benchmark)")
self.default_g.add_arg("profiler_path", str, './', "the profiler output file path. (used for benchmark)")
self.default_g.add_arg("max_iter", int, 0, "the max train batch num.(used for benchmark)")
self.default_g.add_arg(
"is_profiler", int, 0,
"the switch of profiler tools. (used for benchmark)")
self.default_g.add_arg(
"profiler_path", str, './',
"the profiler output file path. (used for benchmark)")
self.default_g.add_arg("max_iter", int, 0,
"the max train batch num.(used for benchmark)")
self.parser = parser
......
import pickle
import six
import warnings
from functools import partial
import paddle.fluid as fluid
def load(program, model_path, executor=None, var_list=None):
"""
To load python2 saved models in python3.
"""
try:
fluid.load(program, model_path, executor, var_list)
except UnicodeDecodeError:
warnings.warn(
"An UnicodeDecodeError is catched, which might be caused by loading "
"a python2 saved model. Encoding of pickle.load would be set and "
"load again automatically.")
if six.PY3:
load_bak = pickle.load
pickle.load = partial(load_bak, encoding="latin1")
fluid.load(program, model_path, executor, var_list)
pickle.load = load_bak
......@@ -22,6 +22,8 @@
| :------| :------: | :------: |:------: |:------: |
| [BERT-Large, Uncased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_uncased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M |
| [BERT-Large, Cased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_cased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M |
| [RoBERTa-Base, Chinese](https://bert-models.bj.bcebos.com/chinese_roberta_wwm_ext_L-12_H-768_A-12.tar.gz) | 12 | 768 |12 |110M |
| [RoBERTa-Large, Chinese](https://bert-models.bj.bcebos.com/chinese_roberta_wwm_large_ext_L-24_H-1024_A-16.tar.gz) | 24 | 1024 |16 |340M |
| [BERT-Base, Uncased](https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz) | 12 | 768 |12 |110M |
| [BERT-Large, Uncased](https://bert-models.bj.bcebos.com/uncased_L-24_H-1024_A-16.tar.gz) | 24 | 1024 |16 |340M |
|[BERT-Base, Cased](https://bert-models.bj.bcebos.com/cased_L-12_H-768_A-12.tar.gz)|12|768|12|110M|
......@@ -415,5 +417,3 @@ for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) {
<< static_cast<float *>(output.front().data.data())[i + 2] << std::endl;
}
```
......@@ -158,7 +158,7 @@ def optimization(loss,
else:
if weight_decay > 0:
for param in train_program.global_block().all_parameters():
for param in train_program.all_parameters():
param_list[param.name] = param * 1.0
param_list[param.name].stop_gradient = True
......
......@@ -392,7 +392,7 @@ def main(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(program=train_program, model_path=save_path)
if steps % args.validation_steps == 0:
print("Average throughtput: %s" % (np.average(throughput)))
......@@ -409,7 +409,7 @@ def main(args):
"test")
except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(program=train_program, model_path=save_path)
train_data_loader.reset()
break
if args.enable_ce:
......
......@@ -398,11 +398,11 @@ def train(args):
if steps % args.save_steps == 0 or steps == max_train_steps:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(program=train_program, model_path=save_path)
except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps) + "_final")
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(program=train_program, model_path=save_path)
train_data_loader.reset()
break
......
......@@ -412,7 +412,7 @@ def train(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(program=train_program, model_path=save_path)
if args.validation_set_dir and steps % args.validation_steps == 0:
vali_cost, vali_lm_cost, vali_acc, vali_steps, vali_speed = predict(
......
......@@ -25,7 +25,7 @@ import paddle.fluid as fluid
def cast_fp32_to_fp16(exe, main_program):
print("Cast parameters to float16 data format.")
for param in main_program.global_block().all_parameters():
for param in main_program.all_parameters():
if not param.name.endswith(".master"):
param_t = fluid.global_scope().find_var(param.name).get_tensor()
data = np.array(param_t)
......@@ -38,21 +38,9 @@ def cast_fp32_to_fp16(exe, main_program):
def init_checkpoint(exe, init_checkpoint_path, main_program, use_fp16=False):
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
fluid.load(
program=main_program, model_path=init_checkpoint_path, executor=exe)
def existed_persitables(var):
if not fluid.io.is_persistable(var):
return False
if os.path.exists(os.path.join(init_checkpoint_path, var.name)):
print("INIT {}".format(var.name))
return True
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path))
if use_fp16:
......@@ -63,24 +51,8 @@ def init_pretraining_params(exe,
pretraining_params_path,
main_program,
use_fp16=False):
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
if os.path.exists(os.path.join(pretraining_params_path, var.name)):
print("INIT {}".format(var.name))
return True
else:
print("SKIP {}".format(var.name))
return False
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=existed_params)
fluid.load(
program=main_program, model_path=pretraining_params_path, executor=exe)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
......
......@@ -90,5 +90,3 @@ word_embedding=fluid.layers.concat(input=[elmo_embedding, word_embedding], axis=
### 参考论文
[Deep contextualized word representations](https://arxiv.org/abs/1802.05365)
......@@ -7,7 +7,6 @@ from kpi import CostKpi, DurationKpi, AccKpi
#### NOTE kpi.py should shared in models in some way!!!!
train_duration_sts_b_card1 = DurationKpi(
'train_duration_sts_b_card1', 0.01, 0, actived=True)
train_cost_sts_b_card1 = CostKpi(
......
......@@ -29,7 +29,7 @@
1. PaddlePaddle 安装
本项目依赖于 PaddlePaddle Fluid 1.6 及以上版本,请参考 [安装指南](http://www.paddlepaddle.org/#quick-start) 进行安装
本项目依赖于 PaddlePaddle Fluid 1.7 及以上版本,请参考 [安装指南](http://www.paddlepaddle.org/#quick-start) 进行安装
2. 代码安装
......
......@@ -13,6 +13,7 @@ from run_classifier import create_model
import utils
import reader
def do_save_inference_model(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
......@@ -53,6 +54,7 @@ def do_save_inference_model(args):
print("save inference model at %s" % (args.inference_model_dir))
def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
"""
Inference Function
......@@ -61,13 +63,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
test_pyreader.start()
while True:
try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, return_numpy=True)
np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
except fluid.core.EOFException:
test_pyreader.reset()
break
def test_inference_model(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
......@@ -92,7 +97,8 @@ def test_inference_model(args):
exe = fluid.Executor(place)
exe.run(startup_prog)
processor = reader.SentaProcessor(data_dir=args.data_dir,
processor = reader.SentaProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path,
random_seed=args.random_seed,
max_seq_len=args.max_seq_len)
......@@ -107,14 +113,14 @@ def test_inference_model(args):
params_filename="params.pdparams")
infer_data_generator = processor.data_generator(
batch_size=args.batch_size,
batch_size=args.batch_size / dev_count,
phase="infer",
epoch=1,
shuffle=False)
infer_pyreader.decorate_sample_list_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader,
[probs.name], "infer")
infer_pyreader.set_sample_list_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader, [probs.name], "infer")
if __name__ == "__main__":
args = PDConfig('senta_config.json')
......
# -*- coding: utf_8 -*-
import os
import sys
sys.path.append("../")
sys.path.append("../models/classification")
sys.path.append("../shared_modules/")
sys.path.append("../shared_modules/models/classification")
import paddle
import paddle.fluid as fluid
import numpy as np
......@@ -17,6 +17,7 @@ from models.representation.ernie import ErnieConfig
from models.representation.ernie import ernie_encoder, ernie_encoder_with_paddle_hub
from preprocess.ernie import task_reader
def do_save_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path)
......@@ -37,18 +38,17 @@ def do_save_inference_model(args):
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args,
pyreader_name="infer_reader")
args, pyreader_name="infer_reader")
if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len)
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args,
embeddings,
labels=labels,
is_prediction=True)
probs = create_model(
args, embeddings, labels=labels, is_prediction=True)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
......@@ -59,11 +59,11 @@ def do_save_inference_model(args):
fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=[ernie_inputs["src_ids"].name,
ernie_inputs["sent_ids"].name,
ernie_inputs["pos_ids"].name,
ernie_inputs["input_mask"].name,
ernie_inputs["seq_lens"].name],
feeded_var_names=[
ernie_inputs["src_ids"].name, ernie_inputs["sent_ids"].name,
ernie_inputs["pos_ids"].name, ernie_inputs["input_mask"].name,
ernie_inputs["seq_lens"].name
],
target_vars=[probs],
executor=exe,
main_program=test_prog,
......@@ -72,6 +72,7 @@ def do_save_inference_model(args):
print("save inference model at %s" % (args.inference_model_dir))
def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
"""
Inference Function
......@@ -80,13 +81,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
test_pyreader.start()
while True:
try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, return_numpy=True)
np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
except fluid.core.EOFException:
test_pyreader.reset()
break
def test_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path)
ernie_config.print_config()
......@@ -113,15 +117,11 @@ def test_inference_model(args):
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args,
pyreader_name="infer_pyreader")
args, pyreader_name="infer_pyreader")
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
probs = create_model(
args,
embeddings,
labels=labels,
is_prediction=True)
args, embeddings, labels=labels, is_prediction=True)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
......@@ -129,7 +129,7 @@ def test_inference_model(args):
assert (args.inference_model_dir)
infer_data_generator = reader.data_generator(
input_file=args.test_set,
batch_size=args.batch_size,
batch_size=args.batch_size / dev_count,
phase="infer",
epoch=1,
shuffle=False)
......@@ -140,9 +140,9 @@ def test_inference_model(args):
model_filename="model.pdmodel",
params_filename="params.pdparams")
infer_pyreader.decorate_batch_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader,
[probs.name], "infer")
infer_pyreader.set_batch_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader, [probs.name], "infer")
if __name__ == "__main__":
args = PDConfig()
......
......@@ -12,8 +12,8 @@ import argparse
import numpy as np
import multiprocessing
import sys
sys.path.append("../models/classification/")
sys.path.append("../")
sys.path.append("../shared_modules/models/classification/")
sys.path.append("../shared_modules/")
from nets import bow_net
from nets import lstm_net
......@@ -30,24 +30,19 @@ import paddle.fluid as fluid
import reader
from utils import init_checkpoint
def create_model(args,
pyreader_name,
num_labels,
is_prediction=False):
def create_model(args, pyreader_name, num_labels, is_prediction=False):
"""
Create Model for sentiment classification
"""
data = fluid.layers.data(
name="src_ids", shape=[-1, args.max_seq_len], dtype='int64')
label = fluid.layers.data(
name="label", shape=[-1, 1], dtype="int64")
seq_len = fluid.layers.data(
name="seq_len", shape=[-1], dtype="int64")
data = fluid.data(
name="src_ids", shape=[None, args.max_seq_len], dtype='int64')
label = fluid.data(name="label", shape=[None, 1], dtype="int64")
seq_len = fluid.data(name="seq_len", shape=[None], dtype="int64")
data_reader = fluid.io.PyReader(feed_list=[data, label, seq_len],
capacity=4, iterable=False)
data_reader = fluid.io.DataLoader.from_generator(
feed_list=[data, label, seq_len], capacity=4, iterable=False)
if args.model_type == "bilstm_net":
network = bilstm_net
......@@ -63,18 +58,19 @@ def create_model(args,
raise ValueError("Unknown network type!")
if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, is_prediction=is_prediction)
probs = network(
data, seq_len, None, args.vocab_size, is_prediction=is_prediction)
print("create inference model...")
return data_reader, probs, [data.name, seq_len.name]
ce_loss, probs = network(data, seq_len, label, args.vocab_size, is_prediction=is_prediction)
ce_loss, probs = network(
data, seq_len, label, args.vocab_size, is_prediction=is_prediction)
loss = fluid.layers.mean(x=ce_loss)
num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return data_reader, loss, accuracy, num_seqs
def evaluate(exe, test_program, test_pyreader, fetch_list, eval_phase):
"""
Evaluation Function
......@@ -111,7 +107,8 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
time_begin = time.time()
while True:
try:
np_props = exe.run(program=test_program, fetch_list=fetch_list,
np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
......@@ -135,7 +132,8 @@ def main(args):
exe = fluid.Executor(place)
task_name = args.task_name.lower()
processor = reader.SentaProcessor(data_dir=args.data_dir,
processor = reader.SentaProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path,
random_seed=args.random_seed,
max_seq_len=args.max_seq_len)
......@@ -151,7 +149,7 @@ def main(args):
if args.do_train:
train_data_generator = processor.data_generator(
batch_size=args.batch_size,
batch_size=args.batch_size / dev_count,
phase='train',
epoch=args.epoch,
shuffle=True)
......@@ -187,7 +185,7 @@ def main(args):
if args.do_val:
test_data_generator = processor.data_generator(
batch_size=args.batch_size,
batch_size=args.batch_size / dev_count,
phase='dev',
epoch=1,
shuffle=False)
......@@ -204,7 +202,7 @@ def main(args):
if args.do_infer:
infer_data_generator = processor.data_generator(
batch_size=args.batch_size,
batch_size=args.batch_size / dev_count,
phase='infer',
epoch=1,
shuffle=False)
......@@ -223,30 +221,25 @@ def main(args):
if args.do_train:
if args.init_checkpoint:
init_checkpoint(
exe,
args.init_checkpoint,
main_program=startup_prog)
exe, args.init_checkpoint, main_program=startup_prog)
elif args.do_val or args.do_infer:
if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!")
init_checkpoint(
exe,
args.init_checkpoint,
main_program=startup_prog)
init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog)
if args.do_train:
train_exe = exe
train_reader.decorate_sample_list_generator(train_data_generator)
train_reader.set_sample_list_generator(train_data_generator)
else:
train_exe = None
if args.do_val:
test_exe = exe
test_reader.decorate_sample_list_generator(test_data_generator)
test_reader.set_sample_list_generator(test_data_generator)
if args.do_infer:
test_exe = exe
infer_reader.decorate_sample_list_generator(infer_data_generator)
infer_reader.set_sample_list_generator(infer_data_generator)
if args.do_train:
train_reader.start()
......@@ -262,7 +255,9 @@ def main(args):
else:
fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False)
outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
#print("finished one step")
if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs
......@@ -274,7 +269,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs)
if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size()
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose)
time_end = time.time()
......@@ -289,8 +285,8 @@ def main(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
"step_" + str(steps), "checkpoint")
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate dev set
......@@ -301,8 +297,9 @@ def main(args):
"dev")
except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.checkpoints, "step_" + str(steps),
"checkpoint")
fluid.save(train_program, save_path)
train_reader.reset()
break
......@@ -315,8 +312,7 @@ def main(args):
# final eval on test set
if args.do_infer:
print("Final test result:")
inference(exe, infer_prog, infer_reader,
[prop.name], "infer")
inference(exe, infer_prog, infer_reader, [prop.name], "infer")
if __name__ == "__main__":
......
......@@ -16,8 +16,8 @@ import sys
import paddle
import paddle.fluid as fluid
sys.path.append("../models/classification/")
sys.path.append("..")
sys.path.append("../shared_modules/models/classification/")
sys.path.append("../shared_modules/")
print(sys.path)
from nets import bow_net
......@@ -36,19 +36,18 @@ from config import PDConfig
from utils import init_checkpoint
def ernie_pyreader(args, pyreader_name):
src_ids = fluid.layers.data(
name="src_ids", shape=[-1, args.max_seq_len, 1], dtype="int64")
sent_ids = fluid.layers.data(
name="sent_ids", shape=[-1, args.max_seq_len, 1], dtype="int64")
pos_ids = fluid.layers.data(
name="pos_ids", shape=[-1, args.max_seq_len, 1], dtype="int64")
input_mask = fluid.layers.data(
name="input_mask", shape=[-1, args.max_seq_len, 1], dtype="float32")
labels = fluid.layers.data(
name="labels", shape=[-1, 1], dtype="int64")
seq_lens = fluid.layers.data(
name="seq_lens", shape=[-1], dtype="int64")
src_ids = fluid.data(
name="src_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
sent_ids = fluid.data(
name="sent_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
pos_ids = fluid.data(
name="pos_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
input_mask = fluid.data(
name="input_mask", shape=[None, args.max_seq_len, 1], dtype="float32")
labels = fluid.data(name="labels", shape=[None, 1], dtype="int64")
seq_lens = fluid.data(name="seq_lens", shape=[None], dtype="int64")
pyreader = fluid.io.DataLoader.from_generator(
feed_list=[src_ids, sent_ids, pos_ids, input_mask, labels, seq_lens],
......@@ -61,15 +60,13 @@ def ernie_pyreader(args, pyreader_name):
"sent_ids": sent_ids,
"pos_ids": pos_ids,
"input_mask": input_mask,
"seq_lens": seq_lens}
"seq_lens": seq_lens
}
return pyreader, ernie_inputs, labels
def create_model(args,
embeddings,
labels,
is_prediction=False):
def create_model(args, embeddings, labels, is_prediction=False):
"""
Create Model for sentiment classification based on ERNIE encoder
"""
......@@ -132,7 +129,8 @@ def infer(exe, infer_program, infer_pyreader, fetch_list, infer_phase):
time_begin = time.time()
while True:
try:
batch_probs = exe.run(program=infer_program, fetch_list=fetch_list,
batch_probs = exe.run(program=infer_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in batch_probs[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
......@@ -195,21 +193,19 @@ def main(args):
with fluid.unique_name.guard():
# create ernie_pyreader
train_pyreader, ernie_inputs, labels = ernie_pyreader(
args,
pyreader_name='train_pyreader')
args, pyreader_name='train_pyreader')
# get ernie_embeddings
if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len)
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model(
args,
embeddings,
labels=labels,
is_prediction=False)
args, embeddings, labels=labels, is_prediction=False)
optimizer = fluid.optimizer.Adam(learning_rate=args.lr)
optimizer.minimize(loss)
......@@ -232,21 +228,19 @@ def main(args):
with fluid.unique_name.guard():
# create ernie_pyreader
test_pyreader, ernie_inputs, labels = ernie_pyreader(
args,
pyreader_name='eval_reader')
args, pyreader_name='eval_reader')
# get ernie_embeddings
if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len)
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model(
args,
embeddings,
labels=labels,
is_prediction=False)
args, embeddings, labels=labels, is_prediction=False)
test_prog = test_prog.clone(for_test=True)
......@@ -261,19 +255,18 @@ def main(args):
with fluid.program_guard(infer_prog, startup_prog):
with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args,
pyreader_name="infer_pyreader")
args, pyreader_name="infer_pyreader")
# get ernie_embeddings
if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len)
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args,
embeddings,
labels=labels,
is_prediction=True)
probs = create_model(
args, embeddings, labels=labels, is_prediction=True)
infer_prog = infer_prog.clone(for_test=True)
......@@ -282,25 +275,17 @@ def main(args):
if args.do_train:
if args.init_checkpoint:
init_checkpoint(
exe,
args.init_checkpoint,
main_program=train_program)
exe, args.init_checkpoint, main_program=train_program)
elif args.do_val:
if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!")
init_checkpoint(
exe,
args.init_checkpoint,
main_program=test_prog)
init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
elif args.do_infer:
if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!")
init_checkpoint(
exe,
args.init_checkpoint,
main_program=infer_prog)
init_checkpoint(exe, args.init_checkpoint, main_program=infer_prog)
if args.do_train:
train_exe = exe
......@@ -327,7 +312,9 @@ def main(args):
else:
fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False)
outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs
np_loss = np.array(np_loss)
......@@ -338,7 +325,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs)
if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size()
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose)
time_end = time.time()
......@@ -353,8 +341,8 @@ def main(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
"step_" + str(steps), "checkpoint")
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate dev set
......@@ -364,8 +352,9 @@ def main(args):
"dev")
except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.checkpoints, "step_" + str(steps),
"checkpoint")
fluid.save(train_program, save_path)
train_pyreader.reset()
break
......@@ -378,8 +367,8 @@ def main(args):
# final eval on test set
if args.do_infer:
print("Final test result:")
infer(exe, infer_prog, infer_pyreader,
[probs.name], "infer")
infer(exe, infer_prog, infer_pyreader, [probs.name], "infer")
if __name__ == "__main__":
args = PDConfig()
......
......@@ -31,6 +31,7 @@ class ArgumentGroup(object):
"""
Argument Class
"""
def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des)
......@@ -63,20 +64,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
"""
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
try:
checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
fluid.load(main_program, checkpoint_path, exe)
except:
fluid.load(main_program, init_checkpoint_path, exe)
print("Load model from {}".format(init_checkpoint_path))
......@@ -96,8 +88,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len):
sys.stderr.write("[NOTICE] Error Format Line!")
continue
label = int(cols[1])
wids = [word_dict[x] if x in word_dict else unk_id
for x in cols[0].split(" ")]
wids = [
word_dict[x] if x in word_dict else unk_id
for x in cols[0].split(" ")
]
seq_len = len(wids)
if seq_len < max_seq_len:
for i in range(max_seq_len - seq_len):
......@@ -119,8 +113,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len):
for epoch_index in range(epoch):
for doc, label, seq_len in all_data:
yield doc, label, seq_len
return reader
def load_vocab(file_path):
"""
load the given vocabulary
......@@ -144,15 +140,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
fluid.load(main_program, pretraining_params_path, exe)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
运行本目录下的范例模型需要安装PaddlePaddle Fluid 1.6版。如果您的 PaddlePaddle 安装版本低于此要求,请按照[安装文档](https://www.paddlepaddle.org.cn/#quick-start)中的说明更新 PaddlePaddle 安装版本。
运行本目录下的范例模型需要安装PaddlePaddle Fluid 1.7版本。如果您的 PaddlePaddle 安装版本低于此要求,请按照[安装文档](https://www.paddlepaddle.org.cn/#quick-start)中的说明更新 PaddlePaddle 安装版本。
# Sequence to Sequence (Seq2Seq)
......
......@@ -93,7 +93,7 @@ def infer():
# clone from default main program and use it as the validation program
main_program = fluid.default_main_program()
main_program = main_program.clone(for_test=True)
print([param.name for param in main_program.blocks[0].all_parameters()])
print([param.name for param in main_program.all_parameters()])
place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
exe = Executor(place)
......@@ -127,7 +127,8 @@ def infer():
dir_name = args.reload_model
print("dir name", dir_name)
fluid.io.load_params(exe, dir_name)
dir_name = os.path.join(dir_name, "checkpoint")
fluid.load(main_program, dir_name, exe)
train_data_iter = reader.get_data_iter(infer_data, 1, mode='eval')
......
......@@ -229,10 +229,10 @@ def main():
% (epoch_id, epoch_time, sum(batch_times) / len(batch_times)))
if not args.profile:
dir_name = os.path.join(args.model_path,
"epoch_" + str(epoch_id))
print("begin to save", dir_name)
fluid.io.save_params(exe, dir_name, main_program=train_program)
save_path = os.path.join(args.model_path,
"epoch_" + str(epoch_id), "checkpoint")
print("begin to save", save_path)
fluid.save(train_program, save_path)
print("save finished")
dev_ppl = eval(valid_data)
print("dev ppl", dev_ppl)
......
......@@ -88,7 +88,8 @@ def infer():
dir_name = args.reload_model
print("dir name", dir_name)
fluid.io.load_params(exe, dir_name)
dir_name = os.path.join(dir_name, "checkpoint")
fluid.load(main_program, dir_name, exe)
vocab, tar_id2vocab = get_vocab(args.dataset_prefix)
infer_output = np.ones((batch_size, 1), dtype='int64') * BOS_ID
......
......@@ -255,10 +255,11 @@ def main():
best_nll = test_nll
best_ppl = test_ppl
best_epoch_id = epoch_id
dir_name = os.path.join(args.model_path,
"epoch_" + str(best_epoch_id))
print("save model {}".format(dir_name))
fluid.io.save_params(exe, dir_name, main_program)
save_path = os.path.join(args.model_path,
"epoch_" + str(best_epoch_id),
"checkpoint")
print("save model {}".format(save_path))
fluid.save(main_program, save_path)
else:
steps_not_improved += 1
if steps_not_improved == decay_ts:
......
......@@ -4,6 +4,7 @@ This module provide nets for text classification
import paddle.fluid as fluid
def bow_net(data,
seq_len,
label,
......
......@@ -43,8 +43,8 @@ class CNN(object):
left_emb = emb_layer.ops(left)
right_emb = emb_layer.ops(right)
# Presentation context
cnn_layer = layers.SequenceConvPoolLayer(
self.filter_size, self.num_filters, "conv")
cnn_layer = layers.SequenceConvPoolLayer(self.filter_size,
self.num_filters, "conv")
left_cnn = cnn_layer.ops(left_emb)
right_cnn = cnn_layer.ops(right_emb)
# matching layer
......
......@@ -33,6 +33,7 @@ def check_cuda(use_cuda, err = \
except Exception as e:
pass
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
......
......@@ -30,10 +30,14 @@ from models.transformer_encoder import encoder, pre_process_layer
def ernie_pyreader(args, pyreader_name):
"""define standard ernie pyreader"""
src_ids = fluid.data(name='1', shape=[-1, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data(name='2', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(name='3', shape=[-1, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(name='4', shape=[-1, args.max_seq_len, 1], dtype='float32')
src_ids = fluid.data(
name='1', shape=[-1, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data(
name='2', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(
name='3', shape=[-1, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(
name='4', shape=[-1, args.max_seq_len, 1], dtype='float32')
labels = fluid.data(name='5', shape=[-1, 1], dtype='int64')
seq_lens = fluid.data(name='6', shape=[-1], dtype='int64')
......
......@@ -29,6 +29,7 @@ from preprocess.ernie import tokenization
from preprocess.padding import pad_batch_data
import io
def csv_reader(fd, delimiter='\t'):
def gen():
for i in fd:
......@@ -37,8 +38,10 @@ def csv_reader(fd, delimiter='\t'):
yield slots,
else:
yield slots
return gen()
class BaseReader(object):
"""BaseReader for classify and sequence labeling task"""
......
......@@ -23,6 +23,7 @@ import unicodedata
import six
import io
def convert_to_unicode(text):
"""Converts `text` to Unicode (if it's not already), assuming utf-8 input."""
if six.PY3:
......
......@@ -30,7 +30,7 @@ if sys.getdefaultencoding() != defaultencoding:
reload(sys)
sys.setdefaultencoding(defaultencoding)
sys.path.append("..")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......@@ -47,14 +47,14 @@ from models.model_check import check_version
from models.model_check import check_cuda
def create_model(args, pyreader_name, is_inference = False, is_pointwise = False):
def create_model(args, pyreader_name, is_inference=False, is_pointwise=False):
"""
Create Model for simnet
"""
if is_inference:
inf_pyreader = fluid.layers.py_reader(
capacity=16,
shapes=([-1,1], [-1,1]),
shapes=([-1], [-1]),
dtypes=('int64', 'int64'),
lod_levels=(1, 1),
name=pyreader_name,
......@@ -67,7 +67,7 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
if is_pointwise:
pointwise_pyreader = fluid.layers.py_reader(
capacity=16,
shapes=([-1,1], [-1,1], [-1,1]),
shapes=([-1], [-1], [-1]),
dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 0),
name=pyreader_name,
......@@ -79,15 +79,17 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
else:
pairwise_pyreader = fluid.layers.py_reader(
capacity=16,
shapes=([-1,1], [-1,1], [-1,1]),
shapes=([-1], [-1], [-1]),
dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 1),
name=pyreader_name,
use_double_buffer=False)
left, pos_right, neg_right = fluid.layers.read_file(pairwise_pyreader)
left, pos_right, neg_right = fluid.layers.read_file(
pairwise_pyreader)
return pairwise_pyreader, left, pos_right, neg_right
def train(conf_dict, args):
"""
train processic
......@@ -97,16 +99,16 @@ def train(conf_dict, args):
# get vocab size
conf_dict['dict_size'] = len(vocab)
# Load network structure dynamically
net = utils.import_class("../models/matching",
net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict)
# Load loss function dynamically
loss = utils.import_class("../models/matching/losses",
loss = utils.import_class("../shared_modules/models/matching/losses",
conf_dict["loss"]["module_name"],
conf_dict["loss"]["class_name"])(conf_dict)
# Load Optimization method
optimizer = utils.import_class(
"../models/matching/optimizers", "paddle_optimizers",
"../shared_modules/models/matching/optimizers", "paddle_optimizers",
conf_dict["optimizer"]["class_name"])(conf_dict)
# load auc method
metric = fluid.metrics.Auc(name="auc")
......@@ -131,8 +133,7 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard():
train_pyreader, left, pos_right, neg_right = create_model(
args,
pyreader_name='train_reader')
args, pyreader_name='train_reader')
left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score
_, neg_score = net.predict(left, neg_right)
......@@ -141,12 +142,14 @@ def train(conf_dict, args):
optimizer.ops(avg_cost)
# Get Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch)
get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid:
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, left, pos_right= create_model(args, pyreader_name = 'test_reader',is_inference=True)
test_pyreader, left, pos_right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score
test_prog = test_prog.clone(for_test=True)
......@@ -156,40 +159,41 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard():
train_pyreader, left, right, label = create_model(
args,
pyreader_name='train_reader',
is_pointwise=True)
args, pyreader_name='train_reader', is_pointwise=True)
left_feat, pred = net.predict(left, right)
avg_cost = loss.compute(pred, label)
avg_cost.persistable = True
optimizer.ops(avg_cost)
# Get Feeder and Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch)
get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid:
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, left, right= create_model(args, pyreader_name = 'test_reader',is_inference=True)
test_pyreader, left, right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True)
if args.init_checkpoint is not "":
utils.init_checkpoint(exe, args.init_checkpoint,
startup_prog)
utils.init_checkpoint(exe, args.init_checkpoint, startup_prog)
def valid_and_test(test_program, test_pyreader, get_valid_examples, process, mode, exe, fetch_list):
def valid_and_test(test_program, test_pyreader, get_valid_examples, process,
mode, exe, fetch_list):
"""
return auc and acc
"""
# Get Batch Data
batch_data = fluid.io.batch(get_valid_examples, args.batch_size, drop_last=False)
batch_data = fluid.io.batch(
get_valid_examples, args.batch_size, drop_last=False)
test_pyreader.decorate_paddle_reader(batch_data)
test_pyreader.start()
pred_list = []
while True:
try:
_pred = exe.run(program=test_program,fetch_list=[pred.name])
_pred = exe.run(program=test_program, fetch_list=[pred.name])
pred_list += list(_pred)
except fluid.core.EOFException:
test_pyreader.reset()
......@@ -222,7 +226,8 @@ def train(conf_dict, args):
#for epoch_id in range(args.epoch):
# used for continuous evaluation
if args.enable_ce:
train_batch_data = fluid.io.batch(get_train_examples, args.batch_size, drop_last=False)
train_batch_data = fluid.io.batch(
get_train_examples, args.batch_size, drop_last=False)
else:
train_batch_data = fluid.io.batch(
fluid.io.shuffle(
......@@ -238,19 +243,23 @@ def train(conf_dict, args):
try:
global_step += 1
fetch_list = [avg_cost.name]
avg_loss = train_exe.run(program=train_program, fetch_list = fetch_list)
avg_loss = train_exe.run(program=train_program,
fetch_list=fetch_list)
losses.append(np.mean(avg_loss[0]))
if args.do_valid and global_step % args.validation_steps == 0:
get_valid_examples = simnet_process.get_reader("valid")
valid_result = valid_and_test(test_prog,test_pyreader,get_valid_examples,simnet_process,"valid",exe,[pred.name])
valid_result = valid_and_test(
test_prog, test_pyreader, get_valid_examples,
simnet_process, "valid", exe, [pred.name])
if args.compute_accuracy:
valid_auc, valid_acc = valid_result
logging.info(
"global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f" %
(global_step, valid_auc, valid_acc, np.mean(losses)))
"global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f"
% (global_step, valid_auc, valid_acc, np.mean(losses)))
else:
valid_auc = valid_result
logging.info("global_steps: %d, valid_auc: %f, valid_loss: %f" %
logging.info(
"global_steps: %d, valid_auc: %f, valid_loss: %f" %
(global_step, valid_auc, np.mean(losses)))
if global_step % args.save_steps == 0:
model_save_dir = os.path.join(args.output_dir,
......@@ -269,8 +278,7 @@ def train(conf_dict, args):
]
target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names,
target_vars, exe,
test_prog)
target_vars, exe, test_prog)
logging.info("saving infer model in %s" % model_path)
except fluid.core.EOFException:
......@@ -282,8 +290,7 @@ def train(conf_dict, args):
ce_info.append([np.mean(losses), end_time - start_time])
#final save
logging.info("the final step is %s" % global_step)
model_save_dir = os.path.join(args.output_dir,
conf_dict["model_path"])
model_save_dir = os.path.join(args.output_dir, conf_dict["model_path"])
model_path = os.path.join(model_save_dir, str(global_step))
if not os.path.exists(model_save_dir):
os.makedirs(model_save_dir)
......@@ -296,8 +303,7 @@ def train(conf_dict, args):
right.name,
]
target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names,
target_vars, exe,
fluid.io.save_inference_model(model_path, feed_var_names, target_vars, exe,
test_prog)
logging.info("saving infer model in %s" % model_path)
# used for continuous evaluation
......@@ -322,7 +328,9 @@ def train(conf_dict, args):
else:
# Get Feeder and Reader
get_test_examples = simnet_process.get_reader("test")
test_result = valid_and_test(test_prog,test_pyreader,get_test_examples,simnet_process,"test",exe,[pred.name])
test_result = valid_and_test(test_prog, test_pyreader,
get_test_examples, simnet_process, "test",
exe, [pred.name])
if args.compute_accuracy:
test_auc, test_acc = test_result
logging.info("AUC of test is %f, Accuracy of test is %f" %
......@@ -348,12 +356,13 @@ def test(conf_dict, args):
startup_prog = fluid.Program()
get_test_examples = simnet_process.get_reader("test")
batch_data = fluid.io.batch(get_test_examples, args.batch_size, drop_last=False)
batch_data = fluid.io.batch(
get_test_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching",
net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict)
......@@ -364,9 +373,7 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, left, pos_right = create_model(
args,
pyreader_name = 'test_reader',
is_inference=True)
args, pyreader_name='test_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score
test_prog = test_prog.clone(for_test=True)
......@@ -375,18 +382,13 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, left, right = create_model(
args,
pyreader_name = 'test_reader',
is_inference=True)
args, pyreader_name='test_reader', is_inference=True)
left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
utils.init_checkpoint(
exe,
args.init_checkpoint,
main_program=test_prog)
utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
test_exe = exe
test_pyreader.decorate_paddle_reader(batch_data)
......@@ -398,15 +400,18 @@ def test(conf_dict, args):
output = []
while True:
try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list)
output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise":
pred_list += list(map(lambda item: float(item[0]), output[0]))
pred_list += list(
map(lambda item: float(item[0]), output[0]))
predictions_file.write(u"\n".join(
map(lambda item: str((item[0] + 1) / 2), output[0])) + "\n")
map(lambda item: str((item[0] + 1) / 2), output[0])) +
"\n")
else:
pred_list += map(lambda item: item, output[0])
predictions_file.write(u"\n".join(
map(lambda item: str(np.argmax(item)), output[0])) + "\n")
map(lambda item: str(np.argmax(item)), output[0])) +
"\n")
except fluid.core.EOFException:
test_pyreader.reset()
break
......@@ -450,36 +455,36 @@ def infer(conf_dict, args):
startup_prog = fluid.Program()
get_infer_examples = simnet_process.get_infer_reader
batch_data = fluid.io.batch(get_infer_examples, args.batch_size, drop_last=False)
batch_data = fluid.io.batch(
get_infer_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching",
net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict)
if args.task_mode == "pairwise":
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_pyreader, left, pos_right = create_model(args, pyreader_name = 'infer_reader', is_inference = True)
infer_pyreader, left, pos_right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score
test_prog = test_prog.clone(for_test=True)
else:
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_pyreader, left, right = create_model(args, pyreader_name = 'infer_reader', is_inference = True)
infer_pyreader, left, right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
utils.init_checkpoint(
exe,
args.init_checkpoint,
main_program=test_prog)
utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
test_exe = exe
infer_pyreader.decorate_sample_list_generator(batch_data)
......@@ -491,7 +496,7 @@ def infer(conf_dict, args):
infer_pyreader.start()
while True:
try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list)
output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise":
preds_list += list(
map(lambda item: str((item[0] + 1) / 2), output[0]))
......@@ -514,6 +519,7 @@ def get_cards():
num = len(cards.split(","))
return num
if __name__ == "__main__":
args = ArgConfig()
......
......@@ -149,7 +149,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
[**PaddleNLP**](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP) 是基于 PaddlePaddle 深度学习框架开发的自然语言处理 (NLP) 工具,算法,模型和数据的开源项目。百度在 NLP 领域十几年的深厚积淀为 PaddleNLP 提供了强大的核心动力。使用 PaddleNLP,您可以得到:
- **丰富而全面的 NLP 任务支持:**
- PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。
- PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。
- **稳定可靠的 NLP 模型和强大的预训练参数:**
- PaddleNLP集成了百度内部广泛使用的 NLP 工具模型,为您提供了稳定可靠的 NLP 算法解决方案。基于百亿级数据的预训练参数和丰富的预训练模型,助您轻松提高模型效果,为您的 NLP 业务注入强大动力。
- **持续改进和技术支持,零基础搭建 NLP 应用:**
......@@ -167,14 +167,14 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 语义表示
[PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) (Paddle LAngauge Representation ToolKit) 是传统语言模型的进一步发展,通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。PaddleLARK 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。
[pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) (Paddle LAngauge Representation ToolKit) 是传统语言模型的进一步发展,通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。pretrain_langauge_models 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。
| 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
| [ERNIE](https://github.com/PaddlePaddle/ERNIE)(Enhanced Representation from kNowledge IntEgration) | 百度自研的语义表示模型,通过建模海量数据中的词、实体及实体关系,学习真实世界的语义知识。相较于 BERT 学习原始语言信号,ERNIE 直接对先验语义知识单元进行建模,增强了模型语义表示能力。 |
| [BERT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 |
| [XLNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 |
| [ELMo](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 |
| [BERT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 |
| [XLNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 |
| [ELMo](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 |
#### 文本相似度计算
......@@ -182,7 +182,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 文本生成
[PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。
[seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。
### NLP 系统应用
......@@ -195,7 +195,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 阅读理解
[PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。
[machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。
| 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
......@@ -205,16 +205,16 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 机器翻译
[PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
[machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
#### 对话系统
[PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) 包含对话系统方向的模型、数据集和工具。
[dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) 包含对话系统方向的模型、数据集和工具。
| 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
| [DGU](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 |
| [ADEM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 |
| [DGU](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 |
| [ADEM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 |
| [Proactive Conversation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2019-DuConv) | 包含百度开源的知识驱动的开放领域对话数据集 [DuConv](https://ai.baidu.com/broad/subordinate?dataset=duconv),以及基线模型。对应论文 [Proactive Human-Machine Conversation with Explicit Conversation Goals](https://arxiv.org/abs/1906.05572) 发表于 ACL2019。 |
| [DAM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2018-DAM)(Deep Attention Matching Network,深度注意力机制模型) | 开放领域多轮对话匹配模型,对应论文 [Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](https://aclweb.org/anthology/P18-1103/) 发表于 ACL2018。 |
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册