未验证 提交 803dab78 编写于 作者: P pkpk 提交者: GitHub

test=develop (#4389)

上级 9e12ab90
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:** - **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。 - PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:** - **稳定可靠的NLP模型和强大的预训练参数:**
...@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification ...@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 | | **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 | | **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 | | **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 | | **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 | | **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 | | **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 | | **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 | | **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 | | **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 |
...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification ...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text ```text
. .
├── Research # 百度NLP在research方面的工作集合 ├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型 ├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型 ├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型 ├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱 ├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型 ├── language_model # 语言模型
├── lexical_analysis # LAC词法分析 ├── lexical_analysis # LAC词法分析
├── models # 共享网络 ├── shared_modules/models # 共享网络
│ ├── __init__.py │ ├── __init__.py
│ ├── classification │ ├── classification
│ ├── dialogue_model_toolkit │ ├── dialogue_model_toolkit
...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification ...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation │ ├── representation
│ ├── sequence_labeling │ ├── sequence_labeling
│ └── transformer_encoder.py │ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具 ├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py │ ├── __init__.py
│ ├── ernie │ ├── ernie
│ ├── padding.py │ ├── padding.py
......
...@@ -16,7 +16,6 @@ ...@@ -16,7 +16,6 @@
# limitations under the License. # limitations under the License.
""" """
from __future__ import absolute_import from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
...@@ -40,43 +39,55 @@ import math ...@@ -40,43 +39,55 @@ import math
np.random.seed(0) np.random.seed(0)
random.seed(0) random.seed(0)
parser = argparse.ArgumentParser(__doc__) parser = argparse.ArgumentParser(__doc__)
DEV_COUNT = 1 DEV_COUNT = 1
model_g = ArgumentGroup(parser, "model", "model configuration and paths.") model_g = ArgumentGroup(parser, "model", "model configuration and paths.")
model_g.add_arg("init_checkpoint", str, None, "Init checkpoint to resume training from.") model_g.add_arg("init_checkpoint", str, None,
model_g.add_arg("checkpoints", str, "./checkpoints", "Path to save checkpoints.") "Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints",
"Path to save checkpoints.")
model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.") model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.")
model_g.add_arg("build_dict", bool, False, "Build dict.") model_g.add_arg("build_dict", bool, False, "Build dict.")
train_g = ArgumentGroup(parser, "training", "training options.") train_g = ArgumentGroup(parser, "training", "training options.")
train_g.add_arg("cpu_num", int, 3, "Number of Threads.") train_g.add_arg("cpu_num", int, 3, "Number of Threads.")
train_g.add_arg("epoch", int, 100, "Number of epoches for training.") train_g.add_arg("epoch", int, 100, "Number of epoches for training.")
train_g.add_arg("learning_rate", float, 0.1, "Learning rate used to train with warmup.") train_g.add_arg("learning_rate", float, 0.1,
train_g.add_arg("save_steps", int, 1000, "The steps interval to save checkpoints.") "Learning rate used to train with warmup.")
train_g.add_arg("validation_steps", int, 100, "The steps interval to evaluate model performance.") train_g.add_arg("save_steps", int, 1000,
"The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100,
"The steps interval to evaluate model performance.")
train_g.add_arg("random_seed", int, 7, "random seed") train_g.add_arg("random_seed", int, 7, "random seed")
train_g.add_arg("threshold", float, 0.1, "When the confidence exceeds the threshold, the corresponding label is given.") train_g.add_arg(
"threshold", float, 0.1,
"When the confidence exceeds the threshold, the corresponding label is given."
)
log_g = ArgumentGroup(parser, "logging", "logging related.") log_g = ArgumentGroup(parser, "logging", "logging related.")
log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.") log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.")
data_g = ArgumentGroup(parser, "data", "Data paths, vocab paths and data processing options") data_g = ArgumentGroup(parser, "data",
"Data paths, vocab paths and data processing options")
data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.") data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.")
data_g.add_arg("save_dir", str, "./data/output/", "Path to save.") data_g.add_arg("save_dir", str, "./data/output/", "Path to save.")
data_g.add_arg("max_seq_len", int, 50, "Tokens' number of the longest seqence allowed.") data_g.add_arg("max_seq_len", int, 50,
data_g.add_arg("batch_size", int, 64, "The total number of examples in one batch for training.") "Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64,
"The total number of examples in one batch for training.")
run_type_g = ArgumentGroup(parser, "run_type", "running type options.") run_type_g = ArgumentGroup(parser, "run_type", "running type options.")
run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.") run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.")
# run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).") # run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).")
run_type_g.add_arg("do_train", bool, True, "Whether to perform evaluation on test data set.") run_type_g.add_arg("do_train", bool, True,
run_type_g.add_arg("do_eval", bool, True, "Whether to perform evaluation on test data set.") "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True, "Whether to perform evaluation on test data set.") run_type_g.add_arg("do_eval", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True,
"Whether to perform evaluation on test data set.")
args = parser.parse_args() args = parser.parse_args()
def get_score(pred_result, label, eval_phase): def get_score(pred_result, label, eval_phase):
"""[get precision recall and f-score] """[get precision recall and f-score]
...@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place): ...@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place):
pred_label = build_res["pred_label"] pred_label = build_res["pred_label"]
label = build_res["label"] label = build_res["label"]
fetch_list = [cost.name, prediction.name, pred_label.name, label.name] fetch_list = [cost.name, prediction.name, pred_label.name, label.name]
train_pyreader = build_res["train_pyreader"] train_data_loader = build_res["train_data_loader"]
train_prog = build_res["train_prog"] train_prog = build_res["train_prog"]
steps = 0 steps = 0
time_begin = time.time() time_begin = time.time()
...@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place): ...@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place):
logger.info("Begin training") logger.info("Begin training")
for i in range(args.epoch): for i in range(args.epoch):
try: try:
for data in train_pyreader(): for data in train_data_loader():
avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \ avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \
fetch_list=fetch_list) fetch_list=fetch_list)
steps += 1 steps += 1
if steps % int(args.skip_steps) == 0: if steps % int(args.skip_steps) == 0:
time_end = time.time() time_end = time.time()
used_time = time_end - time_begin used_time = time_end - time_begin
get_score(pred_label, label, eval_phase = "Train") get_score(pred_label, label, eval_phase="Train")
logger.info('loss is {}'.format(avg_cost_np)) logger.info('loss is {}'.format(avg_cost_np))
logger.info("epoch: %d, step: %d, speed: %f steps/s" % (i, steps, args.skip_steps / used_time)) logger.info("epoch: %d, step: %d, speed: %f steps/s" %
(i, steps, args.skip_steps / used_time))
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog) fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path)) logger.info("[save]step %d : save at %s" %
(steps, save_path))
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
if args.do_eval: if args.do_eval:
evaluate(args, test_exe, build_res, "eval") evaluate(args, test_exe, build_res, "eval")
...@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place): ...@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place):
logger.error("Train error : %s" % str(e)) logger.error("Train error : %s" % str(e))
exit(1) exit(1)
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog) fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path)) logger.info("[save]step %d : save at %s" % (steps, save_path))
def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent=None): def evaluate(args,
test_exe,
build_res,
eval_phase,
save_result=False,
id2intent=None):
"""[evaluate on dev/test dataset] """[evaluate on dev/test dataset]
Arguments: Arguments:
...@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], [] total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], []
if eval_phase == "eval": if eval_phase == "eval":
test_prog = build_res["eval_compiled_prog"] test_prog = build_res["eval_compiled_prog"]
test_pyreader = build_res["eval_pyreader"] test_data_loader = build_res["eval_data_loader"]
elif eval_phase == "test": elif eval_phase == "test":
test_prog = build_res["test_compiled_prog"] test_prog = build_res["test_compiled_prog"]
test_pyreader = build_res["test_pyreader"] test_data_loader = build_res["test_data_loader"]
else: else:
exit(1) exit(1)
logger.info("-----------------------------------------------------------") logger.info("-----------------------------------------------------------")
for data in test_pyreader(): for data in test_data_loader():
avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \ avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \
return_numpy=True) return_numpy=True)
total_cost.append(avg_cost_np) total_cost.append(avg_cost_np)
...@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
label_list.extend(label) label_list.extend(label)
if save_result: if save_result:
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase + ".rst") logger.info("save result at : %s" % args.save_dir + "/" + eval_phase +
".rst")
save_dir = args.save_dir save_dir = args.save_dir
if not os.path.exists(save_dir): if not os.path.exists(save_dir):
logger.warning("save dir not exists, and create it") logger.warning("save dir not exists, and create it")
os.makedirs(save_dir) os.makedirs(save_dir)
fin = codecs.open(os.path.join(args.data_dir, eval_phase + ".txt"), "r", encoding="utf8") fin = codecs.open(
fout = codecs.open(args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8") os.path.join(args.data_dir, eval_phase + ".txt"),
"r",
encoding="utf8")
fout = codecs.open(
args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
for line in pred_prob_list: for line in pred_prob_list:
query = fin.readline().rsplit("\t", 1)[0] query = fin.readline().rsplit("\t", 1)[0]
res = [] res = []
...@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
logger.info("-----------------------------------------------------------") logger.info("-----------------------------------------------------------")
def create_net(args,
def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_net", is_infer=False): flow_data,
"""[create network and pyreader] class_dim,
dict_dim,
place,
model_name="textcnn_net",
is_infer=False):
"""[create network and loader]
Arguments: Arguments:
flow_data {[type]} -- [description] flow_data {[type]} -- [description]
...@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_ ...@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
model = textcnn_net_multi_label model = textcnn_net_multi_label
else: else:
return return
char_list = fluid.data(name="char", shape=[None, args.max_seq_len, 1], dtype="int64", lod_level=0) char_list = fluid.data(
label = fluid.data(name="label", shape=[None, class_dim], dtype="float32", lod_level=0) # label data name="char",
reader = fluid.io.PyReader(feed_list=[char_list, label], capacity=args.batch_size * 10, iterable=True, \ shape=[None, args.max_seq_len, 1],
dtype="int64",
lod_level=0)
label = fluid.data(
name="label", shape=[None, class_dim], dtype="float32",
lod_level=0) # label data
data_loader = fluid.io.DataLoader.from_generator(
feed_list=[char_list, label],
capacity=args.batch_size * 10,
iterable=True,
return_list=False) return_list=False)
output = model(char_list, label, dict_dim, output = model(
char_list,
label,
dict_dim,
emb_dim=flow_data["model"]["emb_dim"], emb_dim=flow_data["model"]["emb_dim"],
hid_dim=flow_data["model"]["hid_dim"], hid_dim=flow_data["model"]["hid_dim"],
hid_dim2=flow_data["model"]["hid_dim2"], hid_dim2=flow_data["model"]["hid_dim2"],
...@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_ ...@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
max_seq_len=args.max_seq_len) max_seq_len=args.max_seq_len)
if is_infer: if is_infer:
prediction = output prediction = output
return [reader, prediction] return [data_loader, prediction]
else: else:
avg_cost, prediction, pred_label, label = output[0], output[1], output[2], output[3] avg_cost, prediction, pred_label, label = output[0], output[1], output[
return [reader, avg_cost, prediction, pred_label, label] 2], output[3]
return [data_loader, avg_cost, prediction, pred_label, label]
def build_data_reader(args, char_dict, intent_dict): def build_data_loader(args, char_dict, intent_dict):
"""[decorate samples for pyreader] """[decorate samples for dataloader]
Arguments: Arguments:
args {[type]} -- [description] args {[type]} -- [description]
...@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
reader_res = {} loader_res = {}
if args.do_train: if args.do_train:
train_processor = DataReader(char_dict, intent_dict, args.max_seq_len) train_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
train_data_generator = train_processor.prepare_data( train_data_generator = train_processor.prepare_data(
data_path=args.data_dir + "train.txt", data_path=args.data_dir + "train.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='train') mode='train')
reader_res["train_data_generator"] = train_data_generator loader_res["train_data_generator"] = train_data_generator
num_train_examples = train_processor._get_num_examples() num_train_examples = train_processor._get_num_examples()
logger.info("Num train examples: %d" % num_train_examples) logger.info("Num train examples: %d" % num_train_examples)
logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \ logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \
args.epoch // DEV_COUNT)) args.epoch // DEV_COUNT))
if math.ceil(num_train_examples * 1.0 / args.batch_size) // DEV_COUNT <= 0: if math.ceil(num_train_examples * 1.0 /
logger.error("Num of train steps is less than 0 or equals to 0, exit") args.batch_size) // DEV_COUNT <= 0:
logger.error(
"Num of train steps is less than 0 or equals to 0, exit")
exit(1) exit(1)
if args.do_eval: if args.do_eval:
eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len) eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
...@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "eval.txt", data_path=args.data_dir + "eval.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='eval') mode='eval')
reader_res["eval_data_generator"] = eval_data_generator loader_res["eval_data_generator"] = eval_data_generator
num_eval_examples = eval_processor._get_num_examples() num_eval_examples = eval_processor._get_num_examples()
logger.info("Num eval examples: %d" % num_eval_examples) logger.info("Num eval examples: %d" % num_eval_examples)
if args.do_test: if args.do_test:
...@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "test.txt", data_path=args.data_dir + "test.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='test') mode='test')
reader_res["test_data_generator"] = test_data_generator loader_res["test_data_generator"] = test_data_generator
return reader_res return loader_res
def build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res): def build_graph(args, model_config, num_labels, dict_dim, place, test_place,
loader_res):
"""[build paddle graph] """[build paddle graph]
Arguments: Arguments:
...@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea ...@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
num_labels {[type]} -- [description] num_labels {[type]} -- [description]
dict_dim {[type]} -- [description] dict_dim {[type]} -- [description]
place {[type]} -- [description] place {[type]} -- [description]
reader_res {[type]} -- [description] loader_res {[type]} -- [description]
Returns: Returns:
[type] -- [description] [type] -- [description]
...@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea ...@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
if args.do_train: if args.do_train:
with fluid.program_guard(train_prog, startup_prog): with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ train_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, place, model_name="textcnn_net") dict_dim, place, model_name="textcnn_net")
train_pyreader.decorate_sample_list_generator(reader_res['train_data_generator'], places=place) train_data_loader.set_sample_list_generator(
res["train_pyreader"] = train_pyreader loader_res['train_data_generator'], places=place)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=fluid.layers.exponential_decay( res["train_data_loader"] = train_data_loader
learning_rate=args.learning_rate, decay_steps=1000, decay_rate=0.5, staircase=True)) sgd_optimizer = fluid.optimizer.SGD(
learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate,
decay_steps=1000,
decay_rate=0.5,
staircase=True))
sgd_optimizer.minimize(cost) sgd_optimizer.minimize(cost)
if args.do_eval: if args.do_eval:
with fluid.program_guard(eval_prog, startup_prog): with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
eval_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ eval_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net") dict_dim, test_place, model_name="textcnn_net")
eval_pyreader.decorate_sample_list_generator(reader_res['eval_data_generator'], places=test_place) eval_data_loader.set_sample_list_generator(
res["eval_pyreader"] = eval_pyreader loader_res['eval_data_generator'], places=test_place)
res["eval_data_loader"] = eval_data_loader
if args.do_test: if args.do_test:
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ test_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net") dict_dim, test_place, model_name="textcnn_net")
test_pyreader.decorate_sample_list_generator(reader_res['test_data_generator'], places=test_place) test_data_loader.set_sample_list_generator(
res["test_pyreader"] = test_pyreader loader_res['test_data_generator'], places=test_place)
res["test_data_loader"] = test_data_loader
res["cost"] = cost res["cost"] = cost
res["prediction"] = prediction res["prediction"] = prediction
res["label"] = label res["label"] = label
res["pred_label"] = pred_label res["pred_label"] = pred_label
res["train_prog"] =train_prog res["train_prog"] = train_prog
res["eval_prog"] = eval_prog res["eval_prog"] = eval_prog
res["test_prog"] = test_prog res["test_prog"] = test_prog
return res return res
...@@ -421,8 +471,9 @@ def main(args): ...@@ -421,8 +471,9 @@ def main(args):
id2intent[int(value)] = key id2intent[int(value)] = key
num_labels = len(intent_dict) num_labels = len(intent_dict)
# build model # build model
reader_res = build_data_reader(args, char_dict, intent_dict) loader_res = build_data_loader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res) build_res = build_graph(args, model_config, num_labels, dict_dim, place,
test_place, loader_res)
build_res["place"] = place build_res["place"] = place
build_res["test_place"] = test_place build_res["test_place"] = test_place
if not (args.do_train or args.do_eval or args.do_test): if not (args.do_train or args.do_eval or args.do_test):
...@@ -432,11 +483,13 @@ def main(args): ...@@ -432,11 +483,13 @@ def main(args):
exe.run(startup_prog) exe.run(startup_prog)
if args.init_checkpoint and args.init_checkpoint != "None": if args.init_checkpoint and args.init_checkpoint != "None":
try: try:
init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog) init_checkpoint(
exe, args.init_checkpoint, main_program=startup_prog)
logger.info("Load model from %s" % args.init_checkpoint) logger.info("Load model from %s" % args.init_checkpoint)
except Exception as e: except Exception as e:
logger.exception(str(e)) logger.exception(str(e))
logger.error("Faild load model from %s [%s]" % (args.init_checkpoint, str(e))) logger.error("Faild load model from %s [%s]" %
(args.init_checkpoint, str(e)))
build_strategy = fluid.compiler.BuildStrategy() build_strategy = fluid.compiler.BuildStrategy()
build_strategy.fuse_all_reduce_ops = False build_strategy.fuse_all_reduce_ops = False
exec_strategy = fluid.ExecutionStrategy() exec_strategy = fluid.ExecutionStrategy()
...@@ -449,10 +502,12 @@ def main(args): ...@@ -449,10 +502,12 @@ def main(args):
exec_strategy=exec_strategy) exec_strategy=exec_strategy)
build_res["compiled_prog"] = compiled_prog build_res["compiled_prog"] = compiled_prog
if args.do_test: if args.do_test:
test_compiled_prog = fluid.compiler.CompiledProgram(build_res["test_prog"]) test_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"test_prog"])
build_res["test_compiled_prog"] = test_compiled_prog build_res["test_compiled_prog"] = test_compiled_prog
if args.do_eval: if args.do_eval:
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res["eval_prog"]) eval_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"eval_prog"])
build_res["eval_compiled_prog"] = eval_compiled_prog build_res["eval_compiled_prog"] = eval_compiled_prog
if args.do_train: if args.do_train:
...@@ -465,7 +520,6 @@ def main(args): ...@@ -465,7 +520,6 @@ def main(args):
save_result=True, id2intent=id2intent) save_result=True, id2intent=id2intent)
if __name__ == "__main__": if __name__ == "__main__":
logger.info("the paddle version is %s" % paddle.__version__) logger.info("the paddle version is %s" % paddle.__version__)
check_version('1.6.0') check_version('1.6.0')
......
...@@ -32,7 +32,6 @@ try: ...@@ -32,7 +32,6 @@ try:
except ImportError: except ImportError:
import ConfigParser as cp import ConfigParser as cp
random_seed = 7 random_seed = 7
logger = logging.getLogger() logger = logging.getLogger()
format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s" format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s"
...@@ -77,6 +76,7 @@ class ArgumentGroup(object): ...@@ -77,6 +76,7 @@ class ArgumentGroup(object):
Arguments: Arguments:
object {[type]} -- [description] object {[type]} -- [description]
""" """
def __init__(self, parser, title, des): def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des) self._group = parser.add_argument_group(title=title, description=des)
...@@ -107,6 +107,7 @@ class DataReader(object): ...@@ -107,6 +107,7 @@ class DataReader(object):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
def __init__(self, char_vocab, intent_dict, max_len): def __init__(self, char_vocab, intent_dict, max_len):
self._char_vocab = char_vocab self._char_vocab = char_vocab
self._intent_dict = intent_dict self._intent_dict = intent_dict
...@@ -128,12 +129,17 @@ class DataReader(object): ...@@ -128,12 +129,17 @@ class DataReader(object):
# word_dict_path), "The given word dictionary dose not exist." # word_dict_path), "The given word dictionary dose not exist."
assert os.path.exists(data_path), "The given data file does not exist." assert os.path.exists(data_path), "The given data file does not exist."
if mode == "train": if mode == "train":
train_reader = fluid.io.batch(paddle.reader.shuffle(self.data_reader(data_path, self.max_len, shuffle=True), train_reader = fluid.io.batch(
buf_size=batch_size * 100), batch_size) paddle.reader.shuffle(
self.data_reader(
data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100),
batch_size)
# train_reader = fluid.io.batch(self.data_reader(data_path), batch_size) # train_reader = fluid.io.batch(self.data_reader(data_path), batch_size)
return train_reader return train_reader
else: else:
test_reader = fluid.io.batch(self.data_reader(data_path, self.max_len), batch_size) test_reader = fluid.io.batch(
self.data_reader(data_path, self.max_len), batch_size)
return test_reader return test_reader
def data_reader(self, file_path, max_len, shuffle=False): def data_reader(self, file_path, max_len, shuffle=False):
...@@ -150,7 +156,8 @@ class DataReader(object): ...@@ -150,7 +156,8 @@ class DataReader(object):
char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \ char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \
list(query))) list(query)))
if len(char_id_list) < max_len: if len(char_id_list) < max_len:
char_id_list.extend([self.padding_id] * (max_len - len(char_id_list))) char_id_list.extend([self.padding_id] *
(max_len - len(char_id_list)))
char_id_list = char_id_list[:max_len] char_id_list = char_id_list[:max_len]
intent_id_list = [self.padding_id] * self.intent_size intent_id_list = [self.padding_id] * self.intent_size
for item in intent.split('\2'): for item in intent.split('\2'):
...@@ -159,6 +166,7 @@ class DataReader(object): ...@@ -159,6 +166,7 @@ class DataReader(object):
if shuffle: if shuffle:
random.seed(random_seed) random.seed(random_seed)
random.shuffle(self.all_data) random.shuffle(self.all_data)
def reader(): def reader():
""" """
reader reader
...@@ -166,6 +174,7 @@ class DataReader(object): ...@@ -166,6 +174,7 @@ class DataReader(object):
for char_id_list, intent_id_list in self.all_data: for char_id_list, intent_id_list in self.all_data:
# print char_id_list, intent_id # print char_id_list, intent_id
yield char_id_list, intent_id_list yield char_id_list, intent_id_list
return reader return reader
...@@ -178,6 +187,7 @@ class DataProcesser(object): ...@@ -178,6 +187,7 @@ class DataProcesser(object):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
@staticmethod @staticmethod
def read_dict(filename): def read_dict(filename):
""" """
...@@ -227,7 +237,8 @@ class DataProcesser(object): ...@@ -227,7 +237,8 @@ class DataProcesser(object):
intent_dict[intent] = 0 intent_dict[intent] = 0
intent_dict[intent] += 1 intent_dict[intent] += 1
# save char dict # save char dict
with codecs.open("%s/char.dict" % save_dir, "w", encoding="utf8") as f_out: with codecs.open(
"%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("PAD\0020\n") f_out.write("PAD\0020\n")
f_out.write("OOV\0021\n") f_out.write("OOV\0021\n")
char_id = 2 char_id = 2
...@@ -238,7 +249,8 @@ class DataProcesser(object): ...@@ -238,7 +249,8 @@ class DataProcesser(object):
f_out.write("%s\002%d\n" % (key, char_id)) f_out.write("%s\002%d\n" % (key, char_id))
char_id += 1 char_id += 1
# save intent dict # save intent dict
with codecs.open("%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out: with codecs.open(
"%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("SYS_OTHER\0020\n") f_out.write("SYS_OTHER\0020\n")
intent_id = 1 intent_id = 1
for key, value in intent_dict.items(): for key, value in intent_dict.items():
...@@ -249,7 +261,6 @@ class DataProcesser(object): ...@@ -249,7 +261,6 @@ class DataProcesser(object):
intent_id += 1 intent_id += 1
class ConfigReader(object): class ConfigReader(object):
"""[read model config file] """[read model config file]
...@@ -282,49 +293,13 @@ class ConfigReader(object): ...@@ -282,49 +293,13 @@ class ConfigReader(object):
return flow_data return flow_data
def init_pretraining_params(exe,
pretraining_params_path,
main_program,
use_fp16=False):
"""load params of pretrained model, NOT including moment, learning_rate"""
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
def init_checkpoint(exe, init_checkpoint_path, main_program): def init_checkpoint(exe, init_checkpoint_path, main_program):
""" """
Init CheckPoint Init CheckPoint
""" """
assert os.path.exists( fluid.load(main_program, init_checkpoint_path, exe)
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path print("Load model from {}".format(init_checkpoint_path))
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print ("Load model from {}".format(init_checkpoint_path))
def print_arguments(args): def print_arguments(args):
""" """
...@@ -350,5 +325,3 @@ def check_version(version='1.6.0'): ...@@ -350,5 +325,3 @@ def check_version(version='1.6.0'):
except Exception as e: except Exception as e:
logger.error(err) logger.error(err)
sys.exit(1) sys.exit(1)
...@@ -21,8 +21,10 @@ from kpi import DurationKpi ...@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True) train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True) train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True) train_duration_card1 = DurationKpi(
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True) 'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
train_loss_card1, train_loss_card1,
......
...@@ -20,22 +20,25 @@ import sys ...@@ -20,22 +20,25 @@ import sys
import io import io
import os import os
URLLIB=urllib URLLIB = urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz", DATA_MODEL_PATH = {
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"} "DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -51,7 +54,8 @@ def download_model_and_data(): ...@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -21,8 +21,7 @@ import paddle ...@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
def create_net( def create_net(is_training,
is_training,
model_input, model_input,
args, args,
clip_value=10.0, clip_value=10.0,
...@@ -52,14 +51,12 @@ def create_net( ...@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1))) initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM #fc to fit dynamic LSTM
context_fc = fluid.layers.fc( context_fc = fluid.layers.fc(input=context_emb,
input=context_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc( response_fc = fluid.layers.fc(input=response_emb,
input=response_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"): ...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
""" """
Set word embedding Set word embedding
""" """
word_emb_param = fluid.global_scope().find_var( word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_name).get_tensor()
word_emb_param.set(word_emb, place) word_emb_param.set(word_emb, place)
...@@ -42,22 +42,24 @@ def do_save_inference_model(args): ...@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
if args.use_cuda: if args.use_cuda:
place = fluid.CUDAPlace(0) place = fluid.CUDAPlace(0)
...@@ -81,9 +83,7 @@ def do_save_inference_model(args): ...@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name, input_field.context_wordseq.name,
input_field.response_wordseq.name, input_field.response_wordseq.name,
], ],
target_vars=[ target_vars=[logits, ],
logits,
],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig from ade.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml") args = PDConfig(yaml_file="./data/config/ade.yaml")
......
...@@ -46,22 +46,24 @@ def do_predict(args): ...@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
logits.persistable = True logits.persistable = True
fetch_list = [logits.name] fetch_list = [logits.name]
...@@ -89,10 +91,7 @@ def do_predict(args): ...@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size) batch_size=args.batch_size)
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
place=place, place=place, phase="test", shuffle=False, sample_pro=1)
phase="test",
shuffle=False,
sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test') num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator) data_reader.decorate_batch_generator(batch_generator)
...@@ -107,7 +106,7 @@ def do_predict(args): ...@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset() data_reader.reset()
break break
scores = scores[: num_test_examples] scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file") print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8") fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores): for index, score in enumerate(scores):
......
...@@ -49,22 +49,24 @@ def do_train(args): ...@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
loss = create_net( loss = create_net(
is_training=True, is_training=True, model_input=input_field, args=args)
model_input=input_field,
args=args
)
loss.persistable = True loss.persistable = True
# gradient clipping # gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue( fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
...@@ -74,7 +76,8 @@ def do_train(args): ...@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0'))) place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else: else:
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace() place = fluid.CPUPlace()
...@@ -114,9 +117,14 @@ def do_train(args): ...@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init: if args.word_emb_init:
print("start loading word embedding init ...") print("start loading word embedding init ...")
if six.PY2: if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32') word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else: else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32') word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place) set_word_embedding(word_emb, place)
print("finish init word embedding ...") print("finish init word embedding ...")
...@@ -147,15 +155,20 @@ def do_train(args): ...@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time)) print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0 sum_loss = 0.0
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1 steps += 1
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -20,12 +20,18 @@ from kpi import CostKpi ...@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi from kpi import DurationKpi
from kpi import AccKpi from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True) each_step_duration_atis_slot_card1 = DurationKpi(
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True) 'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True) train_loss_atis_slot_card1 = CostKpi(
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True) 'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True) train_acc_atis_slot_card1 = CostKpi(
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True) 'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
each_step_duration_atis_slot_card1, each_step_duration_atis_slot_card1,
......
...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name, ...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list): if isinstance(insts[0][3], list):
if task_name == "atis_slot": if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts] labels_list = [
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])] inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2": elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts] labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")] labels_list = [np.array(labels_list).astype("int64")]
...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name, ...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids out = batch_src_ids
# Second step: padding # Second step: padding
src_id, self_input_mask = pad_batch_data( src_id, self_input_mask = pad_batch_data(
out, out, max_len, pad_idx=pad_id, return_input_mask=True)
max_len,
pad_idx=pad_id,
return_input_mask=True)
pos_id = pad_batch_data( pos_id = pad_batch_data(
batch_pos_ids, batch_pos_ids,
max_len, max_len,
...@@ -163,13 +164,13 @@ def pad_batch_data(insts, ...@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias. corresponding position data and attention bias.
""" """
return_list = [] return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts) max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss # Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients. # will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array( inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts [inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
])
return_list += [inst_data.astype("int64").reshape([-1, max_len])] return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data # position data
......
...@@ -25,18 +25,21 @@ class DefinePredict(object): ...@@ -25,18 +25,21 @@ class DefinePredict(object):
""" """
Packaging Prediction Results Packaging Prediction Results
""" """
def __init__(self): def __init__(self):
""" """
init init
""" """
self.task_map = {'udc': 'get_matching_res', self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res', 'swda': 'get_cls_res',
'mrda': 'get_cls_res', 'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res', 'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging', 'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res', 'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res', 'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'} 'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None): def get_matching_res(self, probs, params=None):
""" """
...@@ -79,7 +82,3 @@ class DefinePredict(object): ...@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)]) label_str = " ".join([str(l) for l in sorted(labels)])
return label_str return label_str
...@@ -20,25 +20,29 @@ import sys ...@@ -20,25 +20,29 @@ import sys
import io import io
import os import os
URLLIB = urllib
URLLIB=urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz", DATA_MODEL_PATH = {
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz", "DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"} "PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model', 'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'} 'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name): ...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data(): def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......") print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............") print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']: for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path): if not os.path.exists(path):
continue continue
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -19,6 +19,3 @@ python run_build_data.py udc ...@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot 生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent 生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""build swda train dev test dataset""" """build swda train dev test dataset"""
import json import json
...@@ -27,6 +26,7 @@ class ATIS(object): ...@@ -27,6 +26,7 @@ class ATIS(object):
""" """
nlu dataset atis data process nlu dataset atis data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -73,7 +73,8 @@ class ATIS(object): ...@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict: if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1 self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower())) fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8") fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict: for tag in self.intent_dict:
...@@ -109,17 +110,19 @@ class ATIS(object): ...@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag])) tags_slot.append(str(self.slot_dict[tag]))
if i == 0: if i == 0:
if start not in [0, 1]: if start not in [0, 1]:
prefix_num = len(text[: start].strip().split()) prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
else: else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split()) prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
if entities[-1]['end'] < len(text): if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split()) suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num) tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8'))) fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8") fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict: for slot in self.slot_dict:
...@@ -152,7 +155,3 @@ class ATIS(object): ...@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__": if __name__ == "__main__":
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
...@@ -28,6 +28,7 @@ class DSTC2(object): ...@@ -28,6 +28,7 @@ class DSTC2(object):
""" """
dialogue state tracking dstc2 data process dialogue state tracking dstc2 data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -49,7 +50,8 @@ class DSTC2(object): ...@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list) self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict: for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])): for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i]) self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self): def _load_ontology(self):
""" """
...@@ -97,15 +99,25 @@ class DSTC2(object): ...@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i] log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i] label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"] assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]] labels = [
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels]) "%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript'] mach = log_turn['output']['transcript']
user = label_turn['transcription'] user = label_turn['transcription']
if not labels_ids.strip(): if not labels_ids.strip():
labels_ids = self.map_tag_dict['none'] labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids) out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip() user_asr = log_turn['input']['live']['asr-hyps'][0][
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids) 'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8')) fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8')) fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
...@@ -144,10 +156,7 @@ class DSTC2(object): ...@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
...@@ -27,6 +27,7 @@ class MRDA(object): ...@@ -27,6 +27,7 @@ class MRDA(object):
""" """
dialogue act dataset mrda data process dialogue act dataset mrda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -67,7 +68,7 @@ class MRDA(object): ...@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list: for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key] dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8") fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
conv_id = elems[2] conv_id = elems[2]
...@@ -87,7 +88,7 @@ class MRDA(object): ...@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list: for trans_key in trans_list:
trans_file = self.trans_dict[trans_key] trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8") fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
if len(elems) != 3: if len(elems) != 3:
...@@ -120,7 +121,8 @@ class MRDA(object): ...@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1 self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1] caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0] conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0]) out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out) fw.write(u"%s\n" % out)
def get_train_dataset(self): def get_train_dataset(self):
...@@ -158,10 +160,7 @@ class MRDA(object): ...@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
mrda_inst = MRDA() mrda_inst = MRDA()
mrda_inst.main() mrda_inst.main()
...@@ -27,6 +27,7 @@ class SWDA(object): ...@@ -27,6 +27,7 @@ class SWDA(object):
""" """
dialogue act dataset swda data process dialogue act dataset swda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -63,7 +64,7 @@ class SWDA(object): ...@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name] file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8") fr = io.open(file_path, 'r', encoding="utf8")
idx = 0 idx = 0
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for r in row: for r in row:
if idx == 0: if idx == 0:
idx += 1 idx += 1
...@@ -224,10 +225,7 @@ class SWDA(object): ...@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
swda_inst = SWDA() swda_inst = SWDA()
swda_inst.main() swda_inst.main()
...@@ -71,6 +71,3 @@ def load_voc(conf): ...@@ -71,6 +71,3 @@ def load_voc(conf):
elems = line.split('\t') elems = line.split('\t')
map_dict[elems[0]] = elems[1] map_dict[elems[0]] = elems[1]
return map_dict return map_dict
...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2 ...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA from build_swda_dataset import SWDA
if __name__ == "__main__": if __name__ == "__main__":
task_name = sys.argv[1] task_name = sys.argv[1]
task_name = task_name.lower() task_name = task_name.lower()
...@@ -38,11 +37,12 @@ if __name__ == "__main__": ...@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis': elif task_name == 'atis':
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt") shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt") "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2': elif task_name == 'dstc2':
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
else: else:
exit(0) exit(0)
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""Tokenization classes.""" """Tokenization classes."""
from __future__ import absolute_import from __future__ import absolute_import
......
...@@ -113,7 +113,7 @@ def multi_head_attention(queries, ...@@ -113,7 +113,7 @@ def multi_head_attention(queries,
""" """
Scaled Dot-Product Attention Scaled Dot-Product Attention
""" """
scaled_q = layers.scale(x=q, scale=d_key ** -0.5) scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True) product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias: if attn_bias:
product += attn_bias product += attn_bias
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -23,12 +23,7 @@ from dgu.bert import BertModel ...@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig from dgu.utils.configure import JsonConfig
def create_net( def create_net(is_training, model_input, num_labels, paradigm_inst, args):
is_training,
model_input,
num_labels,
paradigm_inst,
args):
"""create dialogue task model""" """create dialogue task model"""
src_ids = model_input.src_ids src_ids = model_input.src_ids
...@@ -48,14 +43,15 @@ def create_net( ...@@ -48,14 +43,15 @@ def create_net(
config=bert_conf, config=bert_conf,
use_fp16=False) use_fp16=False)
params = {'num_labels': num_labels, params = {
'num_labels': num_labels,
'src_ids': src_ids, 'src_ids': src_ids,
'pos_ids': pos_ids, 'pos_ids': pos_ids,
'sent_ids': sent_ids, 'sent_ids': sent_ids,
'input_mask': input_mask, 'input_mask': input_mask,
'labels': labels, 'labels': labels,
'is_training': is_training} 'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params) results = paradigm_inst.paradigm(bert, params)
return results return results
...@@ -66,7 +66,9 @@ def do_save_inference_model(args): ...@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,8 +76,7 @@ def do_save_inference_model(args): ...@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
...@@ -107,14 +108,10 @@ def do_save_inference_model(args): ...@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model( fluid.io.save_inference_model(
args.inference_model_dir, args.inference_model_dir,
feeded_var_names=[ feeded_var_names=[
input_field.src_ids.name, input_field.src_ids.name, input_field.pos_ids.name,
input_field.pos_ids.name, input_field.sent_ids.name, input_field.input_mask.name
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
], ],
target_vars=[probs],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig from dgu.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml") args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
...@@ -66,7 +66,9 @@ def do_train(args): ...@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,13 +76,12 @@ def do_train(args): ...@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir, processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len, max_seq_len=args.max_seq_len,
...@@ -113,9 +114,7 @@ def do_train(args): ...@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='train', shuffle=True)
phase='train',
shuffle=True)
num_train_examples = processor.get_num_examples(phase='train') num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens: if args.in_tokens:
...@@ -217,37 +216,32 @@ def do_train(args): ...@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
if accuracy is not None: if accuracy is not None:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" % "ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time)) args.print_steps / used_time))
ce_info.append([ ce_info.append([
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time args.print_steps / used_time
]) ])
else: else:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" % "speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), args.print_steps / used_time))
args.print_steps / used_time)) ce_info.append(
ce_info.append([ [np.mean(np_loss), args.print_steps / used_time])
np.mean(np_loss),
args.print_steps / used_time
])
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = "step_" + str(steps) save_path = "step_" + str(steps)
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path) save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path) save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -19,8 +19,7 @@ from __future__ import print_function ...@@ -19,8 +19,7 @@ from __future__ import print_function
import os import os
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
Emotion Detection Task Emotion Detection Task
""" """
...@@ -24,7 +23,7 @@ import os ...@@ -24,7 +23,7 @@ import os
import time import time
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -38,9 +37,7 @@ import reader ...@@ -38,9 +37,7 @@ import reader
import utils import utils
def create_model(args, def create_model(args, num_labels, is_prediction=False):
num_labels,
is_prediction=False):
""" """
Create Model for Emotion Detection Create Model for Emotion Detection
""" """
...@@ -77,10 +74,17 @@ def create_model(args, ...@@ -77,10 +74,17 @@ def create_model(args,
raise ValueError("Unknown network type!") raise ValueError("Unknown network type!")
if is_prediction: if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, class_dim=num_labels, is_prediction=True) probs = network(
data,
seq_len,
None,
args.vocab_size,
class_dim=num_labels,
is_prediction=True)
return loader, probs, [data.name, seq_len.name] return loader, probs, [data.name, seq_len.name]
avg_loss, probs = network(data, seq_len, label, args.vocab_size, class_dim=num_labels) avg_loss, probs = network(
data, seq_len, label, args.vocab_size, class_dim=num_labels)
num_seqs = fluid.layers.create_tensor(dtype='int64') num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs) accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return loader, avg_loss, accuracy, num_seqs return loader, avg_loss, accuracy, num_seqs
...@@ -142,7 +146,8 @@ def main(args): ...@@ -142,7 +146,8 @@ def main(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
task_name = args.task_name.lower() task_name = args.task_name.lower()
processor = reader.EmoTectProcessor(data_dir=args.data_dir, processor = reader.EmoTectProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
random_seed=args.random_seed) random_seed=args.random_seed)
#num_labels = len(processor.get_labels()) #num_labels = len(processor.get_labels())
...@@ -173,9 +178,7 @@ def main(args): ...@@ -173,9 +178,7 @@ def main(args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_loader, loss, accuracy, num_seqs = create_model( train_loader, loss, accuracy, num_seqs = create_model(
args, args, num_labels=num_labels, is_prediction=False)
num_labels=num_labels,
is_prediction=False)
sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr) sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr)
sgd_optimizer.minimize(loss) sgd_optimizer.minimize(loss)
...@@ -189,37 +192,27 @@ def main(args): ...@@ -189,37 +192,27 @@ def main(args):
if args.do_val: if args.do_val:
if args.do_train: if args.do_train:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='dev', epoch=1)
phase='dev',
epoch=1)
else: else:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='test', epoch=1)
phase='test',
epoch=1)
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_loader, loss, accuracy, num_seqs = create_model( test_loader, loss, accuracy, num_seqs = create_model(
args, args, num_labels=num_labels, is_prediction=False)
num_labels=num_labels,
is_prediction=False)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
if args.do_infer: if args.do_infer:
infer_data_generator = processor.data_generator( infer_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='infer', epoch=1)
phase='infer',
epoch=1)
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_loader, probs, _ = create_model( infer_loader, probs, _ = create_model(
args, args, num_labels=num_labels, is_prediction=True)
num_labels=num_labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -292,8 +285,9 @@ def main(args): ...@@ -292,8 +285,9 @@ def main(args):
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir,
fluid.io.save_persistables(exe, save_path, train_program) "step_" + str(steps))
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate on dev set # evaluate on dev set
...@@ -306,11 +300,11 @@ def main(args): ...@@ -306,11 +300,11 @@ def main(args):
print("final step: %d " % steps) print("final step: %d " % steps)
if args.do_val: if args.do_val:
evaluate(test_exe, test_prog, test_loader, evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name], [loss.name, accuracy.name, num_seqs.name], "dev")
"dev")
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir,
fluid.io.save_persistables(exe, save_path, train_program) "step_" + str(steps))
fluid.save(train_program, save_path)
train_loader.reset() train_loader.reset()
break break
...@@ -334,15 +328,12 @@ def main(args): ...@@ -334,15 +328,12 @@ def main(args):
if not args.do_train and args.do_val: if not args.do_train and args.do_val:
print("Final test result:") print("Final test result:")
evaluate(test_exe, test_prog, test_loader, evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name], [loss.name, accuracy.name, num_seqs.name], "test")
"test")
# infer # infer
if args.do_infer: if args.do_infer:
print("Final infer result:") print("Final infer result:")
infer(test_exe, test_prog, infer_loader, infer(test_exe, test_prog, infer_loader, [probs.name], "infer")
[probs.name],
"infer")
def get_cards(): def get_cards():
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
Emotion Detection Task, based on ERNIE Emotion Detection Task, based on ERNIE
""" """
...@@ -25,7 +24,7 @@ import time ...@@ -25,7 +24,7 @@ import time
import argparse import argparse
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -350,7 +349,7 @@ def main(args): ...@@ -350,7 +349,7 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate dev set # evaluate dev set
...@@ -369,7 +368,7 @@ def main(args): ...@@ -369,7 +368,7 @@ def main(args):
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
train_pyreader.reset() train_pyreader.reset()
break break
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
EmoTect utilities. EmoTect utilities.
""" """
...@@ -29,27 +28,13 @@ import paddle ...@@ -29,27 +28,13 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
def init_checkpoint(exe, init_checkpoint_path, main_program): def init_checkpoint(exe, init_checkpoint_path, main_program):
""" """
Init CheckPoint Init CheckPoint
""" """
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars( fluid.load(main_program, init_checkpoint_path, exe)
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path))
def word2id(word_dict, query): def word2id(word_dict, query):
...@@ -57,8 +42,10 @@ def word2id(word_dict, query): ...@@ -57,8 +42,10 @@ def word2id(word_dict, query):
Convert word sequence into id list Convert word sequence into id list
""" """
unk_id = len(word_dict) unk_id = len(word_dict)
wids = [word_dict[w] if w in word_dict else unk_id wids = [
for w in query.strip().split(" ")] word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")
]
return wids return wids
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
## 1. 任务说明 ## 1. 任务说明
本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。 本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。
**目前语言模型要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。** **目前语言模型要求使用PaddlePaddle 1.7及以上版本或适当的develop版本。**
同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290) 同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290)
......
...@@ -36,7 +36,7 @@ import sys ...@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2': if sys.version[0] == '2':
reload(sys) reload(sys)
sys.setdefaultencoding("utf-8") sys.setdefaultencoding("utf-8")
sys.path.append('../') sys.path.append('../shared_modules/')
import os import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
...@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'): ...@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'):
def get_current_model_para(train_prog, train_exe): def get_current_model_para(train_prog, train_exe):
param_list = train_prog.block(0).all_parameters() param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list] param_name_list = [p.name for p in param_list]
vals = {} vals = {}
...@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe): ...@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe):
def save_para_npz(train_prog, train_exe): def save_para_npz(train_prog, train_exe):
print("begin to save model to model_base") print("begin to save model to model_base")
param_list = train_prog.block(0).all_parameters() param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list] param_name_list = [p.name for p in param_list]
vals = {} vals = {}
......
...@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型 ...@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型
#### 1.PaddlePaddle 安装 #### 1.PaddlePaddle 安装
本项目依赖 PaddlePaddle 1.6.0 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub) 本项目依赖 PaddlePaddle 1.7 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
> Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。 > Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。
......
...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer ...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset from reader import Dataset
from ernie_reader import SequenceLabelReader from ernie_reader import SequenceLabelReader
sys.path.append("..") sys.path.append("../shared_modules/")
from models.sequence_labeling import nets from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader from models.representation.ernie import ernie_encoder, ernie_pyreader
...@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'): ...@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model""" """create lac model"""
# model's input data # model's input data
words = fluid.data(name='words', shape=[-1, 1], dtype='int64', lod_level=1) words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data( targets = fluid.data(
name='targets', shape=[-1, 1], dtype='int64', lod_level=1) name='targets', shape=[None, 1], dtype='int64', lod_level=1)
# for inference process # for inference process
if mode == 'infer': if mode == 'infer':
...@@ -88,9 +89,11 @@ def create_pyreader(args, ...@@ -88,9 +89,11 @@ def create_pyreader(args,
return_reader=False, return_reader=False,
mode='train'): mode='train'):
# init reader # init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac': if model == 'lac':
pyreader = fluid.io.PyReader( pyreader = fluid.io.DataLoader.from_generator(
feed_list=feed_list, feed_list=feed_list,
capacity=50, capacity=50,
use_double_buffer=True, use_double_buffer=True,
...@@ -101,19 +104,19 @@ def create_pyreader(args, ...@@ -101,19 +104,19 @@ def create_pyreader(args,
# create lac pyreader # create lac pyreader
if mode == 'train': if mode == 'train':
pyreader.decorate_sample_list_generator( pyreader.set_sample_list_generator(
fluid.io.batch( fluid.io.batch(
fluid.io.shuffle( fluid.io.shuffle(
reader.file_reader(file_name), reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer), buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size), batch_size=args.batch_size / device_count),
places=place) places=place)
else: else:
pyreader.decorate_sample_list_generator( pyreader.set_sample_list_generator(
fluid.io.batch( fluid.io.batch(
reader.file_reader( reader.file_reader(
file_name, mode=mode), file_name, mode=mode),
batch_size=args.batch_size), batch_size=args.batch_size / device_count),
places=place) places=place)
elif model == 'ernie': elif model == 'ernie':
...@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config): ...@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config):
# ERNIE's input data # ERNIE's input data
src_ids = fluid.data( src_ids = fluid.data(
name='src_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='src_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='sent_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data( pos_ids = fluid.data(
name='pos_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='pos_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len, 1], dtype='float32') name='input_mask', shape=[None, args.max_seq_len, 1], dtype='float32')
padded_labels = fluid.data( padded_labels = fluid.data(
name='padded_labels', shape=[-1, args.max_seq_len, 1], dtype='int64') name='padded_labels', shape=[None, args.max_seq_len, 1], dtype='int64')
seq_lens = fluid.data( seq_lens = fluid.data(
name='seq_lens', shape=[-1], dtype='int64', lod_level=0) name='seq_lens', shape=[None], dtype='int64', lod_level=0)
squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1]) squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1])
......
...@@ -20,7 +20,7 @@ import sys ...@@ -20,7 +20,7 @@ import sys
from collections import namedtuple from collections import namedtuple
import numpy as np import numpy as np
sys.path.append("..") sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization from preprocess.ernie.task_reader import BaseReader, tokenization
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -10,7 +10,7 @@ import paddle.fluid as fluid ...@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator import creator
import reader import reader
import utils import utils
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -34,7 +34,7 @@ import paddle.fluid as fluid ...@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator import creator
import utils import utils
sys.path.append("..") sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda from models.model_check import check_cuda
from models.model_check import check_version from models.model_check import check_version
...@@ -188,15 +188,16 @@ def do_train(args): ...@@ -188,15 +188,16 @@ def do_train(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir, save_path = os.path.join(args.model_save_dir,
"step_" + str(steps)) "step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path)) print("\tsaving model as %s" % (save_path))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
evaluate(exe, test_program, test_pyreader, train_ret) evaluate(exe, test_program, test_pyreader, train_ret)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps)) save_path = os.path.join(args.model_save_dir, "step_" + str(steps),
fluid.io.save_persistables(exe, save_path, train_program) "checkpoint")
fluid.save(train_program, save_path)
def do_eval(args): def do_eval(args):
......
...@@ -29,7 +29,7 @@ import reader ...@@ -29,7 +29,7 @@ import reader
import utils import utils
import creator import creator
from eval import test_process from eval import test_process
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
...@@ -151,8 +151,8 @@ def do_train(args): ...@@ -151,8 +151,8 @@ def do_train(args):
# save checkpoints # save checkpoints
if step % args.save_steps == 0 and step != 0: if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir, save_path = os.path.join(args.model_save_dir,
"step_" + str(step)) "step_" + str(step), "checkpoint")
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
step += 1 step += 1
if args.enable_ce: if args.enable_ce:
......
...@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program): ...@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
assert os.path.exists( assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var): try:
""" checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
If existed presitabels fluid.load(main_program, checkpoint_path, exe)
""" except:
if not fluid.io.is_persistable(var): fluid.load(main_program, init_checkpoint_path, exe)
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path)) print("Load model from {}".format(init_checkpoint_path))
...@@ -224,15 +216,6 @@ def init_pretraining_params(exe, ...@@ -224,15 +216,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path ), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var): fluid.load(main_program, pretraining_params_path, exe)
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format( print("Load pretraining parameters from {}.".format(
pretraining_params_path)) pretraining_params_path))
...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训 ...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力 - 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。 百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
...@@ -106,7 +106,7 @@ python -u main.py \ ...@@ -106,7 +106,7 @@ python -u main.py \
--prepostprocess_dropout 0.3 --prepostprocess_dropout 0.3
``` ```
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_param``save_checkpoint`(默认为 trained_params 和 trained_ckpts),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将分别保存当前训练的参数值和 checkpoint 到相应目录,每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出: 训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_model_path`(默认为 saved_models),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将保存当前训练的 checkpoint 到相应目录(会保存分别记录了模型参数和优化器状态的 `transformer.pdparams``transformer.pdopt` 两个文件),每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
```txt ```txt
[2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s [2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s
...@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len ...@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len
### 预训练模型 ### 预训练模型
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_params.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_params.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。 我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_graph.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_graph.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
## 进阶使用 ## 进阶使用
......
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
def get_input_descs(args): def get_input_descs(args):
""" """
Generate a dict mapping data fields to the corresponding data shapes and Generate a dict mapping data fields to the corresponding data shapes and
...@@ -42,7 +43,8 @@ def get_input_descs(args): ...@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder. # encoder.
# The actual data shape of src_slf_attn_bias is: # The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is: # The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1] # [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64", "trg_word": [(batch_size, seq_len), "int64",
...@@ -54,12 +56,14 @@ def get_input_descs(args): ...@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder. # subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is: # The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source # This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention. # input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is: # The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference. # This input is used in independent decoder program for inference.
# The actual data shape of enc_output is: # The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model] # [batch_size, max_src_len_in_batch, d_model]
...@@ -80,6 +84,7 @@ def get_input_descs(args): ...@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs return input_descs
# Names of word embedding table which might be reused for weight sharing. # Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = ( word_emb_param_names = (
"src_word_emb_table", "src_word_emb_table",
......
...@@ -24,6 +24,7 @@ import paddle.fluid as fluid ...@@ -24,6 +24,7 @@ import paddle.fluid as fluid
from utils.input_field import InputField from utils.input_field import InputField
from utils.configure import PDConfig from utils.configure import PDConfig
from utils.load import load
# include task-specific libs # include task-specific libs
import desc import desc
...@@ -31,51 +32,6 @@ import reader ...@@ -31,51 +32,6 @@ import reader
from transformer import create_net from transformer import create_net
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_params(args, exe, program):
assert isinstance(args.init_from_params, str)
if not os.path.exists(args.init_from_params):
raise Warning("the params path does not exist.")
return False
fluid.io.load_params(
executor=exe,
dirname=args.init_from_params,
main_program=program,
filename="params.pdparams")
print("finish init model from params from %s" % (args.init_from_params))
return True
def do_save_inference_model(args): def do_save_inference_model(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
...@@ -84,6 +40,11 @@ def do_save_inference_model(args): ...@@ -84,6 +40,11 @@ def do_save_inference_model(args):
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace() place = fluid.CPUPlace()
src_vocab = reader.DataProcessor.load_dict(args.src_vocab_fpath)
trg_vocab = reader.DataProcessor.load_dict(args.trg_vocab_fpath)
args.src_vocab_size = len(src_vocab)
args.trg_vocab_size = len(trg_vocab)
test_prog = fluid.default_main_program() test_prog = fluid.default_main_program()
startup_prog = fluid.default_startup_program() startup_prog = fluid.default_startup_program()
...@@ -119,13 +80,10 @@ def do_save_inference_model(args): ...@@ -119,13 +80,10 @@ def do_save_inference_model(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
exe.run(startup_prog) exe.run(startup_prog)
assert (args.init_from_params) or (args.init_from_pretrain_model) assert (
args.init_from_params), "must set init_from_params to load parameters"
if args.init_from_params: load(test_prog, os.path.join(args.init_from_params, "transformer"), exe)
init_from_params(args, exe, test_prog) print("finish initing model from params from %s" % (args.init_from_params))
elif args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, test_prog)
# saving inference model # saving inference model
......
...@@ -25,7 +25,6 @@ from train import do_train ...@@ -25,7 +25,6 @@ from train import do_train
from predict import do_predict from predict import do_predict
from inference_model import do_save_inference_model from inference_model import do_save_inference_model
if __name__ == "__main__": if __name__ == "__main__":
LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s" LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s"
logging.basicConfig( logging.basicConfig(
......
...@@ -25,6 +25,7 @@ import paddle.fluid as fluid ...@@ -25,6 +25,7 @@ import paddle.fluid as fluid
from utils.input_field import InputField from utils.input_field import InputField
from utils.configure import PDConfig from utils.configure import PDConfig
from utils.check import check_gpu, check_version from utils.check import check_gpu, check_version
from utils.load import load
# include task-specific libs # include task-specific libs
import desc import desc
...@@ -32,51 +33,6 @@ import reader ...@@ -32,51 +33,6 @@ import reader
from transformer import create_net, position_encoding_init from transformer import create_net, position_encoding_init
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_params(args, exe, program):
assert isinstance(args.init_from_params, str)
if not os.path.exists(args.init_from_params):
raise Warning("the params path does not exist.")
return False
fluid.io.load_params(
executor=exe,
dirname=args.init_from_params,
main_program=program,
filename="params.pdparams")
print("finish init model from params from %s" % (args.init_from_params))
return True
def post_process_seq(seq, bos_idx, eos_idx, output_bos=False, output_eos=False): def post_process_seq(seq, bos_idx, eos_idx, output_bos=False, output_eos=False):
""" """
Post-process the beam-search decoded sequence. Truncate from the first Post-process the beam-search decoded sequence. Truncate from the first
...@@ -160,13 +116,10 @@ def do_predict(args): ...@@ -160,13 +116,10 @@ def do_predict(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
exe.run(startup_prog) exe.run(startup_prog)
assert (args.init_from_params) or (args.init_from_pretrain_model) assert (
args.init_from_params), "must set init_from_params to load parameters"
if args.init_from_params: load(test_prog, os.path.join(args.init_from_params, "transformer"), exe)
init_from_params(args, exe, test_prog) print("finish initing model from params from %s" % (args.init_from_params))
elif args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, test_prog)
# to avoid a longer length than training, reset the size of position encoding to max_length # to avoid a longer length than training, reset the size of position encoding to max_length
for pos_enc_param_name in desc.pos_enc_param_names: for pos_enc_param_name in desc.pos_enc_param_names:
......
...@@ -27,6 +27,7 @@ import utils.dist_utils as dist_utils ...@@ -27,6 +27,7 @@ import utils.dist_utils as dist_utils
from utils.input_field import InputField from utils.input_field import InputField
from utils.configure import PDConfig from utils.configure import PDConfig
from utils.check import check_gpu, check_version from utils.check import check_gpu, check_version
from utils.load import load
# include task-specific libs # include task-specific libs
import desc import desc
...@@ -39,91 +40,6 @@ if os.environ.get('FLAGS_eager_delete_tensor_gb', None) is None: ...@@ -39,91 +40,6 @@ if os.environ.get('FLAGS_eager_delete_tensor_gb', None) is None:
num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1)) num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
def init_from_pretrain_model(args, exe, program):
assert isinstance(args.init_from_pretrain_model, str)
if not os.path.exists(args.init_from_pretrain_model):
raise Warning("The pretrained params do not exist.")
return False
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(
os.path.join(args.init_from_pretrain_model, var.name))
fluid.io.load_vars(
exe,
args.init_from_pretrain_model,
main_program=program,
predicate=existed_params)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
return True
def init_from_checkpoint(args, exe, program):
assert isinstance(args.init_from_checkpoint, str)
if not os.path.exists(args.init_from_checkpoint):
raise Warning("the checkpoint path does not exist.")
return False
fluid.io.load_persistables(
executor=exe,
dirname=args.init_from_checkpoint,
main_program=program,
filename="checkpoint.pdckpt")
print("finish initing model from checkpoint from %s" %
(args.init_from_checkpoint))
return True
def save_checkpoint(args, exe, program, dirname):
assert isinstance(args.save_model_path, str)
checkpoint_dir = os.path.join(args.save_model_path, args.save_checkpoint)
if not os.path.exists(checkpoint_dir):
os.mkdir(checkpoint_dir)
fluid.io.save_persistables(
exe,
os.path.join(checkpoint_dir, dirname),
main_program=program,
filename="checkpoint.pdparams")
print("save checkpoint at %s" % (os.path.join(checkpoint_dir, dirname)))
return True
def save_param(args, exe, program, dirname):
assert isinstance(args.save_model_path, str)
param_dir = os.path.join(args.save_model_path, args.save_param)
if not os.path.exists(param_dir):
os.mkdir(param_dir)
fluid.io.save_params(
exe,
os.path.join(param_dir, dirname),
main_program=program,
filename="params.pdparams")
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
def do_train(args): def do_train(args):
if args.use_cuda: if args.use_cuda:
if num_trainers > 1: # for multi-process gpu training if num_trainers > 1: # for multi-process gpu training
...@@ -226,11 +142,17 @@ def do_train(args): ...@@ -226,11 +142,17 @@ def do_train(args):
## init from some checkpoint, to resume the previous training ## init from some checkpoint, to resume the previous training
if args.init_from_checkpoint: if args.init_from_checkpoint:
init_from_checkpoint(args, exe, train_prog) load(train_prog,
os.path.join(args.init_from_checkpoint, "transformer"), exe)
print("finish initing model from checkpoint from %s" %
(args.init_from_checkpoint))
## init from some pretrain models, to better solve the current task ## init from some pretrain models, to better solve the current task
if args.init_from_pretrain_model: if args.init_from_pretrain_model:
init_from_pretrain_model(args, exe, train_prog) load(train_prog,
os.path.join(args.init_from_pretrain_model, "transformer"), exe)
print("finish initing model from pretrained params from %s" %
(args.init_from_pretrain_model))
build_strategy = fluid.compiler.BuildStrategy() build_strategy = fluid.compiler.BuildStrategy()
build_strategy.enable_inplace = True build_strategy.enable_inplace = True
...@@ -293,14 +215,11 @@ def do_train(args): ...@@ -293,14 +215,11 @@ def do_train(args):
avg_batch_time = time.time() avg_batch_time = time.time()
if step_idx % args.save_step == 0 and step_idx != 0: if step_idx % args.save_step == 0 and step_idx != 0:
if args.save_model_path:
if args.save_checkpoint: model_path = os.path.join(args.save_model_path,
save_checkpoint(args, exe, train_prog, "step_" + str(step_idx),
"step_" + str(step_idx)) "transformer")
fluid.save(train_prog, model_path)
if args.save_param:
save_param(args, exe, train_prog,
"step_" + str(step_idx))
batch_id += 1 batch_id += 1
step_idx += 1 step_idx += 1
...@@ -319,11 +238,10 @@ def do_train(args): ...@@ -319,11 +238,10 @@ def do_train(args):
time_consumed = time.time() - pass_start_time time_consumed = time.time() - pass_start_time
if args.save_checkpoint: if args.save_model_path:
save_checkpoint(args, exe, train_prog, "step_final") model_path = os.path.join(args.save_model_path, "step_final",
"transformer")
if args.save_param: fluid.save(train_prog, model_path)
save_param(args, exe, train_prog, "step_final")
if args.enable_ce: # For CE if args.enable_ce: # For CE
print("kpis\ttrain_cost_card%d\t%f" % (dev_count, total_avg_cost)) print("kpis\ttrain_cost_card%d\t%f" % (dev_count, total_avg_cost))
......
...@@ -17,6 +17,7 @@ import numpy as np ...@@ -17,6 +17,7 @@ import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
import paddle.fluid.layers as layers import paddle.fluid.layers as layers
from paddle.fluid.layers.utils import map_structure
from desc import * from desc import *
...@@ -90,7 +91,6 @@ def multi_head_attention(queries, ...@@ -90,7 +91,6 @@ def multi_head_attention(queries,
n_head=1, n_head=1,
dropout_rate=0., dropout_rate=0.,
cache=None, cache=None,
gather_idx=None,
static_kv=False): static_kv=False):
""" """
Multi-Head Attention. Note that attn_bias is added to the logit before Multi-Head Attention. Note that attn_bias is added to the logit before
...@@ -161,30 +161,28 @@ def multi_head_attention(queries, ...@@ -161,30 +161,28 @@ def multi_head_attention(queries,
v = transpose_layer(x=reshaped_v, perm=[0, 2, 1, 3]) v = transpose_layer(x=reshaped_v, perm=[0, 2, 1, 3])
if cache is not None: # only for faster inference if cache is not None: # only for faster inference
cache_, i = cache
if static_kv: # For encoder-decoder attention in inference if static_kv: # For encoder-decoder attention in inference
cache_k, cache_v = cache["static_k"], cache["static_v"] cache_k, cache_v = cache_["static_k"], cache_["static_v"]
# To init the static_k and static_v in cache. # To init the static_k and static_v in global block.
# Maybe we can use condition_op(if_else) to do these at the first
# step in while loop to replace these, however it might be less
# efficient.
static_cache_init = wrap_layer_with_block( static_cache_init = wrap_layer_with_block(
layers.assign, layers.assign,
fluid.default_main_program().current_block().parent_idx) fluid.default_main_program().current_block().parent_idx)
static_cache_init(k, cache_k) static_cache_init(
static_cache_init(v, cache_v) k,
fluid.default_main_program().global_block().var(
"static_k_%d" % i))
static_cache_init(
v,
fluid.default_main_program().global_block().var(
"static_v_%d" % i))
k, v = cache_k, cache_v
else: # For decoder self-attention in inference else: # For decoder self-attention in inference
cache_k, cache_v = cache["k"], cache["v"] # use cache and concat time steps.
# gather cell states corresponding to selected parent cache_k, cache_v = cache_["k"], cache_["v"]
select_k = layers.gather(cache_k, index=gather_idx) k = layers.concat([cache_k, k], axis=2)
select_v = layers.gather(cache_v, index=gather_idx) v = layers.concat([cache_v, v], axis=2)
if not static_kv: cache_["k"], cache_["v"] = (k, v)
# For self attention in inference, use cache and concat time steps.
select_k = layers.concat([select_k, k], axis=2)
select_v = layers.concat([select_v, v], axis=2)
# update cell states(caches) cached in global block
layers.assign(select_k, cache_k)
layers.assign(select_v, cache_v)
return q, select_k, select_v
return q, k, v return q, k, v
def __combine_heads(x): def __combine_heads(x):
...@@ -301,12 +299,13 @@ def prepare_encoder_decoder(src_word, ...@@ -301,12 +299,13 @@ def prepare_encoder_decoder(src_word,
src_word, src_word,
size=[src_vocab_size, src_emb_dim], size=[src_vocab_size, src_emb_dim],
padding_idx=bos_idx, # set embedding of bos to 0 padding_idx=bos_idx, # set embedding of bos to 0
param_attr=fluid.ParamAttr(name=word_emb_param_name, param_attr=fluid.ParamAttr(
initializer=fluid.initializer.Normal( name=word_emb_param_name,
0., src_emb_dim**-0.5))) initializer=fluid.initializer.Normal(0., src_emb_dim**-0.5)))
src_word_emb = layers.scale(x=src_word_emb, scale=src_emb_dim**0.5) src_word_emb = layers.scale(x=src_word_emb, scale=src_emb_dim**0.5)
src_pos_enc = fluid.embedding(src_pos, src_pos_enc = fluid.embedding(
src_pos,
size=[src_max_len, src_emb_dim], size=[src_max_len, src_emb_dim],
param_attr=fluid.ParamAttr( param_attr=fluid.ParamAttr(
name=pos_enc_param_name, trainable=False)) name=pos_enc_param_name, trainable=False))
...@@ -405,8 +404,7 @@ def decoder_layer(dec_input, ...@@ -405,8 +404,7 @@ def decoder_layer(dec_input,
relu_dropout, relu_dropout,
preprocess_cmd, preprocess_cmd,
postprocess_cmd, postprocess_cmd,
cache=None, cache=None):
gather_idx=None):
""" The layer to be stacked in decoder part. """ The layer to be stacked in decoder part.
The structure of this module is similar to that in the encoder part except The structure of this module is similar to that in the encoder part except
a multi-head attention is added to implement encoder-decoder attention. a multi-head attention is added to implement encoder-decoder attention.
...@@ -421,8 +419,7 @@ def decoder_layer(dec_input, ...@@ -421,8 +419,7 @@ def decoder_layer(dec_input,
d_model, d_model,
n_head, n_head,
attention_dropout, attention_dropout,
cache=cache, cache=cache)
gather_idx=gather_idx)
slf_attn_output = post_process_layer( slf_attn_output = post_process_layer(
dec_input, dec_input,
slf_attn_output, slf_attn_output,
...@@ -440,7 +437,6 @@ def decoder_layer(dec_input, ...@@ -440,7 +437,6 @@ def decoder_layer(dec_input,
n_head, n_head,
attention_dropout, attention_dropout,
cache=cache, cache=cache,
gather_idx=gather_idx,
static_kv=True) static_kv=True)
enc_attn_output = post_process_layer( enc_attn_output = post_process_layer(
slf_attn_output, slf_attn_output,
...@@ -476,8 +472,7 @@ def decoder(dec_input, ...@@ -476,8 +472,7 @@ def decoder(dec_input,
relu_dropout, relu_dropout,
preprocess_cmd, preprocess_cmd,
postprocess_cmd, postprocess_cmd,
caches=None, caches=None):
gather_idx=None):
""" """
The decoder is composed of a stack of identical decoder_layer layers. The decoder is composed of a stack of identical decoder_layer layers.
""" """
...@@ -497,8 +492,7 @@ def decoder(dec_input, ...@@ -497,8 +492,7 @@ def decoder(dec_input,
relu_dropout, relu_dropout,
preprocess_cmd, preprocess_cmd,
postprocess_cmd, postprocess_cmd,
cache=None if caches is None else caches[i], cache=None if caches is None else (caches[i], i))
gather_idx=gather_idx)
dec_input = dec_output dec_input = dec_output
dec_output = pre_process_layer(dec_output, preprocess_cmd, dec_output = pre_process_layer(dec_output, preprocess_cmd,
prepostprocess_dropout) prepostprocess_dropout)
...@@ -536,7 +530,8 @@ def transformer(model_input, ...@@ -536,7 +530,8 @@ def transformer(model_input,
label = model_input.lbl_word label = model_input.lbl_word
weights = model_input.lbl_weight weights = model_input.lbl_weight
enc_output = wrap_encoder(enc_inputs, enc_output = wrap_encoder(
enc_inputs,
src_vocab_size, src_vocab_size,
max_length, max_length,
n_layer, n_layer,
...@@ -553,7 +548,8 @@ def transformer(model_input, ...@@ -553,7 +548,8 @@ def transformer(model_input,
weight_sharing, weight_sharing,
bos_idx=bos_idx) bos_idx=bos_idx)
predict = wrap_decoder(dec_inputs, predict = wrap_decoder(
dec_inputs,
trg_vocab_size, trg_vocab_size,
max_length, max_length,
n_layer, n_layer,
...@@ -575,8 +571,9 @@ def transformer(model_input, ...@@ -575,8 +571,9 @@ def transformer(model_input,
if label_smooth_eps: if label_smooth_eps:
# TODO: use fluid.input.one_hot after softmax_with_cross_entropy removing # TODO: use fluid.input.one_hot after softmax_with_cross_entropy removing
# the enforcement that the last dimension of label must be 1. # the enforcement that the last dimension of label must be 1.
label = layers.label_smooth(label=layers.one_hot(input=label, label = layers.label_smooth(
depth=trg_vocab_size), label=layers.one_hot(
input=label, depth=trg_vocab_size),
epsilon=label_smooth_eps) epsilon=label_smooth_eps)
cost = layers.softmax_with_cross_entropy( cost = layers.softmax_with_cross_entropy(
...@@ -654,7 +651,6 @@ def wrap_decoder(dec_inputs, ...@@ -654,7 +651,6 @@ def wrap_decoder(dec_inputs,
weight_sharing, weight_sharing,
enc_output=None, enc_output=None,
caches=None, caches=None,
gather_idx=None,
bos_idx=0): bos_idx=0):
""" """
The wrapper assembles together all needed layers for the decoder. The wrapper assembles together all needed layers for the decoder.
...@@ -687,8 +683,7 @@ def wrap_decoder(dec_inputs, ...@@ -687,8 +683,7 @@ def wrap_decoder(dec_inputs,
relu_dropout, relu_dropout,
preprocess_cmd, preprocess_cmd,
postprocess_cmd, postprocess_cmd,
caches=caches, caches=caches)
gather_idx=gather_idx)
# Reshape to 2D tensor to use GEMM instead of BatchedGEMM # Reshape to 2D tensor to use GEMM instead of BatchedGEMM
dec_output = layers.reshape( dec_output = layers.reshape(
dec_output, shape=[-1, dec_output.shape[-1]], inplace=True) dec_output, shape=[-1, dec_output.shape[-1]], inplace=True)
...@@ -722,7 +717,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -722,7 +717,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dec_inputs = (model_input.trg_word, model_input.init_score, dec_inputs = (model_input.trg_word, model_input.init_score,
model_input.init_idx, model_input.trg_src_attn_bias) model_input.init_idx, model_input.trg_src_attn_bias)
enc_output = wrap_encoder(enc_inputs, enc_output = wrap_encoder(
enc_inputs,
src_vocab_size, src_vocab_size,
max_in_len, max_in_len,
n_layer, n_layer,
...@@ -748,8 +744,6 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -748,8 +744,6 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
force_cpu=True) force_cpu=True)
step_idx = layers.fill_constant( step_idx = layers.fill_constant(
shape=[1], dtype=start_tokens.dtype, value=0, force_cpu=True) shape=[1], dtype=start_tokens.dtype, value=0, force_cpu=True)
cond = layers.less_than(x=step_idx, y=max_len) # default force_cpu=True
while_op = layers.While(cond)
# array states will be stored for each step. # array states will be stored for each step.
ids = layers.array_write( ids = layers.array_write(
layers.reshape(start_tokens, (-1, 1)), step_idx) layers.reshape(start_tokens, (-1, 1)), step_idx)
...@@ -773,21 +767,31 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -773,21 +767,31 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=enc_output.dtype, dtype=enc_output.dtype,
value=0), value=0),
"static_k": # for encoder-decoder attention "static_k": # for encoder-decoder attention
layers.create_tensor(dtype=enc_output.dtype), fluid.data(
shape=[None, n_head, 0, d_key],
dtype=enc_output.dtype,
name=("static_k_%d" % i)),
"static_v": # for encoder-decoder attention "static_v": # for encoder-decoder attention
layers.create_tensor(dtype=enc_output.dtype) fluid.data(
shape=[None, n_head, 0, d_value],
dtype=enc_output.dtype,
name=("static_v_%d" % i)),
} for i in range(n_layer) } for i in range(n_layer)
] ]
with while_op.block(): def cond_func(step_idx, selected_ids, selected_scores, gather_idx,
pre_ids = layers.array_read(array=ids, i=step_idx) caches, trg_src_attn_bias):
# Since beam_search_op dosen't enforce pre_ids' shape, we can do length_cond = layers.less_than(x=step_idx, y=max_len)
# inplace reshape here which actually change the shape of pre_ids. finish_cond = layers.logical_not(layers.is_empty(x=selected_ids))
# pre_ids = layers.reshape(pre_ids, (-1, 1, 1), inplace=True) return layers.logical_and(x=length_cond, y=finish_cond)
pre_scores = layers.array_read(array=scores, i=step_idx)
def body_func(step_idx, pre_ids, pre_scores, gather_idx, caches,
trg_src_attn_bias):
# gather cell states corresponding to selected parent # gather cell states corresponding to selected parent
pre_caches = map_structure(
lambda x: layers.gather(x, index=gather_idx), caches)
pre_src_attn_bias = layers.gather( pre_src_attn_bias = layers.gather(
trg_src_attn_bias, index=parent_idx) trg_src_attn_bias, index=gather_idx)
pre_pos = layers.elementwise_mul( pre_pos = layers.elementwise_mul(
x=layers.fill_constant_batch_size_like( x=layers.fill_constant_batch_size_like(
input=pre_src_attn_bias, # cann't use lod tensor here input=pre_src_attn_bias, # cann't use lod tensor here
...@@ -796,7 +800,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -796,7 +800,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=pre_ids.dtype), dtype=pre_ids.dtype),
y=step_idx, y=step_idx,
axis=0) axis=0)
logits = wrap_decoder((pre_ids, pre_pos, None, pre_src_attn_bias), logits = wrap_decoder(
(pre_ids, pre_pos, None, pre_src_attn_bias),
trg_vocab_size, trg_vocab_size,
max_in_len, max_in_len,
n_layer, n_layer,
...@@ -812,8 +817,7 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -812,8 +817,7 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
postprocess_cmd, postprocess_cmd,
weight_sharing, weight_sharing,
enc_output=enc_output, enc_output=enc_output,
caches=caches, caches=pre_caches,
gather_idx=parent_idx,
bos_idx=bos_idx) bos_idx=bos_idx)
# intra-beam topK # intra-beam topK
topk_scores, topk_indices = layers.topk( topk_scores, topk_indices = layers.topk(
...@@ -832,16 +836,20 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -832,16 +836,20 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
beam_size=beam_size, beam_size=beam_size,
end_id=eos_idx, end_id=eos_idx,
return_parent_idx=True) return_parent_idx=True)
layers.increment(x=step_idx, value=1.0, in_place=True) step_idx = layers.increment(x=step_idx, value=1.0, in_place=False)
# cell states(caches) have been updated in wrap_decoder,
# only need to update beam search states here.
layers.array_write(selected_ids, i=step_idx, array=ids) layers.array_write(selected_ids, i=step_idx, array=ids)
layers.array_write(selected_scores, i=step_idx, array=scores) layers.array_write(selected_scores, i=step_idx, array=scores)
layers.assign(gather_idx, parent_idx) return (step_idx, selected_ids, selected_scores, gather_idx,
layers.assign(pre_src_attn_bias, trg_src_attn_bias) pre_caches, pre_src_attn_bias)
length_cond = layers.less_than(x=step_idx, y=max_len)
finish_cond = layers.logical_not(layers.is_empty(x=selected_ids)) _ = layers.while_loop(
layers.logical_and(x=length_cond, y=finish_cond, out=cond) cond=cond_func,
body=body_func,
loop_vars=[
step_idx, start_tokens, init_scores, parent_idx, caches,
trg_src_attn_bias
],
is_test=True)
finished_ids, finished_scores = layers.beam_search_decode( finished_ids, finished_scores = layers.beam_search_decode(
ids, scores, beam_size=beam_size, end_id=eos_idx) ids, scores, beam_size=beam_size, end_id=eos_idx)
......
...@@ -11,10 +11,11 @@ init_from_checkpoint: "" ...@@ -11,10 +11,11 @@ init_from_checkpoint: ""
init_from_pretrain_model: "" init_from_pretrain_model: ""
# path of trained parameter, to make prediction # path of trained parameter, to make prediction
init_from_params: "trained_params/step_100000" init_from_params: "trained_params/step_100000"
save_model_path: "" # the directory for saving models.
# the directory for saving checkpoints. save_model_path: "saved_models"
# deprecated, the directory for saving checkpoints.
save_checkpoint: "trained_ckpts" save_checkpoint: "trained_ckpts"
# the directory for saving trained parameters. # deprecated, the directory for saving trained parameters.
save_param: "trained_params" save_param: "trained_params"
# the directory for saving inference model. # the directory for saving inference model.
inference_model_dir: "infer_model" inference_model_dir: "infer_model"
......
...@@ -199,9 +199,14 @@ class PDConfig(object): ...@@ -199,9 +199,14 @@ class PDConfig(object):
"Whether to perform model saving for inference.") "Whether to perform model saving for inference.")
# NOTE: args for profiler # NOTE: args for profiler
self.default_g.add_arg("is_profiler", int, 0, "the switch of profiler tools. (used for benchmark)") self.default_g.add_arg(
self.default_g.add_arg("profiler_path", str, './', "the profiler output file path. (used for benchmark)") "is_profiler", int, 0,
self.default_g.add_arg("max_iter", int, 0, "the max train batch num.(used for benchmark)") "the switch of profiler tools. (used for benchmark)")
self.default_g.add_arg(
"profiler_path", str, './',
"the profiler output file path. (used for benchmark)")
self.default_g.add_arg("max_iter", int, 0,
"the max train batch num.(used for benchmark)")
self.parser = parser self.parser = parser
......
import pickle
import six
import warnings
from functools import partial
import paddle.fluid as fluid
def load(program, model_path, executor=None, var_list=None):
"""
To load python2 saved models in python3.
"""
try:
fluid.load(program, model_path, executor, var_list)
except UnicodeDecodeError:
warnings.warn(
"An UnicodeDecodeError is catched, which might be caused by loading "
"a python2 saved model. Encoding of pickle.load would be set and "
"load again automatically.")
if six.PY3:
load_bak = pickle.load
pickle.load = partial(load_bak, encoding="latin1")
fluid.load(program, model_path, executor, var_list)
pickle.load = load_bak
...@@ -22,6 +22,8 @@ ...@@ -22,6 +22,8 @@
| :------| :------: | :------: |:------: |:------: | | :------| :------: | :------: |:------: |:------: |
| [BERT-Large, Uncased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_uncased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M | | [BERT-Large, Uncased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_uncased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M |
| [BERT-Large, Cased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_cased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M | | [BERT-Large, Cased (Whole Word Masking)](https://bert-models.bj.bcebos.com/wwm_cased_L-24_H-1024_A-16.tar.gz)| 24 | 1024 | 16 | 340M |
| [RoBERTa-Base, Chinese](https://bert-models.bj.bcebos.com/chinese_roberta_wwm_ext_L-12_H-768_A-12.tar.gz) | 12 | 768 |12 |110M |
| [RoBERTa-Large, Chinese](https://bert-models.bj.bcebos.com/chinese_roberta_wwm_large_ext_L-24_H-1024_A-16.tar.gz) | 24 | 1024 |16 |340M |
| [BERT-Base, Uncased](https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz) | 12 | 768 |12 |110M | | [BERT-Base, Uncased](https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz) | 12 | 768 |12 |110M |
| [BERT-Large, Uncased](https://bert-models.bj.bcebos.com/uncased_L-24_H-1024_A-16.tar.gz) | 24 | 1024 |16 |340M | | [BERT-Large, Uncased](https://bert-models.bj.bcebos.com/uncased_L-24_H-1024_A-16.tar.gz) | 24 | 1024 |16 |340M |
|[BERT-Base, Cased](https://bert-models.bj.bcebos.com/cased_L-12_H-768_A-12.tar.gz)|12|768|12|110M| |[BERT-Base, Cased](https://bert-models.bj.bcebos.com/cased_L-12_H-768_A-12.tar.gz)|12|768|12|110M|
...@@ -415,5 +417,3 @@ for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) { ...@@ -415,5 +417,3 @@ for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) {
<< static_cast<float *>(output.front().data.data())[i + 2] << std::endl; << static_cast<float *>(output.front().data.data())[i + 2] << std::endl;
} }
``` ```
...@@ -158,7 +158,7 @@ def optimization(loss, ...@@ -158,7 +158,7 @@ def optimization(loss,
else: else:
if weight_decay > 0: if weight_decay > 0:
for param in train_program.global_block().all_parameters(): for param in train_program.all_parameters():
param_list[param.name] = param * 1.0 param_list[param.name] = param * 1.0
param_list[param.name].stop_gradient = True param_list[param.name].stop_gradient = True
......
...@@ -392,7 +392,7 @@ def main(args): ...@@ -392,7 +392,7 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(program=train_program, model_path=save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
print("Average throughtput: %s" % (np.average(throughput))) print("Average throughtput: %s" % (np.average(throughput)))
...@@ -409,7 +409,7 @@ def main(args): ...@@ -409,7 +409,7 @@ def main(args):
"test") "test")
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(program=train_program, model_path=save_path)
train_data_loader.reset() train_data_loader.reset()
break break
if args.enable_ce: if args.enable_ce:
......
...@@ -398,11 +398,11 @@ def train(args): ...@@ -398,11 +398,11 @@ def train(args):
if steps % args.save_steps == 0 or steps == max_train_steps: if steps % args.save_steps == 0 or steps == max_train_steps:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(program=train_program, model_path=save_path)
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps) + "_final") "step_" + str(steps) + "_final")
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(program=train_program, model_path=save_path)
train_data_loader.reset() train_data_loader.reset()
break break
......
...@@ -412,7 +412,7 @@ def train(args): ...@@ -412,7 +412,7 @@ def train(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(program=train_program, model_path=save_path)
if args.validation_set_dir and steps % args.validation_steps == 0: if args.validation_set_dir and steps % args.validation_steps == 0:
vali_cost, vali_lm_cost, vali_acc, vali_steps, vali_speed = predict( vali_cost, vali_lm_cost, vali_acc, vali_steps, vali_speed = predict(
......
...@@ -25,7 +25,7 @@ import paddle.fluid as fluid ...@@ -25,7 +25,7 @@ import paddle.fluid as fluid
def cast_fp32_to_fp16(exe, main_program): def cast_fp32_to_fp16(exe, main_program):
print("Cast parameters to float16 data format.") print("Cast parameters to float16 data format.")
for param in main_program.global_block().all_parameters(): for param in main_program.all_parameters():
if not param.name.endswith(".master"): if not param.name.endswith(".master"):
param_t = fluid.global_scope().find_var(param.name).get_tensor() param_t = fluid.global_scope().find_var(param.name).get_tensor()
data = np.array(param_t) data = np.array(param_t)
...@@ -38,21 +38,9 @@ def cast_fp32_to_fp16(exe, main_program): ...@@ -38,21 +38,9 @@ def cast_fp32_to_fp16(exe, main_program):
def init_checkpoint(exe, init_checkpoint_path, main_program, use_fp16=False): def init_checkpoint(exe, init_checkpoint_path, main_program, use_fp16=False):
assert os.path.exists( fluid.load(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path program=main_program, model_path=init_checkpoint_path, executor=exe)
def existed_persitables(var):
if not fluid.io.is_persistable(var):
return False
if os.path.exists(os.path.join(init_checkpoint_path, var.name)):
print("INIT {}".format(var.name))
return True
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path)) print("Load model from {}".format(init_checkpoint_path))
if use_fp16: if use_fp16:
...@@ -63,24 +51,8 @@ def init_pretraining_params(exe, ...@@ -63,24 +51,8 @@ def init_pretraining_params(exe,
pretraining_params_path, pretraining_params_path,
main_program, main_program,
use_fp16=False): use_fp16=False):
assert os.path.exists(pretraining_params_path fluid.load(
), "[%s] cann't be found." % pretraining_params_path program=main_program, model_path=pretraining_params_path, executor=exe)
def existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
if os.path.exists(os.path.join(pretraining_params_path, var.name)):
print("INIT {}".format(var.name))
return True
else:
print("SKIP {}".format(var.name))
return False
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=existed_params)
print("Load pretraining parameters from {}.".format( print("Load pretraining parameters from {}.".format(
pretraining_params_path)) pretraining_params_path))
......
...@@ -90,5 +90,3 @@ word_embedding=fluid.layers.concat(input=[elmo_embedding, word_embedding], axis= ...@@ -90,5 +90,3 @@ word_embedding=fluid.layers.concat(input=[elmo_embedding, word_embedding], axis=
### 参考论文 ### 参考论文
[Deep contextualized word representations](https://arxiv.org/abs/1802.05365) [Deep contextualized word representations](https://arxiv.org/abs/1802.05365)
...@@ -7,7 +7,6 @@ from kpi import CostKpi, DurationKpi, AccKpi ...@@ -7,7 +7,6 @@ from kpi import CostKpi, DurationKpi, AccKpi
#### NOTE kpi.py should shared in models in some way!!!! #### NOTE kpi.py should shared in models in some way!!!!
train_duration_sts_b_card1 = DurationKpi( train_duration_sts_b_card1 = DurationKpi(
'train_duration_sts_b_card1', 0.01, 0, actived=True) 'train_duration_sts_b_card1', 0.01, 0, actived=True)
train_cost_sts_b_card1 = CostKpi( train_cost_sts_b_card1 = CostKpi(
......
...@@ -29,7 +29,7 @@ ...@@ -29,7 +29,7 @@
1. PaddlePaddle 安装 1. PaddlePaddle 安装
本项目依赖于 PaddlePaddle Fluid 1.6 及以上版本,请参考 [安装指南](http://www.paddlepaddle.org/#quick-start) 进行安装 本项目依赖于 PaddlePaddle Fluid 1.7 及以上版本,请参考 [安装指南](http://www.paddlepaddle.org/#quick-start) 进行安装
2. 代码安装 2. 代码安装
......
...@@ -13,6 +13,7 @@ from run_classifier import create_model ...@@ -13,6 +13,7 @@ from run_classifier import create_model
import utils import utils
import reader import reader
def do_save_inference_model(args): def do_save_inference_model(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
...@@ -53,6 +54,7 @@ def do_save_inference_model(args): ...@@ -53,6 +54,7 @@ def do_save_inference_model(args):
print("save inference model at %s" % (args.inference_model_dir)) print("save inference model at %s" % (args.inference_model_dir))
def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
""" """
Inference Function Inference Function
...@@ -61,13 +63,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): ...@@ -61,13 +63,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
test_pyreader.start() test_pyreader.start()
while True: while True:
try: try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, return_numpy=True) np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]: for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
break break
def test_inference_model(args): def test_inference_model(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
...@@ -92,7 +97,8 @@ def test_inference_model(args): ...@@ -92,7 +97,8 @@ def test_inference_model(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
exe.run(startup_prog) exe.run(startup_prog)
processor = reader.SentaProcessor(data_dir=args.data_dir, processor = reader.SentaProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
random_seed=args.random_seed, random_seed=args.random_seed,
max_seq_len=args.max_seq_len) max_seq_len=args.max_seq_len)
...@@ -107,14 +113,14 @@ def test_inference_model(args): ...@@ -107,14 +113,14 @@ def test_inference_model(args):
params_filename="params.pdparams") params_filename="params.pdparams")
infer_data_generator = processor.data_generator( infer_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size / dev_count,
phase="infer", phase="infer",
epoch=1, epoch=1,
shuffle=False) shuffle=False)
infer_pyreader.decorate_sample_list_generator(infer_data_generator) infer_pyreader.set_sample_list_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader, inference(exe, test_prog, infer_pyreader, [probs.name], "infer")
[probs.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig('senta_config.json') args = PDConfig('senta_config.json')
......
# -*- coding: utf_8 -*- # -*- coding: utf_8 -*-
import os import os
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
sys.path.append("../models/classification") sys.path.append("../shared_modules/models/classification")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
...@@ -17,6 +17,7 @@ from models.representation.ernie import ErnieConfig ...@@ -17,6 +17,7 @@ from models.representation.ernie import ErnieConfig
from models.representation.ernie import ernie_encoder, ernie_encoder_with_paddle_hub from models.representation.ernie import ernie_encoder, ernie_encoder_with_paddle_hub
from preprocess.ernie import task_reader from preprocess.ernie import task_reader
def do_save_inference_model(args): def do_save_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path) ernie_config = ErnieConfig(args.ernie_config_path)
...@@ -37,18 +38,17 @@ def do_save_inference_model(args): ...@@ -37,18 +38,17 @@ def do_save_inference_model(args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_reader")
pyreader_name="infer_reader")
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args, probs = create_model(
embeddings, args, embeddings, labels=labels, is_prediction=True)
labels=labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -59,11 +59,11 @@ def do_save_inference_model(args): ...@@ -59,11 +59,11 @@ def do_save_inference_model(args):
fluid.io.save_inference_model( fluid.io.save_inference_model(
args.inference_model_dir, args.inference_model_dir,
feeded_var_names=[ernie_inputs["src_ids"].name, feeded_var_names=[
ernie_inputs["sent_ids"].name, ernie_inputs["src_ids"].name, ernie_inputs["sent_ids"].name,
ernie_inputs["pos_ids"].name, ernie_inputs["pos_ids"].name, ernie_inputs["input_mask"].name,
ernie_inputs["input_mask"].name, ernie_inputs["seq_lens"].name
ernie_inputs["seq_lens"].name], ],
target_vars=[probs], target_vars=[probs],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
...@@ -72,6 +72,7 @@ def do_save_inference_model(args): ...@@ -72,6 +72,7 @@ def do_save_inference_model(args):
print("save inference model at %s" % (args.inference_model_dir)) print("save inference model at %s" % (args.inference_model_dir))
def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
""" """
Inference Function Inference Function
...@@ -80,13 +81,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): ...@@ -80,13 +81,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
test_pyreader.start() test_pyreader.start()
while True: while True:
try: try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, return_numpy=True) np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]: for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
break break
def test_inference_model(args): def test_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path) ernie_config = ErnieConfig(args.ernie_config_path)
ernie_config.print_config() ernie_config.print_config()
...@@ -113,15 +117,11 @@ def test_inference_model(args): ...@@ -113,15 +117,11 @@ def test_inference_model(args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_pyreader")
pyreader_name="infer_pyreader")
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
probs = create_model( probs = create_model(
args, args, embeddings, labels=labels, is_prediction=True)
embeddings,
labels=labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -129,7 +129,7 @@ def test_inference_model(args): ...@@ -129,7 +129,7 @@ def test_inference_model(args):
assert (args.inference_model_dir) assert (args.inference_model_dir)
infer_data_generator = reader.data_generator( infer_data_generator = reader.data_generator(
input_file=args.test_set, input_file=args.test_set,
batch_size=args.batch_size, batch_size=args.batch_size / dev_count,
phase="infer", phase="infer",
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -140,9 +140,9 @@ def test_inference_model(args): ...@@ -140,9 +140,9 @@ def test_inference_model(args):
model_filename="model.pdmodel", model_filename="model.pdmodel",
params_filename="params.pdparams") params_filename="params.pdparams")
infer_pyreader.decorate_batch_generator(infer_data_generator) infer_pyreader.set_batch_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader, inference(exe, test_prog, infer_pyreader, [probs.name], "infer")
[probs.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig() args = PDConfig()
......
...@@ -12,8 +12,8 @@ import argparse ...@@ -12,8 +12,8 @@ import argparse
import numpy as np import numpy as np
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../models/classification/") sys.path.append("../shared_modules/models/classification/")
sys.path.append("../") sys.path.append("../shared_modules/")
from nets import bow_net from nets import bow_net
from nets import lstm_net from nets import lstm_net
...@@ -30,24 +30,19 @@ import paddle.fluid as fluid ...@@ -30,24 +30,19 @@ import paddle.fluid as fluid
import reader import reader
from utils import init_checkpoint from utils import init_checkpoint
def create_model(args,
pyreader_name,
num_labels,
is_prediction=False):
def create_model(args, pyreader_name, num_labels, is_prediction=False):
""" """
Create Model for sentiment classification Create Model for sentiment classification
""" """
data = fluid.layers.data( data = fluid.data(
name="src_ids", shape=[-1, args.max_seq_len], dtype='int64') name="src_ids", shape=[None, args.max_seq_len], dtype='int64')
label = fluid.layers.data( label = fluid.data(name="label", shape=[None, 1], dtype="int64")
name="label", shape=[-1, 1], dtype="int64") seq_len = fluid.data(name="seq_len", shape=[None], dtype="int64")
seq_len = fluid.layers.data(
name="seq_len", shape=[-1], dtype="int64")
data_reader = fluid.io.PyReader(feed_list=[data, label, seq_len], data_reader = fluid.io.DataLoader.from_generator(
capacity=4, iterable=False) feed_list=[data, label, seq_len], capacity=4, iterable=False)
if args.model_type == "bilstm_net": if args.model_type == "bilstm_net":
network = bilstm_net network = bilstm_net
...@@ -63,18 +58,19 @@ def create_model(args, ...@@ -63,18 +58,19 @@ def create_model(args,
raise ValueError("Unknown network type!") raise ValueError("Unknown network type!")
if is_prediction: if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, is_prediction=is_prediction) probs = network(
data, seq_len, None, args.vocab_size, is_prediction=is_prediction)
print("create inference model...") print("create inference model...")
return data_reader, probs, [data.name, seq_len.name] return data_reader, probs, [data.name, seq_len.name]
ce_loss, probs = network(data, seq_len, label, args.vocab_size, is_prediction=is_prediction) ce_loss, probs = network(
data, seq_len, label, args.vocab_size, is_prediction=is_prediction)
loss = fluid.layers.mean(x=ce_loss) loss = fluid.layers.mean(x=ce_loss)
num_seqs = fluid.layers.create_tensor(dtype='int64') num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs) accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return data_reader, loss, accuracy, num_seqs return data_reader, loss, accuracy, num_seqs
def evaluate(exe, test_program, test_pyreader, fetch_list, eval_phase): def evaluate(exe, test_program, test_pyreader, fetch_list, eval_phase):
""" """
Evaluation Function Evaluation Function
...@@ -111,7 +107,8 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): ...@@ -111,7 +107,8 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
time_begin = time.time() time_begin = time.time()
while True: while True:
try: try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True) return_numpy=True)
for probs in np_props[0]: for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
...@@ -135,7 +132,8 @@ def main(args): ...@@ -135,7 +132,8 @@ def main(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
task_name = args.task_name.lower() task_name = args.task_name.lower()
processor = reader.SentaProcessor(data_dir=args.data_dir, processor = reader.SentaProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
random_seed=args.random_seed, random_seed=args.random_seed,
max_seq_len=args.max_seq_len) max_seq_len=args.max_seq_len)
...@@ -151,7 +149,7 @@ def main(args): ...@@ -151,7 +149,7 @@ def main(args):
if args.do_train: if args.do_train:
train_data_generator = processor.data_generator( train_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size / dev_count,
phase='train', phase='train',
epoch=args.epoch, epoch=args.epoch,
shuffle=True) shuffle=True)
...@@ -187,7 +185,7 @@ def main(args): ...@@ -187,7 +185,7 @@ def main(args):
if args.do_val: if args.do_val:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size / dev_count,
phase='dev', phase='dev',
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -204,7 +202,7 @@ def main(args): ...@@ -204,7 +202,7 @@ def main(args):
if args.do_infer: if args.do_infer:
infer_data_generator = processor.data_generator( infer_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size / dev_count,
phase='infer', phase='infer',
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -223,30 +221,25 @@ def main(args): ...@@ -223,30 +221,25 @@ def main(args):
if args.do_train: if args.do_train:
if args.init_checkpoint: if args.init_checkpoint:
init_checkpoint( init_checkpoint(
exe, exe, args.init_checkpoint, main_program=startup_prog)
args.init_checkpoint,
main_program=startup_prog)
elif args.do_val or args.do_infer: elif args.do_val or args.do_infer:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog)
exe,
args.init_checkpoint,
main_program=startup_prog)
if args.do_train: if args.do_train:
train_exe = exe train_exe = exe
train_reader.decorate_sample_list_generator(train_data_generator) train_reader.set_sample_list_generator(train_data_generator)
else: else:
train_exe = None train_exe = None
if args.do_val: if args.do_val:
test_exe = exe test_exe = exe
test_reader.decorate_sample_list_generator(test_data_generator) test_reader.set_sample_list_generator(test_data_generator)
if args.do_infer: if args.do_infer:
test_exe = exe test_exe = exe
infer_reader.decorate_sample_list_generator(infer_data_generator) infer_reader.set_sample_list_generator(infer_data_generator)
if args.do_train: if args.do_train:
train_reader.start() train_reader.start()
...@@ -262,7 +255,9 @@ def main(args): ...@@ -262,7 +255,9 @@ def main(args):
else: else:
fetch_list = [] fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False) outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
#print("finished one step") #print("finished one step")
if steps % args.skip_steps == 0: if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs np_loss, np_acc, np_num_seqs = outputs
...@@ -274,7 +269,8 @@ def main(args): ...@@ -274,7 +269,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs) total_num_seqs.extend(np_num_seqs)
if args.verbose: if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size() verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose) print(verbose)
time_end = time.time() time_end = time.time()
...@@ -289,8 +285,8 @@ def main(args): ...@@ -289,8 +285,8 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps), "checkpoint")
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate dev set # evaluate dev set
...@@ -301,8 +297,9 @@ def main(args): ...@@ -301,8 +297,9 @@ def main(args):
"dev") "dev")
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps),
fluid.io.save_persistables(exe, save_path, train_program) "checkpoint")
fluid.save(train_program, save_path)
train_reader.reset() train_reader.reset()
break break
...@@ -315,8 +312,7 @@ def main(args): ...@@ -315,8 +312,7 @@ def main(args):
# final eval on test set # final eval on test set
if args.do_infer: if args.do_infer:
print("Final test result:") print("Final test result:")
inference(exe, infer_prog, infer_reader, inference(exe, infer_prog, infer_reader, [prop.name], "infer")
[prop.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
......
...@@ -16,8 +16,8 @@ import sys ...@@ -16,8 +16,8 @@ import sys
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
sys.path.append("../models/classification/") sys.path.append("../shared_modules/models/classification/")
sys.path.append("..") sys.path.append("../shared_modules/")
print(sys.path) print(sys.path)
from nets import bow_net from nets import bow_net
...@@ -36,19 +36,18 @@ from config import PDConfig ...@@ -36,19 +36,18 @@ from config import PDConfig
from utils import init_checkpoint from utils import init_checkpoint
def ernie_pyreader(args, pyreader_name): def ernie_pyreader(args, pyreader_name):
src_ids = fluid.layers.data( src_ids = fluid.data(
name="src_ids", shape=[-1, args.max_seq_len, 1], dtype="int64") name="src_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
sent_ids = fluid.layers.data( sent_ids = fluid.data(
name="sent_ids", shape=[-1, args.max_seq_len, 1], dtype="int64") name="sent_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
pos_ids = fluid.layers.data( pos_ids = fluid.data(
name="pos_ids", shape=[-1, args.max_seq_len, 1], dtype="int64") name="pos_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
input_mask = fluid.layers.data( input_mask = fluid.data(
name="input_mask", shape=[-1, args.max_seq_len, 1], dtype="float32") name="input_mask", shape=[None, args.max_seq_len, 1], dtype="float32")
labels = fluid.layers.data( labels = fluid.data(name="labels", shape=[None, 1], dtype="int64")
name="labels", shape=[-1, 1], dtype="int64") seq_lens = fluid.data(name="seq_lens", shape=[None], dtype="int64")
seq_lens = fluid.layers.data(
name="seq_lens", shape=[-1], dtype="int64")
pyreader = fluid.io.DataLoader.from_generator( pyreader = fluid.io.DataLoader.from_generator(
feed_list=[src_ids, sent_ids, pos_ids, input_mask, labels, seq_lens], feed_list=[src_ids, sent_ids, pos_ids, input_mask, labels, seq_lens],
...@@ -61,15 +60,13 @@ def ernie_pyreader(args, pyreader_name): ...@@ -61,15 +60,13 @@ def ernie_pyreader(args, pyreader_name):
"sent_ids": sent_ids, "sent_ids": sent_ids,
"pos_ids": pos_ids, "pos_ids": pos_ids,
"input_mask": input_mask, "input_mask": input_mask,
"seq_lens": seq_lens} "seq_lens": seq_lens
}
return pyreader, ernie_inputs, labels return pyreader, ernie_inputs, labels
def create_model(args,
embeddings,
labels,
is_prediction=False):
def create_model(args, embeddings, labels, is_prediction=False):
""" """
Create Model for sentiment classification based on ERNIE encoder Create Model for sentiment classification based on ERNIE encoder
""" """
...@@ -132,7 +129,8 @@ def infer(exe, infer_program, infer_pyreader, fetch_list, infer_phase): ...@@ -132,7 +129,8 @@ def infer(exe, infer_program, infer_pyreader, fetch_list, infer_phase):
time_begin = time.time() time_begin = time.time()
while True: while True:
try: try:
batch_probs = exe.run(program=infer_program, fetch_list=fetch_list, batch_probs = exe.run(program=infer_program,
fetch_list=fetch_list,
return_numpy=True) return_numpy=True)
for probs in batch_probs[0]: for probs in batch_probs[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
...@@ -195,21 +193,19 @@ def main(args): ...@@ -195,21 +193,19 @@ def main(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
# create ernie_pyreader # create ernie_pyreader
train_pyreader, ernie_inputs, labels = ernie_pyreader( train_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name='train_pyreader')
pyreader_name='train_pyreader')
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings # user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model( loss, accuracy, num_seqs = create_model(
args, args, embeddings, labels=labels, is_prediction=False)
embeddings,
labels=labels,
is_prediction=False)
optimizer = fluid.optimizer.Adam(learning_rate=args.lr) optimizer = fluid.optimizer.Adam(learning_rate=args.lr)
optimizer.minimize(loss) optimizer.minimize(loss)
...@@ -232,21 +228,19 @@ def main(args): ...@@ -232,21 +228,19 @@ def main(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
# create ernie_pyreader # create ernie_pyreader
test_pyreader, ernie_inputs, labels = ernie_pyreader( test_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name='eval_reader')
pyreader_name='eval_reader')
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings # user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model( loss, accuracy, num_seqs = create_model(
args, args, embeddings, labels=labels, is_prediction=False)
embeddings,
labels=labels,
is_prediction=False)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -261,19 +255,18 @@ def main(args): ...@@ -261,19 +255,18 @@ def main(args):
with fluid.program_guard(infer_prog, startup_prog): with fluid.program_guard(infer_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_pyreader")
pyreader_name="infer_pyreader")
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args, probs = create_model(
embeddings, args, embeddings, labels=labels, is_prediction=True)
labels=labels,
is_prediction=True)
infer_prog = infer_prog.clone(for_test=True) infer_prog = infer_prog.clone(for_test=True)
...@@ -282,25 +275,17 @@ def main(args): ...@@ -282,25 +275,17 @@ def main(args):
if args.do_train: if args.do_train:
if args.init_checkpoint: if args.init_checkpoint:
init_checkpoint( init_checkpoint(
exe, exe, args.init_checkpoint, main_program=train_program)
args.init_checkpoint,
main_program=train_program)
elif args.do_val: elif args.do_val:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
elif args.do_infer: elif args.do_infer:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=infer_prog)
exe,
args.init_checkpoint,
main_program=infer_prog)
if args.do_train: if args.do_train:
train_exe = exe train_exe = exe
...@@ -327,7 +312,9 @@ def main(args): ...@@ -327,7 +312,9 @@ def main(args):
else: else:
fetch_list = [] fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False) outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
if steps % args.skip_steps == 0: if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs np_loss, np_acc, np_num_seqs = outputs
np_loss = np.array(np_loss) np_loss = np.array(np_loss)
...@@ -338,7 +325,8 @@ def main(args): ...@@ -338,7 +325,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs) total_num_seqs.extend(np_num_seqs)
if args.verbose: if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size() verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose) print(verbose)
time_end = time.time() time_end = time.time()
...@@ -353,8 +341,8 @@ def main(args): ...@@ -353,8 +341,8 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps), "checkpoint")
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate dev set # evaluate dev set
...@@ -364,8 +352,9 @@ def main(args): ...@@ -364,8 +352,9 @@ def main(args):
"dev") "dev")
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps),
fluid.io.save_persistables(exe, save_path, train_program) "checkpoint")
fluid.save(train_program, save_path)
train_pyreader.reset() train_pyreader.reset()
break break
...@@ -378,8 +367,8 @@ def main(args): ...@@ -378,8 +367,8 @@ def main(args):
# final eval on test set # final eval on test set
if args.do_infer: if args.do_infer:
print("Final test result:") print("Final test result:")
infer(exe, infer_prog, infer_pyreader, infer(exe, infer_prog, infer_pyreader, [probs.name], "infer")
[probs.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig() args = PDConfig()
......
...@@ -31,6 +31,7 @@ class ArgumentGroup(object): ...@@ -31,6 +31,7 @@ class ArgumentGroup(object):
""" """
Argument Class Argument Class
""" """
def __init__(self, parser, title, des): def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des) self._group = parser.add_argument_group(title=title, description=des)
...@@ -63,20 +64,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program): ...@@ -63,20 +64,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
""" """
assert os.path.exists( assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
try:
def existed_persitables(var): checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
""" fluid.load(main_program, checkpoint_path, exe)
If existed presitabels except:
""" fluid.load(main_program, init_checkpoint_path, exe)
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path)) print("Load model from {}".format(init_checkpoint_path))
...@@ -96,8 +88,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len): ...@@ -96,8 +88,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len):
sys.stderr.write("[NOTICE] Error Format Line!") sys.stderr.write("[NOTICE] Error Format Line!")
continue continue
label = int(cols[1]) label = int(cols[1])
wids = [word_dict[x] if x in word_dict else unk_id wids = [
for x in cols[0].split(" ")] word_dict[x] if x in word_dict else unk_id
for x in cols[0].split(" ")
]
seq_len = len(wids) seq_len = len(wids)
if seq_len < max_seq_len: if seq_len < max_seq_len:
for i in range(max_seq_len - seq_len): for i in range(max_seq_len - seq_len):
...@@ -119,8 +113,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len): ...@@ -119,8 +113,10 @@ def data_reader(file_path, word_dict, num_examples, phrase, epoch, max_seq_len):
for epoch_index in range(epoch): for epoch_index in range(epoch):
for doc, label, seq_len in all_data: for doc, label, seq_len in all_data:
yield doc, label, seq_len yield doc, label, seq_len
return reader return reader
def load_vocab(file_path): def load_vocab(file_path):
""" """
load the given vocabulary load the given vocabulary
...@@ -144,15 +140,6 @@ def init_pretraining_params(exe, ...@@ -144,15 +140,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path ), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var): fluid.load(main_program, pretraining_params_path, exe)
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format( print("Load pretraining parameters from {}.".format(
pretraining_params_path)) pretraining_params_path))
运行本目录下的范例模型需要安装PaddlePaddle Fluid 1.6版。如果您的 PaddlePaddle 安装版本低于此要求,请按照[安装文档](https://www.paddlepaddle.org.cn/#quick-start)中的说明更新 PaddlePaddle 安装版本。 运行本目录下的范例模型需要安装PaddlePaddle Fluid 1.7版本。如果您的 PaddlePaddle 安装版本低于此要求,请按照[安装文档](https://www.paddlepaddle.org.cn/#quick-start)中的说明更新 PaddlePaddle 安装版本。
# Sequence to Sequence (Seq2Seq) # Sequence to Sequence (Seq2Seq)
......
...@@ -93,7 +93,7 @@ def infer(): ...@@ -93,7 +93,7 @@ def infer():
# clone from default main program and use it as the validation program # clone from default main program and use it as the validation program
main_program = fluid.default_main_program() main_program = fluid.default_main_program()
main_program = main_program.clone(for_test=True) main_program = main_program.clone(for_test=True)
print([param.name for param in main_program.blocks[0].all_parameters()]) print([param.name for param in main_program.all_parameters()])
place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace() place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
exe = Executor(place) exe = Executor(place)
...@@ -127,7 +127,8 @@ def infer(): ...@@ -127,7 +127,8 @@ def infer():
dir_name = args.reload_model dir_name = args.reload_model
print("dir name", dir_name) print("dir name", dir_name)
fluid.io.load_params(exe, dir_name) dir_name = os.path.join(dir_name, "checkpoint")
fluid.load(main_program, dir_name, exe)
train_data_iter = reader.get_data_iter(infer_data, 1, mode='eval') train_data_iter = reader.get_data_iter(infer_data, 1, mode='eval')
......
...@@ -229,10 +229,10 @@ def main(): ...@@ -229,10 +229,10 @@ def main():
% (epoch_id, epoch_time, sum(batch_times) / len(batch_times))) % (epoch_id, epoch_time, sum(batch_times) / len(batch_times)))
if not args.profile: if not args.profile:
dir_name = os.path.join(args.model_path, save_path = os.path.join(args.model_path,
"epoch_" + str(epoch_id)) "epoch_" + str(epoch_id), "checkpoint")
print("begin to save", dir_name) print("begin to save", save_path)
fluid.io.save_params(exe, dir_name, main_program=train_program) fluid.save(train_program, save_path)
print("save finished") print("save finished")
dev_ppl = eval(valid_data) dev_ppl = eval(valid_data)
print("dev ppl", dev_ppl) print("dev ppl", dev_ppl)
......
...@@ -88,7 +88,8 @@ def infer(): ...@@ -88,7 +88,8 @@ def infer():
dir_name = args.reload_model dir_name = args.reload_model
print("dir name", dir_name) print("dir name", dir_name)
fluid.io.load_params(exe, dir_name) dir_name = os.path.join(dir_name, "checkpoint")
fluid.load(main_program, dir_name, exe)
vocab, tar_id2vocab = get_vocab(args.dataset_prefix) vocab, tar_id2vocab = get_vocab(args.dataset_prefix)
infer_output = np.ones((batch_size, 1), dtype='int64') * BOS_ID infer_output = np.ones((batch_size, 1), dtype='int64') * BOS_ID
......
...@@ -255,10 +255,11 @@ def main(): ...@@ -255,10 +255,11 @@ def main():
best_nll = test_nll best_nll = test_nll
best_ppl = test_ppl best_ppl = test_ppl
best_epoch_id = epoch_id best_epoch_id = epoch_id
dir_name = os.path.join(args.model_path, save_path = os.path.join(args.model_path,
"epoch_" + str(best_epoch_id)) "epoch_" + str(best_epoch_id),
print("save model {}".format(dir_name)) "checkpoint")
fluid.io.save_params(exe, dir_name, main_program) print("save model {}".format(save_path))
fluid.save(main_program, save_path)
else: else:
steps_not_improved += 1 steps_not_improved += 1
if steps_not_improved == decay_ts: if steps_not_improved == decay_ts:
......
...@@ -4,6 +4,7 @@ This module provide nets for text classification ...@@ -4,6 +4,7 @@ This module provide nets for text classification
import paddle.fluid as fluid import paddle.fluid as fluid
def bow_net(data, def bow_net(data,
seq_len, seq_len,
label, label,
......
...@@ -43,8 +43,8 @@ class CNN(object): ...@@ -43,8 +43,8 @@ class CNN(object):
left_emb = emb_layer.ops(left) left_emb = emb_layer.ops(left)
right_emb = emb_layer.ops(right) right_emb = emb_layer.ops(right)
# Presentation context # Presentation context
cnn_layer = layers.SequenceConvPoolLayer( cnn_layer = layers.SequenceConvPoolLayer(self.filter_size,
self.filter_size, self.num_filters, "conv") self.num_filters, "conv")
left_cnn = cnn_layer.ops(left_emb) left_cnn = cnn_layer.ops(left_emb)
right_cnn = cnn_layer.ops(right_emb) right_cnn = cnn_layer.ops(right_emb)
# matching layer # matching layer
......
...@@ -33,6 +33,7 @@ def check_cuda(use_cuda, err = \ ...@@ -33,6 +33,7 @@ def check_cuda(use_cuda, err = \
except Exception as e: except Exception as e:
pass pass
def check_version(): def check_version():
""" """
Log error and exit when the installed version of paddlepaddle is Log error and exit when the installed version of paddlepaddle is
......
...@@ -30,10 +30,14 @@ from models.transformer_encoder import encoder, pre_process_layer ...@@ -30,10 +30,14 @@ from models.transformer_encoder import encoder, pre_process_layer
def ernie_pyreader(args, pyreader_name): def ernie_pyreader(args, pyreader_name):
"""define standard ernie pyreader""" """define standard ernie pyreader"""
src_ids = fluid.data(name='1', shape=[-1, args.max_seq_len, 1], dtype='int64') src_ids = fluid.data(
sent_ids = fluid.data(name='2', shape=[-1, args.max_seq_len, 1], dtype='int64') name='1', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(name='3', shape=[-1, args.max_seq_len, 1], dtype='int64') sent_ids = fluid.data(
input_mask = fluid.data(name='4', shape=[-1, args.max_seq_len, 1], dtype='float32') name='2', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(
name='3', shape=[-1, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(
name='4', shape=[-1, args.max_seq_len, 1], dtype='float32')
labels = fluid.data(name='5', shape=[-1, 1], dtype='int64') labels = fluid.data(name='5', shape=[-1, 1], dtype='int64')
seq_lens = fluid.data(name='6', shape=[-1], dtype='int64') seq_lens = fluid.data(name='6', shape=[-1], dtype='int64')
......
...@@ -29,6 +29,7 @@ from preprocess.ernie import tokenization ...@@ -29,6 +29,7 @@ from preprocess.ernie import tokenization
from preprocess.padding import pad_batch_data from preprocess.padding import pad_batch_data
import io import io
def csv_reader(fd, delimiter='\t'): def csv_reader(fd, delimiter='\t'):
def gen(): def gen():
for i in fd: for i in fd:
...@@ -37,8 +38,10 @@ def csv_reader(fd, delimiter='\t'): ...@@ -37,8 +38,10 @@ def csv_reader(fd, delimiter='\t'):
yield slots, yield slots,
else: else:
yield slots yield slots
return gen() return gen()
class BaseReader(object): class BaseReader(object):
"""BaseReader for classify and sequence labeling task""" """BaseReader for classify and sequence labeling task"""
......
...@@ -23,6 +23,7 @@ import unicodedata ...@@ -23,6 +23,7 @@ import unicodedata
import six import six
import io import io
def convert_to_unicode(text): def convert_to_unicode(text):
"""Converts `text` to Unicode (if it's not already), assuming utf-8 input.""" """Converts `text` to Unicode (if it's not already), assuming utf-8 input."""
if six.PY3: if six.PY3:
......
...@@ -30,7 +30,7 @@ if sys.getdefaultencoding() != defaultencoding: ...@@ -30,7 +30,7 @@ if sys.getdefaultencoding() != defaultencoding:
reload(sys) reload(sys)
sys.setdefaultencoding(defaultencoding) sys.setdefaultencoding(defaultencoding)
sys.path.append("..") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -47,14 +47,14 @@ from models.model_check import check_version ...@@ -47,14 +47,14 @@ from models.model_check import check_version
from models.model_check import check_cuda from models.model_check import check_cuda
def create_model(args, pyreader_name, is_inference = False, is_pointwise = False): def create_model(args, pyreader_name, is_inference=False, is_pointwise=False):
""" """
Create Model for simnet Create Model for simnet
""" """
if is_inference: if is_inference:
inf_pyreader = fluid.layers.py_reader( inf_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1]), shapes=([-1], [-1]),
dtypes=('int64', 'int64'), dtypes=('int64', 'int64'),
lod_levels=(1, 1), lod_levels=(1, 1),
name=pyreader_name, name=pyreader_name,
...@@ -67,7 +67,7 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False ...@@ -67,7 +67,7 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
if is_pointwise: if is_pointwise:
pointwise_pyreader = fluid.layers.py_reader( pointwise_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1], [-1,1]), shapes=([-1], [-1], [-1]),
dtypes=('int64', 'int64', 'int64'), dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 0), lod_levels=(1, 1, 0),
name=pyreader_name, name=pyreader_name,
...@@ -79,15 +79,17 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False ...@@ -79,15 +79,17 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
else: else:
pairwise_pyreader = fluid.layers.py_reader( pairwise_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1], [-1,1]), shapes=([-1], [-1], [-1]),
dtypes=('int64', 'int64', 'int64'), dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 1), lod_levels=(1, 1, 1),
name=pyreader_name, name=pyreader_name,
use_double_buffer=False) use_double_buffer=False)
left, pos_right, neg_right = fluid.layers.read_file(pairwise_pyreader) left, pos_right, neg_right = fluid.layers.read_file(
pairwise_pyreader)
return pairwise_pyreader, left, pos_right, neg_right return pairwise_pyreader, left, pos_right, neg_right
def train(conf_dict, args): def train(conf_dict, args):
""" """
train processic train processic
...@@ -97,16 +99,16 @@ def train(conf_dict, args): ...@@ -97,16 +99,16 @@ def train(conf_dict, args):
# get vocab size # get vocab size
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
# Load network structure dynamically # Load network structure dynamically
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
# Load loss function dynamically # Load loss function dynamically
loss = utils.import_class("../models/matching/losses", loss = utils.import_class("../shared_modules/models/matching/losses",
conf_dict["loss"]["module_name"], conf_dict["loss"]["module_name"],
conf_dict["loss"]["class_name"])(conf_dict) conf_dict["loss"]["class_name"])(conf_dict)
# Load Optimization method # Load Optimization method
optimizer = utils.import_class( optimizer = utils.import_class(
"../models/matching/optimizers", "paddle_optimizers", "../shared_modules/models/matching/optimizers", "paddle_optimizers",
conf_dict["optimizer"]["class_name"])(conf_dict) conf_dict["optimizer"]["class_name"])(conf_dict)
# load auc method # load auc method
metric = fluid.metrics.Auc(name="auc") metric = fluid.metrics.Auc(name="auc")
...@@ -131,8 +133,7 @@ def train(conf_dict, args): ...@@ -131,8 +133,7 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, left, pos_right, neg_right = create_model( train_pyreader, left, pos_right, neg_right = create_model(
args, args, pyreader_name='train_reader')
pyreader_name='train_reader')
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
_, neg_score = net.predict(left, neg_right) _, neg_score = net.predict(left, neg_right)
...@@ -141,12 +142,14 @@ def train(conf_dict, args): ...@@ -141,12 +142,14 @@ def train(conf_dict, args):
optimizer.ops(avg_cost) optimizer.ops(avg_cost)
# Get Reader # Get Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch) get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid: if args.do_valid:
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, pos_right= create_model(args, pyreader_name = 'test_reader',is_inference=True) test_pyreader, left, pos_right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -156,40 +159,41 @@ def train(conf_dict, args): ...@@ -156,40 +159,41 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, left, right, label = create_model( train_pyreader, left, right, label = create_model(
args, args, pyreader_name='train_reader', is_pointwise=True)
pyreader_name='train_reader',
is_pointwise=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
avg_cost = loss.compute(pred, label) avg_cost = loss.compute(pred, label)
avg_cost.persistable = True avg_cost.persistable = True
optimizer.ops(avg_cost) optimizer.ops(avg_cost)
# Get Feeder and Reader # Get Feeder and Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch) get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid: if args.do_valid:
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, right= create_model(args, pyreader_name = 'test_reader',is_inference=True) test_pyreader, left, right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
if args.init_checkpoint is not "": if args.init_checkpoint is not "":
utils.init_checkpoint(exe, args.init_checkpoint, utils.init_checkpoint(exe, args.init_checkpoint, startup_prog)
startup_prog)
def valid_and_test(test_program, test_pyreader, get_valid_examples, process, mode, exe, fetch_list): def valid_and_test(test_program, test_pyreader, get_valid_examples, process,
mode, exe, fetch_list):
""" """
return auc and acc return auc and acc
""" """
# Get Batch Data # Get Batch Data
batch_data = fluid.io.batch(get_valid_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_valid_examples, args.batch_size, drop_last=False)
test_pyreader.decorate_paddle_reader(batch_data) test_pyreader.decorate_paddle_reader(batch_data)
test_pyreader.start() test_pyreader.start()
pred_list = [] pred_list = []
while True: while True:
try: try:
_pred = exe.run(program=test_program,fetch_list=[pred.name]) _pred = exe.run(program=test_program, fetch_list=[pred.name])
pred_list += list(_pred) pred_list += list(_pred)
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
...@@ -222,7 +226,8 @@ def train(conf_dict, args): ...@@ -222,7 +226,8 @@ def train(conf_dict, args):
#for epoch_id in range(args.epoch): #for epoch_id in range(args.epoch):
# used for continuous evaluation # used for continuous evaluation
if args.enable_ce: if args.enable_ce:
train_batch_data = fluid.io.batch(get_train_examples, args.batch_size, drop_last=False) train_batch_data = fluid.io.batch(
get_train_examples, args.batch_size, drop_last=False)
else: else:
train_batch_data = fluid.io.batch( train_batch_data = fluid.io.batch(
fluid.io.shuffle( fluid.io.shuffle(
...@@ -238,19 +243,23 @@ def train(conf_dict, args): ...@@ -238,19 +243,23 @@ def train(conf_dict, args):
try: try:
global_step += 1 global_step += 1
fetch_list = [avg_cost.name] fetch_list = [avg_cost.name]
avg_loss = train_exe.run(program=train_program, fetch_list = fetch_list) avg_loss = train_exe.run(program=train_program,
fetch_list=fetch_list)
losses.append(np.mean(avg_loss[0])) losses.append(np.mean(avg_loss[0]))
if args.do_valid and global_step % args.validation_steps == 0: if args.do_valid and global_step % args.validation_steps == 0:
get_valid_examples = simnet_process.get_reader("valid") get_valid_examples = simnet_process.get_reader("valid")
valid_result = valid_and_test(test_prog,test_pyreader,get_valid_examples,simnet_process,"valid",exe,[pred.name]) valid_result = valid_and_test(
test_prog, test_pyreader, get_valid_examples,
simnet_process, "valid", exe, [pred.name])
if args.compute_accuracy: if args.compute_accuracy:
valid_auc, valid_acc = valid_result valid_auc, valid_acc = valid_result
logging.info( logging.info(
"global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f" % "global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f"
(global_step, valid_auc, valid_acc, np.mean(losses))) % (global_step, valid_auc, valid_acc, np.mean(losses)))
else: else:
valid_auc = valid_result valid_auc = valid_result
logging.info("global_steps: %d, valid_auc: %f, valid_loss: %f" % logging.info(
"global_steps: %d, valid_auc: %f, valid_loss: %f" %
(global_step, valid_auc, np.mean(losses))) (global_step, valid_auc, np.mean(losses)))
if global_step % args.save_steps == 0: if global_step % args.save_steps == 0:
model_save_dir = os.path.join(args.output_dir, model_save_dir = os.path.join(args.output_dir,
...@@ -269,8 +278,7 @@ def train(conf_dict, args): ...@@ -269,8 +278,7 @@ def train(conf_dict, args):
] ]
target_vars = [left_feat, pred] target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names, fluid.io.save_inference_model(model_path, feed_var_names,
target_vars, exe, target_vars, exe, test_prog)
test_prog)
logging.info("saving infer model in %s" % model_path) logging.info("saving infer model in %s" % model_path)
except fluid.core.EOFException: except fluid.core.EOFException:
...@@ -282,8 +290,7 @@ def train(conf_dict, args): ...@@ -282,8 +290,7 @@ def train(conf_dict, args):
ce_info.append([np.mean(losses), end_time - start_time]) ce_info.append([np.mean(losses), end_time - start_time])
#final save #final save
logging.info("the final step is %s" % global_step) logging.info("the final step is %s" % global_step)
model_save_dir = os.path.join(args.output_dir, model_save_dir = os.path.join(args.output_dir, conf_dict["model_path"])
conf_dict["model_path"])
model_path = os.path.join(model_save_dir, str(global_step)) model_path = os.path.join(model_save_dir, str(global_step))
if not os.path.exists(model_save_dir): if not os.path.exists(model_save_dir):
os.makedirs(model_save_dir) os.makedirs(model_save_dir)
...@@ -296,8 +303,7 @@ def train(conf_dict, args): ...@@ -296,8 +303,7 @@ def train(conf_dict, args):
right.name, right.name,
] ]
target_vars = [left_feat, pred] target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names, fluid.io.save_inference_model(model_path, feed_var_names, target_vars, exe,
target_vars, exe,
test_prog) test_prog)
logging.info("saving infer model in %s" % model_path) logging.info("saving infer model in %s" % model_path)
# used for continuous evaluation # used for continuous evaluation
...@@ -322,7 +328,9 @@ def train(conf_dict, args): ...@@ -322,7 +328,9 @@ def train(conf_dict, args):
else: else:
# Get Feeder and Reader # Get Feeder and Reader
get_test_examples = simnet_process.get_reader("test") get_test_examples = simnet_process.get_reader("test")
test_result = valid_and_test(test_prog,test_pyreader,get_test_examples,simnet_process,"test",exe,[pred.name]) test_result = valid_and_test(test_prog, test_pyreader,
get_test_examples, simnet_process, "test",
exe, [pred.name])
if args.compute_accuracy: if args.compute_accuracy:
test_auc, test_acc = test_result test_auc, test_acc = test_result
logging.info("AUC of test is %f, Accuracy of test is %f" % logging.info("AUC of test is %f, Accuracy of test is %f" %
...@@ -348,12 +356,13 @@ def test(conf_dict, args): ...@@ -348,12 +356,13 @@ def test(conf_dict, args):
startup_prog = fluid.Program() startup_prog = fluid.Program()
get_test_examples = simnet_process.get_reader("test") get_test_examples = simnet_process.get_reader("test")
batch_data = fluid.io.batch(get_test_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_test_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program() test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
...@@ -364,9 +373,7 @@ def test(conf_dict, args): ...@@ -364,9 +373,7 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, pos_right = create_model( test_pyreader, left, pos_right = create_model(
args, args, pyreader_name='test_reader', is_inference=True)
pyreader_name = 'test_reader',
is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -375,18 +382,13 @@ def test(conf_dict, args): ...@@ -375,18 +382,13 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, right = create_model( test_pyreader, left, right = create_model(
args, args, pyreader_name='test_reader', is_inference=True)
pyreader_name = 'test_reader',
is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
utils.init_checkpoint( utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
test_exe = exe test_exe = exe
test_pyreader.decorate_paddle_reader(batch_data) test_pyreader.decorate_paddle_reader(batch_data)
...@@ -398,15 +400,18 @@ def test(conf_dict, args): ...@@ -398,15 +400,18 @@ def test(conf_dict, args):
output = [] output = []
while True: while True:
try: try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list) output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
pred_list += list(map(lambda item: float(item[0]), output[0])) pred_list += list(
map(lambda item: float(item[0]), output[0]))
predictions_file.write(u"\n".join( predictions_file.write(u"\n".join(
map(lambda item: str((item[0] + 1) / 2), output[0])) + "\n") map(lambda item: str((item[0] + 1) / 2), output[0])) +
"\n")
else: else:
pred_list += map(lambda item: item, output[0]) pred_list += map(lambda item: item, output[0])
predictions_file.write(u"\n".join( predictions_file.write(u"\n".join(
map(lambda item: str(np.argmax(item)), output[0])) + "\n") map(lambda item: str(np.argmax(item)), output[0])) +
"\n")
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
break break
...@@ -450,36 +455,36 @@ def infer(conf_dict, args): ...@@ -450,36 +455,36 @@ def infer(conf_dict, args):
startup_prog = fluid.Program() startup_prog = fluid.Program()
get_infer_examples = simnet_process.get_infer_reader get_infer_examples = simnet_process.get_infer_reader
batch_data = fluid.io.batch(get_infer_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_infer_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program() test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, left, pos_right = create_model(args, pyreader_name = 'infer_reader', is_inference = True) infer_pyreader, left, pos_right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
else: else:
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, left, right = create_model(args, pyreader_name = 'infer_reader', is_inference = True) infer_pyreader, left, right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
utils.init_checkpoint( utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
test_exe = exe test_exe = exe
infer_pyreader.decorate_sample_list_generator(batch_data) infer_pyreader.decorate_sample_list_generator(batch_data)
...@@ -491,7 +496,7 @@ def infer(conf_dict, args): ...@@ -491,7 +496,7 @@ def infer(conf_dict, args):
infer_pyreader.start() infer_pyreader.start()
while True: while True:
try: try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list) output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
preds_list += list( preds_list += list(
map(lambda item: str((item[0] + 1) / 2), output[0])) map(lambda item: str((item[0] + 1) / 2), output[0]))
...@@ -514,6 +519,7 @@ def get_cards(): ...@@ -514,6 +519,7 @@ def get_cards():
num = len(cards.split(",")) num = len(cards.split(","))
return num return num
if __name__ == "__main__": if __name__ == "__main__":
args = ArgConfig() args = ArgConfig()
......
...@@ -149,7 +149,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -149,7 +149,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
[**PaddleNLP**](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP) 是基于 PaddlePaddle 深度学习框架开发的自然语言处理 (NLP) 工具,算法,模型和数据的开源项目。百度在 NLP 领域十几年的深厚积淀为 PaddleNLP 提供了强大的核心动力。使用 PaddleNLP,您可以得到: [**PaddleNLP**](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP) 是基于 PaddlePaddle 深度学习框架开发的自然语言处理 (NLP) 工具,算法,模型和数据的开源项目。百度在 NLP 领域十几年的深厚积淀为 PaddleNLP 提供了强大的核心动力。使用 PaddleNLP,您可以得到:
- **丰富而全面的 NLP 任务支持:** - **丰富而全面的 NLP 任务支持:**
- PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。 - PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。
- **稳定可靠的 NLP 模型和强大的预训练参数:** - **稳定可靠的 NLP 模型和强大的预训练参数:**
- PaddleNLP集成了百度内部广泛使用的 NLP 工具模型,为您提供了稳定可靠的 NLP 算法解决方案。基于百亿级数据的预训练参数和丰富的预训练模型,助您轻松提高模型效果,为您的 NLP 业务注入强大动力。 - PaddleNLP集成了百度内部广泛使用的 NLP 工具模型,为您提供了稳定可靠的 NLP 算法解决方案。基于百亿级数据的预训练参数和丰富的预训练模型,助您轻松提高模型效果,为您的 NLP 业务注入强大动力。
- **持续改进和技术支持,零基础搭建 NLP 应用:** - **持续改进和技术支持,零基础搭建 NLP 应用:**
...@@ -167,14 +167,14 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -167,14 +167,14 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 语义表示 #### 语义表示
[PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) (Paddle LAngauge Representation ToolKit) 是传统语言模型的进一步发展,通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。PaddleLARK 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。 [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) (Paddle LAngauge Representation ToolKit) 是传统语言模型的进一步发展,通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。pretrain_langauge_models 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [ERNIE](https://github.com/PaddlePaddle/ERNIE)(Enhanced Representation from kNowledge IntEgration) | 百度自研的语义表示模型,通过建模海量数据中的词、实体及实体关系,学习真实世界的语义知识。相较于 BERT 学习原始语言信号,ERNIE 直接对先验语义知识单元进行建模,增强了模型语义表示能力。 | | [ERNIE](https://github.com/PaddlePaddle/ERNIE)(Enhanced Representation from kNowledge IntEgration) | 百度自研的语义表示模型,通过建模海量数据中的词、实体及实体关系,学习真实世界的语义知识。相较于 BERT 学习原始语言信号,ERNIE 直接对先验语义知识单元进行建模,增强了模型语义表示能力。 |
| [BERT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 | | [BERT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 |
| [XLNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 | | [XLNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 |
| [ELMo](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 | | [ELMo](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 |
#### 文本相似度计算 #### 文本相似度计算
...@@ -182,7 +182,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -182,7 +182,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 文本生成 #### 文本生成
[PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。 [seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。
### NLP 系统应用 ### NLP 系统应用
...@@ -195,7 +195,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -195,7 +195,7 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 阅读理解 #### 阅读理解
[PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。 [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
...@@ -205,16 +205,16 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -205,16 +205,16 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
#### 机器翻译 #### 机器翻译
[PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762) [machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
#### 对话系统 #### 对话系统
[PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) 包含对话系统方向的模型、数据集和工具。 [dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) 包含对话系统方向的模型、数据集和工具。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [DGU](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 | | [DGU](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 |
| [ADEM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 | | [ADEM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 |
| [Proactive Conversation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2019-DuConv) | 包含百度开源的知识驱动的开放领域对话数据集 [DuConv](https://ai.baidu.com/broad/subordinate?dataset=duconv),以及基线模型。对应论文 [Proactive Human-Machine Conversation with Explicit Conversation Goals](https://arxiv.org/abs/1906.05572) 发表于 ACL2019。 | | [Proactive Conversation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2019-DuConv) | 包含百度开源的知识驱动的开放领域对话数据集 [DuConv](https://ai.baidu.com/broad/subordinate?dataset=duconv),以及基线模型。对应论文 [Proactive Human-Machine Conversation with Explicit Conversation Goals](https://arxiv.org/abs/1906.05572) 发表于 ACL2019。 |
| [DAM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2018-DAM)(Deep Attention Matching Network,深度注意力机制模型) | 开放领域多轮对话匹配模型,对应论文 [Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](https://aclweb.org/anthology/P18-1103/) 发表于 ACL2018。 | | [DAM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2018-DAM)(Deep Attention Matching Network,深度注意力机制模型) | 开放领域多轮对话匹配模型,对应论文 [Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](https://aclweb.org/anthology/P18-1103/) 发表于 ACL2018。 |
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册