未验证 提交 803dab78 编写于 作者: P pkpk 提交者: GitHub

test=develop (#4389)

上级 9e12ab90
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:** - **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。 - PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:** - **稳定可靠的NLP模型和强大的预训练参数:**
...@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification ...@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 | | **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 | | **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 | | **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 | | **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 | | **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 | | **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 | | **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 | | **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 | | **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 |
...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification ...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text ```text
. .
├── Research # 百度NLP在research方面的工作集合 ├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型 ├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型 ├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型 ├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱 ├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型 ├── language_model # 语言模型
├── lexical_analysis # LAC词法分析 ├── lexical_analysis # LAC词法分析
├── models # 共享网络 ├── shared_modules/models # 共享网络
│ ├── __init__.py │ ├── __init__.py
│ ├── classification │ ├── classification
│ ├── dialogue_model_toolkit │ ├── dialogue_model_toolkit
...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification ...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation │ ├── representation
│ ├── sequence_labeling │ ├── sequence_labeling
│ └── transformer_encoder.py │ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具 ├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py │ ├── __init__.py
│ ├── ernie │ ├── ernie
│ ├── padding.py │ ├── padding.py
......
...@@ -16,7 +16,6 @@ ...@@ -16,7 +16,6 @@
# limitations under the License. # limitations under the License.
""" """
from __future__ import absolute_import from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
...@@ -40,43 +39,55 @@ import math ...@@ -40,43 +39,55 @@ import math
np.random.seed(0) np.random.seed(0)
random.seed(0) random.seed(0)
parser = argparse.ArgumentParser(__doc__) parser = argparse.ArgumentParser(__doc__)
DEV_COUNT = 1 DEV_COUNT = 1
model_g = ArgumentGroup(parser, "model", "model configuration and paths.") model_g = ArgumentGroup(parser, "model", "model configuration and paths.")
model_g.add_arg("init_checkpoint", str, None, "Init checkpoint to resume training from.") model_g.add_arg("init_checkpoint", str, None,
model_g.add_arg("checkpoints", str, "./checkpoints", "Path to save checkpoints.") "Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints",
"Path to save checkpoints.")
model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.") model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.")
model_g.add_arg("build_dict", bool, False, "Build dict.") model_g.add_arg("build_dict", bool, False, "Build dict.")
train_g = ArgumentGroup(parser, "training", "training options.") train_g = ArgumentGroup(parser, "training", "training options.")
train_g.add_arg("cpu_num", int, 3, "Number of Threads.") train_g.add_arg("cpu_num", int, 3, "Number of Threads.")
train_g.add_arg("epoch", int, 100, "Number of epoches for training.") train_g.add_arg("epoch", int, 100, "Number of epoches for training.")
train_g.add_arg("learning_rate", float, 0.1, "Learning rate used to train with warmup.") train_g.add_arg("learning_rate", float, 0.1,
train_g.add_arg("save_steps", int, 1000, "The steps interval to save checkpoints.") "Learning rate used to train with warmup.")
train_g.add_arg("validation_steps", int, 100, "The steps interval to evaluate model performance.") train_g.add_arg("save_steps", int, 1000,
"The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100,
"The steps interval to evaluate model performance.")
train_g.add_arg("random_seed", int, 7, "random seed") train_g.add_arg("random_seed", int, 7, "random seed")
train_g.add_arg("threshold", float, 0.1, "When the confidence exceeds the threshold, the corresponding label is given.") train_g.add_arg(
"threshold", float, 0.1,
"When the confidence exceeds the threshold, the corresponding label is given."
)
log_g = ArgumentGroup(parser, "logging", "logging related.") log_g = ArgumentGroup(parser, "logging", "logging related.")
log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.") log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.")
data_g = ArgumentGroup(parser, "data", "Data paths, vocab paths and data processing options") data_g = ArgumentGroup(parser, "data",
"Data paths, vocab paths and data processing options")
data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.") data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.")
data_g.add_arg("save_dir", str, "./data/output/", "Path to save.") data_g.add_arg("save_dir", str, "./data/output/", "Path to save.")
data_g.add_arg("max_seq_len", int, 50, "Tokens' number of the longest seqence allowed.") data_g.add_arg("max_seq_len", int, 50,
data_g.add_arg("batch_size", int, 64, "The total number of examples in one batch for training.") "Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64,
"The total number of examples in one batch for training.")
run_type_g = ArgumentGroup(parser, "run_type", "running type options.") run_type_g = ArgumentGroup(parser, "run_type", "running type options.")
run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.") run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.")
# run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).") # run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).")
run_type_g.add_arg("do_train", bool, True, "Whether to perform evaluation on test data set.") run_type_g.add_arg("do_train", bool, True,
run_type_g.add_arg("do_eval", bool, True, "Whether to perform evaluation on test data set.") "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True, "Whether to perform evaluation on test data set.") run_type_g.add_arg("do_eval", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True,
"Whether to perform evaluation on test data set.")
args = parser.parse_args() args = parser.parse_args()
def get_score(pred_result, label, eval_phase): def get_score(pred_result, label, eval_phase):
"""[get precision recall and f-score] """[get precision recall and f-score]
...@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place): ...@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place):
pred_label = build_res["pred_label"] pred_label = build_res["pred_label"]
label = build_res["label"] label = build_res["label"]
fetch_list = [cost.name, prediction.name, pred_label.name, label.name] fetch_list = [cost.name, prediction.name, pred_label.name, label.name]
train_pyreader = build_res["train_pyreader"] train_data_loader = build_res["train_data_loader"]
train_prog = build_res["train_prog"] train_prog = build_res["train_prog"]
steps = 0 steps = 0
time_begin = time.time() time_begin = time.time()
...@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place): ...@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place):
logger.info("Begin training") logger.info("Begin training")
for i in range(args.epoch): for i in range(args.epoch):
try: try:
for data in train_pyreader(): for data in train_data_loader():
avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \ avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \
fetch_list=fetch_list) fetch_list=fetch_list)
steps += 1 steps += 1
if steps % int(args.skip_steps) == 0: if steps % int(args.skip_steps) == 0:
time_end = time.time() time_end = time.time()
used_time = time_end - time_begin used_time = time_end - time_begin
get_score(pred_label, label, eval_phase = "Train") get_score(pred_label, label, eval_phase="Train")
logger.info('loss is {}'.format(avg_cost_np)) logger.info('loss is {}'.format(avg_cost_np))
logger.info("epoch: %d, step: %d, speed: %f steps/s" % (i, steps, args.skip_steps / used_time)) logger.info("epoch: %d, step: %d, speed: %f steps/s" %
(i, steps, args.skip_steps / used_time))
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps)) "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog) fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path)) logger.info("[save]step %d : save at %s" %
(steps, save_path))
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
if args.do_eval: if args.do_eval:
evaluate(args, test_exe, build_res, "eval") evaluate(args, test_exe, build_res, "eval")
...@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place): ...@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place):
logger.error("Train error : %s" % str(e)) logger.error("Train error : %s" % str(e))
exit(1) exit(1)
save_path = os.path.join(args.checkpoints, "step_" + str(steps)) save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog) fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path)) logger.info("[save]step %d : save at %s" % (steps, save_path))
def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent=None): def evaluate(args,
test_exe,
build_res,
eval_phase,
save_result=False,
id2intent=None):
"""[evaluate on dev/test dataset] """[evaluate on dev/test dataset]
Arguments: Arguments:
...@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], [] total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], []
if eval_phase == "eval": if eval_phase == "eval":
test_prog = build_res["eval_compiled_prog"] test_prog = build_res["eval_compiled_prog"]
test_pyreader = build_res["eval_pyreader"] test_data_loader = build_res["eval_data_loader"]
elif eval_phase == "test": elif eval_phase == "test":
test_prog = build_res["test_compiled_prog"] test_prog = build_res["test_compiled_prog"]
test_pyreader = build_res["test_pyreader"] test_data_loader = build_res["test_data_loader"]
else: else:
exit(1) exit(1)
logger.info("-----------------------------------------------------------") logger.info("-----------------------------------------------------------")
for data in test_pyreader(): for data in test_data_loader():
avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \ avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \
return_numpy=True) return_numpy=True)
total_cost.append(avg_cost_np) total_cost.append(avg_cost_np)
...@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
label_list.extend(label) label_list.extend(label)
if save_result: if save_result:
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase + ".rst") logger.info("save result at : %s" % args.save_dir + "/" + eval_phase +
".rst")
save_dir = args.save_dir save_dir = args.save_dir
if not os.path.exists(save_dir): if not os.path.exists(save_dir):
logger.warning("save dir not exists, and create it") logger.warning("save dir not exists, and create it")
os.makedirs(save_dir) os.makedirs(save_dir)
fin = codecs.open(os.path.join(args.data_dir, eval_phase + ".txt"), "r", encoding="utf8") fin = codecs.open(
fout = codecs.open(args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8") os.path.join(args.data_dir, eval_phase + ".txt"),
"r",
encoding="utf8")
fout = codecs.open(
args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
for line in pred_prob_list: for line in pred_prob_list:
query = fin.readline().rsplit("\t", 1)[0] query = fin.readline().rsplit("\t", 1)[0]
res = [] res = []
...@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent ...@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
logger.info("-----------------------------------------------------------") logger.info("-----------------------------------------------------------")
def create_net(args,
def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_net", is_infer=False): flow_data,
"""[create network and pyreader] class_dim,
dict_dim,
place,
model_name="textcnn_net",
is_infer=False):
"""[create network and loader]
Arguments: Arguments:
flow_data {[type]} -- [description] flow_data {[type]} -- [description]
...@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_ ...@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
model = textcnn_net_multi_label model = textcnn_net_multi_label
else: else:
return return
char_list = fluid.data(name="char", shape=[None, args.max_seq_len, 1], dtype="int64", lod_level=0) char_list = fluid.data(
label = fluid.data(name="label", shape=[None, class_dim], dtype="float32", lod_level=0) # label data name="char",
reader = fluid.io.PyReader(feed_list=[char_list, label], capacity=args.batch_size * 10, iterable=True, \ shape=[None, args.max_seq_len, 1],
dtype="int64",
lod_level=0)
label = fluid.data(
name="label", shape=[None, class_dim], dtype="float32",
lod_level=0) # label data
data_loader = fluid.io.DataLoader.from_generator(
feed_list=[char_list, label],
capacity=args.batch_size * 10,
iterable=True,
return_list=False) return_list=False)
output = model(char_list, label, dict_dim, output = model(
char_list,
label,
dict_dim,
emb_dim=flow_data["model"]["emb_dim"], emb_dim=flow_data["model"]["emb_dim"],
hid_dim=flow_data["model"]["hid_dim"], hid_dim=flow_data["model"]["hid_dim"],
hid_dim2=flow_data["model"]["hid_dim2"], hid_dim2=flow_data["model"]["hid_dim2"],
...@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_ ...@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
max_seq_len=args.max_seq_len) max_seq_len=args.max_seq_len)
if is_infer: if is_infer:
prediction = output prediction = output
return [reader, prediction] return [data_loader, prediction]
else: else:
avg_cost, prediction, pred_label, label = output[0], output[1], output[2], output[3] avg_cost, prediction, pred_label, label = output[0], output[1], output[
return [reader, avg_cost, prediction, pred_label, label] 2], output[3]
return [data_loader, avg_cost, prediction, pred_label, label]
def build_data_reader(args, char_dict, intent_dict): def build_data_loader(args, char_dict, intent_dict):
"""[decorate samples for pyreader] """[decorate samples for dataloader]
Arguments: Arguments:
args {[type]} -- [description] args {[type]} -- [description]
...@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
reader_res = {} loader_res = {}
if args.do_train: if args.do_train:
train_processor = DataReader(char_dict, intent_dict, args.max_seq_len) train_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
train_data_generator = train_processor.prepare_data( train_data_generator = train_processor.prepare_data(
data_path=args.data_dir + "train.txt", data_path=args.data_dir + "train.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='train') mode='train')
reader_res["train_data_generator"] = train_data_generator loader_res["train_data_generator"] = train_data_generator
num_train_examples = train_processor._get_num_examples() num_train_examples = train_processor._get_num_examples()
logger.info("Num train examples: %d" % num_train_examples) logger.info("Num train examples: %d" % num_train_examples)
logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \ logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \
args.epoch // DEV_COUNT)) args.epoch // DEV_COUNT))
if math.ceil(num_train_examples * 1.0 / args.batch_size) // DEV_COUNT <= 0: if math.ceil(num_train_examples * 1.0 /
logger.error("Num of train steps is less than 0 or equals to 0, exit") args.batch_size) // DEV_COUNT <= 0:
logger.error(
"Num of train steps is less than 0 or equals to 0, exit")
exit(1) exit(1)
if args.do_eval: if args.do_eval:
eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len) eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
...@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "eval.txt", data_path=args.data_dir + "eval.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='eval') mode='eval')
reader_res["eval_data_generator"] = eval_data_generator loader_res["eval_data_generator"] = eval_data_generator
num_eval_examples = eval_processor._get_num_examples() num_eval_examples = eval_processor._get_num_examples()
logger.info("Num eval examples: %d" % num_eval_examples) logger.info("Num eval examples: %d" % num_eval_examples)
if args.do_test: if args.do_test:
...@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict): ...@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "test.txt", data_path=args.data_dir + "test.txt",
batch_size=args.batch_size, batch_size=args.batch_size,
mode='test') mode='test')
reader_res["test_data_generator"] = test_data_generator loader_res["test_data_generator"] = test_data_generator
return reader_res return loader_res
def build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res): def build_graph(args, model_config, num_labels, dict_dim, place, test_place,
loader_res):
"""[build paddle graph] """[build paddle graph]
Arguments: Arguments:
...@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea ...@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
num_labels {[type]} -- [description] num_labels {[type]} -- [description]
dict_dim {[type]} -- [description] dict_dim {[type]} -- [description]
place {[type]} -- [description] place {[type]} -- [description]
reader_res {[type]} -- [description] loader_res {[type]} -- [description]
Returns: Returns:
[type] -- [description] [type] -- [description]
...@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea ...@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
if args.do_train: if args.do_train:
with fluid.program_guard(train_prog, startup_prog): with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ train_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, place, model_name="textcnn_net") dict_dim, place, model_name="textcnn_net")
train_pyreader.decorate_sample_list_generator(reader_res['train_data_generator'], places=place) train_data_loader.set_sample_list_generator(
res["train_pyreader"] = train_pyreader loader_res['train_data_generator'], places=place)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=fluid.layers.exponential_decay( res["train_data_loader"] = train_data_loader
learning_rate=args.learning_rate, decay_steps=1000, decay_rate=0.5, staircase=True)) sgd_optimizer = fluid.optimizer.SGD(
learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate,
decay_steps=1000,
decay_rate=0.5,
staircase=True))
sgd_optimizer.minimize(cost) sgd_optimizer.minimize(cost)
if args.do_eval: if args.do_eval:
with fluid.program_guard(eval_prog, startup_prog): with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
eval_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ eval_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net") dict_dim, test_place, model_name="textcnn_net")
eval_pyreader.decorate_sample_list_generator(reader_res['eval_data_generator'], places=test_place) eval_data_loader.set_sample_list_generator(
res["eval_pyreader"] = eval_pyreader loader_res['eval_data_generator'], places=test_place)
res["eval_data_loader"] = eval_data_loader
if args.do_test: if args.do_test:
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \ test_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net") dict_dim, test_place, model_name="textcnn_net")
test_pyreader.decorate_sample_list_generator(reader_res['test_data_generator'], places=test_place) test_data_loader.set_sample_list_generator(
res["test_pyreader"] = test_pyreader loader_res['test_data_generator'], places=test_place)
res["test_data_loader"] = test_data_loader
res["cost"] = cost res["cost"] = cost
res["prediction"] = prediction res["prediction"] = prediction
res["label"] = label res["label"] = label
res["pred_label"] = pred_label res["pred_label"] = pred_label
res["train_prog"] =train_prog res["train_prog"] = train_prog
res["eval_prog"] = eval_prog res["eval_prog"] = eval_prog
res["test_prog"] = test_prog res["test_prog"] = test_prog
return res return res
...@@ -421,8 +471,9 @@ def main(args): ...@@ -421,8 +471,9 @@ def main(args):
id2intent[int(value)] = key id2intent[int(value)] = key
num_labels = len(intent_dict) num_labels = len(intent_dict)
# build model # build model
reader_res = build_data_reader(args, char_dict, intent_dict) loader_res = build_data_loader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res) build_res = build_graph(args, model_config, num_labels, dict_dim, place,
test_place, loader_res)
build_res["place"] = place build_res["place"] = place
build_res["test_place"] = test_place build_res["test_place"] = test_place
if not (args.do_train or args.do_eval or args.do_test): if not (args.do_train or args.do_eval or args.do_test):
...@@ -432,11 +483,13 @@ def main(args): ...@@ -432,11 +483,13 @@ def main(args):
exe.run(startup_prog) exe.run(startup_prog)
if args.init_checkpoint and args.init_checkpoint != "None": if args.init_checkpoint and args.init_checkpoint != "None":
try: try:
init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog) init_checkpoint(
exe, args.init_checkpoint, main_program=startup_prog)
logger.info("Load model from %s" % args.init_checkpoint) logger.info("Load model from %s" % args.init_checkpoint)
except Exception as e: except Exception as e:
logger.exception(str(e)) logger.exception(str(e))
logger.error("Faild load model from %s [%s]" % (args.init_checkpoint, str(e))) logger.error("Faild load model from %s [%s]" %
(args.init_checkpoint, str(e)))
build_strategy = fluid.compiler.BuildStrategy() build_strategy = fluid.compiler.BuildStrategy()
build_strategy.fuse_all_reduce_ops = False build_strategy.fuse_all_reduce_ops = False
exec_strategy = fluid.ExecutionStrategy() exec_strategy = fluid.ExecutionStrategy()
...@@ -449,10 +502,12 @@ def main(args): ...@@ -449,10 +502,12 @@ def main(args):
exec_strategy=exec_strategy) exec_strategy=exec_strategy)
build_res["compiled_prog"] = compiled_prog build_res["compiled_prog"] = compiled_prog
if args.do_test: if args.do_test:
test_compiled_prog = fluid.compiler.CompiledProgram(build_res["test_prog"]) test_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"test_prog"])
build_res["test_compiled_prog"] = test_compiled_prog build_res["test_compiled_prog"] = test_compiled_prog
if args.do_eval: if args.do_eval:
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res["eval_prog"]) eval_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"eval_prog"])
build_res["eval_compiled_prog"] = eval_compiled_prog build_res["eval_compiled_prog"] = eval_compiled_prog
if args.do_train: if args.do_train:
...@@ -465,7 +520,6 @@ def main(args): ...@@ -465,7 +520,6 @@ def main(args):
save_result=True, id2intent=id2intent) save_result=True, id2intent=id2intent)
if __name__ == "__main__": if __name__ == "__main__":
logger.info("the paddle version is %s" % paddle.__version__) logger.info("the paddle version is %s" % paddle.__version__)
check_version('1.6.0') check_version('1.6.0')
......
...@@ -32,7 +32,6 @@ try: ...@@ -32,7 +32,6 @@ try:
except ImportError: except ImportError:
import ConfigParser as cp import ConfigParser as cp
random_seed = 7 random_seed = 7
logger = logging.getLogger() logger = logging.getLogger()
format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s" format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s"
...@@ -77,6 +76,7 @@ class ArgumentGroup(object): ...@@ -77,6 +76,7 @@ class ArgumentGroup(object):
Arguments: Arguments:
object {[type]} -- [description] object {[type]} -- [description]
""" """
def __init__(self, parser, title, des): def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des) self._group = parser.add_argument_group(title=title, description=des)
...@@ -107,6 +107,7 @@ class DataReader(object): ...@@ -107,6 +107,7 @@ class DataReader(object):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
def __init__(self, char_vocab, intent_dict, max_len): def __init__(self, char_vocab, intent_dict, max_len):
self._char_vocab = char_vocab self._char_vocab = char_vocab
self._intent_dict = intent_dict self._intent_dict = intent_dict
...@@ -128,12 +129,17 @@ class DataReader(object): ...@@ -128,12 +129,17 @@ class DataReader(object):
# word_dict_path), "The given word dictionary dose not exist." # word_dict_path), "The given word dictionary dose not exist."
assert os.path.exists(data_path), "The given data file does not exist." assert os.path.exists(data_path), "The given data file does not exist."
if mode == "train": if mode == "train":
train_reader = fluid.io.batch(paddle.reader.shuffle(self.data_reader(data_path, self.max_len, shuffle=True), train_reader = fluid.io.batch(
buf_size=batch_size * 100), batch_size) paddle.reader.shuffle(
self.data_reader(
data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100),
batch_size)
# train_reader = fluid.io.batch(self.data_reader(data_path), batch_size) # train_reader = fluid.io.batch(self.data_reader(data_path), batch_size)
return train_reader return train_reader
else: else:
test_reader = fluid.io.batch(self.data_reader(data_path, self.max_len), batch_size) test_reader = fluid.io.batch(
self.data_reader(data_path, self.max_len), batch_size)
return test_reader return test_reader
def data_reader(self, file_path, max_len, shuffle=False): def data_reader(self, file_path, max_len, shuffle=False):
...@@ -150,7 +156,8 @@ class DataReader(object): ...@@ -150,7 +156,8 @@ class DataReader(object):
char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \ char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \
list(query))) list(query)))
if len(char_id_list) < max_len: if len(char_id_list) < max_len:
char_id_list.extend([self.padding_id] * (max_len - len(char_id_list))) char_id_list.extend([self.padding_id] *
(max_len - len(char_id_list)))
char_id_list = char_id_list[:max_len] char_id_list = char_id_list[:max_len]
intent_id_list = [self.padding_id] * self.intent_size intent_id_list = [self.padding_id] * self.intent_size
for item in intent.split('\2'): for item in intent.split('\2'):
...@@ -159,6 +166,7 @@ class DataReader(object): ...@@ -159,6 +166,7 @@ class DataReader(object):
if shuffle: if shuffle:
random.seed(random_seed) random.seed(random_seed)
random.shuffle(self.all_data) random.shuffle(self.all_data)
def reader(): def reader():
""" """
reader reader
...@@ -166,6 +174,7 @@ class DataReader(object): ...@@ -166,6 +174,7 @@ class DataReader(object):
for char_id_list, intent_id_list in self.all_data: for char_id_list, intent_id_list in self.all_data:
# print char_id_list, intent_id # print char_id_list, intent_id
yield char_id_list, intent_id_list yield char_id_list, intent_id_list
return reader return reader
...@@ -178,6 +187,7 @@ class DataProcesser(object): ...@@ -178,6 +187,7 @@ class DataProcesser(object):
Returns: Returns:
[type] -- [description] [type] -- [description]
""" """
@staticmethod @staticmethod
def read_dict(filename): def read_dict(filename):
""" """
...@@ -227,7 +237,8 @@ class DataProcesser(object): ...@@ -227,7 +237,8 @@ class DataProcesser(object):
intent_dict[intent] = 0 intent_dict[intent] = 0
intent_dict[intent] += 1 intent_dict[intent] += 1
# save char dict # save char dict
with codecs.open("%s/char.dict" % save_dir, "w", encoding="utf8") as f_out: with codecs.open(
"%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("PAD\0020\n") f_out.write("PAD\0020\n")
f_out.write("OOV\0021\n") f_out.write("OOV\0021\n")
char_id = 2 char_id = 2
...@@ -238,7 +249,8 @@ class DataProcesser(object): ...@@ -238,7 +249,8 @@ class DataProcesser(object):
f_out.write("%s\002%d\n" % (key, char_id)) f_out.write("%s\002%d\n" % (key, char_id))
char_id += 1 char_id += 1
# save intent dict # save intent dict
with codecs.open("%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out: with codecs.open(
"%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("SYS_OTHER\0020\n") f_out.write("SYS_OTHER\0020\n")
intent_id = 1 intent_id = 1
for key, value in intent_dict.items(): for key, value in intent_dict.items():
...@@ -249,7 +261,6 @@ class DataProcesser(object): ...@@ -249,7 +261,6 @@ class DataProcesser(object):
intent_id += 1 intent_id += 1
class ConfigReader(object): class ConfigReader(object):
"""[read model config file] """[read model config file]
...@@ -282,49 +293,13 @@ class ConfigReader(object): ...@@ -282,49 +293,13 @@ class ConfigReader(object):
return flow_data return flow_data
def init_pretraining_params(exe,
pretraining_params_path,
main_program,
use_fp16=False):
"""load params of pretrained model, NOT including moment, learning_rate"""
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
def init_checkpoint(exe, init_checkpoint_path, main_program): def init_checkpoint(exe, init_checkpoint_path, main_program):
""" """
Init CheckPoint Init CheckPoint
""" """
assert os.path.exists( fluid.load(main_program, init_checkpoint_path, exe)
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path print("Load model from {}".format(init_checkpoint_path))
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print ("Load model from {}".format(init_checkpoint_path))
def print_arguments(args): def print_arguments(args):
""" """
...@@ -350,5 +325,3 @@ def check_version(version='1.6.0'): ...@@ -350,5 +325,3 @@ def check_version(version='1.6.0'):
except Exception as e: except Exception as e:
logger.error(err) logger.error(err)
sys.exit(1) sys.exit(1)
...@@ -21,8 +21,10 @@ from kpi import DurationKpi ...@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True) train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True) train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True) train_duration_card1 = DurationKpi(
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True) 'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
train_loss_card1, train_loss_card1,
......
...@@ -20,22 +20,25 @@ import sys ...@@ -20,22 +20,25 @@ import sys
import io import io
import os import os
URLLIB=urllib URLLIB = urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz", DATA_MODEL_PATH = {
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"} "DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -51,7 +54,8 @@ def download_model_and_data(): ...@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -21,8 +21,7 @@ import paddle ...@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
def create_net( def create_net(is_training,
is_training,
model_input, model_input,
args, args,
clip_value=10.0, clip_value=10.0,
...@@ -52,14 +51,12 @@ def create_net( ...@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1))) initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM #fc to fit dynamic LSTM
context_fc = fluid.layers.fc( context_fc = fluid.layers.fc(input=context_emb,
input=context_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc( response_fc = fluid.layers.fc(input=response_emb,
input=response_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"): ...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
""" """
Set word embedding Set word embedding
""" """
word_emb_param = fluid.global_scope().find_var( word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_name).get_tensor()
word_emb_param.set(word_emb, place) word_emb_param.set(word_emb, place)
...@@ -42,22 +42,24 @@ def do_save_inference_model(args): ...@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
if args.use_cuda: if args.use_cuda:
place = fluid.CUDAPlace(0) place = fluid.CUDAPlace(0)
...@@ -81,9 +83,7 @@ def do_save_inference_model(args): ...@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name, input_field.context_wordseq.name,
input_field.response_wordseq.name, input_field.response_wordseq.name,
], ],
target_vars=[ target_vars=[logits, ],
logits,
],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig from ade.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml") args = PDConfig(yaml_file="./data/config/ade.yaml")
......
...@@ -46,22 +46,24 @@ def do_predict(args): ...@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
logits.persistable = True logits.persistable = True
fetch_list = [logits.name] fetch_list = [logits.name]
...@@ -89,10 +91,7 @@ def do_predict(args): ...@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size) batch_size=args.batch_size)
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
place=place, place=place, phase="test", shuffle=False, sample_pro=1)
phase="test",
shuffle=False,
sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test') num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator) data_reader.decorate_batch_generator(batch_generator)
...@@ -107,7 +106,7 @@ def do_predict(args): ...@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset() data_reader.reset()
break break
scores = scores[: num_test_examples] scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file") print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8") fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores): for index, score in enumerate(scores):
......
...@@ -49,22 +49,24 @@ def do_train(args): ...@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
loss = create_net( loss = create_net(
is_training=True, is_training=True, model_input=input_field, args=args)
model_input=input_field,
args=args
)
loss.persistable = True loss.persistable = True
# gradient clipping # gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue( fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
...@@ -74,7 +76,8 @@ def do_train(args): ...@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0'))) place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else: else:
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace() place = fluid.CPUPlace()
...@@ -114,9 +117,14 @@ def do_train(args): ...@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init: if args.word_emb_init:
print("start loading word embedding init ...") print("start loading word embedding init ...")
if six.PY2: if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32') word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else: else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32') word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place) set_word_embedding(word_emb, place)
print("finish init word embedding ...") print("finish init word embedding ...")
...@@ -147,15 +155,20 @@ def do_train(args): ...@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time)) print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0 sum_loss = 0.0
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1 steps += 1
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -20,12 +20,18 @@ from kpi import CostKpi ...@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi from kpi import DurationKpi
from kpi import AccKpi from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True) each_step_duration_atis_slot_card1 = DurationKpi(
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True) 'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True) train_loss_atis_slot_card1 = CostKpi(
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True) 'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True) train_acc_atis_slot_card1 = CostKpi(
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True) 'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
each_step_duration_atis_slot_card1, each_step_duration_atis_slot_card1,
......
...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name, ...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list): if isinstance(insts[0][3], list):
if task_name == "atis_slot": if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts] labels_list = [
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])] inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2": elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts] labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")] labels_list = [np.array(labels_list).astype("int64")]
...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name, ...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids out = batch_src_ids
# Second step: padding # Second step: padding
src_id, self_input_mask = pad_batch_data( src_id, self_input_mask = pad_batch_data(
out, out, max_len, pad_idx=pad_id, return_input_mask=True)
max_len,
pad_idx=pad_id,
return_input_mask=True)
pos_id = pad_batch_data( pos_id = pad_batch_data(
batch_pos_ids, batch_pos_ids,
max_len, max_len,
...@@ -163,13 +164,13 @@ def pad_batch_data(insts, ...@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias. corresponding position data and attention bias.
""" """
return_list = [] return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts) max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss # Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients. # will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array( inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts [inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
])
return_list += [inst_data.astype("int64").reshape([-1, max_len])] return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data # position data
......
...@@ -25,18 +25,21 @@ class DefinePredict(object): ...@@ -25,18 +25,21 @@ class DefinePredict(object):
""" """
Packaging Prediction Results Packaging Prediction Results
""" """
def __init__(self): def __init__(self):
""" """
init init
""" """
self.task_map = {'udc': 'get_matching_res', self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res', 'swda': 'get_cls_res',
'mrda': 'get_cls_res', 'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res', 'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging', 'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res', 'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res', 'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'} 'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None): def get_matching_res(self, probs, params=None):
""" """
...@@ -79,7 +82,3 @@ class DefinePredict(object): ...@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)]) label_str = " ".join([str(l) for l in sorted(labels)])
return label_str return label_str
...@@ -20,25 +20,29 @@ import sys ...@@ -20,25 +20,29 @@ import sys
import io import io
import os import os
URLLIB = urllib
URLLIB=urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz", DATA_MODEL_PATH = {
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz", "DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"} "PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model', 'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'} 'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name): ...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data(): def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......") print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............") print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']: for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path): if not os.path.exists(path):
continue continue
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -19,6 +19,3 @@ python run_build_data.py udc ...@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot 生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent 生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""build swda train dev test dataset""" """build swda train dev test dataset"""
import json import json
...@@ -27,6 +26,7 @@ class ATIS(object): ...@@ -27,6 +26,7 @@ class ATIS(object):
""" """
nlu dataset atis data process nlu dataset atis data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -73,7 +73,8 @@ class ATIS(object): ...@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict: if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1 self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower())) fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8") fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict: for tag in self.intent_dict:
...@@ -109,17 +110,19 @@ class ATIS(object): ...@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag])) tags_slot.append(str(self.slot_dict[tag]))
if i == 0: if i == 0:
if start not in [0, 1]: if start not in [0, 1]:
prefix_num = len(text[: start].strip().split()) prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
else: else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split()) prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
if entities[-1]['end'] < len(text): if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split()) suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num) tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8'))) fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8") fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict: for slot in self.slot_dict:
...@@ -152,7 +155,3 @@ class ATIS(object): ...@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__": if __name__ == "__main__":
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
...@@ -28,6 +28,7 @@ class DSTC2(object): ...@@ -28,6 +28,7 @@ class DSTC2(object):
""" """
dialogue state tracking dstc2 data process dialogue state tracking dstc2 data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -49,7 +50,8 @@ class DSTC2(object): ...@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list) self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict: for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])): for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i]) self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self): def _load_ontology(self):
""" """
...@@ -97,15 +99,25 @@ class DSTC2(object): ...@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i] log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i] label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"] assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]] labels = [
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels]) "%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript'] mach = log_turn['output']['transcript']
user = label_turn['transcription'] user = label_turn['transcription']
if not labels_ids.strip(): if not labels_ids.strip():
labels_ids = self.map_tag_dict['none'] labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids) out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip() user_asr = log_turn['input']['live']['asr-hyps'][0][
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids) 'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8')) fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8')) fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
...@@ -144,10 +156,7 @@ class DSTC2(object): ...@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
...@@ -27,6 +27,7 @@ class MRDA(object): ...@@ -27,6 +27,7 @@ class MRDA(object):
""" """
dialogue act dataset mrda data process dialogue act dataset mrda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -67,7 +68,7 @@ class MRDA(object): ...@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list: for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key] dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8") fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
conv_id = elems[2] conv_id = elems[2]
...@@ -87,7 +88,7 @@ class MRDA(object): ...@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list: for trans_key in trans_list:
trans_file = self.trans_dict[trans_key] trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8") fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
if len(elems) != 3: if len(elems) != 3:
...@@ -120,7 +121,8 @@ class MRDA(object): ...@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1 self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1] caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0] conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0]) out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out) fw.write(u"%s\n" % out)
def get_train_dataset(self): def get_train_dataset(self):
...@@ -158,10 +160,7 @@ class MRDA(object): ...@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
mrda_inst = MRDA() mrda_inst = MRDA()
mrda_inst.main() mrda_inst.main()
...@@ -27,6 +27,7 @@ class SWDA(object): ...@@ -27,6 +27,7 @@ class SWDA(object):
""" """
dialogue act dataset swda data process dialogue act dataset swda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -63,7 +64,7 @@ class SWDA(object): ...@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name] file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8") fr = io.open(file_path, 'r', encoding="utf8")
idx = 0 idx = 0
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for r in row: for r in row:
if idx == 0: if idx == 0:
idx += 1 idx += 1
...@@ -224,10 +225,7 @@ class SWDA(object): ...@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
swda_inst = SWDA() swda_inst = SWDA()
swda_inst.main() swda_inst.main()
...@@ -71,6 +71,3 @@ def load_voc(conf): ...@@ -71,6 +71,3 @@ def load_voc(conf):
elems = line.split('\t') elems = line.split('\t')
map_dict[elems[0]] = elems[1] map_dict[elems[0]] = elems[1]
return map_dict return map_dict
...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2 ...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA from build_swda_dataset import SWDA
if __name__ == "__main__": if __name__ == "__main__":
task_name = sys.argv[1] task_name = sys.argv[1]
task_name = task_name.lower() task_name = task_name.lower()
...@@ -38,11 +37,12 @@ if __name__ == "__main__": ...@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis': elif task_name == 'atis':
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt") shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt") "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2': elif task_name == 'dstc2':
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
else: else:
exit(0) exit(0)
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""Tokenization classes.""" """Tokenization classes."""
from __future__ import absolute_import from __future__ import absolute_import
......
...@@ -113,7 +113,7 @@ def multi_head_attention(queries, ...@@ -113,7 +113,7 @@ def multi_head_attention(queries,
""" """
Scaled Dot-Product Attention Scaled Dot-Product Attention
""" """
scaled_q = layers.scale(x=q, scale=d_key ** -0.5) scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True) product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias: if attn_bias:
product += attn_bias product += attn_bias
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -23,12 +23,7 @@ from dgu.bert import BertModel ...@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig from dgu.utils.configure import JsonConfig
def create_net( def create_net(is_training, model_input, num_labels, paradigm_inst, args):
is_training,
model_input,
num_labels,
paradigm_inst,
args):
"""create dialogue task model""" """create dialogue task model"""
src_ids = model_input.src_ids src_ids = model_input.src_ids
...@@ -48,14 +43,15 @@ def create_net( ...@@ -48,14 +43,15 @@ def create_net(
config=bert_conf, config=bert_conf,
use_fp16=False) use_fp16=False)
params = {'num_labels': num_labels, params = {
'num_labels': num_labels,
'src_ids': src_ids, 'src_ids': src_ids,
'pos_ids': pos_ids, 'pos_ids': pos_ids,
'sent_ids': sent_ids, 'sent_ids': sent_ids,
'input_mask': input_mask, 'input_mask': input_mask,
'labels': labels, 'labels': labels,
'is_training': is_training} 'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params) results = paradigm_inst.paradigm(bert, params)
return results return results
...@@ -66,7 +66,9 @@ def do_save_inference_model(args): ...@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,8 +76,7 @@ def do_save_inference_model(args): ...@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
...@@ -107,14 +108,10 @@ def do_save_inference_model(args): ...@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model( fluid.io.save_inference_model(
args.inference_model_dir, args.inference_model_dir,
feeded_var_names=[ feeded_var_names=[
input_field.src_ids.name, input_field.src_ids.name, input_field.pos_ids.name,
input_field.pos_ids.name, input_field.sent_ids.name, input_field.input_mask.name
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
], ],
target_vars=[probs],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig from dgu.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml") args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
...@@ -66,7 +66,9 @@ def do_train(args): ...@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,13 +76,12 @@ def do_train(args): ...@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir, processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len, max_seq_len=args.max_seq_len,
...@@ -113,9 +114,7 @@ def do_train(args): ...@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='train', shuffle=True)
phase='train',
shuffle=True)
num_train_examples = processor.get_num_examples(phase='train') num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens: if args.in_tokens:
...@@ -217,37 +216,32 @@ def do_train(args): ...@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
if accuracy is not None: if accuracy is not None:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" % "ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time)) args.print_steps / used_time))
ce_info.append([ ce_info.append([
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time args.print_steps / used_time
]) ])
else: else:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" % "speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), args.print_steps / used_time))
args.print_steps / used_time)) ce_info.append(
ce_info.append([ [np.mean(np_loss), args.print_steps / used_time])
np.mean(np_loss),
args.print_steps / used_time
])
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = "step_" + str(steps) save_path = "step_" + str(steps)
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path) save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path) save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -19,8 +19,7 @@ from __future__ import print_function ...@@ -19,8 +19,7 @@ from __future__ import print_function
import os import os
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
Emotion Detection Task Emotion Detection Task
""" """
...@@ -24,7 +23,7 @@ import os ...@@ -24,7 +23,7 @@ import os
import time import time
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -38,9 +37,7 @@ import reader ...@@ -38,9 +37,7 @@ import reader
import utils import utils
def create_model(args, def create_model(args, num_labels, is_prediction=False):
num_labels,
is_prediction=False):
""" """
Create Model for Emotion Detection Create Model for Emotion Detection
""" """
...@@ -77,10 +74,17 @@ def create_model(args, ...@@ -77,10 +74,17 @@ def create_model(args,
raise ValueError("Unknown network type!") raise ValueError("Unknown network type!")
if is_prediction: if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, class_dim=num_labels, is_prediction=True) probs = network(
data,
seq_len,
None,
args.vocab_size,
class_dim=num_labels,
is_prediction=True)
return loader, probs, [data.name, seq_len.name] return loader, probs, [data.name, seq_len.name]
avg_loss, probs = network(data, seq_len, label, args.vocab_size, class_dim=num_labels) avg_loss, probs = network(
data, seq_len, label, args.vocab_size, class_dim=num_labels)
num_seqs = fluid.layers.create_tensor(dtype='int64') num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs) accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return loader, avg_loss, accuracy, num_seqs return loader, avg_loss, accuracy, num_seqs
...@@ -142,7 +146,8 @@ def main(args): ...@@ -142,7 +146,8 @@ def main(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
task_name = args.task_name.lower() task_name = args.task_name.lower()
processor = reader.EmoTectProcessor(data_dir=args.data_dir, processor = reader.EmoTectProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
random_seed=args.random_seed) random_seed=args.random_seed)
#num_labels = len(processor.get_labels()) #num_labels = len(processor.get_labels())
...@@ -173,9 +178,7 @@ def main(args): ...@@ -173,9 +178,7 @@ def main(args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_loader, loss, accuracy, num_seqs = create_model( train_loader, loss, accuracy, num_seqs = create_model(
args, args, num_labels=num_labels, is_prediction=False)
num_labels=num_labels,
is_prediction=False)
sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr) sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr)
sgd_optimizer.minimize(loss) sgd_optimizer.minimize(loss)
...@@ -189,37 +192,27 @@ def main(args): ...@@ -189,37 +192,27 @@ def main(args):
if args.do_val: if args.do_val:
if args.do_train: if args.do_train:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='dev', epoch=1)
phase='dev',
epoch=1)
else: else:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='test', epoch=1)
phase='test',
epoch=1)
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_loader, loss, accuracy, num_seqs = create_model( test_loader, loss, accuracy, num_seqs = create_model(
args, args, num_labels=num_labels, is_prediction=False)
num_labels=num_labels,
is_prediction=False)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
if args.do_infer: if args.do_infer:
infer_data_generator = processor.data_generator( infer_data_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='infer', epoch=1)
phase='infer',
epoch=1)
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_loader, probs, _ = create_model( infer_loader, probs, _ = create_model(
args, args, num_labels=num_labels, is_prediction=True)
num_labels=num_labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -292,8 +285,9 @@ def main(args): ...@@ -292,8 +285,9 @@ def main(args):
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir,
fluid.io.save_persistables(exe, save_path, train_program) "step_" + str(steps))
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate on dev set # evaluate on dev set
...@@ -306,11 +300,11 @@ def main(args): ...@@ -306,11 +300,11 @@ def main(args):
print("final step: %d " % steps) print("final step: %d " % steps)
if args.do_val: if args.do_val:
evaluate(test_exe, test_prog, test_loader, evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name], [loss.name, accuracy.name, num_seqs.name], "dev")
"dev")
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir,
fluid.io.save_persistables(exe, save_path, train_program) "step_" + str(steps))
fluid.save(train_program, save_path)
train_loader.reset() train_loader.reset()
break break
...@@ -334,15 +328,12 @@ def main(args): ...@@ -334,15 +328,12 @@ def main(args):
if not args.do_train and args.do_val: if not args.do_train and args.do_val:
print("Final test result:") print("Final test result:")
evaluate(test_exe, test_prog, test_loader, evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name], [loss.name, accuracy.name, num_seqs.name], "test")
"test")
# infer # infer
if args.do_infer: if args.do_infer:
print("Final infer result:") print("Final infer result:")
infer(test_exe, test_prog, infer_loader, infer(test_exe, test_prog, infer_loader, [probs.name], "infer")
[probs.name],
"infer")
def get_cards(): def get_cards():
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
Emotion Detection Task, based on ERNIE Emotion Detection Task, based on ERNIE
""" """
...@@ -25,7 +24,7 @@ import time ...@@ -25,7 +24,7 @@ import time
import argparse import argparse
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -350,7 +349,7 @@ def main(args): ...@@ -350,7 +349,7 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
# evaluate dev set # evaluate dev set
...@@ -369,7 +368,7 @@ def main(args): ...@@ -369,7 +368,7 @@ def main(args):
except fluid.core.EOFException: except fluid.core.EOFException:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps)) save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
train_pyreader.reset() train_pyreader.reset()
break break
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
""" """
EmoTect utilities. EmoTect utilities.
""" """
...@@ -29,27 +28,13 @@ import paddle ...@@ -29,27 +28,13 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
def init_checkpoint(exe, init_checkpoint_path, main_program): def init_checkpoint(exe, init_checkpoint_path, main_program):
""" """
Init CheckPoint Init CheckPoint
""" """
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars( fluid.load(main_program, init_checkpoint_path, exe)
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path))
def word2id(word_dict, query): def word2id(word_dict, query):
...@@ -57,8 +42,10 @@ def word2id(word_dict, query): ...@@ -57,8 +42,10 @@ def word2id(word_dict, query):
Convert word sequence into id list Convert word sequence into id list
""" """
unk_id = len(word_dict) unk_id = len(word_dict)
wids = [word_dict[w] if w in word_dict else unk_id wids = [
for w in query.strip().split(" ")] word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")
]
return wids return wids
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
## 1. 任务说明 ## 1. 任务说明
本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。 本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。
**目前语言模型要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。** **目前语言模型要求使用PaddlePaddle 1.7及以上版本或适当的develop版本。**
同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290) 同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290)
......
...@@ -36,7 +36,7 @@ import sys ...@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2': if sys.version[0] == '2':
reload(sys) reload(sys)
sys.setdefaultencoding("utf-8") sys.setdefaultencoding("utf-8")
sys.path.append('../') sys.path.append('../shared_modules/')
import os import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
...@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'): ...@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'):
def get_current_model_para(train_prog, train_exe): def get_current_model_para(train_prog, train_exe):
param_list = train_prog.block(0).all_parameters() param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list] param_name_list = [p.name for p in param_list]
vals = {} vals = {}
...@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe): ...@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe):
def save_para_npz(train_prog, train_exe): def save_para_npz(train_prog, train_exe):
print("begin to save model to model_base") print("begin to save model to model_base")
param_list = train_prog.block(0).all_parameters() param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list] param_name_list = [p.name for p in param_list]
vals = {} vals = {}
......
...@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型 ...@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型
#### 1.PaddlePaddle 安装 #### 1.PaddlePaddle 安装
本项目依赖 PaddlePaddle 1.6.0 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub) 本项目依赖 PaddlePaddle 1.7 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
> Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。 > Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。
......
...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer ...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset from reader import Dataset
from ernie_reader import SequenceLabelReader from ernie_reader import SequenceLabelReader
sys.path.append("..") sys.path.append("../shared_modules/")
from models.sequence_labeling import nets from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader from models.representation.ernie import ernie_encoder, ernie_pyreader
...@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'): ...@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model""" """create lac model"""
# model's input data # model's input data
words = fluid.data(name='words', shape=[-1, 1], dtype='int64', lod_level=1) words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data( targets = fluid.data(
name='targets', shape=[-1, 1], dtype='int64', lod_level=1) name='targets', shape=[None, 1], dtype='int64', lod_level=1)
# for inference process # for inference process
if mode == 'infer': if mode == 'infer':
...@@ -88,9 +89,11 @@ def create_pyreader(args, ...@@ -88,9 +89,11 @@ def create_pyreader(args,
return_reader=False, return_reader=False,
mode='train'): mode='train'):
# init reader # init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac': if model == 'lac':
pyreader = fluid.io.PyReader( pyreader = fluid.io.DataLoader.from_generator(
feed_list=feed_list, feed_list=feed_list,
capacity=50, capacity=50,
use_double_buffer=True, use_double_buffer=True,
...@@ -101,19 +104,19 @@ def create_pyreader(args, ...@@ -101,19 +104,19 @@ def create_pyreader(args,
# create lac pyreader # create lac pyreader
if mode == 'train': if mode == 'train':
pyreader.decorate_sample_list_generator( pyreader.set_sample_list_generator(
fluid.io.batch( fluid.io.batch(
fluid.io.shuffle( fluid.io.shuffle(
reader.file_reader(file_name), reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer), buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size), batch_size=args.batch_size / device_count),
places=place) places=place)
else: else:
pyreader.decorate_sample_list_generator( pyreader.set_sample_list_generator(
fluid.io.batch( fluid.io.batch(
reader.file_reader( reader.file_reader(
file_name, mode=mode), file_name, mode=mode),
batch_size=args.batch_size), batch_size=args.batch_size / device_count),
places=place) places=place)
elif model == 'ernie': elif model == 'ernie':
...@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config): ...@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config):
# ERNIE's input data # ERNIE's input data
src_ids = fluid.data( src_ids = fluid.data(
name='src_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='src_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='sent_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data( pos_ids = fluid.data(
name='pos_ids', shape=[-1, args.max_seq_len, 1], dtype='int64') name='pos_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len, 1], dtype='float32') name='input_mask', shape=[None, args.max_seq_len, 1], dtype='float32')
padded_labels = fluid.data( padded_labels = fluid.data(
name='padded_labels', shape=[-1, args.max_seq_len, 1], dtype='int64') name='padded_labels', shape=[None, args.max_seq_len, 1], dtype='int64')
seq_lens = fluid.data( seq_lens = fluid.data(
name='seq_lens', shape=[-1], dtype='int64', lod_level=0) name='seq_lens', shape=[None], dtype='int64', lod_level=0)
squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1]) squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1])
......
...@@ -20,7 +20,7 @@ import sys ...@@ -20,7 +20,7 @@ import sys
from collections import namedtuple from collections import namedtuple
import numpy as np import numpy as np
sys.path.append("..") sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization from preprocess.ernie.task_reader import BaseReader, tokenization
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -10,7 +10,7 @@ import paddle.fluid as fluid ...@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator import creator
import reader import reader
import utils import utils
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -34,7 +34,7 @@ import paddle.fluid as fluid ...@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator import creator
import utils import utils
sys.path.append("..") sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda from models.model_check import check_cuda
from models.model_check import check_version from models.model_check import check_version
...@@ -188,15 +188,16 @@ def do_train(args): ...@@ -188,15 +188,16 @@ def do_train(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir, save_path = os.path.join(args.model_save_dir,
"step_" + str(steps)) "step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path)) print("\tsaving model as %s" % (save_path))
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
evaluate(exe, test_program, test_pyreader, train_ret) evaluate(exe, test_program, test_pyreader, train_ret)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps)) save_path = os.path.join(args.model_save_dir, "step_" + str(steps),
fluid.io.save_persistables(exe, save_path, train_program) "checkpoint")
fluid.save(train_program, save_path)
def do_eval(args): def do_eval(args):
......
...@@ -29,7 +29,7 @@ import reader ...@@ -29,7 +29,7 @@ import reader
import utils import utils
import creator import creator
from eval import test_process from eval import test_process
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
...@@ -151,8 +151,8 @@ def do_train(args): ...@@ -151,8 +151,8 @@ def do_train(args):
# save checkpoints # save checkpoints
if step % args.save_steps == 0 and step != 0: if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir, save_path = os.path.join(args.model_save_dir,
"step_" + str(step)) "step_" + str(step), "checkpoint")
fluid.io.save_persistables(exe, save_path, train_program) fluid.save(train_program, save_path)
step += 1 step += 1
if args.enable_ce: if args.enable_ce:
......
...@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program): ...@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
assert os.path.exists( assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var): try:
""" checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
If existed presitabels fluid.load(main_program, checkpoint_path, exe)
""" except:
if not fluid.io.is_persistable(var): fluid.load(main_program, init_checkpoint_path, exe)
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path)) print("Load model from {}".format(init_checkpoint_path))
...@@ -224,15 +216,6 @@ def init_pretraining_params(exe, ...@@ -224,15 +216,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path ), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var): fluid.load(main_program, pretraining_params_path, exe)
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format( print("Load pretraining parameters from {}.".format(
pretraining_params_path)) pretraining_params_path))
...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训 ...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力 - 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。 百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
...@@ -106,7 +106,7 @@ python -u main.py \ ...@@ -106,7 +106,7 @@ python -u main.py \
--prepostprocess_dropout 0.3 --prepostprocess_dropout 0.3
``` ```
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_param``save_checkpoint`(默认为 trained_params 和 trained_ckpts),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将分别保存当前训练的参数值和 checkpoint 到相应目录,每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出: 训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_model_path`(默认为 saved_models),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将保存当前训练的 checkpoint 到相应目录(会保存分别记录了模型参数和优化器状态的 `transformer.pdparams``transformer.pdopt` 两个文件),每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
```txt ```txt
[2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s [2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s
...@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len ...@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len
### 预训练模型 ### 预训练模型
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_params.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_params.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。 我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_graph.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_graph.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
## 进阶使用 ## 进阶使用
......
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
def get_input_descs(args): def get_input_descs(args):
""" """
Generate a dict mapping data fields to the corresponding data shapes and Generate a dict mapping data fields to the corresponding data shapes and
...@@ -42,7 +43,8 @@ def get_input_descs(args): ...@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder. # encoder.
# The actual data shape of src_slf_attn_bias is: # The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is: # The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1] # [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64", "trg_word": [(batch_size, seq_len), "int64",
...@@ -54,12 +56,14 @@ def get_input_descs(args): ...@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder. # subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is: # The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source # This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention. # input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is: # The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference. # This input is used in independent decoder program for inference.
# The actual data shape of enc_output is: # The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model] # [batch_size, max_src_len_in_batch, d_model]
...@@ -80,6 +84,7 @@ def get_input_descs(args): ...@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs return input_descs
# Names of word embedding table which might be reused for weight sharing. # Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = ( word_emb_param_names = (
"src_word_emb_table", "src_word_emb_table",
......
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册