未验证 提交 f492ae4f 编写于 作者: P pkpk 提交者: GitHub

refactor the PaddleNLP (#4351)

* Update README.md (#4267)

* test=develop (#4269)

* 3d use new api (#4275)

* PointNet++ and PointRCNN use new API

* Update Readme of Dygraph BERT (#4277)

Fix some typos.

* Update run_classifier_multi_gpu.sh (#4279)

remove the CUDA_VISIBLE_DEVICES

* Update README.md (#4280)

* 17 update api (#4294)

* update1.7 save/load & fluid.data

* update datafeed to dataloader

* Update resnet_acnet.py (#4297)

Bias attr of square conv should be "False" rather than None during training mode.

* test=develop

* test=develop

* test=develop

* test=develop

* test
Co-authored-by: NKaipeng Deng <dengkaipeng@baidu.com>
Co-authored-by: Nzhang wenhui <frankwhzhang@126.com>
Co-authored-by: Nparap1uie-s <parap1uie-s@users.noreply.github.com>
上级 8dc42c73
...@@ -13,6 +13,3 @@ ...@@ -13,6 +13,3 @@
[submodule "PaddleSpeech/DeepSpeech"] [submodule "PaddleSpeech/DeepSpeech"]
path = PaddleSpeech/DeepSpeech path = PaddleSpeech/DeepSpeech
url = https://github.com/PaddlePaddle/DeepSpeech.git url = https://github.com/PaddlePaddle/DeepSpeech.git
[submodule "PaddleNLP/PALM"]
path = PaddleNLP/PALM
url = https://github.com/PaddlePaddle/PALM
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:** - **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。 - PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:** - **稳定可靠的NLP模型和强大的预训练参数:**
...@@ -50,17 +50,17 @@ cd models/PaddleNLP/sentiment_classification ...@@ -50,17 +50,17 @@ cd models/PaddleNLP/sentiment_classification
| 任务场景 | 对应项目/目录 | 简介 | | 任务场景 | 对应项目/目录 | 简介 |
| :------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | :------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| **中文分词****词性标注****命名实体识别**:fire: | [LAC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis) | LAC,全称为Lexical Analysis of Chinese,是百度内部广泛使用的中文处理工具,功能涵盖从中文分词,词性标注,命名实体识别等常见中文处理任务。 | | **中文分词****词性标注****命名实体识别**:fire: | [LAC](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis) | LAC,全称为Lexical Analysis of Chinese,是百度内部广泛使用的中文处理工具,功能涵盖从中文分词,词性标注,命名实体识别等常见中文处理任务。 |
| **词向量(word2vec)** | [word2vec](https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 | | **词向量(word2vec)** | [word2vec](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 |
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 | | **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 | | **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 | | **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 | | **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 | | **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 | | **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 | | **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 | | **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 | | **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research) | 百度最新前沿工作开源。 |
...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification ...@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text ```text
. .
├── Research # 百度NLP在research方面的工作集合 ├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型 ├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型 ├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型 ├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱 ├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型 ├── language_model # 语言模型
├── lexical_analysis # LAC词法分析 ├── lexical_analysis # LAC词法分析
├── models # 共享网络 ├── shared_modules/models # 共享网络
│ ├── __init__.py │ ├── __init__.py
│ ├── classification │ ├── classification
│ ├── dialogue_model_toolkit │ ├── dialogue_model_toolkit
...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification ...@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation │ ├── representation
│ ├── sequence_labeling │ ├── sequence_labeling
│ └── transformer_encoder.py │ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具 ├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py │ ├── __init__.py
│ ├── ernie │ ├── ernie
│ ├── padding.py │ ├── padding.py
......
...@@ -21,8 +21,10 @@ from kpi import DurationKpi ...@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True) train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True) train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True) train_duration_card1 = DurationKpi(
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True) 'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
train_loss_card1, train_loss_card1,
......
...@@ -20,22 +20,25 @@ import sys ...@@ -20,22 +20,25 @@ import sys
import io import io
import os import os
URLLIB=urllib URLLIB = urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz", DATA_MODEL_PATH = {
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"} "DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -51,7 +54,8 @@ def download_model_and_data(): ...@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -21,8 +21,7 @@ import paddle ...@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
def create_net( def create_net(is_training,
is_training,
model_input, model_input,
args, args,
clip_value=10.0, clip_value=10.0,
...@@ -52,14 +51,12 @@ def create_net( ...@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1))) initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM #fc to fit dynamic LSTM
context_fc = fluid.layers.fc( context_fc = fluid.layers.fc(input=context_emb,
input=context_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc( response_fc = fluid.layers.fc(input=response_emb,
input=response_emb,
size=args.hidden_size * 4, size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'), param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias')) bias_attr=fluid.ParamAttr(name='fc_bias'))
...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"): ...@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
""" """
Set word embedding Set word embedding
""" """
word_emb_param = fluid.global_scope().find_var( word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_name).get_tensor()
word_emb_param.set(word_emb, place) word_emb_param.set(word_emb, place)
...@@ -42,22 +42,24 @@ def do_save_inference_model(args): ...@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
if args.use_cuda: if args.use_cuda:
place = fluid.CUDAPlace(0) place = fluid.CUDAPlace(0)
...@@ -81,9 +83,7 @@ def do_save_inference_model(args): ...@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name, input_field.context_wordseq.name,
input_field.response_wordseq.name, input_field.response_wordseq.name,
], ],
target_vars=[ target_vars=[logits, ],
logits,
],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig from ade.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml") args = PDConfig(yaml_file="./data/config/ade.yaml")
......
...@@ -46,22 +46,24 @@ def do_predict(args): ...@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
logits = create_net( logits = create_net(
is_training=False, is_training=False, model_input=input_field, args=args)
model_input=input_field,
args=args
)
logits.persistable = True logits.persistable = True
fetch_list = [logits.name] fetch_list = [logits.name]
...@@ -89,10 +91,7 @@ def do_predict(args): ...@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size) batch_size=args.batch_size)
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
place=place, place=place, phase="test", shuffle=False, sample_pro=1)
phase="test",
shuffle=False,
sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test') num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator) data_reader.decorate_batch_generator(batch_generator)
...@@ -107,7 +106,7 @@ def do_predict(args): ...@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset() data_reader.reset()
break break
scores = scores[: num_test_examples] scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file") print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8") fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores): for index, score in enumerate(scores):
......
...@@ -49,22 +49,24 @@ def do_train(args): ...@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
context_wordseq = fluid.data( context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data( response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1) name='response_wordseq',
labels = fluid.data( shape=[-1, 1],
name='labels', shape=[-1, 1], dtype='int64') dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels] input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
loss = create_net( loss = create_net(
is_training=True, is_training=True, model_input=input_field, args=args)
model_input=input_field,
args=args
)
loss.persistable = True loss.persistable = True
# gradient clipping # gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue( fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
...@@ -74,7 +76,8 @@ def do_train(args): ...@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda: if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count() dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0'))) place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else: else:
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace() place = fluid.CPUPlace()
...@@ -114,9 +117,14 @@ def do_train(args): ...@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init: if args.word_emb_init:
print("start loading word embedding init ...") print("start loading word embedding init ...")
if six.PY2: if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32') word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else: else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32') word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place) set_word_embedding(word_emb, place)
print("finish init word embedding ...") print("finish init word embedding ...")
...@@ -147,15 +155,20 @@ def do_train(args): ...@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time)) print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0 sum_loss = 0.0
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps)) save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1 steps += 1
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -20,12 +20,18 @@ from kpi import CostKpi ...@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi from kpi import DurationKpi
from kpi import AccKpi from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True) each_step_duration_atis_slot_card1 = DurationKpi(
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True) 'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True) train_loss_atis_slot_card1 = CostKpi(
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True) 'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True) train_acc_atis_slot_card1 = CostKpi(
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True) 'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [ tracking_kpis = [
each_step_duration_atis_slot_card1, each_step_duration_atis_slot_card1,
......
...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name, ...@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list): if isinstance(insts[0][3], list):
if task_name == "atis_slot": if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts] labels_list = [
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])] inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2": elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts] labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")] labels_list = [np.array(labels_list).astype("int64")]
...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name, ...@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids out = batch_src_ids
# Second step: padding # Second step: padding
src_id, self_input_mask = pad_batch_data( src_id, self_input_mask = pad_batch_data(
out, out, max_len, pad_idx=pad_id, return_input_mask=True)
max_len,
pad_idx=pad_id,
return_input_mask=True)
pos_id = pad_batch_data( pos_id = pad_batch_data(
batch_pos_ids, batch_pos_ids,
max_len, max_len,
...@@ -163,13 +164,13 @@ def pad_batch_data(insts, ...@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias. corresponding position data and attention bias.
""" """
return_list = [] return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts) max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss # Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients. # will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array( inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts [inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
])
return_list += [inst_data.astype("int64").reshape([-1, max_len])] return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data # position data
......
...@@ -25,18 +25,21 @@ class DefinePredict(object): ...@@ -25,18 +25,21 @@ class DefinePredict(object):
""" """
Packaging Prediction Results Packaging Prediction Results
""" """
def __init__(self): def __init__(self):
""" """
init init
""" """
self.task_map = {'udc': 'get_matching_res', self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res', 'swda': 'get_cls_res',
'mrda': 'get_cls_res', 'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res', 'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging', 'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res', 'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res', 'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'} 'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None): def get_matching_res(self, probs, params=None):
""" """
...@@ -79,7 +82,3 @@ class DefinePredict(object): ...@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)]) label_str = " ".join([str(l) for l in sorted(labels)])
return label_str return label_str
...@@ -20,25 +20,29 @@ import sys ...@@ -20,25 +20,29 @@ import sys
import io import io
import os import os
URLLIB = urllib
URLLIB=urllib
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
import urllib.request import urllib.request
URLLIB=urllib.request URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz", DATA_MODEL_PATH = {
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz", "DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"} "PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input", PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model', 'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'} 'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name): def un_tar(tar_name, dir_name):
try: try:
t = tarfile.open(tar_name) t = tarfile.open(tar_name)
t.extractall(path = dir_name) t.extractall(path=dir_name)
return True return True
except Exception as e: except Exception as e:
print(e) print(e)
...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name): ...@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data(): def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......") print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............") print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']: for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path): if not os.path.exists(path):
continue continue
shutil.rmtree(path) shutil.rmtree(path)
for path_key in DATA_MODEL_PATH: for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key]) filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename)) URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key]) state = un_tar(filename, PATH_MAP[path_key])
if not state: if not state:
print("Tar %s error....." % path_key) print("Tar %s error....." % path_key)
......
...@@ -19,6 +19,3 @@ python run_build_data.py udc ...@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot 生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent 生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""build swda train dev test dataset""" """build swda train dev test dataset"""
import json import json
...@@ -27,6 +26,7 @@ class ATIS(object): ...@@ -27,6 +26,7 @@ class ATIS(object):
""" """
nlu dataset atis data process nlu dataset atis data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -73,7 +73,8 @@ class ATIS(object): ...@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict: if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1 self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower())) fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8") fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict: for tag in self.intent_dict:
...@@ -109,17 +110,19 @@ class ATIS(object): ...@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag])) tags_slot.append(str(self.slot_dict[tag]))
if i == 0: if i == 0:
if start not in [0, 1]: if start not in [0, 1]:
prefix_num = len(text[: start].strip().split()) prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
else: else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split()) prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num) tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot) tags.extend(tags_slot)
if entities[-1]['end'] < len(text): if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split()) suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num) tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8'))) fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8") fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict: for slot in self.slot_dict:
...@@ -152,7 +155,3 @@ class ATIS(object): ...@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__": if __name__ == "__main__":
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
...@@ -28,6 +28,7 @@ class DSTC2(object): ...@@ -28,6 +28,7 @@ class DSTC2(object):
""" """
dialogue state tracking dstc2 data process dialogue state tracking dstc2 data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -49,7 +50,8 @@ class DSTC2(object): ...@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list) self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict: for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])): for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i]) self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self): def _load_ontology(self):
""" """
...@@ -97,15 +99,25 @@ class DSTC2(object): ...@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i] log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i] label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"] assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]] labels = [
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels]) "%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript'] mach = log_turn['output']['transcript']
user = label_turn['transcription'] user = label_turn['transcription']
if not labels_ids.strip(): if not labels_ids.strip():
labels_ids = self.map_tag_dict['none'] labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids) out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip() user_asr = log_turn['input']['live']['asr-hyps'][0][
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids) 'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8')) fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8')) fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
...@@ -144,10 +156,7 @@ class DSTC2(object): ...@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
...@@ -27,6 +27,7 @@ class MRDA(object): ...@@ -27,6 +27,7 @@ class MRDA(object):
""" """
dialogue act dataset mrda data process dialogue act dataset mrda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -67,7 +68,7 @@ class MRDA(object): ...@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list: for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key] dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8") fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
conv_id = elems[2] conv_id = elems[2]
...@@ -87,7 +88,7 @@ class MRDA(object): ...@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list: for trans_key in trans_list:
trans_file = self.trans_dict[trans_key] trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8") fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for line in row: for line in row:
elems = line elems = line
if len(elems) != 3: if len(elems) != 3:
...@@ -120,7 +121,8 @@ class MRDA(object): ...@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1 self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1] caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0] conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0]) out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out) fw.write(u"%s\n" % out)
def get_train_dataset(self): def get_train_dataset(self):
...@@ -158,10 +160,7 @@ class MRDA(object): ...@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
mrda_inst = MRDA() mrda_inst = MRDA()
mrda_inst.main() mrda_inst.main()
...@@ -27,6 +27,7 @@ class SWDA(object): ...@@ -27,6 +27,7 @@ class SWDA(object):
""" """
dialogue act dataset swda data process dialogue act dataset swda data process
""" """
def __init__(self): def __init__(self):
""" """
init instance init instance
...@@ -63,7 +64,7 @@ class SWDA(object): ...@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name] file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8") fr = io.open(file_path, 'r', encoding="utf8")
idx = 0 idx = 0
row = csv.reader(fr, delimiter = ',') row = csv.reader(fr, delimiter=',')
for r in row: for r in row:
if idx == 0: if idx == 0:
idx += 1 idx += 1
...@@ -224,10 +225,7 @@ class SWDA(object): ...@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset() self.get_test_dataset()
self.get_labels() self.get_labels()
if __name__ == "__main__": if __name__ == "__main__":
swda_inst = SWDA() swda_inst = SWDA()
swda_inst.main() swda_inst.main()
...@@ -71,6 +71,3 @@ def load_voc(conf): ...@@ -71,6 +71,3 @@ def load_voc(conf):
elems = line.split('\t') elems = line.split('\t')
map_dict[elems[0]] = elems[1] map_dict[elems[0]] = elems[1]
return map_dict return map_dict
...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2 ...@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA from build_swda_dataset import SWDA
if __name__ == "__main__": if __name__ == "__main__":
task_name = sys.argv[1] task_name = sys.argv[1]
task_name = task_name.lower() task_name = task_name.lower()
...@@ -38,11 +37,12 @@ if __name__ == "__main__": ...@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis': elif task_name == 'atis':
atis_inst = ATIS() atis_inst = ATIS()
atis_inst.main() atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt") shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt") "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2': elif task_name == 'dstc2':
dstc_inst = DSTC2() dstc_inst = DSTC2()
dstc_inst.main() dstc_inst.main()
else: else:
exit(0) exit(0)
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""Tokenization classes.""" """Tokenization classes."""
from __future__ import absolute_import from __future__ import absolute_import
......
...@@ -113,7 +113,7 @@ def multi_head_attention(queries, ...@@ -113,7 +113,7 @@ def multi_head_attention(queries,
""" """
Scaled Dot-Product Attention Scaled Dot-Product Attention
""" """
scaled_q = layers.scale(x=q, scale=d_key ** -0.5) scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True) product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias: if attn_bias:
product += attn_bias product += attn_bias
......
...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname): ...@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname))) print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True return True
...@@ -23,12 +23,7 @@ from dgu.bert import BertModel ...@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig from dgu.utils.configure import JsonConfig
def create_net( def create_net(is_training, model_input, num_labels, paradigm_inst, args):
is_training,
model_input,
num_labels,
paradigm_inst,
args):
"""create dialogue task model""" """create dialogue task model"""
src_ids = model_input.src_ids src_ids = model_input.src_ids
...@@ -48,14 +43,15 @@ def create_net( ...@@ -48,14 +43,15 @@ def create_net(
config=bert_conf, config=bert_conf,
use_fp16=False) use_fp16=False)
params = {'num_labels': num_labels, params = {
'num_labels': num_labels,
'src_ids': src_ids, 'src_ids': src_ids,
'pos_ids': pos_ids, 'pos_ids': pos_ids,
'sent_ids': sent_ids, 'sent_ids': sent_ids,
'input_mask': input_mask, 'input_mask': input_mask,
'labels': labels, 'labels': labels,
'is_training': is_training} 'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params) results = paradigm_inst.paradigm(bert, params)
return results return results
...@@ -66,7 +66,9 @@ def do_save_inference_model(args): ...@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,8 +76,7 @@ def do_save_inference_model(args): ...@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
...@@ -107,14 +108,10 @@ def do_save_inference_model(args): ...@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model( fluid.io.save_inference_model(
args.inference_model_dir, args.inference_model_dir,
feeded_var_names=[ feeded_var_names=[
input_field.src_ids.name, input_field.src_ids.name, input_field.pos_ids.name,
input_field.pos_ids.name, input_field.sent_ids.name, input_field.input_mask.name
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
], ],
target_vars=[probs],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
model_filename="model.pdmodel", model_filename="model.pdmodel",
......
...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model ...@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig from dgu.utils.configure import PDConfig
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml") args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
...@@ -66,7 +66,9 @@ def do_train(args): ...@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data( sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64') name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data( input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32') name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot': if args.task_name == 'atis_slot':
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64') name='labels', shape=[-1, args.max_seq_len], dtype='int64')
...@@ -74,13 +76,12 @@ def do_train(args): ...@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data( labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64') name='labels', shape=[-1, num_labels], dtype='int64')
else: else:
labels = fluid.data( labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels] input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst) input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst, data_reader = fluid.io.PyReader(
capacity=4, iterable=False) feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir, processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len, max_seq_len=args.max_seq_len,
...@@ -113,9 +114,7 @@ def do_train(args): ...@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1)) dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator( batch_generator = processor.data_generator(
batch_size=args.batch_size, batch_size=args.batch_size, phase='train', shuffle=True)
phase='train',
shuffle=True)
num_train_examples = processor.get_num_examples(phase='train') num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens: if args.in_tokens:
...@@ -217,37 +216,32 @@ def do_train(args): ...@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S', current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time())) time.localtime(time.time()))
if accuracy is not None: if accuracy is not None:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" % "ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time)) args.print_steps / used_time))
ce_info.append([ ce_info.append([
np.mean(np_loss), np.mean(np_loss), np.mean(np_acc),
np.mean(np_acc),
args.print_steps / used_time args.print_steps / used_time
]) ])
else: else:
print( print("%s epoch: %d, step: %d, ave loss: %f, "
"%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" % "speed: %f steps/s" %
(current_time, epoch_step, steps, (current_time, epoch_step, steps,
np.mean(np_loss), np.mean(np_loss), args.print_steps / used_time))
args.print_steps / used_time)) ce_info.append(
ce_info.append([ [np.mean(np_loss), args.print_steps / used_time])
np.mean(np_loss),
args.print_steps / used_time
])
time_begin = time.time() time_begin = time.time()
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = "step_" + str(steps) save_path = "step_" + str(steps)
if args.save_checkpoint: if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path) save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param: if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path) save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException: except fluid.core.EOFException:
data_reader.reset() data_reader.reset()
......
...@@ -19,8 +19,7 @@ from __future__ import print_function ...@@ -19,8 +19,7 @@ from __future__ import print_function
import os import os
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
......
...@@ -23,7 +23,7 @@ import os ...@@ -23,7 +23,7 @@ import os
import time import time
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
......
...@@ -24,7 +24,7 @@ import time ...@@ -24,7 +24,7 @@ import time
import argparse import argparse
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
......
...@@ -36,7 +36,7 @@ import sys ...@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2': if sys.version[0] == '2':
reload(sys) reload(sys)
sys.setdefaultencoding("utf-8") sys.setdefaultencoding("utf-8")
sys.path.append('../') sys.path.append('../shared_modules/')
import os import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
......
...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer ...@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset from reader import Dataset
from ernie_reader import SequenceLabelReader from ernie_reader import SequenceLabelReader
sys.path.append("..") sys.path.append("../shared_modules/")
from models.sequence_labeling import nets from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader from models.representation.ernie import ernie_encoder, ernie_pyreader
...@@ -35,7 +35,8 @@ def create_model(args, vocab_size, num_labels, mode='train'): ...@@ -35,7 +35,8 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model""" """create lac model"""
# model's input data # model's input data
words = fluid.data(name='words', shape=[None, 1], dtype='int64', lod_level=1) words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data( targets = fluid.data(
name='targets', shape=[None, 1], dtype='int64', lod_level=1) name='targets', shape=[None, 1], dtype='int64', lod_level=1)
...@@ -88,7 +89,8 @@ def create_pyreader(args, ...@@ -88,7 +89,8 @@ def create_pyreader(args,
return_reader=False, return_reader=False,
mode='train'): mode='train'):
# init reader # init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(fluid.cpu_places()) device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac': if model == 'lac':
pyreader = fluid.io.DataLoader.from_generator( pyreader = fluid.io.DataLoader.from_generator(
...@@ -107,14 +109,14 @@ def create_pyreader(args, ...@@ -107,14 +109,14 @@ def create_pyreader(args,
fluid.io.shuffle( fluid.io.shuffle(
reader.file_reader(file_name), reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer), buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size/device_count), batch_size=args.batch_size / device_count),
places=place) places=place)
else: else:
pyreader.set_sample_list_generator( pyreader.set_sample_list_generator(
fluid.io.batch( fluid.io.batch(
reader.file_reader( reader.file_reader(
file_name, mode=mode), file_name, mode=mode),
batch_size=args.batch_size/device_count), batch_size=args.batch_size / device_count),
places=place) places=place)
elif model == 'ernie': elif model == 'ernie':
......
...@@ -20,7 +20,7 @@ import sys ...@@ -20,7 +20,7 @@ import sys
from collections import namedtuple from collections import namedtuple
import numpy as np import numpy as np
sys.path.append("..") sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization from preprocess.ernie.task_reader import BaseReader, tokenization
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -10,7 +10,7 @@ import paddle.fluid as fluid ...@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator import creator
import reader import reader
import utils import utils
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -24,7 +24,7 @@ import paddle ...@@ -24,7 +24,7 @@ import paddle
import utils import utils
import reader import reader
import creator import creator
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
......
...@@ -34,7 +34,7 @@ import paddle.fluid as fluid ...@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator import creator
import utils import utils
sys.path.append("..") sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda from models.model_check import check_cuda
from models.model_check import check_version from models.model_check import check_version
...@@ -187,8 +187,8 @@ def do_train(args): ...@@ -187,8 +187,8 @@ def do_train(args):
end_time - start_time, train_pyreader.queue.size())) end_time - start_time, train_pyreader.queue.size()))
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir, "step_" + str(steps), save_path = os.path.join(args.model_save_dir,
"checkpoint") "step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path)) print("\tsaving model as %s" % (save_path))
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
...@@ -199,6 +199,7 @@ def do_train(args): ...@@ -199,6 +199,7 @@ def do_train(args):
"checkpoint") "checkpoint")
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
def do_eval(args): def do_eval(args):
# init executor # init executor
if args.use_cuda: if args.use_cuda:
......
...@@ -29,7 +29,7 @@ import reader ...@@ -29,7 +29,7 @@ import reader
import utils import utils
import creator import creator
from eval import test_process from eval import test_process
sys.path.append('../models/') sys.path.append('../shared_modules/models/')
from model_check import check_cuda from model_check import check_cuda
from model_check import check_version from model_check import check_version
...@@ -151,8 +151,7 @@ def do_train(args): ...@@ -151,8 +151,7 @@ def do_train(args):
# save checkpoints # save checkpoints
if step % args.save_steps == 0 and step != 0: if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir, save_path = os.path.join(args.model_save_dir,
"step_" + str(step), "step_" + str(step), "checkpoint")
"checkpoint")
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
step += 1 step += 1
......
...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训 ...@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力 - 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。 百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
def get_input_descs(args): def get_input_descs(args):
""" """
Generate a dict mapping data fields to the corresponding data shapes and Generate a dict mapping data fields to the corresponding data shapes and
...@@ -42,7 +43,8 @@ def get_input_descs(args): ...@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder. # encoder.
# The actual data shape of src_slf_attn_bias is: # The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is: # The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1] # [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64", "trg_word": [(batch_size, seq_len), "int64",
...@@ -54,12 +56,14 @@ def get_input_descs(args): ...@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder. # subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is: # The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source # This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention. # input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is: # The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch] # [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"], "trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference. # This input is used in independent decoder program for inference.
# The actual data shape of enc_output is: # The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model] # [batch_size, max_src_len_in_batch, d_model]
...@@ -80,6 +84,7 @@ def get_input_descs(args): ...@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs return input_descs
# Names of word embedding table which might be reused for weight sharing. # Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = ( word_emb_param_names = (
"src_word_emb_table", "src_word_emb_table",
......
...@@ -87,7 +87,8 @@ def do_save_inference_model(args): ...@@ -87,7 +87,8 @@ def do_save_inference_model(args):
# saving inference model # saving inference model
fluid.io.save_inference_model(args.inference_model_dir, fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=list(input_field_names), feeded_var_names=list(input_field_names),
target_vars=[out_ids, out_scores], target_vars=[out_ids, out_scores],
executor=exe, executor=exe,
......
...@@ -25,7 +25,6 @@ from train import do_train ...@@ -25,7 +25,6 @@ from train import do_train
from predict import do_predict from predict import do_predict
from inference_model import do_save_inference_model from inference_model import do_save_inference_model
if __name__ == "__main__": if __name__ == "__main__":
LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s" LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s"
logging.basicConfig( logging.basicConfig(
......
...@@ -142,8 +142,8 @@ def do_train(args): ...@@ -142,8 +142,8 @@ def do_train(args):
## init from some checkpoint, to resume the previous training ## init from some checkpoint, to resume the previous training
if args.init_from_checkpoint: if args.init_from_checkpoint:
load(train_prog, os.path.join(args.init_from_checkpoint, "transformer"), load(train_prog,
exe) os.path.join(args.init_from_checkpoint, "transformer"), exe)
print("finish initing model from checkpoint from %s" % print("finish initing model from checkpoint from %s" %
(args.init_from_checkpoint)) (args.init_from_checkpoint))
...@@ -221,7 +221,6 @@ def do_train(args): ...@@ -221,7 +221,6 @@ def do_train(args):
"transformer") "transformer")
fluid.save(train_prog, model_path) fluid.save(train_prog, model_path)
batch_id += 1 batch_id += 1
step_idx += 1 step_idx += 1
total_batch_num = total_batch_num + 1 # this is for benchmark total_batch_num = total_batch_num + 1 # this is for benchmark
......
...@@ -25,7 +25,6 @@ from desc import * ...@@ -25,7 +25,6 @@ from desc import *
dropout_seed = None dropout_seed = None
def wrap_layer_with_block(layer, block_idx): def wrap_layer_with_block(layer, block_idx):
""" """
Make layer define support indicating block, by which we can add layers Make layer define support indicating block, by which we can add layers
...@@ -300,12 +299,13 @@ def prepare_encoder_decoder(src_word, ...@@ -300,12 +299,13 @@ def prepare_encoder_decoder(src_word,
src_word, src_word,
size=[src_vocab_size, src_emb_dim], size=[src_vocab_size, src_emb_dim],
padding_idx=bos_idx, # set embedding of bos to 0 padding_idx=bos_idx, # set embedding of bos to 0
param_attr=fluid.ParamAttr(name=word_emb_param_name, param_attr=fluid.ParamAttr(
initializer=fluid.initializer.Normal( name=word_emb_param_name,
0., src_emb_dim**-0.5))) initializer=fluid.initializer.Normal(0., src_emb_dim**-0.5)))
src_word_emb = layers.scale(x=src_word_emb, scale=src_emb_dim**0.5) src_word_emb = layers.scale(x=src_word_emb, scale=src_emb_dim**0.5)
src_pos_enc = fluid.embedding(src_pos, src_pos_enc = fluid.embedding(
src_pos,
size=[src_max_len, src_emb_dim], size=[src_max_len, src_emb_dim],
param_attr=fluid.ParamAttr( param_attr=fluid.ParamAttr(
name=pos_enc_param_name, trainable=False)) name=pos_enc_param_name, trainable=False))
...@@ -477,7 +477,8 @@ def decoder(dec_input, ...@@ -477,7 +477,8 @@ def decoder(dec_input,
The decoder is composed of a stack of identical decoder_layer layers. The decoder is composed of a stack of identical decoder_layer layers.
""" """
for i in range(n_layer): for i in range(n_layer):
dec_output = decoder_layer(dec_input, dec_output = decoder_layer(
dec_input,
enc_output, enc_output,
dec_slf_attn_bias, dec_slf_attn_bias,
dec_enc_attn_bias, dec_enc_attn_bias,
...@@ -491,8 +492,7 @@ def decoder(dec_input, ...@@ -491,8 +492,7 @@ def decoder(dec_input,
relu_dropout, relu_dropout,
preprocess_cmd, preprocess_cmd,
postprocess_cmd, postprocess_cmd,
cache=None if caches is None else cache=None if caches is None else (caches[i], i))
(caches[i], i))
dec_input = dec_output dec_input = dec_output
dec_output = pre_process_layer(dec_output, preprocess_cmd, dec_output = pre_process_layer(dec_output, preprocess_cmd,
prepostprocess_dropout) prepostprocess_dropout)
...@@ -530,7 +530,8 @@ def transformer(model_input, ...@@ -530,7 +530,8 @@ def transformer(model_input,
label = model_input.lbl_word label = model_input.lbl_word
weights = model_input.lbl_weight weights = model_input.lbl_weight
enc_output = wrap_encoder(enc_inputs, enc_output = wrap_encoder(
enc_inputs,
src_vocab_size, src_vocab_size,
max_length, max_length,
n_layer, n_layer,
...@@ -547,7 +548,8 @@ def transformer(model_input, ...@@ -547,7 +548,8 @@ def transformer(model_input,
weight_sharing, weight_sharing,
bos_idx=bos_idx) bos_idx=bos_idx)
predict = wrap_decoder(dec_inputs, predict = wrap_decoder(
dec_inputs,
trg_vocab_size, trg_vocab_size,
max_length, max_length,
n_layer, n_layer,
...@@ -569,8 +571,9 @@ def transformer(model_input, ...@@ -569,8 +571,9 @@ def transformer(model_input,
if label_smooth_eps: if label_smooth_eps:
# TODO: use fluid.input.one_hot after softmax_with_cross_entropy removing # TODO: use fluid.input.one_hot after softmax_with_cross_entropy removing
# the enforcement that the last dimension of label must be 1. # the enforcement that the last dimension of label must be 1.
label = layers.label_smooth(label=layers.one_hot(input=label, label = layers.label_smooth(
depth=trg_vocab_size), label=layers.one_hot(
input=label, depth=trg_vocab_size),
epsilon=label_smooth_eps) epsilon=label_smooth_eps)
cost = layers.softmax_with_cross_entropy( cost = layers.softmax_with_cross_entropy(
...@@ -714,7 +717,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -714,7 +717,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dec_inputs = (model_input.trg_word, model_input.init_score, dec_inputs = (model_input.trg_word, model_input.init_score,
model_input.init_idx, model_input.trg_src_attn_bias) model_input.init_idx, model_input.trg_src_attn_bias)
enc_output = wrap_encoder(enc_inputs, enc_output = wrap_encoder(
enc_inputs,
src_vocab_size, src_vocab_size,
max_in_len, max_in_len,
n_layer, n_layer,
...@@ -763,9 +767,15 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -763,9 +767,15 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=enc_output.dtype, dtype=enc_output.dtype,
value=0), value=0),
"static_k": # for encoder-decoder attention "static_k": # for encoder-decoder attention
fluid.data(shape=[None, n_head, 0, d_key], dtype=enc_output.dtype, name=("static_k_%d"%i)), fluid.data(
shape=[None, n_head, 0, d_key],
dtype=enc_output.dtype,
name=("static_k_%d" % i)),
"static_v": # for encoder-decoder attention "static_v": # for encoder-decoder attention
fluid.data(shape=[None, n_head, 0, d_value], dtype=enc_output.dtype, name=("static_v_%d"%i)), fluid.data(
shape=[None, n_head, 0, d_value],
dtype=enc_output.dtype,
name=("static_v_%d" % i)),
} for i in range(n_layer) } for i in range(n_layer)
] ]
...@@ -780,8 +790,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -780,8 +790,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
# gather cell states corresponding to selected parent # gather cell states corresponding to selected parent
pre_caches = map_structure( pre_caches = map_structure(
lambda x: layers.gather(x, index=gather_idx), caches) lambda x: layers.gather(x, index=gather_idx), caches)
pre_src_attn_bias = layers.gather(trg_src_attn_bias, pre_src_attn_bias = layers.gather(
index=gather_idx) trg_src_attn_bias, index=gather_idx)
pre_pos = layers.elementwise_mul( pre_pos = layers.elementwise_mul(
x=layers.fill_constant_batch_size_like( x=layers.fill_constant_batch_size_like(
input=pre_src_attn_bias, # cann't use lod tensor here input=pre_src_attn_bias, # cann't use lod tensor here
...@@ -790,7 +800,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -790,7 +800,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
dtype=pre_ids.dtype), dtype=pre_ids.dtype),
y=step_idx, y=step_idx,
axis=0) axis=0)
logits = wrap_decoder((pre_ids, pre_pos, None, pre_src_attn_bias), logits = wrap_decoder(
(pre_ids, pre_pos, None, pre_src_attn_bias),
trg_vocab_size, trg_vocab_size,
max_in_len, max_in_len,
n_layer, n_layer,
...@@ -811,9 +822,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -811,9 +822,8 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
# intra-beam topK # intra-beam topK
topk_scores, topk_indices = layers.topk( topk_scores, topk_indices = layers.topk(
input=layers.softmax(logits), k=beam_size) input=layers.softmax(logits), k=beam_size)
accu_scores = layers.elementwise_add(x=layers.log(topk_scores), accu_scores = layers.elementwise_add(
y=pre_scores, x=layers.log(topk_scores), y=pre_scores, axis=0)
axis=0)
# beam_search op uses lod to differentiate branches. # beam_search op uses lod to differentiate branches.
accu_scores = layers.lod_reset(accu_scores, pre_ids) accu_scores = layers.lod_reset(accu_scores, pre_ids)
# topK reduction across beams, also contain special handle of # topK reduction across beams, also contain special handle of
...@@ -832,11 +842,12 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len, ...@@ -832,11 +842,12 @@ def fast_decode(model_input, src_vocab_size, trg_vocab_size, max_in_len,
return (step_idx, selected_ids, selected_scores, gather_idx, return (step_idx, selected_ids, selected_scores, gather_idx,
pre_caches, pre_src_attn_bias) pre_caches, pre_src_attn_bias)
_ = layers.while_loop(cond=cond_func, _ = layers.while_loop(
cond=cond_func,
body=body_func, body=body_func,
loop_vars=[ loop_vars=[
step_idx, start_tokens, init_scores, step_idx, start_tokens, init_scores, parent_idx, caches,
parent_idx, caches, trg_src_attn_bias trg_src_attn_bias
], ],
is_test=True) is_test=True)
......
...@@ -199,9 +199,14 @@ class PDConfig(object): ...@@ -199,9 +199,14 @@ class PDConfig(object):
"Whether to perform model saving for inference.") "Whether to perform model saving for inference.")
# NOTE: args for profiler # NOTE: args for profiler
self.default_g.add_arg("is_profiler", int, 0, "the switch of profiler tools. (used for benchmark)") self.default_g.add_arg(
self.default_g.add_arg("profiler_path", str, './', "the profiler output file path. (used for benchmark)") "is_profiler", int, 0,
self.default_g.add_arg("max_iter", int, 0, "the max train batch num.(used for benchmark)") "the switch of profiler tools. (used for benchmark)")
self.default_g.add_arg(
"profiler_path", str, './',
"the profiler output file path. (used for benchmark)")
self.default_g.add_arg("max_iter", int, 0,
"the max train batch num.(used for benchmark)")
self.parser = parser self.parser = parser
......
...@@ -415,5 +415,3 @@ for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) { ...@@ -415,5 +415,3 @@ for (size_t i = 0; i < output.front().data.length() / sizeof(float); i += 3) {
<< static_cast<float *>(output.front().data.data())[i + 2] << std::endl; << static_cast<float *>(output.front().data.data())[i + 2] << std::endl;
} }
``` ```
...@@ -90,5 +90,3 @@ word_embedding=fluid.layers.concat(input=[elmo_embedding, word_embedding], axis= ...@@ -90,5 +90,3 @@ word_embedding=fluid.layers.concat(input=[elmo_embedding, word_embedding], axis=
### 参考论文 ### 参考论文
[Deep contextualized word representations](https://arxiv.org/abs/1802.05365) [Deep contextualized word representations](https://arxiv.org/abs/1802.05365)
...@@ -7,7 +7,6 @@ from kpi import CostKpi, DurationKpi, AccKpi ...@@ -7,7 +7,6 @@ from kpi import CostKpi, DurationKpi, AccKpi
#### NOTE kpi.py should shared in models in some way!!!! #### NOTE kpi.py should shared in models in some way!!!!
train_duration_sts_b_card1 = DurationKpi( train_duration_sts_b_card1 = DurationKpi(
'train_duration_sts_b_card1', 0.01, 0, actived=True) 'train_duration_sts_b_card1', 0.01, 0, actived=True)
train_cost_sts_b_card1 = CostKpi( train_cost_sts_b_card1 = CostKpi(
......
# -*- coding: utf_8 -*- # -*- coding: utf_8 -*-
import os import os
import sys import sys
sys.path.append("../") sys.path.append("../shared_modules/")
sys.path.append("../models/classification") sys.path.append("../shared_modules/models/classification")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import numpy as np import numpy as np
...@@ -17,6 +17,7 @@ from models.representation.ernie import ErnieConfig ...@@ -17,6 +17,7 @@ from models.representation.ernie import ErnieConfig
from models.representation.ernie import ernie_encoder, ernie_encoder_with_paddle_hub from models.representation.ernie import ernie_encoder, ernie_encoder_with_paddle_hub
from preprocess.ernie import task_reader from preprocess.ernie import task_reader
def do_save_inference_model(args): def do_save_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path) ernie_config = ErnieConfig(args.ernie_config_path)
...@@ -37,18 +38,17 @@ def do_save_inference_model(args): ...@@ -37,18 +38,17 @@ def do_save_inference_model(args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_reader")
pyreader_name="infer_reader")
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args, probs = create_model(
embeddings, args, embeddings, labels=labels, is_prediction=True)
labels=labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -59,11 +59,11 @@ def do_save_inference_model(args): ...@@ -59,11 +59,11 @@ def do_save_inference_model(args):
fluid.io.save_inference_model( fluid.io.save_inference_model(
args.inference_model_dir, args.inference_model_dir,
feeded_var_names=[ernie_inputs["src_ids"].name, feeded_var_names=[
ernie_inputs["sent_ids"].name, ernie_inputs["src_ids"].name, ernie_inputs["sent_ids"].name,
ernie_inputs["pos_ids"].name, ernie_inputs["pos_ids"].name, ernie_inputs["input_mask"].name,
ernie_inputs["input_mask"].name, ernie_inputs["seq_lens"].name
ernie_inputs["seq_lens"].name], ],
target_vars=[probs], target_vars=[probs],
executor=exe, executor=exe,
main_program=test_prog, main_program=test_prog,
...@@ -72,6 +72,7 @@ def do_save_inference_model(args): ...@@ -72,6 +72,7 @@ def do_save_inference_model(args):
print("save inference model at %s" % (args.inference_model_dir)) print("save inference model at %s" % (args.inference_model_dir))
def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
""" """
Inference Function Inference Function
...@@ -80,13 +81,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): ...@@ -80,13 +81,16 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
test_pyreader.start() test_pyreader.start()
while True: while True:
try: try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, return_numpy=True) np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True)
for probs in np_props[0]: for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
break break
def test_inference_model(args): def test_inference_model(args):
ernie_config = ErnieConfig(args.ernie_config_path) ernie_config = ErnieConfig(args.ernie_config_path)
ernie_config.print_config() ernie_config.print_config()
...@@ -113,15 +117,11 @@ def test_inference_model(args): ...@@ -113,15 +117,11 @@ def test_inference_model(args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_pyreader")
pyreader_name="infer_pyreader")
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config)
probs = create_model( probs = create_model(
args, args, embeddings, labels=labels, is_prediction=True)
embeddings,
labels=labels,
is_prediction=True)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
...@@ -129,7 +129,7 @@ def test_inference_model(args): ...@@ -129,7 +129,7 @@ def test_inference_model(args):
assert (args.inference_model_dir) assert (args.inference_model_dir)
infer_data_generator = reader.data_generator( infer_data_generator = reader.data_generator(
input_file=args.test_set, input_file=args.test_set,
batch_size=args.batch_size/dev_count, batch_size=args.batch_size / dev_count,
phase="infer", phase="infer",
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -141,8 +141,8 @@ def test_inference_model(args): ...@@ -141,8 +141,8 @@ def test_inference_model(args):
params_filename="params.pdparams") params_filename="params.pdparams")
infer_pyreader.set_batch_generator(infer_data_generator) infer_pyreader.set_batch_generator(infer_data_generator)
inference(exe, test_prog, infer_pyreader, inference(exe, test_prog, infer_pyreader, [probs.name], "infer")
[probs.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig() args = PDConfig()
......
...@@ -12,8 +12,8 @@ import argparse ...@@ -12,8 +12,8 @@ import argparse
import numpy as np import numpy as np
import multiprocessing import multiprocessing
import sys import sys
sys.path.append("../models/classification/") sys.path.append("../shared_modules/models/classification/")
sys.path.append("../") sys.path.append("../shared_modules/")
from nets import bow_net from nets import bow_net
from nets import lstm_net from nets import lstm_net
...@@ -30,24 +30,19 @@ import paddle.fluid as fluid ...@@ -30,24 +30,19 @@ import paddle.fluid as fluid
import reader import reader
from utils import init_checkpoint from utils import init_checkpoint
def create_model(args,
pyreader_name,
num_labels,
is_prediction=False):
def create_model(args, pyreader_name, num_labels, is_prediction=False):
""" """
Create Model for sentiment classification Create Model for sentiment classification
""" """
data = fluid.data( data = fluid.data(
name="src_ids", shape=[None, args.max_seq_len], dtype='int64') name="src_ids", shape=[None, args.max_seq_len], dtype='int64')
label = fluid.data( label = fluid.data(name="label", shape=[None, 1], dtype="int64")
name="label", shape=[None, 1], dtype="int64") seq_len = fluid.data(name="seq_len", shape=[None], dtype="int64")
seq_len = fluid.data(
name="seq_len", shape=[None], dtype="int64")
data_reader = fluid.io.DataLoader.from_generator(feed_list=[data, label, seq_len], data_reader = fluid.io.DataLoader.from_generator(
capacity=4, iterable=False) feed_list=[data, label, seq_len], capacity=4, iterable=False)
if args.model_type == "bilstm_net": if args.model_type == "bilstm_net":
network = bilstm_net network = bilstm_net
...@@ -63,18 +58,19 @@ def create_model(args, ...@@ -63,18 +58,19 @@ def create_model(args,
raise ValueError("Unknown network type!") raise ValueError("Unknown network type!")
if is_prediction: if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, is_prediction=is_prediction) probs = network(
data, seq_len, None, args.vocab_size, is_prediction=is_prediction)
print("create inference model...") print("create inference model...")
return data_reader, probs, [data.name, seq_len.name] return data_reader, probs, [data.name, seq_len.name]
ce_loss, probs = network(data, seq_len, label, args.vocab_size, is_prediction=is_prediction) ce_loss, probs = network(
data, seq_len, label, args.vocab_size, is_prediction=is_prediction)
loss = fluid.layers.mean(x=ce_loss) loss = fluid.layers.mean(x=ce_loss)
num_seqs = fluid.layers.create_tensor(dtype='int64') num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs) accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return data_reader, loss, accuracy, num_seqs return data_reader, loss, accuracy, num_seqs
def evaluate(exe, test_program, test_pyreader, fetch_list, eval_phase): def evaluate(exe, test_program, test_pyreader, fetch_list, eval_phase):
""" """
Evaluation Function Evaluation Function
...@@ -111,7 +107,8 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase): ...@@ -111,7 +107,8 @@ def inference(exe, test_program, test_pyreader, fetch_list, infer_phrase):
time_begin = time.time() time_begin = time.time()
while True: while True:
try: try:
np_props = exe.run(program=test_program, fetch_list=fetch_list, np_props = exe.run(program=test_program,
fetch_list=fetch_list,
return_numpy=True) return_numpy=True)
for probs in np_props[0]: for probs in np_props[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
...@@ -135,7 +132,8 @@ def main(args): ...@@ -135,7 +132,8 @@ def main(args):
exe = fluid.Executor(place) exe = fluid.Executor(place)
task_name = args.task_name.lower() task_name = args.task_name.lower()
processor = reader.SentaProcessor(data_dir=args.data_dir, processor = reader.SentaProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path, vocab_path=args.vocab_path,
random_seed=args.random_seed, random_seed=args.random_seed,
max_seq_len=args.max_seq_len) max_seq_len=args.max_seq_len)
...@@ -151,7 +149,7 @@ def main(args): ...@@ -151,7 +149,7 @@ def main(args):
if args.do_train: if args.do_train:
train_data_generator = processor.data_generator( train_data_generator = processor.data_generator(
batch_size=args.batch_size/dev_count, batch_size=args.batch_size / dev_count,
phase='train', phase='train',
epoch=args.epoch, epoch=args.epoch,
shuffle=True) shuffle=True)
...@@ -187,7 +185,7 @@ def main(args): ...@@ -187,7 +185,7 @@ def main(args):
if args.do_val: if args.do_val:
test_data_generator = processor.data_generator( test_data_generator = processor.data_generator(
batch_size=args.batch_size/dev_count, batch_size=args.batch_size / dev_count,
phase='dev', phase='dev',
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -204,7 +202,7 @@ def main(args): ...@@ -204,7 +202,7 @@ def main(args):
if args.do_infer: if args.do_infer:
infer_data_generator = processor.data_generator( infer_data_generator = processor.data_generator(
batch_size=args.batch_size/dev_count, batch_size=args.batch_size / dev_count,
phase='infer', phase='infer',
epoch=1, epoch=1,
shuffle=False) shuffle=False)
...@@ -223,18 +221,13 @@ def main(args): ...@@ -223,18 +221,13 @@ def main(args):
if args.do_train: if args.do_train:
if args.init_checkpoint: if args.init_checkpoint:
init_checkpoint( init_checkpoint(
exe, exe, args.init_checkpoint, main_program=startup_prog)
args.init_checkpoint,
main_program=startup_prog)
elif args.do_val or args.do_infer: elif args.do_val or args.do_infer:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog)
exe,
args.init_checkpoint,
main_program=startup_prog)
if args.do_train: if args.do_train:
train_exe = exe train_exe = exe
...@@ -262,7 +255,9 @@ def main(args): ...@@ -262,7 +255,9 @@ def main(args):
else: else:
fetch_list = [] fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False) outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
#print("finished one step") #print("finished one step")
if steps % args.skip_steps == 0: if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs np_loss, np_acc, np_num_seqs = outputs
...@@ -274,7 +269,8 @@ def main(args): ...@@ -274,7 +269,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs) total_num_seqs.extend(np_num_seqs)
if args.verbose: if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size() verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose) print(verbose)
time_end = time.time() time_end = time.time()
...@@ -289,8 +285,7 @@ def main(args): ...@@ -289,8 +285,7 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps), "step_" + str(steps), "checkpoint")
"checkpoint")
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
...@@ -317,8 +312,7 @@ def main(args): ...@@ -317,8 +312,7 @@ def main(args):
# final eval on test set # final eval on test set
if args.do_infer: if args.do_infer:
print("Final test result:") print("Final test result:")
inference(exe, infer_prog, infer_reader, inference(exe, infer_prog, infer_reader, [prop.name], "infer")
[prop.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
......
...@@ -16,8 +16,8 @@ import sys ...@@ -16,8 +16,8 @@ import sys
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
sys.path.append("../models/classification/") sys.path.append("../shared_modules/models/classification/")
sys.path.append("..") sys.path.append("../shared_modules/")
print(sys.path) print(sys.path)
from nets import bow_net from nets import bow_net
...@@ -36,6 +36,7 @@ from config import PDConfig ...@@ -36,6 +36,7 @@ from config import PDConfig
from utils import init_checkpoint from utils import init_checkpoint
def ernie_pyreader(args, pyreader_name): def ernie_pyreader(args, pyreader_name):
src_ids = fluid.data( src_ids = fluid.data(
name="src_ids", shape=[None, args.max_seq_len, 1], dtype="int64") name="src_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
...@@ -45,10 +46,8 @@ def ernie_pyreader(args, pyreader_name): ...@@ -45,10 +46,8 @@ def ernie_pyreader(args, pyreader_name):
name="pos_ids", shape=[None, args.max_seq_len, 1], dtype="int64") name="pos_ids", shape=[None, args.max_seq_len, 1], dtype="int64")
input_mask = fluid.data( input_mask = fluid.data(
name="input_mask", shape=[None, args.max_seq_len, 1], dtype="float32") name="input_mask", shape=[None, args.max_seq_len, 1], dtype="float32")
labels = fluid.data( labels = fluid.data(name="labels", shape=[None, 1], dtype="int64")
name="labels", shape=[None, 1], dtype="int64") seq_lens = fluid.data(name="seq_lens", shape=[None], dtype="int64")
seq_lens = fluid.data(
name="seq_lens", shape=[None], dtype="int64")
pyreader = fluid.io.DataLoader.from_generator( pyreader = fluid.io.DataLoader.from_generator(
feed_list=[src_ids, sent_ids, pos_ids, input_mask, labels, seq_lens], feed_list=[src_ids, sent_ids, pos_ids, input_mask, labels, seq_lens],
...@@ -61,15 +60,13 @@ def ernie_pyreader(args, pyreader_name): ...@@ -61,15 +60,13 @@ def ernie_pyreader(args, pyreader_name):
"sent_ids": sent_ids, "sent_ids": sent_ids,
"pos_ids": pos_ids, "pos_ids": pos_ids,
"input_mask": input_mask, "input_mask": input_mask,
"seq_lens": seq_lens} "seq_lens": seq_lens
}
return pyreader, ernie_inputs, labels return pyreader, ernie_inputs, labels
def create_model(args,
embeddings,
labels,
is_prediction=False):
def create_model(args, embeddings, labels, is_prediction=False):
""" """
Create Model for sentiment classification based on ERNIE encoder Create Model for sentiment classification based on ERNIE encoder
""" """
...@@ -132,7 +129,8 @@ def infer(exe, infer_program, infer_pyreader, fetch_list, infer_phase): ...@@ -132,7 +129,8 @@ def infer(exe, infer_program, infer_pyreader, fetch_list, infer_phase):
time_begin = time.time() time_begin = time.time()
while True: while True:
try: try:
batch_probs = exe.run(program=infer_program, fetch_list=fetch_list, batch_probs = exe.run(program=infer_program,
fetch_list=fetch_list,
return_numpy=True) return_numpy=True)
for probs in batch_probs[0]: for probs in batch_probs[0]:
print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1])) print("%d\t%f\t%f" % (np.argmax(probs), probs[0], probs[1]))
...@@ -195,21 +193,19 @@ def main(args): ...@@ -195,21 +193,19 @@ def main(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
# create ernie_pyreader # create ernie_pyreader
train_pyreader, ernie_inputs, labels = ernie_pyreader( train_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name='train_pyreader')
pyreader_name='train_pyreader')
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings # user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model( loss, accuracy, num_seqs = create_model(
args, args, embeddings, labels=labels, is_prediction=False)
embeddings,
labels=labels,
is_prediction=False)
optimizer = fluid.optimizer.Adam(learning_rate=args.lr) optimizer = fluid.optimizer.Adam(learning_rate=args.lr)
optimizer.minimize(loss) optimizer.minimize(loss)
...@@ -232,21 +228,19 @@ def main(args): ...@@ -232,21 +228,19 @@ def main(args):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
# create ernie_pyreader # create ernie_pyreader
test_pyreader, ernie_inputs, labels = ernie_pyreader( test_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name='eval_reader')
pyreader_name='eval_reader')
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
# user defined model based on ernie embeddings # user defined model based on ernie embeddings
loss, accuracy, num_seqs = create_model( loss, accuracy, num_seqs = create_model(
args, args, embeddings, labels=labels, is_prediction=False)
embeddings,
labels=labels,
is_prediction=False)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -261,19 +255,18 @@ def main(args): ...@@ -261,19 +255,18 @@ def main(args):
with fluid.program_guard(infer_prog, startup_prog): with fluid.program_guard(infer_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, ernie_inputs, labels = ernie_pyreader( infer_pyreader, ernie_inputs, labels = ernie_pyreader(
args, args, pyreader_name="infer_pyreader")
pyreader_name="infer_pyreader")
# get ernie_embeddings # get ernie_embeddings
if args.use_paddle_hub: if args.use_paddle_hub:
embeddings = ernie_encoder_with_paddle_hub(ernie_inputs, args.max_seq_len) embeddings = ernie_encoder_with_paddle_hub(ernie_inputs,
args.max_seq_len)
else: else:
embeddings = ernie_encoder(ernie_inputs, ernie_config=ernie_config) embeddings = ernie_encoder(
ernie_inputs, ernie_config=ernie_config)
probs = create_model(args, probs = create_model(
embeddings, args, embeddings, labels=labels, is_prediction=True)
labels=labels,
is_prediction=True)
infer_prog = infer_prog.clone(for_test=True) infer_prog = infer_prog.clone(for_test=True)
...@@ -282,25 +275,17 @@ def main(args): ...@@ -282,25 +275,17 @@ def main(args):
if args.do_train: if args.do_train:
if args.init_checkpoint: if args.init_checkpoint:
init_checkpoint( init_checkpoint(
exe, exe, args.init_checkpoint, main_program=train_program)
args.init_checkpoint,
main_program=train_program)
elif args.do_val: elif args.do_val:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
elif args.do_infer: elif args.do_infer:
if not args.init_checkpoint: if not args.init_checkpoint:
raise ValueError("args 'init_checkpoint' should be set if" raise ValueError("args 'init_checkpoint' should be set if"
"only doing validation or testing!") "only doing validation or testing!")
init_checkpoint( init_checkpoint(exe, args.init_checkpoint, main_program=infer_prog)
exe,
args.init_checkpoint,
main_program=infer_prog)
if args.do_train: if args.do_train:
train_exe = exe train_exe = exe
...@@ -327,7 +312,9 @@ def main(args): ...@@ -327,7 +312,9 @@ def main(args):
else: else:
fetch_list = [] fetch_list = []
outputs = train_exe.run(program=train_program, fetch_list=fetch_list, return_numpy=False) outputs = train_exe.run(program=train_program,
fetch_list=fetch_list,
return_numpy=False)
if steps % args.skip_steps == 0: if steps % args.skip_steps == 0:
np_loss, np_acc, np_num_seqs = outputs np_loss, np_acc, np_num_seqs = outputs
np_loss = np.array(np_loss) np_loss = np.array(np_loss)
...@@ -338,7 +325,8 @@ def main(args): ...@@ -338,7 +325,8 @@ def main(args):
total_num_seqs.extend(np_num_seqs) total_num_seqs.extend(np_num_seqs)
if args.verbose: if args.verbose:
verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size() verbose = "train pyreader queue size: %d, " % train_pyreader.queue.size(
)
print(verbose) print(verbose)
time_end = time.time() time_end = time.time()
...@@ -353,8 +341,7 @@ def main(args): ...@@ -353,8 +341,7 @@ def main(args):
if steps % args.save_steps == 0: if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints, save_path = os.path.join(args.checkpoints,
"step_" + str(steps), "step_" + str(steps), "checkpoint")
"checkpoint")
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
if steps % args.validation_steps == 0: if steps % args.validation_steps == 0:
...@@ -380,8 +367,8 @@ def main(args): ...@@ -380,8 +367,8 @@ def main(args):
# final eval on test set # final eval on test set
if args.do_infer: if args.do_infer:
print("Final test result:") print("Final test result:")
infer(exe, infer_prog, infer_pyreader, infer(exe, infer_prog, infer_pyreader, [probs.name], "infer")
[probs.name], "infer")
if __name__ == "__main__": if __name__ == "__main__":
args = PDConfig() args = PDConfig()
......
...@@ -230,8 +230,7 @@ def main(): ...@@ -230,8 +230,7 @@ def main():
if not args.profile: if not args.profile:
save_path = os.path.join(args.model_path, save_path = os.path.join(args.model_path,
"epoch_" + str(epoch_id), "epoch_" + str(epoch_id), "checkpoint")
"checkpoint")
print("begin to save", save_path) print("begin to save", save_path)
fluid.save(train_program, save_path) fluid.save(train_program, save_path)
print("save finished") print("save finished")
......
...@@ -4,6 +4,7 @@ This module provide nets for text classification ...@@ -4,6 +4,7 @@ This module provide nets for text classification
import paddle.fluid as fluid import paddle.fluid as fluid
def bow_net(data, def bow_net(data,
seq_len, seq_len,
label, label,
......
...@@ -43,8 +43,8 @@ class CNN(object): ...@@ -43,8 +43,8 @@ class CNN(object):
left_emb = emb_layer.ops(left) left_emb = emb_layer.ops(left)
right_emb = emb_layer.ops(right) right_emb = emb_layer.ops(right)
# Presentation context # Presentation context
cnn_layer = layers.SequenceConvPoolLayer( cnn_layer = layers.SequenceConvPoolLayer(self.filter_size,
self.filter_size, self.num_filters, "conv") self.num_filters, "conv")
left_cnn = cnn_layer.ops(left_emb) left_cnn = cnn_layer.ops(left_emb)
right_cnn = cnn_layer.ops(right_emb) right_cnn = cnn_layer.ops(right_emb)
# matching layer # matching layer
......
...@@ -33,6 +33,7 @@ def check_cuda(use_cuda, err = \ ...@@ -33,6 +33,7 @@ def check_cuda(use_cuda, err = \
except Exception as e: except Exception as e:
pass pass
def check_version(): def check_version():
""" """
Log error and exit when the installed version of paddlepaddle is Log error and exit when the installed version of paddlepaddle is
......
...@@ -30,10 +30,14 @@ from models.transformer_encoder import encoder, pre_process_layer ...@@ -30,10 +30,14 @@ from models.transformer_encoder import encoder, pre_process_layer
def ernie_pyreader(args, pyreader_name): def ernie_pyreader(args, pyreader_name):
"""define standard ernie pyreader""" """define standard ernie pyreader"""
src_ids = fluid.data(name='1', shape=[-1, args.max_seq_len, 1], dtype='int64') src_ids = fluid.data(
sent_ids = fluid.data(name='2', shape=[-1, args.max_seq_len, 1], dtype='int64') name='1', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(name='3', shape=[-1, args.max_seq_len, 1], dtype='int64') sent_ids = fluid.data(
input_mask = fluid.data(name='4', shape=[-1, args.max_seq_len, 1], dtype='float32') name='2', shape=[-1, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(
name='3', shape=[-1, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(
name='4', shape=[-1, args.max_seq_len, 1], dtype='float32')
labels = fluid.data(name='5', shape=[-1, 1], dtype='int64') labels = fluid.data(name='5', shape=[-1, 1], dtype='int64')
seq_lens = fluid.data(name='6', shape=[-1], dtype='int64') seq_lens = fluid.data(name='6', shape=[-1], dtype='int64')
......
...@@ -29,6 +29,7 @@ from preprocess.ernie import tokenization ...@@ -29,6 +29,7 @@ from preprocess.ernie import tokenization
from preprocess.padding import pad_batch_data from preprocess.padding import pad_batch_data
import io import io
def csv_reader(fd, delimiter='\t'): def csv_reader(fd, delimiter='\t'):
def gen(): def gen():
for i in fd: for i in fd:
...@@ -37,8 +38,10 @@ def csv_reader(fd, delimiter='\t'): ...@@ -37,8 +38,10 @@ def csv_reader(fd, delimiter='\t'):
yield slots, yield slots,
else: else:
yield slots yield slots
return gen() return gen()
class BaseReader(object): class BaseReader(object):
"""BaseReader for classify and sequence labeling task""" """BaseReader for classify and sequence labeling task"""
......
...@@ -23,6 +23,7 @@ import unicodedata ...@@ -23,6 +23,7 @@ import unicodedata
import six import six
import io import io
def convert_to_unicode(text): def convert_to_unicode(text):
"""Converts `text` to Unicode (if it's not already), assuming utf-8 input.""" """Converts `text` to Unicode (if it's not already), assuming utf-8 input."""
if six.PY3: if six.PY3:
......
...@@ -30,7 +30,7 @@ if sys.getdefaultencoding() != defaultencoding: ...@@ -30,7 +30,7 @@ if sys.getdefaultencoding() != defaultencoding:
reload(sys) reload(sys)
sys.setdefaultencoding(defaultencoding) sys.setdefaultencoding(defaultencoding)
sys.path.append("..") sys.path.append("../shared_modules/")
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
...@@ -47,14 +47,14 @@ from models.model_check import check_version ...@@ -47,14 +47,14 @@ from models.model_check import check_version
from models.model_check import check_cuda from models.model_check import check_cuda
def create_model(args, pyreader_name, is_inference = False, is_pointwise = False): def create_model(args, pyreader_name, is_inference=False, is_pointwise=False):
""" """
Create Model for simnet Create Model for simnet
""" """
if is_inference: if is_inference:
inf_pyreader = fluid.layers.py_reader( inf_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1]), shapes=([-1, 1], [-1, 1]),
dtypes=('int64', 'int64'), dtypes=('int64', 'int64'),
lod_levels=(1, 1), lod_levels=(1, 1),
name=pyreader_name, name=pyreader_name,
...@@ -67,7 +67,7 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False ...@@ -67,7 +67,7 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
if is_pointwise: if is_pointwise:
pointwise_pyreader = fluid.layers.py_reader( pointwise_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1], [-1,1]), shapes=([-1, 1], [-1, 1], [-1, 1]),
dtypes=('int64', 'int64', 'int64'), dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 0), lod_levels=(1, 1, 0),
name=pyreader_name, name=pyreader_name,
...@@ -79,15 +79,17 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False ...@@ -79,15 +79,17 @@ def create_model(args, pyreader_name, is_inference = False, is_pointwise = False
else: else:
pairwise_pyreader = fluid.layers.py_reader( pairwise_pyreader = fluid.layers.py_reader(
capacity=16, capacity=16,
shapes=([-1,1], [-1,1], [-1,1]), shapes=([-1, 1], [-1, 1], [-1, 1]),
dtypes=('int64', 'int64', 'int64'), dtypes=('int64', 'int64', 'int64'),
lod_levels=(1, 1, 1), lod_levels=(1, 1, 1),
name=pyreader_name, name=pyreader_name,
use_double_buffer=False) use_double_buffer=False)
left, pos_right, neg_right = fluid.layers.read_file(pairwise_pyreader) left, pos_right, neg_right = fluid.layers.read_file(
pairwise_pyreader)
return pairwise_pyreader, left, pos_right, neg_right return pairwise_pyreader, left, pos_right, neg_right
def train(conf_dict, args): def train(conf_dict, args):
""" """
train processic train processic
...@@ -97,16 +99,16 @@ def train(conf_dict, args): ...@@ -97,16 +99,16 @@ def train(conf_dict, args):
# get vocab size # get vocab size
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
# Load network structure dynamically # Load network structure dynamically
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
# Load loss function dynamically # Load loss function dynamically
loss = utils.import_class("../models/matching/losses", loss = utils.import_class("../shared_modules/models/matching/losses",
conf_dict["loss"]["module_name"], conf_dict["loss"]["module_name"],
conf_dict["loss"]["class_name"])(conf_dict) conf_dict["loss"]["class_name"])(conf_dict)
# Load Optimization method # Load Optimization method
optimizer = utils.import_class( optimizer = utils.import_class(
"../models/matching/optimizers", "paddle_optimizers", "../shared_modules/models/matching/optimizers", "paddle_optimizers",
conf_dict["optimizer"]["class_name"])(conf_dict) conf_dict["optimizer"]["class_name"])(conf_dict)
# load auc method # load auc method
metric = fluid.metrics.Auc(name="auc") metric = fluid.metrics.Auc(name="auc")
...@@ -131,8 +133,7 @@ def train(conf_dict, args): ...@@ -131,8 +133,7 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, left, pos_right, neg_right = create_model( train_pyreader, left, pos_right, neg_right = create_model(
args, args, pyreader_name='train_reader')
pyreader_name='train_reader')
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
_, neg_score = net.predict(left, neg_right) _, neg_score = net.predict(left, neg_right)
...@@ -141,12 +142,14 @@ def train(conf_dict, args): ...@@ -141,12 +142,14 @@ def train(conf_dict, args):
optimizer.ops(avg_cost) optimizer.ops(avg_cost)
# Get Reader # Get Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch) get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid: if args.do_valid:
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, pos_right= create_model(args, pyreader_name = 'test_reader',is_inference=True) test_pyreader, left, pos_right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -156,40 +159,41 @@ def train(conf_dict, args): ...@@ -156,40 +159,41 @@ def train(conf_dict, args):
with fluid.program_guard(train_program, startup_prog): with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
train_pyreader, left, right, label = create_model( train_pyreader, left, right, label = create_model(
args, args, pyreader_name='train_reader', is_pointwise=True)
pyreader_name='train_reader',
is_pointwise=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
avg_cost = loss.compute(pred, label) avg_cost = loss.compute(pred, label)
avg_cost.persistable = True avg_cost.persistable = True
optimizer.ops(avg_cost) optimizer.ops(avg_cost)
# Get Feeder and Reader # Get Feeder and Reader
get_train_examples = simnet_process.get_reader("train",epoch=args.epoch) get_train_examples = simnet_process.get_reader(
"train", epoch=args.epoch)
if args.do_valid: if args.do_valid:
test_prog = fluid.Program() test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, right= create_model(args, pyreader_name = 'test_reader',is_inference=True) test_pyreader, left, right = create_model(
args, pyreader_name='test_reader', is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
if args.init_checkpoint is not "": if args.init_checkpoint is not "":
utils.init_checkpoint(exe, args.init_checkpoint, utils.init_checkpoint(exe, args.init_checkpoint, startup_prog)
startup_prog)
def valid_and_test(test_program, test_pyreader, get_valid_examples, process, mode, exe, fetch_list): def valid_and_test(test_program, test_pyreader, get_valid_examples, process,
mode, exe, fetch_list):
""" """
return auc and acc return auc and acc
""" """
# Get Batch Data # Get Batch Data
batch_data = fluid.io.batch(get_valid_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_valid_examples, args.batch_size, drop_last=False)
test_pyreader.decorate_paddle_reader(batch_data) test_pyreader.decorate_paddle_reader(batch_data)
test_pyreader.start() test_pyreader.start()
pred_list = [] pred_list = []
while True: while True:
try: try:
_pred = exe.run(program=test_program,fetch_list=[pred.name]) _pred = exe.run(program=test_program, fetch_list=[pred.name])
pred_list += list(_pred) pred_list += list(_pred)
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
...@@ -222,7 +226,8 @@ def train(conf_dict, args): ...@@ -222,7 +226,8 @@ def train(conf_dict, args):
#for epoch_id in range(args.epoch): #for epoch_id in range(args.epoch):
# used for continuous evaluation # used for continuous evaluation
if args.enable_ce: if args.enable_ce:
train_batch_data = fluid.io.batch(get_train_examples, args.batch_size, drop_last=False) train_batch_data = fluid.io.batch(
get_train_examples, args.batch_size, drop_last=False)
else: else:
train_batch_data = fluid.io.batch( train_batch_data = fluid.io.batch(
fluid.io.shuffle( fluid.io.shuffle(
...@@ -238,19 +243,23 @@ def train(conf_dict, args): ...@@ -238,19 +243,23 @@ def train(conf_dict, args):
try: try:
global_step += 1 global_step += 1
fetch_list = [avg_cost.name] fetch_list = [avg_cost.name]
avg_loss = train_exe.run(program=train_program, fetch_list = fetch_list) avg_loss = train_exe.run(program=train_program,
fetch_list=fetch_list)
losses.append(np.mean(avg_loss[0])) losses.append(np.mean(avg_loss[0]))
if args.do_valid and global_step % args.validation_steps == 0: if args.do_valid and global_step % args.validation_steps == 0:
get_valid_examples = simnet_process.get_reader("valid") get_valid_examples = simnet_process.get_reader("valid")
valid_result = valid_and_test(test_prog,test_pyreader,get_valid_examples,simnet_process,"valid",exe,[pred.name]) valid_result = valid_and_test(
test_prog, test_pyreader, get_valid_examples,
simnet_process, "valid", exe, [pred.name])
if args.compute_accuracy: if args.compute_accuracy:
valid_auc, valid_acc = valid_result valid_auc, valid_acc = valid_result
logging.info( logging.info(
"global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f" % "global_steps: %d, valid_auc: %f, valid_acc: %f, valid_loss: %f"
(global_step, valid_auc, valid_acc, np.mean(losses))) % (global_step, valid_auc, valid_acc, np.mean(losses)))
else: else:
valid_auc = valid_result valid_auc = valid_result
logging.info("global_steps: %d, valid_auc: %f, valid_loss: %f" % logging.info(
"global_steps: %d, valid_auc: %f, valid_loss: %f" %
(global_step, valid_auc, np.mean(losses))) (global_step, valid_auc, np.mean(losses)))
if global_step % args.save_steps == 0: if global_step % args.save_steps == 0:
model_save_dir = os.path.join(args.output_dir, model_save_dir = os.path.join(args.output_dir,
...@@ -269,8 +278,7 @@ def train(conf_dict, args): ...@@ -269,8 +278,7 @@ def train(conf_dict, args):
] ]
target_vars = [left_feat, pred] target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names, fluid.io.save_inference_model(model_path, feed_var_names,
target_vars, exe, target_vars, exe, test_prog)
test_prog)
logging.info("saving infer model in %s" % model_path) logging.info("saving infer model in %s" % model_path)
except fluid.core.EOFException: except fluid.core.EOFException:
...@@ -282,8 +290,7 @@ def train(conf_dict, args): ...@@ -282,8 +290,7 @@ def train(conf_dict, args):
ce_info.append([np.mean(losses), end_time - start_time]) ce_info.append([np.mean(losses), end_time - start_time])
#final save #final save
logging.info("the final step is %s" % global_step) logging.info("the final step is %s" % global_step)
model_save_dir = os.path.join(args.output_dir, model_save_dir = os.path.join(args.output_dir, conf_dict["model_path"])
conf_dict["model_path"])
model_path = os.path.join(model_save_dir, str(global_step)) model_path = os.path.join(model_save_dir, str(global_step))
if not os.path.exists(model_save_dir): if not os.path.exists(model_save_dir):
os.makedirs(model_save_dir) os.makedirs(model_save_dir)
...@@ -296,8 +303,7 @@ def train(conf_dict, args): ...@@ -296,8 +303,7 @@ def train(conf_dict, args):
right.name, right.name,
] ]
target_vars = [left_feat, pred] target_vars = [left_feat, pred]
fluid.io.save_inference_model(model_path, feed_var_names, fluid.io.save_inference_model(model_path, feed_var_names, target_vars, exe,
target_vars, exe,
test_prog) test_prog)
logging.info("saving infer model in %s" % model_path) logging.info("saving infer model in %s" % model_path)
# used for continuous evaluation # used for continuous evaluation
...@@ -322,7 +328,9 @@ def train(conf_dict, args): ...@@ -322,7 +328,9 @@ def train(conf_dict, args):
else: else:
# Get Feeder and Reader # Get Feeder and Reader
get_test_examples = simnet_process.get_reader("test") get_test_examples = simnet_process.get_reader("test")
test_result = valid_and_test(test_prog,test_pyreader,get_test_examples,simnet_process,"test",exe,[pred.name]) test_result = valid_and_test(test_prog, test_pyreader,
get_test_examples, simnet_process, "test",
exe, [pred.name])
if args.compute_accuracy: if args.compute_accuracy:
test_auc, test_acc = test_result test_auc, test_acc = test_result
logging.info("AUC of test is %f, Accuracy of test is %f" % logging.info("AUC of test is %f, Accuracy of test is %f" %
...@@ -348,12 +356,13 @@ def test(conf_dict, args): ...@@ -348,12 +356,13 @@ def test(conf_dict, args):
startup_prog = fluid.Program() startup_prog = fluid.Program()
get_test_examples = simnet_process.get_reader("test") get_test_examples = simnet_process.get_reader("test")
batch_data = fluid.io.batch(get_test_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_test_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program() test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
...@@ -364,9 +373,7 @@ def test(conf_dict, args): ...@@ -364,9 +373,7 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, pos_right = create_model( test_pyreader, left, pos_right = create_model(
args, args, pyreader_name='test_reader', is_inference=True)
pyreader_name = 'test_reader',
is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
...@@ -375,18 +382,13 @@ def test(conf_dict, args): ...@@ -375,18 +382,13 @@ def test(conf_dict, args):
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
test_pyreader, left, right = create_model( test_pyreader, left, right = create_model(
args, args, pyreader_name='test_reader', is_inference=True)
pyreader_name = 'test_reader',
is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
utils.init_checkpoint( utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
test_exe = exe test_exe = exe
test_pyreader.decorate_paddle_reader(batch_data) test_pyreader.decorate_paddle_reader(batch_data)
...@@ -398,15 +400,18 @@ def test(conf_dict, args): ...@@ -398,15 +400,18 @@ def test(conf_dict, args):
output = [] output = []
while True: while True:
try: try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list) output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
pred_list += list(map(lambda item: float(item[0]), output[0])) pred_list += list(
map(lambda item: float(item[0]), output[0]))
predictions_file.write(u"\n".join( predictions_file.write(u"\n".join(
map(lambda item: str((item[0] + 1) / 2), output[0])) + "\n") map(lambda item: str((item[0] + 1) / 2), output[0])) +
"\n")
else: else:
pred_list += map(lambda item: item, output[0]) pred_list += map(lambda item: item, output[0])
predictions_file.write(u"\n".join( predictions_file.write(u"\n".join(
map(lambda item: str(np.argmax(item)), output[0])) + "\n") map(lambda item: str(np.argmax(item)), output[0])) +
"\n")
except fluid.core.EOFException: except fluid.core.EOFException:
test_pyreader.reset() test_pyreader.reset()
break break
...@@ -450,36 +455,36 @@ def infer(conf_dict, args): ...@@ -450,36 +455,36 @@ def infer(conf_dict, args):
startup_prog = fluid.Program() startup_prog = fluid.Program()
get_infer_examples = simnet_process.get_infer_reader get_infer_examples = simnet_process.get_infer_reader
batch_data = fluid.io.batch(get_infer_examples, args.batch_size, drop_last=False) batch_data = fluid.io.batch(
get_infer_examples, args.batch_size, drop_last=False)
test_prog = fluid.Program() test_prog = fluid.Program()
conf_dict['dict_size'] = len(vocab) conf_dict['dict_size'] = len(vocab)
net = utils.import_class("../models/matching", net = utils.import_class("../shared_modules/models/matching",
conf_dict["net"]["module_name"], conf_dict["net"]["module_name"],
conf_dict["net"]["class_name"])(conf_dict) conf_dict["net"]["class_name"])(conf_dict)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, left, pos_right = create_model(args, pyreader_name = 'infer_reader', is_inference = True) infer_pyreader, left, pos_right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pos_score = net.predict(left, pos_right) left_feat, pos_score = net.predict(left, pos_right)
pred = pos_score pred = pos_score
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
else: else:
with fluid.program_guard(test_prog, startup_prog): with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
infer_pyreader, left, right = create_model(args, pyreader_name = 'infer_reader', is_inference = True) infer_pyreader, left, right = create_model(
args, pyreader_name='infer_reader', is_inference=True)
left_feat, pred = net.predict(left, right) left_feat, pred = net.predict(left, right)
test_prog = test_prog.clone(for_test=True) test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog) exe.run(startup_prog)
utils.init_checkpoint( utils.init_checkpoint(exe, args.init_checkpoint, main_program=test_prog)
exe,
args.init_checkpoint,
main_program=test_prog)
test_exe = exe test_exe = exe
infer_pyreader.decorate_sample_list_generator(batch_data) infer_pyreader.decorate_sample_list_generator(batch_data)
...@@ -491,7 +496,7 @@ def infer(conf_dict, args): ...@@ -491,7 +496,7 @@ def infer(conf_dict, args):
infer_pyreader.start() infer_pyreader.start()
while True: while True:
try: try:
output = test_exe.run(program=test_prog,fetch_list=fetch_list) output = test_exe.run(program=test_prog, fetch_list=fetch_list)
if args.task_mode == "pairwise": if args.task_mode == "pairwise":
preds_list += list( preds_list += list(
map(lambda item: str((item[0] + 1) / 2), output[0])) map(lambda item: str((item[0] + 1) / 2), output[0]))
...@@ -514,6 +519,7 @@ def get_cards(): ...@@ -514,6 +519,7 @@ def get_cards():
num = len(cards.split(",")) num = len(cards.split(","))
return num return num
if __name__ == "__main__": if __name__ == "__main__":
args = ArgConfig() args = ArgConfig()
......
...@@ -146,10 +146,10 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -146,10 +146,10 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
## PaddleNLP ## PaddleNLP
[**PaddleNLP**](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP) 是基于 PaddlePaddle 深度学习框架开发的自然语言处理 (NLP) 工具,算法,模型和数据的开源项目。百度在 NLP 领域十几年的深厚积淀为 PaddleNLP 提供了强大的核心动力。使用 PaddleNLP,您可以得到: [**PaddleNLP**](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP) 是基于 PaddlePaddle 深度学习框架开发的自然语言处理 (NLP) 工具,算法,模型和数据的开源项目。百度在 NLP 领域十几年的深厚积淀为 PaddleNLP 提供了强大的核心动力。使用 PaddleNLP,您可以得到:
- **丰富而全面的 NLP 任务支持:** - **丰富而全面的 NLP 任务支持:**
- PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。 - PaddleNLP 为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)等 NLP 基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/seq2seq)等 NLP 核心技术。同时,PaddleNLP 还提供了针对常见 NLP 大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在 NLP 领域畅通无阻。
- **稳定可靠的 NLP 模型和强大的预训练参数:** - **稳定可靠的 NLP 模型和强大的预训练参数:**
- PaddleNLP集成了百度内部广泛使用的 NLP 工具模型,为您提供了稳定可靠的 NLP 算法解决方案。基于百亿级数据的预训练参数和丰富的预训练模型,助您轻松提高模型效果,为您的 NLP 业务注入强大动力。 - PaddleNLP集成了百度内部广泛使用的 NLP 工具模型,为您提供了稳定可靠的 NLP 算法解决方案。基于百亿级数据的预训练参数和丰富的预训练模型,助您轻松提高模型效果,为您的 NLP 业务注入强大动力。
- **持续改进和技术支持,零基础搭建 NLP 应用:** - **持续改进和技术支持,零基础搭建 NLP 应用:**
...@@ -159,30 +159,30 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -159,30 +159,30 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
| 任务类型 | 目录 | 简介 | | 任务类型 | 目录 | 简介 |
| ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
| 中文词法分析 | [LAC(Lexical Analysis of Chinese)](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis) | 百度自主研发中文特色模型词法分析任务,集成了中文分词、词性标注和命名实体识别任务。输入是一个字符串,而输出是句子中的词边界和词性、实体类别。 | | 中文词法分析 | [LAC(Lexical Analysis of Chinese)](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis) | 百度自主研发中文特色模型词法分析任务,集成了中文分词、词性标注和命名实体识别任务。输入是一个字符串,而输出是句子中的词边界和词性、实体类别。 |
| 词向量 | [Word2vec](https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 | | 词向量 | [Word2vec](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 |
| 语言模型 | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 给定一个输入词序列(中文需要先分词、英文需要先 tokenize),计算其生成概率。 语言模型的评价指标 PPL(困惑度),用于表示模型生成句子的流利程度。 | | 语言模型 | [Language_model](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/language_model) | 给定一个输入词序列(中文需要先分词、英文需要先 tokenize),计算其生成概率。 语言模型的评价指标 PPL(困惑度),用于表示模型生成句子的流利程度。 |
### NLP 核心技术 ### NLP 核心技术
#### 语义表示 #### 语义表示
[PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) (Paddle LAngauge Representation ToolKit) 是传统语言模型的进一步发展,通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。PaddleLARK 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。 [PaddleLARK](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_language_models) 通过在大规模语料上训练得到的通用的语义表示模型,可以助益其他自然语言处理任务,是通用预训练 + 特定任务精调范式的体现。PaddleLARK 集成了 ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet 等热门中英文预训练模型。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [ERNIE](https://github.com/PaddlePaddle/ERNIE)(Enhanced Representation from kNowledge IntEgration) | 百度自研的语义表示模型,通过建模海量数据中的词、实体及实体关系,学习真实世界的语义知识。相较于 BERT 学习原始语言信号,ERNIE 直接对先验语义知识单元进行建模,增强了模型语义表示能力。 | | [ERNIE](https://github.com/PaddlePaddle/ERNIE)(Enhanced Representation from kNowledge IntEgration) | 百度自研的语义表示模型,通过建模海量数据中的词、实体及实体关系,学习真实世界的语义知识。相较于 BERT 学习原始语言信号,ERNIE 直接对先验语义知识单元进行建模,增强了模型语义表示能力。 |
| [BERT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 | | [BERT](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_language_models/BERT)(Bidirectional Encoder Representation from Transformers) | 一个迁移能力很强的通用语义表示模型, 以 Transformer 为网络基本组件,以双向 Masked Language Model和 Next Sentence Prediction 为训练目标,通过预训练得到通用语义表示,再结合简单的输出层,应用到下游的 NLP 任务,在多个任务上取得了 SOTA 的结果。 |
| [XLNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 | | [XLNet](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_language_models/XLNet)(XLNet: Generalized Autoregressive Pretraining for Language Understanding) | 重要的语义表示模型之一,引入 Transformer-XL 为骨架,以 Permutation Language Modeling 为优化目标,在若干下游任务上优于 BERT 的性能。 |
| [ELMo](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 | | [ELMo](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_language_models/ELMo)(Embeddings from Language Models) | 重要的通用语义表示模型之一,以双向 LSTM 为网路基本组件,以 Language Model 为训练目标,通过预训练得到通用的语义表示,将通用的语义表示作为 Feature 迁移到下游 NLP 任务中,会显著提升下游任务的模型性能。 |
#### 文本相似度计算 #### 文本相似度计算
[SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) (Similarity Net) 是一个计算短文本相似度的框架,主要包括 BOW、CNN、RNN、MMDNN 等核心网络结构形式。SimNet 框架在百度各产品上广泛应用,提供语义相似度计算训练和预测框架,适用于信息检索、新闻推荐、智能客服等多个应用场景,帮助企业解决语义匹配问题。 [SimNet](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net) (Similarity Net) 是一个计算短文本相似度的框架,主要包括 BOW、CNN、RNN、MMDNN 等核心网络结构形式。SimNet 框架在百度各产品上广泛应用,提供语义相似度计算训练和预测框架,适用于信息检索、新闻推荐、智能客服等多个应用场景,帮助企业解决语义匹配问题。
#### 文本生成 #### 文本生成
[PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。 [seq2seq](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/seq2seq) (Paddle Text Generation) ,一个基于 PaddlePaddle 的文本生成框架,提供了一些列经典文本生成模型案例,如 vanilla seq2seq,seq2seq with attention,variational seq2seq 模型等。
### NLP 系统应用 ### NLP 系统应用
...@@ -190,35 +190,35 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化 ...@@ -190,35 +190,35 @@ PaddlePaddle 提供了丰富的计算单元,使得用户可以采用模块化
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification) (Sentiment Classification,简称Senta) | 面向**通用场景**的情感分类模型,针对带有主观描述的中文文本,可自动判断该文本的情感极性类别。 | | [Senta](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification) (Sentiment Classification,简称Senta) | 面向**通用场景**的情感分类模型,针对带有主观描述的中文文本,可自动判断该文本的情感极性类别。 |
| [EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) (Emotion Detection,简称EmoTect) | 专注于识别**人机对话场景**中用户的情绪,针对智能对话场景中的用户文本,自动判断该文本的情绪类别。 | | [EmotionDetection](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/emotion_detection) (Emotion Detection,简称EmoTect) | 专注于识别**人机对话场景**中用户的情绪,针对智能对话场景中的用户文本,自动判断该文本的情绪类别。 |
#### 阅读理解 #### 阅读理解
[PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。 [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension) (Paddle Machine Reading Comprehension),集合了百度在阅读理解领域相关的模型,工具,开源数据集等一系列工作。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [DuReader](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2018-DuReader) | 包含百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集以及基线模型。 | | [DuReader](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research/ACL2018-DuReader) | 包含百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集以及基线模型。 |
| [KT-Net](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2019-KTNET) | 结合知识的阅读理解模型,Squad 曾排名第一。 | | [KT-Net](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research/ACL2019-KTNET) | 结合知识的阅读理解模型,Squad 曾排名第一。 |
| [D-Net](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/MRQA2019-D-NET) | 阅读理解十项全能模型,在 EMNLP2019 国际阅读理解大赛夺得 10 项冠军。 | | [D-Net](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research/MRQA2019-D-NET) | 阅读理解十项全能模型,在 EMNLP2019 国际阅读理解大赛夺得 10 项冠军。 |
#### 机器翻译 #### 机器翻译
[PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762) [machine_translation](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation) ,全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型,基于论文 [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
#### 对话系统 #### 对话系统
[PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) 包含对话系统方向的模型、数据集和工具。 [dialogue_system](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system) 包含对话系统方向的模型、数据集和工具。
| 模型 | 简介 | | 模型 | 简介 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [DGU](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 | | [DGU](https://github.com/PaddlePaddle/models/tree/develop/release/1.7/dialogue_system/dialogue_general_understanding) (Dialogue General Understanding,通用对话理解模型) | 覆盖了包括**检索式聊天系统**中 context-response matching 任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在 6 项国际公开数据集中都获得了最佳效果。 |
| [ADEM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 | | [ADEM](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system/auto_dialogue_evaluation) (Auto Dialogue Evaluation Model) | 评估开放领域对话系统的回复质量,能够帮助企业或个人快速评估对话系统的回复质量,减少人工评估成本。 |
| [Proactive Conversation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2019-DuConv) | 包含百度开源的知识驱动的开放领域对话数据集 [DuConv](https://ai.baidu.com/broad/subordinate?dataset=duconv),以及基线模型。对应论文 [Proactive Human-Machine Conversation with Explicit Conversation Goals](https://arxiv.org/abs/1906.05572) 发表于 ACL2019。 | | [Proactive Conversation](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research/ACL2019-DuConv) | 包含百度开源的知识驱动的开放领域对话数据集 [DuConv](https://ai.baidu.com/broad/subordinate?dataset=duconv),以及基线模型。对应论文 [Proactive Human-Machine Conversation with Explicit Conversation Goals](https://arxiv.org/abs/1906.05572) 发表于 ACL2019。 |
| [DAM](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research/ACL2018-DAM)(Deep Attention Matching Network,深度注意力机制模型) | 开放领域多轮对话匹配模型,对应论文 [Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](https://aclweb.org/anthology/P18-1103/) 发表于 ACL2018。 | | [DAM](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research/ACL2018-DAM)(Deep Attention Matching Network,深度注意力机制模型) | 开放领域多轮对话匹配模型,对应论文 [Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network](https://aclweb.org/anthology/P18-1103/) 发表于 ACL2018。 |
百度最新前沿工作开源,请参考 [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) 百度最新前沿工作开源,请参考 [Research](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research)
## PaddleRec ## PaddleRec
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册