未验证 提交 f492ae4f 编写于 作者: P pkpk 提交者: GitHub

refactor the PaddleNLP (#4351)

* Update README.md (#4267)

* test=develop (#4269)

* 3d use new api (#4275)

* PointNet++ and PointRCNN use new API

* Update Readme of Dygraph BERT (#4277)

Fix some typos.

* Update run_classifier_multi_gpu.sh (#4279)

remove the CUDA_VISIBLE_DEVICES

* Update README.md (#4280)

* 17 update api (#4294)

* update1.7 save/load & fluid.data

* update datafeed to dataloader

* Update resnet_acnet.py (#4297)

Bias attr of square conv should be "False" rather than None during training mode.

* test=develop

* test=develop

* test=develop

* test=develop

* test
Co-authored-by: NKaipeng Deng <dengkaipeng@baidu.com>
Co-authored-by: Nzhang wenhui <frankwhzhang@126.com>
Co-authored-by: Nparap1uie-s <parap1uie-s@users.noreply.github.com>
上级 8dc42c73
......@@ -13,6 +13,3 @@
[submodule "PaddleSpeech/DeepSpeech"]
path = PaddleSpeech/DeepSpeech
url = https://github.com/PaddlePaddle/DeepSpeech.git
[submodule "PaddleNLP/PALM"]
path = PaddleNLP/PALM
url = https://github.com/PaddlePaddle/PALM
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
......@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:**
......@@ -50,17 +50,17 @@ cd models/PaddleNLP/sentiment_classification
| 任务场景 | 对应项目/目录 | 简介 |
| :------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| **中文分词****词性标注****命名实体识别**:fire: | [LAC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis) | LAC,全称为Lexical Analysis of Chinese,是百度内部广泛使用的中文处理工具,功能涵盖从中文分词,词性标注,命名实体识别等常见中文处理任务。 |
| **词向量(word2vec)** | [word2vec](https://github.com/PaddlePaddle/models/tree/develop/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 |
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 |
| **中文分词****词性标注****命名实体识别**:fire: | [LAC](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/lexical_analysis) | LAC,全称为Lexical Analysis of Chinese,是百度内部广泛使用的中文处理工具,功能涵盖从中文分词,词性标注,命名实体识别等常见中文处理任务。 |
| **词向量(word2vec)** | [word2vec](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleRec/word2vec) | 提供单机多卡,多机等分布式训练中文词向量能力,支持主流词向量模型(skip-gram,cbow等),可以快速使用自定义数据训练词向量模型。 |
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/release/1.7/PaddleNLP/Research) | 百度最新前沿工作开源。 |
......@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text
.
├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱
├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型
├── lexical_analysis # LAC词法分析
├── models # 共享网络
├── shared_modules/models # 共享网络
│ ├── __init__.py
│ ├── classification
│ ├── dialogue_model_toolkit
......@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation
│ ├── sequence_labeling
│ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具
├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py
│ ├── ernie
│ ├── padding.py
......
......@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True)
train_duration_card1 = DurationKpi(
'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [
train_loss_card1,
......
......@@ -20,22 +20,25 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
'TRAINED_MODEL': './data/saved_models'}
PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid
def create_net(
is_training,
def create_net(is_training,
model_input,
args,
clip_value=10.0,
......@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM
context_fc = fluid.layers.fc(
input=context_emb,
context_fc = fluid.layers.fc(input=context_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc(
input=response_emb,
response_fc = fluid.layers.fc(input=response_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
......@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
"""
Set word embedding
"""
word_emb_param = fluid.global_scope().find_var(
word_emb_name).get_tensor()
word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_param.set(word_emb, place)
......@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
if args.use_cuda:
place = fluid.CUDAPlace(0)
......@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name,
input_field.response_wordseq.name,
],
target_vars=[
logits,
],
target_vars=[logits, ],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml")
......
......@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
logits.persistable = True
fetch_list = [logits.name]
......@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size)
batch_generator = processor.data_generator(
place=place,
phase="test",
shuffle=False,
sample_pro=1)
place=place, phase="test", shuffle=False, sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator)
......@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset()
break
scores = scores[: num_test_examples]
scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores):
......
......@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
loss = create_net(
is_training=True,
model_input=input_field,
args=args
)
is_training=True, model_input=input_field, args=args)
loss.persistable = True
# gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
......@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0')))
place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else:
dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace()
......@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init:
print("start loading word embedding init ...")
if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32')
word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32')
word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place)
print("finish init word embedding ...")
......@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time))
print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0
time_begin = time.time()
if steps % args.save_steps == 0:
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi
from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True)
each_step_duration_atis_slot_card1 = DurationKpi(
'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi(
'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi(
'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [
each_step_duration_atis_slot_card1,
......
......@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list):
if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts]
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])]
labels_list = [
inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")]
......@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids
# Second step: padding
src_id, self_input_mask = pad_batch_data(
out,
max_len,
pad_idx=pad_id,
return_input_mask=True)
out, max_len, pad_idx=pad_id, return_input_mask=True)
pos_id = pad_batch_data(
batch_pos_ids,
max_len,
......@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias.
"""
return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts)
max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts
])
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data
......
......@@ -25,18 +25,21 @@ class DefinePredict(object):
"""
Packaging Prediction Results
"""
def __init__(self):
"""
init
"""
self.task_map = {'udc': 'get_matching_res',
self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res',
'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'}
'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None):
"""
......@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)])
return label_str
......@@ -20,25 +20,29 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']:
for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path):
continue
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""build swda train dev test dataset"""
import json
......@@ -27,6 +26,7 @@ class ATIS(object):
"""
nlu dataset atis data process
"""
def __init__(self):
"""
init instance
......@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower()))
fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict:
......@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag]))
if i == 0:
if start not in [0, 1]:
prefix_num = len(text[: start].strip().split())
prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split())
prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8')))
fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict:
......@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__":
atis_inst = ATIS()
atis_inst.main()
......@@ -28,6 +28,7 @@ class DSTC2(object):
"""
dialogue state tracking dstc2 data process
"""
def __init__(self):
"""
init instance
......@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i])
self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self):
"""
......@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]]
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels])
labels = [
"%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript']
user = label_turn['transcription']
if not labels_ids.strip():
labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0][
'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
......@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
dstc_inst = DSTC2()
dstc_inst.main()
......@@ -27,6 +27,7 @@ class MRDA(object):
"""
dialogue act dataset mrda data process
"""
def __init__(self):
"""
init instance
......@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
conv_id = elems[2]
......@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list:
trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
if len(elems) != 3:
......@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0])
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out)
def get_train_dataset(self):
......@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
mrda_inst = MRDA()
mrda_inst.main()
......@@ -27,6 +27,7 @@ class SWDA(object):
"""
dialogue act dataset swda data process
"""
def __init__(self):
"""
init instance
......@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8")
idx = 0
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for r in row:
if idx == 0:
idx += 1
......@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
swda_inst = SWDA()
swda_inst.main()
......@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA
if __name__ == "__main__":
task_name = sys.argv[1]
task_name = task_name.lower()
......@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis':
atis_inst = ATIS()
atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
"../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2':
dstc_inst = DSTC2()
dstc_inst.main()
else:
exit(0)
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Tokenization classes."""
from __future__ import absolute_import
......
......@@ -113,7 +113,7 @@ def multi_head_attention(queries,
"""
Scaled Dot-Product Attention
"""
scaled_q = layers.scale(x=q, scale=d_key ** -0.5)
scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias:
product += attn_bias
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig
def create_net(
is_training,
model_input,
num_labels,
paradigm_inst,
args):
def create_net(is_training, model_input, num_labels, paradigm_inst, args):
"""create dialogue task model"""
src_ids = model_input.src_ids
......@@ -48,14 +43,15 @@ def create_net(
config=bert_conf,
use_fp16=False)
params = {'num_labels': num_labels,
params = {
'num_labels': num_labels,
'src_ids': src_ids,
'pos_ids': pos_ids,
'sent_ids': sent_ids,
'input_mask': input_mask,
'labels': labels,
'is_training': is_training}
'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params)
return results
......@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
......@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=[
input_field.src_ids.name,
input_field.pos_ids.name,
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
input_field.src_ids.name, input_field.pos_ids.name,
input_field.sent_ids.name, input_field.input_mask.name
],
target_vars=[probs],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
......@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len,
......@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator(
batch_size=args.batch_size,
phase='train',
shuffle=True)
batch_size=args.batch_size, phase='train', shuffle=True)
num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens:
......@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
if accuracy is not None:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time
])
else:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
args.print_steps / used_time
])
np.mean(np_loss), args.print_steps / used_time))
ce_info.append(
[np.mean(np_loss), args.print_steps / used_time])
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = "step_" + str(steps)
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path)
save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path)
save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -19,8 +19,7 @@ from __future__ import print_function
import os
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
import numpy as np
......
......@@ -23,7 +23,7 @@ import os
import time
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......
......@@ -24,7 +24,7 @@ import time
import argparse
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......
......@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding("utf-8")
sys.path.append('../')
sys.path.append('../shared_modules/')
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
......
......@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset
from ernie_reader import SequenceLabelReader
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader
......@@ -35,7 +35,8 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model"""
# model's input data
words = fluid.data(name='words', shape=[None, 1], dtype='int64', lod_level=1)
words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data(
name='targets', shape=[None, 1], dtype='int64', lod_level=1)
......@@ -88,7 +89,8 @@ def create_pyreader(args,
return_reader=False,
mode='train'):
# init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(fluid.cpu_places())
device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac':
pyreader = fluid.io.DataLoader.from_generator(
......@@ -107,14 +109,14 @@ def create_pyreader(args,
fluid.io.shuffle(
reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size/device_count),
batch_size=args.batch_size / device_count),
places=place)
else:
pyreader.set_sample_list_generator(
fluid.io.batch(
reader.file_reader(
file_name, mode=mode),
batch_size=args.batch_size/device_count),
batch_size=args.batch_size / device_count),
places=place)
elif model == 'ernie':
......
......@@ -20,7 +20,7 @@ import sys
from collections import namedtuple
import numpy as np
sys.path.append("..")
sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator
import reader
import utils
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator
import utils
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda
from models.model_check import check_version
......@@ -187,8 +187,8 @@ def do_train(args):
end_time - start_time, train_pyreader.queue.size()))
if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir, "step_" + str(steps),
"checkpoint")
save_path = os.path.join(args.model_save_dir,
"step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path))
fluid.save(train_program, save_path)
......@@ -199,6 +199,7 @@ def do_train(args):
"checkpoint")
fluid.save(train_program, save_path)
def do_eval(args):
# init executor
if args.use_cuda:
......
......@@ -29,7 +29,7 @@ import reader
import utils
import creator
from eval import test_process
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......@@ -151,8 +151,7 @@ def do_train(args):
# save checkpoints
if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir,
"step_" + str(step),
"checkpoint")
"step_" + str(step), "checkpoint")
fluid.save(train_program, save_path)
step += 1
......
......@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
......@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
def get_input_descs(args):
"""
Generate a dict mapping data fields to the corresponding data shapes and
......@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder.
# The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64",
......@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference.
# The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model]
......@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs
# Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = (
"src_word_emb_table",
......
......@@ -87,7 +87,8 @@ def do_save_inference_model(args):
# saving inference model
fluid.io.save_inference_model(args.inference_model_dir,
fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=list(input_field_names),
target_vars=[out_ids, out_scores],
executor=exe,
......
......@@ -25,7 +25,6 @@ from train import do_train
from predict import do_predict
from inference_model import do_save_inference_model
if __name__ == "__main__":
LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d] %(message)s"
logging.basicConfig(
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册