未验证 提交 803dab78 编写于 作者: P pkpk 提交者: GitHub

test=develop (#4389)

上级 9e12ab90
Subproject commit 5426f75073cf5bd416622dbe71b146d3dc8fffb6
Subproject commit 30b892e3c029bff706337f269e6c158b0a223f60
......@@ -10,7 +10,7 @@
- **丰富而全面的NLP任务支持:**
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- PaddleNLP为您提供了多粒度,多场景的应用支持。涵盖了从[分词](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[词性标注](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)[命名实体识别](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/lexical_analysis)等NLP基础技术,到[文本分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[文本相似度计算](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net)[语义表示](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models)[文本生成](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/seq2seq)等NLP核心技术。同时,PaddleNLP还提供了针对常见NLP大型应用系统(如[阅读理解](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension)[对话系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system)[机器翻译系统](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation)等)的特定核心技术和工具组件,模型和预训练参数等,让您在NLP领域畅通无阻。
- **稳定可靠的NLP模型和强大的预训练参数:**
......@@ -55,11 +55,11 @@ cd models/PaddleNLP/sentiment_classification
| **语言模型** | [Language_model](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_model) | 基于循环神经网络(RNN)的经典神经语言模型(neural language model)。 |
| **情感分类**:fire: | [Senta](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/sentiment_classification)[EmotionDetection](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/emotion_detection) | Senta(Sentiment Classification,简称Senta)和EmotionDetection两个项目分别提供了面向*通用场景**人机对话场景专用*的情感倾向性分析模型。 |
| **文本相似度计算**:fire: | [SimNet](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/similarity_net) | SimNet,又称为Similarity Net,为您提供高效可靠的文本相似度计算工具和预训练模型。 |
| **语义表示**:fire: | [PaddleLARK](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleLARK) | PaddleLARK,全称为Paddle LAngauge Representation Toolkit,集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [PaddleTextGEN](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | Paddle Text Generation为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [PaddleMRC](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMRC) | PaddleMRC,全称为Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [PaddleDialogue](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleDialogue) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [PaddleMT](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleMT) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **语义表示**:fire: | [pretrain_langauge_models](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/pretrain_langauge_models) | 集成了ELMO,BERT,ERNIE 1.0,ERNIE 2.0,XLNet等热门中英文预训练模型。 |
| **文本生成** | [seq2seq](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/PaddleTextGEN) | seq2seq为您提供了一些列经典文本生成模型案例,如vanilla seq2seq,seq2seq with attention,variational seq2seq模型等。 |
| **阅读理解** | [machine_reading_comprehension](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_reading_comprehension) | Paddle Machine Reading Comprehension,集合了百度在阅读理解领域相关的模型,工具,开源数据等一系列工作。包括DuReader (百度开源的基于真实搜索用户行为的中文大规模阅读理解数据集),KT-Net (结合知识的阅读理解模型,SQuAD以及ReCoRD曾排名第一), D-Net (预训练-微调框架,在EMNLP2019 MRQA国际阅读理解评测获得第一),等。 |
| **对话系统** | [dialogue_system](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/dialogue_system) | 包括:1)DGU(Dialogue General Understanding,通用对话理解模型)覆盖了包括**检索式聊天系统**中context-response matching任务和**任务完成型对话系统****意图识别****槽位解析****状态追踪**等常见对话系统任务,在6项国际公开数据集中都获得了最佳效果。<br/> 2) knowledge-driven dialogue:百度开源的知识驱动的开放领域对话数据集,发表于ACL2019。<br/>3)ADEM(Auto Dialogue Evaluation Model):对话自动评估模型,可用于自动评估不同对话生成模型的回复质量。 |
| **机器翻译** | [machine_translation](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/machine_translation) | 全称为Paddle Machine Translation,基于Transformer的经典机器翻译模型。 |
| **其他前沿工作** | [Research](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/Research) | 百度最新前沿工作开源。 |
......@@ -70,13 +70,13 @@ cd models/PaddleNLP/sentiment_classification
```text
.
├── Research # 百度NLP在research方面的工作集合
├── PaddleMT # 机器翻译相关代码,数据,预训练模型
├── PaddleDialogue # 对话系统相关代码,数据,预训练模型
├── PaddleMRC # 阅读理解相关代码,数据,预训练模型
├── PaddleLARK # 语言表示工具箱
├── machine_translation # 机器翻译相关代码,数据,预训练模型
├── dialogue_system # 对话系统相关代码,数据,预训练模型
├── machcine_reading_comprehension # 阅读理解相关代码,数据,预训练模型
├── pretrain_langauge_models # 语言表示工具箱
├── language_model # 语言模型
├── lexical_analysis # LAC词法分析
├── models # 共享网络
├── shared_modules/models # 共享网络
│ ├── __init__.py
│ ├── classification
│ ├── dialogue_model_toolkit
......@@ -87,7 +87,7 @@ cd models/PaddleNLP/sentiment_classification
│ ├── representation
│ ├── sequence_labeling
│ └── transformer_encoder.py
├── preprocess # 共享文本预处理工具
├── shared_modules/preprocess # 共享文本预处理工具
│ ├── __init__.py
│ ├── ernie
│ ├── padding.py
......
......@@ -16,7 +16,6 @@
# limitations under the License.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......@@ -40,43 +39,55 @@ import math
np.random.seed(0)
random.seed(0)
parser = argparse.ArgumentParser(__doc__)
DEV_COUNT = 1
model_g = ArgumentGroup(parser, "model", "model configuration and paths.")
model_g.add_arg("init_checkpoint", str, None, "Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints", "Path to save checkpoints.")
model_g.add_arg("init_checkpoint", str, None,
"Init checkpoint to resume training from.")
model_g.add_arg("checkpoints", str, "./checkpoints",
"Path to save checkpoints.")
model_g.add_arg("config_path", str, "./data/input/model.conf", "Model conf.")
model_g.add_arg("build_dict", bool, False, "Build dict.")
train_g = ArgumentGroup(parser, "training", "training options.")
train_g.add_arg("cpu_num", int, 3, "Number of Threads.")
train_g.add_arg("epoch", int, 100, "Number of epoches for training.")
train_g.add_arg("learning_rate", float, 0.1, "Learning rate used to train with warmup.")
train_g.add_arg("save_steps", int, 1000, "The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100, "The steps interval to evaluate model performance.")
train_g.add_arg("learning_rate", float, 0.1,
"Learning rate used to train with warmup.")
train_g.add_arg("save_steps", int, 1000,
"The steps interval to save checkpoints.")
train_g.add_arg("validation_steps", int, 100,
"The steps interval to evaluate model performance.")
train_g.add_arg("random_seed", int, 7, "random seed")
train_g.add_arg("threshold", float, 0.1, "When the confidence exceeds the threshold, the corresponding label is given.")
train_g.add_arg(
"threshold", float, 0.1,
"When the confidence exceeds the threshold, the corresponding label is given."
)
log_g = ArgumentGroup(parser, "logging", "logging related.")
log_g.add_arg("skip_steps", int, 10, "The steps interval to print loss.")
data_g = ArgumentGroup(parser, "data", "Data paths, vocab paths and data processing options")
data_g = ArgumentGroup(parser, "data",
"Data paths, vocab paths and data processing options")
data_g.add_arg("data_dir", str, "./data/input/", "Path to training data.")
data_g.add_arg("save_dir", str, "./data/output/", "Path to save.")
data_g.add_arg("max_seq_len", int, 50, "Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64, "The total number of examples in one batch for training.")
data_g.add_arg("max_seq_len", int, 50,
"Tokens' number of the longest seqence allowed.")
data_g.add_arg("batch_size", int, 64,
"The total number of examples in one batch for training.")
run_type_g = ArgumentGroup(parser, "run_type", "running type options.")
run_type_g.add_arg("use_cuda", bool, False, "If set, use GPU for training.")
# run_type_g.add_arg("use_fast_executor", bool, False, "If set, use fast parallel executor (in experiment).")
run_type_g.add_arg("do_train", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_eval", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True, "Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_train", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_eval", bool, True,
"Whether to perform evaluation on test data set.")
run_type_g.add_arg("do_test", bool, True,
"Whether to perform evaluation on test data set.")
args = parser.parse_args()
def get_score(pred_result, label, eval_phase):
"""[get precision recall and f-score]
......@@ -139,7 +150,7 @@ def train(args, train_exe, build_res, place):
pred_label = build_res["pred_label"]
label = build_res["label"]
fetch_list = [cost.name, prediction.name, pred_label.name, label.name]
train_pyreader = build_res["train_pyreader"]
train_data_loader = build_res["train_data_loader"]
train_prog = build_res["train_prog"]
steps = 0
time_begin = time.time()
......@@ -147,22 +158,24 @@ def train(args, train_exe, build_res, place):
logger.info("Begin training")
for i in range(args.epoch):
try:
for data in train_pyreader():
for data in train_data_loader():
avg_cost_np, avg_pred_np, pred_label, label = train_exe.run(feed=data, program=compiled_prog, \
fetch_list=fetch_list)
steps += 1
if steps % int(args.skip_steps) == 0:
time_end = time.time()
used_time = time_end - time_begin
get_score(pred_label, label, eval_phase = "Train")
get_score(pred_label, label, eval_phase="Train")
logger.info('loss is {}'.format(avg_cost_np))
logger.info("epoch: %d, step: %d, speed: %f steps/s" % (i, steps, args.skip_steps / used_time))
logger.info("epoch: %d, step: %d, speed: %f steps/s" %
(i, steps, args.skip_steps / used_time))
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = os.path.join(args.checkpoints,
"step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog)
logger.info("[save]step %d : save at %s" % (steps, save_path))
fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" %
(steps, save_path))
if steps % args.validation_steps == 0:
if args.do_eval:
evaluate(args, test_exe, build_res, "eval")
......@@ -173,11 +186,16 @@ def train(args, train_exe, build_res, place):
logger.error("Train error : %s" % str(e))
exit(1)
save_path = os.path.join(args.checkpoints, "step_" + str(steps))
fluid.io.save_persistables(train_exe, save_path, train_prog)
fluid.io.save(train_prog, save_path)
logger.info("[save]step %d : save at %s" % (steps, save_path))
def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent=None):
def evaluate(args,
test_exe,
build_res,
eval_phase,
save_result=False,
id2intent=None):
"""[evaluate on dev/test dataset]
Arguments:
......@@ -203,14 +221,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
total_cost, total_acc, pred_prob_list, pred_label_list, label_list = [], [], [], [], []
if eval_phase == "eval":
test_prog = build_res["eval_compiled_prog"]
test_pyreader = build_res["eval_pyreader"]
test_data_loader = build_res["eval_data_loader"]
elif eval_phase == "test":
test_prog = build_res["test_compiled_prog"]
test_pyreader = build_res["test_pyreader"]
test_data_loader = build_res["test_data_loader"]
else:
exit(1)
logger.info("-----------------------------------------------------------")
for data in test_pyreader():
for data in test_data_loader():
avg_cost_np, avg_pred_np, pred_label, label= test_exe.run(program=test_prog, fetch_list=fetch_list, feed=data, \
return_numpy=True)
total_cost.append(avg_cost_np)
......@@ -219,13 +237,18 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
label_list.extend(label)
if save_result:
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase + ".rst")
logger.info("save result at : %s" % args.save_dir + "/" + eval_phase +
".rst")
save_dir = args.save_dir
if not os.path.exists(save_dir):
logger.warning("save dir not exists, and create it")
os.makedirs(save_dir)
fin = codecs.open(os.path.join(args.data_dir, eval_phase + ".txt"), "r", encoding="utf8")
fout = codecs.open(args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
fin = codecs.open(
os.path.join(args.data_dir, eval_phase + ".txt"),
"r",
encoding="utf8")
fout = codecs.open(
args.save_dir + "/" + eval_phase + ".rst", "w", encoding="utf8")
for line in pred_prob_list:
query = fin.readline().rsplit("\t", 1)[0]
res = []
......@@ -245,9 +268,14 @@ def evaluate(args, test_exe, build_res, eval_phase, save_result=False, id2intent
logger.info("-----------------------------------------------------------")
def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_net", is_infer=False):
"""[create network and pyreader]
def create_net(args,
flow_data,
class_dim,
dict_dim,
place,
model_name="textcnn_net",
is_infer=False):
"""[create network and loader]
Arguments:
flow_data {[type]} -- [description]
......@@ -266,11 +294,23 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
model = textcnn_net_multi_label
else:
return
char_list = fluid.data(name="char", shape=[None, args.max_seq_len, 1], dtype="int64", lod_level=0)
label = fluid.data(name="label", shape=[None, class_dim], dtype="float32", lod_level=0) # label data
reader = fluid.io.PyReader(feed_list=[char_list, label], capacity=args.batch_size * 10, iterable=True, \
char_list = fluid.data(
name="char",
shape=[None, args.max_seq_len, 1],
dtype="int64",
lod_level=0)
label = fluid.data(
name="label", shape=[None, class_dim], dtype="float32",
lod_level=0) # label data
data_loader = fluid.io.DataLoader.from_generator(
feed_list=[char_list, label],
capacity=args.batch_size * 10,
iterable=True,
return_list=False)
output = model(char_list, label, dict_dim,
output = model(
char_list,
label,
dict_dim,
emb_dim=flow_data["model"]["emb_dim"],
hid_dim=flow_data["model"]["hid_dim"],
hid_dim2=flow_data["model"]["hid_dim2"],
......@@ -281,14 +321,15 @@ def create_net(args, flow_data, class_dim, dict_dim, place, model_name="textcnn_
max_seq_len=args.max_seq_len)
if is_infer:
prediction = output
return [reader, prediction]
return [data_loader, prediction]
else:
avg_cost, prediction, pred_label, label = output[0], output[1], output[2], output[3]
return [reader, avg_cost, prediction, pred_label, label]
avg_cost, prediction, pred_label, label = output[0], output[1], output[
2], output[3]
return [data_loader, avg_cost, prediction, pred_label, label]
def build_data_reader(args, char_dict, intent_dict):
"""[decorate samples for pyreader]
def build_data_loader(args, char_dict, intent_dict):
"""[decorate samples for dataloader]
Arguments:
args {[type]} -- [description]
......@@ -298,20 +339,22 @@ def build_data_reader(args, char_dict, intent_dict):
Returns:
[type] -- [description]
"""
reader_res = {}
loader_res = {}
if args.do_train:
train_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
train_data_generator = train_processor.prepare_data(
data_path=args.data_dir + "train.txt",
batch_size=args.batch_size,
mode='train')
reader_res["train_data_generator"] = train_data_generator
loader_res["train_data_generator"] = train_data_generator
num_train_examples = train_processor._get_num_examples()
logger.info("Num train examples: %d" % num_train_examples)
logger.info("Num train steps: %d" % (math.ceil(num_train_examples * 1.0 / args.batch_size) * \
args.epoch // DEV_COUNT))
if math.ceil(num_train_examples * 1.0 / args.batch_size) // DEV_COUNT <= 0:
logger.error("Num of train steps is less than 0 or equals to 0, exit")
if math.ceil(num_train_examples * 1.0 /
args.batch_size) // DEV_COUNT <= 0:
logger.error(
"Num of train steps is less than 0 or equals to 0, exit")
exit(1)
if args.do_eval:
eval_processor = DataReader(char_dict, intent_dict, args.max_seq_len)
......@@ -319,7 +362,7 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "eval.txt",
batch_size=args.batch_size,
mode='eval')
reader_res["eval_data_generator"] = eval_data_generator
loader_res["eval_data_generator"] = eval_data_generator
num_eval_examples = eval_processor._get_num_examples()
logger.info("Num eval examples: %d" % num_eval_examples)
if args.do_test:
......@@ -328,11 +371,12 @@ def build_data_reader(args, char_dict, intent_dict):
data_path=args.data_dir + "test.txt",
batch_size=args.batch_size,
mode='test')
reader_res["test_data_generator"] = test_data_generator
return reader_res
loader_res["test_data_generator"] = test_data_generator
return loader_res
def build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res):
def build_graph(args, model_config, num_labels, dict_dim, place, test_place,
loader_res):
"""[build paddle graph]
Arguments:
......@@ -341,7 +385,7 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
num_labels {[type]} -- [description]
dict_dim {[type]} -- [description]
place {[type]} -- [description]
reader_res {[type]} -- [description]
loader_res {[type]} -- [description]
Returns:
[type] -- [description]
......@@ -358,36 +402,42 @@ def build_graph(args, model_config, num_labels, dict_dim, place, test_place, rea
if args.do_train:
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
train_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
train_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, place, model_name="textcnn_net")
train_pyreader.decorate_sample_list_generator(reader_res['train_data_generator'], places=place)
res["train_pyreader"] = train_pyreader
sgd_optimizer = fluid.optimizer.SGD(learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate, decay_steps=1000, decay_rate=0.5, staircase=True))
train_data_loader.set_sample_list_generator(
loader_res['train_data_generator'], places=place)
res["train_data_loader"] = train_data_loader
sgd_optimizer = fluid.optimizer.SGD(
learning_rate=fluid.layers.exponential_decay(
learning_rate=args.learning_rate,
decay_steps=1000,
decay_rate=0.5,
staircase=True))
sgd_optimizer.minimize(cost)
if args.do_eval:
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
eval_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
eval_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net")
eval_pyreader.decorate_sample_list_generator(reader_res['eval_data_generator'], places=test_place)
res["eval_pyreader"] = eval_pyreader
eval_data_loader.set_sample_list_generator(
loader_res['eval_data_generator'], places=test_place)
res["eval_data_loader"] = eval_data_loader
if args.do_test:
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_pyreader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
test_data_loader, cost, prediction, pred_label, label = create_net(args, model_config, num_labels, \
dict_dim, test_place, model_name="textcnn_net")
test_pyreader.decorate_sample_list_generator(reader_res['test_data_generator'], places=test_place)
res["test_pyreader"] = test_pyreader
test_data_loader.set_sample_list_generator(
loader_res['test_data_generator'], places=test_place)
res["test_data_loader"] = test_data_loader
res["cost"] = cost
res["prediction"] = prediction
res["label"] = label
res["pred_label"] = pred_label
res["train_prog"] =train_prog
res["train_prog"] = train_prog
res["eval_prog"] = eval_prog
res["test_prog"] = test_prog
return res
......@@ -421,8 +471,9 @@ def main(args):
id2intent[int(value)] = key
num_labels = len(intent_dict)
# build model
reader_res = build_data_reader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place, test_place, reader_res)
loader_res = build_data_loader(args, char_dict, intent_dict)
build_res = build_graph(args, model_config, num_labels, dict_dim, place,
test_place, loader_res)
build_res["place"] = place
build_res["test_place"] = test_place
if not (args.do_train or args.do_eval or args.do_test):
......@@ -432,11 +483,13 @@ def main(args):
exe.run(startup_prog)
if args.init_checkpoint and args.init_checkpoint != "None":
try:
init_checkpoint(exe, args.init_checkpoint, main_program=startup_prog)
init_checkpoint(
exe, args.init_checkpoint, main_program=startup_prog)
logger.info("Load model from %s" % args.init_checkpoint)
except Exception as e:
logger.exception(str(e))
logger.error("Faild load model from %s [%s]" % (args.init_checkpoint, str(e)))
logger.error("Faild load model from %s [%s]" %
(args.init_checkpoint, str(e)))
build_strategy = fluid.compiler.BuildStrategy()
build_strategy.fuse_all_reduce_ops = False
exec_strategy = fluid.ExecutionStrategy()
......@@ -449,10 +502,12 @@ def main(args):
exec_strategy=exec_strategy)
build_res["compiled_prog"] = compiled_prog
if args.do_test:
test_compiled_prog = fluid.compiler.CompiledProgram(build_res["test_prog"])
test_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"test_prog"])
build_res["test_compiled_prog"] = test_compiled_prog
if args.do_eval:
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res["eval_prog"])
eval_compiled_prog = fluid.compiler.CompiledProgram(build_res[
"eval_prog"])
build_res["eval_compiled_prog"] = eval_compiled_prog
if args.do_train:
......@@ -465,7 +520,6 @@ def main(args):
save_result=True, id2intent=id2intent)
if __name__ == "__main__":
logger.info("the paddle version is %s" % paddle.__version__)
check_version('1.6.0')
......
......@@ -32,7 +32,6 @@ try:
except ImportError:
import ConfigParser as cp
random_seed = 7
logger = logging.getLogger()
format = "%(asctime)s - %(name)s - %(levelname)s -%(filename)s-%(lineno)4d -%(message)s"
......@@ -77,6 +76,7 @@ class ArgumentGroup(object):
Arguments:
object {[type]} -- [description]
"""
def __init__(self, parser, title, des):
self._group = parser.add_argument_group(title=title, description=des)
......@@ -107,6 +107,7 @@ class DataReader(object):
Returns:
[type] -- [description]
"""
def __init__(self, char_vocab, intent_dict, max_len):
self._char_vocab = char_vocab
self._intent_dict = intent_dict
......@@ -128,12 +129,17 @@ class DataReader(object):
# word_dict_path), "The given word dictionary dose not exist."
assert os.path.exists(data_path), "The given data file does not exist."
if mode == "train":
train_reader = fluid.io.batch(paddle.reader.shuffle(self.data_reader(data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100), batch_size)
train_reader = fluid.io.batch(
paddle.reader.shuffle(
self.data_reader(
data_path, self.max_len, shuffle=True),
buf_size=batch_size * 100),
batch_size)
# train_reader = fluid.io.batch(self.data_reader(data_path), batch_size)
return train_reader
else:
test_reader = fluid.io.batch(self.data_reader(data_path, self.max_len), batch_size)
test_reader = fluid.io.batch(
self.data_reader(data_path, self.max_len), batch_size)
return test_reader
def data_reader(self, file_path, max_len, shuffle=False):
......@@ -150,7 +156,8 @@ class DataReader(object):
char_id_list = list(map(lambda x: 0 if x not in self._char_vocab else int(self._char_vocab[x]), \
list(query)))
if len(char_id_list) < max_len:
char_id_list.extend([self.padding_id] * (max_len - len(char_id_list)))
char_id_list.extend([self.padding_id] *
(max_len - len(char_id_list)))
char_id_list = char_id_list[:max_len]
intent_id_list = [self.padding_id] * self.intent_size
for item in intent.split('\2'):
......@@ -159,6 +166,7 @@ class DataReader(object):
if shuffle:
random.seed(random_seed)
random.shuffle(self.all_data)
def reader():
"""
reader
......@@ -166,6 +174,7 @@ class DataReader(object):
for char_id_list, intent_id_list in self.all_data:
# print char_id_list, intent_id
yield char_id_list, intent_id_list
return reader
......@@ -178,6 +187,7 @@ class DataProcesser(object):
Returns:
[type] -- [description]
"""
@staticmethod
def read_dict(filename):
"""
......@@ -227,7 +237,8 @@ class DataProcesser(object):
intent_dict[intent] = 0
intent_dict[intent] += 1
# save char dict
with codecs.open("%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
with codecs.open(
"%s/char.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("PAD\0020\n")
f_out.write("OOV\0021\n")
char_id = 2
......@@ -238,7 +249,8 @@ class DataProcesser(object):
f_out.write("%s\002%d\n" % (key, char_id))
char_id += 1
# save intent dict
with codecs.open("%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
with codecs.open(
"%s/domain.dict" % save_dir, "w", encoding="utf8") as f_out:
f_out.write("SYS_OTHER\0020\n")
intent_id = 1
for key, value in intent_dict.items():
......@@ -249,7 +261,6 @@ class DataProcesser(object):
intent_id += 1
class ConfigReader(object):
"""[read model config file]
......@@ -282,49 +293,13 @@ class ConfigReader(object):
return flow_data
def init_pretraining_params(exe,
pretraining_params_path,
main_program,
use_fp16=False):
"""load params of pretrained model, NOT including moment, learning_rate"""
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
def init_checkpoint(exe, init_checkpoint_path, main_program):
"""
Init CheckPoint
"""
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.load(main_program, init_checkpoint_path, exe)
print("Load model from {}".format(init_checkpoint_path))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print ("Load model from {}".format(init_checkpoint_path))
def print_arguments(args):
"""
......@@ -350,5 +325,3 @@ def check_version(version='1.6.0'):
except Exception as e:
logger.error(err)
sys.exit(1)
......@@ -21,8 +21,10 @@ from kpi import DurationKpi
train_loss_card1 = CostKpi('train_loss_card1', 0.03, 0, actived=True)
train_loss_card4 = CostKpi('train_loss_card4', 0.03, 0, actived=True)
train_duration_card1 = DurationKpi('train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi('train_duration_card4', 0.01, 0, actived=True)
train_duration_card1 = DurationKpi(
'train_duration_card1', 0.01, 0, actived=True)
train_duration_card4 = DurationKpi(
'train_duration_card4', 0.01, 0, actived=True)
tracking_kpis = [
train_loss_card1,
......
......@@ -20,22 +20,25 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_dataset-1.0.0.tar.gz",
"TRAINED_MODEL":
"https://baidu-nlp.bj.bcebos.com/auto_dialogue_evaluation_models.2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
'TRAINED_MODEL': './data/saved_models'}
PATH_MAP = {'DATA_PATH': "./data/input", 'TRAINED_MODEL': './data/saved_models'}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -51,7 +54,8 @@ def download_model_and_data():
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -21,8 +21,7 @@ import paddle
import paddle.fluid as fluid
def create_net(
is_training,
def create_net(is_training,
model_input,
args,
clip_value=10.0,
......@@ -52,14 +51,12 @@ def create_net(
initializer=fluid.initializer.Normal(scale=0.1)))
#fc to fit dynamic LSTM
context_fc = fluid.layers.fc(
input=context_emb,
context_fc = fluid.layers.fc(input=context_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
response_fc = fluid.layers.fc(
input=response_emb,
response_fc = fluid.layers.fc(input=response_emb,
size=args.hidden_size * 4,
param_attr=fluid.ParamAttr(name='fc_weight'),
bias_attr=fluid.ParamAttr(name='fc_bias'))
......@@ -106,7 +103,5 @@ def set_word_embedding(word_emb, place, word_emb_name="shared_word_emb"):
"""
Set word embedding
"""
word_emb_param = fluid.global_scope().find_var(
word_emb_name).get_tensor()
word_emb_param = fluid.global_scope().find_var(word_emb_name).get_tensor()
word_emb_param.set(word_emb, place)
......@@ -42,22 +42,24 @@ def do_save_inference_model(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
if args.use_cuda:
place = fluid.CUDAPlace(0)
......@@ -81,9 +83,7 @@ def do_save_inference_model(args):
input_field.context_wordseq.name,
input_field.response_wordseq.name,
],
target_vars=[
logits,
],
target_vars=[logits, ],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from ade.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/ade.yaml")
......
......@@ -46,22 +46,24 @@ def do_predict(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
logits = create_net(
is_training=False,
model_input=input_field,
args=args
)
is_training=False, model_input=input_field, args=args)
logits.persistable = True
fetch_list = [logits.name]
......@@ -89,10 +91,7 @@ def do_predict(args):
batch_size=args.batch_size)
batch_generator = processor.data_generator(
place=place,
phase="test",
shuffle=False,
sample_pro=1)
place=place, phase="test", shuffle=False, sample_pro=1)
num_test_examples = processor.get_num_examples(phase='test')
data_reader.decorate_batch_generator(batch_generator)
......@@ -107,7 +106,7 @@ def do_predict(args):
data_reader.reset()
break
scores = scores[: num_test_examples]
scores = scores[:num_test_examples]
print("Write the predicted results into the output_prediction_file")
fw = io.open(args.output_prediction_file, 'w', encoding="utf8")
for index, score in enumerate(scores):
......
......@@ -49,22 +49,24 @@ def do_train(args):
with fluid.unique_name.guard():
context_wordseq = fluid.data(
name='context_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
name='context_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
response_wordseq = fluid.data(
name='response_wordseq', shape=[-1, 1], dtype='int64', lod_level=1)
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
name='response_wordseq',
shape=[-1, 1],
dtype='int64',
lod_level=1)
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [context_wordseq, response_wordseq, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
loss = create_net(
is_training=True,
model_input=input_field,
args=args
)
is_training=True, model_input=input_field, args=args)
loss.persistable = True
# gradient clipping
fluid.clip.set_gradient_clip(clip=fluid.clip.GradientClipByValue(
......@@ -74,7 +76,8 @@ def do_train(args):
if args.use_cuda:
dev_count = fluid.core.get_cuda_device_count()
place = fluid.CUDAPlace(int(os.getenv('FLAGS_selected_gpus', '0')))
place = fluid.CUDAPlace(
int(os.getenv('FLAGS_selected_gpus', '0')))
else:
dev_count = int(os.environ.get('CPU_NUM', 1))
place = fluid.CPUPlace()
......@@ -114,9 +117,14 @@ def do_train(args):
if args.word_emb_init:
print("start loading word embedding init ...")
if six.PY2:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'))).astype('float32')
word_emb = np.array(
pickle.load(io.open(args.word_emb_init, 'rb'))).astype(
'float32')
else:
word_emb = np.array(pickle.load(io.open(args.word_emb_init, 'rb'), encoding="bytes")).astype('float32')
word_emb = np.array(
pickle.load(
io.open(args.word_emb_init, 'rb'),
encoding="bytes")).astype('float32')
set_word_embedding(word_emb, place)
print("finish init word embedding ...")
......@@ -147,15 +155,20 @@ def do_train(args):
used_time = time_end - time_begin
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
print('%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s' % (current_time, epoch_step, steps, sum_loss / args.print_steps, args.print_steps / used_time))
print(
'%s epoch: %d, step: %s, avg loss %s, speed: %f steps/s'
% (current_time, epoch_step, steps, sum_loss /
args.print_steps, args.print_steps / used_time))
sum_loss = 0.0
time_begin = time.time()
if steps % args.save_steps == 0:
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_checkpoint(args, exe, train_prog,
"step_" + str(steps))
if args.save_param:
save_load_io.save_param(args, exe, train_prog, "step_" + str(steps))
save_load_io.save_param(args, exe, train_prog,
"step_" + str(steps))
steps += 1
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -20,12 +20,18 @@ from kpi import CostKpi
from kpi import DurationKpi
from kpi import AccKpi
each_step_duration_atis_slot_card1 = DurationKpi('each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi('train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi('train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi('each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi('train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi('train_acc_atis_slot_card4', 0.01, 0, actived=True)
each_step_duration_atis_slot_card1 = DurationKpi(
'each_step_duration_atis_slot_card1', 0.01, 0, actived=True)
train_loss_atis_slot_card1 = CostKpi(
'train_loss_atis_slot_card1', 0.08, 0, actived=True)
train_acc_atis_slot_card1 = CostKpi(
'train_acc_atis_slot_card1', 0.01, 0, actived=True)
each_step_duration_atis_slot_card4 = DurationKpi(
'each_step_duration_atis_slot_card4', 0.06, 0, actived=True)
train_loss_atis_slot_card4 = CostKpi(
'train_loss_atis_slot_card4', 0.03, 0, actived=True)
train_acc_atis_slot_card4 = CostKpi(
'train_acc_atis_slot_card4', 0.01, 0, actived=True)
tracking_kpis = [
each_step_duration_atis_slot_card1,
......
......@@ -100,8 +100,12 @@ def prepare_batch_data(task_name,
if isinstance(insts[0][3], list):
if task_name == "atis_slot":
labels_list = [inst[3] + [0] * (max_len - len(inst[3])) for inst in insts]
labels_list = [np.array(labels_list).astype("int64").reshape([-1, max_len])]
labels_list = [
inst[3] + [0] * (max_len - len(inst[3])) for inst in insts
]
labels_list = [
np.array(labels_list).astype("int64").reshape([-1, max_len])
]
elif task_name == "dstc2":
labels_list = [inst[3] for inst in insts]
labels_list = [np.array(labels_list).astype("int64")]
......@@ -124,10 +128,7 @@ def prepare_batch_data(task_name,
out = batch_src_ids
# Second step: padding
src_id, self_input_mask = pad_batch_data(
out,
max_len,
pad_idx=pad_id,
return_input_mask=True)
out, max_len, pad_idx=pad_id, return_input_mask=True)
pos_id = pad_batch_data(
batch_pos_ids,
max_len,
......@@ -163,13 +164,13 @@ def pad_batch_data(insts,
corresponding position data and attention bias.
"""
return_list = []
max_len = max_len_in if max_len_in != -1 else max(len(inst) for inst in insts)
max_len = max_len_in if max_len_in != -1 else max(
len(inst) for inst in insts)
# Any token included in dict can be used to pad, since the paddings' loss
# will be masked out by weights and make no effect on parameter gradients.
inst_data = np.array(
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts
])
[inst + list([pad_idx] * (max_len - len(inst))) for inst in insts])
return_list += [inst_data.astype("int64").reshape([-1, max_len])]
# position data
......
......@@ -25,18 +25,21 @@ class DefinePredict(object):
"""
Packaging Prediction Results
"""
def __init__(self):
"""
init
"""
self.task_map = {'udc': 'get_matching_res',
self.task_map = {
'udc': 'get_matching_res',
'swda': 'get_cls_res',
'mrda': 'get_cls_res',
'atis_intent': 'get_cls_res',
'atis_slot': 'get_sequence_tagging',
'dstc2': 'get_multi_cls_res',
'dstc2_asr': 'get_multi_cls_res',
'multi-woz': 'get_multi_cls_res'}
'multi-woz': 'get_multi_cls_res'
}
def get_matching_res(self, probs, params=None):
"""
......@@ -79,7 +82,3 @@ class DefinePredict(object):
label_str = " ".join([str(l) for l in sorted(labels)])
return label_str
......@@ -20,25 +20,29 @@ import sys
import io
import os
URLLIB=urllib
URLLIB = urllib
if sys.version_info >= (3, 0):
import urllib.request
URLLIB=urllib.request
URLLIB = urllib.request
DATA_MODEL_PATH = {"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL": "https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"}
DATA_MODEL_PATH = {
"DATA_PATH": "https://baidu-nlp.bj.bcebos.com/dmtk_data_1.0.0.tar.gz",
"PRETRAIN_MODEL":
"https://bert-models.bj.bcebos.com/uncased_L-12_H-768_A-12.tar.gz",
"TRAINED_MODEL": "https://baidu-nlp.bj.bcebos.com/dgu_models_2.0.0.tar.gz"
}
PATH_MAP = {'DATA_PATH': "./data/input",
PATH_MAP = {
'DATA_PATH': "./data/input",
'PRETRAIN_MODEL': './data/pretrain_model',
'TRAINED_MODEL': './data/saved_models'}
'TRAINED_MODEL': './data/saved_models'
}
def un_tar(tar_name, dir_name):
try:
t = tarfile.open(tar_name)
t.extractall(path = dir_name)
t.extractall(path=dir_name)
return True
except Exception as e:
print(e)
......@@ -48,13 +52,18 @@ def un_tar(tar_name, dir_name):
def download_model_and_data():
print("Downloading dgu data, pretrain model and trained models......")
print("This process is quite long, please wait patiently............")
for path in ['./data/input/data', './data/pretrain_model/uncased_L-12_H-768_A-12', './data/saved_models/trained_models']:
for path in [
'./data/input/data',
'./data/pretrain_model/uncased_L-12_H-768_A-12',
'./data/saved_models/trained_models'
]:
if not os.path.exists(path):
continue
shutil.rmtree(path)
for path_key in DATA_MODEL_PATH:
filename = os.path.basename(DATA_MODEL_PATH[path_key])
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key], os.path.join("./", filename))
URLLIB.urlretrieve(DATA_MODEL_PATH[path_key],
os.path.join("./", filename))
state = un_tar(filename, PATH_MAP[path_key])
if not state:
print("Tar %s error....." % path_key)
......
......@@ -19,6 +19,3 @@ python run_build_data.py udc
python run_build_data.py atis
生成槽位识别数据在dialogue_general_understanding/data/input/data/atis/atis_slot
生成意图识别数据在dialogue_general_understanding/data/input/data/atis/atis_intent
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""build swda train dev test dataset"""
import json
......@@ -27,6 +26,7 @@ class ATIS(object):
"""
nlu dataset atis data process
"""
def __init__(self):
"""
init instance
......@@ -73,7 +73,8 @@ class ATIS(object):
if example[1] not in self.intent_dict:
self.intent_dict[example[1]] = self.intent_id
self.intent_id += 1
fw.write(u"%s\t%s\n" % (self.intent_dict[example[1]], example[0].lower()))
fw.write(u"%s\t%s\n" %
(self.intent_dict[example[1]], example[0].lower()))
fw = io.open(self.map_tag_intent, 'w', encoding="utf8")
for tag in self.intent_dict:
......@@ -109,17 +110,19 @@ class ATIS(object):
tags_slot.append(str(self.slot_dict[tag]))
if i == 0:
if start not in [0, 1]:
prefix_num = len(text[: start].strip().split())
prefix_num = len(text[:start].strip().split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
else:
prefix_num = len(text[entities[i - 1]['end']: start].strip().split())
prefix_num = len(text[entities[i - 1]['end']:start].strip()
.split())
tags.extend([str(self.slot_dict['O'])] * prefix_num)
tags.extend(tags_slot)
if entities[-1]['end'] < len(text):
suffix_num = len(text[entities[-1]['end']:].strip().split())
tags.extend([str(self.slot_dict['O'])] * suffix_num)
fw.write(u"%s\t%s\n" % (text.encode('utf8'), " ".join(tags).encode('utf8')))
fw.write(u"%s\t%s\n" %
(text.encode('utf8'), " ".join(tags).encode('utf8')))
fw = io.open(self.map_tag_slot, 'w', encoding="utf8")
for slot in self.slot_dict:
......@@ -152,7 +155,3 @@ class ATIS(object):
if __name__ == "__main__":
atis_inst = ATIS()
atis_inst.main()
......@@ -28,6 +28,7 @@ class DSTC2(object):
"""
dialogue state tracking dstc2 data process
"""
def __init__(self):
"""
init instance
......@@ -49,7 +50,8 @@ class DSTC2(object):
self.data_dict = commonlib.load_dict(self.data_list)
for data_type in self.data_dict:
for i in range(len(self.data_dict[data_type])):
self.data_dict[data_type][i] = os.path.join(self.src_dir, self.data_dict[data_type][i])
self.data_dict[data_type][i] = os.path.join(
self.src_dir, self.data_dict[data_type][i])
def _load_ontology(self):
"""
......@@ -97,15 +99,25 @@ class DSTC2(object):
log_turn = log_json["turns"][i]
label_turn = label_json["turns"][i]
assert log_turn["turn-index"] == label_turn["turn-index"]
labels = ["%s_%s" % (slot, label_turn["goal-labels"][slot]) for slot in label_turn["goal-labels"]]
labels_ids = " ".join([str(self.map_tag_dict.get(label, self.map_tag_dict["%s_none" % label.split('_')[0]])) for label in labels])
labels = [
"%s_%s" % (slot, label_turn["goal-labels"][slot])
for slot in label_turn["goal-labels"]
]
labels_ids = " ".join([
str(
self.map_tag_dict.get(label, self.map_tag_dict[
"%s_none" % label.split('_')[0]]))
for label in labels
])
mach = log_turn['output']['transcript']
user = label_turn['transcription']
if not labels_ids.strip():
labels_ids = self.map_tag_dict['none']
out = "%s\t%s\1%s\t%s" % (session_id, mach, user, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0]['asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr, labels_ids)
user_asr = log_turn['input']['live']['asr-hyps'][0][
'asr-hyp'].strip()
out_asr = "%s\t%s\1%s\t%s" % (session_id, mach, user_asr,
labels_ids)
fw.write(u"%s\n" % out.encode('utf8'))
fw_asr.write(u"%s\n" % out_asr.encode('utf8'))
......@@ -144,10 +156,7 @@ class DSTC2(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
dstc_inst = DSTC2()
dstc_inst.main()
......@@ -27,6 +27,7 @@ class MRDA(object):
"""
dialogue act dataset mrda data process
"""
def __init__(self):
"""
init instance
......@@ -67,7 +68,7 @@ class MRDA(object):
for dadb_key in dadb_list:
dadb_file = self.dadb_dict[dadb_key]
fr = io.open(dadb_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
conv_id = elems[2]
......@@ -87,7 +88,7 @@ class MRDA(object):
for trans_key in trans_list:
trans_file = self.trans_dict[trans_key]
fr = io.open(trans_file, 'r', encoding="utf8")
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for line in row:
elems = line
if len(elems) != 3:
......@@ -120,7 +121,8 @@ class MRDA(object):
self.tag_id += 1
caller = elem.split('_')[0].split('-')[-1]
conv_no = elem.split('_')[0].split('-')[0]
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller, v_trans[0])
out = "%s\t%s\t%s\t%s" % (conv_no, self.map_tag_dict[tag], caller,
v_trans[0])
fw.write(u"%s\n" % out)
def get_train_dataset(self):
......@@ -158,10 +160,7 @@ class MRDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
mrda_inst = MRDA()
mrda_inst.main()
......@@ -27,6 +27,7 @@ class SWDA(object):
"""
dialogue act dataset swda data process
"""
def __init__(self):
"""
init instance
......@@ -63,7 +64,7 @@ class SWDA(object):
file_path = self.file_dict[name]
fr = io.open(file_path, 'r', encoding="utf8")
idx = 0
row = csv.reader(fr, delimiter = ',')
row = csv.reader(fr, delimiter=',')
for r in row:
if idx == 0:
idx += 1
......@@ -224,10 +225,7 @@ class SWDA(object):
self.get_test_dataset()
self.get_labels()
if __name__ == "__main__":
swda_inst = SWDA()
swda_inst.main()
......@@ -20,7 +20,6 @@ from build_dstc2_dataset import DSTC2
from build_mrda_dataset import MRDA
from build_swda_dataset import SWDA
if __name__ == "__main__":
task_name = sys.argv[1]
task_name = task_name.lower()
......@@ -38,11 +37,12 @@ if __name__ == "__main__":
elif task_name == 'atis':
atis_inst = ATIS()
atis_inst.main()
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt", "../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt", "../../data/input/data/atis/atis_intent/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_slot/test.txt",
"../../data/input/data/atis/atis_slot/dev.txt")
shutil.copyfile("../../data/input/data/atis/atis_intent/test.txt",
"../../data/input/data/atis/atis_intent/dev.txt")
elif task_name == 'dstc2':
dstc_inst = DSTC2()
dstc_inst.main()
else:
exit(0)
......@@ -12,7 +12,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Tokenization classes."""
from __future__ import absolute_import
......
......@@ -113,7 +113,7 @@ def multi_head_attention(queries,
"""
Scaled Dot-Product Attention
"""
scaled_q = layers.scale(x=q, scale=d_key ** -0.5)
scaled_q = layers.scale(x=q, scale=d_key**-0.5)
product = layers.matmul(x=scaled_q, y=k, transpose_y=True)
if attn_bias:
product += attn_bias
......
......@@ -122,5 +122,3 @@ def save_param(args, exe, program, dirname):
print("save parameters at %s" % (os.path.join(param_dir, dirname)))
return True
......@@ -23,12 +23,7 @@ from dgu.bert import BertModel
from dgu.utils.configure import JsonConfig
def create_net(
is_training,
model_input,
num_labels,
paradigm_inst,
args):
def create_net(is_training, model_input, num_labels, paradigm_inst, args):
"""create dialogue task model"""
src_ids = model_input.src_ids
......@@ -48,14 +43,15 @@ def create_net(
config=bert_conf,
use_fp16=False)
params = {'num_labels': num_labels,
params = {
'num_labels': num_labels,
'src_ids': src_ids,
'pos_ids': pos_ids,
'sent_ids': sent_ids,
'input_mask': input_mask,
'labels': labels,
'is_training': is_training}
'is_training': is_training
}
results = paradigm_inst.paradigm(bert, params)
return results
......@@ -66,7 +66,9 @@ def do_save_inference_model(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,8 +76,7 @@ def do_save_inference_model(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
......@@ -107,14 +108,10 @@ def do_save_inference_model(args):
fluid.io.save_inference_model(
args.inference_model_dir,
feeded_var_names=[
input_field.src_ids.name,
input_field.pos_ids.name,
input_field.sent_ids.name,
input_field.input_mask.name
],
target_vars=[
probs
input_field.src_ids.name, input_field.pos_ids.name,
input_field.sent_ids.name, input_field.input_mask.name
],
target_vars=[probs],
executor=exe,
main_program=test_prog,
model_filename="model.pdmodel",
......
......@@ -26,7 +26,6 @@ from inference_model import do_save_inference_model
from dgu.utils.configure import PDConfig
if __name__ == "__main__":
args = PDConfig(yaml_file="./data/config/dgu.yaml")
......
......@@ -66,7 +66,9 @@ def do_train(args):
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len], dtype='float32')
name='input_mask',
shape=[-1, args.max_seq_len],
dtype='float32')
if args.task_name == 'atis_slot':
labels = fluid.data(
name='labels', shape=[-1, args.max_seq_len], dtype='int64')
......@@ -74,13 +76,12 @@ def do_train(args):
labels = fluid.data(
name='labels', shape=[-1, num_labels], dtype='int64')
else:
labels = fluid.data(
name='labels', shape=[-1, 1], dtype='int64')
labels = fluid.data(name='labels', shape=[-1, 1], dtype='int64')
input_inst = [src_ids, pos_ids, sent_ids, input_mask, labels]
input_field = InputField(input_inst)
data_reader = fluid.io.PyReader(feed_list=input_inst,
capacity=4, iterable=False)
data_reader = fluid.io.PyReader(
feed_list=input_inst, capacity=4, iterable=False)
processor = processors[task_name](data_dir=args.data_dir,
vocab_path=args.vocab_path,
max_seq_len=args.max_seq_len,
......@@ -113,9 +114,7 @@ def do_train(args):
dev_count = int(os.environ.get('CPU_NUM', 1))
batch_generator = processor.data_generator(
batch_size=args.batch_size,
phase='train',
shuffle=True)
batch_size=args.batch_size, phase='train', shuffle=True)
num_train_examples = processor.get_num_examples(phase='train')
if args.in_tokens:
......@@ -217,37 +216,32 @@ def do_train(args):
current_time = time.strftime('%Y-%m-%d %H:%M:%S',
time.localtime(time.time()))
if accuracy is not None:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"ave acc: %f, speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
np.mean(np_acc),
np.mean(np_loss), np.mean(np_acc),
args.print_steps / used_time
])
else:
print(
"%s epoch: %d, step: %d, ave loss: %f, "
print("%s epoch: %d, step: %d, ave loss: %f, "
"speed: %f steps/s" %
(current_time, epoch_step, steps,
np.mean(np_loss),
args.print_steps / used_time))
ce_info.append([
np.mean(np_loss),
args.print_steps / used_time
])
np.mean(np_loss), args.print_steps / used_time))
ce_info.append(
[np.mean(np_loss), args.print_steps / used_time])
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = "step_" + str(steps)
if args.save_checkpoint:
save_load_io.save_checkpoint(args, exe, train_prog, save_path)
save_load_io.save_checkpoint(args, exe, train_prog,
save_path)
if args.save_param:
save_load_io.save_param(args, exe, train_prog, save_path)
save_load_io.save_param(args, exe, train_prog,
save_path)
except fluid.core.EOFException:
data_reader.reset()
......
......@@ -19,8 +19,7 @@ from __future__ import print_function
import os
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
import numpy as np
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Emotion Detection Task
"""
......@@ -24,7 +23,7 @@ import os
import time
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......@@ -38,9 +37,7 @@ import reader
import utils
def create_model(args,
num_labels,
is_prediction=False):
def create_model(args, num_labels, is_prediction=False):
"""
Create Model for Emotion Detection
"""
......@@ -77,10 +74,17 @@ def create_model(args,
raise ValueError("Unknown network type!")
if is_prediction:
probs = network(data, seq_len, None, args.vocab_size, class_dim=num_labels, is_prediction=True)
probs = network(
data,
seq_len,
None,
args.vocab_size,
class_dim=num_labels,
is_prediction=True)
return loader, probs, [data.name, seq_len.name]
avg_loss, probs = network(data, seq_len, label, args.vocab_size, class_dim=num_labels)
avg_loss, probs = network(
data, seq_len, label, args.vocab_size, class_dim=num_labels)
num_seqs = fluid.layers.create_tensor(dtype='int64')
accuracy = fluid.layers.accuracy(input=probs, label=label, total=num_seqs)
return loader, avg_loss, accuracy, num_seqs
......@@ -142,7 +146,8 @@ def main(args):
exe = fluid.Executor(place)
task_name = args.task_name.lower()
processor = reader.EmoTectProcessor(data_dir=args.data_dir,
processor = reader.EmoTectProcessor(
data_dir=args.data_dir,
vocab_path=args.vocab_path,
random_seed=args.random_seed)
#num_labels = len(processor.get_labels())
......@@ -173,9 +178,7 @@ def main(args):
with fluid.program_guard(train_program, startup_prog):
with fluid.unique_name.guard():
train_loader, loss, accuracy, num_seqs = create_model(
args,
num_labels=num_labels,
is_prediction=False)
args, num_labels=num_labels, is_prediction=False)
sgd_optimizer = fluid.optimizer.Adagrad(learning_rate=args.lr)
sgd_optimizer.minimize(loss)
......@@ -189,37 +192,27 @@ def main(args):
if args.do_val:
if args.do_train:
test_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='dev',
epoch=1)
batch_size=args.batch_size, phase='dev', epoch=1)
else:
test_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='test',
epoch=1)
batch_size=args.batch_size, phase='test', epoch=1)
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
test_loader, loss, accuracy, num_seqs = create_model(
args,
num_labels=num_labels,
is_prediction=False)
args, num_labels=num_labels, is_prediction=False)
test_prog = test_prog.clone(for_test=True)
if args.do_infer:
infer_data_generator = processor.data_generator(
batch_size=args.batch_size,
phase='infer',
epoch=1)
batch_size=args.batch_size, phase='infer', epoch=1)
test_prog = fluid.Program()
with fluid.program_guard(test_prog, startup_prog):
with fluid.unique_name.guard():
infer_loader, probs, _ = create_model(
args,
num_labels=num_labels,
is_prediction=True)
args, num_labels=num_labels, is_prediction=True)
test_prog = test_prog.clone(for_test=True)
exe.run(startup_prog)
......@@ -292,8 +285,9 @@ def main(args):
time_begin = time.time()
if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.save_checkpoint_dir,
"step_" + str(steps))
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate on dev set
......@@ -306,11 +300,11 @@ def main(args):
print("final step: %d " % steps)
if args.do_val:
evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name],
"dev")
[loss.name, accuracy.name, num_seqs.name], "dev")
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.save_checkpoint_dir,
"step_" + str(steps))
fluid.save(train_program, save_path)
train_loader.reset()
break
......@@ -334,15 +328,12 @@ def main(args):
if not args.do_train and args.do_val:
print("Final test result:")
evaluate(test_exe, test_prog, test_loader,
[loss.name, accuracy.name, num_seqs.name],
"test")
[loss.name, accuracy.name, num_seqs.name], "test")
# infer
if args.do_infer:
print("Final infer result:")
infer(test_exe, test_prog, infer_loader,
[probs.name],
"infer")
infer(test_exe, test_prog, infer_loader, [probs.name], "infer")
def get_cards():
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Emotion Detection Task, based on ERNIE
"""
......@@ -25,7 +24,7 @@ import time
import argparse
import multiprocessing
import sys
sys.path.append("../")
sys.path.append("../shared_modules/")
import paddle
import paddle.fluid as fluid
......@@ -350,7 +349,7 @@ def main(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
# evaluate dev set
......@@ -369,7 +368,7 @@ def main(args):
except fluid.core.EOFException:
save_path = os.path.join(args.save_checkpoint_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
train_pyreader.reset()
break
......
......@@ -11,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
EmoTect utilities.
"""
......@@ -29,27 +28,13 @@ import paddle
import paddle.fluid as fluid
import numpy as np
def init_checkpoint(exe, init_checkpoint_path, main_program):
"""
Init CheckPoint
"""
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
print("Load model from {}".format(init_checkpoint_path))
fluid.load(main_program, init_checkpoint_path, exe)
def word2id(word_dict, query):
......@@ -57,8 +42,10 @@ def word2id(word_dict, query):
Convert word sequence into id list
"""
unk_id = len(word_dict)
wids = [word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")]
wids = [
word_dict[w] if w in word_dict else unk_id
for w in query.strip().split(" ")
]
return wids
......
......@@ -5,7 +5,7 @@
## 1. 任务说明
本文主要介绍基于lstm的语言的模型的实现,给定一个输入词序列(中文分词、英文tokenize),计算其ppl(语言模型困惑度,用户表示句子的流利程度),基于循环神经网络语言模型的介绍可以[参阅论文](https://arxiv.org/abs/1409.2329)。相对于传统的方法,基于循环神经网络的方法能够更好的解决稀疏词的问题。
**目前语言模型要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。**
**目前语言模型要求使用PaddlePaddle 1.7及以上版本或适当的develop版本。**
同时推荐用户参考[IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/122290)
......
......@@ -36,7 +36,7 @@ import sys
if sys.version[0] == '2':
reload(sys)
sys.setdefaultencoding("utf-8")
sys.path.append('../')
sys.path.append('../shared_modules/')
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
......@@ -60,7 +60,7 @@ def profile_context(profile=True, profiler_path='/tmp/paddingrnn.profile'):
def get_current_model_para(train_prog, train_exe):
param_list = train_prog.block(0).all_parameters()
param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list]
vals = {}
......@@ -73,7 +73,7 @@ def get_current_model_para(train_prog, train_exe):
def save_para_npz(train_prog, train_exe):
print("begin to save model to model_base")
param_list = train_prog.block(0).all_parameters()
param_list = train_prog.all_parameters()
param_name_list = [p.name for p in param_list]
vals = {}
......
......@@ -16,7 +16,7 @@ Lexical Analysis of Chinese,简称 LAC,是一个联合的词法分析模型
#### 1.PaddlePaddle 安装
本项目依赖 PaddlePaddle 1.6.0 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
本项目依赖 PaddlePaddle 1.7 及以上版本和PaddleHub 1.0.0及以上版本 ,PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start),PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
> Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle,请安装时注意区别。
......
......@@ -26,7 +26,7 @@ from paddle.fluid.initializer import NormalInitializer
from reader import Dataset
from ernie_reader import SequenceLabelReader
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.sequence_labeling import nets
from models.representation.ernie import ernie_encoder, ernie_pyreader
......@@ -35,9 +35,10 @@ def create_model(args, vocab_size, num_labels, mode='train'):
"""create lac model"""
# model's input data
words = fluid.data(name='words', shape=[-1, 1], dtype='int64', lod_level=1)
words = fluid.data(
name='words', shape=[None, 1], dtype='int64', lod_level=1)
targets = fluid.data(
name='targets', shape=[-1, 1], dtype='int64', lod_level=1)
name='targets', shape=[None, 1], dtype='int64', lod_level=1)
# for inference process
if mode == 'infer':
......@@ -88,9 +89,11 @@ def create_pyreader(args,
return_reader=False,
mode='train'):
# init reader
device_count = len(fluid.cuda_places()) if args.use_cuda else len(
fluid.cpu_places())
if model == 'lac':
pyreader = fluid.io.PyReader(
pyreader = fluid.io.DataLoader.from_generator(
feed_list=feed_list,
capacity=50,
use_double_buffer=True,
......@@ -101,19 +104,19 @@ def create_pyreader(args,
# create lac pyreader
if mode == 'train':
pyreader.decorate_sample_list_generator(
pyreader.set_sample_list_generator(
fluid.io.batch(
fluid.io.shuffle(
reader.file_reader(file_name),
buf_size=args.traindata_shuffle_buffer),
batch_size=args.batch_size),
batch_size=args.batch_size / device_count),
places=place)
else:
pyreader.decorate_sample_list_generator(
pyreader.set_sample_list_generator(
fluid.io.batch(
reader.file_reader(
file_name, mode=mode),
batch_size=args.batch_size),
batch_size=args.batch_size / device_count),
places=place)
elif model == 'ernie':
......@@ -162,19 +165,19 @@ def create_ernie_model(args, ernie_config):
# ERNIE's input data
src_ids = fluid.data(
name='src_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='src_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
sent_ids = fluid.data(
name='sent_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='sent_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
pos_ids = fluid.data(
name='pos_ids', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='pos_ids', shape=[None, args.max_seq_len, 1], dtype='int64')
input_mask = fluid.data(
name='input_mask', shape=[-1, args.max_seq_len, 1], dtype='float32')
name='input_mask', shape=[None, args.max_seq_len, 1], dtype='float32')
padded_labels = fluid.data(
name='padded_labels', shape=[-1, args.max_seq_len, 1], dtype='int64')
name='padded_labels', shape=[None, args.max_seq_len, 1], dtype='int64')
seq_lens = fluid.data(
name='seq_lens', shape=[-1], dtype='int64', lod_level=0)
name='seq_lens', shape=[None], dtype='int64', lod_level=0)
squeeze_labels = fluid.layers.squeeze(padded_labels, axes=[-1])
......
......@@ -20,7 +20,7 @@ import sys
from collections import namedtuple
import numpy as np
sys.path.append("..")
sys.path.append("../shared_modules/")
from preprocess.ernie.task_reader import BaseReader, tokenization
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -10,7 +10,7 @@ import paddle.fluid as fluid
import creator
import reader
import utils
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -24,7 +24,7 @@ import paddle
import utils
import reader
import creator
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......
......@@ -34,7 +34,7 @@ import paddle.fluid as fluid
import creator
import utils
sys.path.append("..")
sys.path.append("../shared_modules/")
from models.representation.ernie import ErnieConfig
from models.model_check import check_cuda
from models.model_check import check_version
......@@ -188,15 +188,16 @@ def do_train(args):
if steps % args.save_steps == 0:
save_path = os.path.join(args.model_save_dir,
"step_" + str(steps))
"step_" + str(steps), "checkpoint")
print("\tsaving model as %s" % (save_path))
fluid.io.save_persistables(exe, save_path, train_program)
fluid.save(train_program, save_path)
if steps % args.validation_steps == 0:
evaluate(exe, test_program, test_pyreader, train_ret)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps))
fluid.io.save_persistables(exe, save_path, train_program)
save_path = os.path.join(args.model_save_dir, "step_" + str(steps),
"checkpoint")
fluid.save(train_program, save_path)
def do_eval(args):
......
......@@ -29,7 +29,7 @@ import reader
import utils
import creator
from eval import test_process
sys.path.append('../models/')
sys.path.append('../shared_modules/models/')
from model_check import check_cuda
from model_check import check_version
......@@ -151,8 +151,8 @@ def do_train(args):
# save checkpoints
if step % args.save_steps == 0 and step != 0:
save_path = os.path.join(args.model_save_dir,
"step_" + str(step))
fluid.io.save_persistables(exe, save_path, train_program)
"step_" + str(step), "checkpoint")
fluid.save(train_program, save_path)
step += 1
if args.enable_ce:
......
......@@ -200,19 +200,11 @@ def init_checkpoint(exe, init_checkpoint_path, main_program):
assert os.path.exists(
init_checkpoint_path), "[%s] cann't be found." % init_checkpoint_path
def existed_persitables(var):
"""
If existed presitabels
"""
if not fluid.io.is_persistable(var):
return False
return os.path.exists(os.path.join(init_checkpoint_path, var.name))
fluid.io.load_vars(
exe,
init_checkpoint_path,
main_program=main_program,
predicate=existed_persitables)
try:
checkpoint_path = os.path.join(init_checkpoint_path, "checkpoint")
fluid.load(main_program, checkpoint_path, exe)
except:
fluid.load(main_program, init_checkpoint_path, exe)
print("Load model from {}".format(init_checkpoint_path))
......@@ -224,15 +216,6 @@ def init_pretraining_params(exe,
assert os.path.exists(pretraining_params_path
), "[%s] cann't be found." % pretraining_params_path
def _existed_params(var):
if not isinstance(var, fluid.framework.Parameter):
return False
return os.path.exists(os.path.join(pretraining_params_path, var.name))
fluid.io.load_vars(
exe,
pretraining_params_path,
main_program=main_program,
predicate=_existed_params)
fluid.load(main_program, pretraining_params_path, exe)
print("Load pretraining parameters from {}.".format(
pretraining_params_path))
......@@ -39,4 +39,3 @@ D-NET是一个以提升**阅读理解模型泛化能力**为目标的“预训
- 在微调阶段引入多任务、多领域的学习策略 (基于[PALM](https://github.com/PaddlePaddle/PALM)多任务学习框架),有效的提升了模型在不同领域的泛化能力
百度利用D-NET框架在EMNLP 2019 [MRQA](https://mrqa.github.io/shared)国际阅读理解评测中以超过第二名近两个百分点的成绩夺得冠军,同时,在全部12个测试数据集中的10个排名第一。
......@@ -106,7 +106,7 @@ python -u main.py \
--prepostprocess_dropout 0.3
```
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_param``save_checkpoint`(默认为 trained_params 和 trained_ckpts),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将分别保存当前训练的参数值和 checkpoint 到相应目录,每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
训练时默认使用所有 GPU,可以通过 `CUDA_VISIBLE_DEVICES` 环境变量来设置使用的 GPU 数目。也可以只使用 CPU 训练(通过参数 `--use_cuda False` 设置),训练速度相对较慢。在执行训练时若提供了 `save_model_path`(默认为 saved_models),则每隔一定 iteration 后(通过参数 `save_step` 设置,默认为10000)将保存当前训练的 checkpoint 到相应目录(会保存分别记录了模型参数和优化器状态的 `transformer.pdparams``transformer.pdopt` 两个文件),每隔一定数目的 iteration (通过参数 `print_step` 设置,默认为100)将打印如下的日志到标准输出:
```txt
[2019-08-02 15:30:51,656 INFO train.py:262] step_idx: 150100, epoch: 32, batch: 1364, avg loss: 2.880427, normalized loss: 1.504687, ppl: 17.821888, speed: 3.34 step/s
......@@ -195,7 +195,7 @@ BLEU = 26.35, 57.7/32.1/20.0/13.0 (BP=1.000, ratio=1.013, hyp_len=63903, ref_len
### 预训练模型
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_params.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_params.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
我们这里提供了对应有以上 BLEU 值的 [base model](https://transformer-res.bj.bcebos.com/base_model_graph.tar.gz)[big model](https://transformer-res.bj.bcebos.com/big_model_graph.tar.gz) 的模型参数提供下载使用(注意,模型使用了提供下载的数据进行训练和测试)。
## 进阶使用
......
......@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
def get_input_descs(args):
"""
Generate a dict mapping data fields to the corresponding data shapes and
......@@ -42,7 +43,8 @@ def get_input_descs(args):
# encoder.
# The actual data shape of src_slf_attn_bias is:
# [batch_size, n_head, max_src_len_in_batch, max_src_len_in_batch]
"src_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"src_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# The actual data shape of trg_word is:
# [batch_size, max_trg_len_in_batch, 1]
"trg_word": [(batch_size, seq_len), "int64",
......@@ -54,12 +56,14 @@ def get_input_descs(args):
# subsequent words in the decoder.
# The actual data shape of trg_slf_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_trg_len_in_batch]
"trg_slf_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_slf_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used to remove attention weights on paddings of the source
# input in the encoder-decoder attention.
# The actual data shape of trg_src_attn_bias is:
# [batch_size, n_head, max_trg_len_in_batch, max_src_len_in_batch]
"trg_src_attn_bias": [(batch_size, n_head, seq_len, seq_len), "float32"],
"trg_src_attn_bias":
[(batch_size, n_head, seq_len, seq_len), "float32"],
# This input is used in independent decoder program for inference.
# The actual data shape of enc_output is:
# [batch_size, max_src_len_in_batch, d_model]
......@@ -80,6 +84,7 @@ def get_input_descs(args):
return input_descs
# Names of word embedding table which might be reused for weight sharing.
word_emb_param_names = (
"src_word_emb_table",
......
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册