未验证 提交 da820fc1 编写于 作者: Z zhanghan 提交者: GitHub

Merge pull request #487 from zhanghan1992/repro

Update README
......@@ -43,11 +43,11 @@ Specifically, the span-by-span generation task and word-by-word generation task
## Pre-trained Models
We release the checkpoints for **ERNIE-GEN _base_** model and **ERNIE-GEN _large_** model which are both pre-trained on English Wikipedia and [BookCorpus](https://arxiv.org/abs/1506.06724) (totally 16GB). Besides, **ERNIE-GEN _large_** pre-trained on the 160GB corpus (used by [RoBERTa](https://arxiv.org/abs/1907.11692) and [BART](https://arxiv.org/abs/1910.13461)) is available as well.
We release the checkpoints for **ERNIE-GEN _base_** model and **ERNIE-GEN _large_** model which are both pre-trained on English Wikipedia and [BookCorpus](https://arxiv.org/abs/1506.06724) (totally 16GB). Besides, **ERNIE-GEN _large_** pre-trained on the 430GB corpus (see [ERNIE-GEN Appendix A.1](https://arxiv.org/abs/2001.11314) for the description of the corpus) is available as well.
- [**ERNIE-GEN _base_**](https://ernie.bj.bcebos.com/ernie_gen_base.tgz) (_lowercased | 12-layer, 768-hidden, 12-heads, 110M parameters_)
- [**ERNIE-GEN _large_**](https://ernie.bj.bcebos.com/ernie_gen_large.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
- [**ERNIE-GEN _large with 160G_**](https://ernie.bj.bcebos.com/ernie_gen_large_160g.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
- [**ERNIE-GEN _large with 430G_**](https://ernie.bj.bcebos.com/ernie_gen_large_430g.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
## Fine-tuning on Downstream Tasks
......@@ -65,7 +65,7 @@ The results on Gigaword-10k (10K examples of Gigaword) are presented as follows:
| UniLM | 16G / 340M | 34.21 | 15.28 | 31.54 |
| **ENRIE-GEN** _base_ | 16G / 110M | 33.75 | 15.23 | 31.35 |
| **ERNIE-GEN** _large_ | 16G / 340M | 35.05 | 16.10 | 32.50 |
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | **35.51** | **16.79** | **33.23** |
| **ERNIE-GEN** _large_ (430G) | 430G / 340M | **35.51** | **16.79** | **33.23** |
The results on Gigaword are presented as follows:
......@@ -78,7 +78,7 @@ The results on Gigaword are presented as follows:
| PEGASUS (_HugeNews_) | 3.8T / 568M | 39.12 | 19.86 | 36.24 |
| **ENRIE-GEN** _base_ | 16G / 110M | 38.83 | 20.04 | 36.20 |
| **ERNIE-GEN** _large_ | 16G / 340M | 39.25 | 20.25 | 36.53 |
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | **39.46** | **20.34** | **36.74** |
| **ERNIE-GEN** _large_ (430G) | 430G / 340M | **39.46** | **20.34** | **36.74** |
We preprocess the raw Gigaword dataset following UniLM, the preprocessed data is avalilable at this [Gigaword](https://ernie.bj.bcebos.com/gigaword.tgz).
......@@ -97,7 +97,7 @@ The results on CNN/Daily Mail are presented as follows:
| PEGASUS (_HugeNews_) | 3.8T / 568M | 44.17 | 21.47 | 41.11 |
| **ENRIE-GEN** _base_ | 16G / 110M | 42.30 | 19.92 | 39.68 |
| **ENRIE-GEN** _large_ | 16G / 340M | 44.02 | 21.17 | 41.26 |
| **ENRIE-GEN** _large_ (160G) | 160G / 340M | **44.31** | 21.35 | **41.60** |
| **ENRIE-GEN** _large_ (430G) | 430G / 340M | **44.31** | 21.35 | **41.60** |
We preprocess the raw CNN/Daily Mail dataset following UniLM, the preprocessed data is avalilable at this [CNN/Daily Mail](https://ernie.bj.bcebos.com/cnndm.tgz).
......@@ -114,7 +114,7 @@ The results on the [SQuAD 1.1](https://arxiv.org/abs/1806.03822) dataset followi
| **ENRIE-GEN** _base_ (beam size=1) | 22.28 | 25.13 | 50.38 |
| **ERNIE-GEN** _large_ (beam size=1) | 24.03 | 26.31 | 52.36 |
| **ERNIE-GEN** _large_ (beam size=5) | 25.40 | **26.92** | 52.84 |
| **ERNIE-GEN** _large_ (beam size=5) + (160G) | **25.41** | 26.77 | **52.91** |
| **ERNIE-GEN** _large_ (beam size=5) + (430G) | **25.41** | 26.77 | **52.91** |
The results following the reversed dev-test data split in [[Zhao et al., 2018]](https://www.aclweb.org/anthology/D18-1424/) are presented as follows:
......@@ -125,7 +125,7 @@ The results following the reversed dev-test data split in [[Zhao et al., 2018]](
| **ENRIE-GEN** _base_ (beam size=1) | 23.52 | 25.61 | 51.45 |
| **ERNIE-GEN** _large_ (beam size=1) | 25.57 | 26.89 | 53.31 |
| **ERNIE-GEN** _large_ (beam size=5) | 26.95 | **27.57** | 53.77 |
| **ERNIE-GEN** _large_ (beam size=5) + (160G) | **27.05** | 27.43 | **53.83** |
| **ERNIE-GEN** _large_ (beam size=5) + (430G) | **27.05** | 27.43 | **53.83** |
*_Note that we also report the results with higher beam size to 5._
......@@ -161,24 +161,6 @@ Results of development set on CoQA task is presented as follows:
We preprocess the raw [CoQA](https://arxiv.org/abs/1808.07042) dataset, the preprocessed data is avalilable at this [CoQA-preprocessed](https://ernie.bj.bcebos.com/coqa.tgz).
Finally, we also compared with a concurrent work [ProphetNet](https://arxiv.org/abs/2001.04063), the fine-tuning results on Gigaword, CNN/Daily Mail and SQuAD are reported as follows:
- _**Abstractive Summarization**_
| Model / Task | <strong>Data / Params</strong> | <strong>Gigaword</strong> |<strong>CNN/Daily Mail</strong>|
| :-------------------------------------------------------- | :----------------------------: | :----------------------: | :----------------------: |
| Metric | - | <strong>Rouge-1 / Rouge-2 / Rouge-L</strong> |<strong>Rouge-1 / Rouge-2 / Rouge-L</strong>|
| **ProphetNet** _large_ (160G) | 160G / 340M | **39.51** / **20.42** / 36.69 |44.20 / 21.17 / 41.30|
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | 39.46 / 20.34 / **36.74** |**44.31** / **21.35** / **41.60**|
- _**Question Generation**_
| Model | <strong>Data / Params</strong> | <strong>BLEU-4 / METEOR / Rouge-L</strong> |<strong>BLEU-4 / METEOR / Rouge-L</strong>|
| :-------------------------------------------------------- | :----------------------------: | :----------------------: |:----------------------: |
| Data split | - | <strong>Original</strong> |<strong>Reversed dev-test</strong>|
| **ProphetNet** _large_ (16G) | 16G / 340M | 25.01 / 26.83 / 52.57 |26.72 / **27.64** / **53.79** |
| **ERNIE-GEN** _large_ (16G) | 16G / 340M | **25.40** / **26.92** / **52.84** |**26.95** / 27.57 / **53.77**|
## Usage
### Install PaddlePaddle
......@@ -191,7 +173,7 @@ pip install -r requirements.txt
### Fine-tuning
Please update LD_LIBRARY_PATH about CUDA, cuDNN, NCCL2 before running ERNIE-GEN. We have put the parameter configurations of the above downstream tasks in `config/`. You can easily run finetuning through these configuration files. For example, you can finetune ERNIE-GEN base model on Gigaword by
```script
MODEL="base" # base or large or large_160g
MODEL="base" # base or large or large_430g
TASK="gigaword" # cnndm, coqa, gigaword, squad_qg or persona-chat
sh run_seq2seq.sh ./configs/${MODEL}/${TASK}_conf
```
......
......@@ -43,11 +43,11 @@
## 预训练模型
我们发布了 **ERNIE-GEN _base_** 模型和 **ERNIE-GEN _large_** 模型。 预训练数据使用英文维基百科和 BookCorpus,总共16GB。此外,我们还发布了基于 160GB 语料预训练的**ERNIE-GEN _large_** 模型,此份语料也被用于 [RoBERTa](https://arxiv.org/abs/1907.11692)[BART](https://arxiv.org/abs/1910.13461) 的预训练
我们发布了 **ERNIE-GEN _base_** 模型和 **ERNIE-GEN _large_** 模型。 预训练数据使用英文维基百科和 BookCorpus,总共16GB。此外,我们还发布了基于 430GB 语料(数据描述见[ERNIE-GEN Appendix A.1](https://arxiv.org/abs/2001.11314))预训练的**ERNIE-GEN _large_** 模型
- [**ERNIE-GEN _base_**](https://ernie.bj.bcebos.com/ernie_gen_base.tgz) (_lowercased | 12-layer, 768-hidden, 12-heads, 110M parameters_)
- [**ERNIE-GEN _large_**](https://ernie.bj.bcebos.com/ernie_gen_large.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
- [**ERNIE-GEN _large with 160G_**](https://ernie.bj.bcebos.com/ernie_gen_large_160g.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
- [**ERNIE-GEN _large with 430G_**](https://ernie.bj.bcebos.com/ernie_gen_large_430g.tgz) (_lowercased | 24-layer, 1024-hidden, 16-heads, 340M parameters_)
## 微调任务
......@@ -65,7 +65,7 @@
| UniLM | 16G / 340M | 34.21 | 15.28 | 31.54 |
| **ENRIE-GEN** _base_ | 16G / 110M | 33.75 | 15.23 | 31.35 |
| **ERNIE-GEN** _large_ | 16G / 340M | 35.05 | 16.10 | 32.50 |
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | **35.51** | **16.79** | **33.23** |
| **ERNIE-GEN** _large_ (430G) | 430G / 340M | **35.51** | **16.79** | **33.23** |
在 Gigaword 上的效果:
......@@ -78,7 +78,7 @@
| PEGASUS (_HugeNews_) | 3.8T / 568M | 39.12 | 19.86 | 36.24 |
| **ENRIE-GEN** _base_ | 16G / 110M | 38.83 | 20.04 | 36.20 |
| **ERNIE-GEN** _large_ | 16G / 340M | 39.25 | 20.25 | 36.53 |
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | **39.46** | **20.34** | **36.74** |
| **ERNIE-GEN** _large_ (430G) | 430G / 340M | **39.46** | **20.34** | **36.74** |
我们按照 UniLM 的方式处理了数据,下载链接 [Gigaword](https://ernie.bj.bcebos.com/gigaword.tgz)
......@@ -97,7 +97,7 @@
| PEGASUS (_HugeNews_) | 3.8T / 568M | 44.17 | 21.47 | 41.11 |
| **ENRIE-GEN** _base_ | 16G / 110M | 42.30 | 19.92 | 39.68 |
| **ENRIE-GEN** _large_ | 16G / 340M | 44.02 | 21.17 | 41.26 |
| **ENRIE-GEN** _large_ (160G) | 160G / 340M | **44.31** | 21.35 | **41.60** |
| **ENRIE-GEN** _large_ (430G) | 430G / 340M | **44.31** | 21.35 | **41.60** |
我们按照 UniLM 的方式处理了数据,下载链接 [CNN/Daily Mail](https://ernie.bj.bcebos.com/cnndm.tgz)
......@@ -114,7 +114,7 @@
| **ENRIE-GEN** _base_ (beam size=1) | 22.28 | 25.13 | 50.38 |
| **ERNIE-GEN** _large_ (beam size=1) | 24.03 | 26.31 | 52.36 |
| **ERNIE-GEN** _large_ (beam size=5) | 25.40 | **26.92** | 52.84 |
| **ERNIE-GEN** _large_ (beam size=5) + (160G) | **25.41** | 26.77 | **52.91** |
| **ERNIE-GEN** _large_ (beam size=5) + (430G) | **25.41** | 26.77 | **52.91** |
按照 [[Zhao et al., 2018]](https://www.aclweb.org/anthology/D18-1424/) 反向使用验证集和测试集,效果如下:
......@@ -125,7 +125,7 @@
| **ENRIE-GEN** _base_ (beam size=1) | 23.52 | 25.61 | 51.45 |
| **ERNIE-GEN** _large_ (beam size=1) | 25.57 | 26.89 | 53.31 |
| **ERNIE-GEN** _large_ (beam size=5) | 26.95 | **27.57** | 53.77 |
| **ERNIE-GEN** _large_ (beam size=5) + (160G) | **27.05** | 27.43 | **53.83** |
| **ERNIE-GEN** _large_ (beam size=5) + (430G) | **27.05** | 27.43 | **53.83** |
*_我们增加了将 beam size 扩大到 5 的结果。_
......@@ -159,23 +159,6 @@
我们对原始的 CoQA 数据集进行了处理,下载链接 [CoQA](https://ernie.bj.bcebos.com/coqa.tgz)
此外,我们与同期的工作 [ProphetNet](https://arxiv.org/abs/2001.04063) 在 Gigaword,CNN/Daily Mail 和 SQuAD 三个数据集上进行了对比:
- _**生成式摘要**_
| 模型 / 任务 | <strong>数据量 / 参数量</strong> | <strong>Gigaword</strong> |<strong>CNN/Daily Mail</strong>|
| :-------------------------------------------------------- | :------------------------------: | :----------------------: | :----------------------: |
| Metric | - | <strong>Rouge-1 / Rouge-2 / Rouge-L</strong> |<strong>Rouge-1 / Rouge-2 / Rouge-L</strong>|
| ProphetNet _large_ (160G) | 160G / 340M | **39.51** / **20.42** / 36.69 |44.20 / 21.17 / 41.30|
| **ERNIE-GEN** _large_ (160G) | 160G / 340M | 39.46 / 20.34 / **36.74** |**44.31** / **21.35** / **41.60**|
- _**问题生成**_
| 模型 | <strong>数据量 / 参数量</strong> | <strong>BLEU-4 / METEOR / Rouge-L</strong> |<strong>BLEU-4 / METEOR / Rouge-L</strong>|
| :-------------------------------------------------------- | :------------------------------: | :----------------------: |:----------------------: |
| Data split | - | <strong>Original</strong> |<strong>Reversed dev-test</strong>|
| ProphetNet** _large_ (16G) | 16G / 340M | 25.01 / 26.83 / 52.57 |26.72 / **27.64** / **53.79** |
| **ERNIE-GEN** _large_ (16G) | 16G / 340M | **25.40** / **26.92** / **52.84** |**26.95** / 27.57 / **53.77**|
## 使用说明
......@@ -189,7 +172,7 @@ pip install -r requirements.txt
### 运行微调
在运行 ERNIE-GEN 前,需要将 CUDA 、cuDNN 、NCCL2 的动态库路径添加到 LD_LIBRARY_PATH 。 我们把下游任务的参数配置文件放到了 `config/` ,可以简单地通过配置文件运行。 例如,您可以通过下面的指令在 Gigaword 数据集上微调 ERNIE-GEN base 模型:
```script
MODEL="base" # base or large or large_160g
MODEL="base" # base or large or large_430g
TASK="gigaword" # cnndm, coqa, gigaword, squad_qg or persona-chat
sh run_seq2seq.sh ./configs/${MODEL}/${TASK}_conf
```
......
#load model
vocab_path="ernie_gen_large/vocab.txt"
config_path="ernie_gen_large/ernie_config.json"
init_model="ernie_gen_large/params"
#for multi-turn dialog/qa
task_type="dialog"
role_type_size=3
turn_type_size=16
#input
max_src_len=480
max_tgt_len=32
tokenized_input="true"
continuous_position="true"
batch_size=4
in_tokens="false"
#tgt_type_id=1
#decode
do_decode="true"
max_dec_len=30
beam_size=3
length_penalty=0.0
use_multi_gpu_test="true"
#train
epoch=10
weight_decay=0.01
label_smooth=0.1
hidden_dropout_prob=0.1
save_and_valid_by_epoch="true"
#lr
warmup_proportion=0.1
lr_scheduler="linear_warmup_decay"
learning_rate=1e-5
#noise
random_noise="false"
noise_prob=0.5
#dataset
data_path="./datasets/coqa/"
train_set="train.tsv"
dev_set="dev.tsv"
do_train="true"
do_val="true"
do_test="false"
do_pred="false"
#evaluate
eval_script="sh ./eval/tasks/coqa/eval.sh"
eval_mertrics="f1"
#load model
vocab_path="ernie_gen_large/vocab.txt"
config_path="ernie_gen_large/ernie_config.json"
init_model="ernie_gen_large/params"
#for multi-turn dialog/qa
task_type="dialog"
role_type_size=3
turn_type_size=16
#input
max_src_len=472
max_tgt_len=40
tokenized_input="true"
continuous_position="true"
batch_size=8
in_tokens="false"
#decode
do_decode="true"
max_dec_len=32
beam_size=10
length_penalty=1.3
use_multi_gpu_test="true"
#train
epoch=30
weight_decay=0.01
label_smooth=0.0
hidden_dropout_prob=0.1
save_and_valid_by_epoch="true"
#lr
warmup_proportion=0.1
lr_scheduler="linear_warmup_decay"
learning_rate=1e-4
#noise
random_noise="false"
noise_prob=0.0
#dataset
data_path="./datasets/persona_chat/"
train_set="train.tsv"
dev_set="dev.2k.tsv"
pred_set="test.tsv"
do_train="true"
do_val="true"
do_test="false"
do_pred="true"
do_decode="true"
#evaluate
eval_script="sh ./eval/tasks/persona_chat/eval.sh"
eval_mertrics="bleu_1,bleu_2,distinct_1,distinct_2"
#load model
vocab_path="ernie_gen_large_160g/vocab.txt"
config_path="ernie_gen_large_160g/ernie_config.json"
init_model="ernie_gen_large_160g/params"
vocab_path="ernie_gen_large_430g/vocab.txt"
config_path="ernie_gen_large_430g/ernie_config.json"
init_model="ernie_gen_large_430g/params"
#input
max_src_len=640
......
#load model
vocab_path="ernie_gen_large_160g/vocab.txt"
config_path="ernie_gen_large_160g/ernie_config.json"
init_model="ernie_gen_large_160g/params"
vocab_path="ernie_gen_large_430g/vocab.txt"
config_path="ernie_gen_large_430g/ernie_config.json"
init_model="ernie_gen_large_430g/params"
#input
max_src_len=192
......
#load model
vocab_path="ernie_gen_large_160g/vocab.txt"
config_path="ernie_gen_large_160g/ernie_config.json"
init_model="ernie_gen_large_160g/params"
vocab_path="ernie_gen_large_430g/vocab.txt"
config_path="ernie_gen_large_430g/ernie_config.json"
init_model="ernie_gen_large_430g/params"
#input
max_src_len=192
......
#load model
vocab_path="ernie_gen_large_160g/vocab.txt"
config_path="ernie_gen_large_160g/ernie_config.json"
init_model="ernie_gen_large_160g/params"
vocab_path="ernie_gen_large_430g/vocab.txt"
config_path="ernie_gen_large_430g/ernie_config.json"
init_model="ernie_gen_large_430g/params"
#input
max_src_len=512
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册