提交 6bc8cd33 编写于 作者: L liyukun01

add ernie-gram framework figure

上级 0454118b
English|[简体中文](./README.zh.md) English|[简体中文](./README.zh.md)
`Remind`: *ERNIE-Gram* model has been officially released in [here](??). Our reproduction codes will be released to [repro branch](https://github.com/PaddlePaddle/ERNIE/tree/repro) soon. `Remind`: *ERNIE-Gram* model has been officially released in [here](#3-download-pretrained-models-optional). Our reproduction codes will be released to [repro branch](https://github.com/PaddlePaddle/ERNIE/tree/repro) soon.
## _ERNIE-Gram_: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding ## _ERNIE-Gram_: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
![ERNIE-Gram](.meta/ernie-gram.jpeg)
- [Framework](#ernie-gram-framework) - [Framework](#ernie-gram-framework)
- [Quick Tour](#quick-tour) - [Quick Tour](#quick-tour)
- [Setup](#setup) - [Setup](#setup)
...@@ -17,8 +19,6 @@ English|[简体中文](./README.zh.md) ...@@ -17,8 +19,6 @@ English|[简体中文](./README.zh.md)
### ERNIE-Gram Framework ### ERNIE-Gram Framework
整图
Since **ERNIE 1.0**, Baidu researchers have introduced **knowledge-enhanced representation learning** in pre-training to achieve better pre-training learning by masking consecutive words, phrases, named entities, and other semantic knowledge units. Furthermore, we propose **ERNIE-Gram**, an explicitly n-gram masking language model to enhance the integration of coarse-grained information for pre-training. In **ERNIE-Gram**, **n-grams** are masked and predicted directly using **explicit** n-gram identities rather than contiguous sequences of tokens. Since **ERNIE 1.0**, Baidu researchers have introduced **knowledge-enhanced representation learning** in pre-training to achieve better pre-training learning by masking consecutive words, phrases, named entities, and other semantic knowledge units. Furthermore, we propose **ERNIE-Gram**, an explicitly n-gram masking language model to enhance the integration of coarse-grained information for pre-training. In **ERNIE-Gram**, **n-grams** are masked and predicted directly using **explicit** n-gram identities rather than contiguous sequences of tokens.
In downstream tasks, **ERNIE-gram** uses a `bert-style` fine-tuning approach, thus maintaining the same parameter size and computational complexity. In downstream tasks, **ERNIE-gram** uses a `bert-style` fine-tuning approach, thus maintaining the same parameter size and computational complexity.
...@@ -56,8 +56,8 @@ pip install -e . ...@@ -56,8 +56,8 @@ pip install -e .
| Model | Description |abbreviation| | Model | Description |abbreviation|
| :------------------------------------------------- | :----------------------------------------------------------- |:-----------| | :------------------------------------------------- | :----------------------------------------------------------- |:-----------|
| [ERNIE-Gram Base for Chinese](补充链接) | Layer:12, Hidden:768, Heads:12 | ernie-gram| | [ERNIE-Gram Base for Chinese](https://ernie-github.cdn.bcebos.com/model-ernie-gram-zh.1.tar.gz) | Layer:12, Hidden:768, Heads:12 | ernie-gram|
| [ERNIE-Gram Base for English](补充链接) | Layer:12, Hidden:768, Heads:12 | ernie-gram-en | | [ERNIE-Gram Base for English](https://ernie-github.cdn.bcebos.com/model-ernie-gram-en.1.tar.gz) | Layer:12, Hidden:768, Heads:12 | ernie-gram-en |
##### 4. Download datasets ##### 4. Download datasets
......
[English](./README.en.md)|简体中文 [English](./README.en.md)|简体中文
`提醒`: *ERNIE-Gram* 中/英文模型已经[正式开源](??),paper 复现代码也即将开源至 [repro分支](https://github.com/PaddlePaddle/ERNIE/tree/repro)。现在您可以使用基于 Paddle 2.0 全新升级、基于动静结合的新版 ERNIE 套件体验 *ERNIE-Gram* 中/英文开源模型。 `提醒`: *ERNIE-Gram* 中/英文模型已经[正式开源](#3-下载预训练模型可选),paper 复现代码也即将开源至 [repro分支](https://github.com/PaddlePaddle/ERNIE/tree/repro)。现在您可以使用基于 Paddle 2.0 全新升级、基于动静结合的新版 ERNIE 套件体验 *ERNIE-Gram* 中/英文开源模型。
## _ERNIE-Gram_: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding ## _ERNIE-Gram_: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
![ERNIE-Gram](.meta/ernie-gram.jpeg)
- [模型框架](#模型框架) - [模型框架](#模型框架)
- [快速上手](#快速上手) - [快速上手](#快速上手)
- [安装& 使用](#安装) - [安装& 使用](#安装)
...@@ -17,8 +19,6 @@ ...@@ -17,8 +19,6 @@
### 模型框架 ### 模型框架
整图
**ERNIE 1.0** 起,百度研究者们就在预训练中引入**知识增强**学习,通过掩码连续的词、phrase、named entity 等语义知识单元,实现更好的预训练学习。本次开源的通用语义理解模型 **ERNIE-Gram** 更进一步,提出的**显式****完备**的 n-gram 掩码语言模型,实现了显式的 n-gram 语义单元知识建模。 **ERNIE 1.0** 起,百度研究者们就在预训练中引入**知识增强**学习,通过掩码连续的词、phrase、named entity 等语义知识单元,实现更好的预训练学习。本次开源的通用语义理解模型 **ERNIE-Gram** 更进一步,提出的**显式****完备**的 n-gram 掩码语言模型,实现了显式的 n-gram 语义单元知识建模。
#### ERNIE 多粒度预训练语义理解技术 #### ERNIE 多粒度预训练语义理解技术
...@@ -66,8 +66,8 @@ export PYTHONPATH=$PWD:$PYTHONPATH ...@@ -66,8 +66,8 @@ export PYTHONPATH=$PWD:$PYTHONPATH
| Model | 细节参数 |下载简写| | Model | 细节参数 |下载简写|
| :------------------------------------------------- |:------------------------------------------------------------------------- |:-------| | :------------------------------------------------- |:------------------------------------------------------------------------- |:-------|
| [ERNIE-Gram 中文](补充链接) | Layer:12, Hidden:768, Heads:12 |ernie-gram| | [ERNIE-Gram 中文](https://ernie-github.cdn.bcebos.com/model-ernie-gram-zh.1.tar.gz) | Layer:12, Hidden:768, Heads:12 |ernie-gram|
| [ERNIE-Gram 英文](补充链接) | Layer:3, Hdden:1024, Heads:16 |ernie-gram-en| | [ERNIE-Gram 英文](https://ernie-github.cdn.bcebos.com/model-ernie-gram-en.1.tar.gz) | Layer:3, Hdden:1024, Heads:16 |ernie-gram-en|
##### 4. 下载数据集 ##### 4. 下载数据集
...@@ -99,8 +99,7 @@ data/xnli ...@@ -99,8 +99,7 @@ data/xnli
使用 `动态图` 模型进行finetune: 使用 `动态图` 模型进行finetune:
- [句对分类](./demo/finetune_classifier_distributed.py) - [句对分类](./demo/finetune_classifier_distributed.py)
- [语义匹配](./demo/finetune_classifier.py) - [语义匹配](./demo/finetune_classifier_distributed.py)
- [命名实体识别(NER)](./demo/finetune_ner.py)
- [机器阅读理解](./demo/finetune_mrc.py) - [机器阅读理解](./demo/finetune_mrc.py)
...@@ -108,13 +107,14 @@ data/xnli ...@@ -108,13 +107,14 @@ data/xnli
|任务|batch size|learning rate| |任务|batch size|learning rate|
|--|--|--| |--|--|--|
| XNLI | 512 | 1e-4 | | XNLI | 256 | 1.5e-4 |
| LCQMC | 32 | 2e-5 | | LCQMC | 16 | 4e-5 |
| DRCD | 64 | 5e-5 | | DRCD | 64 | 5e-5 |
| CMRC2018 | 64 | 3e-5 | | CMRC2018 | 64 | 1.5e-4 |
| DuReader | 64 | 3e-5 | | DuReader | 64 | 1.5e-5 |
| MSRA-NER(SIGHAN2006) | 16 | 5e-5 | | MSRA-NER(SIGHAN2006) | 16 | 5e-5 |
若希望复现 paper 中的所有实验,请切换至本 repo 的 `repro` 分支。
### 文献引用 ### 文献引用
...@@ -127,9 +127,6 @@ data/xnli ...@@ -127,9 +127,6 @@ data/xnli
} }
``` ```
若希望复现 paper 中的所有实验,请切换至本 repo 的 `repro` 分支。
### 讨论组 ### 讨论组
- [ERNIE官方主页](https://wenxin.baidu.com/) - [ERNIE官方主页](https://wenxin.baidu.com/)
- [Github Issues](https://github.com/PaddlePaddle/ERNIE/issues): bug reports, feature requests, install issues, usage issues, etc. - [Github Issues](https://github.com/PaddlePaddle/ERNIE/issues): bug reports, feature requests, install issues, usage issues, etc.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册