diff --git a/PaddleNLP/README.md b/PaddleNLP/README.md index a764d0d5a9b6f1ebadff3e4eb0035bab3317788a..00945a53eb9746c1c1d41e5212881936800a0db0 100644 --- a/PaddleNLP/README.md +++ b/PaddleNLP/README.md @@ -102,14 +102,15 @@ electra = ElectraModel.from_pretrained('chinese-electra-small') ## API 使用文档 - [Transformer API](./docs/transformers.md) - + * 基于Transformer结构相关的预训练模型API,包含ERNIE, BERT, RoBERTa, Electra等主流经典结构和下游任务。 - [Data API](./docs/data.md) - + * 文本数据Pipeline相关的API说明。 - [Dataset API](./docs/datasets.md) - + * 数据集相关API,包含自定义数据集,数据集贡献与数据集快速加载等功能说明。 - [Embedding API](./docs/embeddings.md) - + * 词向量相关API,支持一键快速加载包预训练的中文词向量,VisulDL高维可视化等功能说明。 - [Metrics API](./docs/metrics.md) + * 针对NLP场景的评估指标说明,与飞桨2.0框架高层API兼容。 ## 交互式Notebook教程 diff --git a/PaddleNLP/README_en.md b/PaddleNLP/README_en.md index 34e8c9a533f315dc6719e5a84f39dfa44c416b06..62c7668a65bf8049c06841427a981ac7e2ddb64f 100644 --- a/PaddleNLP/README_en.md +++ b/PaddleNLP/README_en.md @@ -20,10 +20,13 @@ PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-t * **Rich and Powerful Model Zoo** - Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Syntactic Parsing, Machine Translation, Text Classification, Text Generation, Text Matching, General Dialogue and Question Answering etc. + * **Easy-to-use API** - The API is fully integrated with PaddlePaddle high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, training and evaluation. which enables you to deal with text problems more productively. + * **High Performance and Large-scale Training** - We provide a highly optimized ditributed training implementation for BERT with Fleet API, it can fully utilize GPU clusters for large-scale model pre-training. Please refer to our [benchmark](./benchmark/bert) for more information. + * **Detailed Tutorials and Industrial Practices** - We offers detailed and interactable notebook tutorials to show you the best practices of PaddlePaddle 2.0. @@ -91,13 +94,9 @@ For more pretrained model selection, please refer to [Pretrained-Models](./paddl ## API Usage - [Transformer API](./docs/transformers.md) - - [Data API](./docs/data.md) - - [Dataset API](./docs/datasets.md) - - [Embedding API](./docs/embeddings.md) - - [Metrics API](./docs/metrics.md) diff --git a/PaddleNLP/docs/transformers.md b/PaddleNLP/docs/transformers.md index 7735bac76b3fb00bc4d17d526a04442847192992..cea646846838fe28d22b3e8bba1f2bf92bc5dda8 100644 --- a/PaddleNLP/docs/transformers.md +++ b/PaddleNLP/docs/transformers.md @@ -1,9 +1,9 @@ -# PaddleNLP transformer类预训练模型 +# PaddleNLP Transformer API -随着深度学习的发展,NLP领域涌现了一大批高质量的transformer类预训练模型,多次刷新各种NLP任务SOTA。PaddleNLP为用户提供了常用的BERT、ERNIE等预训练模型,让用户能够方便快捷的使用各种transformer类模型,完成自己所需的任务。 +随着深度学习的发展,NLP领域涌现了一大批高质量的Transformer类预训练模型,多次刷新各种NLP任务SOTA。PaddleNLP为用户提供了常用的BERT、ERNIE、RoBERTa等经典结构预训练模型,让开发者能够方便快捷应用各类Transformer预训练模型及其下游任务。 -## Transformer 类模型汇总 +## Transformer 预训练模型汇总 下表汇总了目前PaddleNLP支持的各类预训练模型。用户可以使用PaddleNLP提供的模型,完成问答、序列分类、token分类等任务。同时我们提供了22种预训练的参数权重供用户使用,其中包含了11种中文语言模型的预训练权重。 @@ -12,7 +12,7 @@ | [BERT](https://arxiv.org/abs/1810.04805) | BertTokenizer|BertModel
BertForQuestionAnswering
BertForSequenceClassification
BertForTokenClassification| `bert-base-uncased`
`bert-large-uncased`
`bert-base-multilingual-uncased`
`bert-base-cased`
`bert-base-chinese`
`bert-base-multilingual-cased`
`bert-large-cased`
`bert-wwm-chinese`
`bert-wwm-ext-chinese` | |[ERNIE](https://arxiv.org/abs/1904.09223)|ErnieTokenizer
ErnieTinyTokenizer|ErnieModel
ErnieForQuestionAnswering
ErnieForSequenceClassification
ErnieForTokenClassification
ErnieForGeneration| `ernie-1.0`
`ernie-tiny`
`ernie-2.0-en`
`ernie-2.0-large-en`
`ernie-gen-base-en`
`ernie-gen-large-en`
`ernie-gen-large-en-430g`| |[RoBERTa](https://arxiv.org/abs/1907.11692)|RobertaTokenizer| RobertaModel
RobertaForQuestionAnswering
RobertaForSequenceClassification
RobertaForTokenClassification| `roberta-wwm-ext`
`roberta-wwm-ext-large`
`rbt3`
`rbtl3`| -|[ELECTRA](https://arxiv.org/abs/2003.10555) |ElectraTokenizer| ElectraModel
ElectraForSequenceClassification
ElectraForTokenClassification
|`electra-small`
`electra-base`
`electra-large`
`chinese-electra-small`
`chinese-electra-base`
| +|[ELECTRA](https://arxiv.org/abs/2003.10555) | ElectraTokenizer| ElectraModel
ElectraForSequenceClassification
ElectraForTokenClassification
|`electra-small`
`electra-base`
`electra-large`
`chinese-electra-small`
`chinese-electra-base`
| |[Transformer](https://arxiv.org/abs/1706.03762) |- | TransformerModel | - | 注:其中中文的预训练模型有 `bert-base-chinese, bert-wwm-chinese, bert-wwm-ext-chinese, ernie-1.0, ernie-tiny, roberta-wwm-ext, roberta-wwm-ext-large, rbt3, rbtl3, chinese-electra-base, chinese-electra-small`。生成模型`ernie-gen-base-en, ernie-gen-large-en, ernie-gen-large-en-430g`仅支持`ErnieForGeneration`任务。 @@ -47,7 +47,7 @@ for batch in train_data_loader: probs = paddle.nn.functional.softmax(logits, axis=1) loss.backward() optimizer.step() - optimizer.clear_gradients() + optimizer.clear_grad() ``` 上面的代码给出使用预训练模型的简要示例,更完整详细的示例代码,可以参考[使用预训练模型Fine-tune完成中文文本分类任务](https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/examples/text_classification/pretrained_models)。 diff --git a/PaddleNLP/paddlenlp/transformers/converter/run_glue_pp.py b/PaddleNLP/paddlenlp/transformers/converter/run_glue_pp.py index e11374c04ce6beb222151a359049e9635495b914..fccbc0825538830e8e059329e8ee910e1f3aa2db 100644 --- a/PaddleNLP/paddlenlp/transformers/converter/run_glue_pp.py +++ b/PaddleNLP/paddlenlp/transformers/converter/run_glue_pp.py @@ -369,7 +369,7 @@ def do_train(args): loss.backward() optimizer.step() lr_scheduler.step() - optimizer.clear_gradients() + optimizer.clear_grad() if global_step % args.save_steps == 0: evaluate(model, loss_fct, metric, dev_data_loader) if (not args.n_gpu > 1) or paddle.distributed.get_rank() == 0: