Update README.md

b28b862f · Zeyu Chen · GitHub · 94631f03 · b28b862f
隐藏空白更改
内联并排

Showing with 68 addition and 47 deletion

PaddleNLP/README.md PaddleNLP/README.md +68 -47

未找到文件。
--- a/PaddleNLP/README.md
+++ b/PaddleNLP/README.md
 简体中文 | [English](./README_en.md)

-# PaddleNLP
+<p align="center">
+  <img src="./docs/imgs/paddlenlp.png" width="520" height ="100"  align="middle" />
+</p>
+
+
+PaddleNLP旨在帮助开发者提高文本建模的效率，通过丰富的模型库、简洁易用的API，提供飞桨2.0的最佳实践并加速NLP领域应用产业落地效率。

 ![License](https://img.shields.io/badge/license-Apache%202-red.svg)
 ![python version](https://img.shields.io/badge/python-3.6+-orange.svg)
 ![support os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-yellow.svg)

-## Introduction

-PaddleNLP aims to accelerate NLP applications through powerful model zoo, easy-to-use API with detailed tutorials, It's also the NLP best practice for PaddlePaddle 2.0 API system.
+# 特性
+
+- **丰富的模型库**
+
+  - 涵盖了NLP主流应用相关的前沿模型，包括中文词向量、预训练模型、词法分析、文本分类、文本匹配、文本生成、机器翻译、通用对话、问答系统等。
+
+- **简洁易用的API**

-**This project is still UNDER ACTIVE DEVELOPMENT.**
+  - 深度兼容飞桨2.0的高层API体系，提供更多可复用的文本建模模块，可大幅度减少数据处理、组网、训练环节的代码开发，提高开发效率。

-## Features
+- **高性能分布式训练**

-* **Rich and Powerful Model Zoo**
-  - Our Model Zoo covers mainstream NLP applications, including Lexical Analysis, Syntactic Parsing, Machine Translation, Text Classification, Text Generation, Text Matching, General Dialogue and Question Answering etc.
-* **Easy-to-use API**
-  - The API is fully integrated with PaddlePaddle high-level API system. It minimizes the number of user actions required for common use cases like data loading, text pre-processing, training and evaluation. which enables you to deal with text problems more productively.
-* **High Performance and Large-scale Training**
-  - We provide a highly optimized ditributed training implementation for BERT with Fleet API, it can fully utilize GPU clusters for large-scale model pre-training. Please refer to our [benchmark](./benchmark/bert) for more information.
-* **Detailed Tutorials and Industrial Practices**
-  - We offers detailed and interactable notebook tutorials to show you the best practices of PaddlePaddle 2.0.
+  - 通过高度优化的Transformer网络实现，结合混合精度与Fleet分布式训练API，可充分利用GPU集群资源，高效完成预训练模型的分布式训练。

-## Installation

-### Prerequisites
+  
+# 安装

-* python >= 3.6
-* paddlepaddle >= 2.0.0-rc1
+## 环境依赖
+
+- python >= 3.6
+- paddlepaddle >= 2.0.0-rc1

 ```
-pip install paddlenlp>=2.0.0a
+pip install paddlenlp==2.0.0b 
 ```

-## Quick Start

-### Quick Dataset Loading
+# 快速开始
+
+## 数据集快速加载

 ```python
 from paddlenlp.datasets import ChnSentiCrop

-train_ds, test_ds = ChnSentiCorp.get_datasets(['train','test'])
+train_dataset, dev_dataset, test_dataset= ChnSentiCorp.get_datasets(['train', 'dev', 'test'])
 ```

-For more Dataset API usage, please refer to [Dataset API](./docs/datasets.md).
+可参考[Dataset文档](./docs/datasets.md)查看更多数据集。

-### Chinese Text Emebdding Loading
+## 一键加载中文词向量

 ```python
-
 from paddlenlp.embeddings import TokenEmbedding

 wordemb = TokenEmbedding("w2v.baidu_encyclopedia.target.word-word.dim300")
@@ -59,9 +64,9 @@ wordemb.cosine_sim("艺术", "火车")
 >>> 0.14792643
 ```

-For more token embedding usage, please refer to [examples/word_embedding](./example/../examples/word_embedding/README.md).
+内置50+中文词向量，更多使用方法请参考 [Embedding文档](./examples/word_embedding/README.md)。

-### One-Line Classical Model Building
+## 一键加载经典模型

 ```python
 from paddlenlp.models import Ernie, Senta, SimNet
@@ -71,49 +76,65 @@ ernie = Ernie("ernie-1.0", num_classes=2, task="seq-cls")
 senta = Senta(network="bow", vocab_size=1024, num_classes=2)

 simnet = SimNet(network="gru", vocab_size=1024, num_classes=2)
-
 ```

-### Rich Chinsese Pre-trained Models
+更多使用方法请参考[Models API](./docs/models.md)。
+
+## 一键加载高质量中文预训练模型

 ```python
 from paddlenlp.transformers import ErnieModel, BertModel, RobertaModel, ElectraModel

 ernie = ErnieModel.from_pretrained('ernie-1.0')
+
 bert = BertModel.from_pretrained('bert-wwm-chinese')
+
 roberta = RobertaModel.from_pretrained('roberta-wwm-ext')
+
 electra = ElectraModel.from_pretrained('chinese-electra-small')
 ```

-For more pretrained model selection, please refer to [Pretrained-Models](./docs/transformers.md)
+请参考 [Pretrained-Models](./docs/transformers.md)查看目前支持的预训练模型。
+
+
+
+# API 使用文档
+
+- [Transformer API](./docs/transformers.md)
+
+- [Dataset API](./docs/datasets.md)
+
+- [Embedding API](./docs/embeddings.md)

-## API Usage
+- [Metrics API](./docs/embeddings.md)

-* [Transformer API](./docs/transformers.md)
-* [Dataset API](./docs/datasets.md)
-* [Embedding API](./docs/embeddings.md)
-* [Metrics API](./docs/embeddings.md)
-* [Models API](./docs/models.md)
+- [Models API](./docs/models.md)

-## Tutorials
+  

-Please refer to our official AI Studio account for more interactive tutorials: [PaddleNLP on AI Studio](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)
+# 可交互式Notebook教程

-* [paddlenlp.seq2vec是什么? 瞧瞧它怎么完成情感分析教程](https://aistudio.baidu.com/aistudio/projectdetail/1294333) shows how to use lstm to do sentiment analysis.
+- [使用seq2vec模块进行句子情感分类](https://aistudio.baidu.com/aistudio/projectdetail/1283423)
+- [如何将预训练模型Fine-tune下游任务](https://aistudio.baidu.com/aistudio/projectdetail/1294333)
+- [使用Bi-GRU+CRF完成快递单信息抽取](https://aistudio.baidu.com/aistudio/projectdetail/1317771)
+- [使用预训练模型ERNIE优化快递单信息抽取](https://aistudio.baidu.com/aistudio/projectdetail/1329361)
+- [使用Seq2Seq模型完成自动对联模型](https://aistudio.baidu.com/aistudio/projectdetail/1321118)
+- [使用预训练模型ERNIE-GEN实现智能写诗](https://aistudio.baidu.com/aistudio/projectdetail/1339888)
+- [使用TCN网络完成新冠疫情病例数预测](https://aistudio.baidu.com/aistudio/projectdetail/1290873)

-* [使用PaddleNLP语义预训练模型ERNIE优化情感分析教程](https://aistudio.baidu.com/aistudio/projectdetail/1283423) shows how to exploit the pretrained ERNIE to make sentiment analysis better.
+更多教程参见[PaddleNLP on AI Studio](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)。

-* [基于Bi-GRU+CRF的快递单信息抽取](https://aistudio.baidu.com/aistudio/projectdetail/1317771) shows how to make use of bigru and crf to do information extraction.

-* [使用PaddleNLP预训练模型ERNIE优化快递单信息抽取](https://aistudio.baidu.com/aistudio/projectdetail/1329361) shows how to exploit the pretrained ERNIE to do information extraction better.
+# 社区贡献与技术交流

+- 欢迎您加入PaddleNLP的SIG社区，贡献优秀的模型实现、公开数据集、教程与案例、外围小工具。
+- 现在就加入我们的QQ技术交流群，一起交流NLP技术！⬇️

-## Community
+<div align="center">
+  <img src="./docs/imgs/qq.png"  width="200" height="200" />
+</div>  

-* SIG for Pretrained Model Contribution
-* SIG for Dataset Integration
-* SIG for Tutorial Writing

-## License
+# License

-PaddleNLP is provided under the [Apache-2.0 License](./LICENSE).
+PaddleNLP遵循[Apache-2.0开源协议](./LICENSE)。