Merge branch 'release/v2.1' into develop

41bee5c9 · wuzewu · 0f269066 · 55429fcc · 41bee5c9 · 41bee5c9
27 changed file
--- a/README.md
+++ b/README.md
@@ -179,12 +179,14 @@ The release of this project is certified by the <a href="./LICENSE">Apache 2.0 l
    <a href="https://github.com/DesmonDay"><img src="https://avatars.githubusercontent.com/u/20554008?v=4" width=75 height=75></a>
    <a href="https://github.com/chunzhang-hub"><img src="https://avatars.githubusercontent.com/u/63036966?v=4" width=75 height=75></a>
    <a href="https://github.com/adaxiadaxi"><img src="https://avatars.githubusercontent.com/u/58928121?v=4" width=75 height=75></a>
+    <a href="https://github.com/rainyfly"><img src="https://avatars.githubusercontent.com/u/22424850?v=4" width=75 height=75></a>
+    <a href="https://github.com/linjieccc"><img src="https://avatars.githubusercontent.com/u/40840292?v=4" width=75 height=75></a>
    <a href="https://github.com/linshuliang"><img src="https://avatars.githubusercontent.com/u/15993091?v=4" width=75 height=75></a>
    <a href="https://github.com/eepgxxy"><img src="https://avatars.githubusercontent.com/u/15946195?v=4" width=75 height=75></a>
-    <a href="https://github.com/linjieccc"><img src="https://avatars.githubusercontent.com/u/40840292?v=4" width=75 height=75></a>
    <a href="https://github.com/paopjian"><img src="https://avatars.githubusercontent.com/u/20377352?v=4" width=75 height=75></a>
    <a href="https://github.com/zbp-xxxp"><img src="https://avatars.githubusercontent.com/u/58476312?v=4" width=75 height=75></a>
    <a href="https://github.com/houj04"><img src="https://avatars.githubusercontent.com/u/35131887?v=4" width=75 height=75></a>
+    <a href="https://github.com/Wgm-Inspur"><img src="https://avatars.githubusercontent.com/u/89008682?v=4" width=75 height=75></a>
    <a href="https://github.com/apps/dependabot"><img src="https://avatars.githubusercontent.com/in/29110?v=4" width=75 height=75></a>
    <a href="https://github.com/dxxxp"><img src="https://avatars.githubusercontent.com/u/15886898?v=4" width=75 height=75></a>
    <a href="https://github.com/jianganbai"><img src="https://avatars.githubusercontent.com/u/50263321?v=4" width=75 height=75></a>
@@ -197,12 +199,12 @@ The release of this project is certified by the <a href="./LICENSE">Apache 2.0 l
    <a href="https://github.com/Haijunlv"><img src="https://avatars.githubusercontent.com/u/28926237?v=4" width=75 height=75></a>
    <a href="https://github.com/holyseven"><img src="https://avatars.githubusercontent.com/u/13829174?v=4" width=75 height=75></a>
    <a href="https://github.com/MRXLT"><img src="https://avatars.githubusercontent.com/u/16594411?v=4" width=75 height=75></a>
-    <a href="https://github.com/Wgm-Inspur"><img src="https://avatars.githubusercontent.com/u/89008682?v=4" width=75 height=75></a>
    <a href="https://github.com/cclauss"><img src="https://avatars.githubusercontent.com/u/3709715?v=4" width=75 height=75></a>
-    <a href="https://github.com/rainyfly"><img src="https://avatars.githubusercontent.com/u/22424850?v=4" width=75 height=75></a>
    <a href="https://github.com/hu-qi"><img src="https://avatars.githubusercontent.com/u/17986122?v=4" width=75 height=75></a>
    <a href="https://github.com/jayhenry"><img src="https://avatars.githubusercontent.com/u/4285375?v=4" width=75 height=75></a>
    <a href="https://github.com/hlmu"><img src="https://avatars.githubusercontent.com/u/30133236?v=4" width=75 height=75></a>
+    <a href="https://github.com/shinichiye"><img src="https://avatars.githubusercontent.com/u/76040149?v=4" width=75 height=75></a>
+    <a href="https://github.com/will-jl944"><img src="https://avatars.githubusercontent.com/u/68210528?v=4" width=75 height=75></a>
    <a href="https://github.com/yma-admin"><img src="https://avatars.githubusercontent.com/u/40477813?v=4" width=75 height=75></a>
    <a href="https://github.com/zl1271"><img src="https://avatars.githubusercontent.com/u/22902089?v=4" width=75 height=75></a>
    <a href="https://github.com/brooklet"><img src="https://avatars.githubusercontent.com/u/1585799?v=4" width=75 height=75></a>
@@ -223,5 +225,5 @@ We welcome you to contribute code to PaddleHub, and thank you for your feedback.
 * Many thanks to [huqi](https://github.com/hu-qi) for fixing readme typo
 * Many thanks to [parano](https://github.com/parano) [cqvu](https://github.com/cqvu) [deehrlic](https://github.com/deehrlic) for contributing this feature in PaddleHub
 * Many thanks to [paopjian](https://github.com/paopjian) for correcting the wrong website address [#1424](https://github.com/PaddlePaddle/PaddleHub/issues/1424)
-* Many thanks to [Wgm-Inspur](https://github.com/Wgm-Inspur) for correcting the demo errors in readme
+* Many thanks to [Wgm-Inspur](https://github.com/Wgm-Inspur) for correcting the demo errors in readme, and updating the RNN illustration in the text classification and sequence labeling demo
 * Many thanks to [zl1271](https://github.com/zl1271) for fixing serving docs typo
--- a/README_ch.md
+++ b/README_ch.md
@@ -196,12 +196,14 @@ print(results)
    <a href="https://github.com/DesmonDay"><img src="https://avatars.githubusercontent.com/u/20554008?v=4" width=75 height=75></a>
    <a href="https://github.com/chunzhang-hub"><img src="https://avatars.githubusercontent.com/u/63036966?v=4" width=75 height=75></a>
    <a href="https://github.com/adaxiadaxi"><img src="https://avatars.githubusercontent.com/u/58928121?v=4" width=75 height=75></a>
+    <a href="https://github.com/rainyfly"><img src="https://avatars.githubusercontent.com/u/22424850?v=4" width=75 height=75></a>
+    <a href="https://github.com/linjieccc"><img src="https://avatars.githubusercontent.com/u/40840292?v=4" width=75 height=75></a>
    <a href="https://github.com/linshuliang"><img src="https://avatars.githubusercontent.com/u/15993091?v=4" width=75 height=75></a>
    <a href="https://github.com/eepgxxy"><img src="https://avatars.githubusercontent.com/u/15946195?v=4" width=75 height=75></a>
-    <a href="https://github.com/linjieccc"><img src="https://avatars.githubusercontent.com/u/40840292?v=4" width=75 height=75></a>
    <a href="https://github.com/paopjian"><img src="https://avatars.githubusercontent.com/u/20377352?v=4" width=75 height=75></a>
    <a href="https://github.com/zbp-xxxp"><img src="https://avatars.githubusercontent.com/u/58476312?v=4" width=75 height=75></a>
    <a href="https://github.com/houj04"><img src="https://avatars.githubusercontent.com/u/35131887?v=4" width=75 height=75></a>
+    <a href="https://github.com/Wgm-Inspur"><img src="https://avatars.githubusercontent.com/u/89008682?v=4" width=75 height=75></a>
    <a href="https://github.com/apps/dependabot"><img src="https://avatars.githubusercontent.com/in/29110?v=4" width=75 height=75></a>
    <a href="https://github.com/dxxxp"><img src="https://avatars.githubusercontent.com/u/15886898?v=4" width=75 height=75></a>
    <a href="https://github.com/jianganbai"><img src="https://avatars.githubusercontent.com/u/50263321?v=4" width=75 height=75></a>
@@ -214,12 +216,12 @@ print(results)
    <a href="https://github.com/Haijunlv"><img src="https://avatars.githubusercontent.com/u/28926237?v=4" width=75 height=75></a>
    <a href="https://github.com/holyseven"><img src="https://avatars.githubusercontent.com/u/13829174?v=4" width=75 height=75></a>
    <a href="https://github.com/MRXLT"><img src="https://avatars.githubusercontent.com/u/16594411?v=4" width=75 height=75></a>
-    <a href="https://github.com/Wgm-Inspur"><img src="https://avatars.githubusercontent.com/u/89008682?v=4" width=75 height=75></a>
    <a href="https://github.com/cclauss"><img src="https://avatars.githubusercontent.com/u/3709715?v=4" width=75 height=75></a>
-    <a href="https://github.com/rainyfly"><img src="https://avatars.githubusercontent.com/u/22424850?v=4" width=75 height=75></a>
    <a href="https://github.com/hu-qi"><img src="https://avatars.githubusercontent.com/u/17986122?v=4" width=75 height=75></a>
    <a href="https://github.com/jayhenry"><img src="https://avatars.githubusercontent.com/u/4285375?v=4" width=75 height=75></a>
    <a href="https://github.com/hlmu"><img src="https://avatars.githubusercontent.com/u/30133236?v=4" width=75 height=75></a>
+    <a href="https://github.com/shinichiye"><img src="https://avatars.githubusercontent.com/u/76040149?v=4" width=75 height=75></a>
+    <a href="https://github.com/will-jl944"><img src="https://avatars.githubusercontent.com/u/68210528?v=4" width=75 height=75></a>
    <a href="https://github.com/yma-admin"><img src="https://avatars.githubusercontent.com/u/40477813?v=4" width=75 height=75></a>
    <a href="https://github.com/zl1271"><img src="https://avatars.githubusercontent.com/u/22902089?v=4" width=75 height=75></a>
    <a href="https://github.com/brooklet"><img src="https://avatars.githubusercontent.com/u/1585799?v=4" width=75 height=75></a>
@@ -239,5 +241,5 @@ print(results)
 * 非常感谢[huqi](https://github.com/hu-qi)修复了readme中的错别字
 * 非常感谢[parano](https://github.com/parano)、[cqvu](https://github.com/cqvu)、[deehrlic](https://github.com/deehrlic)三位的贡献与支持
 * 非常感谢[paopjian](https://github.com/paopjian)修改了中文readme模型搜索指向的的网站地址错误[#1424](https://github.com/PaddlePaddle/PaddleHub/issues/1424)
-* 非常感谢[Wgm-Inspur](https://github.com/Wgm-Inspur)修复了readme中的代码示例问题
+* 非常感谢[Wgm-Inspur](https://github.com/Wgm-Inspur)修复了readme中的代码示例问题，并优化了文本分类、序列标注demo中的RNN示例图
 * 非常感谢[zl1271](https://github.com/zl1271)修复了serving文档中的错别字
--- a/demo/sequence_labeling/README.md
+++ b/demo/sequence_labeling/README.md
@@ -2,7 +2,7 @@
 在2017年之前，工业界和学术界对NLP文本处理依赖于序列模型[Recurrent Neural Network (RNN)](https://baike.baidu.com/item/%E5%BE%AA%E7%8E%AF%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C/23199490?fromtitle=RNN&fromid=5707183&fr=aladdin).
-![](http://colah.github.io/posts/2015-09-NN-Types-FP/img/RNN-general.png)
+![](../../docs/imgs/RNN_Sample.png)
 近年来随着深度学习的发展，模型参数数量飞速增长，为了训练这些参数，需要更大的数据集来避免过拟合。然而，对于大部分NLP任务来说，构建大规模的标注数据集成本过高，非常困难，特别是对于句法和语义相关的任务。相比之下，大规模的未标注语料库的构建则相对容易。最近的研究表明，基于大规模未标注语料库的预训练模型（Pretrained Models, PTM) 能够习得通用的语言表示，将预训练模型Fine-tune到下游任务，能够获得出色的表现。另外，预训练模型能够避免从零开始训练模型。

--- a/demo/text_classification/README.md
+++ b/demo/text_classification/README.md
@@ -2,7 +2,7 @@
 在2017年之前，工业界和学术界对NLP文本处理依赖于序列模型[Recurrent Neural Network (RNN)](https://baike.baidu.com/item/%E5%BE%AA%E7%8E%AF%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C/23199490?fromtitle=RNN&fromid=5707183&fr=aladdin).
-![](http://colah.github.io/posts/2015-09-NN-Types-FP/img/RNN-general.png)
+![](../../docs/imgs/RNN_Sample.png)
 近年来随着深度学习的发展，模型参数数量飞速增长，为了训练这些参数，需要更大的数据集来避免过拟合。然而，对于大部分NLP任务来说，构建大规模的标注数据集成本过高，非常困难，特别是对于句法和语义相关的任务。相比之下，大规模的未标注语料库的构建则相对容易。最近的研究表明，基于大规模未标注语料库的预训练模型（Pretrained Models, PTM) 能够习得通用的语言表示，将预训练模型Fine-tune到下游任务，能够获得出色的表现。另外，预训练模型能够避免从零开始训练模型。

--- a/docs/imgs/RNN_Sample.png
+++ b/docs/imgs/RNN_Sample.png
--- a/modules/text/language_model/bert-base-cased/README.md
+++ b/modules/text/language_model/bert-base-cased/README.md
@@ -130,7 +130,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-base-cased"
+url = "http://127.0.0.1:8866/predict/bert-base-cased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-base-chinese/README.md
+++ b/modules/text/language_model/bert-base-chinese/README.md
@@ -156,7 +156,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/bert-base-chinese"
+    url = "http://127.0.0.1:8866/predict/bert-base-chinese"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-base-multilingual-cased/README.md
+++ b/modules/text/language_model/bert-base-multilingual-cased/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-base-multilingual-cased"
+url = "http://127.0.0.1:8866/predict/bert-base-multilingual-cased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-base-multilingual-uncased/README.md
+++ b/modules/text/language_model/bert-base-multilingual-uncased/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-base-multilingual-uncased"
+url = "http://127.0.0.1:8866/predict/bert-base-multilingual-uncased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-base-uncased/README.md
+++ b/modules/text/language_model/bert-base-uncased/README.md
@@ -130,7 +130,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-base-uncased"
+url = "http://127.0.0.1:8866/predict/bert-base-uncased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-large-cased/README.md
+++ b/modules/text/language_model/bert-large-cased/README.md
@@ -130,7 +130,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-large-cased"
+url = "http://127.0.0.1:8866/predict/bert-large-cased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/bert-large-uncased/README.md
+++ b/modules/text/language_model/bert-large-uncased/README.md
@@ -130,7 +130,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/bert-large-uncased"
+url = "http://127.0.0.1:8866/predict/bert-large-uncased"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/chinese_bert_wwm/README.md
+++ b/modules/text/language_model/chinese_bert_wwm/README.md
-```shell
+# chinese-bert-wwm
-$ hub install chinese-bert-wwm==2.0.1
+|模型名称|chinese-bert-wwm|
-```
+| :--- | :---: | 
+|类别|文本-语义模型|
+|网络|chinese-bert-wwm|
+|数据集|百度自建数据集|
+|是否支持Fine-tuning|是|
+|模型大小|391MB|
+|最新更新日期|2021-03-16|
+|贡献者|[ymcui](https://github.com/ymcui)|
+|数据指标|-|
+## 一、模型基本信息
+- ### 模型介绍
 <p align="center">
 <img src="https://bj.bcebos.com/paddlehub/paddlehub-img/bert_network.png"  hspace='10'/> <br />
 </p>
 更多详情请参考[BERT论文](https://arxiv.org/abs/1810.04805), [Chinese-BERT-wwm技术报告](https://arxiv.org/abs/1906.08101)
-## API
+## 二、安装
-```python
-def __init__(
-    task=None,
-    load_checkpoint=None,
-    label_map=None,
-    num_classes=2,
-    suffix=False,
-    **kwargs,
-)
-```
-创建Module对象（动态图组网版本）。
-**参数**
-* `task`： 任务名称，可为`seq-cls`(文本分类任务，原来的`sequence_classification`在未来会被弃用)或`token-cls`(序列标注任务)。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `label_map`：预测时的类别映射表。
-* `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
-* `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
-def predict(
-    data,
-    max_seq_len=128,
-    batch_size=1,
-    use_gpu=False
-)
-```
-**参数**
-* `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
-* `max_seq_len`：模型处理文本的最大长度
-* `batch_size`：模型批处理大小
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
-* `results`：list类型，不同任务类型的返回结果如下
-  * 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-  * 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-```python
-def get_embedding(
-    data,
-    use_gpu=False
-)
-```
-用于获取输入文本的句子粒度特征与字粒度特征
+- ### 1、环境依赖
-**参数**
+  - paddlepaddle >= 2.0.0
-* `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+- ### 2、安装
-* `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。
+  - ```shell
+    $ hub install chinese-bert-wwm
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
 ```python
 import paddlehub as hub
@@ -95,62 +62,110 @@ for idx, text in enumerate(data):
 ```
 详情可参考PaddleHub示例：
- [文本分类](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/text_classification)
+- [文本分类](../../../../demo/text_classification)
- [序列标注](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/sequence_labeling)
+- [序列标注](../../../../demo/sequence_labeling)
-## 服务部署
+- ### 2、API
-PaddleHub Serving可以部署一个在线获取预训练词向量。
+  - ```python
+    def __init__(
+        task=None,
+        load_checkpoint=None,
+        label_map=None,
+        num_classes=2,
+        suffix=False,
+        **kwargs,
+    )
+    ```
-### Step1: 启动PaddleHub Serving
+    - 创建Module对象（动态图组网版本）
-运行启动命令：
+    - **参数**
-```shell
+      - `task`： 任务名称，可为`seq-cls`(文本分类任务)或`token-cls`(序列标注任务)。
-$ hub serving start -m chinese-bert-wwm
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-```
+      - `label_map`：预测时的类别映射表。
+      - `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
+      - `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
-这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
+  - ```python
+    def predict(
+        data,
+        max_seq_len=128,
+        batch_size=1,
+        use_gpu=False
+    )
+    ```
-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+    - **参数**
-### Step2: 发送预测请求
+      - `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
+      - `max_seq_len`：模型处理文本的最大长度
+      - `batch_size`：模型批处理大小
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **返回**
-```python
+      - `results`：list类型，不同任务类型的返回结果如下
-import requests
+        - 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-import json
+        - 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-# 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+  - ```python
-text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    def get_embedding(
-# 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+      data,
-# 对应本地部署，则为module.get_embedding(data=text)
+      use_gpu=False
-data = {"data": text}
+    )
-# 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    ```
-url = "http://10.12.121.132:8866/predict/chinese-bert-wwm"
-# 指定post请求的headers为application/json方式
+    - 用于获取输入文本的句子粒度特征与字粒度特征
-headers = {"Content-Type": "application/json"}
+    - **参数**
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
-print(r.json())
+      - `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
-```
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
+    - **返回**
+      - `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。  
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线获取预训练词向量。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m chinese_bert_wwm
+    ```
-## 查看代码
+  - 这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
-https://github.com/ymcui/Chinese-BERT-wwm
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
-## 贡献者
+- ### 第二步：发送预测请求
-[ymcui](https://github.com/ymcui)
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
-## 依赖
+  - ```python
+    import requests
+    import json
-paddlepaddle >= 2.0.0
+    # 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+    text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+    # 对应本地部署，则为module.get_embedding(data=text)
+    data = {"data": text}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/chinese_bert_wwm"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
-paddlehub >= 2.0.0
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
-## 更新历史
+## 五、更新历史
 * 1.0.0
@@ -163,3 +178,6 @@ paddlehub >= 2.0.0
 * 2.0.1
  增加文本匹配任务`text-matching`
+  ```shell
+  $ hub install chinese-bert-wwm==2.0.1
+  ```
--- a/modules/text/language_model/chinese_bert_wwm_ext/README.md
+++ b/modules/text/language_model/chinese_bert_wwm_ext/README.md
-```shell
+# chinese-bert-wwm-ext
-$ hub install chinese-bert-wwm-ext==2.0.1
+|模型名称|chinese-bert-wwm-ext|
-```
+| :--- | :---: | 
+|类别|文本-语义模型|
+|网络|chinese-bert-wwm-ext|
+|数据集|百度自建数据集|
+|是否支持Fine-tuning|是|
+|模型大小|391MB|
+|最新更新日期|2021-03-16|
+|贡献者|[ymcui](https://github.com/ymcui)|
+|数据指标|-|
+## 一、模型基本信息
+- ### 模型介绍
 <p align="center">
 <img src="https://bj.bcebos.com/paddlehub/paddlehub-img/bert_network.png"  hspace='10'/> <br />
 </p>
 更多详情请参考[BERT论文](https://arxiv.org/abs/1810.04805), [Chinese-BERT-wwm技术报告](https://arxiv.org/abs/1906.08101)
-## API
+## 二、安装
-```python
-def __init__(
-    task=None,
-    load_checkpoint=None,
-    label_map=None,
-    num_classes=2,
-    suffix=False,
-    **kwargs,
-)
-```
-创建Module对象（动态图组网版本）。
-**参数**
-* `task`： 任务名称，可为`seq-cls`(文本分类任务，原来的`sequence_classification`在未来会被弃用)或`token-cls`(序列标注任务)。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `label_map`：预测时的类别映射表。
-* `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
-* `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
-def predict(
-    data,
-    max_seq_len=128,
-    batch_size=1,
-    use_gpu=False
-)
-```
-**参数**
-* `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
-* `max_seq_len`：模型处理文本的最大长度
-* `batch_size`：模型批处理大小
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
-* `results`：list类型，不同任务类型的返回结果如下
-  * 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-  * 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-```python
-def get_embedding(
-    data,
-    use_gpu=False
-)
-```
-用于获取输入文本的句子粒度特征与字粒度特征
+- ### 1、环境依赖
-**参数**
+  - paddlepaddle >= 2.0.0
-* `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+- ### 2、安装
-* `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。
+  - ```shell
+    $ hub install chinese-bert-wwm-ext
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
 ```python
 import paddlehub as hub
@@ -95,62 +62,110 @@ for idx, text in enumerate(data):
 ```
 详情可参考PaddleHub示例：
- [文本分类](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/text_classification)
+- [文本分类](../../../../demo/text_classification)
- [序列标注](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/sequence_labeling)
+- [序列标注](../../../../demo/sequence_labeling)
-## 服务部署
+- ### 2、API
-PaddleHub Serving可以部署一个在线获取预训练词向量。
+  - ```python
+    def __init__(
+        task=None,
+        load_checkpoint=None,
+        label_map=None,
+        num_classes=2,
+        suffix=False,
+        **kwargs,
+    )
+    ```
-### Step1: 启动PaddleHub Serving
+    - 创建Module对象（动态图组网版本）
-运行启动命令：
+    - **参数**
-```shell
+      - `task`： 任务名称，可为`seq-cls`(文本分类任务)或`token-cls`(序列标注任务)。
-$ hub serving start -m chinese-bert-wwm-ext
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-```
+      - `label_map`：预测时的类别映射表。
+      - `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
+      - `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
-这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
+  - ```python
+    def predict(
+        data,
+        max_seq_len=128,
+        batch_size=1,
+        use_gpu=False
+    )
+    ```
-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+    - **参数**
-### Step2: 发送预测请求
+      - `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
+      - `max_seq_len`：模型处理文本的最大长度
+      - `batch_size`：模型批处理大小
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **返回**
-```python
+      - `results`：list类型，不同任务类型的返回结果如下
-import requests
+        - 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-import json
+        - 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-# 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+  - ```python
-text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    def get_embedding(
-# 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+      data,
-# 对应本地部署，则为module.get_embedding(data=text)
+      use_gpu=False
-data = {"data": text}
+    )
-# 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    ```
-url = "http://10.12.121.132:8866/predict/chinese-bert-wwm-ext"
-# 指定post请求的headers为application/json方式
+    - 用于获取输入文本的句子粒度特征与字粒度特征
-headers = {"Content-Type": "application/json"}
+    - **参数**
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
-print(r.json())
+      - `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
-```
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
+    - **返回**
+      - `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。  
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线获取预训练词向量。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m chinese_bert_wwm_ext
+    ```
-## 查看代码
+  - 这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
-https://github.com/ymcui/Chinese-BERT-wwm
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
-## 贡献者
+- ### 第二步：发送预测请求
-[ymcui](https://github.com/ymcui)
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
-## 依赖
+  - ```python
+    import requests
+    import json
-paddlepaddle >= 2.0.0
+    # 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+    text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+    # 对应本地部署，则为module.get_embedding(data=text)
+    data = {"data": text}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/chinese_bert_wwm_ext"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
-paddlehub >= 2.0.0
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
-## 更新历史
+## 五、更新历史
 * 1.0.0
@@ -163,3 +178,6 @@ paddlehub >= 2.0.0
 * 2.0.1
  增加文本匹配任务`text-matching`
+  ```shell
+  $ hub install chinese-bert-wwm-ext==2.0.1
+  ```
--- a/modules/text/language_model/chinese_electra_base/README.md
+++ b/modules/text/language_model/chinese_electra_base/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/chinese-electra-base"
+url = "http://127.0.0.1:8866/predict/chinese-electra-base"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/chinese_electra_small/README.md
+++ b/modules/text/language_model/chinese_electra_small/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/chinese-electra-small"
+url = "http://127.0.0.1:8866/predict/chinese-electra-small"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/electra_base/README.md
+++ b/modules/text/language_model/electra_base/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/electra-base"
+url = "http://127.0.0.1:8866/predict/electra-base"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/electra_large/README.md
+++ b/modules/text/language_model/electra_large/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/electra-large"
+url = "http://127.0.0.1:8866/predict/electra-large"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/electra_small/README.md
+++ b/modules/text/language_model/electra_small/README.md
@@ -129,7 +129,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/electra-small"
+url = "http://127.0.0.1:8866/predict/electra-small"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/ernie/README.md
+++ b/modules/text/language_model/ernie/README.md
@@ -165,7 +165,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/ernie"
+    url = "http://127.0.0.1:8866/predict/ernie"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/ernie_tiny/README.md
+++ b/modules/text/language_model/ernie_tiny/README.md
@@ -167,7 +167,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/ernie_tiny"
+    url = "http://127.0.0.1:8866/predict/ernie_tiny"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/ernie_v2_eng_base/README.md
+++ b/modules/text/language_model/ernie_v2_eng_base/README.md
@@ -162,7 +162,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/ernie_v2_eng_base"
+    url = "http://127.0.0.1:8866/predict/ernie_v2_eng_base"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/ernie_v2_eng_large/README.md
+++ b/modules/text/language_model/ernie_v2_eng_large/README.md
@@ -162,7 +162,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/ernie_v2_eng_large"
+    url = "http://127.0.0.1:8866/predict/ernie_v2_eng_large"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/rbt3/README.md
+++ b/modules/text/language_model/rbt3/README.md
@@ -128,7 +128,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/rtb3"
+url = "http://127.0.0.1:8866/predict/rtb3"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/rbtl3/README.md
+++ b/modules/text/language_model/rbtl3/README.md
@@ -128,7 +128,7 @@ text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
 # 对应本地部署，则为module.get_embedding(data=text)
 data = {"data": text}
 # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-url = "http://10.12.121.132:8866/predict/rbtl3"
+url = "http://127.0.0.1:8866/predict/rbtl3"
 # 指定post请求的headers为application/json方式
 headers = {"Content-Type": "application/json"}

--- a/modules/text/language_model/roberta-wwm-ext-large/README.md
+++ b/modules/text/language_model/roberta-wwm-ext-large/README.md
-```shell
+# roberta-wwm-ext-large
-$ hub install roberta-wwm-ext-large==2.0.2
+|模型名称|roberta-wwm-ext-large|
-```
+| :--- | :---: | 
+|类别|文本-语义模型|
+|网络|roberta-wwm-ext-large|
+|数据集|百度自建数据集|
+|是否支持Fine-tuning|是|
+|模型大小|1.3GB|
+|最新更新日期|2021-03-16|
+|数据指标|-|
+## 一、模型基本信息
+- ### 模型介绍
 <p align="center">
 <img src="https://bj.bcebos.com/paddlehub/paddlehub-img/bert_network.png"  hspace='10'/> <br />
 </p>
 更多详情请参考[RoBERTa论文](https://arxiv.org/abs/1907.11692)、[Chinese-BERT-wwm技术报告](https://arxiv.org/abs/1906.08101)
-## API
+## 二、安装
-```python
-def __init__(
-    task=None,
-    load_checkpoint=None,
-    label_map=None,
-    num_classes=2,
-    suffix=False,
-    **kwargs,
-)
-```
-创建Module对象（动态图组网版本）。
-**参数**
-* `task`： 任务名称，可为`seq-cls`(文本分类任务，原来的`sequence_classification`在未来会被弃用)或`token-cls`(序列标注任务)。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `label_map`：预测时的类别映射表。
-* `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
-* `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
-def predict(
-    data,
-    max_seq_len=128,
-    batch_size=1,
-    use_gpu=False
-)
-```
-**参数**
-* `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
-* `max_seq_len`：模型处理文本的最大长度
-* `batch_size`：模型批处理大小
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
-* `results`：list类型，不同任务类型的返回结果如下
-  * 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-  * 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-```python
+- ### 1、环境依赖
-def get_embedding(
-    data,
-    use_gpu=False
-)
-```
-用于获取输入文本的句子粒度特征与字粒度特征
-**参数**
+  - paddlepaddle >= 2.0.0
-* `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+- ### 2、安装
-* `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。
+  - ```shell
+    $ hub install roberta-wwm-ext-large
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
 ```python
 import paddlehub as hub
@@ -96,59 +61,110 @@ for idx, text in enumerate(data):
 ```
 详情可参考PaddleHub示例：
- [文本分类](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/text_classification)
+- [文本分类](../../../../demo/text_classification)
- [序列标注](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-beta/demo/sequence_labeling)
+- [序列标注](../../../../demo/sequence_labeling)
-## 服务部署
+- ### 2、API
-PaddleHub Serving可以部署一个在线获取预训练词向量。
+  - ```python
+    def __init__(
+        task=None,
+        load_checkpoint=None,
+        label_map=None,
+        num_classes=2,
+        suffix=False,
+        **kwargs,
+    )
+    ```
-### Step1: 启动PaddleHub Serving
+    - 创建Module对象（动态图组网版本）
-运行启动命令：
+    - **参数**
-```shell
+      - `task`： 任务名称，可为`seq-cls`(文本分类任务)或`token-cls`(序列标注任务)。
-$ hub serving start -m roberta-wwm-ext-large
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-```
+      - `label_map`：预测时的类别映射表。
+      - `num_classes`：分类任务的类别数，如果指定了`label_map`，此参数可不传，默认2分类。
+      - `suffix`: 序列标注任务的标签格式，如果设定为`True`，标签以'-B', '-I', '-E' 或者 '-S'为结尾，此参数默认为`False`。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
-这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
+  - ```python
+    def predict(
+        data,
+        max_seq_len=128,
+        batch_size=1,
+        use_gpu=False
+    )
+    ```
-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+    - **参数**
-### Step2: 发送预测请求
+      - `data`： 待预测数据，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。每个样例文本数量（1个或者2个）需和训练时保持一致。
+      - `max_seq_len`：模型处理文本的最大长度
+      - `batch_size`：模型批处理大小
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **返回**
-```python
+      - `results`：list类型，不同任务类型的返回结果如下
-import requests
+        - 文本分类：列表里包含每个句子的预测标签，格式为\[label\_1, label\_2, …,\]
-import json
+        - 序列标注：列表里包含每个句子每个token的预测标签，格式为\[\[token\_1, token\_2, …,\], \[token\_1, token\_2, …,\], …,\]
-# 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+  - ```python
-text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    def get_embedding(
-# 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+      data,
-# 对应本地部署，则为module.get_embedding(data=text)
+      use_gpu=False
-data = {"data": text}
+    )
-# 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    ```
-url = "http://10.12.121.132:8866/predict/roberta-wwm-ext-large"
-# 指定post请求的headers为application/json方式
+    - 用于获取输入文本的句子粒度特征与字粒度特征
-headers = {"Content-Type": "application/json"}
+    - **参数**
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
-print(r.json())
+      - `data`：输入文本列表，格式为\[\[sample\_a\_text\_a, sample\_a\_text\_b\], \[sample\_b\_text\_a, sample\_b\_text\_b\],…,\]，其中每个元素都是一个样例，每个样例可以包含text\_a与text\_b。
-```
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
+    - **返回**
+      - `results`：list类型，格式为\[\[sample\_a\_pooled\_feature, sample\_a\_seq\_feature\], \[sample\_b\_pooled\_feature, sample\_b\_seq\_feature\],…,\]，其中每个元素都是对应样例的特征输出，每个样例都有句子粒度特征pooled\_feature与字粒度特征seq\_feature。  
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线获取预训练词向量。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m roberta-wwm-ext-large
+    ```
+  - 这样就完成了一个获取预训练词向量服务化API的部署，默认端口号为8866。
-##   查看代码
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
-https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/paddlenlp/transformers/roberta
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
-## 依赖
+  - ```python
+    import requests
+    import json
-paddlepaddle >= 2.0.0
+    # 指定用于获取embedding的文本[[text_1], [text_2], ... ]}
+    text = [["今天是个好日子"], ["天气预报说今天要下雨"]]
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"data"
+    # 对应本地部署，则为module.get_embedding(data=text)
+    data = {"data": text}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/roberta-wwm-ext-large"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
-paddlehub >= 2.0.0
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
-## 更新历史
+## 五、更新历史
 * 1.0.0
@@ -165,3 +181,6 @@ paddlehub >= 2.0.0
 * 2.0.2
  增加文本匹配任务`text-matching`
+  ```shell
+  $ hub install roberta-wwm-ext-large==2.0.2
+  ```
--- a/modules/text/language_model/roberta-wwm-ext/README.md
+++ b/modules/text/language_model/roberta-wwm-ext/README.md
@@ -156,7 +156,7 @@ for idx, text in enumerate(data):
    # 对应本地部署，则为module.get_embedding(data=text)
    data = {"data": text}
    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
-    url = "http://10.12.121.132:8866/predict/roberta-wwm-ext"
+    url = "http://127.0.0.1:8866/predict/roberta-wwm-ext"
    # 指定post请求的headers为application/json方式
    headers = {"Content-Type": "application/json"}