Merge branch 'release/v2.2' of https://github.com/PaddlePaddle/PaddleHub into...

Merge branch 'release/v2.2' of https://github.com/PaddlePaddle/PaddleHub into PaddlePaddle-release/v2.2

Merge branch 'release/v2.2' of https://github.com/PaddlePaddle/PaddleHub into...
Merge branch 'release/v2.2' of https://github.com/PaddlePaddle/PaddleHub into PaddlePaddle-release/v2.2
cbe2fa8a · AK391 · 831de341 · 545bf161 · cbe2fa8a · cbe2fa8a
724 changed file
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -30,3 +30,9 @@
        -   --show-source
        -   --statistics
        files: \.py$
+-   repo: https://github.com/asottile/reorder_python_imports
+    rev: v2.4.0
+    hooks:
+      - id: reorder-python-imports
+        exclude: (?=third_party).*(\.py)$
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@ English | [简体中文](README_ch.md)
 <img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
 <p align="center">
 <div align="center">  
-  <h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
+  <h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="./modules"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
 </div>
 ------------------------------------------------------------------------------------------
@@ -29,7 +29,7 @@ English | [简体中文](README_ch.md)
 ## Introduction and Features
 - **PaddleHub** aims to provide developers with rich, high-quality, and directly usable pre-trained models.
- **Abundant Pre-trained Models**: 300+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
+- **Abundant Pre-trained Models**: 360+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
 - **No Need for Deep Learning Background**: you can use AI models quickly and enjoy the dividends of the artificial intelligence era.
 - **Quick Model Prediction**: model prediction can be realized through a few lines of scripts to quickly experience the model effect.
 - **Model As Service**: one-line command to build deep learning model API service deployment capabilities.
@@ -38,6 +38,7 @@ English | [简体中文](README_ch.md)
 ### Recent updates
 - **2022.02.18:** Added Huggingface Org, add spaces and models to the org: [PaddlePaddle Huggingface](https://huggingface.co/PaddlePaddle)
+- **2021.12.22**，The v2.2.0 version is released. [1]More than 100 new models released，including dialog, speech, segmentation, OCR, text processing, GANs, and many other categories. The total number of pre-trained models reaches [**【360】**](https://www.paddlepaddle.org.cn/hublist). [2]Add an [indexed file](./modules/README.md) including useful information of pretrained models supported by PaddleHub. [3]Refactor README of pretrained models.
 - **2021.05.12:** Add an open-domain dialogue system, i.e., [plato-mini](https://www.paddlepaddle.org.cn/hubdetail?name=plato-mini&en_category=TextGeneration), to make it easy to build a chatbot in wechat with the help of the wechaty, [See Demo](https://github.com/KPatr1ck/paddlehub-wechaty-demo)
 - **2021.04.27:** The v2.1.0 version is released. [1] Add supports for five new models, including two high-precision semantic segmentation models based on VOC dataset and three voice classification models. [2] Enforce the transfer learning capabilities for image semantic segmentation, text semantic matching and voice classification on related datasets. [3] Add the export function APIs for two kinds of model formats, i.,e, ONNX and PaddleInference. [4] Add the support for [BentoML](https://github.com/bentoml/BentoML/), which is a cloud native framework for serving deployment. Users can easily serve pre-trained models from PaddleHub by following the [Tutorial notebooks](https://github.com/PaddlePaddle/PaddleHub/blob/release/v2.1/demo/serving/bentoml/cloud-native-model-serving-with-bentoml.ipynb). Also, see this announcement and [Release note](https://github.com/bentoml/BentoML/releases/tag/v0.12.1) from BentoML. (Many thanks to @[parano](https://github.com/parano) @[cqvu](https://github.com/cqvu) @[deehrlic](https://github.com/deehrlic) for contributing this feature in PaddleHub). [5] The total number of pre-trained models reaches **【300】**.
 - **2021.02.18:** The v2.0.0 version is released, making model development and debugging easier, and the finetune task is more flexible and easy to use.The ability to transfer learning for visual tasks is fully upgraded, supporting various tasks such as image classification, image coloring, and style transfer; Transformer models such as BERT, ERNIE, and RoBERTa are upgraded to dynamic graphs, supporting Fine-Tune capabilities for text classification and sequence labeling; Optimize the Serving capability, support multi-card prediction, automatic load balancing, and greatly improve performance; the new automatic data enhancement capability Auto Augment can efficiently search for data enhancement strategy combinations suitable for data sets. 61 new word vector models were added, including 51 Chinese models and 10 English models; add 4 image segmentation models, 2 depth models, 7 image generation models, and 3 text generation models, the total number of pre-trained models reaches **【274】**.
@@ -46,8 +47,8 @@ English | [简体中文](README_ch.md)
-## Visualization Demo [[More]](./docs/docs_en/visualization.md)
+## Visualization Demo [[More]](./docs/docs_en/visualization.md) [[ModelList]](./modules)
-### **Computer Vision (161 models)**
+### **[Computer Vision (212 models)](./modules#Image)**
 <div align="center">
 <img src="./docs/imgs/Readme_Related/Image_all.gif"  width = "530" height = "400" />
 </div>
@@ -55,7 +56,7 @@ English | [简体中文](README_ch.md)
 - Many thanks to CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)、[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)、[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)、[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)、[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)、[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)、[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)、[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) for the pre-trained models, you can try to train your models with them.
-### **Natural Language Processing (129 models)**
+### **[Natural Language Processing (130 models)](./modules#Text)**
 <div align="center">
 <img src="./docs/imgs/Readme_Related/Text_all.gif"  width = "640" height = "240" />
 </div>
@@ -64,9 +65,37 @@ English | [简体中文](README_ch.md)
-### Speech (3 models)
+### [Speech (15 models)](./modules#Audio)
+- ASR speech recognition algorithm, multiple algorithms are available.
+- The speech recognition effect is as follows:
+<div align="center">
+<table>
+    <thead>
+        <tr>
+            <th width=250> Input Audio  </th>
+            <th width=550> Recognition Result  </th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td align = "center">
+            <a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
+                    <img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250 ></a><br>
+            </td>
+            <td >I knocked at the door on the ancient side of the building.</td>
+            </tr>
+            <tr>
+            <td align = "center">
+            <a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
+                    <img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250></a><br>
+            </td>
+            <td>我认为跑步最重要的就是给我带来了身体健康。</td>
+        </tr>
+    </tbody>
+</table>
+</div>
 - TTS speech synthesis algorithm, multiple algorithms are available.
- Many thanks to CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet) for the pre-trained models, you can try to train your models with Parakeet.
 - Input: `Life was like a box of chocolates, you never know what you're gonna get.`
 - The synthesis effect is as follows:
 <div align="center">
@@ -97,7 +126,9 @@ English | [简体中文](README_ch.md)
 </table>
 </div>
-### Video (8 models)
+- Many thanks to CopyRight@[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) for the pre-trained models, you can try to train your models with PaddleSpeech.
+### [Video (8 models)](./modules#Video)
 - Short video classification trained via large-scale video datasets, supports 3000+ tag types prediction for short Form Videos.
 - Many thanks to CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo) for the pre-trained model, you can try to train your models with PaddleVideo.
 - `Example: Input a short video of swimming, the algorithm can output the result of "swimming"`

--- a/README_ch.md
+++ b/README_ch.md
@@ -4,7 +4,7 @@
 <img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
 <p align="center">
 <div align="center">  
-  <h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> 模型搜索 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
+  <h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="./modules/README_ch.md"> 模型库 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
  </h3>
 </div>
@@ -30,7 +30,7 @@
 ## 简介与特性
 - PaddleHub旨在为开发者提供丰富的、高质量的、直接可用的预训练模型
- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 350+ 预训练模型，全部开源下载，离线可运行
+- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 **360+** 预训练模型，全部开源下载，离线可运行
 - **【超低使用门槛】**：无需深度学习背景、无需数据与训练过程，可快速使用AI模型
 - **【一键模型快速预测】**：通过一行命令行或者极简的Python API实现模型调用，可快速体验模型效果
 - **【一键模型转服务化】**：一行命令，搭建深度学习模型API服务化部署能力
@@ -38,6 +38,7 @@
 - **【跨平台兼容性】**：可运行于Linux、Windows、MacOS等多种操作系统
 ## 近期更新
+- **2021.12.22**，发布v2.2.0版本。【1】新增100+高质量模型，涵盖对话、语音处理、语义分割、文字识别、文本处理、图像生成等多个领域，预训练模型总量达到[**【360+】**](https://www.paddlepaddle.org.cn/hublist)；【2】新增模型[检索列表](./modules/README_ch.md)，包含模型名称、网络、数据集和使用场景等信息，快速定位用户所需的模型；【3】模型文档排版优化，呈现数据集、指标、模型大小等更多实用信息。
 - **2021.05.12**，新增轻量级中文对话模型[plato-mini](https://www.paddlepaddle.org.cn/hubdetail?name=plato-mini&en_category=TextGeneration)，可以配合使用wechaty实现微信闲聊机器人，[参考demo](https://github.com/KPatr1ck/paddlehub-wechaty-demo)
 - **2021.04.27**，发布v2.1.0版本。【1】新增基于VOC数据集的高精度语义分割模型2个，语音分类模型3个。【2】新增图像语义分割、文本语义匹配、语音分类等相关任务的Fine-Tune能力以及相关任务数据集;完善部署能力：【3】新增ONNX和PaddleInference等模型格式的导出功能。【4】新增[BentoML](https://github.com/bentoml/BentoML) 云原生服务化部署能力，可以支持统一的多框架模型管理和模型部署的工作流，[详细教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v2.1/demo/serving/bentoml/cloud-native-model-serving-with-bentoml.ipynb). 更多内容可以参考BentoML 最新 v0.12.1 [Releasenote](https://github.com/bentoml/BentoML/releases/tag/v0.12.1).（感谢@[parano](https://github.com/parano) @[cqvu](https://github.com/cqvu) @[deehrlic](https://github.com/deehrlic)）的贡献与支持。【5】预训练模型总量达到[**【300】**](https://www.paddlepaddle.org.cn/hublist)个。
 - **2021.02.18**，发布v2.0.0版本，【1】模型开发调试更简单，finetune接口更加灵活易用。视觉类任务迁移学习能力全面升级，支持[图像分类](./demo/image_classification/README.md)、[图像着色](./demo/colorization/README.md)、[风格迁移](./demo/style_transfer/README.md)等多种任务；BERT、ERNIE、RoBERTa等Transformer类模型升级至动态图，支持[文本分类](./demo/text_classification/README.md)、[序列标注](./demo/sequence_labeling/README.md)的Fine-Tune能力；【2】优化服务化部署Serving能力，支持多卡预测、自动负载均衡，性能大幅度提升；【3】新增自动数据增强能力[Auto Augment](./demo/autoaug/README.md)，能高效地搜索适合数据集的数据增强策略组合。【4】新增[词向量模型](./modules/text/embedding)61个，其中包含中文模型51个，英文模型10个；新增[图像分割](./modules/thirdparty/image/semantic_segmentation)模型4个、[深度模型](./modules/thirdparty/image/depth_estimation)2个、[图像生成](./modules/thirdparty/image/Image_gan/style_transfer)模型7个、[文本生成](./modules/thirdparty/text/text_generation)模型3个。【5】预训练模型总量达到[**【274】**](https://www.paddlepaddle.org.cn/hublist) 个。
@@ -47,9 +48,9 @@
-## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)**
+## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)[【模型库】](./modules/README_ch.md)**
-### **图像类（161个）**
+### **[图像类（212个）](./modules/README_ch.md#图像)**
 - 包括图像分类、人脸检测、口罩检测、车辆检测、人脸/人体/手部关键点检测、人像分割、80+语言文本识别、图像超分/上色/动漫化等
 <div align="center">
 <img src="./docs/imgs/Readme_Related/Image_all.gif"  width = "530" height = "400" />
@@ -58,7 +59,7 @@
 - 感谢CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)、[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)、[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)、[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)、[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)、[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)、[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)、[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) 提供相关预训练模型，训练能力开放，欢迎体验。
-### **文本类（129个）**
+### **[文本类（130个）](./modules/README_ch.md#文本)**
 - 包括中文分词、词性标注与命名实体识别、句法分析、AI写诗/对联/情话/藏头诗、中文的评论情感分析、中文色情文本审核等
 <div align="center">
 <img src="./docs/imgs/Readme_Related/Text_all.gif"  width = "640" height = "240" />
@@ -67,9 +68,37 @@
 - 感谢CopyRight@[ERNIE](https://github.com/PaddlePaddle/ERNIE)、[LAC](https://github.com/baidu/LAC)、[DDParser](https://github.com/baidu/DDParser)提供相关预训练模型，训练能力开放，欢迎体验。
-### **语音类（3个）**
+### **[语音类（15个）](./modules/README_ch.md#语音)**
+- ASR语音识别算法，多种算法可选
+- 语音识别效果如下:
+<div align="center">
+<table>
+    <thead>
+        <tr>
+            <th width=250> Input Audio  </th>
+            <th width=550> Recognition Result  </th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td align = "center">
+            <a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
+                    <img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250 ></a><br>
+            </td>
+            <td >I knocked at the door on the ancient side of the building.</td>
+            </tr>
+            <tr>
+            <td align = "center">
+            <a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
+                    <img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250></a><br>
+            </td>
+            <td>我认为跑步最重要的就是给我带来了身体健康。</td>
+        </tr>
+    </tbody>
+</table>
+</div>
 - TTS语音合成算法，多种算法可选
- 感谢CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet)提供预训练模型，训练能力开放，欢迎体验。
 - 输入：`Life was like a box of chocolates, you never know what you're gonna get.`
 - 合成效果如下:
 <div align="center">
@@ -100,7 +129,9 @@
 </table>
 </div>
-### **视频类（8个）**
+- 感谢CopyRight@[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)提供预训练模型，训练能力开放，欢迎体验。
+### **[视频类（8个）](./modules/README_ch.md#视频)**
 - 包含短视频分类，支持3000+标签种类，可输出TOP-K标签，多种算法可选。
 - 感谢CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo)提供预训练模型，训练能力开放，欢迎体验。
 - `举例：输入一段游泳的短视频，算法可以输出"游泳"结果`

--- a/demo/image_classification/README.md
+++ b/demo/image_classification/README.md
@@ -8,6 +8,18 @@
 $ hub run resnet50_vd_imagenet_ssld --input_path "/PATH/TO/IMAGE" --top_k 5
 ```
+## 脚本预测
+```python
+import paddle
+import paddlehub as hub
+if __name__ == '__main__':
+    model = hub.Module(name='resnet50_vd_imagenet_ssld',)
+    result = model.predict([PATH/TO/IMAGE])
+```
 ## 如何开始Fine-tune
 在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用resnet50_vd_imagenet_ssld对[Flowers](../../docs/reference/datasets.md#class-hubdatasetsflowers)等数据集进行Fine-tune。

--- a/demo/sequence_labeling/README.md
+++ b/demo/sequence_labeling/README.md
@@ -91,10 +91,12 @@ train_dataset = hub.datasets.MSRA_NER(
    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='train')
 dev_dataset = hub.datasets.MSRA_NER(
    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='dev')
+test_dataset = hub.datasets.MSRA_NER(
+    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='test')
 ```
 * `tokenizer`：表示该module所需用到的tokenizer，其将对输入文本完成切词，并转化成module运行所需模型输入格式。
-* `mode`：选择数据模式，可选项有 `train`, `test`, `val`， 默认为`train`。
+* `mode`：选择数据模式，可选项有 `train`, `test`, `dev`， 默认为`train`。
 * `max_seq_len`：ERNIE/BERT模型使用的最大序列长度，若出现显存不足，请适当调低这一参数。
 预训练模型ERNIE对中文数据的处理是以字为单位，tokenizer作用为将原始输入文本转化成模型model可以接受的输入数据形式。 PaddleHub 2.0中的各种预训练模型已经内置了相应的tokenizer，可以通过`model.get_tokenizer`方法获取。
@@ -106,7 +108,7 @@ dev_dataset = hub.datasets.MSRA_NER(
 ```python
 optimizer = paddle.optimizer.AdamW(learning_rate=5e-5, parameters=model.parameters())
-trainer = hub.Trainer(model, optimizer, checkpoint_dir='test_ernie_token_cls', use_gpu=False)
+trainer = hub.Trainer(model, optimizer, checkpoint_dir='test_ernie_token_cls', use_gpu=True)
 trainer.train(train_dataset, epochs=3, batch_size=32, eval_dataset=dev_dataset)

--- a/demo/style_transfer/README.md
+++ b/demo/style_transfer/README.md
@@ -8,6 +8,17 @@
 $ hub run msgnet --input_path "/PATH/TO/ORIGIN/IMAGE" --style_path "/PATH/TO/STYLE/IMAGE"
 ```
+## 脚本预测
+```python
+import paddle
+import paddlehub as hub
+if __name__ == '__main__':
+    model = hub.Module(name='msgnet')
+    result = model.predict(origin=["venice-boat.jpg"], style="candy.jpg", visualization=True, save_path ='style_tranfer')
+```
 ## 如何开始Fine-tune
 在完成安装PaddlePaddle与PaddleHub后，通过执行`python train.py`即可开始使用msgnet模型对[MiniCOCO](../../docs/reference/datasets.md#class-hubdatasetsMiniCOCO)等数据集进行Fine-tune。

--- a/demo/text_classification/README.md
+++ b/demo/text_classification/README.md
@@ -80,10 +80,12 @@ train_dataset = hub.datasets.ChnSentiCorp(
    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='train')
 dev_dataset = hub.datasets.ChnSentiCorp(
    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='dev')
+test_dataset = hub.datasets.ChnSentiCorp(
+    tokenizer=model.get_tokenizer(), max_seq_len=128, mode='test')
 ```
 * `tokenizer`：表示该module所需用到的tokenizer，其将对输入文本完成切词，并转化成module运行所需模型输入格式。
-* `mode`：选择数据模式，可选项有 `train`, `test`, `val`， 默认为`train`。
+* `mode`：选择数据模式，可选项有 `train`, `test`, `dev`， 默认为`train`。
 * `max_seq_len`：ERNIE/BERT模型使用的最大序列长度，若出现显存不足，请适当调低这一参数。
 预训练模型ERNIE对中文数据的处理是以字为单位，tokenizer作用为将原始输入文本转化成模型model可以接受的输入数据形式。 PaddleHub 2.0中的各种预训练模型已经内置了相应的tokenizer，可以通过`model.get_tokenizer`方法获取。
@@ -95,7 +97,7 @@ dev_dataset = hub.datasets.ChnSentiCorp(
 ```python
 optimizer = paddle.optimizer.Adam(learning_rate=5e-5, parameters=model.parameters())
-trainer = hub.Trainer(model, optimizer, checkpoint_dir='test_ernie_text_cls')
+trainer = hub.Trainer(model, optimizer, checkpoint_dir='test_ernie_text_cls', use_gpu=True)
 trainer.train(train_dataset, epochs=3, batch_size=32, eval_dataset=dev_dataset)

--- a/modules/README.md
+++ b/modules/README.md
--- a/modules/README_ch.md
+++ b/modules/README_ch.md
--- a/modules/audio/asr/deepspeech2_aishell/README.md
+++ b/modules/audio/asr/deepspeech2_aishell/README.md
+# deepspeech2_aishell
+|模型名称|deepspeech2_aishell|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|DeepSpeech2|
+|数据集|AISHELL-1|
+|是否支持Fine-tuning|否|
+|模型大小|306MB|
+|最新更新日期|2021-10-20|
+|数据指标|中文CER 0.065|
+## 一、模型基本信息
+### 模型介绍
+DeepSpeech2是百度于2015年提出的适用于英文和中文的end-to-end语音识别模型。deepspeech2_aishell使用了DeepSpeech2离线模型的结构，模型主要由2层卷积网络和3层GRU组成，并在中文普通话开源语音数据集[AISHELL-1](http://www.aishelltech.com/kysjcp)进行了预训练，该模型在其测试集上的CER指标是0.065。
+<p align="center">
+<img src="https://raw.githubusercontent.com/PaddlePaddle/DeepSpeech/Hub/docs/images/ds2offlineModel.png" hspace='10'/> <br />
+</p>
+更多详情请参考[Deep Speech 2: End-to-End Speech Recognition in English and Mandarin](https://arxiv.org/abs/1512.02595)
+## 二、安装
+- ### 1、系统依赖
+  - libsndfile, swig >= 3.0
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile swig
+      or
+      $ sudo yum install libsndfile swig
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile swig
+      ```
+- ### 2、环境依赖
+  - swig_decoder:
+    ```
+    git clone https://github.com/PaddlePaddle/DeepSpeech.git && cd DeepSpeech && git reset --hard b53171694e7b87abe7ea96870b2f4d8e0e2b1485 && cd deepspeech/decoders/ctcdecoder/swig && sh setup.sh
+    ```
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 3、安装
+  - ```shell
+    $ hub install deepspeech2_aishell
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 采样率为16k，格式为wav的中文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+    model = hub.Module(
+        name='deepspeech2_aishell',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+    print(text)
+    ```
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `text`：str类型，返回输入音频的识别文字结果。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m deepspeech2_aishell
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/deepspeech2_aishell"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install deepspeech2_aishell
+  ```
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/__init__.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/__init__.py
--- a/modules/audio/asr/deepspeech2_aishell/assets/conf/augmentation.json
+++ b/modules/audio/asr/deepspeech2_aishell/assets/conf/augmentation.json
+{}
--- a/modules/audio/asr/deepspeech2_aishell/assets/conf/deepspeech2.yaml
+++ b/modules/audio/asr/deepspeech2_aishell/assets/conf/deepspeech2.yaml
+# https://yaml.org/type/float.html
+data:
+  train_manifest: data/manifest.train
+  dev_manifest: data/manifest.dev
+  test_manifest: data/manifest.test
+  min_input_len: 0.0
+  max_input_len: 27.0 # second
+  min_output_len: 0.0
+  max_output_len: .inf
+  min_output_input_ratio: 0.00
+  max_output_input_ratio: .inf
+collator:
+  batch_size: 64 # one gpu
+  mean_std_filepath: data/mean_std.json
+  unit_type: char
+  vocab_filepath: data/vocab.txt
+  augmentation_config: conf/augmentation.json
+  random_seed: 0
+  spm_model_prefix:
+  spectrum_type: linear
+  feat_dim:
+  delta_delta: False
+  stride_ms: 10.0
+  window_ms: 20.0
+  n_fft: None
+  max_freq: None
+  target_sample_rate: 16000
+  use_dB_normalization: True
+  target_dB: -20
+  dither: 1.0
+  keep_transcription_text: False
+  sortagrad: True
+  shuffle_method: batch_shuffle
+  num_workers: 2
+model:
+  num_conv_layers: 2
+  num_rnn_layers: 3
+  rnn_layer_size: 1024
+  use_gru: True
+  share_rnn_weights: False
+  blank_id: 0
+  ctc_grad_norm_type: instance
+training:
+  n_epoch: 80
+  accum_grad: 1
+  lr: 2e-3
+  lr_decay: 0.83
+  weight_decay: 1e-06
+  global_grad_clip: 3.0
+  log_interval: 100
+  checkpoint:
+    kbest_n: 50
+    latest_n: 5
+decoding:
+  batch_size: 128
+  error_rate_type: cer
+  decoding_method: ctc_beam_search
+  lang_model_path: data/lm/zh_giga.no_cna_cmn.prune01244.klm
+  alpha: 1.9
+  beta: 5.0
+  beam_size: 300
+  cutoff_prob: 0.99
+  cutoff_top_n: 40
+  num_proc_bsearch: 10
--- a/modules/audio/asr/deepspeech2_aishell/assets/data/mean_std.json
+++ b/modules/audio/asr/deepspeech2_aishell/assets/data/mean_std.json
+{"mean_stat": [-13505966.65209869, -12778154.889588555, -13487728.30750011, -12897344.94123812, -12472281.490772562, -12631566.475106332, -13391790.349327326, -14045382.570026815, -14159320.465516506, -14273422.438486755, -14639805.161347123, -15145380.07768254, -15612893.133258691, -15938542.05012206, -16115293.502621327, -16188225.698757892, -16317206.280373082, -16500598.476283036, -16671564.297937019, -16804599.860397574, -16916423.142814968, -17011785.59439087, -17075067.62262626, -17154580.16740178, -17257812.961825978, -17355683.228599995, -17441455.258318607, -17473199.925130684, -17488835.5763828, -17491232.15414511, -17485000.29006962, -17499471.646940477, -17551398.97122984, -17641732.10682403, -17757209.077974595, -17843801.500521667, -17935647.58641936, -18020362.347413756, -18117633.806080323, -18232427.58935143, -18316024.35215119, -18378789.145393644, -18421147.25807373, -18445805.18294822, -18460946.27810118, -18467914.04034822, -18469404.319909714, -18469606.974339806, -18470754.294192698, -18458320.91921723, -18441354.111811973, -18428332.216321833, -18422281.413955193, -18433421.585668042, -18460521.025954794, -18494800.856363494, -18539532.288011573, -18583823.79899225, -18614474.56256926, -18646872.180154275, -18661137.85367877, -18673590.719379324, -18702967.62040798, -18736434.748098046, -18777912.13098326, -18794675.486509323, -18837225.856196072, -18874872.796128694, -18927340.44407057, -18994929.076545004, -19060701.164406348, -19118006.18996682, -19175792.05766062, -19230755.996405277, -19270174.594219487, -19334788.35904946, -19401456.988906194, -19484580.095938426, -19582040.4715673, -19696598.86662636, -19810401.513227757, -19931755.37941177, -20021867.47620737, -20082298.984455004, -20114708.336475413, -20143802.72793865, -20146821.988139726, -20165613.317683898, -20189938.602584295, -20220059.08673595, -20242848.528134122, -20250859.979931064, -20267382.93048284, -20267964.544716164, -20261372.89563879, -20252878.74023849, -20247550.771284755, -20231778.31093504, -20231376.103159923, -20236926.52293088, -20248068.41488535, -20255076.901920393, -20262924.167151034, -20263926.583205637, -20263790.273742784, -20268560.080967404, -20268997.150654405, -20269810.816284582, -20267771.864327505, -20256472.703380838, -20241790.559690386, -20241865.794732895, -20244924.716114976, -20249736.631184842, -20257257.816903576, -20268027.212145977, -20277399.95533857, -20281840.8112546, -20270512.52002465, -20255938.63066214, -20242421.685443826, -20241986.654626504, -20237836.034444932, -20231458.31132546, -20218092.819713395, -20204994.19634715, -20198880.142133974, -20197376.49014031, -20198117.60450857, -20197443.473929476, -20191142.03632657, -20174428.452719454, -20159204.32090646, -20137981.294740904, -20124944.79897834, -20112774.604521394, -20109389.248600915, -20115248.61302806, -20117743.853294585, -20123076.93515528, -20132224.95454374, -20147099.26793121, -20169581.367630124, -20190957.518733896, -20215197.057997894, -20242033.589256056, -20282032.217160087, -20316778.653784916, -20360354.215504933, -20425089.908502825, -20534553.0465662, -20737928.349233944, -21091705.14104186, -21646013.197923105, -22403182.076235127, -23313516.63322832, -24244679.879594248, -25027534.00417361, -25502455.708560493, -25665136.744125813, -26602318.88405537], "var_stat": [209924783.1093623, 185218712.4577822, 209991180.89829063, 196198511.40798286, 186098265.7827955, 191905798.58923203, 214281935.29191792, 235042114.51049897, 240179456.24597096, 244657890.3963041, 256099586.32657292, 271849135.9872555, 287174069.13527167, 298171137.28863454, 304112589.91933817, 306553976.2206335, 310813670.30674237, 316958840.3099824, 322651440.3639528, 327213725.196089, 331252123.26114285, 334856188.3081607, 337217897.6545214, 340385427.82557064, 344400488.5633641, 348086880.08086526, 351349070.53148264, 352648076.18415344, 353409462.33704513, 353598061.4967693, 353405322.74993587, 353917215.6834277, 355784796.898883, 359222461.3224974, 363671441.7428676, 366908651.69908494, 370304677.0615045, 373477194.79721, 377174088.9808273, 381531608.6574547, 384703574.426059, 387104126.9474883, 388723211.11308575, 389687817.27351815, 390351031.4418706, 390659006.3690262, 390704649.89417714, 390702370.1919126, 390731862.59274197, 390216004.4126628, 389516083.054853, 389017745.636457, 388788872.1127645, 389269311.2239042, 390401819.5968815, 391842612.97859454, 393708801.05223197, 395569598.4694, 396868892.67152405, 398210915.02133286, 398743299.4753882, 399330344.88417244, 400565940.1325846, 401901693.4656316, 403513855.43933284, 404103248.96526104, 405986814.274556, 407507145.4104169, 409598353.6517908, 412453848.0248063, 415138273.0558441, 417479272.96907294, 419785633.3276395, 422003065.1681787, 423610264.8868346, 426260552.96545905, 428973536.3620236, 432368654.40899384, 436359561.5468266, 441119512.777527, 445884989.25794005, 451037422.65838546, 454872292.24179226, 457497136.8780015, 458904066.0675219, 460155836.4432799, 460272943.80738074, 461087498.6828549, 462144907.7850926, 463483598.81228757, 464530694.44478536, 464971538.85301507, 465771535.6019992, 465936698.93801653, 465741012.7287712, 465448625.0011534, 465296363.8603534, 464718299.2207512, 464720391.25778216, 465016640.5248736, 465564374.0248998, 465982788.8695927, 466425068.01245564, 466595649.90489674, 466707658.8296169, 467015570.78026086, 467099213.08769494, 467201640.15951264, 467163862.3709329, 466727597.56313753, 466174871.71213347, 466255498.45248336, 466439062.65458614, 466693130.99620277, 467068587.1422199, 467536070.1402474, 467955819.1549621, 468187227.1069643, 467742976.2778335, 467159585.250493, 466592359.52916145, 466583195.8099961, 466424348.9572719, 466155323.6074322, 465569620.1801811, 465021642.5158305, 464757658.6383867, 464713882.60103834, 464724239.2941314, 464679163.728191, 464407007.8705965, 463660736.0136739, 463001339.2385198, 462077058.47595775, 461505071.67199403, 460946277.95973784, 460816158.9197017, 461123589.268546, 461232998.1572812, 461445601.0442877, 461803238.28569543, 462436966.22005004, 463391404.7434971, 464299608.85523456, 465319405.3931429, 466432961.70208246, 468168080.3331244, 469640808.6809098, 471501539.22440934, 474301795.1694898, 479155711.93441755, 488314271.10405815, 504537056.23994666, 530509400.5201074, 566892036.4437443, 611792826.0442055, 658913502.9004005, 699716882.9169292, 725237302.8248898, 734259159.9571886, 789267050.8287783], "frame_num": 899422}
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
--- a/modules/audio/asr/deepspeech2_aishell/deepspeech_tester.py
+++ b/modules/audio/asr/deepspeech2_aishell/deepspeech_tester.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Evaluation for DeepSpeech2 model."""
+import os
+import sys
+from pathlib import Path
+import paddle
+from deepspeech.frontend.featurizer.text_featurizer import TextFeaturizer
+from deepspeech.io.collator import SpeechCollator
+from deepspeech.models.ds2 import DeepSpeech2Model
+from deepspeech.utils import mp_tools
+from deepspeech.utils.utility import UpdateConfig
+class DeepSpeech2Tester:
+    def __init__(self, config):
+        self.config = config
+        self.collate_fn_test = SpeechCollator.from_config(config)
+        self._text_featurizer = TextFeaturizer(unit_type=config.collator.unit_type, vocab_filepath=None)
+    def compute_result_transcripts(self, audio, audio_len, vocab_list, cfg):
+        result_transcripts = self.model.decode(
+            audio,
+            audio_len,
+            vocab_list,
+            decoding_method=cfg.decoding_method,
+            lang_model_path=cfg.lang_model_path,
+            beam_alpha=cfg.alpha,
+            beam_beta=cfg.beta,
+            beam_size=cfg.beam_size,
+            cutoff_prob=cfg.cutoff_prob,
+            cutoff_top_n=cfg.cutoff_top_n,
+            num_processes=cfg.num_proc_bsearch)
+        #replace the '<space>' with ' '
+        result_transcripts = [self._text_featurizer.detokenize(sentence) for sentence in result_transcripts]
+        return result_transcripts
+    @mp_tools.rank_zero_only
+    @paddle.no_grad()
+    def test(self, audio_file):
+        self.model.eval()
+        cfg = self.config
+        collate_fn_test = self.collate_fn_test
+        audio, _ = collate_fn_test.process_utterance(audio_file=audio_file, transcript=" ")
+        audio_len = audio.shape[0]
+        audio = paddle.to_tensor(audio, dtype='float32')
+        audio_len = paddle.to_tensor(audio_len)
+        audio = paddle.unsqueeze(audio, axis=0)
+        vocab_list = collate_fn_test.vocab_list
+        result_transcripts = self.compute_result_transcripts(audio, audio_len, vocab_list, cfg.decoding)
+        return result_transcripts
+    def setup_model(self):
+        config = self.config.clone()
+        with UpdateConfig(config):
+            config.model.feat_size = self.collate_fn_test.feature_size
+            config.model.dict_size = self.collate_fn_test.vocab_size
+        model = DeepSpeech2Model.from_config(config.model)
+        self.model = model
+    def resume(self, checkpoint):
+        """Resume from the checkpoint at checkpoints in the output
+        directory or load a specified checkpoint.
+        """
+        model_dict = paddle.load(checkpoint)
+        self.model.set_state_dict(model_dict)
--- a/modules/audio/asr/deepspeech2_aishell/module.py
+++ b/modules/audio/asr/deepspeech2_aishell/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from pathlib import Path
+import sys
+import numpy as np
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+from paddle.utils.download import get_path_from_url
+try:
+    import swig_decoders
+except ModuleNotFoundError as e:
+    logger.error(e)
+    logger.info('The module requires additional dependencies: swig_decoders. '
+                'please install via:\n\'git clone https://github.com/PaddlePaddle/DeepSpeech.git '
+                '&& cd DeepSpeech && git reset --hard b53171694e7b87abe7ea96870b2f4d8e0e2b1485 '
+                '&& cd deepspeech/decoders/ctcdecoder/swig && sh setup.sh\'')
+    sys.exit(1)
+import paddle
+import soundfile as sf
+# TODO: Remove system path when deepspeech can be installed via pip.
+sys.path.append(os.path.join(MODULE_HOME, 'deepspeech2_aishell'))
+from deepspeech.exps.deepspeech2.config import get_cfg_defaults
+from deepspeech.utils.utility import UpdateConfig
+from .deepspeech_tester import DeepSpeech2Tester
+LM_URL = 'https://deepspeech.bj.bcebos.com/zh_lm/zh_giga.no_cna_cmn.prune01244.klm'
+LM_MD5 = '29e02312deb2e59b3c8686c7966d4fe3'
+@moduleinfo(name="deepspeech2_aishell", version="1.0.0", summary="", author="Baidu", author_email="", type="audio/asr")
+class DeepSpeech2(paddle.nn.Layer):
+    def __init__(self):
+        super(DeepSpeech2, self).__init__()
+        # resource
+        res_dir = os.path.join(MODULE_HOME, 'deepspeech2_aishell', 'assets')
+        conf_file = os.path.join(res_dir, 'conf/deepspeech2.yaml')
+        checkpoint = os.path.join(res_dir, 'checkpoints/avg_1.pdparams')
+        # Download LM manually cause its large size.
+        lm_path = os.path.join(res_dir, 'data', 'lm')
+        lm_file = os.path.join(lm_path, LM_URL.split('/')[-1])
+        if not os.path.isfile(lm_file):
+            logger.info(f'Downloading lm from {LM_URL}.')
+            get_path_from_url(url=LM_URL, root_dir=lm_path, md5sum=LM_MD5)
+        # config
+        self.model_type = 'offline'
+        self.config = get_cfg_defaults(self.model_type)
+        self.config.merge_from_file(conf_file)
+        # TODO: Remove path updating snippet.
+        with UpdateConfig(self.config):
+            self.config.collator.mean_std_filepath = os.path.join(res_dir, self.config.collator.mean_std_filepath)
+            self.config.collator.vocab_filepath = os.path.join(res_dir, self.config.collator.vocab_filepath)
+            self.config.collator.augmentation_config = os.path.join(res_dir, self.config.collator.augmentation_config)
+            self.config.decoding.lang_model_path = os.path.join(res_dir, self.config.decoding.lang_model_path)
+        # model
+        self.tester = DeepSpeech2Tester(self.config)
+        self.tester.setup_model()
+        self.tester.resume(checkpoint)
+    @staticmethod
+    def check_audio(audio_file):
+        sig, sample_rate = sf.read(audio_file)
+        assert sample_rate == 16000, 'Excepting sample rate of input audio is 16000, but got {}'.format(sample_rate)
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        self.check_audio(audio_file)
+        paddle.set_device(device)
+        return self.tester.test(audio_file)[0]
--- a/modules/audio/asr/deepspeech2_aishell/requirements.txt
+++ b/modules/audio/asr/deepspeech2_aishell/requirements.txt
+# system level: libsnd swig
+loguru
+yacs
+jsonlines
+scipy==1.2.1
+sentencepiece
+resampy==0.2.2
+SoundFile==0.9.0.post1
+soxbindings
+kaldiio
+typeguard
+editdistance
--- a/modules/audio/asr/deepspeech2_librispeech/README.md
+++ b/modules/audio/asr/deepspeech2_librispeech/README.md
+# deepspeech2_librispeech
+|模型名称|deepspeech2_librispeech|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|DeepSpeech2|
+|数据集|LibriSpeech|
+|是否支持Fine-tuning|否|
+|模型大小|518MB|
+|最新更新日期|2021-10-20|
+|数据指标|英文WER 0.072|
+## 一、模型基本信息
+### 模型介绍
+DeepSpeech2是百度于2015年提出的适用于英文和中文的end-to-end语音识别模型。deepspeech2_librispeech使用了DeepSpeech2离线模型的结构，模型主要由2层卷积网络和3层GRU组成，并在英文开源语音数据集[LibriSpeech ASR corpus](http://www.openslr.org/12/)进行了预训练，该模型在其测试集上的WER指标是0.072。
+<p align="center">
+<img src="https://raw.githubusercontent.com/PaddlePaddle/DeepSpeech/Hub/docs/images/ds2offlineModel.png" hspace='10'/> <br />
+</p>
+更多详情请参考[Deep Speech 2: End-to-End Speech Recognition in English and Mandarin](https://arxiv.org/abs/1512.02595)
+## 二、安装
+- ### 1、系统依赖
+  - libsndfile, swig >= 3.0
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile swig
+      or
+      $ sudo yum install libsndfile swig
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile swig
+      ```
+- ### 2、环境依赖
+  - swig_decoder:
+    ```
+    git clone https://github.com/paddlepaddle/deepspeech && cd DeepSpeech && git reset --hard b53171694e7b87abe7ea96870b2f4d8e0e2b1485 && cd deepspeech/decoders/ctcdecoder/swig && sh setup.sh
+    ```
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 3、安装
+  - ```shell
+    $ hub install deepspeech2_librispeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 采样率为16k，格式为wav的英文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+    model = hub.Module(
+        name='deepspeech2_librispeech',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+    print(text)
+    ```
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `text`：str类型，返回输入音频的识别文字结果。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m deepspeech2_librispeech
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/deepspeech2_librispeech"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install deepspeech2_librispeech
+  ```
--- a/modules/thirdparty/image/classification/DriverStatusRecognition/__init__.py
+++ b/modules/thirdparty/image/classification/DriverStatusRecognition/__init__.py
--- a/modules/audio/asr/deepspeech2_librispeech/assets/conf/augmentation.json
+++ b/modules/audio/asr/deepspeech2_librispeech/assets/conf/augmentation.json
+{}
--- a/modules/audio/asr/deepspeech2_librispeech/assets/conf/deepspeech2.yaml
+++ b/modules/audio/asr/deepspeech2_librispeech/assets/conf/deepspeech2.yaml
+# https://yaml.org/type/float.html
+data:
+  train_manifest: data/manifest.train
+  dev_manifest: data/manifest.dev-clean
+  test_manifest: data/manifest.test-clean
+  min_input_len: 0.0
+  max_input_len: 30.0 # second
+  min_output_len: 0.0
+  max_output_len: .inf
+  min_output_input_ratio: 0.00
+  max_output_input_ratio: .inf
+collator:
+  batch_size: 20
+  mean_std_filepath: data/mean_std.json
+  unit_type: char
+  vocab_filepath: data/vocab.txt
+  augmentation_config: conf/augmentation.json
+  random_seed: 0
+  spm_model_prefix:
+  spectrum_type: linear
+  target_sample_rate: 16000
+  max_freq: None
+  n_fft: None
+  stride_ms: 10.0
+  window_ms: 20.0
+  delta_delta: False
+  dither: 1.0
+  use_dB_normalization: True
+  target_dB: -20
+  random_seed: 0
+  keep_transcription_text: False
+  sortagrad: True
+  shuffle_method: batch_shuffle
+  num_workers: 2
+model:
+  num_conv_layers: 2
+  num_rnn_layers: 3
+  rnn_layer_size: 2048
+  use_gru: False
+  share_rnn_weights: True
+  blank_id: 0
+  ctc_grad_norm_type: instance
+training:
+  n_epoch: 50
+  accum_grad: 1
+  lr: 1e-3
+  lr_decay: 0.83
+  weight_decay: 1e-06
+  global_grad_clip: 5.0
+  log_interval: 100
+  checkpoint:
+    kbest_n: 50
+    latest_n: 5
+decoding:
+  batch_size: 128
+  error_rate_type: wer
+  decoding_method: ctc_beam_search
+  lang_model_path: data/lm/common_crawl_00.prune01111.trie.klm
+  alpha: 1.9
+  beta: 0.3
+  beam_size: 500
+  cutoff_prob: 1.0
+  cutoff_top_n: 40
+  num_proc_bsearch: 8
--- a/modules/audio/asr/deepspeech2_librispeech/deepspeech_tester.py
+++ b/modules/audio/asr/deepspeech2_librispeech/deepspeech_tester.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Evaluation for DeepSpeech2 model."""
+import os
+import sys
+from pathlib import Path
+import paddle
+from deepspeech.frontend.featurizer.text_featurizer import TextFeaturizer
+from deepspeech.io.collator import SpeechCollator
+from deepspeech.models.ds2 import DeepSpeech2Model
+from deepspeech.utils import mp_tools
+from deepspeech.utils.utility import UpdateConfig
+class DeepSpeech2Tester:
+    def __init__(self, config):
+        self.config = config
+        self.collate_fn_test = SpeechCollator.from_config(config)
+        self._text_featurizer = TextFeaturizer(unit_type=config.collator.unit_type, vocab_filepath=None)
+    def compute_result_transcripts(self, audio, audio_len, vocab_list, cfg):
+        result_transcripts = self.model.decode(
+            audio,
+            audio_len,
+            vocab_list,
+            decoding_method=cfg.decoding_method,
+            lang_model_path=cfg.lang_model_path,
+            beam_alpha=cfg.alpha,
+            beam_beta=cfg.beta,
+            beam_size=cfg.beam_size,
+            cutoff_prob=cfg.cutoff_prob,
+            cutoff_top_n=cfg.cutoff_top_n,
+            num_processes=cfg.num_proc_bsearch)
+        #replace the '<space>' with ' '
+        result_transcripts = [self._text_featurizer.detokenize(sentence) for sentence in result_transcripts]
+        return result_transcripts
+    @mp_tools.rank_zero_only
+    @paddle.no_grad()
+    def test(self, audio_file):
+        self.model.eval()
+        cfg = self.config
+        collate_fn_test = self.collate_fn_test
+        audio, _ = collate_fn_test.process_utterance(audio_file=audio_file, transcript=" ")
+        audio_len = audio.shape[0]
+        audio = paddle.to_tensor(audio, dtype='float32')
+        audio_len = paddle.to_tensor(audio_len)
+        audio = paddle.unsqueeze(audio, axis=0)
+        vocab_list = collate_fn_test.vocab_list
+        result_transcripts = self.compute_result_transcripts(audio, audio_len, vocab_list, cfg.decoding)
+        return result_transcripts
+    def setup_model(self):
+        config = self.config.clone()
+        with UpdateConfig(config):
+            config.model.feat_size = self.collate_fn_test.feature_size
+            config.model.dict_size = self.collate_fn_test.vocab_size
+        model = DeepSpeech2Model.from_config(config.model)
+        self.model = model
+    def resume(self, checkpoint):
+        """Resume from the checkpoint at checkpoints in the output
+        directory or load a specified checkpoint.
+        """
+        model_dict = paddle.load(checkpoint)
+        self.model.set_state_dict(model_dict)
--- a/modules/audio/asr/deepspeech2_librispeech/module.py
+++ b/modules/audio/asr/deepspeech2_librispeech/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from pathlib import Path
+import sys
+import numpy as np
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+from paddle.utils.download import get_path_from_url
+try:
+    import swig_decoders
+except ModuleNotFoundError as e:
+    logger.error(e)
+    logger.info('The module requires additional dependencies: swig_decoders. '
+                'please install via:\n\'git clone https://github.com/PaddlePaddle/DeepSpeech.git '
+                '&& cd DeepSpeech && git reset --hard b53171694e7b87abe7ea96870b2f4d8e0e2b1485 '
+                '&& cd deepspeech/decoders/ctcdecoder/swig && sh setup.sh\'')
+    sys.exit(1)
+import paddle
+import soundfile as sf
+# TODO: Remove system path when deepspeech can be installed via pip.
+sys.path.append(os.path.join(MODULE_HOME, 'deepspeech2_librispeech'))
+from deepspeech.exps.deepspeech2.config import get_cfg_defaults
+from deepspeech.utils.utility import UpdateConfig
+from .deepspeech_tester import DeepSpeech2Tester
+LM_URL = 'https://deepspeech.bj.bcebos.com/en_lm/common_crawl_00.prune01111.trie.klm'
+LM_MD5 = '099a601759d467cd0a8523ff939819c5'
+@moduleinfo(
+    name="deepspeech2_librispeech", version="1.0.0", summary="", author="Baidu", author_email="", type="audio/asr")
+class DeepSpeech2(paddle.nn.Layer):
+    def __init__(self):
+        super(DeepSpeech2, self).__init__()
+        # resource
+        res_dir = os.path.join(MODULE_HOME, 'deepspeech2_librispeech', 'assets')
+        conf_file = os.path.join(res_dir, 'conf/deepspeech2.yaml')
+        checkpoint = os.path.join(res_dir, 'checkpoints/avg_1.pdparams')
+        # Download LM manually cause its large size.
+        lm_path = os.path.join(res_dir, 'data', 'lm')
+        lm_file = os.path.join(lm_path, LM_URL.split('/')[-1])
+        if not os.path.isfile(lm_file):
+            logger.info(f'Downloading lm from {LM_URL}.')
+            get_path_from_url(url=LM_URL, root_dir=lm_path, md5sum=LM_MD5)
+        # config
+        self.model_type = 'offline'
+        self.config = get_cfg_defaults(self.model_type)
+        self.config.merge_from_file(conf_file)
+        # TODO: Remove path updating snippet.
+        with UpdateConfig(self.config):
+            self.config.collator.mean_std_filepath = os.path.join(res_dir, self.config.collator.mean_std_filepath)
+            self.config.collator.vocab_filepath = os.path.join(res_dir, self.config.collator.vocab_filepath)
+            self.config.collator.augmentation_config = os.path.join(res_dir, self.config.collator.augmentation_config)
+            self.config.decoding.lang_model_path = os.path.join(res_dir, self.config.decoding.lang_model_path)
+        # model
+        self.tester = DeepSpeech2Tester(self.config)
+        self.tester.setup_model()
+        self.tester.resume(checkpoint)
+    @staticmethod
+    def check_audio(audio_file):
+        sig, sample_rate = sf.read(audio_file)
+        assert sample_rate == 16000, 'Excepting sample rate of input audio is 16000, but got {}'.format(sample_rate)
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        self.check_audio(audio_file)
+        paddle.set_device(device)
+        return self.tester.test(audio_file)[0]
--- a/modules/audio/asr/deepspeech2_librispeech/requirements.txt
+++ b/modules/audio/asr/deepspeech2_librispeech/requirements.txt
+loguru
+yacs
+jsonlines
+scipy==1.2.1
+sentencepiece
+resampy==0.2.2
+SoundFile==0.9.0.post1
+soxbindings
+kaldiio
+typeguard
+editdistance
--- a/modules/audio/asr/u2_conformer_aishell/README.md
+++ b/modules/audio/asr/u2_conformer_aishell/README.md
+# u2_conformer_aishell
+|模型名称|u2_conformer_aishell|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|Conformer|
+|数据集|AISHELL-1|
+|是否支持Fine-tuning|否|
+|模型大小|284MB|
+|最新更新日期|2021-11-01|
+|数据指标|中文CER 0.055|
+## 一、模型基本信息
+### 模型介绍
+U2 Conformer模型是一种适用于英文和中文的end-to-end语音识别模型。u2_conformer_aishell采用了conformer的encoder和transformer的decoder的模型结构，并且使用了ctc-prefix beam search的方式进行一遍打分，再利用attention decoder进行二次打分的方式进行解码来得到最终结果。
+u2_conformer_aishell在中文普通话开源语音数据集[AISHELL-1](http://www.aishelltech.com/kysjcp)进行了预训练，该模型在其测试集上的CER指标是0.055257。
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/conformer.png" hspace='10'/> <br />
+</p>
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/u2_conformer.png" hspace='10'/> <br />
+</p>
+更多详情请参考:
+- [Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition](https://arxiv.org/abs/2012.05481)
+- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
+## 二、安装
+- ### 1、系统依赖
+  - libsndfile
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile
+      or
+      $ sudo yum install libsndfile
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile
+      ```
+- ### 2、环境依赖
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 3、安装
+  - ```shell
+    $ hub install u2_conformer_aishell
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 采样率为16k，格式为wav的中文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+    model = hub.Module(
+        name='u2_conformer_aishell',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+    print(text)
+    ```
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `text`：str类型，返回输入音频的识别文字结果。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m u2_conformer_aishell
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/u2_conformer_aishell"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install u2_conformer_aishell
+  ```
--- a/modules/thirdparty/image/classification/SnakeIdentification/__init__.py
+++ b/modules/thirdparty/image/classification/SnakeIdentification/__init__.py
--- a/modules/audio/asr/u2_conformer_aishell/assets/conf/augmentation.json
+++ b/modules/audio/asr/u2_conformer_aishell/assets/conf/augmentation.json
+{}
--- a/modules/audio/asr/u2_conformer_aishell/assets/conf/conformer.yaml
+++ b/modules/audio/asr/u2_conformer_aishell/assets/conf/conformer.yaml
+data:
+  train_manifest: data/manifest.train
+  dev_manifest: data/manifest.dev
+  test_manifest: data/manifest.test
+  min_input_len: 0.5
+  max_input_len: 20.0 # second
+  min_output_len: 0.0
+  max_output_len: 400.0
+  min_output_input_ratio: 0.05
+  max_output_input_ratio: 10.0
+collator:
+  vocab_filepath: data/vocab.txt
+  unit_type: 'char'
+  spm_model_prefix: ''
+  augmentation_config: conf/augmentation.json
+  batch_size: 64
+  raw_wav: True  # use raw_wav or kaldi feature
+  spectrum_type: fbank #linear, mfcc, fbank
+  feat_dim: 80
+  delta_delta: False
+  dither: 1.0
+  target_sample_rate: 16000
+  max_freq: None
+  n_fft: None
+  stride_ms: 10.0
+  window_ms: 25.0
+  use_dB_normalization: False
+  target_dB: -20
+  random_seed: 0
+  keep_transcription_text: False
+  sortagrad: True
+  shuffle_method: batch_shuffle
+  num_workers: 2
+decoding:
+  alpha: 2.5
+  batch_size: 128
+  beam_size: 10
+  beta: 0.3
+  ctc_weight: 0.0
+  cutoff_prob: 1.0
+  cutoff_top_n: 0
+  decoding_chunk_size: -1
+  decoding_method: attention
+  error_rate_type: cer
+  lang_model_path: data/lm/common_crawl_00.prune01111.trie.klm
+  num_decoding_left_chunks: -1
+  num_proc_bsearch: 8
+  simulate_streaming: False
+model:
+  cmvn_file: data/mean_std.json
+  cmvn_file_type: json
+  decoder: transformer
+  decoder_conf:
+    attention_heads: 4
+    dropout_rate: 0.1
+    linear_units: 2048
+    num_blocks: 6
+    positional_dropout_rate: 0.1
+    self_attention_dropout_rate: 0.0
+    src_attention_dropout_rate: 0.0
+  encoder: conformer
+  encoder_conf:
+    activation_type: swish
+    attention_dropout_rate: 0.0
+    attention_heads: 4
+    cnn_module_kernel: 15
+    dropout_rate: 0.1
+    input_layer: conv2d
+    linear_units: 2048
+    normalize_before: True
+    num_blocks: 12
+    output_size: 256
+    pos_enc_layer_type: rel_pos
+    positional_dropout_rate: 0.1
+    selfattention_layer_type: rel_selfattn
+    use_cnn_module: True
+  input_dim: 0
+  model_conf:
+    ctc_weight: 0.3
+    ctc_dropoutrate: 0.0
+    ctc_grad_norm_type: instance
+    length_normalized_loss: False
+    lsm_weight: 0.1
+  output_dim: 0
+training:
+  accum_grad: 2
+  global_grad_clip: 5.0
+  log_interval: 100
+  n_epoch: 300
+  optim: adam
+  optim_conf:
+    lr: 0.002
+    weight_decay: 1e-06
+  scheduler: warmuplr
+  scheduler_conf:
+    lr_decay: 1.0
+    warmup_steps: 25000
+  checkpoint:
+    kbest_n: 50
+    latest_n: 5
--- a/modules/audio/asr/u2_conformer_aishell/assets/data/mean_std.json
+++ b/modules/audio/asr/u2_conformer_aishell/assets/data/mean_std.json
+{"mean_stat": [533749178.75492024, 537379151.9412827, 553560684.251823, 587164297.7995199, 631868827.5506272, 662598279.7375823, 684377628.7270963, 695391900.076011, 692470493.5234187, 679434068.1698124, 666124153.9164762, 656323498.7897255, 665750586.0282139, 678693518.7836165, 681921713.5434498, 679622373.0941861, 669891550.4909347, 656595089.7941492, 653838531.0994304, 637678601.7858486, 628412248.7348012, 644835299.462052, 638840698.1892803, 646181879.4332589, 639724189.2981818, 642757470.3933163, 637471382.8647255, 642368839.4687729, 643414999.4559816, 647384269.1630985, 649348352.9727564, 649293860.0141628, 650234047.7200857, 654485430.6703687, 660474314.9996675, 667417041.2224753, 673157601.3226709, 675674470.304284, 675124085.6890339, 668017589.4583111, 670061307.6169846, 662625614.6886193, 663144526.4351237, 662504003.7634674, 666413530.1149732, 672263295.5639057, 678483738.2530766, 685387098.3034457, 692570857.529439, 699066050.4399202, 700784878.5879861, 701201520.50868, 702666292.305144, 705443439.2278953, 706070270.9023902, 705988909.8337733, 702843339.0362502, 699318566.4701376, 696089900.3030818, 687559674.541517, 675279201.9502573, 663676352.2301354, 662963751.7464145, 664300133.8414352, 666095384.4212626, 671682092.7777623, 676652386.6696675, 680097668.2490273, 683810023.0071762, 688701544.3655603, 692082724.9923568, 695788849.6782106, 701085780.0070009, 706389529.7959046, 711492753.1344281, 717637923.73355, 719691678.2081754, 715810733.4964175, 696362890.4862831, 604649423.9932467], "var_stat": [5413314850.92017, 5559847287.933615, 6150990253.613769, 6921242242.585692, 7999776708.347419, 8789877370.390867, 9405801233.462742, 9768050110.323652, 9759783206.942099, 9430647265.679018, 9090547056.72849, 8873147345.425886, 9155912918.518642, 9542539953.84679, 9653547618.806402, 9593434792.936714, 9316633026.420147, 8959273999.588833, 8863548125.445953, 8450615911.730164, 8211598033.615433, 8587083872.162145, 8432613574.987708, 8583943640.722399, 8401731458.393406, 8439359231.367369, 8293779802.711447, 8401506934.147289, 8427506949.839874, 8525176341.071184, 8577080109.482346, 8575106681.347283, 8594987363.896849, 8701703698.13697, 8854967559.695303, 9029484499.828356, 9168774993.437275, 9221457044.693224, 9194525496.858181, 8997085233.031223, 9024585998.805922, 8819398159.92156, 8807895653.788486, 8777245867.886335, 8869681168.825321, 9017397167.041729, 9173402827.38027, 9345595113.30765, 9530638054.282673, 9701241750.610865, 9749002220.142677, 9762753891.356327, 9802020174.527405, 9874432300.977995, 9883303068.689241, 9873499335.610315, 9780680890.924107, 9672603363.913414, 9569436761.47915, 9321842521.985804, 8968140697.297707, 8646348638.918655, 8616965457.523136, 8648620220.395298, 8702086138.675117, 8859213220.99842, 8999405313.087536, 9105949447.399998, 9220413227.016796, 9358601578.269663, 9451405873.00428, 9552727080.824707, 9695443509.54488, 9836687193.669691, 9970962418.410656, 10135881535.317768, 10189390919.400673, 10070483257.345238, 9532953296.22076, 7261219636.045063], "frame_num": 54068199}
--- a/modules/audio/asr/u2_conformer_aishell/assets/data/vocab.txt
+++ b/modules/audio/asr/u2_conformer_aishell/assets/data/vocab.txt
--- a/modules/audio/asr/u2_conformer_aishell/module.py
+++ b/modules/audio/asr/u2_conformer_aishell/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from pathlib import Path
+import sys
+import numpy as np
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+import paddle
+import soundfile as sf
+# TODO: Remove system path when deepspeech can be installed via pip.
+sys.path.append(os.path.join(MODULE_HOME, 'u2_conformer_aishell'))
+from deepspeech.exps.u2.config import get_cfg_defaults
+from deepspeech.utils.utility import UpdateConfig
+from .u2_conformer_tester import U2ConformerTester
+@moduleinfo(name="u2_conformer_aishell", version="1.0.0", summary="", author="Baidu", author_email="", type="audio/asr")
+class U2Conformer(paddle.nn.Layer):
+    def __init__(self):
+        super(U2Conformer, self).__init__()
+        # resource
+        res_dir = os.path.join(MODULE_HOME, 'u2_conformer_aishell', 'assets')
+        conf_file = os.path.join(res_dir, 'conf/conformer.yaml')
+        checkpoint = os.path.join(res_dir, 'checkpoints/avg_20.pdparams')
+        # config
+        self.config = get_cfg_defaults()
+        self.config.merge_from_file(conf_file)
+        # TODO: Remove path updating snippet.
+        with UpdateConfig(self.config):
+            self.config.collator.vocab_filepath = os.path.join(res_dir, self.config.collator.vocab_filepath)
+            # self.config.collator.spm_model_prefix = os.path.join(res_dir, self.config.collator.spm_model_prefix)
+            self.config.collator.augmentation_config = os.path.join(res_dir, self.config.collator.augmentation_config)
+            self.config.model.cmvn_file = os.path.join(res_dir, self.config.model.cmvn_file)
+            self.config.decoding.decoding_method = 'attention_rescoring'
+            self.config.decoding.batch_size = 1
+        # model
+        self.tester = U2ConformerTester(self.config)
+        self.tester.setup_model()
+        self.tester.resume(checkpoint)
+    @staticmethod
+    def check_audio(audio_file):
+        sig, sample_rate = sf.read(audio_file)
+        assert sample_rate == 16000, 'Excepting sample rate of input audio is 16000, but got {}'.format(sample_rate)
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        self.check_audio(audio_file)
+        paddle.set_device(device)
+        return self.tester.test(audio_file)[0][0]
--- a/modules/audio/asr/u2_conformer_aishell/requirements.txt
+++ b/modules/audio/asr/u2_conformer_aishell/requirements.txt
+loguru
+yacs
+jsonlines
+scipy==1.2.1
+sentencepiece
+resampy==0.2.2
+SoundFile==0.9.0.post1
+soxbindings
+kaldiio
+typeguard
+editdistance
+textgrid
--- a/modules/audio/asr/u2_conformer_aishell/u2_conformer_tester.py
+++ b/modules/audio/asr/u2_conformer_aishell/u2_conformer_tester.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Evaluation for U2 model."""
+import os
+import sys
+import paddle
+from deepspeech.frontend.featurizer.text_featurizer import TextFeaturizer
+from deepspeech.io.collator import SpeechCollator
+from deepspeech.models.u2 import U2Model
+from deepspeech.utils import mp_tools
+from deepspeech.utils.utility import UpdateConfig
+class U2ConformerTester:
+    def __init__(self, config):
+        self.config = config
+        self.collate_fn_test = SpeechCollator.from_config(config)
+        self._text_featurizer = TextFeaturizer(
+            unit_type=config.collator.unit_type, vocab_filepath=None, spm_model_prefix=config.collator.spm_model_prefix)
+    @mp_tools.rank_zero_only
+    @paddle.no_grad()
+    def test(self, audio_file):
+        self.model.eval()
+        cfg = self.config.decoding
+        collate_fn_test = self.collate_fn_test
+        audio, _ = collate_fn_test.process_utterance(audio_file=audio_file, transcript="Hello")
+        audio_len = audio.shape[0]
+        audio = paddle.to_tensor(audio, dtype='float32')
+        audio_len = paddle.to_tensor(audio_len)
+        audio = paddle.unsqueeze(audio, axis=0)
+        vocab_list = collate_fn_test.vocab_list
+        text_feature = self.collate_fn_test.text_feature
+        result_transcripts = self.model.decode(
+            audio,
+            audio_len,
+            text_feature=text_feature,
+            decoding_method=cfg.decoding_method,
+            lang_model_path=cfg.lang_model_path,
+            beam_alpha=cfg.alpha,
+            beam_beta=cfg.beta,
+            beam_size=cfg.beam_size,
+            cutoff_prob=cfg.cutoff_prob,
+            cutoff_top_n=cfg.cutoff_top_n,
+            num_processes=cfg.num_proc_bsearch,
+            ctc_weight=cfg.ctc_weight,
+            decoding_chunk_size=cfg.decoding_chunk_size,
+            num_decoding_left_chunks=cfg.num_decoding_left_chunks,
+            simulate_streaming=cfg.simulate_streaming)
+        return result_transcripts
+    def setup_model(self):
+        config = self.config.clone()
+        with UpdateConfig(config):
+            config.model.input_dim = self.collate_fn_test.feature_size
+            config.model.output_dim = self.collate_fn_test.vocab_size
+        self.model = U2Model.from_config(config.model)
+    def resume(self, checkpoint):
+        """Resume from the checkpoint at checkpoints in the output
+        directory or load a specified checkpoint.
+        """
+        model_dict = paddle.load(checkpoint)
+        self.model.set_state_dict(model_dict)
--- a/modules/audio/asr/u2_conformer_librispeech/README.md
+++ b/modules/audio/asr/u2_conformer_librispeech/README.md
+# u2_conformer_librispeech
+|模型名称|u2_conformer_librispeech|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|Conformer|
+|数据集|LibriSpeech|
+|是否支持Fine-tuning|否|
+|模型大小|191MB|
+|最新更新日期|2021-11-01|
+|数据指标|英文WER 0.034|
+## 一、模型基本信息
+### 模型介绍
+U2 Conformer模型是一种适用于英文和中文的end-to-end语音识别模型。u2_conformer_libirspeech采用了conformer的encoder和transformer的decoder的模型结构，并且使用了ctc-prefix beam search的方式进行一遍打分，再利用attention decoder进行二次打分的方式进行解码来得到最终结果。
+u2_conformer_libirspeech在英文开源语音数据集[LibriSpeech ASR corpus](http://www.openslr.org/12/)进行了预训练，该模型在其测试集上的WER指标是0.034655。
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/conformer.png" hspace='10'/> <br />
+</p>
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/u2_conformer.png" hspace='10'/> <br />
+</p>
+更多详情请参考:
+- [Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition](https://arxiv.org/abs/2012.05481)
+- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
+## 二、安装
+- ### 1、系统依赖
+  - libsndfile
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile
+      or
+      $ sudo yum install libsndfile
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile
+      ```
+- ### 2、环境依赖
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 3、安装
+  - ```shell
+    $ hub install u2_conformer_librispeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+  - ```python
+    import paddlehub as hub
+    # 采样率为16k，格式为wav的英文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+    model = hub.Module(
+        name='u2_conformer_librispeech',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+    print(text)
+    ```
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `text`：str类型，返回输入音频的识别文字结果。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m u2_conformer_librispeech
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/u2_conformer_librispeech"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install u2_conformer_librispeech
+  ```
--- a/modules/thirdparty/image/classification/food_classification/__init__.py
+++ b/modules/thirdparty/image/classification/food_classification/__init__.py
--- a/modules/audio/asr/u2_conformer_librispeech/assets/conf/augmentation.json
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/conf/augmentation.json
+{}
--- a/modules/audio/asr/u2_conformer_librispeech/assets/conf/conformer.yaml
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/conf/conformer.yaml
+# https://yaml.org/type/float.html
+data:
+  train_manifest: data/manifest.test-clean
+  dev_manifest: data/manifest.test-clean
+  test_manifest: data/manifest.test-clean
+  min_input_len: 0.5  # seconds
+  max_input_len: 30.0 # seconds
+  min_output_len: 0.0 # tokens
+  max_output_len: 400.0 # tokens
+  min_output_input_ratio: 0.05
+  max_output_input_ratio: 100.0
+collator:
+  vocab_filepath: data/vocab.txt
+  unit_type: 'spm'
+  spm_model_prefix: 'data/bpe_unigram_5000'
+  mean_std_filepath: ""
+  augmentation_config: conf/augmentation.json
+  batch_size: 16
+  raw_wav: True  # use raw_wav or kaldi feature
+  spectrum_type: fbank #linear, mfcc, fbank
+  feat_dim: 80
+  delta_delta: False
+  dither: 1.0
+  target_sample_rate: 16000
+  max_freq: None
+  n_fft: None
+  stride_ms: 10.0
+  window_ms: 25.0
+  use_dB_normalization: True
+  target_dB: -20
+  random_seed: 0
+  keep_transcription_text: False
+  sortagrad: True
+  shuffle_method: batch_shuffle
+  num_workers: 2
+# network architecture
+model:
+    cmvn_file: "data/mean_std.json"
+    cmvn_file_type: "json"
+    # encoder related
+    encoder: conformer
+    encoder_conf:
+        output_size: 256    # dimension of attention
+        attention_heads: 4
+        linear_units: 2048  # the number of units of position-wise feed forward
+        num_blocks: 12      # the number of encoder blocks
+        dropout_rate: 0.1
+        positional_dropout_rate: 0.1
+        attention_dropout_rate: 0.0
+        input_layer: conv2d # encoder input type, you can chose conv2d, conv2d6 and conv2d8
+        normalize_before: True
+        use_cnn_module: True
+        cnn_module_kernel: 15
+        activation_type: 'swish'
+        pos_enc_layer_type: 'rel_pos'
+        selfattention_layer_type: 'rel_selfattn'
+    # decoder related
+    decoder: transformer
+    decoder_conf:
+        attention_heads: 4
+        linear_units: 2048
+        num_blocks: 6
+        dropout_rate: 0.1
+        positional_dropout_rate: 0.1
+        self_attention_dropout_rate: 0.0
+        src_attention_dropout_rate: 0.0
+    # hybrid CTC/attention
+    model_conf:
+        ctc_weight: 0.3
+        ctc_dropoutrate: 0.0
+        ctc_grad_norm_type: instance
+        lsm_weight: 0.1     # label smoothing option
+        length_normalized_loss: false
+training:
+  n_epoch: 120
+  accum_grad: 8
+  global_grad_clip: 3.0
+  optim: adam
+  optim_conf:
+    lr: 0.004
+    weight_decay: 1e-06
+  scheduler: warmuplr     # pytorch v1.1.0+ required
+  scheduler_conf:
+    warmup_steps: 25000
+    lr_decay: 1.0
+  log_interval: 100
+  checkpoint:
+    kbest_n: 50
+    latest_n: 5
+decoding:
+  batch_size: 64
+  error_rate_type: wer
+  decoding_method: attention  # 'attention', 'ctc_greedy_search', 'ctc_prefix_beam_search', 'attention_rescoring'
+  lang_model_path: data/lm/common_crawl_00.prune01111.trie.klm
+  alpha: 2.5
+  beta: 0.3
+  beam_size: 10
+  cutoff_prob: 1.0
+  cutoff_top_n: 0
+  num_proc_bsearch: 8
+  ctc_weight: 0.5 # ctc weight for attention rescoring decode mode.
+  decoding_chunk_size: -1 # decoding chunk size. Defaults to -1.
+      # <0: for decoding, use full chunk.
+      # >0: for decoding, use fixed chunk size as set.
+      # 0: used for training, it's prohibited here.
+  num_decoding_left_chunks: -1  # number of left chunks for decoding. Defaults to -1.
+  simulate_streaming: False  # simulate streaming inference. Defaults to False.
--- a/modules/audio/asr/u2_conformer_librispeech/assets/data/bpe_unigram_5000.model
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/data/bpe_unigram_5000.model
--- a/modules/audio/asr/u2_conformer_librispeech/assets/data/bpe_unigram_5000.vocab
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/data/bpe_unigram_5000.vocab
--- a/modules/audio/asr/u2_conformer_librispeech/assets/data/mean_std.json
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/data/mean_std.json
+{"mean_stat": [3419817384.9589553, 3554070049.1888413, 3818511309.9166613, 4066044518.3850017, 4291564631.2871633, 4447813845.146345, 4533096457.680424, 4535743891.989957, 4529762966.952207, 4506798370.255702, 4563810141.721841, 4621582319.277632, 4717208210.814803, 4782916961.295261, 4800534153.252695, 4816978042.979026, 4813370098.242317, 4783029495.131413, 4797780594.144404, 4697681126.278327, 4615891408.325888, 4660549391.6024275, 4576180438.146472, 4609080513.250168, 4575296489.058092, 4602504837.872262, 4568039825.650208, 4596829549.204861, 4590634987.343898, 4604371982.549804, 4623782318.317643, 4643582410.8842745, 4681460771.788484, 4759470876.31175, 4808639788.683043, 4828470941.416027, 4868984035.113543, 4906503986.801533, 4945995579.443381, 4936645225.986488, 4975902400.919519, 4960230208.656678, 4986734786.199859, 4983472199.8246765, 5002204376.162232, 5030432036.352981, 5060386169.086892, 5093482058.577236, 5118330657.308789, 5137270836.326198, 5140137363.319094, 5144296534.330122, 5158812605.654329, 5166263515.51458, 5156261604.282723, 5155820011.532965, 5154511256.8968, 5152063882.193671, 5153425524.412178, 5149000486.683038, 5154587156.35868, 5134412165.07972, 5092874838.792056, 5062281231.5140915, 5029059442.072953, 4996045017.917702, 4962203662.170533, 4928110046.282831, 4900476581.092096, 4881407033.533021, 4859626116.955097, 4851430742.3865795, 4850317443.454599, 4848197040.155383, 4837178106.464577, 4818448202.7298765, 4803345264.527405, 4765785994.104498, 4735296707.352132, 4699957946.40757], "var_stat": [39487786239.20539, 42865198005.60155, 49718916704.468704, 55953639455.490585, 62156293826.00315, 66738657819.12445, 69416921986.47835, 69657873431.17258, 69240303799.53061, 68286972351.43054, 69718367152.18843, 71405427710.7103, 74174200331.87572, 76047347951.43869, 76478048614.40665, 76810929560.19212, 76540466184.85634, 75538479521.34026, 75775624554.07217, 72775991318.16557, 70350402972.93352, 71358602366.48341, 68872845697.9878, 69552396791.49916, 68471390455.59991, 69022047288.07498, 67982260910.11236, 68656154716.71916, 68461419064.9241, 68795285460.65717, 69270474608.52791, 69754495937.76433, 70596044579.14969, 72207936275.97945, 73629619360.65047, 74746445259.57487, 75925168496.81197, 76973508692.04265, 78074337163.3413, 77765963787.96971, 78839167623.49733, 78328768943.2287, 79016127287.03778, 78922638306.99306, 79489768324.9408, 80354861037.44005, 81311991408.12526, 82368205917.26112, 83134782296.1741, 83667769421.23245, 83673751953.46239, 83806087685.62842, 84193971202.07523, 84424752763.34825, 84092846117.64104, 84039114093.08766, 83982515225.7085, 83909645482.75613, 83947278563.15077, 83800767707.19617, 83851106027.8772, 83089292432.37892, 82056425825.3622, 81138570746.92316, 80131843258.75557, 79130160837.19037, 78092166878.71533, 77104785522.79205, 76308548392.10454, 75709445890.58063, 75084778641.6033, 74795849006.19067, 74725807683.832, 74645651838.2169, 74300193368.39339, 73696619147.86806, 73212785808.97992, 72240491743.0697, 71420246227.32545, 70457076435.4593], "frame_num": 345484372}
--- a/modules/audio/asr/u2_conformer_librispeech/assets/data/vocab.txt
+++ b/modules/audio/asr/u2_conformer_librispeech/assets/data/vocab.txt
--- a/modules/audio/asr/u2_conformer_librispeech/module.py
+++ b/modules/audio/asr/u2_conformer_librispeech/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from pathlib import Path
+import sys
+import numpy as np
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+import paddle
+import soundfile as sf
+# TODO: Remove system path when deepspeech can be installed via pip.
+sys.path.append(os.path.join(MODULE_HOME, 'u2_conformer_librispeech'))
+from deepspeech.exps.u2.config import get_cfg_defaults
+from deepspeech.utils.utility import UpdateConfig
+from .u2_conformer_tester import U2ConformerTester
+@moduleinfo(
+    name="u2_conformer_librispeech", version="1.0.0", summary="", author="Baidu", author_email="", type="audio/asr")
+class U2Conformer(paddle.nn.Layer):
+    def __init__(self):
+        super(U2Conformer, self).__init__()
+        # resource
+        res_dir = os.path.join(MODULE_HOME, 'u2_conformer_librispeech', 'assets')
+        conf_file = os.path.join(res_dir, 'conf/conformer.yaml')
+        checkpoint = os.path.join(res_dir, 'checkpoints/avg_30.pdparams')
+        # config
+        self.config = get_cfg_defaults()
+        self.config.merge_from_file(conf_file)
+        # TODO: Remove path updating snippet.
+        with UpdateConfig(self.config):
+            self.config.collator.vocab_filepath = os.path.join(res_dir, self.config.collator.vocab_filepath)
+            self.config.collator.spm_model_prefix = os.path.join(res_dir, self.config.collator.spm_model_prefix)
+            self.config.collator.augmentation_config = os.path.join(res_dir, self.config.collator.augmentation_config)
+            self.config.model.cmvn_file = os.path.join(res_dir, self.config.model.cmvn_file)
+            self.config.decoding.decoding_method = 'attention_rescoring'
+            self.config.decoding.batch_size = 1
+        # model
+        self.tester = U2ConformerTester(self.config)
+        self.tester.setup_model()
+        self.tester.resume(checkpoint)
+    @staticmethod
+    def check_audio(audio_file):
+        sig, sample_rate = sf.read(audio_file)
+        assert sample_rate == 16000, 'Excepting sample rate of input audio is 16000, but got {}'.format(sample_rate)
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        self.check_audio(audio_file)
+        paddle.set_device(device)
+        return self.tester.test(audio_file)[0][0]
--- a/modules/audio/asr/u2_conformer_librispeech/requirements.txt
+++ b/modules/audio/asr/u2_conformer_librispeech/requirements.txt
+loguru
+yacs
+jsonlines
+scipy==1.2.1
+sentencepiece
+resampy==0.2.2
+SoundFile==0.9.0.post1
+soxbindings
+kaldiio
+typeguard
+editdistance
+textgrid
--- a/modules/audio/asr/u2_conformer_librispeech/u2_conformer_tester.py
+++ b/modules/audio/asr/u2_conformer_librispeech/u2_conformer_tester.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Evaluation for U2 model."""
+import os
+import sys
+import paddle
+from deepspeech.frontend.featurizer.text_featurizer import TextFeaturizer
+from deepspeech.io.collator import SpeechCollator
+from deepspeech.models.u2 import U2Model
+from deepspeech.utils import mp_tools
+from deepspeech.utils.utility import UpdateConfig
+class U2ConformerTester:
+    def __init__(self, config):
+        self.config = config
+        self.collate_fn_test = SpeechCollator.from_config(config)
+        self._text_featurizer = TextFeaturizer(
+            unit_type=config.collator.unit_type, vocab_filepath=None, spm_model_prefix=config.collator.spm_model_prefix)
+    @mp_tools.rank_zero_only
+    @paddle.no_grad()
+    def test(self, audio_file):
+        self.model.eval()
+        cfg = self.config.decoding
+        collate_fn_test = self.collate_fn_test
+        audio, _ = collate_fn_test.process_utterance(audio_file=audio_file, transcript="Hello")
+        audio_len = audio.shape[0]
+        audio = paddle.to_tensor(audio, dtype='float32')
+        audio_len = paddle.to_tensor(audio_len)
+        audio = paddle.unsqueeze(audio, axis=0)
+        vocab_list = collate_fn_test.vocab_list
+        text_feature = self.collate_fn_test.text_feature
+        result_transcripts = self.model.decode(
+            audio,
+            audio_len,
+            text_feature=text_feature,
+            decoding_method=cfg.decoding_method,
+            lang_model_path=cfg.lang_model_path,
+            beam_alpha=cfg.alpha,
+            beam_beta=cfg.beta,
+            beam_size=cfg.beam_size,
+            cutoff_prob=cfg.cutoff_prob,
+            cutoff_top_n=cfg.cutoff_top_n,
+            num_processes=cfg.num_proc_bsearch,
+            ctc_weight=cfg.ctc_weight,
+            decoding_chunk_size=cfg.decoding_chunk_size,
+            num_decoding_left_chunks=cfg.num_decoding_left_chunks,
+            simulate_streaming=cfg.simulate_streaming)
+        return result_transcripts
+    def setup_model(self):
+        config = self.config.clone()
+        with UpdateConfig(config):
+            config.model.input_dim = self.collate_fn_test.feature_size
+            config.model.output_dim = self.collate_fn_test.vocab_size
+        self.model = U2Model.from_config(config.model)
+    def resume(self, checkpoint):
+        """Resume from the checkpoint at checkpoints in the output
+        directory or load a specified checkpoint.
+        """
+        model_dict = paddle.load(checkpoint)
+        self.model.set_state_dict(model_dict)
--- a/modules/audio/asr/u2_conformer_wenetspeech/README.md
+++ b/modules/audio/asr/u2_conformer_wenetspeech/README.md
+# u2_conformer_wenetspeech
+|模型名称|u2_conformer_wenetspeech|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|Conformer|
+|数据集|WenetSpeech|
+|是否支持Fine-tuning|否|
+|模型大小|494MB|
+|最新更新日期|2021-12-10|
+|数据指标|中文CER 0.087 |
+## 一、模型基本信息
+### 模型介绍
+U2 Conformer模型是一种适用于英文和中文的end-to-end语音识别模型。u2_conformer_wenetspeech采用了conformer的encoder和transformer的decoder的模型结构，并且使用了ctc-prefix beam search的方式进行一遍打分，再利用attention decoder进行二次打分的方式进行解码来得到最终结果。
+u2_conformer_wenetspeech在中文普通话开源语音数据集[WenetSpeech](https://wenet-e2e.github.io/WenetSpeech/)进行了预训练，该模型在其DEV测试集上的CER指标是0.087。
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/conformer.png" hspace='10'/> <br />
+</p>
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/u2_conformer.png" hspace='10'/> <br />
+</p>
+更多详情请参考:
+- [Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition](https://arxiv.org/abs/2012.05481)
+- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
+- [WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition](https://arxiv.org/abs/2110.03370)
+## 二、安装
+- ### 1、系统依赖
+  - libsndfile
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile
+      or
+      $ sudo yum install libsndfile
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile
+      ```
+- ### 2、环境依赖
+  - paddlepaddle >= 2.2.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 3、安装
+  - ```shell
+    $ hub install u2_conformer_wenetspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 采样率为16k，格式为wav的中文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+    model = hub.Module(
+        name='u2_conformer_wenetspeech',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+    print(text)
+    ```
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000，如果不满足，则重新采样至16000并将新的音频文件保存至相同目录。
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+    - **参数**
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `text`：str类型，返回输入音频的识别文字结果。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m u2_conformer_wenetspeech
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/u2_conformer_wenetspeech"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install u2_conformer_wenetspeech
+  ```
--- a/modules/thirdparty/image/classification/marine_biometrics/__init__.py
+++ b/modules/thirdparty/image/classification/marine_biometrics/__init__.py
--- a/modules/audio/asr/u2_conformer_wenetspeech/module.py
+++ b/modules/audio/asr/u2_conformer_wenetspeech/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import paddle
+from paddleaudio import load, save_wav
+from paddlespeech.cli import ASRExecutor
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+@moduleinfo(
+    name="u2_conformer_wenetspeech", version="1.0.0", summary="", author="Wenet", author_email="", type="audio/asr")
+class U2Conformer(paddle.nn.Layer):
+    def __init__(self):
+        super(U2Conformer, self).__init__()
+        self.asr_executor = ASRExecutor()
+        self.asr_kw_args = {
+            'model': 'conformer_wenetspeech',
+            'lang': 'zh',
+            'sample_rate': 16000,
+            'config': None,  # Set `config` and `ckpt_path` to None to use pretrained model.
+            'ckpt_path': None,
+        }
+    @staticmethod
+    def check_audio(audio_file):
+        assert audio_file.endswith('.wav'), 'Input file must be a wave file `*.wav`.'
+        sig, sample_rate = load(audio_file)
+        if sample_rate != 16000:
+            sig, _ = load(audio_file, 16000)
+            audio_file_16k = audio_file[:audio_file.rindex('.')] + '_16k.wav'
+            logger.info('Resampling to 16000 sample rate to new audio file: {}'.format(audio_file_16k))
+            save_wav(sig, 16000, audio_file_16k)
+            return audio_file_16k
+        else:
+            return audio_file
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        audio_file = self.check_audio(audio_file)
+        text = self.asr_executor(audio_file=audio_file, device=device, **self.asr_kw_args)
+        return text
--- a/modules/audio/asr/u2_conformer_wenetspeech/requirements.txt
+++ b/modules/audio/asr/u2_conformer_wenetspeech/requirements.txt
+paddlespeech==0.1.0a9
--- a/modules/audio/audio_classification/PANNs/cnn10/README.md
+++ b/modules/audio/audio_classification/PANNs/cnn10/README.md
-```shell
+# panns_cnn10
-$ hub install panns_cnn10==1.0.0
-```
+|模型名称|panns_cnn10|
+| :--- | :---: |
+|类别|语音-声音分类|
+|网络|PANNs|
+|数据集|Google Audioset|
+|是否支持Fine-tuning|是|
+|模型大小|31MB|
+|最新更新日期|2021-06-15|
+|数据指标|mAP 0.380|
-`panns_cnn10`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含8个卷积层和2个全连接层，模型参数为4.9M。经过预训练后，可以用于提取音频的embbedding，维度是512。
+## 一、模型基本信息
-更多详情请参考论文：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
+### 模型介绍
-## API
+`panns_cnn10`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含8个卷积层和2个全连接层，模型参数为4.9M。经过预训练后，可以用于提取音频的embbedding，维度是512。
-```python
-def __init__(
-        task,
-        num_class=None,
-        label_map=None,
-        load_checkpoint=None,
-        **kwargs,
-)
-```
-创建Module对象。
+更多详情请参考论文：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
-**参数**
+## 二、安装
-* `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
+- ### 1、环境依赖
-* `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
-* `label_map`：预测时的类别映射表。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
+  - paddlepaddle >= 2.0.0
-def predict(
-        data,
-        sample_rate,
-        batch_size=1,
-        feat_type='mel',
-        use_gpu=False
-)
-```
-**参数**
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+- ### 2、安装
-* `sample_rate`：音频文件的采样率。
-* `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
-* `batch_size`：模型批处理大小。
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+  - ```shell
+    $ hub install panns_cnn10
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
-* `results`：list类型，不同任务类型的返回结果如下
-  * 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
-  * Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
- [ESC50](https://github.com/karolpiczak/ESC-50)声音分类预测
+  - ```python
-    ```python
+    # ESC50声音分类预测
    import librosa
    import paddlehub as hub
    from paddlehub.datasets import ESC50
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    checkpoint = 'model.pdparams' # 用于预测的模型参数
    label_map = {idx: label for idx, label in enumerate(ESC50.label_list)}
@@ -86,8 +70,8 @@ def predict(
    print('File: {}\tLable: {}'.format(wav_file, result[0]))
    ```
- Audioset Tagging
+  - ```python
-    ```python
+    # Audioset Tagging
    import librosa
    import numpy as np
    import paddlehub as hub
@@ -105,7 +89,7 @@ def predict(
        print(msg)
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    label_file = './audioset_labels.txt' # audioset标签文本文件
    topk = 10 # 展示的topk数
@@ -130,23 +114,58 @@ def predict(
    show_topk(topk, label_map, wav_file, result[0])
    ```
-详情可参考PaddleHub示例：
+- ### 2、API
- [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
+  - ```python
+    def __init__(
+            task,
+            num_class=None,
+            label_map=None,
+            load_checkpoint=None,
+            **kwargs,
+    )
+    ```
+    - 创建Module对象。
-##   查看代码
+    - **参数**
+      - `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
-https://github.com/qiuqiangkong/audioset_tagging_cnn
+      - `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
+      - `label_map`：预测时的类别映射表。
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
+  - ```python
+    def predict(
+            data,
+            sample_rate,
+            batch_size=1,
+            feat_type='mel',
+            use_gpu=False
+    )
+    ```
+    - 模型预测，输入为音频波形数据，输出为分类标签。
-## 依赖
+    - **参数**
+      - `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+      - `sample_rate`：音频文件的采样率。
+      - `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
+      - `batch_size`：模型批处理大小。
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-paddlepaddle >= 2.0.0
+    - **返回**
+      - `results`：list类型，不同任务类型的返回结果如下
+      - 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
+      - Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
-paddlehub >= 2.0.0
+    详情可参考PaddleHub示例：
+    - [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
-## 更新历史
+## 四、更新历史
 * 1.0.0
  初始发布，动态图版本模型，支持声音分类`sound-cls`任务的fine-tune和基于Audioset Tagging预测。
+  ```shell
+  $ hub install panns_cnn10
+  ```
--- a/modules/audio/audio_classification/PANNs/cnn14/README.md
+++ b/modules/audio/audio_classification/PANNs/cnn14/README.md
-```shell
+# panns_cnn14
-$ hub install panns_cnn14==1.0.0
-```
+|模型名称|panns_cnn14|
+| :--- | :---: |
+|类别|语音-声音分类|
+|网络|PANNs|
+|数据集|Google Audioset|
+|是否支持Fine-tuning|是|
+|模型大小|469MB|
+|最新更新日期|2021-06-15|
+|数据指标|mAP 0.431|
-`panns_cnn14`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含12个卷积层和2个全连接层，模型参数为79.6M。经过预训练后，可以用于提取音频的embbedding，维度是2048。
+## 一、模型基本信息
-更多详情请参考论文：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
+### 模型介绍
-## API
+`panns_cnn14`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含12个卷积层和2个全连接层，模型参数为79.6M。经过预训练后，可以用于提取音频的embbedding，维度是2048。
-```python
-def __init__(
-        task,
-        num_class=None,
-        label_map=None,
-        load_checkpoint=None,
-        **kwargs,
-)
-```
-创建Module对象。
+更多详情请参考论文：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
-**参数**
+## 二、安装
-* `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
+- ### 1、环境依赖
-* `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
-* `label_map`：预测时的类别映射表。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
+  - paddlepaddle >= 2.0.0
-def predict(
-        data,
-        sample_rate,
-        batch_size=1,
-        feat_type='mel',
-        use_gpu=False
-)
-```
-**参数**
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+- ### 2、安装
-* `sample_rate`：音频文件的采样率。
-* `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
-* `batch_size`：模型批处理大小。
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+  - ```shell
+    $ hub install panns_cnn14
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
-* `results`：list类型，不同任务类型的返回结果如下
-  * 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
-  * Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
- [ESC50](https://github.com/karolpiczak/ESC-50)声音分类预测
+  - ```python
-    ```python
+    # ESC50声音分类预测
    import librosa
    import paddlehub as hub
    from paddlehub.datasets import ESC50
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    checkpoint = 'model.pdparams' # 用于预测的模型参数
    label_map = {idx: label for idx, label in enumerate(ESC50.label_list)}
@@ -86,8 +70,8 @@ def predict(
    print('File: {}\tLable: {}'.format(wav_file, result[0]))
    ```
- Audioset Tagging
+  - ```python
-    ```python
+    # Audioset Tagging
    import librosa
    import numpy as np
    import paddlehub as hub
@@ -105,7 +89,7 @@ def predict(
        print(msg)
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    label_file = './audioset_labels.txt' # audioset标签文本文件
    topk = 10 # 展示的topk数
@@ -130,23 +114,58 @@ def predict(
    show_topk(topk, label_map, wav_file, result[0])
    ```
-详情可参考PaddleHub示例：
+- ### 2、API
- [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
+  - ```python
+    def __init__(
+            task,
+            num_class=None,
+            label_map=None,
+            load_checkpoint=None,
+            **kwargs,
+    )
+    ```
+    - 创建Module对象。
-##   查看代码
+    - **参数**
+      - `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
-https://github.com/qiuqiangkong/audioset_tagging_cnn
+      - `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
+      - `label_map`：预测时的类别映射表。
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
+  - ```python
+    def predict(
+            data,
+            sample_rate,
+            batch_size=1,
+            feat_type='mel',
+            use_gpu=False
+    )
+    ```
+    - 模型预测，输入为音频波形数据，输出为分类标签。
-## 依赖
+    - **参数**
+      - `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+      - `sample_rate`：音频文件的采样率。
+      - `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
+      - `batch_size`：模型批处理大小。
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-paddlepaddle >= 2.0.0
+    - **返回**
+      - `results`：list类型，不同任务类型的返回结果如下
+      - 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
+      - Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
-paddlehub >= 2.0.0
+    详情可参考PaddleHub示例：
+    - [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
-## 更新历史
+## 四、更新历史
 * 1.0.0
  初始发布，动态图版本模型，支持声音分类`sound-cls`任务的fine-tune和基于Audioset Tagging预测。
+  ```shell
+  $ hub install panns_cnn14
+  ```
--- a/modules/audio/audio_classification/PANNs/cnn6/README.md
+++ b/modules/audio/audio_classification/PANNs/cnn6/README.md
-```shell
+# panns_cnn6
-$ hub install panns_cnn6==1.0.0
-```
+|模型名称|panns_cnn6|
+| :--- | :---: |
+|类别|语音-声音分类|
+|网络|PANNs|
+|数据集|Google Audioset|
+|是否支持Fine-tuning|是|
+|模型大小|29MB|
+|最新更新日期|2021-06-15|
+|数据指标|mAP 0.343|
-`panns_cnn6`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含4个卷积层和2个全连接层，模型参数为4.5M。经过预训练后，可以用于提取音频的embbedding，维度是512。
+## 一、模型基本信息
-更多详情请参考论文：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
+### 模型介绍
-## API
+`panns_cnn6`是一个基于[Google Audioset](https://research.google.com/audioset/)数据集训练的声音分类/识别的模型。该模型主要包含4个卷积层和2个全连接层，模型参数为4.5M。经过预训练后，可以用于提取音频的embbedding，维度是512。
-```python
-def __init__(
-        task,
-        num_class=None,
-        label_map=None,
-        load_checkpoint=None,
-        **kwargs,
-)
-```
-创建Module对象。
+更多详情请参考：[PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition](https://arxiv.org/pdf/1912.10211.pdf)
-**参数**
+## 二、安装
-* `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
+- ### 1、环境依赖
-* `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
-* `label_map`：预测时的类别映射表。
-* `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
-* `**kwargs`：用户额外指定的关键字字典类型的参数。
-```python
+  - paddlepaddle >= 2.0.0
-def predict(
-        data,
-        sample_rate,
-        batch_size=1,
-        feat_type='mel',
-        use_gpu=False
-)
-```
-**参数**
+  - paddlehub >= 2.0.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+- ### 2、安装
-* `sample_rate`：音频文件的采样率。
-* `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
-* `batch_size`：模型批处理大小。
-* `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-**返回**
+  - ```shell
+    $ hub install panns_cnn6
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
-* `results`：list类型，不同任务类型的返回结果如下
-  * 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
-  * Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
+## 三、模型API预测  
-**代码示例**
+- ### 1、预测代码示例
- [ESC50](https://github.com/karolpiczak/ESC-50)声音分类预测
+  - ```python
-    ```python
+    # ESC50声音分类预测
    import librosa
    import paddlehub as hub
    from paddlehub.datasets import ESC50
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    checkpoint = 'model.pdparams' # 用于预测的模型参数
    label_map = {idx: label for idx, label in enumerate(ESC50.label_list)}
@@ -86,8 +70,8 @@ def predict(
    print('File: {}\tLable: {}'.format(wav_file, result[0]))
    ```
- Audioset Tagging
+  - ```python
-    ```python
+    # Audioset Tagging
    import librosa
    import numpy as np
    import paddlehub as hub
@@ -105,7 +89,7 @@ def predict(
        print(msg)
    sr = 44100 # 音频文件的采样率
-    wav_file = '/data/cat.wav' # 用于预测的音频文件路径
+    wav_file = '/PATH/TO/AUDIO' # 用于预测的音频文件路径
    label_file = './audioset_labels.txt' # audioset标签文本文件
    topk = 10 # 展示的topk数
@@ -130,23 +114,58 @@ def predict(
    show_topk(topk, label_map, wav_file, result[0])
    ```
-详情可参考PaddleHub示例：
+- ### 2、API
- [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
-##   查看代码
+  - ```python
+    def __init__(
+            task,
+            num_class=None,
+            label_map=None,
+            load_checkpoint=None,
+            **kwargs,
+    )
+    ```
+    - 创建Module对象。
-https://github.com/qiuqiangkong/audioset_tagging_cnn
+    - **参数**
+      - `task`： 任务名称，可为`sound-cls`或者`None`。`sound-cls`代表声音分类任务，可以对声音分类的数据集进行finetune；为`None`时可以获取预训练模型对音频进行分类/Tagging。
+      - `num_classes`：声音分类任务的类别数，finetune时需要指定，数值与具体使用的数据集类别数一致。
+      - `label_map`：预测时的类别映射表。
+      - `load_checkpoint`：使用PaddleHub Fine-tune api训练保存的模型参数文件路径。
+      - `**kwargs`：用户额外指定的关键字字典类型的参数。
+  - ```python
+    def predict(
+            data,
+            sample_rate,
+            batch_size=1,
+            feat_type='mel',
+            use_gpu=False
+    )
+    ```
+    - 模型预测，输入为音频波形数据，输出为分类标签。
-## 依赖
+    - **参数**
+      - `data`： 待预测数据，格式为\[waveform1, wavwform2…,\]，其中每个元素都是一个一维numpy列表，是音频的波形采样数值列表。
+      - `sample_rate`：音频文件的采样率。
+      - `feat_type`：音频特征的种类选取，当前支持`'mel'`(详情可查看[Mel-frequency cepstrum](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum))和原始波形特征`'raw'`。
+      - `batch_size`：模型批处理大小。
+      - `use_gpu`：是否使用gpu，默认为False。对于GPU用户，建议开启use_gpu。
-paddlepaddle >= 2.0.0
+    - **返回**
+      - `results`：list类型，不同任务类型的返回结果如下
+      - 声音分类(task参数为`sound-cls`)：列表里包含每个音频文件的分类标签。
+      - Tagging(task参数为`None`)：列表里包含每个音频文件527个类别([Audioset标签](https://research.google.com/audioset/))的得分。
-paddlehub >= 2.0.0
+    详情可参考PaddleHub示例：
+    - [AudioClassification](https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0/demo/audio_classification)
-## 更新历史
+## 四、更新历史
 * 1.0.0
  初始发布，动态图版本模型，支持声音分类`sound-cls`任务的fine-tune和基于Audioset Tagging预测。
+  ```shell
+  $ hub install panns_cnn6
+  ```
--- a/modules/audio/tts/deepvoice3_ljspeech/README.md
+++ b/modules/audio/tts/deepvoice3_ljspeech/README.md
-## 概述
+# deepvoice3_ljspeech
+|模型名称|deepvoice3_ljspeech|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|DeepVoice3|
+|数据集|LJSpeech-1.1|
+|是否支持Fine-tuning|否|
+|模型大小|58MB|
+|最新更新日期|2020-10-27|
+|数据指标|-|
+## 一、模型基本信息
+### 模型介绍
 Deep Voice 3是百度研究院2017年发布的端到端的TTS模型（论文录用于ICLR 2018）。它是一个基于卷积神经网络和注意力机制的seq2seq模型,由于不包含循环神经网络，它可以并行训练，远快于基于循环神经网络的模型。Deep Voice 3可以学习到多个说话人的特征，也支持搭配多种声码器使用。deepvoice3_ljspeech是基于ljspeech英文语音数据集预训练得到的英文TTS模型，仅支持预测。
 <p align="center">
-<img src="https://github.com/PaddlePaddle/Parakeet/blob/develop/examples/deepvoice3/images/model_architecture.png" hspace='10'/> <br />
+<img src="https://raw.githubusercontent.com/PaddlePaddle/Parakeet/release/v0.1/examples/deepvoice3/images/model_architecture.png" hspace='10'/> <br/>
 </p>
 更多详情参考论文[Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning](https://arxiv.org/abs/1710.07654)
-## 命令行预测
-```shell
+## 二、安装
-$ hub run deepvoice3_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
-```
-## API
+- ### 1、系统依赖
-```python
+    对于Ubuntu用户，请执行：
-def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
+    ```
-```
+    sudo apt-get install libsndfile1
+    ```
+    对于Centos用户，请执行：
+    ```
+    sudo yum install libsndfile
+    ```
-预测API，由输入文本合成对应音频波形。
+- ### 2、环境依赖
-**参数**
+  - 2.0.0 > paddlepaddle >= 1.8.2
-* texts (list\[str\]): 待预测文本；
+  - 2.0.0 > paddlehub >= 1.7.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
-* use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
-* vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"
-**返回**
+- ### 3、安装
-* wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
+  - ```shell
-* sample\_rate (int): 合成音频的采样率。
+    $ hub install deepvoice3_ljspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
-**代码示例**
-```python
+## 三、模型API预测  
-import paddlehub as hub
-import soundfile as sf
-# Load deepvoice3_ljspeech module.
+- ### 1、命令行预测
-module = hub.Module(name="deepvoice3_ljspeech")
-# Predict sentiment label
+  - ```shell
-test_texts = ['Simple as this proposition is, it is necessary to be stated',
+    $ hub run deepvoice3_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
-              'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
+    ```
-wavs, sample_rate = module.synthesize(texts=test_texts)
+  - 通过命令行方式实现语音合成模型的调用，更多请见[PaddleHub命令行指令](https://github.com/shinichiye/PaddleHub/blob/release/v2.1/docs/docs_ch/tutorial/cmd_usage.rst)
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
-## 服务部署
+- ### 2、预测代码示例
-PaddleHub Serving 可以部署在线服务。
+  - ```python
+    import paddlehub as hub
+    import soundfile as sf
-### 第一步：启动PaddleHub Serving
+    # Load deepvoice3_ljspeech module.
+    module = hub.Module(name="deepvoice3_ljspeech")
-运行启动命令：
+    # Predict sentiment label
-```shell
+    test_texts = ['Simple as this proposition is, it is necessary to be stated',
-$ hub serving start -m deepvoice3_ljspeech
+                'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
-```
+    wavs, sample_rate = module.synthesize(texts=test_texts)
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```
-这样就完成了一个服务化API的部署，默认端口号为8866。
+- ### 3、API
-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+  - ```python
+    def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
+    ```
-### 第二步：发送预测请求
+    - 预测API，由输入文本合成对应音频波形。
-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **参数**
+      - texts (list\[str\]): 待预测文本；
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
+      - vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"
-```python
+    - **返回**
-import requests
+      - wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
-import json
+      - sample\_rate (int): 合成音频的采样率。
-import soundfile as sf
-# 发送HTTP请求
+## 四、服务部署
-data = {'texts':['Simple as this proposition is, it is necessary to be stated',
+- PaddleHub Serving可以部署一个在线语音合成服务，可以将此接口用于在线web应用。
-                 'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
-        'use_gpu':False}
+- ### 第一步：启动PaddleHub Serving
-headers = {"Content-type": "application/json"}
-url = "http://127.0.0.1:8866/predict/deepvoice3_ljspeech"
+  - 运行启动命令
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
+  - ```shell
+    $ hub serving start -m deepvoice3_ljspeech
-# 保存结果
+    ```
-result = r.json()["results"]
+  - 这样就完成了服务化API的部署，默认端口号为8866。  
-wavs = result["wavs"]
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
-sample_rate = result["sample_rate"]
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
-## 查看代码
+- ### 第二步：发送预测请求
-https://github.com/PaddlePaddle/Parakeet
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
-### 依赖
+  - ```python
+    import requests
+    import json
-paddlepaddle >= 1.8.2
+    import soundfile as sf
-paddlehub >= 1.7.0
+    # 发送HTTP请求
-**NOTE:** 除了python依赖外还必须安装libsndfile库
+    data = {'texts':['Simple as this proposition is, it is necessary to be stated',
+                    'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
+            'use_gpu':False}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/deepvoice3_ljspeech"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    # 保存结果
+    result = r.json()["results"]
+    wavs = result["wavs"]
+    sample_rate = result["sample_rate"]
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```
-对于Ubuntu用户，请执行：
-```
-sudo apt-get install libsndfile1
-```
-对于Centos用户，请执行：
-```
-sudo yum install libsndfile
-```
-## 更新历史
+## 五、更新历史
 * 1.0.0
  初始发布
+  ```shell
+  $ hub install deepvoice3_ljspeech
+  ```
--- a/modules/audio/tts/fastspeech2_baker/README.md
+++ b/modules/audio/tts/fastspeech2_baker/README.md
+# fastspeech2_baker
+|模型名称|fastspeech2_baker|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|FastSpeech2|
+|数据集|Chinese Standard Mandarin Speech Copus|
+|是否支持Fine-tuning|否|
+|模型大小|621MB|
+|最新更新日期|2021-10-20|
+|数据指标|-|
+## 一、模型基本信息
+### 模型介绍
+FastSpeech2是微软亚洲研究院和微软Azure语音团队联合浙江大学于2020年提出的语音合成(Text to Speech, TTS)模型。FastSpeech2是FastSpeech的改进版，解决了FastSpeech依赖Teacher-Student的知识蒸馏框架，训练流程比较复杂和训练目标相比真实语音存在信息损失的问题。
+FastSpeech2的模型架构如下图所示，它沿用FastSpeech中提出的Feed-Forward Transformer(FFT)架构，但在音素编码器和梅尔频谱解码器中加入了一个可变信息适配器(Variance Adaptor)，从而支持在FastSpeech2中引入更多语音中变化的信息，例如时长、音高、音量(频谱能量)等，来解决语音合成中的一对多映射问题。
+<p align="center">
+<img src="https://paddlespeech.bj.bcebos.com/Parakeet/docs/images/fastspeech2.png" hspace='10'/> <br />
+</p>
+Parallel WaveGAN是一种使用了无蒸馏的对抗生成网络，快速且占用空间小的波形生成方法。该方法通过联合优化多分辨率谱图和对抗损失函数来训练非自回归WaveNet，可以有效捕获真实语音波形的时频分布。Parallel WaveGAN的结构如下图所示：
+<p align="center">
+<img src="https://paddlespeech.bj.bcebos.com/Parakeet/docs/images/pwg.png" hspace='10'/> <br />
+</p>
+fastspeech2_baker使用了FastSpeech2作为声学模型，使用Parallel WaveGAN作为声码器，并在[中文标准女声音库(Chinese Standard Mandarin Speech Copus)](https://www.data-baker.com/open_source.html)数据集上进行了预训练，可直接用于预测合成音频。
+更多详情请参考:
+- [FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech](https://arxiv.org/abs/2006.04558)
+- [FastSpeech语音合成系统技术升级，微软联合浙大提出FastSpeech2](https://www.msra.cn/zh-cn/news/features/fastspeech2)
+- [Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram](https://arxiv.org/abs/1910.11480)
+## 二、安装
+- ### 1、环境依赖
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install fastspeech2_baker
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 需要合成语音的文本
+    sentences = ['这是一段测试语音合成的音频。']
+    model = hub.Module(
+        name='fastspeech2_baker',
+        version='1.0.0')
+    wav_files =  model.generate(sentences)
+    # 打印合成的音频文件的路径
+    print(wav_files)
+    ```
+    详情可参考PaddleHub示例：
+    - [语音合成](../../../../demo/text_to_speech)
+- ### 2、API
+  - ```python
+    def __init__(output_dir)
+    ```
+    - 创建Module对象（动态图组网版本）
+    - **参数**
+      - `output_dir`： 合成音频文件的输出目录。
+  - ```python
+    def generate(
+        sentences,
+        device='cpu',
+    )
+    ```
+    - 将输入的文本合成为音频文件并保存到输出目录。
+    - **参数**
+      - `sentences`：合成音频的文本列表，类型为`List[str]`。
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `wav_files`：`List[str]`类型，返回合成音频的存放路径。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m fastspeech2_baker
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要合成语音的文本
+    sentences = [
+        '这是第一段测试语音合成的音频。',
+        '这是第二段测试语音合成的音频。',
+    ]
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"sentences"
+    data = {"sentences": sentences}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/fastspeech2_baker"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install fastspeech2_baker
+  ```
--- a/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/__init__.py
+++ b/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/__init__.py
--- a/modules/audio/tts/fastspeech2_baker/assets/fastspeech2_nosil_baker_ckpt_0.4/default.yaml
+++ b/modules/audio/tts/fastspeech2_baker/assets/fastspeech2_nosil_baker_ckpt_0.4/default.yaml
+###########################################################
+#                FEATURE EXTRACTION SETTING               #
+###########################################################
+fs: 24000          # sr
+n_fft: 2048        # FFT size.
+n_shift: 300       # Hop size.
+win_length: 1200   # Window length.
+                   # If set to null, it will be the same as fft_size.
+window: "hann"     # Window function.
+# Only used for feats_type != raw
+fmin: 80           # Minimum frequency of Mel basis.
+fmax: 7600         # Maximum frequency of Mel basis.
+n_mels: 80         # The number of mel basis.
+# Only used for the model using pitch features (e.g. FastSpeech2)
+f0min: 80          # Maximum f0 for pitch extraction.
+f0max: 400         # Minimum f0 for pitch extraction.
+###########################################################
+#                       DATA SETTING                      #
+###########################################################
+batch_size: 64
+num_workers: 4
+###########################################################
+#                       MODEL SETTING                     #
+###########################################################
+model:
+    adim: 384         # attention dimension
+    aheads: 2         # number of attention heads
+    elayers: 4        # number of encoder layers
+    eunits: 1536      # number of encoder ff units
+    dlayers: 4        # number of decoder layers
+    dunits: 1536      # number of decoder ff units
+    positionwise_layer_type: conv1d   # type of position-wise layer
+    positionwise_conv_kernel_size: 3  # kernel size of position wise conv layer
+    duration_predictor_layers: 2      # number of layers of duration predictor
+    duration_predictor_chans: 256     # number of channels of duration predictor
+    duration_predictor_kernel_size: 3 # filter size of duration predictor
+    postnet_layers: 5                 # number of layers of postnset
+    postnet_filts: 5                  # filter size of conv layers in postnet
+    postnet_chans: 256                # number of channels of conv layers in postnet
+    use_masking: True                 # whether to apply masking for padded part in loss calculation
+    use_scaled_pos_enc: True          # whether to use scaled positional encoding
+    encoder_normalize_before: True    # whether to perform layer normalization before the input
+    decoder_normalize_before: True    # whether to perform layer normalization before the input
+    reduction_factor: 1               # reduction factor
+    init_type: xavier_uniform         # initialization type
+    init_enc_alpha: 1.0               # initial value of alpha of encoder scaled position encoding
+    init_dec_alpha: 1.0               # initial value of alpha of decoder scaled position encoding
+    transformer_enc_dropout_rate: 0.2            # dropout rate for transformer encoder layer
+    transformer_enc_positional_dropout_rate: 0.2 # dropout rate for transformer encoder positional encoding
+    transformer_enc_attn_dropout_rate: 0.2       # dropout rate for transformer encoder attention layer
+    transformer_dec_dropout_rate: 0.2            # dropout rate for transformer decoder layer
+    transformer_dec_positional_dropout_rate: 0.2 # dropout rate for transformer decoder positional encoding
+    transformer_dec_attn_dropout_rate: 0.2       # dropout rate for transformer decoder attention layer
+    pitch_predictor_layers: 5                  # number of conv layers in pitch predictor
+    pitch_predictor_chans: 256                 # number of channels of conv layers in pitch predictor
+    pitch_predictor_kernel_size: 5             # kernel size of conv leyers in pitch predictor
+    pitch_predictor_dropout: 0.5               # dropout rate in pitch predictor
+    pitch_embed_kernel_size: 1                 # kernel size of conv embedding layer for pitch
+    pitch_embed_dropout: 0.0                   # dropout rate after conv embedding layer for pitch
+    stop_gradient_from_pitch_predictor: true   # whether to stop the gradient from pitch predictor to encoder
+    energy_predictor_layers: 2                 # number of conv layers in energy predictor
+    energy_predictor_chans: 256                # number of channels of conv layers in energy predictor
+    energy_predictor_kernel_size: 3            # kernel size of conv leyers in energy predictor
+    energy_predictor_dropout: 0.5              # dropout rate in energy predictor
+    energy_embed_kernel_size: 1                # kernel size of conv embedding layer for energy
+    energy_embed_dropout: 0.0                  # dropout rate after conv embedding layer for energy
+    stop_gradient_from_energy_predictor: false # whether to stop the gradient from energy predictor to encoder
+###########################################################
+#                       UPDATER SETTING                   #
+###########################################################
+updater:
+    use_masking: True                 # whether to apply masking for padded part in loss calculation
+###########################################################
+#                     OPTIMIZER SETTING                   #
+###########################################################
+optimizer:
+  optim: adam               # optimizer type
+  learning_rate: 0.001     # learning rate
+###########################################################
+#                     TRAINING SETTING                    #
+###########################################################
+max_epoch: 1000
+num_snapshots: 5
+###########################################################
+#                       OTHER SETTING                     #
+###########################################################
+seed: 10086
--- a/modules/audio/tts/fastspeech2_baker/assets/fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt
+++ b/modules/audio/tts/fastspeech2_baker/assets/fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt
+<pad> 0
+<unk> 1
+a1 2
+a2 3
+a3 4
+a4 5
+a5 6
+ai1 7
+ai2 8
+ai3 9
+ai4 10
+ai5 11
+air2 12
+air4 13
+an1 14
+an2 15
+an3 16
+an4 17
+an5 18
+ang1 19
+ang2 20
+ang3 21
+ang4 22
+ang5 23
+anr1 24
+anr3 25
+anr4 26
+ao1 27
+ao2 28
+ao3 29
+ao4 30
+ao5 31
+aor3 32
+aor4 33
+ar2 34
+ar3 35
+ar4 36
+b 37
+c 38
+ch 39
+d 40
+e1 41
+e2 42
+e3 43
+e4 44
+e5 45
+ei1 46
+ei2 47
+ei3 48
+ei4 49
+ei5 50
+en1 51
+en2 52
+en3 53
+en4 54
+en5 55
+eng1 56
+eng2 57
+eng3 58
+eng4 59
+eng5 60
+enr1 61
+enr2 62
+enr4 63
+enr5 64
+er2 65
+er3 66
+er4 67
+er5 68
+f 69
+g 70
+h 71
+i1 72
+i2 73
+i3 74
+i4 75
+i5 76
+ia1 77
+ia2 78
+ia3 79
+ia4 80
+ia5 81
+ian1 82
+ian2 83
+ian3 84
+ian4 85
+ian5 86
+iang1 87
+iang2 88
+iang3 89
+iang4 90
+iang5 91
+iangr4 92
+ianr1 93
+ianr2 94
+ianr3 95
+iao1 96
+iao2 97
+iao3 98
+iao4 99
+iao5 100
+iar1 101
+iar3 102
+ie1 103
+ie2 104
+ie3 105
+ie4 106
+ie5 107
+ii1 108
+ii2 109
+ii3 110
+ii4 111
+ii5 112
+iii1 113
+iii2 114
+iii3 115
+iii4 116
+iii5 117
+iiir4 118
+iir2 119
+in1 120
+in2 121
+in3 122
+in4 123
+in5 124
+ing1 125
+ing2 126
+ing3 127
+ing4 128
+ing5 129
+ingr2 130
+ingr3 131
+inr4 132
+io1 133
+io5 134
+iong1 135
+iong2 136
+iong3 137
+iong4 138
+iong5 139
+iou1 140
+iou2 141
+iou3 142
+iou4 143
+iou5 144
+iour1 145
+ir1 146
+ir2 147
+ir3 148
+ir4 149
+ir5 150
+j 151
+k 152
+l 153
+m 154
+n 155
+o1 156
+o2 157
+o3 158
+o4 159
+o5 160
+ong1 161
+ong2 162
+ong3 163
+ong4 164
+ong5 165
+ongr4 166
+ou1 167
+ou2 168
+ou3 169
+ou4 170
+ou5 171
+our2 172
+p 173
+q 174
+r 175
+s 176
+sh 177
+sil 178
+sp 179
+spl 180
+spn 181
+t 182
+u1 183
+u2 184
+u3 185
+u4 186
+u5 187
+ua1 188
+ua2 189
+ua3 190
+ua4 191
+ua5 192
+uai1 193
+uai2 194
+uai3 195
+uai4 196
+uai5 197
+uair4 198
+uan1 199
+uan2 200
+uan3 201
+uan4 202
+uan5 203
+uang1 204
+uang2 205
+uang3 206
+uang4 207
+uang5 208
+uanr1 209
+uanr2 210
+uei1 211
+uei2 212
+uei3 213
+uei4 214
+uei5 215
+ueir1 216
+ueir3 217
+ueir4 218
+uen1 219
+uen2 220
+uen3 221
+uen4 222
+uen5 223
+ueng1 224
+ueng2 225
+ueng3 226
+ueng4 227
+uenr3 228
+uenr4 229
+uo1 230
+uo2 231
+uo3 232
+uo4 233
+uo5 234
+uor2 235
+uor3 236
+ur3 237
+ur4 238
+v1 239
+v2 240
+v3 241
+v4 242
+v5 243
+van1 244
+van2 245
+van3 246
+van4 247
+van5 248
+vanr4 249
+ve1 250
+ve2 251
+ve3 252
+ve4 253
+ve5 254
+vn1 255
+vn2 256
+vn3 257
+vn4 258
+vn5 259
+x 260
+z 261
+zh 262
+， 263
+。 264
+？ 265
+！ 266
+<eos> 267
--- a/modules/audio/tts/fastspeech2_baker/assets/pwg_baker_ckpt_0.4/pwg_default.yaml
+++ b/modules/audio/tts/fastspeech2_baker/assets/pwg_baker_ckpt_0.4/pwg_default.yaml
+# This is the hyperparameter configuration file for Parallel WaveGAN.
+# Please make sure this is adjusted for the CSMSC dataset. If you want to
+# apply to the other dataset, you might need to carefully change some parameters.
+# This configuration requires 12 GB GPU memory and takes ~3 days on RTX TITAN.
+###########################################################
+#                FEATURE EXTRACTION SETTING               #
+###########################################################
+fs: 24000                # Sampling rate.
+n_fft: 2048              # FFT size. (in samples)
+n_shift: 300             # Hop size. (in samples)
+win_length: 1200         # Window length. (in samples)
+                         # If set to null, it will be the same as fft_size.
+window: "hann"           # Window function.
+n_mels: 80             # Number of mel basis.
+fmin: 80                 # Minimum freq in mel basis calculation.
+fmax: 7600               # Maximum frequency in mel basis calculation.
+# global_gain_scale: 1.0   # Will be multiplied to all of waveform.
+trim_silence: false      # Whether to trim the start and end of silence.
+top_db: 60 # Need to tune carefully if the recording is not good.
+trim_frame_length: 2048    # Frame size in trimming.(in samples)
+trim_hop_length: 512       # Hop size in trimming.(in samples)
+###########################################################
+#         GENERATOR NETWORK ARCHITECTURE SETTING          #
+###########################################################
+generator_params:
+    in_channels: 1        # Number of input channels.
+    out_channels: 1       # Number of output channels.
+    kernel_size: 3        # Kernel size of dilated convolution.
+    layers: 30            # Number of residual block layers.
+    stacks: 3             # Number of stacks i.e., dilation cycles.
+    residual_channels: 64 # Number of channels in residual conv.
+    gate_channels: 128    # Number of channels in gated conv.
+    skip_channels: 64     # Number of channels in skip conv.
+    aux_channels: 80      # Number of channels for auxiliary feature conv.
+                          # Must be the same as num_mels.
+    aux_context_window: 2 # Context window size for auxiliary feature.
+                          # If set to 2, previous 2 and future 2 frames will be considered.
+    dropout: 0.0          # Dropout rate. 0.0 means no dropout applied.
+    bias: true            # use bias in residual blocks
+    use_weight_norm: true # Whether to use weight norm.
+                          # If set to true, it will be applied to all of the conv layers.
+    use_causal_conv: false               # use causal conv in residual blocks and upsample layers
+    # upsample_net: "ConvInUpsampleNetwork" # Upsampling network architecture.
+    upsample_scales: [4, 5, 3, 5]     # Upsampling scales. Prodcut of these must be the same as hop size.
+    interpolate_mode: "nearest" # upsample net interpolate mode
+    freq_axis_kernel_size: 1 # upsamling net: convolution kernel size in frequencey axis
+    nonlinear_activation: null
+    nonlinear_activation_params: {}
+###########################################################
+#       DISCRIMINATOR NETWORK ARCHITECTURE SETTING        #
+###########################################################
+discriminator_params:
+    in_channels: 1        # Number of input channels.
+    out_channels: 1       # Number of output channels.
+    kernel_size: 3        # Number of output channels.
+    layers: 10            # Number of conv layers.
+    conv_channels: 64     # Number of chnn layers.
+    bias: true            # Whether to use bias parameter in conv.
+    use_weight_norm: true # Whether to use weight norm.
+                          # If set to true, it will be applied to all of the conv layers.
+    nonlinear_activation: "LeakyReLU" # Nonlinear function after each conv.
+    nonlinear_activation_params:      # Nonlinear function parameters
+        negative_slope: 0.2           # Alpha in LeakyReLU.
+###########################################################
+#                   STFT LOSS SETTING                     #
+###########################################################
+stft_loss_params:
+    fft_sizes: [1024, 2048, 512]  # List of FFT size for STFT-based loss.
+    hop_sizes: [120, 240, 50]     # List of hop size for STFT-based loss
+    win_lengths: [600, 1200, 240] # List of window length for STFT-based loss.
+    window: "hann"         # Window function for STFT-based loss
+###########################################################
+#               ADVERSARIAL LOSS SETTING                  #
+###########################################################
+lambda_adv: 4.0  # Loss balancing coefficient.
+###########################################################
+#                  DATA LOADER SETTING                    #
+###########################################################
+batch_size: 6              # Batch size.
+batch_max_steps: 25500     # Length of each audio in batch. Make sure dividable by hop_size.
+pin_memory: true           # Whether to pin memory in Pytorch DataLoader.
+num_workers: 4             # Number of workers in Pytorch DataLoader.
+remove_short_samples: true # Whether to remove samples the length of which are less than batch_max_steps.
+allow_cache: true          # Whether to allow cache in dataset. If true, it requires cpu memory.
+###########################################################
+#             OPTIMIZER & SCHEDULER SETTING               #
+###########################################################
+generator_optimizer_params:
+    epsilon: 1.0e-6            # Generator's epsilon.
+    weight_decay: 0.0      # Generator's weight decay coefficient.
+generator_scheduler_params:
+    learning_rate: 0.0001             # Generator's learning rate.
+    step_size: 200000      # Generator's scheduler step size.
+    gamma: 0.5             # Generator's scheduler gamma.
+                           # At each step size, lr will be multiplied by this parameter.
+generator_grad_norm: 10    # Generator's gradient norm.
+discriminator_optimizer_params:
+    epsilon: 1.0e-6            # Discriminator's epsilon.
+    weight_decay: 0.0      # Discriminator's weight decay coefficient.
+discriminator_scheduler_params:
+    learning_rate: 0.00005            # Discriminator's learning rate.
+    step_size: 200000      # Discriminator's scheduler step size.
+    gamma: 0.5             # Discriminator's scheduler gamma.
+                           # At each step size, lr will be multiplied by this parameter.
+discriminator_grad_norm: 1 # Discriminator's gradient norm.
+###########################################################
+#                    INTERVAL SETTING                     #
+###########################################################
+discriminator_train_start_steps: 100000 # Number of steps to start to train discriminator.
+train_max_steps: 400000                 # Number of training steps.
+save_interval_steps: 5000               # Interval steps to save checkpoint.
+eval_interval_steps: 1000               # Interval steps to evaluate the network.
+###########################################################
+#                     OTHER SETTING                       #
+###########################################################
+num_save_intermediate_results: 4  # Number of results to be saved as intermediate results.
+num_snapshots: 10                 # max number of snapshots to keep while training
+seed: 42                          # random seed for paddle, random, and np.random
--- a/modules/audio/tts/fastspeech2_baker/module.py
+++ b/modules/audio/tts/fastspeech2_baker/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from pathlib import Path
+from typing import List
+import numpy as np
+import paddle
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+from parakeet.frontend.zh_frontend import Frontend
+from parakeet.models.fastspeech2 import FastSpeech2
+from parakeet.models.fastspeech2 import FastSpeech2Inference
+from parakeet.models.parallel_wavegan import PWGGenerator
+from parakeet.models.parallel_wavegan import PWGInference
+from parakeet.modules.normalizer import ZScore
+import soundfile as sf
+from yacs.config import CfgNode
+import yaml
+@moduleinfo(name="fastspeech2_baker", version="1.0.0", summary="", author="Baidu", author_email="", type="audio/tts")
+class FastSpeech(paddle.nn.Layer):
+    def __init__(self, output_dir='./wavs'):
+        super(FastSpeech, self).__init__()
+        fastspeech2_res_dir = os.path.join(MODULE_HOME, 'fastspeech2_baker', 'assets/fastspeech2_nosil_baker_ckpt_0.4')
+        pwg_res_dir = os.path.join(MODULE_HOME, 'fastspeech2_baker', 'assets/pwg_baker_ckpt_0.4')
+        phones_dict = os.path.join(fastspeech2_res_dir, 'phone_id_map.txt')
+        with open(phones_dict, "r") as f:
+            phn_id = [line.strip().split() for line in f.readlines()]
+        vocab_size = len(phn_id)
+        # fastspeech2
+        fastspeech2_config = os.path.join(fastspeech2_res_dir, 'default.yaml')
+        with open(fastspeech2_config) as f:
+            fastspeech2_config = CfgNode(yaml.safe_load(f))
+        self.samplerate = fastspeech2_config.fs
+        fastspeech2_checkpoint = os.path.join(fastspeech2_res_dir, 'snapshot_iter_76000.pdz')
+        model = FastSpeech2(idim=vocab_size, odim=fastspeech2_config.n_mels, **fastspeech2_config["model"])
+        model.set_state_dict(paddle.load(fastspeech2_checkpoint)["main_params"])
+        logger.info('Load fastspeech2 params from %s' % os.path.abspath(fastspeech2_checkpoint))
+        model.eval()
+        # vocoder
+        pwg_config = os.path.join(pwg_res_dir, 'pwg_default.yaml')
+        with open(pwg_config) as f:
+            pwg_config = CfgNode(yaml.safe_load(f))
+        pwg_checkpoint = os.path.join(pwg_res_dir, 'pwg_snapshot_iter_400000.pdz')
+        vocoder = PWGGenerator(**pwg_config["generator_params"])
+        vocoder.set_state_dict(paddle.load(pwg_checkpoint)["generator_params"])
+        logger.info('Load vocoder params from %s' % os.path.abspath(pwg_checkpoint))
+        vocoder.remove_weight_norm()
+        vocoder.eval()
+        # frontend
+        self.frontend = Frontend(phone_vocab_path=phones_dict)
+        # stat
+        fastspeech2_stat = os.path.join(fastspeech2_res_dir, 'speech_stats.npy')
+        stat = np.load(fastspeech2_stat)
+        mu, std = stat
+        mu = paddle.to_tensor(mu)
+        std = paddle.to_tensor(std)
+        fastspeech2_normalizer = ZScore(mu, std)
+        pwg_stat = os.path.join(pwg_res_dir, 'pwg_stats.npy')
+        stat = np.load(pwg_stat)
+        mu, std = stat
+        mu = paddle.to_tensor(mu)
+        std = paddle.to_tensor(std)
+        pwg_normalizer = ZScore(mu, std)
+        # inference
+        self.fastspeech2_inference = FastSpeech2Inference(fastspeech2_normalizer, model)
+        self.pwg_inference = PWGInference(pwg_normalizer, vocoder)
+        self.output_dir = Path(output_dir)
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+    def forward(self, text: str):
+        wav = None
+        input_ids = self.frontend.get_input_ids(text, merge_sentences=True)
+        phone_ids = input_ids["phone_ids"]
+        for part_phone_ids in phone_ids:
+            with paddle.no_grad():
+                mel = self.fastspeech2_inference(part_phone_ids)
+                temp_wav = self.pwg_inference(mel)
+                if wav is None:
+                    wav = temp_wav
+                else:
+                    wav = paddle.concat([wav, temp_wav])
+        return wav
+    @serving
+    def generate(self, sentences: List[str], device='cpu'):
+        assert isinstance(sentences, list) and isinstance(sentences[0], str), \
+            'Input data should be List[str], but got {}'.format(type(sentences))
+        paddle.set_device(device)
+        wav_files = []
+        for i, sentence in enumerate(sentences):
+            wav = self(sentence)
+            wav_file = str(self.output_dir.absolute() / (str(i + 1) + ".wav"))
+            sf.write(wav_file, wav.numpy(), samplerate=self.samplerate)
+            wav_files.append(wav_file)
+        logger.info('{} wave files have been generated in {}'.format(len(sentences), self.output_dir.absolute()))
+        return wav_files
--- a/modules/audio/tts/fastspeech2_baker/requirements.txt
+++ b/modules/audio/tts/fastspeech2_baker/requirements.txt
+git+https://github.com/PaddlePaddle/Parakeet@8040cb0#egg=paddle-parakeet
--- a/modules/audio/tts/fastspeech2_ljspeech/README.md
+++ b/modules/audio/tts/fastspeech2_ljspeech/README.md
+# fastspeech2_ljspeech
+|模型名称|fastspeech2_ljspeech|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|FastSpeech2|
+|数据集|LJSpeech-1.1|
+|是否支持Fine-tuning|否|
+|模型大小|425MB|
+|最新更新日期|2021-10-20|
+|数据指标|-|
+## 一、模型基本信息
+### 模型介绍
+FastSpeech2是微软亚洲研究院和微软Azure语音团队联合浙江大学于2020年提出的语音合成(Text to Speech, TTS)模型。FastSpeech2是FastSpeech的改进版，解决了FastSpeech依赖Teacher-Student的知识蒸馏框架，训练流程比较复杂和训练目标相比真实语音存在信息损失的问题。
+FastSpeech2的模型架构如下图所示，它沿用FastSpeech中提出的Feed-Forward Transformer(FFT)架构，但在音素编码器和梅尔频谱解码器中加入了一个可变信息适配器(Variance Adaptor)，从而支持在FastSpeech2中引入更多语音中变化的信息，例如时长、音高、音量(频谱能量)等，来解决语音合成中的一对多映射问题。
+<p align="center">
+<img src="https://paddlespeech.bj.bcebos.com/Parakeet/docs/images/fastspeech2.png" hspace='10'/> <br />
+</p>
+Parallel WaveGAN是一种使用了无蒸馏的对抗生成网络，快速且占用空间小的波形生成方法。该方法通过联合优化多分辨率谱图和对抗损失函数来训练非自回归WaveNet，可以有效捕获真实语音波形的时频分布。Parallel WaveGAN的结构如下图所示：
+<p align="center">
+<img src="https://paddlespeech.bj.bcebos.com/Parakeet/docs/images/pwg.png" hspace='10'/> <br />
+</p>
+fastspeech2_ljspeech使用了FastSpeech2作为声学模型，使用Parallel WaveGAN作为声码器，并在[The LJ Speech Dataset](https://keithito.com/LJ-Speech-Dataset/)数据集上进行了预训练，可直接用于预测合成音频。
+更多详情请参考:
+- [FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech](https://arxiv.org/abs/2006.04558)
+- [FastSpeech语音合成系统技术升级，微软联合浙大提出FastSpeech2](https://www.msra.cn/zh-cn/news/features/fastspeech2)
+- [Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram](https://arxiv.org/abs/1910.11480)
+## 二、安装
+- ### 1、环境依赖
+  - paddlepaddle >= 2.1.0
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+- ### 2、安装
+  - ```shell
+    $ hub install fastspeech2_ljspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+## 三、模型API预测  
+- ### 1、预测代码示例
+    ```python
+    import paddlehub as hub
+    # 需要合成语音的文本
+    sentences = ['The quick brown fox jumps over a lazy dog.']
+    model = hub.Module(
+        name='fastspeech2_ljspeech',
+        version='1.0.0')
+    wav_files =  model.generate(sentences)
+    # 打印合成的音频文件的路径
+    print(wav_files)
+    ```
+    详情可参考PaddleHub示例：
+    - [语音合成](../../../../demo/text_to_speech)
+- ### 2、API
+  - ```python
+    def __init__(output_dir)
+    ```
+    - 创建Module对象（动态图组网版本）
+    - **参数**
+      - `output_dir`： 合成音频文件的输出目录。
+  - ```python
+    def generate(
+        sentences,
+        device='cpu',
+    )
+    ```
+    - 将输入的文本合成为音频文件并保存到输出目录。
+    - **参数**
+      - `sentences`：合成音频的文本列表，类型为`List[str]`。
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+    - **返回**
+      - `wav_files`：`List[str]`类型，返回合成音频的存放路径。
+## 四、服务部署
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+- ### 第一步：启动PaddleHub Serving
+  - ```shell
+    $ hub serving start -m fastspeech2_ljspeech
+    ```
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+- ### 第二步：发送预测请求
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+  - ```python
+    import requests
+    import json
+    # 需要合成语音的文本
+    sentences = [
+        'The quick brown fox jumps over a lazy dog.',
+        'Today is a good day!',
+    ]
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"sentences"
+    data = {"sentences": sentences}
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/fastspeech2_ljspeech"
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+## 五、更新历史
+* 1.0.0
+  初始发布
+  ```shell
+  $ hub install fastspeech2_ljspeech
+  ```
--- a/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/__init__.py
+++ b/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/__init__.py
--- a/modules/audio/tts/fastspeech2_ljspeech/assets/fastspeech2_nosil_ljspeech_ckpt_0.5/default.yaml
+++ b/modules/audio/tts/fastspeech2_ljspeech/assets/fastspeech2_nosil_ljspeech_ckpt_0.5/default.yaml
--- a/modules/audio/tts/fastspeech2_ljspeech/assets/fastspeech2_nosil_ljspeech_ckpt_0.5/phone_id_map.txt
+++ b/modules/audio/tts/fastspeech2_ljspeech/assets/fastspeech2_nosil_ljspeech_ckpt_0.5/phone_id_map.txt
--- a/modules/audio/tts/fastspeech2_ljspeech/assets/pwg_ljspeech_ckpt_0.5/pwg_default.yaml
+++ b/modules/audio/tts/fastspeech2_ljspeech/assets/pwg_ljspeech_ckpt_0.5/pwg_default.yaml
--- a/modules/audio/tts/fastspeech2_ljspeech/module.py
+++ b/modules/audio/tts/fastspeech2_ljspeech/module.py
--- a/modules/audio/tts/fastspeech2_ljspeech/requirements.txt
+++ b/modules/audio/tts/fastspeech2_ljspeech/requirements.txt
+git+https://github.com/PaddlePaddle/Parakeet@8040cb0#egg=paddle-parakeet
--- a/modules/audio/tts/fastspeech_ljspeech/README.md
+++ b/modules/audio/tts/fastspeech_ljspeech/README.md
--- a/modules/audio/tts/transformer_tts_ljspeech/README.md
+++ b/modules/audio/tts/transformer_tts_ljspeech/README.md
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/README.md
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/__init__.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/__init__.py
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/module.py
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/module.py
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/requirements.txt
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/requirements.txt
+paddlespeech==0.1.0a13
--- a/modules/audio/voice_cloning/lstm_tacotron2/README.md
+++ b/modules/audio/voice_cloning/lstm_tacotron2/README.md
--- a/modules/image/Image_editing/colorization/deoldify/README.md
+++ b/modules/image/Image_editing/colorization/deoldify/README.md
--- a/modules/image/Image_editing/colorization/photo_restoration/README.md
+++ b/modules/image/Image_editing/colorization/photo_restoration/README.md
--- a/modules/image/Image_editing/colorization/user_guided_colorization/README.md
+++ b/modules/image/Image_editing/colorization/user_guided_colorization/README.md
--- a/modules/image/Image_editing/super_resolution/dcscn/README.md
+++ b/modules/image/Image_editing/super_resolution/dcscn/README.md
--- a/modules/image/Image_editing/super_resolution/falsr_a/README.md
+++ b/modules/image/Image_editing/super_resolution/falsr_a/README.md
--- a/modules/image/Image_editing/super_resolution/falsr_b/README.md
+++ b/modules/image/Image_editing/super_resolution/falsr_b/README.md
--- a/modules/image/Image_editing/super_resolution/falsr_c/README.md
+++ b/modules/image/Image_editing/super_resolution/falsr_c/README.md
--- a/modules/image/Image_editing/super_resolution/realsr/README.md
+++ b/modules/image/Image_editing/super_resolution/realsr/README.md
--- a/modules/image/Image_gan/attgan_celeba/README.md
+++ b/modules/image/Image_gan/attgan_celeba/README.md
--- a/modules/image/Image_gan/cyclegan_cityscapes/README.md
+++ b/modules/image/Image_gan/cyclegan_cityscapes/README.md
--- a/modules/image/Image_gan/gan/first_order_motion/README.md
+++ b/modules/image/Image_gan/gan/first_order_motion/README.md
--- a/modules/image/Image_gan/gan/first_order_motion/model.py
+++ b/modules/image/Image_gan/gan/first_order_motion/model.py
--- a/modules/image/Image_gan/gan/first_order_motion/module.py
+++ b/modules/image/Image_gan/gan/first_order_motion/module.py
--- a/modules/image/Image_gan/gan/first_order_motion/requirements.txt
+++ b/modules/image/Image_gan/gan/first_order_motion/requirements.txt
+ppgan
--- a/modules/image/Image_gan/gan/pixel2style2pixel/README.md
+++ b/modules/image/Image_gan/gan/pixel2style2pixel/README.md
--- a/modules/image/Image_gan/gan/pixel2style2pixel/model.py
+++ b/modules/image/Image_gan/gan/pixel2style2pixel/model.py
--- a/modules/image/Image_gan/gan/pixel2style2pixel/module.py
+++ b/modules/image/Image_gan/gan/pixel2style2pixel/module.py
--- a/modules/image/Image_gan/gan/pixel2style2pixel/requirements.txt
+++ b/modules/image/Image_gan/gan/pixel2style2pixel/requirements.txt
+ppgan
+dlib
--- a/modules/image/Image_gan/gan/pixel2style2pixel/util.py
+++ b/modules/image/Image_gan/gan/pixel2style2pixel/util.py
+import base64
+import cv2
+import numpy as np
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/README.md
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/README.md
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/data_feed.py
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/data_feed.py
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/module.py
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/module.py
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/module/__model__
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/module/__model__
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/processor.py
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/processor.py
--- a/modules/thirdparty/image/Image_gan/gan/stgan_bald/requirements.txt
+++ b/modules/thirdparty/image/Image_gan/gan/stgan_bald/requirements.txt
--- a/modules/image/Image_gan/gan/styleganv2_editing/README.md
+++ b/modules/image/Image_gan/gan/styleganv2_editing/README.md
--- a/modules/image/Image_gan/gan/styleganv2_editing/basemodel.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/basemodel.py
--- a/modules/image/Image_gan/gan/styleganv2_editing/model.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/model.py
--- a/modules/image/Image_gan/gan/styleganv2_editing/module.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/module.py
--- a/modules/image/Image_gan/gan/styleganv2_editing/requirements.txt
+++ b/modules/image/Image_gan/gan/styleganv2_editing/requirements.txt
--- a/modules/image/Image_gan/gan/styleganv2_editing/util.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/util.py
--- a/modules/image/Image_gan/gan/wav2lip/README.md
+++ b/modules/image/Image_gan/gan/wav2lip/README.md
--- a/modules/image/Image_gan/gan/wav2lip/model.py
+++ b/modules/image/Image_gan/gan/wav2lip/model.py
--- a/modules/image/Image_gan/gan/wav2lip/module.py
+++ b/modules/image/Image_gan/gan/wav2lip/module.py
--- a/modules/image/Image_gan/gan/wav2lip/requirements.txt
+++ b/modules/image/Image_gan/gan/wav2lip/requirements.txt
--- a/modules/image/Image_gan/stargan_celeba/README.md
+++ b/modules/image/Image_gan/stargan_celeba/README.md
--- a/modules/image/Image_gan/stgan_celeba/README.md
+++ b/modules/image/Image_gan/stgan_celeba/README.md
--- a/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md
+++ b/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/ID_Photo_GEN/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/ID_Photo_GEN/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/model/__init__.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/model/__init__.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/model/networks.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/model/networks.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/Photo2Cartoon/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/u2net.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/U2Net_Portrait/u2net.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_100w/processor.py
--- a/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md
+++ b/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/processor.py
--- a/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md
+++ b/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v1_hayao_60/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_64/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_hayao_99/processor.py
--- a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md
+++ b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_74/processor.py
--- a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md
+++ b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_98/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_33/processor.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/model.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/model.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/module.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/module.py
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/processor.py
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_shinkai_53/processor.py
--- a/modules/image/Image_gan/style_transfer/msgnet/README.md
+++ b/modules/image/Image_gan/style_transfer/msgnet/README.md
--- a/modules/image/classification/DriverStatusRecognition/README.md
+++ b/modules/image/classification/DriverStatusRecognition/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/__init__.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/__init__.py
--- a/modules/thirdparty/image/classification/DriverStatusRecognition/assets/model.yml
+++ b/modules/thirdparty/image/classification/DriverStatusRecognition/assets/model.yml
--- a/modules/thirdparty/image/classification/DriverStatusRecognition/module.py
+++ b/modules/thirdparty/image/classification/DriverStatusRecognition/module.py
--- a/modules/image/classification/DriverStatusRecognition/requirements.txt
+++ b/modules/image/classification/DriverStatusRecognition/requirements.txt
--- a/modules/thirdparty/image/classification/DriverStatusRecognition/serving_client_demo.py
+++ b/modules/thirdparty/image/classification/DriverStatusRecognition/serving_client_demo.py
--- a/modules/image/classification/SnakeIdentification/README.md
+++ b/modules/image/classification/SnakeIdentification/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/__init__.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/__init__.py
--- a/modules/thirdparty/image/classification/SnakeIdentification/assets/model.yml
+++ b/modules/thirdparty/image/classification/SnakeIdentification/assets/model.yml
--- a/modules/thirdparty/image/classification/SnakeIdentification/module.py
+++ b/modules/thirdparty/image/classification/SnakeIdentification/module.py
--- a/modules/image/classification/SnakeIdentification/requirements.txt
+++ b/modules/image/classification/SnakeIdentification/requirements.txt
--- a/modules/thirdparty/image/classification/SnakeIdentification/serving_client_demo.py
+++ b/modules/thirdparty/image/classification/SnakeIdentification/serving_client_demo.py
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/README.md
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/README.md
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/gem_dataset.py
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/gem_dataset.py
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/README.md
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/README.md
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/label_list.txt
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/label_list.txt
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/module.py
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res101_gemstone/module.py
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/README.md
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/README.md
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/label_list.txt
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/label_list.txt
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/module.py
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_res50_gemstone/module.py
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/README.md
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/README.md
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/label_list.txt
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/label_list.txt
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/module.py
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/spinalnet_vgg16_gemstone/module.py
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Cats Eye/cats_eye_3.jpg
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Cats Eye/cats_eye_3.jpg
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Fluorite/fluorite_18.jpg
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Fluorite/fluorite_18.jpg
--- a/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Kunzite/kunzite_28.jpg
+++ b/modules/thirdparty/image/classification/SpinalNet_Gemstones/testImages/Kunzite/kunzite_28.jpg
--- a/modules/image/classification/alexnet_imagenet/README.md
+++ b/modules/image/classification/alexnet_imagenet/README.md
--- a/modules/image/classification/darknet53_imagenet/README.md
+++ b/modules/image/classification/darknet53_imagenet/README.md
--- a/modules/image/classification/densenet121_imagenet/README.md
+++ b/modules/image/classification/densenet121_imagenet/README.md
--- a/modules/image/classification/densenet161_imagenet/README.md
+++ b/modules/image/classification/densenet161_imagenet/README.md
--- a/modules/image/classification/densenet169_imagenet/README.md
+++ b/modules/image/classification/densenet169_imagenet/README.md
--- a/modules/image/classification/densenet201_imagenet/README.md
+++ b/modules/image/classification/densenet201_imagenet/README.md
--- a/modules/image/classification/densenet264_imagenet/README.md
+++ b/modules/image/classification/densenet264_imagenet/README.md
--- a/modules/image/classification/dpn107_imagenet/README.md
+++ b/modules/image/classification/dpn107_imagenet/README.md
--- a/modules/image/classification/dpn131_imagenet/README.md
+++ b/modules/image/classification/dpn131_imagenet/README.md
--- a/modules/image/classification/dpn68_imagenet/README.md
+++ b/modules/image/classification/dpn68_imagenet/README.md
--- a/modules/image/classification/dpn92_imagenet/README.md
+++ b/modules/image/classification/dpn92_imagenet/README.md
--- a/modules/image/classification/dpn98_imagenet/README.md
+++ b/modules/image/classification/dpn98_imagenet/README.md
--- a/modules/image/classification/efficientnetb0_imagenet/README.md
+++ b/modules/image/classification/efficientnetb0_imagenet/README.md
--- a/modules/image/classification/efficientnetb0_small_imagenet/README.md
+++ b/modules/image/classification/efficientnetb0_small_imagenet/README.md
--- a/modules/image/classification/efficientnetb1_imagenet/README.md
+++ b/modules/image/classification/efficientnetb1_imagenet/README.md
--- a/modules/image/classification/efficientnetb2_imagenet/README.md
+++ b/modules/image/classification/efficientnetb2_imagenet/README.md
--- a/modules/image/classification/efficientnetb3_imagenet/README.md
+++ b/modules/image/classification/efficientnetb3_imagenet/README.md
--- a/modules/image/classification/efficientnetb4_imagenet/README.md
+++ b/modules/image/classification/efficientnetb4_imagenet/README.md
--- a/modules/image/classification/efficientnetb5_imagenet/README.md
+++ b/modules/image/classification/efficientnetb5_imagenet/README.md
--- a/modules/image/classification/efficientnetb6_imagenet/README.md
+++ b/modules/image/classification/efficientnetb6_imagenet/README.md
--- a/modules/image/classification/efficientnetb7_imagenet/README.md
+++ b/modules/image/classification/efficientnetb7_imagenet/README.md
--- a/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md
+++ b/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md
--- a/modules/image/classification/food_classification/README.md
+++ b/modules/image/classification/food_classification/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/__init__.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/__init__.py
--- a/modules/thirdparty/image/classification/food_classification/module.py
+++ b/modules/thirdparty/image/classification/food_classification/module.py
--- a/modules/thirdparty/image/classification/food_classification/requirements.txt
+++ b/modules/thirdparty/image/classification/food_classification/requirements.txt
--- a/modules/image/classification/ghostnet_x0_5_imagenet/README.md
+++ b/modules/image/classification/ghostnet_x0_5_imagenet/README.md
--- a/modules/image/classification/ghostnet_x1_0_imagenet/README.md
+++ b/modules/image/classification/ghostnet_x1_0_imagenet/README.md
--- a/modules/image/classification/ghostnet_x1_3_imagenet/README.md
+++ b/modules/image/classification/ghostnet_x1_3_imagenet/README.md
--- a/modules/image/classification/ghostnet_x1_3_imagenet_ssld/README.md
+++ b/modules/image/classification/ghostnet_x1_3_imagenet_ssld/README.md
--- a/modules/image/classification/googlenet_imagenet/README.md
+++ b/modules/image/classification/googlenet_imagenet/README.md
--- a/modules/image/classification/hrnet18_imagenet/README.md
+++ b/modules/image/classification/hrnet18_imagenet/README.md
--- a/modules/image/classification/hrnet18_imagenet_ssld/README.md
+++ b/modules/image/classification/hrnet18_imagenet_ssld/README.md
--- a/modules/image/classification/hrnet30_imagenet/README.md
+++ b/modules/image/classification/hrnet30_imagenet/README.md
--- a/modules/image/classification/hrnet32_imagenet/README.md
+++ b/modules/image/classification/hrnet32_imagenet/README.md
--- a/modules/image/classification/hrnet40_imagenet/README.md
+++ b/modules/image/classification/hrnet40_imagenet/README.md
--- a/modules/image/classification/hrnet44_imagenet/README.md
+++ b/modules/image/classification/hrnet44_imagenet/README.md
--- a/modules/image/classification/hrnet48_imagenet/README.md
+++ b/modules/image/classification/hrnet48_imagenet/README.md
--- a/modules/image/classification/hrnet48_imagenet_ssld/README.md
+++ b/modules/image/classification/hrnet48_imagenet_ssld/README.md
--- a/modules/image/classification/hrnet64_imagenet/README.md
+++ b/modules/image/classification/hrnet64_imagenet/README.md
--- a/modules/image/classification/inception_v4_imagenet/README.md
+++ b/modules/image/classification/inception_v4_imagenet/README.md
--- a/modules/image/classification/marine_biometrics/README.md
+++ b/modules/image/classification/marine_biometrics/README.md
--- a/modules/thirdparty/video/Video_editing/SkyAR/__init__.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/__init__.py
--- a/modules/thirdparty/image/classification/marine_biometrics/module.py
+++ b/modules/thirdparty/image/classification/marine_biometrics/module.py
--- a/modules/image/classification/marine_biometrics/requirements.txt
+++ b/modules/image/classification/marine_biometrics/requirements.txt
--- a/modules/thirdparty/image/classification/marine_biometrics/serving_client_demo.py
+++ b/modules/thirdparty/image/classification/marine_biometrics/serving_client_demo.py
--- a/modules/image/classification/mobilenet_v2_animals/README.md
+++ b/modules/image/classification/mobilenet_v2_animals/README.md
--- a/modules/image/classification/mobilenet_v2_dishes/README.md
+++ b/modules/image/classification/mobilenet_v2_dishes/README.md
--- a/modules/image/classification/mobilenet_v2_imagenet/README.md
+++ b/modules/image/classification/mobilenet_v2_imagenet/README.md
--- a/modules/image/classification/mobilenet_v2_imagenet_ssld/README.md
+++ b/modules/image/classification/mobilenet_v2_imagenet_ssld/README.md
--- a/modules/image/classification/mobilenet_v3_large_imagenet_ssld/README.md
+++ b/modules/image/classification/mobilenet_v3_large_imagenet_ssld/README.md
--- a/modules/image/classification/mobilenet_v3_small_imagenet_ssld/README.md
+++ b/modules/image/classification/mobilenet_v3_small_imagenet_ssld/README.md
--- a/modules/image/classification/nasnet_imagenet/README.md
+++ b/modules/image/classification/nasnet_imagenet/README.md
--- a/modules/image/classification/pnasnet_imagenet/README.md
+++ b/modules/image/classification/pnasnet_imagenet/README.md
--- a/modules/image/classification/repvgg_a0_imagenet/README.md
+++ b/modules/image/classification/repvgg_a0_imagenet/README.md
--- a/modules/image/classification/repvgg_a1_imagenet/README.md
+++ b/modules/image/classification/repvgg_a1_imagenet/README.md
--- a/modules/image/classification/repvgg_a2_imagenet/README.md
+++ b/modules/image/classification/repvgg_a2_imagenet/README.md
--- a/modules/image/classification/repvgg_b0_imagenet/README.md
+++ b/modules/image/classification/repvgg_b0_imagenet/README.md
--- a/modules/image/classification/repvgg_b1_imagenet/README.md
+++ b/modules/image/classification/repvgg_b1_imagenet/README.md
--- a/modules/image/classification/repvgg_b1g2_imagenet/README.md
+++ b/modules/image/classification/repvgg_b1g2_imagenet/README.md
--- a/modules/image/classification/repvgg_b1g4_imagenet/README.md
+++ b/modules/image/classification/repvgg_b1g4_imagenet/README.md
--- a/modules/image/classification/repvgg_b2_imagenet/README.md
+++ b/modules/image/classification/repvgg_b2_imagenet/README.md
--- a/modules/image/classification/repvgg_b2g4_imagenet/README.md
+++ b/modules/image/classification/repvgg_b2g4_imagenet/README.md
--- a/modules/image/classification/repvgg_b3g4_imagenet/README.md
+++ b/modules/image/classification/repvgg_b3g4_imagenet/README.md
--- a/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md
+++ b/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md
--- a/modules/image/classification/resnet18_vd_imagenet/README.md
+++ b/modules/image/classification/resnet18_vd_imagenet/README.md
--- a/modules/image/classification/resnet50_vd_10w/README.md
+++ b/modules/image/classification/resnet50_vd_10w/README.md
--- a/modules/image/classification/resnet50_vd_animals/README.md
+++ b/modules/image/classification/resnet50_vd_animals/README.md
--- a/modules/image/classification/resnet50_vd_animals/module.py
+++ b/modules/image/classification/resnet50_vd_animals/module.py
--- a/modules/image/classification/resnet50_vd_dishes/README.md
+++ b/modules/image/classification/resnet50_vd_dishes/README.md
--- a/modules/image/classification/resnet50_vd_imagenet_ssld/README.md
+++ b/modules/image/classification/resnet50_vd_imagenet_ssld/README.md
--- a/modules/image/classification/resnet50_vd_wildanimals/README.md
+++ b/modules/image/classification/resnet50_vd_wildanimals/README.md
--- a/modules/image/classification/resnet_v2_101_imagenet/README.md
+++ b/modules/image/classification/resnet_v2_101_imagenet/README.md
--- a/modules/image/classification/resnet_v2_152_imagenet/README.md
+++ b/modules/image/classification/resnet_v2_152_imagenet/README.md
--- a/modules/image/classification/resnet_v2_18_imagenet/README.md
+++ b/modules/image/classification/resnet_v2_18_imagenet/README.md
--- a/modules/image/classification/resnet_v2_34_imagenet/README.md
+++ b/modules/image/classification/resnet_v2_34_imagenet/README.md
--- a/modules/image/classification/resnet_v2_50_imagenet/README.md
+++ b/modules/image/classification/resnet_v2_50_imagenet/README.md
--- a/modules/image/classification/resnext101_32x16d_wsl/README.md
+++ b/modules/image/classification/resnext101_32x16d_wsl/README.md
--- a/modules/image/classification/resnext101_32x32d_wsl/README.md
+++ b/modules/image/classification/resnext101_32x32d_wsl/README.md
--- a/modules/image/classification/resnext101_32x48d_wsl/README.md
+++ b/modules/image/classification/resnext101_32x48d_wsl/README.md
--- a/modules/image/classification/resnext101_32x4d_imagenet/README.md
+++ b/modules/image/classification/resnext101_32x4d_imagenet/README.md
--- a/modules/image/classification/resnext101_32x8d_wsl/README.md
+++ b/modules/image/classification/resnext101_32x8d_wsl/README.md
--- a/modules/image/classification/resnext101_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext101_64x4d_imagenet/README.md
--- a/modules/image/classification/resnext101_vd_32x4d_imagenet/README.md
+++ b/modules/image/classification/resnext101_vd_32x4d_imagenet/README.md
--- a/modules/image/classification/resnext101_vd_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext101_vd_64x4d_imagenet/README.md
--- a/modules/image/classification/resnext152_32x4d_imagenet/README.md
+++ b/modules/image/classification/resnext152_32x4d_imagenet/README.md
--- a/modules/image/classification/resnext152_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext152_64x4d_imagenet/README.md
--- a/modules/image/classification/resnext152_vd_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext152_vd_64x4d_imagenet/README.md
--- a/modules/image/classification/resnext50_32x4d_imagenet/README.md
+++ b/modules/image/classification/resnext50_32x4d_imagenet/README.md
--- a/modules/image/classification/resnext50_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext50_64x4d_imagenet/README.md
--- a/modules/image/classification/resnext50_vd_32x4d_imagenet/README.md
+++ b/modules/image/classification/resnext50_vd_32x4d_imagenet/README.md
--- a/modules/image/classification/resnext50_vd_64x4d_imagenet/README.md
+++ b/modules/image/classification/resnext50_vd_64x4d_imagenet/README.md
--- a/modules/image/classification/rexnet_1_0_imagenet/README.md
+++ b/modules/image/classification/rexnet_1_0_imagenet/README.md
--- a/modules/image/classification/rexnet_1_0_imagenet/label_list.txt
+++ b/modules/image/classification/rexnet_1_0_imagenet/label_list.txt
--- a/modules/image/classification/rexnet_1_0_imagenet/module.py
+++ b/modules/image/classification/rexnet_1_0_imagenet/module.py
--- a/modules/image/classification/rexnet_1_3_imagenet/README.md
+++ b/modules/image/classification/rexnet_1_3_imagenet/README.md
--- a/modules/image/classification/rexnet_1_3_imagenet/label_list.txt
+++ b/modules/image/classification/rexnet_1_3_imagenet/label_list.txt
--- a/modules/image/classification/rexnet_1_3_imagenet/module.py
+++ b/modules/image/classification/rexnet_1_3_imagenet/module.py
--- a/modules/image/classification/rexnet_1_5_imagenet/README.md
+++ b/modules/image/classification/rexnet_1_5_imagenet/README.md
--- a/modules/image/classification/rexnet_1_5_imagenet/label_list.txt
+++ b/modules/image/classification/rexnet_1_5_imagenet/label_list.txt
--- a/modules/image/classification/rexnet_1_5_imagenet/module.py
+++ b/modules/image/classification/rexnet_1_5_imagenet/module.py
--- a/modules/image/classification/rexnet_2_0_imagenet/README.md
+++ b/modules/image/classification/rexnet_2_0_imagenet/README.md
--- a/modules/image/classification/rexnet_2_0_imagenet/label_list.txt
+++ b/modules/image/classification/rexnet_2_0_imagenet/label_list.txt
--- a/modules/image/classification/rexnet_2_0_imagenet/module.py
+++ b/modules/image/classification/rexnet_2_0_imagenet/module.py
--- a/modules/image/classification/rexnet_3_0_imagenet/README.md
+++ b/modules/image/classification/rexnet_3_0_imagenet/README.md
--- a/modules/image/classification/rexnet_3_0_imagenet/label_list.txt
+++ b/modules/image/classification/rexnet_3_0_imagenet/label_list.txt
--- a/modules/image/classification/rexnet_3_0_imagenet/module.py
+++ b/modules/image/classification/rexnet_3_0_imagenet/module.py
--- a/modules/image/classification/se_hrnet64_imagenet_ssld/README.md
+++ b/modules/image/classification/se_hrnet64_imagenet_ssld/README.md
--- a/modules/image/classification/se_resnet18_vd_imagenet/README.md
+++ b/modules/image/classification/se_resnet18_vd_imagenet/README.md
--- a/modules/image/classification/se_resnext101_32x4d_imagenet/README.md
+++ b/modules/image/classification/se_resnext101_32x4d_imagenet/README.md
--- a/modules/image/classification/se_resnext50_32x4d_imagenet/README.md
+++ b/modules/image/classification/se_resnext50_32x4d_imagenet/README.md
--- a/modules/image/classification/shufflenet_v2_imagenet/README.md
+++ b/modules/image/classification/shufflenet_v2_imagenet/README.md
--- a/modules/image/classification/spinalnet_res101_gemstone/README.md
+++ b/modules/image/classification/spinalnet_res101_gemstone/README.md
--- a/modules/image/classification/spinalnet_res50_gemstone/README.md
+++ b/modules/image/classification/spinalnet_res50_gemstone/README.md
--- a/modules/image/classification/spinalnet_vgg16_gemstone/README.md
+++ b/modules/image/classification/spinalnet_vgg16_gemstone/README.md
--- a/modules/image/classification/vgg11_imagenet/README.md
+++ b/modules/image/classification/vgg11_imagenet/README.md
--- a/modules/image/classification/vgg13_imagenet/README.md
+++ b/modules/image/classification/vgg13_imagenet/README.md
--- a/modules/image/classification/vgg16_imagenet/README.md
+++ b/modules/image/classification/vgg16_imagenet/README.md
--- a/modules/image/classification/vgg19_imagenet/README.md
+++ b/modules/image/classification/vgg19_imagenet/README.md
--- a/modules/image/classification/xception41_imagenet/README.md
+++ b/modules/image/classification/xception41_imagenet/README.md
--- a/modules/image/classification/xception65_imagenet/README.md
+++ b/modules/image/classification/xception65_imagenet/README.md
--- a/modules/image/classification/xception71_imagenet/README.md
+++ b/modules/image/classification/xception71_imagenet/README.md
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Large/README.md
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Large/README.md
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Large/inference.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Large/inference.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Large/module.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Large/module.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Large/transforms.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Large/transforms.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Large/utils.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Large/utils.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Small/README.md
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Small/README.md
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Small/inference.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Small/inference.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Small/module.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Small/module.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Small/transforms.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Small/transforms.py
--- a/modules/thirdparty/image/depth_estimation/MiDaS_Small/utils.py
+++ b/modules/thirdparty/image/depth_estimation/MiDaS_Small/utils.py
--- a/modules/thirdparty/image/keypoint_detection/hand_pose_localization/model.py
+++ b/modules/thirdparty/image/keypoint_detection/hand_pose_localization/model.py
--- a/modules/thirdparty/image/keypoint_detection/hand_pose_localization/module.py
+++ b/modules/thirdparty/image/keypoint_detection/hand_pose_localization/module.py
--- a/modules/thirdparty/image/keypoint_detection/hand_pose_localization/processor.py
+++ b/modules/thirdparty/image/keypoint_detection/hand_pose_localization/processor.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/README.md
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/README.md
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/data_feed.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/data_feed.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/fpn.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/fpn.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/label_file.txt
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/label_file.txt
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/module.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/module.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/name_adapter.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/name_adapter.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/nonlocal_helper.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/nonlocal_helper.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/processor.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/processor.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/resnet.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/resnet.py
--- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/retina_head.py
+++ b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/retina_head.py
--- a/modules/image/object_detection/yolov3_darknet53_pascalvoc/module.py
+++ b/modules/image/object_detection/yolov3_darknet53_pascalvoc/module.py
--- a/modules/image/object_detection/yolov3_darknet53_vehicles/module.py
+++ b/modules/image/object_detection/yolov3_darknet53_vehicles/module.py
--- a/modules/image/object_detection/yolov3_darknet53_vehicles/processor.py
+++ b/modules/image/object_detection/yolov3_darknet53_vehicles/processor.py
--- a/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/Readme.md
+++ b/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/Readme.md
--- a/modules/image/semantic_segmentation/Extract_Line_Draft/__init__.py
+++ b/modules/image/semantic_segmentation/Extract_Line_Draft/__init__.py
--- a/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/function.py
+++ b/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/function.py
--- a/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/module.py
+++ b/modules/thirdparty/image/semantic_segmentation/Extract_Line_Draft/module.py
--- a/modules/image/semantic_segmentation/ExtremeC3_Portrait_Segmentation/README.md
+++ b/modules/image/semantic_segmentation/ExtremeC3_Portrait_Segmentation/README.md
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/README.md
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/README.md
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/__init__.py
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/__init__.py
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/fcn.py
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/fcn.py
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/hrnet.py
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/hrnet.py
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/layers.py
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/model/layers.py
--- a/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/module.py
+++ b/modules/thirdparty/image/semantic_segmentation/FCN_HRNet_W18_Face_Seg/module.py
--- a/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP/README.md
+++ b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP/README.md
--- a/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung/README.md
+++ b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung/README.md
--- a/modules/image/semantic_segmentation/U2Net/README.md
+++ b/modules/image/semantic_segmentation/U2Net/README.md
--- a/modules/thirdparty/image/semantic_segmentation/U2Net/module.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Net/module.py
--- a/modules/thirdparty/image/semantic_segmentation/U2Net/processor.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Net/processor.py
--- a/modules/thirdparty/image/semantic_segmentation/U2Net/u2net.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Net/u2net.py
--- a/modules/image/semantic_segmentation/U2Netp/README.md
+++ b/modules/image/semantic_segmentation/U2Netp/README.md
--- a/modules/thirdparty/image/semantic_segmentation/U2Netp/module.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Netp/module.py
--- a/modules/thirdparty/image/semantic_segmentation/U2Netp/processor.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Netp/processor.py
--- a/modules/thirdparty/image/semantic_segmentation/U2Netp/u2net.py
+++ b/modules/thirdparty/image/semantic_segmentation/U2Netp/u2net.py
--- a/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/README.md
+++ b/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/README.md
--- a/modules/image/semantic_segmentation/WatermeterSegmentation/__init__.py
+++ b/modules/image/semantic_segmentation/WatermeterSegmentation/__init__.py
--- a/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/assets/model.yml
+++ b/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/assets/model.yml
--- a/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/module.py
+++ b/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/module.py
--- a/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/serving_client_demo.py
+++ b/modules/thirdparty/image/semantic_segmentation/WatermeterSegmentation/serving_client_demo.py
--- a/modules/image/semantic_segmentation/ace2p/README.md
+++ b/modules/image/semantic_segmentation/ace2p/README.md
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/README.md
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/bisenetv2_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/bisenetv2_cityscapes/module.py
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/README.md
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/module.py
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/resnet.py
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_cityscapes/resnet.py
--- a/modules/image/semantic_segmentation/deeplabv3p_resnet50_voc/README.md
+++ b/modules/image/semantic_segmentation/deeplabv3p_resnet50_voc/README.md
--- a/modules/image/semantic_segmentation/deeplabv3p_xception65_humanseg/README.md
+++ b/modules/image/semantic_segmentation/deeplabv3p_xception65_humanseg/README.md
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/README.md
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/fastscnn_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fastscnn_cityscapes/module.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/README.md
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_cityscapes/module.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/README.md
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/layers.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw18_voc/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw18_voc/module.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/README.md
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_cityscapes/module.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/README.md
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/README.md
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/hrnet.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/hrnet.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/layers.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/layers.py
--- a/modules/image/semantic_segmentation/fcn_hrnetw48_voc/module.py
+++ b/modules/image/semantic_segmentation/fcn_hrnetw48_voc/module.py
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/README.md
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/hardnet_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/hardnet_cityscapes/module.py
--- a/modules/image/semantic_segmentation/humanseg_lite/README.md
+++ b/modules/image/semantic_segmentation/humanseg_lite/README.md
--- a/modules/image/semantic_segmentation/humanseg_mobile/README.md
+++ b/modules/image/semantic_segmentation/humanseg_mobile/README.md
--- a/modules/image/semantic_segmentation/humanseg_server/README.md
+++ b/modules/image/semantic_segmentation/humanseg_server/README.md
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/README.md
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/hrnet.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/hrnet.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_cityscapes/module.py
--- a/modules/image/semantic_segmentation/ocrnet_hrnetw18_voc/README.md
+++ b/modules/image/semantic_segmentation/ocrnet_hrnetw18_voc/README.md
--- a/modules/image/semantic_segmentation/unet_cityscapes/README.md
+++ b/modules/image/semantic_segmentation/unet_cityscapes/README.md
--- a/modules/image/semantic_segmentation/unet_cityscapes/layers.py
+++ b/modules/image/semantic_segmentation/unet_cityscapes/layers.py
--- a/modules/image/semantic_segmentation/unet_cityscapes/module.py
+++ b/modules/image/semantic_segmentation/unet_cityscapes/module.py
--- a/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md
+++ b/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md
--- a/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/module.py
+++ b/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/module.py
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md
+++ b/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md
--- a/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/chinese_ocr_db_crnn_server/module.py
+++ b/modules/image/text_recognition/chinese_ocr_db_crnn_server/module.py
--- a/modules/image/text_recognition/chinese_ocr_db_crnn_server/utils.py
+++ b/modules/image/text_recognition/chinese_ocr_db_crnn_server/utils.py
--- a/modules/image/text_recognition/chinese_text_detection_db_mobile/module.py
+++ b/modules/image/text_recognition/chinese_text_detection_db_mobile/module.py
--- a/modules/image/text_recognition/chinese_text_detection_db_mobile/processor.py
+++ b/modules/image/text_recognition/chinese_text_detection_db_mobile/processor.py
--- a/modules/image/text_recognition/chinese_text_detection_db_server/module.py
+++ b/modules/image/text_recognition/chinese_text_detection_db_server/module.py
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/README.md
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/README.md
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/__init__.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/__init__.py
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/arabic.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/arabic.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/cyrillic.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/cyrillic.ttf
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/german.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/german.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/hindi.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/hindi.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/kannada.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/kannada.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/korean.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/korean.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/latin.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/latin.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/marathi.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/marathi.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/nepali.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/nepali.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/persian.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/persian.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/simfang.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/simfang.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/spanish.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/spanish.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/tamil.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/tamil.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/telugu.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/telugu.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/urdu.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/urdu.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/uyghur.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/uyghur.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/module.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/module.py
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/requirements.txt
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/requirements.txt
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/utils.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/utils.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/requirements.txt
--- a/modules/text/embedding/fasttext_crawl_target_word-word_dim300_en/README.md
+++ b/modules/text/embedding/fasttext_crawl_target_word-word_dim300_en/README.md
--- a/modules/text/embedding/fasttext_wiki-news_target_word-word_dim300_en/README.md
+++ b/modules/text/embedding/fasttext_wiki-news_target_word-word_dim300_en/README.md
--- a/modules/text/embedding/glove_twitter_target_word-word_dim100_en/README.md
+++ b/modules/text/embedding/glove_twitter_target_word-word_dim100_en/README.md
--- a/modules/text/embedding/glove_twitter_target_word-word_dim200_en/README.md
+++ b/modules/text/embedding/glove_twitter_target_word-word_dim200_en/README.md
--- a/modules/text/embedding/glove_twitter_target_word-word_dim25_en/README.md
+++ b/modules/text/embedding/glove_twitter_target_word-word_dim25_en/README.md
--- a/modules/text/embedding/glove_twitter_target_word-word_dim50_en/README.md
+++ b/modules/text/embedding/glove_twitter_target_word-word_dim50_en/README.md
--- a/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en/README.md
+++ b/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en/README.md
--- a/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en/README.md
+++ b/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en/README.md
--- a/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en/README.md
+++ b/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en/README.md
--- a/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en/README.md
+++ b/modules/text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en/README.md
--- a/modules/text/embedding/tencent_ailab_chinese_embedding/README.md
+++ b/modules/text/embedding/tencent_ailab_chinese_embedding/README.md
--- a/modules/text/embedding/tencent_ailab_chinese_embedding/module.py
+++ b/modules/text/embedding/tencent_ailab_chinese_embedding/module.py
--- a/modules/text/embedding/tencent_ailab_chinese_embedding_small/README.md
+++ b/modules/text/embedding/tencent_ailab_chinese_embedding_small/README.md
--- a/modules/text/embedding/tencent_ailab_chinese_embedding_small/module.py
+++ b/modules/text/embedding/tencent_ailab_chinese_embedding_small/module.py
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300/README.md
--- a/modules/text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_financial_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_financial_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_financial_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_financial_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_financial_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_financial_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_financial_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_financial_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_literature_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_literature_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_literature_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_literature_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_literature_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_literature_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_literature_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_literature_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_mixed-large_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_mixed-large_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_mixed-large_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_mixed-large_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_people_daily_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_people_daily_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_people_daily_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_people_daily_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_people_daily_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_people_daily_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_people_daily_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_people_daily_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_sikuquanshu_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_sikuquanshu_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_sikuquanshu_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_sikuquanshu_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_sogou_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_sogou_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_sogou_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_sogou_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_sogou_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_sogou_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_sogou_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_sogou_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_weibo_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_weibo_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_weibo_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_weibo_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_weibo_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_weibo_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_weibo_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_weibo_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_wiki_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_wiki_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_wiki_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_wiki_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_wiki_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_wiki_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_wiki_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_wiki_target_word-word_dim300/README.md
--- a/modules/text/embedding/w2v_zhihu_target_bigram-char_dim300/README.md
+++ b/modules/text/embedding/w2v_zhihu_target_bigram-char_dim300/README.md
--- a/modules/text/embedding/w2v_zhihu_target_word-bigram_dim300/README.md
+++ b/modules/text/embedding/w2v_zhihu_target_word-bigram_dim300/README.md
--- a/modules/text/embedding/w2v_zhihu_target_word-char_dim300/README.md
+++ b/modules/text/embedding/w2v_zhihu_target_word-char_dim300/README.md
--- a/modules/text/embedding/w2v_zhihu_target_word-word_dim300/README.md
+++ b/modules/text/embedding/w2v_zhihu_target_word-word_dim300/README.md
--- a/modules/text/lexical_analysis/lac/module.py
+++ b/modules/text/lexical_analysis/lac/module.py
--- a/modules/text/lexical_analysis/lac/processor.py
+++ b/modules/text/lexical_analysis/lac/processor.py
--- a/modules/text/punctuation_restoration/auto_punc/README.md
+++ b/modules/text/punctuation_restoration/auto_punc/README.md
--- a/modules/text/punctuation_restoration/auto_punc/__init__.py
+++ b/modules/text/punctuation_restoration/auto_punc/__init__.py
--- a/modules/text/punctuation_restoration/auto_punc/module.py
+++ b/modules/text/punctuation_restoration/auto_punc/module.py
--- a/modules/text/sentiment_analysis/ernie_skep_sentiment_analysis/README.md
+++ b/modules/text/sentiment_analysis/ernie_skep_sentiment_analysis/README.md
--- a/modules/text/sentiment_analysis/ernie_skep_sentiment_analysis/module.py
+++ b/modules/text/sentiment_analysis/ernie_skep_sentiment_analysis/module.py
--- a/modules/text/sentiment_analysis/senta_bilstm/README_en.md
+++ b/modules/text/sentiment_analysis/senta_bilstm/README_en.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/README.md
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/README.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/__init__.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/__init__.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/model.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/model.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/module.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/module.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/processor.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/processor.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/requirements.txt
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_1/requirements.txt
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/README.md
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/README.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/__init__.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/__init__.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/model.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/model.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/module.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/module.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/processor.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/processor.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/requirements.txt
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_3/requirements.txt
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/README.md
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/README.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/__init__.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/__init__.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/model.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/model.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/module.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/module.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/processor.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/processor.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/requirements.txt
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_5/requirements.txt
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/README.md
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/README.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/__init__.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/__init__.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/model.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/model.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/module.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/module.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/processor.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/processor.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/requirements.txt
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_7/requirements.txt
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/README.md
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/README.md
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/__init__.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/__init__.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/model.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/model.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/module.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/module.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/processor.py
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/processor.py
--- a/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/requirements.txt
+++ b/modules/text/simultaneous_translation/stacl/transformer_nist_wait_all/requirements.txt
--- a/modules/text/syntactic_analysis/DDParser/README.md
+++ b/modules/text/syntactic_analysis/DDParser/README.md
--- a/modules/text/syntactic_analysis/DDParser/module.py
+++ b/modules/text/syntactic_analysis/DDParser/module.py
--- a/modules/text/syntactic_analysis/DDParser/requirements.txt
+++ b/modules/text/syntactic_analysis/DDParser/requirements.txt
--- a/modules/text/text_correction/ernie-csc/README.md
+++ b/modules/text/text_correction/ernie-csc/README.md
--- a/modules/text/text_correction/ernie-csc/__init__.py
+++ b/modules/text/text_correction/ernie-csc/__init__.py
--- a/modules/text/text_correction/ernie-csc/module.py
+++ b/modules/text/text_correction/ernie-csc/module.py
--- a/modules/text/text_correction/ernie-csc/requirements.txt
+++ b/modules/text/text_correction/ernie-csc/requirements.txt
--- a/modules/thirdparty/text/text_generation/GPT2_Base_CN/README.md
+++ b/modules/thirdparty/text/text_generation/GPT2_Base_CN/README.md
--- a/modules/thirdparty/text/text_generation/GPT2_Base_CN/module.py
+++ b/modules/thirdparty/text/text_generation/GPT2_Base_CN/module.py
--- a/modules/thirdparty/text/text_generation/GPT2_CPM_LM/README.md
+++ b/modules/thirdparty/text/text_generation/GPT2_CPM_LM/README.md
--- a/modules/thirdparty/text/text_generation/GPT2_CPM_LM/module.py
+++ b/modules/thirdparty/text/text_generation/GPT2_CPM_LM/module.py
--- a/modules/text/text_generation/ernie_gen/README_en.md
+++ b/modules/text/text_generation/ernie_gen/README_en.md
--- a/modules/text/text_generation/ernie_gen_leave/README.md
+++ b/modules/text/text_generation/ernie_gen_leave/README.md
--- a/modules/text/text_generation/ernie_gen_leave/module.py
+++ b/modules/text/text_generation/ernie_gen_leave/module.py
--- a/modules/text/text_generation/ernie_gen_leave/test.py
+++ b/modules/text/text_generation/ernie_gen_leave/test.py
--- a/modules/text/text_generation/reading_pictures_writing_poems/__init__.py
+++ b/modules/text/text_generation/reading_pictures_writing_poems/__init__.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/module.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/module.py
--- a/modules/text/text_generation/reading_pictures_writing_poems/readme.md
+++ b/modules/text/text_generation/reading_pictures_writing_poems/readme.md
--- a/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/__init__.py
+++ b/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/__init__.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/module.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnDetection/module.py
--- a/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/__init__.py
+++ b/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/__init__.py
--- a/modules/text/text_generation/ernie_gen_leave/model/decode.py
+++ b/modules/text/text_generation/ernie_gen_leave/model/decode.py
--- a/modules/text/text_generation/ernie_gen_leave/model/file_utils.py
+++ b/modules/text/text_generation/ernie_gen_leave/model/file_utils.py
--- a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie.py
+++ b/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie.py
--- a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie_gen.py
+++ b/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie_gen.py
--- a/modules/text/text_generation/ernie_gen_leave/model/tokenizing_ernie.py
+++ b/modules/text/text_generation/ernie_gen_leave/model/tokenizing_ernie.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/module.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/module.py
--- a/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/__init__.py
+++ b/modules/text/text_generation/reading_pictures_writing_poems_for_midautumn/__init__.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/module.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/module.py
--- a/modules/text/text_review/porn_detection_cnn/README.md
+++ b/modules/text/text_review/porn_detection_cnn/README.md
--- a/modules/text/text_review/porn_detection_gru/README.md
+++ b/modules/text/text_review/porn_detection_gru/README.md
--- a/modules/text/text_review/porn_detection_gru/README_en.md
+++ b/modules/text/text_review/porn_detection_gru/README_en.md
--- a/modules/text/text_to_knowledge/nptag/README.md
+++ b/modules/text/text_to_knowledge/nptag/README.md
--- a/modules/text/text_to_knowledge/nptag/__init__.py
+++ b/modules/text/text_to_knowledge/nptag/__init__.py
--- a/modules/text/text_to_knowledge/nptag/module.py
+++ b/modules/text/text_to_knowledge/nptag/module.py
--- a/modules/text/text_to_knowledge/nptag/requirements.txt
+++ b/modules/text/text_to_knowledge/nptag/requirements.txt
--- a/modules/text/text_to_knowledge/wordtag/README.md
+++ b/modules/text/text_to_knowledge/wordtag/README.md
--- a/modules/text/text_to_knowledge/wordtag/__init__.py
+++ b/modules/text/text_to_knowledge/wordtag/__init__.py
--- a/modules/text/text_to_knowledge/wordtag/module.py
+++ b/modules/text/text_to_knowledge/wordtag/module.py
--- a/modules/text/text_to_knowledge/wordtag/requirements.txt
+++ b/modules/text/text_to_knowledge/wordtag/requirements.txt
--- a/modules/thirdparty/image/Image_gan/style_transfer/ID_Photo_GEN/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/ID_Photo_GEN/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_83w/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/UGATIT_92w/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md
--- a/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md
+++ b/modules/thirdparty/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md
--- a/modules/thirdparty/image/classification/DriverStatusRecognition/README.md
+++ b/modules/thirdparty/image/classification/DriverStatusRecognition/README.md
--- a/modules/thirdparty/image/classification/SnakeIdentification/README.md
+++ b/modules/thirdparty/image/classification/SnakeIdentification/README.md
--- a/modules/thirdparty/image/classification/food_classification/README.md
+++ b/modules/thirdparty/image/classification/food_classification/README.md
--- a/modules/thirdparty/image/classification/marine_biometrics/README.md
+++ b/modules/thirdparty/image/classification/marine_biometrics/README.md
--- a/modules/thirdparty/image/keypoint_detection/hand_pose_localization/README.md
+++ b/modules/thirdparty/image/keypoint_detection/hand_pose_localization/README.md
--- a/modules/thirdparty/image/semantic_segmentation/U2Net/README.md
+++ b/modules/thirdparty/image/semantic_segmentation/U2Net/README.md
--- a/modules/thirdparty/image/semantic_segmentation/U2Netp/README.md
+++ b/modules/thirdparty/image/semantic_segmentation/U2Netp/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/README.md
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems/README.md
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/decode.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/decode.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/modeling_ernie.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/modeling_ernie.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/modeling_ernie_gen.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/modeling_ernie_gen.py
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/tokenizing_ernie.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/tokenizing_ernie.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/README.md
+++ b/modules/thirdparty/video/Video_editing/SkyAR/README.md
--- a/modules/video/Video_editing/SkyAR/README.md
+++ b/modules/video/Video_editing/SkyAR/README.md
--- a/modules/video/Video_editing/SkyAR/__init__.py
+++ b/modules/video/Video_editing/SkyAR/__init__.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/module.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/module.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/rain.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/rain.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/rain_streaks/videoplayback.mp4
+++ b/modules/thirdparty/video/Video_editing/SkyAR/rain_streaks/videoplayback.mp4
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/cloudy.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/cloudy.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/district9ship.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/district9ship.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/floatingcastle.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/floatingcastle.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/galaxy.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/galaxy.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/jupiter.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/jupiter.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/rainy.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/rainy.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/sunny.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/sunny.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/sunset.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/sunset.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/supermoon.jpg
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/supermoon.jpg
--- a/modules/thirdparty/video/Video_editing/SkyAR/skybox/thunderstorm.mp4
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skybox/thunderstorm.mp4
--- a/modules/thirdparty/video/Video_editing/SkyAR/skyfilter.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/skyfilter.py
--- a/modules/thirdparty/video/Video_editing/SkyAR/utils.py
+++ b/modules/thirdparty/video/Video_editing/SkyAR/utils.py
--- a/modules/video/classification/nonlocal_kinetics400/README.md
+++ b/modules/video/classification/nonlocal_kinetics400/README.md
--- a/modules/video/classification/stnet_kinetics400/README.md
+++ b/modules/video/classification/stnet_kinetics400/README.md
--- a/modules/video/classification/tsm_kinetics400/README.md
+++ b/modules/video/classification/tsm_kinetics400/README.md
--- a/modules/video/classification/tsn_kinetics400/README.md
+++ b/modules/video/classification/tsn_kinetics400/README.md
--- a/modules/video/multiple_object_tracking/fairmot_dla34/README.md
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/README.md
--- a/modules/video/multiple_object_tracking/fairmot_dla34/config/_base_/fairmot_dla34.yml
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/config/_base_/fairmot_dla34.yml
--- a/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/file_utils.py
+++ b/modules/thirdparty/text/text_generation/reading_pictures_writing_poems_for_midautumn/MidAutumnPoetry/model/file_utils.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/deepsort_matching.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/deepsort_matching.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/jde_matching.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/jde_matching.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/kalman_filter.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/kalman_filter.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_jde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_jde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_sde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_sde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/jde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/jde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/utils.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/utils.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/visualization.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/visualization.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/module.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/module.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/requirements.txt
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/requirements.txt
--- a/modules/video/multiple_object_tracking/fairmot_dla34/tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/utils.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/utils.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/README.md
+++ b/modules/video/multiple_object_tracking/jde_darknet53/README.md
--- a/modules/video/multiple_object_tracking/jde_darknet53/config/_base_/jde_darknet53.yml
+++ b/modules/video/multiple_object_tracking/jde_darknet53/config/_base_/jde_darknet53.yml
--- a/modules/video/multiple_object_tracking/jde_darknet53/config/jde_darknet53_30e_1088x608.yml
+++ b/modules/video/multiple_object_tracking/jde_darknet53/config/jde_darknet53_30e_1088x608.yml
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/deepsort_matching.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/deepsort_matching.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/jde_matching.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/jde_matching.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/kalman_filter.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/kalman_filter.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_jde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_jde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_sde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_sde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/jde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/jde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/utils.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/utils.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/visualization.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/visualization.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/module.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/module.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/requirements.txt
+++ b/modules/video/multiple_object_tracking/jde_darknet53/requirements.txt
--- a/modules/video/multiple_object_tracking/jde_darknet53/tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/utils.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/utils.py
--- a/paddlehub/__init__.py
+++ b/paddlehub/__init__.py
--- a/paddlehub/env.py
+++ b/paddlehub/env.py
--- a/paddlehub/finetune/trainer.py
+++ b/paddlehub/finetune/trainer.py
--- a/paddlehub/module/cv_module.py
+++ b/paddlehub/module/cv_module.py
--- a/paddlehub/server/git_source.py
+++ b/paddlehub/server/git_source.py
--- a/paddlehub/utils/pypi.py
+++ b/paddlehub/utils/pypi.py
--- a/paddlehub/vision/utils.py
+++ b/paddlehub/vision/utils.py
--- a/requirements.txt
+++ b/requirements.txt