t Merge branch 'develop' of https://github.com/rainyfly/PaddleHub into face_parse

fdf1af8d · chenjian · a6082011 · e79e72aa · fdf1af8d · fdf1af8d
141 changed file
--- a/README.md
+++ b/README.md
@@ -231,3 +231,4 @@ We welcome you to contribute code to PaddleHub, and thank you for your feedback.
 * Many thanks to [zl1271](https://github.com/zl1271) for fixing serving docs typo
 * Many thanks to [AK391](https://github.com/AK391) for adding the webdemo of UGATIT and deoldify models in Hugging Face spaces
 * Many thanks to [itegel](https://github.com/itegel) for fixing quick start docs typo
+* Many thanks to [AK391](https://github.com/AK391) for adding the webdemo of Photo2Cartoon model in Hugging Face spaces
--- a/README_ch.md
+++ b/README_ch.md
@@ -247,3 +247,4 @@ print(results)
 * 非常感谢[zl1271](https://github.com/zl1271)修复了serving文档中的错别字
 * 非常感谢[AK391](https://github.com/AK391)在Hugging Face spaces中添加了UGATIT和deoldify模型的web demo
 * 非常感谢[itegel](https://github.com/itegel)修复了快速开始文档中的错别字
+* 非常感谢[AK391](https://github.com/AK391)在Hugging Face spaces中添加了Photo2Cartoon模型的web demo
--- a/docs/docs_en/visualization.md
+++ b/docs/docs_en/visualization.md
@@ -50,6 +50,8 @@

 **UGATIT Selfie2anime Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/U-GAT-IT-selfie2anime)

+**Photo2Cartoon Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/photo2cartoon)
+

 ### Object Detection
 - Pedestrian detection, vehicle detection, and more industrial-grade ultra-large-scale pretrained models are provided.

--- a/modules/audio/asr/u2_conformer_wenetspeech/README.md
+++ b/modules/audio/asr/u2_conformer_wenetspeech/README.md
+# u2_conformer_wenetspeech
+
+|模型名称|u2_conformer_wenetspeech|
+| :--- | :---: |
+|类别|语音-语音识别|
+|网络|Conformer|
+|数据集|WenetSpeech|
+|是否支持Fine-tuning|否|
+|模型大小|494MB|
+|最新更新日期|2021-12-10|
+|数据指标|中文CER 0.087 |
+
+## 一、模型基本信息
+
+### 模型介绍
+
+U2 Conformer模型是一种适用于英文和中文的end-to-end语音识别模型。u2_conformer_wenetspeech采用了conformer的encoder和transformer的decoder的模型结构，并且使用了ctc-prefix beam search的方式进行一遍打分，再利用attention decoder进行二次打分的方式进行解码来得到最终结果。
+
+u2_conformer_wenetspeech在中文普通话开源语音数据集[WenetSpeech](https://wenet-e2e.github.io/WenetSpeech/)进行了预训练，该模型在其DEV测试集上的CER指标是0.087。
+
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/conformer.png" hspace='10'/> <br />
+</p>
+
+<p align="center">
+<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/u2_conformer.png" hspace='10'/> <br />
+</p>
+
+更多详情请参考:
+- [Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition](https://arxiv.org/abs/2012.05481)
+- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
+- [WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition](https://arxiv.org/abs/2110.03370)
+
+## 二、安装
+
+- ### 1、系统依赖
+
+  - libsndfile
+    - Linux
+      ```shell
+      $ sudo apt-get install libsndfile
+      or
+      $ sudo yum install libsndfile
+      ```
+    - MacOs
+      ```
+      $ brew install libsndfile
+      ```
+
+- ### 2、环境依赖
+
+  - paddlepaddle >= 2.2.0
+
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 3、安装
+
+  - ```shell
+    $ hub install u2_conformer_wenetspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+## 三、模型API预测  
+
+- ### 1、预测代码示例
+
+    ```python
+    import paddlehub as hub
+
+    # 采样率为16k，格式为wav的中文语音音频
+    wav_file = '/PATH/TO/AUDIO'
+
+    model = hub.Module(
+        name='u2_conformer_wenetspeech',
+        version='1.0.0')
+    text = model.speech_recognize(wav_file)
+
+    print(text)
+    ```
+
+- ### 2、API
+  - ```python
+    def check_audio(audio_file)
+    ```
+    - 检查输入音频格式和采样率是否满足为16000，如果不满足，则重新采样至16000并将新的音频文件保存至相同目录。
+
+    - **参数**
+
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+
+  - ```python
+    def speech_recognize(
+        audio_file,
+        device='cpu',
+    )
+    ```
+    - 将输入的音频识别成文字
+
+    - **参数**
+
+      - `audio_file`：本地音频文件(*.wav)的路径，如`/path/to/input.wav`
+      - `device`：预测时使用的设备，默认为`cpu`，如需使用gpu预测，请设置为`gpu`。
+
+    - **返回**
+
+      - `text`：str类型，返回输入音频的识别文字结果。
+
+
+## 四、服务部署
+
+- PaddleHub Serving可以部署一个在线的语音识别服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - ```shell
+    $ hub serving start -m u2_conformer_wenetspeech
+    ```
+
+  - 这样就完成了一个语音识别服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA_VISIBLE_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+
+    # 需要识别的音频的存放路径，确保部署服务的机器可访问
+    file = '/path/to/input.wav'
+
+    # 以key的方式指定text传入预测方法的时的参数，此例中为"audio_file"
+    data = {"audio_file": file}
+
+    # 发送post请求，content-type类型应指定json方式，url中的ip地址需改为对应机器的ip
+    url = "http://127.0.0.1:8866/predict/u2_conformer_wenetspeech"
+
+    # 指定post请求的headers为application/json方式
+    headers = {"Content-Type": "application/json"}
+
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+    print(r.json())
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+
+  ```shell
+  $ hub install u2_conformer_wenetspeech
+  ```
--- a/modules/audio/asr/u2_conformer_wenetspeech/__init__.py
+++ b/modules/audio/asr/u2_conformer_wenetspeech/__init__.py
--- a/modules/audio/asr/u2_conformer_wenetspeech/module.py
+++ b/modules/audio/asr/u2_conformer_wenetspeech/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+import paddle
+from paddleaudio import load, save_wav
+from paddlespeech.cli import ASRExecutor
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+
+
+@moduleinfo(
+    name="u2_conformer_wenetspeech", version="1.0.0", summary="", author="Wenet", author_email="", type="audio/asr")
+class U2Conformer(paddle.nn.Layer):
+    def __init__(self):
+        super(U2Conformer, self).__init__()
+        self.asr_executor = ASRExecutor()
+        self.asr_kw_args = {
+            'model': 'conformer_wenetspeech',
+            'lang': 'zh',
+            'sample_rate': 16000,
+            'config': None,  # Set `config` and `ckpt_path` to None to use pretrained model.
+            'ckpt_path': None,
+        }
+
+    @staticmethod
+    def check_audio(audio_file):
+        assert audio_file.endswith('.wav'), 'Input file must be a wave file `*.wav`.'
+        sig, sample_rate = load(audio_file)
+        if sample_rate != 16000:
+            sig, _ = load(audio_file, 16000)
+            audio_file_16k = audio_file[:audio_file.rindex('.')] + '_16k.wav'
+            logger.info('Resampling to 16000 sample rate to new audio file: {}'.format(audio_file_16k))
+            save_wav(sig, 16000, audio_file_16k)
+            return audio_file_16k
+        else:
+            return audio_file
+
+    @serving
+    def speech_recognize(self, audio_file, device='cpu'):
+        assert os.path.isfile(audio_file), 'File not exists: {}'.format(audio_file)
+        audio_file = self.check_audio(audio_file)
+        text = self.asr_executor(audio_file=audio_file, device=device, **self.asr_kw_args)
+        return text
--- a/modules/audio/asr/u2_conformer_wenetspeech/requirements.txt
+++ b/modules/audio/asr/u2_conformer_wenetspeech/requirements.txt
+paddlespeech==0.1.0a9
--- a/modules/audio/tts/deepvoice3_ljspeech/README.md
+++ b/modules/audio/tts/deepvoice3_ljspeech/README.md
-## 概述
+# deepvoice3_ljspeech
+
+|模型名称|deepvoice3_ljspeech|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|DeepVoice3|
+|数据集|LJSpeech-1.1|
+|是否支持Fine-tuning|否|
+|模型大小|58MB|
+|最新更新日期|2020-10-27|
+|数据指标|-|
+
+## 一、模型基本信息
+
+### 模型介绍

 Deep Voice 3是百度研究院2017年发布的端到端的TTS模型（论文录用于ICLR 2018）。它是一个基于卷积神经网络和注意力机制的seq2seq模型,由于不包含循环神经网络，它可以并行训练，远快于基于循环神经网络的模型。Deep Voice 3可以学习到多个说话人的特征，也支持搭配多种声码器使用。deepvoice3_ljspeech是基于ljspeech英文语音数据集预训练得到的英文TTS模型，仅支持预测。

 <p align="center">
-<img src="https://github.com/PaddlePaddle/Parakeet/blob/develop/examples/deepvoice3/images/model_architecture.png" hspace='10'/> <br />
+<img src="https://raw.githubusercontent.com/PaddlePaddle/Parakeet/release/v0.1/examples/deepvoice3/images/model_architecture.png" hspace='10'/> <br/>
 </p>

 更多详情参考论文[Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning](https://arxiv.org/abs/1710.07654)

-## 命令行预测

-```shell
-$ hub run deepvoice3_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
-```
+## 二、安装

-## API
+- ### 1、系统依赖

-```python
-def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
-```
+    对于Ubuntu用户，请执行：
+    ```
+    sudo apt-get install libsndfile1
+    ```
+    对于Centos用户，请执行：
+    ```
+    sudo yum install libsndfile
+    ```

-预测API，由输入文本合成对应音频波形。
+- ### 2、环境依赖

-**参数**
+  - 2.0.0 > paddlepaddle >= 1.8.2

-* texts (list\[str\]): 待预测文本；
-* use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
-* vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"
+  - 2.0.0 > paddlehub >= 1.7.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)

-**返回**
+- ### 3、安装

-* wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
-* sample\_rate (int): 合成音频的采样率。
+  - ```shell
+    $ hub install deepvoice3_ljspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)

-**代码示例**

-```python
-import paddlehub as hub
-import soundfile as sf
+## 三、模型API预测  

-# Load deepvoice3_ljspeech module.
-module = hub.Module(name="deepvoice3_ljspeech")
+- ### 1、命令行预测

-# Predict sentiment label
-test_texts = ['Simple as this proposition is, it is necessary to be stated',
-              'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
-wavs, sample_rate = module.synthesize(texts=test_texts)
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+  - ```shell
+    $ hub run deepvoice3_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
+    ```
+  - 通过命令行方式实现语音合成模型的调用，更多请见[PaddleHub命令行指令](https://github.com/shinichiye/PaddleHub/blob/release/v2.1/docs/docs_ch/tutorial/cmd_usage.rst)

-## 服务部署
+- ### 2、预测代码示例

-PaddleHub Serving 可以部署在线服务。
+  - ```python
+    import paddlehub as hub
+    import soundfile as sf

-### 第一步：启动PaddleHub Serving
+    # Load deepvoice3_ljspeech module.
+    module = hub.Module(name="deepvoice3_ljspeech")

-运行启动命令：
-```shell
-$ hub serving start -m deepvoice3_ljspeech
-```
+    # Predict sentiment label
+    test_texts = ['Simple as this proposition is, it is necessary to be stated',
+                'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
+    wavs, sample_rate = module.synthesize(texts=test_texts)
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```

-这样就完成了一个服务化API的部署，默认端口号为8866。
+- ### 3、API

-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+  - ```python
+    def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
+    ```

-### 第二步：发送预测请求
+    - 预测API，由输入文本合成对应音频波形。

-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **参数**
+      - texts (list\[str\]): 待预测文本；
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
+      - vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"

-```python
-import requests
-import json
+    - **返回**
+      - wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
+      - sample\_rate (int): 合成音频的采样率。

-import soundfile as sf

-# 发送HTTP请求
+## 四、服务部署

-data = {'texts':['Simple as this proposition is, it is necessary to be stated',
-                 'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
-        'use_gpu':False}
-headers = {"Content-type": "application/json"}
-url = "http://127.0.0.1:8866/predict/deepvoice3_ljspeech"
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
+- PaddleHub Serving可以部署一个在线语音合成服务，可以将此接口用于在线web应用。

-# 保存结果
-result = r.json()["results"]
-wavs = result["wavs"]
-sample_rate = result["sample_rate"]
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+- ### 第一步：启动PaddleHub Serving

-## 查看代码
+  - 运行启动命令
+  - ```shell
+    $ hub serving start -m deepvoice3_ljspeech
+    ```
+  - 这样就完成了服务化API的部署，默认端口号为8866。  
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。

-https://github.com/PaddlePaddle/Parakeet
+- ### 第二步：发送预测请求

-### 依赖
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果

-paddlepaddle >= 1.8.2
+  - ```python
+    import requests
+    import json

-paddlehub >= 1.7.0
+    import soundfile as sf

-**NOTE:** 除了python依赖外还必须安装libsndfile库
+    # 发送HTTP请求

-对于Ubuntu用户，请执行：
-```
-sudo apt-get install libsndfile1
-```
-对于Centos用户，请执行：
-```
-sudo yum install libsndfile
-```
+    data = {'texts':['Simple as this proposition is, it is necessary to be stated',
+                    'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
+            'use_gpu':False}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/deepvoice3_ljspeech"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))

-## 更新历史
+    # 保存结果
+    result = r.json()["results"]
+    wavs = result["wavs"]
+    sample_rate = result["sample_rate"]
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```
+
+
+## 五、更新历史

 * 1.0.0

  初始发布
+
+  ```shell
+  $ hub install deepvoice3_ljspeech
+  ```
--- a/modules/audio/tts/fastspeech_ljspeech/README.md
+++ b/modules/audio/tts/fastspeech_ljspeech/README.md
-## 概述
+# fastspeech_ljspeech
+
+|模型名称|fastspeech_ljspeech|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|FastSpeech|
+|数据集|LJSpeech-1.1|
+|是否支持Fine-tuning|否|
+|模型大小|320MB|
+|最新更新日期|2020-10-27|
+|数据指标|-|
+
+## 一、模型基本信息
+
+### 模型介绍

 FastSpeech是基于Transformer的前馈神经网络，作者从encoder-decoder结构的teacher model中提取attention对角线来做发音持续时间预测，即使用长度调节器对文本序列进行扩展来匹配目标梅尔频谱的长度，以便并行生成梅尔频谱。该模型基本上消除了复杂情况下的跳词和重复的问题，并且可以平滑地调整语音速度，更重要的是，该模型大幅度提升了梅尔频谱的生成速度。fastspeech_ljspeech是基于ljspeech英文语音数据集预训练得到的英文TTS模型，仅支持预测。

 <p align="center">
-<img src="https://github.com/PaddlePaddle/Parakeet/blob/develop/examples/fastspeech/images/model_architecture.png" hspace='10'/> <br />
+<img src="https://raw.githubusercontent.com/PaddlePaddle/Parakeet/release/v0.1/examples/fastspeech/images/model_architecture.png" hspace='10'/> <br/>
 </p>

 更多详情参考论文[FastSpeech: Fast, Robust and Controllable Text to Speech](https://arxiv.org/abs/1905.09263)

-## 命令行预测

-```shell
-$ hub run fastspeech_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
-```
+## 二、安装

-## API
+- ### 1、系统依赖

-```python
-def synthesize(texts, use_gpu=False, speed=1.0, vocoder="griffin-lim"):
-```
+    对于Ubuntu用户，请执行：
+    ```
+    sudo apt-get install libsndfile1
+    ```
+    对于Centos用户，请执行：
+    ```
+    sudo yum install libsndfile
+    ```

-预测API，由输入文本合成对应音频波形。
+- ### 2、环境依赖

-**参数**
+  - 2.0.0 > paddlepaddle >= 1.8.2

-* texts (list\[str\]): 待预测文本；
-* use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
-* speed(float): 音频速度，1.0表示以原速输出。
-* vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"
+  - 2.0.0 > paddlehub >= 1.7.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)

-**返回**
+- ### 3、安装

-* wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
-* sample\_rate (int): 合成音频的采样率。
+  - ```shell
+    $ hub install fastspeech_ljspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)

-**代码示例**

-```python
-import paddlehub as hub
-import soundfile as sf
+## 三、模型API预测  

-# Load fastspeech_ljspeech module.
-module = hub.Module(name="fastspeech_ljspeech")
+- ### 1、命令行预测

-# Predict sentiment label
-test_texts = ['Simple as this proposition is, it is necessary to be stated',
-              'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
-wavs, sample_rate = module.synthesize(texts=test_texts)
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+  - ```shell
+    $ hub run fastspeech_ljspeech --input_text='Simple as this proposition is, it is necessary to be stated' --use_gpu True --vocoder griffin-lim
+    ```
+  - 通过命令行方式实现语音合成模型的调用，更多请见[PaddleHub命令行指令](https://github.com/shinichiye/PaddleHub/blob/release/v2.1/docs/docs_ch/tutorial/cmd_usage.rst)

-## 服务部署
+- ### 2、预测代码示例

-PaddleHub Serving 可以部署在线服务。
+  - ```python
+    import paddlehub as hub
+    import soundfile as sf

-### 第一步：启动PaddleHub Serving
+    # Load fastspeech_ljspeech module.
+    module = hub.Module(name="fastspeech_ljspeech")

-运行启动命令：
-```shell
-$ hub serving start -m fastspeech_ljspeech
-```
+    # Predict sentiment label
+    test_texts = ['Simple as this proposition is, it is necessary to be stated',
+                'Parakeet stands for Paddle PARAllel text-to-speech toolkit']
+    wavs, sample_rate = module.synthesize(texts=test_texts)
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```

-这样就完成了一个服务化API的部署，默认端口号为8866。
+- ### 3、API

-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+  - ```python
+    def synthesize(texts, use_gpu=False, speed=1.0, vocoder="griffin-lim"):
+    ```

-### 第二步：发送预测请求
+    - 预测API，由输入文本合成对应音频波形。

-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **参数**
+      - texts (list\[str\]): 待预测文本；
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
+      - speed(float): 音频速度，1.0表示以原速输出。
+      - vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"

-```python
-import requests
-import json
+    - **返回**
+      - wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
+      - sample\_rate (int): 合成音频的采样率。

-import soundfile as sf

-# 发送HTTP请求
+## 四、服务部署

-data = {'texts':['Simple as this proposition is, it is necessary to be stated',
-                 'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
-        'use_gpu':False}
-headers = {"Content-type": "application/json"}
-url = "http://127.0.0.1:8866/predict/fastspeech_ljspeech"
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
+- PaddleHub Serving可以部署一个在线语音合成服务，可以将此接口用于在线web应用。

-# 保存结果
-result = r.json()["results"]
-wavs = result["wavs"]
-sample_rate = result["sample_rate"]
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+- ### 第一步：启动PaddleHub Serving

-## 查看代码
+  - 运行启动命令
+  - ```shell
+    $ hub serving start -m fastspeech_ljspeech
+    ```
+  - 这样就完成了服务化API的部署，默认端口号为8866。  
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。

-https://github.com/PaddlePaddle/Parakeet
+- ### 第二步：发送预测请求

-### 依赖
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果

-paddlepaddle >= 1.8.2
+  - ```python
+    import requests
+    import json

-paddlehub >= 1.7.0
+    import soundfile as sf

-**NOTE:** 除了python依赖外还必须安装libsndfile库
+    # 发送HTTP请求

-对于Ubuntu用户，请执行：
-```
-sudo apt-get install libsndfile1
-```
-对于Centos用户，请执行：
-```
-sudo yum install libsndfile
-```
+    data = {'texts':['Simple as this proposition is, it is necessary to be stated',
+                    'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
+            'use_gpu':False}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/fastspeech_ljspeech"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))

-## 更新历史
+    # 保存结果
+    result = r.json()["results"]
+    wavs = result["wavs"]
+    sample_rate = result["sample_rate"]
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```
+
+
+## 五、更新历史

 * 1.0.0

  初始发布
+
+  ```shell
+  $ hub install fastspeech_ljspeech
+  ```
--- a/modules/audio/tts/transformer_tts_ljspeech/README.md
+++ b/modules/audio/tts/transformer_tts_ljspeech/README.md
-## 概述
+# transformer_tts_ljspeech
+
+|模型名称|transformer_tts_ljspeech|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|Transformer|
+|数据集|LJSpeech-1.1|
+|是否支持Fine-tuning|否|
+|模型大小|54MB|
+|最新更新日期|2020-10-27|
+|数据指标|-|
+
+## 一、模型基本信息
+
+### 模型介绍

 TansformerTTS 是使用了 Transformer 结构的端到端语音合成模型，对 Transformer 和 Tacotron2 进行了融合，取得了令人满意的效果。因为删除了 RNN 的循环连接，可并行的提供 decoder 的输入，进行并行训练，大大提升了模型的训练速度。transformer_tts_ljspeech是基于ljspeech英文语音数据集预训练得到的英文TTS模型，仅支持预测。

 <p align="center">
-<img src="https://github.com/PaddlePaddle/Parakeet/blob/develop/examples/transformer_tts/images/model_architecture.jpg" hspace='10'/> <br />
+<img src="https://raw.githubusercontent.com/PaddlePaddle/Parakeet/release/v0.1/examples/transformer_tts/images/model_architecture.jpg" hspace='10'/> <br/>
 </p>

 更多详情参考论文[Neural Speech Synthesis with Transformer Network](https://arxiv.org/abs/1809.08895)

-## 命令行预测

-```shell
-$ hub run transformer_tts_ljspeech --input_text="Life was like a box of chocolates, you never know what you're gonna get." --use_gpu True --vocoder griffin-lim
-```
+## 二、安装
+
+- ### 1、系统依赖

-## API
+    对于Ubuntu用户，请执行：
+    ```
+    sudo apt-get install libsndfile1
+    ```
+    对于Centos用户，请执行：
+    ```
+    sudo yum install libsndfile
+    ```

-```python
-def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
-```
+- ### 2、环境依赖

-预测API，由输入文本合成对应音频波形。
+  - 2.0.0 > paddlepaddle >= 1.8.2

-**参数**
+  - 2.0.0 > paddlehub >= 1.7.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)

-* texts (list\[str\]): 待预测文本；
-* use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
-* vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"
+- ### 3、安装

-**返回**
+  - ```shell
+    $ hub install transformer_tts_ljspeech
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)

-* wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
-* sample\_rate (int): 合成音频的采样率。

-**代码示例**
+## 三、模型API预测  

-```python
-import paddlehub as hub
-import soundfile as sf
+- ### 1、命令行预测

-# Load transformer_tts_ljspeech module.
-module = hub.Module(name="transformer_tts_ljspeech")
+  - ```shell
+    $ hub run transformer_tts_ljspeech --input_text="Life was like a box of chocolates, you never know what you're gonna get." --use_gpu True --vocoder griffin-lim
+    ```
+  - 通过命令行方式实现语音合成模型的调用，更多请见[PaddleHub命令行指令](https://github.com/shinichiye/PaddleHub/blob/release/v2.1/docs/docs_ch/tutorial/cmd_usage.rst)

-# Predict sentiment label
-test_texts = ["Life was like a box of chocolates, you never know what you're gonna get."]
-wavs, sample_rate = module.synthesize(texts=test_texts, use_gpu=True, vocoder="waveflow")
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+- ### 2、预测代码示例

-## 服务部署
+  - ```python
+    import paddlehub as hub
+    import soundfile as sf

-PaddleHub Serving 可以部署在线服务。
+    # Load transformer_tts_ljspeech module.
+    module = hub.Module(name="transformer_tts_ljspeech")

-### 第一步：启动PaddleHub Serving
+    # Predict sentiment label
+    test_texts = ["Life was like a box of chocolates, you never know what you're gonna get."]
+    wavs, sample_rate = module.synthesize(texts=test_texts, use_gpu=True, vocoder="waveflow")
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```

-运行启动命令：
-```shell
-$ hub serving start -m transformer_tts_ljspeech
-```
+- ### 3、API

-这样就完成了一个服务化API的部署，默认端口号为8866。
+  - ```python
+    def synthesize(texts, use_gpu=False, vocoder="griffin-lim"):
+    ```

-**NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+    - 预测API，由输入文本合成对应音频波形。

-### 第二步：发送预测请求
+    - **参数**
+      - texts (list\[str\]): 待预测文本；
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA\_VISIBLE\_DEVICES环境变量**；
+      - vocoder: 指定声码器，可选 "griffin-lim"或"waveflow"

-配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+    - **返回**
+      - wavs (list): 语音合成结果列表，列表中每一个元素为对应输入文本的音频波形，可使用`soundfile.write`进一步处理或保存。
+      - sample\_rate (int): 合成音频的采样率。

-```python
-import requests
-import json

-import soundfile as sf
+## 四、服务部署

-# 发送HTTP请求
+- PaddleHub Serving可以部署一个在线语音合成服务，可以将此接口用于在线web应用。

-data = {'texts':['Simple as this proposition is, it is necessary to be stated',
-                 'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
-        'use_gpu':False}
-headers = {"Content-type": "application/json"}
-url = "http://127.0.0.1:8866/predict/transformer_tts_ljspeech"
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
+- ### 第一步：启动PaddleHub Serving

-# 保存结果
-result = r.json()["results"]
-wavs = result["wavs"]
-sample_rate = result["sample_rate"]
-for index, wav in enumerate(wavs):
-    sf.write(f"{index}.wav", wav, sample_rate)
-```
+  - 运行启动命令

-## 查看代码
+  - ```shell
+    $ hub serving start -m transformer_tts_ljspeech
+    ```
+  - 这样就完成了服务化API的部署，默认端口号为8866。  
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。

-https://github.com/PaddlePaddle/Parakeet
+- ### 第二步：发送预测请求

-### 依赖
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果

-paddlepaddle >= 1.8.2
+  - ```python
+    import requests
+    import json

-paddlehub >= 1.7.0
+    import soundfile as sf

-**NOTE:** 除了python依赖外还必须安装libsndfile库
+    # 发送HTTP请求

-对于Ubuntu用户，请执行：
-```
-sudo apt-get install libsndfile1
-```
-对于Centos用户，请执行：
-```
-sudo yum install libsndfile
-```
+    data = {'texts':['Simple as this proposition is, it is necessary to be stated',
+                    'Parakeet stands for Paddle PARAllel text-to-speech toolkit'],
+            'use_gpu':False}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/transformer_tts_ljspeech"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))

-## 更新历史
+    # 保存结果
+    result = r.json()["results"]
+    wavs = result["wavs"]
+    sample_rate = result["sample_rate"]
+    for index, wav in enumerate(wavs):
+        sf.write(f"{index}.wav", wav, sample_rate)
+    ```
+
+
+## 五、更新历史

 * 1.0.0

  初始发布
+
+  ```shell
+  $ hub install transformer_tts_ljspeech
+  ```
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/README.md
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/README.md
+# ge2e_fastspeech2_pwgan
+
+|模型名称|ge2e_fastspeech2_pwgan|
+| :--- | :---: |
+|类别|语音-声音克隆|
+|网络|FastSpeech2|
+|数据集|AISHELL-3|
+|是否支持Fine-tuning|否|
+|模型大小|462MB|
+|最新更新日期|2021-12-17|
+|数据指标|-|
+
+## 一、模型基本信息
+
+### 模型介绍
+
+声音克隆是指使用特定的音色，结合文字的读音合成音频，使得合成后的音频具有目标说话人的特征，从而达到克隆的目的。
+
+在训练语音克隆模型时，目标音色作为Speaker Encoder的输入，模型会提取这段语音的说话人特征（音色）作为Speaker Embedding。接着，在训练模型重新合成此类音色的语音时，除了输入的目标文本外，说话人的特征也将成为额外条件加入模型的训练。
+
+在预测时，选取一段新的目标音色作为Speaker Encoder的输入，并提取其说话人特征，最终实现输入为一段文本和一段目标音色，模型生成目标音色说出此段文本的语音片段。
+
+![](https://ai-studio-static-online.cdn.bcebos.com/982ab955b87244d3bae3b003aff8e28d9ec159ff0d6246a79757339076dfe7d4)
+
+`ge2e_fastspeech2_pwgan`是一个支持中文的语音克隆模型，分别使用了LSTMSpeakerEncoder、FastSpeech2和PWGan模型分别用于语音特征提取、目标音频特征合成和语音波形转换。
+
+关于模型的详请可参考[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)。
+
+## 二、安装
+
+- ### 1、环境依赖
+
+  - paddlepaddle >= 2.2.0
+
+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install ge2e_fastspeech2_pwgan
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+## 三、模型API预测  
+
+- ### 1、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+
+    model = hub.Module(name='ge2e_fastspeech2_pwgan', output_dir='./', speaker_audio='/data/man.wav')  # 指定目标音色音频文件
+    texts = [
+        '语音的表现形式在未来将变得越来越重要$',
+        '今天的天气怎么样$',  ]
+    wavs = model.generate(texts, use_gpu=True)
+
+    for text, wav in zip(texts, wavs):
+        print('='*30)
+        print(f'Text: {text}')
+        print(f'Wav: {wav}')
+    ```
+
+- ### 2、API
+  - ```python
+    def __init__(speaker_audio: str = None,
+                 output_dir: str = './')
+    ```
+    - 初始化module，可配置模型的目标音色的音频文件和输出的路径。
+
+    - **参数**
+      - `speaker_audio`(str): 目标说话人语音音频文件(*.wav)的路径，默认为None(使用默认的女声作为目标音色)。
+      - `output_dir`(str): 合成音频的输出文件，默认为当前目录。
+
+
+  - ```python
+    def get_speaker_embedding()
+    ```
+    - 获取模型的目标说话人特征。
+
+    - **返回**
+      - `results`(numpy.ndarray): 长度为256的numpy数组，代表目标说话人的特征。
+
+  - ```python
+    def set_speaker_embedding(speaker_audio: str)
+    ```
+    - 设置模型的目标说话人特征。
+
+    - **参数**
+      - `speaker_audio`(str): 必填，目标说话人语音音频文件(*.wav)的路径。
+
+  - ```python
+    def generate(data: Union[str, List[str]], use_gpu: bool = False):
+    ```
+    - 根据输入文字，合成目标说话人的语音音频文件。
+
+    - **参数**
+      - `data`(Union[str, List[str]]): 必填，目标音频的内容文本列表，目前只支持中文，不支持添加标点符号。
+      - `use_gpu`(bool): 是否使用gpu执行计算，默认为False。
+
+
+## 四、更新历史
+
+* 1.0.0
+
+  初始发布。
+
+  ```shell
+  $ hub install ge2e_fastspeech2_pwgan
+  ```
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/__init__.py
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/__init__.py
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/module.py
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from typing import List, Union
+
+import numpy as np
+import paddle
+import soundfile as sf
+import yaml
+from yacs.config import CfgNode
+
+from paddlehub.env import MODULE_HOME
+from paddlehub.module.module import moduleinfo, serving
+from paddlehub.utils.log import logger
+from paddlespeech.t2s.frontend.zh_frontend import Frontend
+from paddlespeech.t2s.models.fastspeech2 import FastSpeech2
+from paddlespeech.t2s.models.fastspeech2 import FastSpeech2Inference
+from paddlespeech.t2s.models.parallel_wavegan import PWGGenerator
+from paddlespeech.t2s.models.parallel_wavegan import PWGInference
+from paddlespeech.t2s.modules.normalizer import ZScore
+from paddlespeech.vector.exps.ge2e.audio_processor import SpeakerVerificationPreprocessor
+from paddlespeech.vector.models.lstm_speaker_encoder import LSTMSpeakerEncoder
+
+
+@moduleinfo(
+    name="ge2e_fastspeech2_pwgan",
+    version="1.0.0",
+    summary="",
+    author="paddlepaddle",
+    author_email="",
+    type="audio/voice_cloning",
+)
+class VoiceCloner(paddle.nn.Layer):
+    def __init__(self, speaker_audio: str = None, output_dir: str = './'):
+        super(VoiceCloner, self).__init__()
+
+        speaker_encoder_ckpt = os.path.join(MODULE_HOME, 'ge2e_fastspeech2_pwgan', 'assets',
+                                            'ge2e_ckpt_0.3/step-3000000.pdparams')
+        synthesizer_res_dir = os.path.join(MODULE_HOME, 'ge2e_fastspeech2_pwgan', 'assets',
+                                           'fastspeech2_nosil_aishell3_vc1_ckpt_0.5')
+        vocoder_res_dir = os.path.join(MODULE_HOME, 'ge2e_fastspeech2_pwgan', 'assets', 'pwg_aishell3_ckpt_0.5')
+
+        # Speaker encoder
+        self.speaker_processor = SpeakerVerificationPreprocessor(
+            sampling_rate=16000,
+            audio_norm_target_dBFS=-30,
+            vad_window_length=30,
+            vad_moving_average_width=8,
+            vad_max_silence_length=6,
+            mel_window_length=25,
+            mel_window_step=10,
+            n_mels=40,
+            partial_n_frames=160,
+            min_pad_coverage=0.75,
+            partial_overlap_ratio=0.5)
+        self.speaker_encoder = LSTMSpeakerEncoder(n_mels=40, num_layers=3, hidden_size=256, output_size=256)
+        self.speaker_encoder.set_state_dict(paddle.load(speaker_encoder_ckpt))
+        self.speaker_encoder.eval()
+
+        # Voice synthesizer
+        with open(os.path.join(synthesizer_res_dir, 'default.yaml'), 'r') as f:
+            fastspeech2_config = CfgNode(yaml.safe_load(f))
+        with open(os.path.join(synthesizer_res_dir, 'phone_id_map.txt'), 'r') as f:
+            phn_id = [line.strip().split() for line in f.readlines()]
+
+        model = FastSpeech2(idim=len(phn_id), odim=fastspeech2_config.n_mels, **fastspeech2_config["model"])
+        model.set_state_dict(paddle.load(os.path.join(synthesizer_res_dir, 'snapshot_iter_96400.pdz'))["main_params"])
+        model.eval()
+
+        stat = np.load(os.path.join(synthesizer_res_dir, 'speech_stats.npy'))
+        mu, std = stat
+        mu = paddle.to_tensor(mu)
+        std = paddle.to_tensor(std)
+        fastspeech2_normalizer = ZScore(mu, std)
+        self.sample_rate = fastspeech2_config.fs
+
+        self.fastspeech2_inference = FastSpeech2Inference(fastspeech2_normalizer, model)
+        self.fastspeech2_inference.eval()
+
+        # Vocoder
+        with open(os.path.join(vocoder_res_dir, 'default.yaml')) as f:
+            pwg_config = CfgNode(yaml.safe_load(f))
+
+        vocoder = PWGGenerator(**pwg_config["generator_params"])
+        vocoder.set_state_dict(
+            paddle.load(os.path.join(vocoder_res_dir, 'snapshot_iter_1000000.pdz'))["generator_params"])
+        vocoder.remove_weight_norm()
+        vocoder.eval()
+
+        stat = np.load(os.path.join(vocoder_res_dir, 'feats_stats.npy'))
+        mu, std = stat
+        mu = paddle.to_tensor(mu)
+        std = paddle.to_tensor(std)
+        pwg_normalizer = ZScore(mu, std)
+
+        self.pwg_inference = PWGInference(pwg_normalizer, vocoder)
+        self.pwg_inference.eval()
+
+        # Text frontend
+        self.frontend = Frontend(phone_vocab_path=os.path.join(synthesizer_res_dir, 'phone_id_map.txt'))
+
+        # Speaking embedding
+        self._speaker_embedding = None
+        if speaker_audio is None or not os.path.isfile(speaker_audio):
+            speaker_audio = os.path.join(MODULE_HOME, 'ge2e_fastspeech2_pwgan', 'assets', 'voice_cloning.wav')
+            logger.warning(f'Due to no speaker audio is specified, speaker encoder will use defult '
+                           f'waveform({speaker_audio}) to extract speaker embedding. You can use '
+                           '"set_speaker_embedding()" method to reset a speaker audio for voice cloning.')
+        self.set_speaker_embedding(speaker_audio)
+
+        self.output_dir = os.path.abspath(output_dir)
+        if not os.path.exists(self.output_dir):
+            os.makedirs(self.output_dir)
+
+    def get_speaker_embedding(self):
+        return self._speaker_embedding.numpy()
+
+    @paddle.no_grad()
+    def set_speaker_embedding(self, speaker_audio: str):
+        assert os.path.exists(speaker_audio), f'Speaker audio file: {speaker_audio} does not exists.'
+        mel_sequences = self.speaker_processor.extract_mel_partials(
+            self.speaker_processor.preprocess_wav(speaker_audio))
+        self._speaker_embedding = self.speaker_encoder.embed_utterance(paddle.to_tensor(mel_sequences))
+
+        logger.info(f'Speaker embedding has been set from file: {speaker_audio}')
+
+    @paddle.no_grad()
+    def generate(self, data: Union[str, List[str]], use_gpu: bool = False):
+        assert self._speaker_embedding is not None, f'Set speaker embedding before voice cloning.'
+
+        if isinstance(data, str):
+            data = [data]
+        elif isinstance(data, list):
+            assert len(data) > 0 and isinstance(data[0],
+                                                str) and len(data[0]) > 0, f'Input data should be str of List[str].'
+        else:
+            raise Exception(f'Input data should be str of List[str].')
+
+        paddle.set_device('gpu') if use_gpu else paddle.set_device('cpu')
+        files = []
+        for idx, text in enumerate(data):
+            phone_ids = self.frontend.get_input_ids(text, merge_sentences=True)["phone_ids"][0]
+            wav = self.pwg_inference(self.fastspeech2_inference(phone_ids, spk_emb=self._speaker_embedding))
+            output_wav = os.path.join(self.output_dir, f'{idx+1}.wav')
+            sf.write(output_wav, wav.numpy(), samplerate=self.sample_rate)
+            files.append(output_wav)
+
+        return files
--- a/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/requirements.txt
+++ b/modules/audio/voice_cloning/ge2e_fastspeech2_pwgan/requirements.txt
+paddlespeech==0.1.0a13
--- a/modules/audio/voice_cloning/lstm_tacotron2/README.md
+++ b/modules/audio/voice_cloning/lstm_tacotron2/README.md
-```shell
-$ hub install lstm_tacotron2==1.0.0
-```
+# lstm_tacotron2
+
+|模型名称|lstm_tacotron2|
+| :--- | :---: |
+|类别|语音-语音合成|
+|网络|LSTM、Tacotron2、WaveFlow|
+|数据集|AISHELL-3|
+|是否支持Fine-tuning|否|
+|模型大小|327MB|
+|最新更新日期|2021-06-15|
+|数据指标|-|
+
+## 一、模型基本信息

-## 概述
+### 模型介绍

 声音克隆是指使用特定的音色，结合文字的读音合成音频，使得合成后的音频具有目标说话人的特征，从而达到克隆的目的。

@@ -10,93 +20,107 @@ $ hub install lstm_tacotron2==1.0.0

 在预测时，选取一段新的目标音色作为Speaker Encoder的输入，并提取其说话人特征，最终实现输入为一段文本和一段目标音色，模型生成目标音色说出此段文本的语音片段。

-![](https://ai-studio-static-online.cdn.bcebos.com/982ab955b87244d3bae3b003aff8e28d9ec159ff0d6246a79757339076dfe7d4)
+<p align="center">
+<img src="https://ai-studio-static-online.cdn.bcebos.com/982ab955b87244d3bae3b003aff8e28d9ec159ff0d6246a79757339076dfe7d4" hspace='10'/> <br/>
+</p>

 `lstm_tacotron2`是一个支持中文的语音克隆模型，分别使用了LSTMSpeakerEncoder、Tacotron2和WaveFlow模型分别用于语音特征提取、目标音频特征合成和语音波形转换。

-关于模型的详请可参考[Parakeet](https://github.com/PaddlePaddle/Parakeet/tree/release/v0.3/parakeet/models)。
+更多详情请参考:
+- [Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis](https://arxiv.org/pdf/1806.04558.pdf)
+- [Parakeet](https://github.com/PaddlePaddle/Parakeet/tree/release/v0.3/parakeet/models)


-## API
+## 二、安装

-```python
-def __init__(speaker_audio: str = None,
-             output_dir: str = './')
-```
-初始化module，可配置模型的目标音色的音频文件和输出的路径。
+- ### 1、环境依赖

-**参数**
- `speaker_audio`(str): 目标说话人语音音频文件(*.wav)的路径，默认为None(使用默认的女声作为目标音色)。
- `output_dir`(str): 合成音频的输出文件，默认为当前目录。
+  - paddlepaddle >= 2.0.0

+  - paddlehub >= 2.1.0    | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)

-```python
-def get_speaker_embedding()
-```
-获取模型的目标说话人特征。
+- ### 2、安装

-**返回**
-* `results`(numpy.ndarray): 长度为256的numpy数组，代表目标说话人的特征。
+  - ```shell
+    $ hub install lstm_tacotron2
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)

-```python
-def set_speaker_embedding(speaker_audio: str)
-```
-设置模型的目标说话人特征。

-**参数**
- `speaker_audio`(str): 必填，目标说话人语音音频文件(*.wav)的路径。
+## 三、模型API预测  

-```python
-def generate(data: List[str], batch_size: int = 1, use_gpu: bool = False):
-```
-根据输入文字，合成目标说话人的语音音频文件。
+- ### 1、预测代码示例

-**参数**
- `data`(List[str]): 必填，目标音频的内容文本列表，目前只支持中文，不支持添加标点符号。
- `batch_size`(int): 可选，模型合成语音时的batch_size，默认为1。
- `use_gpu`(bool): 是否使用gpu执行计算，默认为False。
+  - ```python
+    import paddlehub as hub

+    model = hub.Module(name='lstm_tacotron2', output_dir='/data', speaker_audio='/data/man.wav')  # 指定目标音色音频文件
+    texts = [
+        '语音的表现形式在未来将变得越来越重要$',
+        '今天的天气怎么样$',  ]
+    wavs = model.generate(texts, use_gpu=True)

-**代码示例**
+    for text, wav in zip(texts, wavs):
+        print('='*30)
+        print(f'Text: {text}')
+        print(f'Wav: {wav}')
+    ```
+    ```
+    ==============================
+    Text: 语音的表现形式在未来将变得越来越重要$
+    Wav: /data/1.wav
+    ==============================
+    Text: 今天的天气怎么样$
+    Wav: /data/2.wav
+    ```

-```python
-import paddlehub as hub
+- ### 2、API

-model = hub.Module(name='lstm_tacotron2', output_dir='./', speaker_audio='/data/man.wav')  # 指定目标音色音频文件
-texts = [
-    '语音的表现形式在未来将变得越来越重要$',
-    '今天的天气怎么样$',  ]
-wavs = model.generate(texts, use_gpu=True)
+  - ```python
+    def __init__(speaker_audio: str = None,
+                output_dir: str = './')
+    ```
+    - 初始化module，可配置模型的目标音色的音频文件和输出的路径。

-for text, wav in zip(texts, wavs):
-    print('='*30)
-    print(f'Text: {text}')
-    print(f'Wav: {wav}')
-```
+    - **参数**
+      - `speaker_audio`(str): 目标说话人语音音频文件(*.wav)的路径，默认为None(使用默认的女声作为目标音色)。
+      - `output_dir`(str): 合成音频的输出文件，默认为当前目录。

-输出
-```
-==============================
-Text: 语音的表现形式在未来将变得越来越重要$
-Wav: /data/1.wav
-==============================
-Text: 今天的天气怎么样$
-Wav: /data/2.wav
-```

+  - ```python
+    def get_speaker_embedding()
+    ```
+    - 获取模型的目标说话人特征。
+
+    - **返回**
+      - `results`(numpy.ndarray): 长度为256的numpy数组，代表目标说话人的特征。

-## 查看代码
+  - ```python
+    def set_speaker_embedding(speaker_audio: str)
+    ```
+    - 设置模型的目标说话人特征。

-https://github.com/PaddlePaddle/Parakeet
+    - **参数**
+      - `speaker_audio`(str): 必填，目标说话人语音音频文件(*.wav)的路径。

-## 依赖
+  - ```python
+    def generate(data: List[str], batch_size: int = 1, use_gpu: bool = False):
+    ```
+    - 根据输入文字，合成目标说话人的语音音频文件。

-paddlepaddle >= 2.0.0
+    - **参数**
+      - `data`(List[str]): 必填，目标音频的内容文本列表，目前只支持中文，不支持添加标点符号。
+      - `batch_size`(int): 可选，模型合成语音时的batch_size，默认为1。
+      - `use_gpu`(bool): 是否使用gpu执行计算，默认为False。

-paddlehub >= 2.1.0

-## 更新历史
+## 四、更新历史

 * 1.0.0

  初始发布
+
+```shell
+$ hub install lstm_tacotron2==1.0.0
+```
--- a/modules/image/Image_gan/gan/styleganv2_editing/README.md
+++ b/modules/image/Image_gan/gan/styleganv2_editing/README.md
+# styleganv2_editing
+
+|模型名称|styleganv2_editing|
+| :--- | :---: |
+|类别|图像 - 图像生成|
+|网络|StyleGAN V2|
+|数据集|-|
+|是否支持Fine-tuning|否|
+|模型大小|190MB|
+|最新更新日期|2021-12-15|
+|数据指标|-|
+
+
+## 一、模型基本信息  
+
+- ### 应用效果展示
+  - 样例结果示例：
+    <p align="center">
+    <img src="https://user-images.githubusercontent.com/22424850/146483720-fb0ea3c0-b259-4ad6-b176-966675b9b164.png"  width = "40%"  hspace='10'/>
+    <br />
+    输入图像
+    <br />
+    <img src="https://user-images.githubusercontent.com/22424850/146483730-3104795e-4ee6-43de-b4dc-b7760d502b50.png"  width = "40%"  hspace='10'/>
+    <br />
+    输出图像(修改age)
+     <br />
+    </p>
+
+- ### 模型介绍
+
+  - StyleGAN V2 的任务是使用风格向量进行image generation，而Editing模块则是利用预先对多图的风格向量进行分类回归得到的属性操纵向量来操纵生成图像的属性。
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+  - ppgan
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install styleganv2_editing
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    # Read from a file
+    $ hub run styleganv2_editing --input_path "/PATH/TO/IMAGE" --direction_name age --direction_offset 5
+    ```
+  - 通过命令行方式实现人脸编辑模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+
+    module = hub.Module(name="styleganv2_editing")
+    input_path = ["/PATH/TO/IMAGE"]
+    # Read from a file
+    module.generate(paths=input_path, direction_name = 'age', direction_offset = 5, output_dir='./editing_result/', use_gpu=True)  
+    ```
+
+- ### 3、API
+
+  - ```python
+    generate(self, images=None, paths=None, direction_name = 'age', direction_offset = 0.0, output_dir='./editing_result/', use_gpu=False, visualization=True)
+    ```
+    - 人脸编辑生成API。
+
+    - **参数**
+
+      - images (list\[numpy.ndarray\]): 图片数据 <br/>
+      - paths (list\[str\]): 图片路径；<br/>
+      - direction_name （str): 要编辑的属性名称，对于ffhq-conf-f有预先准备的这些属性: age、eyes_open、eye_distance、eye_eyebrow_distance、eye_ratio、gender、lip_ratio、mouth_open、mouth_ratio、nose_mouth_distance、nose_ratio、nose_tip、pitch、roll、smile、yaw <br/>
+      - direction_offset (float): 属性的偏移强度 <br/>
+      - output\_dir (str): 结果保存的路径； <br/>
+      - use\_gpu (bool): 是否使用 GPU；<br/>
+      - visualization(bool): 是否保存结果到本地文件夹
+
+
+## 四、服务部署
+
+- PaddleHub Serving可以部署一个在线人脸编辑服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m styleganv2_editing
+    ```
+
+  - 这样就完成了一个人脸编辑的在线服务API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+
+    def cv2_to_base64(image):
+      data = cv2.imencode('.jpg', image)[1]
+      return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/styleganv2_editing"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+
+  - ```shell
+    $ hub install styleganv2_editing==1.0.0
+    ```
--- a/modules/image/Image_gan/gan/styleganv2_editing/basemodel.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/basemodel.py
+#   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import random
+import numpy as np
+import paddle
+from ppgan.models.generators import StyleGANv2Generator
+from ppgan.utils.download import get_path_from_url
+from ppgan.utils.visual import make_grid, tensor2img, save_image
+
+model_cfgs = {
+    'ffhq-config-f': {
+        'model_urls': 'https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f.pdparams',
+        'size': 1024,
+        'style_dim': 512,
+        'n_mlp': 8,
+        'channel_multiplier': 2
+    },
+    'animeface-512': {
+        'model_urls': 'https://paddlegan.bj.bcebos.com/models/stylegan2-animeface-512.pdparams',
+        'size': 512,
+        'style_dim': 512,
+        'n_mlp': 8,
+        'channel_multiplier': 2
+    }
+}
+
+
+@paddle.no_grad()
+def get_mean_style(generator):
+    mean_style = None
+
+    for i in range(10):
+        style = generator.mean_latent(1024)
+
+        if mean_style is None:
+            mean_style = style
+
+        else:
+            mean_style += style
+
+    mean_style /= 10
+    return mean_style
+
+
+@paddle.no_grad()
+def sample(generator, mean_style, n_sample):
+    image = generator(
+        [paddle.randn([n_sample, generator.style_dim])],
+        truncation=0.7,
+        truncation_latent=mean_style,
+    )[0]
+
+    return image
+
+
+@paddle.no_grad()
+def style_mixing(generator, mean_style, n_source, n_target):
+    source_code = paddle.randn([n_source, generator.style_dim])
+    target_code = paddle.randn([n_target, generator.style_dim])
+
+    resolution = 2**((generator.n_latent + 2) // 2)
+
+    images = [paddle.ones([1, 3, resolution, resolution]) * -1]
+
+    source_image = generator([source_code], truncation_latent=mean_style, truncation=0.7)[0]
+    target_image = generator([target_code], truncation_latent=mean_style, truncation=0.7)[0]
+
+    images.append(source_image)
+
+    for i in range(n_target):
+        image = generator(
+            [target_code[i].unsqueeze(0).tile([n_source, 1]), source_code],
+            truncation_latent=mean_style,
+            truncation=0.7,
+        )[0]
+        images.append(target_image[i].unsqueeze(0))
+        images.append(image)
+
+    images = paddle.concat(images, 0)
+
+    return images
+
+
+class StyleGANv2Predictor:
+    def __init__(self,
+                 output_path='output_dir',
+                 weight_path=None,
+                 model_type=None,
+                 seed=None,
+                 size=1024,
+                 style_dim=512,
+                 n_mlp=8,
+                 channel_multiplier=2):
+        self.output_path = output_path
+
+        if weight_path is None:
+            if model_type in model_cfgs.keys():
+                weight_path = get_path_from_url(model_cfgs[model_type]['model_urls'])
+                size = model_cfgs[model_type].get('size', size)
+                style_dim = model_cfgs[model_type].get('style_dim', style_dim)
+                n_mlp = model_cfgs[model_type].get('n_mlp', n_mlp)
+                channel_multiplier = model_cfgs[model_type].get('channel_multiplier', channel_multiplier)
+                checkpoint = paddle.load(weight_path)
+            else:
+                raise ValueError('Predictor need a weight path or a pretrained model type')
+        else:
+            checkpoint = paddle.load(weight_path)
+
+        self.generator = StyleGANv2Generator(size, style_dim, n_mlp, channel_multiplier)
+        self.generator.set_state_dict(checkpoint)
+        self.generator.eval()
+
+        if seed is not None:
+            paddle.seed(seed)
+            random.seed(seed)
+            np.random.seed(seed)
+
+    def run(self, n_row=3, n_col=5):
+        os.makedirs(self.output_path, exist_ok=True)
+        mean_style = get_mean_style(self.generator)
+
+        img = sample(self.generator, mean_style, n_row * n_col)
+        save_image(tensor2img(make_grid(img, nrow=n_col)), f'{self.output_path}/sample.png')
+
+        for j in range(2):
+            img = style_mixing(self.generator, mean_style, n_col, n_row)
+            save_image(tensor2img(make_grid(img, nrow=n_col + 1)), f'{self.output_path}/sample_mixing_{j}.png')
--- a/modules/image/Image_gan/gan/styleganv2_editing/model.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/model.py
+#   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import cv2
+import numpy as np
+import paddle
+
+from ppgan.utils.download import get_path_from_url
+from .basemodel import StyleGANv2Predictor
+
+model_cfgs = {
+    'ffhq-config-f': {
+        'direction_urls': 'https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f-directions.pdparams'
+    }
+}
+
+
+def make_image(tensor):
+    return (((tensor.detach() + 1) / 2 * 255).clip(min=0, max=255).transpose((0, 2, 3, 1)).numpy().astype('uint8'))
+
+
+class StyleGANv2EditingPredictor(StyleGANv2Predictor):
+    def __init__(self, model_type=None, direction_path=None, **kwargs):
+        super().__init__(model_type=model_type, **kwargs)
+
+        if direction_path is None and model_type is not None:
+            assert model_type in model_cfgs, f'There is not any pretrained direction file for {model_type} model.'
+            direction_path = get_path_from_url(model_cfgs[model_type]['direction_urls'])
+        self.directions = paddle.load(direction_path)
+
+    @paddle.no_grad()
+    def run(self, latent, direction, offset):
+
+        latent = paddle.to_tensor(latent).unsqueeze(0).astype('float32')
+        direction = self.directions[direction].unsqueeze(0).astype('float32')
+
+        latent_n = paddle.concat([latent, latent + offset * direction], 0)
+        generator = self.generator
+        img_gen, _ = generator([latent_n], input_is_latent=True, randomize_noise=False)
+        imgs = make_image(img_gen)
+        src_img = imgs[0]
+        dst_img = imgs[1]
+
+        dst_latent = (latent + offset * direction)[0].numpy().astype('float32')
+
+        return src_img, dst_img, dst_latent
--- a/modules/image/Image_gan/gan/styleganv2_editing/module.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import argparse
+import copy
+
+import paddle
+import paddlehub as hub
+from paddlehub.module.module import moduleinfo, runnable, serving
+import numpy as np
+import cv2
+from skimage.io import imread
+from skimage.transform import rescale, resize
+
+from .model import StyleGANv2EditingPredictor
+from .util import base64_to_cv2
+
+
+@moduleinfo(
+    name="styleganv2_editing",
+    type="CV/style_transfer",
+    author="paddlepaddle",
+    author_email="",
+    summary="",
+    version="1.0.0")
+class styleganv2_editing:
+    def __init__(self):
+        self.pretrained_model = os.path.join(self.directory, "stylegan2-ffhq-config-f-directions.pdparams")
+
+        self.network = StyleGANv2EditingPredictor(direction_path=self.pretrained_model, model_type='ffhq-config-f')
+        self.pixel2style2pixel_module = hub.Module(name='pixel2style2pixel')
+
+    def generate(self,
+                 images=None,
+                 paths=None,
+                 direction_name='age',
+                 direction_offset=0.0,
+                 output_dir='./editing_result/',
+                 use_gpu=False,
+                 visualization=True):
+        '''
+
+
+        images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR(read by cv2).
+        paths (list[str]): paths to image.
+        direction_name(str): Attribute to be manipulated，For ffhq-conf-f, we have: age, eyes_open, eye_distance, eye_eyebrow_distance, eye_ratio, gender, lip_ratio, mouth_open, mouth_ratio, nose_mouth_distance, nose_ratio, nose_tip, pitch, roll, smile, yaw.
+        direction_offset(float): Offset strength of the attribute.
+        output_dir: the dir to save the results
+        use_gpu: if True, use gpu to perform the computation, otherwise cpu.
+        visualization: if True, save results in output_dir.
+        '''
+        results = []
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        if images == None and paths == None:
+            print('No image provided. Please input an image or a image path.')
+            return
+
+        if images != None:
+            for image in images:
+                image = image[:, :, ::-1]
+                _, latent = self.pixel2style2pixel_module.network.run(image)
+                out = self.network.run(latent, direction_name, direction_offset)
+                results.append(out)
+
+        if paths != None:
+            for path in paths:
+                image = cv2.imread(path)[:, :, ::-1]
+                _, latent = self.pixel2style2pixel_module.network.run(image)
+                out = self.network.run(latent, direction_name, direction_offset)
+                results.append(out)
+
+        if visualization == True:
+            if not os.path.exists(output_dir):
+                os.makedirs(output_dir, exist_ok=True)
+            for i, out in enumerate(results):
+                if out is not None:
+                    cv2.imwrite(os.path.join(output_dir, 'src_{}.png'.format(i)), out[0][:, :, ::-1])
+                    cv2.imwrite(os.path.join(output_dir, 'dst_{}.png'.format(i)), out[1][:, :, ::-1])
+                    np.save(os.path.join(output_dir, 'dst_{}.npy'.format(i)), out[2])
+
+        return results
+
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(
+            description="Run the {} module.".format(self.name),
+            prog='hub run {}'.format(self.name),
+            usage='%(prog)s',
+            add_help=True)
+
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        results = self.generate(
+            paths=[self.args.input_path],
+            direction_name=self.args.direction_name,
+            direction_offset=self.args.direction_offset,
+            output_dir=self.args.output_dir,
+            use_gpu=self.args.use_gpu,
+            visualization=self.args.visualization)
+        return results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.generate(images=images_decode, **kwargs)
+        tolist = [result.tolist() for result in results]
+        return tolist
+
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+
+        self.arg_config_group.add_argument(
+            '--output_dir', type=str, default='editing_result', help='output directory for saving result.')
+        self.arg_config_group.add_argument('--visualization', type=bool, default=False, help='save results or not.')
+
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
+        self.arg_input_group.add_argument(
+            '--direction_name',
+            type=str,
+            default='age',
+            help=
+            "Attribute to be manipulated，For ffhq-conf-f, we have: age, eyes_open, eye_distance, eye_eyebrow_distance, eye_ratio, gender, lip_ratio, mouth_open, mouth_ratio, nose_mouth_distance, nose_ratio, nose_tip, pitch, roll, smile, yaw."
+        )
+        self.arg_input_group.add_argument('--direction_offset', type=float, help="Offset strength of the attribute.")
--- a/modules/image/Image_gan/gan/styleganv2_editing/requirements.txt
+++ b/modules/image/Image_gan/gan/styleganv2_editing/requirements.txt
+ppgan
--- a/modules/image/Image_gan/gan/styleganv2_editing/util.py
+++ b/modules/image/Image_gan/gan/styleganv2_editing/util.py
+import base64
+import cv2
+import numpy as np
+
+
+def base64_to_cv2(b64str):
+    data = base64.b64decode(b64str.encode('utf8'))
+    data = np.fromstring(data, np.uint8)
+    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+    return data
--- a/modules/image/Image_gan/gan/wav2lip/README.md
+++ b/modules/image/Image_gan/gan/wav2lip/README.md
+# wav2lip
+
+|模型名称|wav2lip|
+| :--- | :---: |
+|类别|图像 - 视频生成|
+|网络|Wav2Lip|
+|数据集|LRS2|
+|是否支持Fine-tuning|否|
+|模型大小|139MB|
+|最新更新日期|2021-12-14|
+|数据指标|-|
+
+
+## 一、模型基本信息  
+
+- ### 应用效果展示
+  - 样例结果示例：
+    <p align="center">
+    <img src="https://user-images.githubusercontent.com/22424850/146481773-4ec50285-3b13-4a86-84a2-b105787b63d1.png"  width = "40%"  hspace='10'/>
+    <br />
+    输入图像
+    <br />
+    <img src="https://user-images.githubusercontent.com/22424850/146482210-5f309fc3-7582-452d-bcf5-f2c54b5c8dc8.gif"  width = "40%"  hspace='10'/>
+    <br />
+    输出视频
+     <br />
+    </p>
+
+
+- ### 模型介绍
+
+  - Wav2Lip实现的是视频人物根据输入音频生成与语音同步的人物唇形，使得生成的视频人物口型与输入语音同步。Wav2Lip不仅可以基于静态图像来输出与目标语音匹配的唇形同步视频，还可以直接将动态的视频进行唇形转换，输出与目标语音匹配的视频。Wav2Lip实现唇形与语音精准同步突破的关键在于，它采用了唇形同步判别器，以强制生成器持续产生准确而逼真的唇部运动。此外，它通过在鉴别器中使用多个连续帧而不是单个帧，并使用视觉质量损失（而不仅仅是对比损失）来考虑时间相关性，从而改善了视觉质量。Wav2Lip适用于任何人脸、任何语言，对任意视频都能达到很高都准确率，可以无缝地与原始视频融合，还可以用于转换动画人脸。
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+  - ffmpeg
+  - libsndfile
+- ### 2、安装
+
+  - ```shell
+    $ hub install wav2lip
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    # Read from a file
+    $ hub run wav2lip --face "/PATH/TO/VIDEO or IMAGE" --audio "/PATH/TO/AUDIO"
+    ```
+  - 通过命令行方式人物唇形生成模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+
+    module = hub.Module(name="wav2lip")
+    face_input_path = "/PATH/TO/VIDEO or IMAGE"
+    audio_input_path = "/PATH/TO/AUDIO"
+    module.wav2lip_transfer(face=face_input_path, audio=audio_input_path, output_dir='./transfer_result/', use_gpu=True)  
+    ```
+
+- ### 3、API
+
+  - ```python
+    def wav2lip_transfer(face, audio, output_dir ='./output_result/', use_gpu=False, visualization=True):
+    ```
+    - 人脸唇形生成API。
+
+    - **参数**
+
+      - face (str): 视频或图像文件的路径<br/>
+      - audio (str): 音频文件的路径<br/>
+      - output\_dir (str): 结果保存的路径； <br/>
+      - use\_gpu (bool): 是否使用 GPU；<br/>
+      - visualization(bool): 是否保存结果到本地文件夹
+
+
+## 四、更新历史
+
+* 1.0.0
+
+  初始发布
+
+  - ```shell
+    $ hub install wav2lip==1.0.0
+    ```
--- a/modules/image/Image_gan/gan/wav2lip/model.py
+++ b/modules/image/Image_gan/gan/wav2lip/model.py
+from os import listdir, path, makedirs
+import platform
+import numpy as np
+import scipy, cv2, os, sys, argparse
+import json, subprocess, random, string
+from tqdm import tqdm
+from glob import glob
+import paddle
+from paddle.utils.download import get_weights_path_from_url
+from ppgan.faceutils import face_detection
+from ppgan.utils import audio
+from ppgan.models.generators.wav2lip import Wav2Lip
+
+WAV2LIP_WEIGHT_URL = 'https://paddlegan.bj.bcebos.com/models/wav2lip_hq.pdparams'
+mel_step_size = 16
+
+
+class Wav2LipPredictor:
+    def __init__(self,
+                 checkpoint_path=None,
+                 static=False,
+                 fps=25,
+                 pads=[0, 10, 0, 0],
+                 face_det_batch_size=16,
+                 wav2lip_batch_size=128,
+                 resize_factor=1,
+                 crop=[0, -1, 0, -1],
+                 box=[-1, -1, -1, -1],
+                 rotate=False,
+                 nosmooth=False,
+                 face_detector='sfd',
+                 face_enhancement=False):
+        self.img_size = 96
+        self.checkpoint_path = checkpoint_path
+        self.static = static
+        self.fps = fps
+        self.pads = pads
+        self.face_det_batch_size = face_det_batch_size
+        self.wav2lip_batch_size = wav2lip_batch_size
+        self.resize_factor = resize_factor
+        self.crop = crop
+        self.box = box
+        self.rotate = rotate
+        self.nosmooth = nosmooth
+        self.face_detector = face_detector
+        self.face_enhancement = face_enhancement
+        if face_enhancement:
+            from ppgan.faceutils.face_enhancement import FaceEnhancement
+            self.faceenhancer = FaceEnhancement()
+        makedirs('./temp', exist_ok=True)
+
+    def get_smoothened_boxes(self, boxes, T):
+        for i in range(len(boxes)):
+            if i + T > len(boxes):
+                window = boxes[len(boxes) - T:]
+            else:
+                window = boxes[i:i + T]
+            boxes[i] = np.mean(window, axis=0)
+        return boxes
+
+    def face_detect(self, images):
+        detector = face_detection.FaceAlignment(
+            face_detection.LandmarksType._2D, flip_input=False, face_detector=self.face_detector)
+
+        batch_size = self.face_det_batch_size
+
+        while 1:
+            predictions = []
+            try:
+                for i in tqdm(range(0, len(images), batch_size)):
+                    predictions.extend(detector.get_detections_for_batch(np.array(images[i:i + batch_size])))
+            except RuntimeError:
+                if batch_size == 1:
+                    raise RuntimeError(
+                        'Image too big to run face detection on GPU. Please use the --resize_factor argument')
+                batch_size //= 2
+                print('Recovering from OOM error; New batch size: {}'.format(batch_size))
+                continue
+            break
+
+        results = []
+        pady1, pady2, padx1, padx2 = self.pads
+        for rect, image in zip(predictions, images):
+            if rect is None:
+                cv2.imwrite('temp/faulty_frame.jpg', image)  # check this frame where the face was not detected.
+                raise ValueError('Face not detected! Ensure the video contains a face in all the frames.')
+
+            y1 = max(0, rect[1] - pady1)
+            y2 = min(image.shape[0], rect[3] + pady2)
+            x1 = max(0, rect[0] - padx1)
+            x2 = min(image.shape[1], rect[2] + padx2)
+
+            results.append([x1, y1, x2, y2])
+
+        boxes = np.array(results)
+        if not self.nosmooth: boxes = self.get_smoothened_boxes(boxes, T=5)
+        results = [[image[y1:y2, x1:x2], (y1, y2, x1, x2)] for image, (x1, y1, x2, y2) in zip(images, boxes)]
+
+        del detector
+        return results
+
+    def datagen(self, frames, mels):
+        img_batch, mel_batch, frame_batch, coords_batch = [], [], [], []
+
+        if self.box[0] == -1:
+            if not self.static:
+                face_det_results = self.face_detect(frames)  # BGR2RGB for CNN face detection
+            else:
+                face_det_results = self.face_detect([frames[0]])
+        else:
+            print('Using the specified bounding box instead of face detection...')
+            y1, y2, x1, x2 = self.box
+            face_det_results = [[f[y1:y2, x1:x2], (y1, y2, x1, x2)] for f in frames]
+
+        for i, m in enumerate(mels):
+            idx = 0 if self.static else i % len(frames)
+            frame_to_save = frames[idx].copy()
+            face, coords = face_det_results[idx].copy()
+
+            face = cv2.resize(face, (self.img_size, self.img_size))
+
+            img_batch.append(face)
+            mel_batch.append(m)
+            frame_batch.append(frame_to_save)
+            coords_batch.append(coords)
+
+            if len(img_batch) >= self.wav2lip_batch_size:
+                img_batch, mel_batch = np.asarray(img_batch), np.asarray(mel_batch)
+
+                img_masked = img_batch.copy()
+                img_masked[:, self.img_size // 2:] = 0
+
+                img_batch = np.concatenate((img_masked, img_batch), axis=3) / 255.
+                mel_batch = np.reshape(mel_batch, [len(mel_batch), mel_batch.shape[1], mel_batch.shape[2], 1])
+
+                yield img_batch, mel_batch, frame_batch, coords_batch
+                img_batch, mel_batch, frame_batch, coords_batch = [], [], [], []
+
+        if len(img_batch) > 0:
+            img_batch, mel_batch = np.asarray(img_batch), np.asarray(mel_batch)
+
+            img_masked = img_batch.copy()
+            img_masked[:, self.img_size // 2:] = 0
+
+            img_batch = np.concatenate((img_masked, img_batch), axis=3) / 255.
+            mel_batch = np.reshape(mel_batch, [len(mel_batch), mel_batch.shape[1], mel_batch.shape[2], 1])
+
+            yield img_batch, mel_batch, frame_batch, coords_batch
+
+    def run(self, face, audio_seq, output_dir, visualization=True):
+        if os.path.isfile(face) and path.basename(face).split('.')[1] in ['jpg', 'png', 'jpeg']:
+            self.static = True
+
+        if not os.path.isfile(face):
+            raise ValueError('--face argument must be a valid path to video/image file')
+
+        elif path.basename(face).split('.')[1] in ['jpg', 'png', 'jpeg']:
+            full_frames = [cv2.imread(face)]
+            fps = self.fps
+
+        else:
+            video_stream = cv2.VideoCapture(face)
+            fps = video_stream.get(cv2.CAP_PROP_FPS)
+
+            print('Reading video frames...')
+
+            full_frames = []
+            while 1:
+                still_reading, frame = video_stream.read()
+                if not still_reading:
+                    video_stream.release()
+                    break
+                if self.resize_factor > 1:
+                    frame = cv2.resize(frame,
+                                       (frame.shape[1] // self.resize_factor, frame.shape[0] // self.resize_factor))
+
+                if self.rotate:
+                    frame = cv2.rotate(frame, cv2.cv2.ROTATE_90_CLOCKWISE)
+
+                y1, y2, x1, x2 = self.crop
+                if x2 == -1: x2 = frame.shape[1]
+                if y2 == -1: y2 = frame.shape[0]
+
+                frame = frame[y1:y2, x1:x2]
+
+                full_frames.append(frame)
+
+        print("Number of frames available for inference: " + str(len(full_frames)))
+
+        if not audio_seq.endswith('.wav'):
+            print('Extracting raw audio...')
+            command = 'ffmpeg -y -i {} -strict -2 {}'.format(audio_seq, 'temp/temp.wav')
+
+            subprocess.call(command, shell=True)
+            audio_seq = 'temp/temp.wav'
+
+        wav = audio.load_wav(audio_seq, 16000)
+        mel = audio.melspectrogram(wav)
+        if np.isnan(mel.reshape(-1)).sum() > 0:
+            raise ValueError(
+                'Mel contains nan! Using a TTS voice? Add a small epsilon noise to the wav file and try again')
+
+        mel_chunks = []
+        mel_idx_multiplier = 80. / fps
+        i = 0
+        while 1:
+            start_idx = int(i * mel_idx_multiplier)
+            if start_idx + mel_step_size > len(mel[0]):
+                mel_chunks.append(mel[:, len(mel[0]) - mel_step_size:])
+                break
+            mel_chunks.append(mel[:, start_idx:start_idx + mel_step_size])
+            i += 1
+
+        print("Length of mel chunks: {}".format(len(mel_chunks)))
+
+        full_frames = full_frames[:len(mel_chunks)]
+
+        batch_size = self.wav2lip_batch_size
+        gen = self.datagen(full_frames.copy(), mel_chunks)
+
+        model = Wav2Lip()
+        if self.checkpoint_path is None:
+            model_weights_path = get_weights_path_from_url(WAV2LIP_WEIGHT_URL)
+            weights = paddle.load(model_weights_path)
+        else:
+            weights = paddle.load(self.checkpoint_path)
+        model.load_dict(weights)
+        model.eval()
+        print("Model loaded")
+        for i, (img_batch, mel_batch, frames, coords) in enumerate(
+                tqdm(gen, total=int(np.ceil(float(len(mel_chunks)) / batch_size)))):
+            if i == 0:
+
+                frame_h, frame_w = full_frames[0].shape[:-1]
+                out = cv2.VideoWriter('temp/result.avi', cv2.VideoWriter_fourcc(*'DIVX'), fps, (frame_w, frame_h))
+
+            img_batch = paddle.to_tensor(np.transpose(img_batch, (0, 3, 1, 2))).astype('float32')
+            mel_batch = paddle.to_tensor(np.transpose(mel_batch, (0, 3, 1, 2))).astype('float32')
+
+            with paddle.no_grad():
+                pred = model(mel_batch, img_batch)
+
+            pred = pred.numpy().transpose(0, 2, 3, 1) * 255.
+
+            for p, f, c in zip(pred, frames, coords):
+                y1, y2, x1, x2 = c
+                if self.face_enhancement:
+                    p = self.faceenhancer.enhance_from_image(p)
+                p = cv2.resize(p.astype(np.uint8), (x2 - x1, y2 - y1))
+
+                f[y1:y2, x1:x2] = p
+                out.write(f)
+
+        out.release()
+        os.makedirs(output_dir, exist_ok=True)
+        if visualization:
+            command = 'ffmpeg -y -i {} -i {} -strict -2 -q:v 1 {}'.format(audio_seq, 'temp/result.avi',
+                                                                          os.path.join(output_dir, 'result.avi'))
+            subprocess.call(command, shell=platform.system() != 'Windows')
--- a/modules/image/Image_gan/gan/wav2lip/module.py
+++ b/modules/image/Image_gan/gan/wav2lip/module.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import argparse
+import copy
+
+import paddle
+import paddlehub as hub
+from paddlehub.module.module import moduleinfo, runnable, serving
+import numpy as np
+import cv2
+
+from .model import Wav2LipPredictor
+
+
+@moduleinfo(name="wav2lip", type="CV/generation", author="paddlepaddle", author_email="", summary="", version="1.0.0")
+class wav2lip:
+    def __init__(self):
+        self.pretrained_model = os.path.join(self.directory, "wav2lip_hq.pdparams")
+
+        self.network = Wav2LipPredictor(
+            checkpoint_path=self.pretrained_model,
+            static=False,
+            fps=25,
+            pads=[0, 10, 0, 0],
+            face_det_batch_size=16,
+            wav2lip_batch_size=128,
+            resize_factor=1,
+            crop=[0, -1, 0, -1],
+            box=[-1, -1, -1, -1],
+            rotate=False,
+            nosmooth=False,
+            face_detector='sfd',
+            face_enhancement=True)
+
+    def wav2lip_transfer(self, face, audio, output_dir='./output_result/', use_gpu=False, visualization=True):
+        '''
+        face (str): path to video/image that contains faces to use.
+        audio (str): path to input audio.
+        output_dir: the dir to save the results
+        use_gpu: if True, use gpu to perform the computation, otherwise cpu.
+        visualization: if True, save results in output_dir.
+        '''
+        paddle.disable_static()
+        place = 'gpu:0' if use_gpu else 'cpu'
+        place = paddle.set_device(place)
+        self.network.run(face, audio, output_dir, visualization)
+
+    @runnable
+    def run_cmd(self, argvs: list):
+        """
+        Run as a command.
+        """
+        self.parser = argparse.ArgumentParser(
+            description="Run the {} module.".format(self.name),
+            prog='hub run {}'.format(self.name),
+            usage='%(prog)s',
+            add_help=True)
+
+        self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
+        self.arg_config_group = self.parser.add_argument_group(
+            title="Config options", description="Run configuration for controlling module behavior, not required.")
+        self.add_module_config_arg()
+        self.add_module_input_arg()
+        self.args = self.parser.parse_args(argvs)
+        self.wav2lip_transfer(
+            face=self.args.face,
+            audio=self.args.audio,
+            output_dir=self.args.output_dir,
+            use_gpu=self.args.use_gpu,
+            visualization=self.args.visualization)
+        return
+
+    def add_module_config_arg(self):
+        """
+        Add the command config options.
+        """
+        self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
+
+        self.arg_config_group.add_argument(
+            '--output_dir', type=str, default='output_result', help='output directory for saving result.')
+        self.arg_config_group.add_argument('--visualization', type=bool, default=False, help='save results or not.')
+
+    def add_module_input_arg(self):
+        """
+        Add the command input options.
+        """
+        self.arg_input_group.add_argument('--audio', type=str, help="path to input audio.")
+        self.arg_input_group.add_argument('--face', type=str, help="path to video/image that contains faces to use.")
--- a/modules/image/Image_gan/gan/wav2lip/requirements.txt
+++ b/modules/image/Image_gan/gan/wav2lip/requirements.txt
+ppgan
--- a/modules/image/Image_gan/style_transfer/U2Net_Portrait/README.md
+++ b/modules/image/Image_gan/style_transfer/U2Net_Portrait/README.md
@@ -50,16 +50,16 @@

 ## 三、模型API预测

- ### 1、代码示例
+- ### 1、预测代码示例

  - ```python
    import paddlehub as hub
    import cv2

    model = hub.Module(name="U2Net_Portrait")
-    result = model.Cartoon_GEN(images=[cv2.imread('/PATH/TO/IMAGE')])
+    result = model.Portrait_GEN(images=[cv2.imread('/PATH/TO/IMAGE')])
    # or
-    # result = model.Cartoon_GEN(paths=['/PATH/TO/IMAGE'])
+    # result = model.Portrait_GEN(paths=['/PATH/TO/IMAGE'])
    ```

 - ### 2、API

--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/README.md
+# arabic_ocr_db_crnn_mobile
+
+|模型名称|arabic_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|最新更新日期|2021-12-2|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 模型介绍
+
+  - arabic_ocr_db_crnn_mobile Module用于识别图片当中的阿拉伯文字，包括阿拉伯文、波斯文、维吾尔文。其基于multi_languages_ocr_db_crnn检测得到的文本框，继续识别文本框中的阿拉伯文文字。最终识别文字算法采用CRNN（Convolutional Recurrent Neural Network）即卷积递归神经网络。其是DCNN和RNN的组合，专门用于识别图像中的序列式对象。与CTC loss配合使用，进行文字识别，可以直接从文本词级或行级的标注中学习，不需要详细的字符级的标注。该Module是一个识别阿拉伯文的轻量级OCR模型，支持直接预测。
+
+  - 更多详情参考：
+    - [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
+    - [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+
+  - paddlepaddle >= 2.0.2  
+
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install arabic_ocr_db_crnn_mobile
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    $ hub run arabic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+    $ hub run arabic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True  --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
+    ```
+  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+    import cv2
+
+    ocr = hub.Module(name="arabic_ocr_db_crnn_mobile", enable_mkldnn=True)       # mkldnn加速仅在CPU下有效
+    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+    # or
+    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+    ```
+
+- ### 3、API
+
+  - ```python
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,  
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9)
+    ```
+
+    - 构造ArabicOCRDBCRNNMobile对象
+
+    - **参数**
+      - det(bool): 是否开启文字检测。默认为True。
+      - rec(bool): 是否开启文字识别。默认为True。
+      - use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
+      - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量**
+      - box\_thresh (float): 检测文本框置信度的阈值；
+      - angle_classification_thresh(float): 文本方向分类置信度的阈值
+
+
+  - ```python
+    def recognize_text(images=[],
+                       paths=[],
+                       output_dir='ocr_result',
+                       visualization=False)
+    ```
+
+    - 预测API，检测输入图片中的所有文本的位置和识别文本结果。
+
+    - **参数**
+
+      - paths (list\[str\]): 图片的路径；
+      - images (list\[numpy.ndarray\]): 图片数据，ndarray.shape 为 \[H, W, C\]，BGR格式；
+      - output\_dir (str): 图片的保存路径，默认设为 ocr\_result；
+      - visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False；
+
+    - **返回**
+
+      - res (list\[dict\]): 识别结果的列表，列表中每一个元素为 dict，各字段为：
+        - data (list\[dict\]): 识别文本结果，列表中每一个元素为 dict，各字段为：
+          - text(str): 识别得到的文本
+          - confidence(float): 识别文本结果置信度
+          - text_box_position(list): 文本框在原图中的像素坐标，4*2的矩阵，依次表示文本框左下、右下、右上、左上顶点的坐标，如果无识别结果则data为\[\]
+          - orientation(str): 分类的方向，仅在只有方向分类开启时输出
+          - score(float): 分类的得分，仅在只有方向分类开启时输出
+        - save_path (str, optional): 识别结果的保存路径，如不保存图片则save_path为''
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m arabic_ocr_db_crnn_mobile
+    ```
+
+  - 这样就完成了一个目标检测的服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/arabic_ocr_db_crnn_mobile"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+  - ```shell
+    $ hub install arabic_ocr_db_crnn_mobile==1.0.0
+    ```
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="arabic_ocr_db_crnn_mobile",
+    version="1.1.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class ArabicOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="arabic",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/arabic_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/README.md
+# chinese_cht_ocr_db_crnn_mobile
+
+|模型名称|chinese_cht_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|最新更新日期|2021-12-2|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 模型介绍
+
+  - chinese_cht_ocr_db_crnn_mobile Module用于识别图片当中的繁体中文。其基于multi_languages_ocr_db_crnn检测得到的文本框，继续识别文本框中的繁体中文文字。最终识别文字算法采用CRNN（Convolutional Recurrent Neural Network）即卷积递归神经网络。其是DCNN和RNN的组合，专门用于识别图像中的序列式对象。与CTC loss配合使用，进行文字识别，可以直接从文本词级或行级的标注中学习，不需要详细的字符级的标注。该Module是一个识别繁体中文的轻量级OCR模型，支持直接预测。
+
+  - 更多详情参考：
+    - [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
+    - [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+
+  - paddlepaddle >= 2.0.2  
+
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install chinese_cht_ocr_db_crnn_mobile
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    $ hub run chinese_cht_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+    $ hub run chinese_cht_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True  --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
+    ```
+  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+    import cv2
+
+    ocr = hub.Module(name="chinese_cht_ocr_db_crnn_mobile", enable_mkldnn=True)       # mkldnn加速仅在CPU下有效
+    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+    # or
+    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+    ```
+
+- ### 3、API
+
+  - ```python
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,  
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9)
+    ```
+
+    - 构造ChineseChtOCRDBCRNNMobile对象
+
+    - **参数**
+      - det(bool): 是否开启文字检测。默认为True。
+      - rec(bool): 是否开启文字识别。默认为True。
+      - use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
+      - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量**
+      - box\_thresh (float): 检测文本框置信度的阈值；
+      - angle_classification_thresh(float): 文本方向分类置信度的阈值
+
+
+  - ```python
+    def recognize_text(images=[],
+                       paths=[],
+                       output_dir='ocr_result',
+                       visualization=False)
+    ```
+
+    - 预测API，检测输入图片中的所有文本的位置和识别文本结果。
+
+    - **参数**
+
+      - paths (list\[str\]): 图片的路径；
+      - images (list\[numpy.ndarray\]): 图片数据，ndarray.shape 为 \[H, W, C\]，BGR格式；
+      - output\_dir (str): 图片的保存路径，默认设为 ocr\_result；
+      - visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False；
+
+    - **返回**
+
+      - res (list\[dict\]): 识别结果的列表，列表中每一个元素为 dict，各字段为：
+        - data (list\[dict\]): 识别文本结果，列表中每一个元素为 dict，各字段为：
+          - text(str): 识别得到的文本
+          - confidence(float): 识别文本结果置信度
+          - text_box_position(list): 文本框在原图中的像素坐标，4*2的矩阵，依次表示文本框左下、右下、右上、左上顶点的坐标，如果无识别结果则data为\[\]
+          - orientation(str): 分类的方向，仅在只有方向分类开启时输出
+          - score(float): 分类的得分，仅在只有方向分类开启时输出
+        - save_path (str, optional): 识别结果的保存路径，如不保存图片则save_path为''
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m chinese_cht_ocr_db_crnn_mobile
+    ```
+
+  - 这样就完成了一个目标检测的服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/chinese_cht_ocr_db_crnn_mobile"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+  - ```shell
+    $ hub install chinese_cht_ocr_db_crnn_mobile==1.0.0
+    ```
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="chinese_cht_ocr_db_crnn_mobile",
+    version="1.0.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class ChineseChtOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="chinese_cht",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/chinese_cht_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/README.md
+# cyrillic_ocr_db_crnn_mobile
+
+|模型名称|cyrillic_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|最新更新日期|2021-12-2|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 模型介绍
+
+  - cyrillic_ocr_db_crnn_mobile Module用于识别图片当中的斯拉夫文，包括俄罗斯文、塞尔维亚文、白俄罗斯文、保加利亚文、乌克兰文、蒙古文、阿迪赫文、阿瓦尔文、达尔瓦文、因古什文、拉克文、莱兹甘文、塔巴萨兰文。其基于multi_languages_ocr_db_crnn检测得到的文本框，继续识别文本框中的斯拉夫文文字。最终识别文字算法采用CRNN（Convolutional Recurrent Neural Network）即卷积递归神经网络。其是DCNN和RNN的组合，专门用于识别图像中的序列式对象。与CTC loss配合使用，进行文字识别，可以直接从文本词级或行级的标注中学习，不需要详细的字符级的标注。该Module是一个识别斯拉夫文的轻量级OCR模型，支持直接预测。
+
+  - 更多详情参考：
+    - [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
+    - [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+
+  - paddlepaddle >= 2.0.2  
+
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install cyrillic_ocr_db_crnn_mobile
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    $ hub run cyrillic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+    $ hub run cyrillic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True  --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
+    ```
+  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+    import cv2
+
+    ocr = hub.Module(name="cyrillic_ocr_db_crnn_mobile", enable_mkldnn=True)       # mkldnn加速仅在CPU下有效
+    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+    # or
+    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+    ```
+
+- ### 3、API
+
+  - ```python
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,  
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9)
+    ```
+
+    - 构造CyrillicOCRDBCRNNMobile对象
+
+    - **参数**
+      - det(bool): 是否开启文字检测。默认为True。
+      - rec(bool): 是否开启文字识别。默认为True。
+      - use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
+      - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量**
+      - box\_thresh (float): 检测文本框置信度的阈值；
+      - angle_classification_thresh(float): 文本方向分类置信度的阈值
+
+
+  - ```python
+    def recognize_text(images=[],
+                       paths=[],
+                       output_dir='ocr_result',
+                       visualization=False)
+    ```
+
+    - 预测API，检测输入图片中的所有文本的位置和识别文本结果。
+
+    - **参数**
+
+      - paths (list\[str\]): 图片的路径；
+      - images (list\[numpy.ndarray\]): 图片数据，ndarray.shape 为 \[H, W, C\]，BGR格式；
+      - output\_dir (str): 图片的保存路径，默认设为 ocr\_result；
+      - visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False；
+
+    - **返回**
+
+      - res (list\[dict\]): 识别结果的列表，列表中每一个元素为 dict，各字段为：
+        - data (list\[dict\]): 识别文本结果，列表中每一个元素为 dict，各字段为：
+          - text(str): 识别得到的文本
+          - confidence(float): 识别文本结果置信度
+          - text_box_position(list): 文本框在原图中的像素坐标，4*2的矩阵，依次表示文本框左下、右下、右上、左上顶点的坐标，如果无识别结果则data为\[\]
+          - orientation(str): 分类的方向，仅在只有方向分类开启时输出
+          - score(float): 分类的得分，仅在只有方向分类开启时输出
+        - save_path (str, optional): 识别结果的保存路径，如不保存图片则save_path为''
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m cyrillic_ocr_db_crnn_mobile
+    ```
+
+  - 这样就完成了一个目标检测的服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/cyrillic_ocr_db_crnn_mobile"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+  - ```shell
+    $ hub install cyrillic_ocr_db_crnn_mobile==1.0.0
+    ```
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="cyrillic_ocr_db_crnn_mobile",
+    version="1.0.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class CyrillicOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="cyrillic",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/cyrillic_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/README.md
+# devanagari_ocr_db_crnn_mobile
+
+|模型名称|devanagari_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|最新更新日期|2021-12-2|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 模型介绍
+
+  - devanagari_ocr_db_crnn_mobile Module用于识别图片当中的梵文，包括印地文、马拉地文、尼泊尔文、比尔哈文、迈蒂利文、昂加文、孟加拉文、摩揭陀文、那格浦尔文、尼瓦尔文。其基于multi_languages_ocr_db_crnn检测得到的文本框，继续识别文本框中的梵文文字。最终识别文字算法采用CRNN（Convolutional Recurrent Neural Network）即卷积递归神经网络。其是DCNN和RNN的组合，专门用于识别图像中的序列式对象。与CTC loss配合使用，进行文字识别，可以直接从文本词级或行级的标注中学习，不需要详细的字符级的标注。该Module是一个识别梵文的轻量级OCR模型，支持直接预测。
+
+  - 更多详情参考：
+    - [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
+    - [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+
+  - paddlepaddle >= 2.0.2  
+
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install devanagari_ocr_db_crnn_mobile
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    $ hub run devanagari_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+    $ hub run devanagari_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True  --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
+    ```
+  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+    import cv2
+
+    ocr = hub.Module(name="devanagari_ocr_db_crnn_mobile", enable_mkldnn=True)       # mkldnn加速仅在CPU下有效
+    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+    # or
+    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+    ```
+
+- ### 3、API
+
+  - ```python
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,  
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9)
+    ```
+
+    - 构造DevanagariOCRDBCRNNMobile对象
+
+    - **参数**
+      - det(bool): 是否开启文字检测。默认为True。
+      - rec(bool): 是否开启文字识别。默认为True。
+      - use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
+      - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量**
+      - box\_thresh (float): 检测文本框置信度的阈值；
+      - angle_classification_thresh(float): 文本方向分类置信度的阈值
+
+
+  - ```python
+    def recognize_text(images=[],
+                       paths=[],
+                       output_dir='ocr_result',
+                       visualization=False)
+    ```
+
+    - 预测API，检测输入图片中的所有文本的位置和识别文本结果。
+
+    - **参数**
+
+      - paths (list\[str\]): 图片的路径；
+      - images (list\[numpy.ndarray\]): 图片数据，ndarray.shape 为 \[H, W, C\]，BGR格式；
+      - output\_dir (str): 图片的保存路径，默认设为 ocr\_result；
+      - visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False；
+
+    - **返回**
+
+      - res (list\[dict\]): 识别结果的列表，列表中每一个元素为 dict，各字段为：
+        - data (list\[dict\]): 识别文本结果，列表中每一个元素为 dict，各字段为：
+          - text(str): 识别得到的文本
+          - confidence(float): 识别文本结果置信度
+          - text_box_position(list): 文本框在原图中的像素坐标，4*2的矩阵，依次表示文本框左下、右下、右上、左上顶点的坐标，如果无识别结果则data为\[\]
+          - orientation(str): 分类的方向，仅在只有方向分类开启时输出
+          - score(float): 分类的得分，仅在只有方向分类开启时输出
+        - save_path (str, optional): 识别结果的保存路径，如不保存图片则save_path为''
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m devanagari_ocr_db_crnn_mobile
+    ```
+
+  - 这样就完成了一个目标检测的服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/devanagari_ocr_db_crnn_mobile"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+  - ```shell
+    $ hub install devanagari_ocr_db_crnn_mobile==1.0.0
+    ```
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="devanagari_ocr_db_crnn_mobile",
+    version="1.0.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class DevanagariOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="devanagari",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/devanagari_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/README.md
+# french_ocr_db_crnn_mobile
+
+|模型名称|french_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|最新更新日期|2021-12-2|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 模型介绍
+
+  - french_ocr_db_crnn_mobile Module用于识别图片当中的法文。其基于multi_languages_ocr_db_crnn检测得到的文本框，继续识别文本框中的法文文字。最终识别文字算法采用CRNN（Convolutional Recurrent Neural Network）即卷积递归神经网络。其是DCNN和RNN的组合，专门用于识别图像中的序列式对象。与CTC loss配合使用，进行文字识别，可以直接从文本词级或行级的标注中学习，不需要详细的字符级的标注。该Module是一个识别法文的轻量级OCR模型，支持直接预测。
+
+  - 更多详情参考：
+    - [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
+    - [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
+
+
+
+## 二、安装
+
+- ### 1、环境依赖  
+
+  - paddlepaddle >= 2.0.2  
+
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+- ### 2、安装
+
+  - ```shell
+    $ hub install french_ocr_db_crnn_mobile
+    ```
+  - 如您安装时遇到问题，可参考：[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+  - ```shell
+    $ hub run french_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+    $ hub run french_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True  --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
+    ```
+  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、预测代码示例
+
+  - ```python
+    import paddlehub as hub
+    import cv2
+
+    ocr = hub.Module(name="french_ocr_db_crnn_mobile", enable_mkldnn=True)       # mkldnn加速仅在CPU下有效
+    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+    # or
+    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+    ```
+
+- ### 3、API
+
+  - ```python
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,  
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9)
+    ```
+
+    - 构造FrechOCRDBCRNNMobile对象
+
+    - **参数**
+      - det(bool): 是否开启文字检测。默认为True。
+      - rec(bool): 是否开启文字识别。默认为True。
+      - use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
+      - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+      - use\_gpu (bool): 是否使用 GPU；**若使用GPU，请先设置CUDA_VISIBLE_DEVICES环境变量**
+      - box\_thresh (float): 检测文本框置信度的阈值；
+      - angle_classification_thresh(float): 文本方向分类置信度的阈值
+
+
+  - ```python
+    def recognize_text(images=[],
+                       paths=[],
+                       output_dir='ocr_result',
+                       visualization=False)
+    ```
+
+    - 预测API，检测输入图片中的所有文本的位置和识别文本结果。
+
+    - **参数**
+
+      - paths (list\[str\]): 图片的路径；
+      - images (list\[numpy.ndarray\]): 图片数据，ndarray.shape 为 \[H, W, C\]，BGR格式；
+      - output\_dir (str): 图片的保存路径，默认设为 ocr\_result；
+      - visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False；
+
+    - **返回**
+
+      - res (list\[dict\]): 识别结果的列表，列表中每一个元素为 dict，各字段为：
+        - data (list\[dict\]): 识别文本结果，列表中每一个元素为 dict，各字段为：
+          - text(str): 识别得到的文本
+          - confidence(float): 识别文本结果置信度
+          - text_box_position(list): 文本框在原图中的像素坐标，4*2的矩阵，依次表示文本框左下、右下、右上、左上顶点的坐标，如果无识别结果则data为\[\]
+          - orientation(str): 分类的方向，仅在只有方向分类开启时输出
+          - score(float): 分类的得分，仅在只有方向分类开启时输出
+        - save_path (str, optional): 识别结果的保存路径，如不保存图片则save_path为''
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步：启动PaddleHub Serving
+
+  - 运行启动命令：
+  - ```shell
+    $ hub serving start -m french_ocr_db_crnn_mobile
+    ```
+
+  - 这样就完成了一个目标检测的服务化API的部署，默认端口号为8866。
+
+  - **NOTE:** 如使用GPU预测，则需要在启动服务之前，请设置CUDA\_VISIBLE\_DEVICES环境变量，否则不用设置。
+
+- ### 第二步：发送预测请求
+
+  - 配置好服务端，以下数行代码即可实现发送预测请求，获取预测结果
+
+  - ```python
+    import requests
+    import json
+    import cv2
+    import base64
+
+    def cv2_to_base64(image):
+        data = cv2.imencode('.jpg', image)[1]
+        return base64.b64encode(data.tostring()).decode('utf8')
+
+    # 发送HTTP请求
+    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+    headers = {"Content-type": "application/json"}
+    url = "http://127.0.0.1:8866/predict/french_ocr_db_crnn_mobile"
+    r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+    # 打印预测结果
+    print(r.json()["results"])
+    ```
+
+## 五、更新历史
+
+* 1.0.0
+
+  初始发布
+
+* 1.1.0
+
+  优化模型
+  - ```shell
+    $ hub install french_ocr_db_crnn_mobile==1.1.0
+    ```
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="french_ocr_db_crnn_mobile",
+    version="1.1.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class FrechOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="fr",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/french_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/french_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
@@ -27,18 +27,9 @@

 - ### 1、环境依赖  

-  - paddlepaddle >= 1.8.0  
+  - paddlepaddle >= 2.0.2  

-  - paddlehub >= 1.8.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
-
-  - shapely
-
-  - pyclipper
-
-  - ```shell
-    $ pip install shapely pyclipper
-    ```
-  - **该Module依赖于第三方库shapely和pyclipper，使用该Module之前，请先安装shapely和pyclipper。**  
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)

 - ### 2、安装

@@ -58,7 +49,7 @@
  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)


- ### 2、代码示例
+- ### 2、预测代码示例

  - ```python
    import paddlehub as hub
@@ -159,13 +150,15 @@
    print(r.json()["results"])
    ```

-
 ## 五、更新历史

 * 1.0.0

  初始发布

+* 1.1.0
+
+  优化模型
  - ```shell
-    $ hub install german_ocr_db_crnn_mobile==1.0.0
+    $ hub install german_ocr_db_crnn_mobile==1.1.0
    ```
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
-!
-"
-$
-%
-&
-'
-(
-)
-+
-,
-
-.
-/
-0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-:
-;
->
-?
-A
-B
-C
-D
-E
-F
-G
-H
-I
-J
-K
-L
-M
-N
-O
-P
-Q
-R
-S
-T
-U
-V
-W
-X
-Y
-Z
-[
-]
-a
-b
-c
-d
-e
-f
-g
-h
-i
-j
-k
-l
-m
-n
-o
-p
-q
-r
-s
-t
-u
-v
-w
-x
-y
-z
-£
-§
-
-²
-´
-µ
-·
-º
-¼
-½
-¿
-À
-Á
-Ä
-Å
-Ç
-É
-Í
-Ï
-Ô
-Ö
-Ø
-Ù
-Ü
-ß
-à
-á
-â
-ã
-ä
-å
-æ
-ç
-è
-é
-ê
-ë
-í
-ï
-ñ
-ò
-ó
-ô
-ö
-ø
-ù
-ú
-û
-ü
- 
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
-# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import numpy as np
-import string
-
-class CharacterOps(object):
-    """ Convert between text-label and text-index """
-
-    def __init__(self, config):
-        self.character_type = config['character_type']
-        self.loss_type = config['loss_type']
-        self.max_text_len = config['max_text_length']
-        if self.character_type == "en":
-            self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
-            dict_character = list(self.character_str)
-        elif self.character_type in [
-                "ch", 'japan', 'korean', 'french', 'german'
-        ]:
-            character_dict_path = config['character_dict_path']
-            add_space = False
-            if 'use_space_char' in config:
-                add_space = config['use_space_char']
-            self.character_str = ""
-            with open(character_dict_path, "rb") as fin:
-                lines = fin.readlines()
-                for line in lines:
-                    line = line.decode('utf-8').strip("\n").strip("\r\n")
-                    self.character_str += line
-            if add_space:
-                self.character_str += " "
-            dict_character = list(self.character_str)
-        elif self.character_type == "en_sensitive":
-            # same with ASTER setting (use 94 char).
-            self.character_str = string.printable[:-6]
-            dict_character = list(self.character_str)
-        else:
-            self.character_str = None
-        assert self.character_str is not None, \
-            "Nonsupport type of the character: {}".format(self.character_str)
-        self.beg_str = "sos"
-        self.end_str = "eos"
-        if self.loss_type == "attention":
-            dict_character = [self.beg_str, self.end_str] + dict_character
-        elif self.loss_type == "srn":
-            dict_character = dict_character + [self.beg_str, self.end_str]
-        self.dict = {}
-        for i, char in enumerate(dict_character):
-            self.dict[char] = i
-        self.character = dict_character
-
-    def encode(self, text):
-        """convert text-label into text-index.
-        input:
-            text: text labels of each image. [batch_size]
-
-        output:
-            text: concatenated text index for CTCLoss.
-                    [sum(text_lengths)] = [text_index_0 + text_index_1 + ... + text_index_(n - 1)]
-            length: length of each text. [batch_size]
-        """
-        if self.character_type == "en":
-            text = text.lower()
-
-        text_list = []
-        for char in text:
-            if char not in self.dict:
-                continue
-            text_list.append(self.dict[char])
-        text = np.array(text_list)
-        return text
-
-    def decode(self, text_index, is_remove_duplicate=False):
-        """ convert text-index into text-label. """
-        char_list = []
-        char_num = self.get_char_num()
-
-        if self.loss_type == "attention":
-            beg_idx = self.get_beg_end_flag_idx("beg")
-            end_idx = self.get_beg_end_flag_idx("end")
-            ignored_tokens = [beg_idx, end_idx]
-        else:
-            ignored_tokens = [char_num]
-
-        for idx in range(len(text_index)):
-            if text_index[idx] in ignored_tokens:
-                continue
-            if is_remove_duplicate:
-                if idx > 0 and text_index[idx - 1] == text_index[idx]:
-                    continue
-            char_list.append(self.character[int(text_index[idx])])
-        text = ''.join(char_list)
-        return text
-
-    def get_char_num(self):
-        return len(self.character)
-
-    def get_beg_end_flag_idx(self, beg_or_end):
-        if self.loss_type == "attention":
-            if beg_or_end == "beg":
-                idx = np.array(self.dict[self.beg_str])
-            elif beg_or_end == "end":
-                idx = np.array(self.dict[self.end_str])
-            else:
-                assert False, "Unsupport type %s in get_beg_end_flag_idx"\
-                    % beg_or_end
-            return idx
-        else:
-            err = "error in get_beg_end_flag_idx when using the loss %s"\
-                % (self.loss_type)
-            assert False, err
-
-
-def cal_predicts_accuracy(char_ops,
-                          preds,
-                          preds_lod,
-                          labels,
-                          labels_lod,
-                          is_remove_duplicate=False):
-    acc_num = 0
-    img_num = 0
-    for ino in range(len(labels_lod) - 1):
-        beg_no = preds_lod[ino]
-        end_no = preds_lod[ino + 1]
-        preds_text = preds[beg_no:end_no].reshape(-1)
-        preds_text = char_ops.decode(preds_text, is_remove_duplicate)
-
-        beg_no = labels_lod[ino]
-        end_no = labels_lod[ino + 1]
-        labels_text = labels[beg_no:end_no].reshape(-1)
-        labels_text = char_ops.decode(labels_text, is_remove_duplicate)
-        img_num += 1
-
-        if preds_text == labels_text:
-            acc_num += 1
-    acc = acc_num * 1.0 / img_num
-    return acc, acc_num, img_num
-
-
-def cal_predicts_accuracy_srn(char_ops,
-                              preds,
-                              labels,
-                              max_text_len,
-                              is_debug=False):
-    acc_num = 0
-    img_num = 0
-
-    char_num = char_ops.get_char_num()
-
-    total_len = preds.shape[0]
-    img_num = int(total_len / max_text_len)
-    for i in range(img_num):
-        cur_label = []
-        cur_pred = []
-        for j in range(max_text_len):
-            if labels[j + i * max_text_len] != int(char_num - 1):  #0
-                cur_label.append(labels[j + i * max_text_len][0])
-            else:
-                break
-
-        for j in range(max_text_len + 1):
-            if j < len(cur_label) and preds[j + i * max_text_len][
-                    0] != cur_label[j]:
-                break
-            elif j == len(cur_label) and j == max_text_len:
-                acc_num += 1
-                break
-            elif j == len(cur_label) and preds[j + i * max_text_len][0] == int(
-                    char_num - 1):
-                acc_num += 1
-                break
-    acc = acc_num * 1.0 / img_num
-    return acc, acc_num, img_num
-
-
-def convert_rec_attention_infer_res(preds):
-    img_num = preds.shape[0]
-    target_lod = [0]
-    convert_ids = []
-    for ino in range(img_num):
-        end_pos = np.where(preds[ino, :] == 1)[0]
-        if len(end_pos) <= 1:
-            text_list = preds[ino, 1:]
-        else:
-            text_list = preds[ino, 1:end_pos[1]]
-        target_lod.append(target_lod[ino] + len(text_list))
-        convert_ids = convert_ids + list(text_list)
-    convert_ids = np.array(convert_ids)
-    convert_ids = convert_ids.reshape((-1, 1))
-    return convert_ids, target_lod
-
-
-def convert_rec_label_to_lod(ori_labels):
-    img_num = len(ori_labels)
-    target_lod = [0]
-    convert_ids = []
-    for ino in range(img_num):
-        target_lod.append(target_lod[ino] + len(ori_labels[ino]))
-        convert_ids = convert_ids + list(ori_labels[ino])
-    convert_ids = np.array(convert_ids)
-    convert_ids = convert_ids.reshape((-1, 1))
-    return convert_ids, target_lod
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
-# -*- coding:utf-8 -*-
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
-import math
-
-from PIL import Image, ImageDraw, ImageFont
-import base64
-import cv2
-import numpy as np
-
-
-def draw_ocr(image,
-             boxes,
-             txts,
-             scores,
-             font_file,
-             draw_txt=True,
-             drop_score=0.5):
-    """
-    Visualize the results of OCR detection and recognition
-    args:
-        image(Image|array): RGB image
-        boxes(list): boxes with shape(N, 4, 2)
-        txts(list): the texts
-        scores(list): txxs corresponding scores
-        draw_txt(bool): whether draw text or not
-        drop_score(float): only scores greater than drop_threshold will be visualized
-    return(array):
-        the visualized img
-    """
-    if scores is None:
-        scores = [1] * len(boxes)
-    for (box, score) in zip(boxes, scores):
-        if score < drop_score or math.isnan(score):
-            continue
-        box = np.reshape(np.array(box), [-1, 1, 2]).astype(np.int64)
-        image = cv2.polylines(np.array(image), [box], True, (255, 0, 0), 2)
-
-    if draw_txt:
-        img = np.array(resize_img(image, input_size=600))
-        txt_img = text_visual(
-            txts,
-            scores,
-            font_file,
-            img_h=img.shape[0],
-            img_w=600,
-            threshold=drop_score)
-        img = np.concatenate([np.array(img), np.array(txt_img)], axis=1)
-        return img
-    return image
-
-
-def text_visual(texts, scores, font_file, img_h=400, img_w=600, threshold=0.):
-    """
-    create new blank img and draw txt on it
-    args:
-        texts(list): the text will be draw
-        scores(list|None): corresponding score of each txt
-        img_h(int): the height of blank img
-        img_w(int): the width of blank img
-    return(array):
-    """
-    if scores is not None:
-        assert len(texts) == len(
-            scores), "The number of txts and corresponding scores must match"
-
-    def create_blank_img():
-        blank_img = np.ones(shape=[img_h, img_w], dtype=np.int8) * 255
-        blank_img[:, img_w - 1:] = 0
-        blank_img = Image.fromarray(blank_img).convert("RGB")
-        draw_txt = ImageDraw.Draw(blank_img)
-        return blank_img, draw_txt
-
-    blank_img, draw_txt = create_blank_img()
-
-    font_size = 20
-    txt_color = (0, 0, 0)
-    font = ImageFont.truetype(font_file, font_size, encoding="utf-8")
-
-    gap = font_size + 5
-    txt_img_list = []
-    count, index = 1, 0
-    for idx, txt in enumerate(texts):
-        index += 1
-        if scores[idx] < threshold or math.isnan(scores[idx]):
-            index -= 1
-            continue
-        first_line = True
-        while str_count(txt) >= img_w // font_size - 4:
-            tmp = txt
-            txt = tmp[:img_w // font_size - 4]
-            if first_line:
-                new_txt = str(index) + ': ' + txt
-                first_line = False
-            else:
-                new_txt = '    ' + txt
-            draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
-            txt = tmp[img_w // font_size - 4:]
-            if count >= img_h // gap - 1:
-                txt_img_list.append(np.array(blank_img))
-                blank_img, draw_txt = create_blank_img()
-                count = 0
-            count += 1
-        if first_line:
-            new_txt = str(index) + ': ' + txt + '   ' + '%.3f' % (scores[idx])
-        else:
-            new_txt = "  " + txt + "  " + '%.3f' % (scores[idx])
-        draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
-        # whether add new blank img or not
-        if count >= img_h // gap - 1 and idx + 1 < len(texts):
-            txt_img_list.append(np.array(blank_img))
-            blank_img, draw_txt = create_blank_img()
-            count = 0
-        count += 1
-    txt_img_list.append(np.array(blank_img))
-    if len(txt_img_list) == 1:
-        blank_img = np.array(txt_img_list[0])
-    else:
-        blank_img = np.concatenate(txt_img_list, axis=1)
-    return np.array(blank_img)
-
-
-def str_count(s):
-    """
-    Count the number of Chinese characters,
-    a single English character and a single number
-    equal to half the length of Chinese characters.
-    args:
-        s(string): the input of string
-    return(int):
-        the number of Chinese characters
-    """
-    import string
-    count_zh = count_pu = 0
-    s_len = len(s)
-    en_dg_count = 0
-    for c in s:
-        if c in string.ascii_letters or c.isdigit() or c.isspace():
-            en_dg_count += 1
-        elif c.isalpha():
-            count_zh += 1
-        else:
-            count_pu += 1
-    return s_len - math.ceil(en_dg_count / 2)
-
-
-def resize_img(img, input_size=600):
-    img = np.array(img)
-    im_shape = img.shape
-    im_size_min = np.min(im_shape[0:2])
-    im_size_max = np.max(im_shape[0:2])
-    im_scale = float(input_size) / float(im_size_max)
-    im = cv2.resize(img, None, None, fx=im_scale, fy=im_scale)
-    return im
-
-
-def get_image_ext(image):
-    if image.shape[2] == 4:
-        return ".png"
-    return ".jpg"
-
-
-def sorted_boxes(dt_boxes):
-    """
-    Sort text boxes in order from top to bottom, left to right
-    args:
-        dt_boxes(array):detected text boxes with shape [4, 2]
-    return:
-        sorted boxes(array) with shape [4, 2]
-    """
-    num_boxes = dt_boxes.shape[0]
-    sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
-    _boxes = list(sorted_boxes)
-
-    for i in range(num_boxes - 1):
-        if abs(_boxes[i + 1][0][1] - _boxes[i][0][1]) < 10 and \
-                (_boxes[i + 1][0][0] < _boxes[i][0][0]):
-            tmp = _boxes[i]
-            _boxes[i] = _boxes[i + 1]
-            _boxes[i + 1] = tmp
-    return _boxes
-
-
-def base64_to_cv2(b64str):
-    data = base64.b64decode(b64str.encode('utf8'))
-    data = np.fromstring(data, np.uint8)
-    data = cv2.imdecode(data, cv2.IMREAD_COLOR)
-    return data
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
@@ -27,18 +27,9 @@

 - ### 1、环境依赖  

-  - paddlepaddle >= 1.8.0  
+  - paddlepaddle >= 2.0.2  

-  - paddlehub >= 1.8.0    | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
-
-  - shapely
-
-  - pyclipper
-
-  - ```shell
-    $ pip install shapely pyclipper
-    ```
-  - **该Module依赖于第三方库shapely和pyclipper，使用该Module之前，请先安装shapely和pyclipper。**  
+  - paddlehub >= 2.0.0   | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)

 - ### 2、安装

@@ -58,7 +49,7 @@
    ```
  - 通过命令行方式实现文字识别模型的调用，更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)

- ### 2、代码示例
+- ### 2、预测代码示例

  - ```python
    import paddlehub as hub
@@ -160,13 +151,15 @@
    print(r.json()["results"])
    ```

-
 ## 五、更新历史

 * 1.0.0

  初始发布

+* 1.1.0
+
+  优化模型
  - ```shell
-    $ hub install japan_ocr_db_crnn_mobile==1.0.0
+    $ hub install japan_ocr_db_crnn_mobile==1.1.0
    ```
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/module.py
+import paddlehub as hub
+from paddleocr.ppocr.utils.logging import get_logger
+from paddleocr.tools.infer.utility import base64_to_cv2
+from paddlehub.module.module import moduleinfo, runnable, serving
+
+
+@moduleinfo(
+    name="kannada_ocr_db_crnn_mobile",
+    version="1.0.0",
+    summary="ocr service",
+    author="PaddlePaddle",
+    type="cv/text_recognition")
+class KannadaOCRDBCRNNMobile:
+    def __init__(self,
+                 det=True,
+                 rec=True,
+                 use_angle_cls=False,
+                 enable_mkldnn=False,
+                 use_gpu=False,
+                 box_thresh=0.6,
+                 angle_classification_thresh=0.9):
+        """
+        initialize with the necessary elements
+        Args:
+            det(bool): Whether to use text detector.
+            rec(bool): Whether to use text recognizer.
+            use_angle_cls(bool): Whether to use text orientation classifier.
+            enable_mkldnn(bool): Whether to enable mkldnn.
+            use_gpu (bool): Whether to use gpu.
+            box_thresh(float): the threshold of the detected text box's confidence
+            angle_classification_thresh(float): the threshold of the angle classification confidence
+        """
+        self.logger = get_logger()
+        self.model = hub.Module(
+            name="multi_languages_ocr_db_crnn",
+            lang="ka",
+            det=det,
+            rec=rec,
+            use_angle_cls=use_angle_cls,
+            enable_mkldnn=enable_mkldnn,
+            use_gpu=use_gpu,
+            box_thresh=box_thresh,
+            angle_classification_thresh=angle_classification_thresh)
+        self.model.name = self.name
+
+    def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
+        """
+        Get the text in the predicted images.
+        Args:
+            images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+            paths (list[str]): The paths of images. If paths not images
+            output_dir (str): The directory to store output images.
+            visualization (bool): Whether to save image or not.
+        Returns:
+            res (list): The result of text detection box and save path of images.
+        """
+        all_results = self.model.recognize_text(
+            images=images, paths=paths, output_dir=output_dir, visualization=visualization)
+        return all_results
+
+    @serving
+    def serving_method(self, images, **kwargs):
+        """
+        Run as a service.
+        """
+        images_decode = [base64_to_cv2(image) for image in images]
+        results = self.recognize_text(images_decode, **kwargs)
+        return results
+
+    @runnable
+    def run_cmd(self, argvs):
+        """
+        Run as a command
+        """
+        results = self.model.run_cmd(argvs)
+        return results
+
+    def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
+        '''
+        Export the model to ONNX format.
+
+        Args:
+            dirname(str): The directory to save the onnx model.
+            input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
+            opset_version(int): operator set
+        '''
+        self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
--- a/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/kannada_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/korean_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/korean_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/latin_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/latin_ocr_db_crnn_mobile/requirements.txt
+paddleocr>=2.3.0.2
+paddle2onnx>=0.9.0
+shapely
+pyclipper
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/README.md
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/README.md
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/__init__.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/__init__.py
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/arabic.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/arabic.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/cyrillic.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/cyrillic.ttf
--- a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/german.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/german.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/hindi.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/hindi.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/kannada.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/kannada.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/korean.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/korean.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/latin.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/latin.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/marathi.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/marathi.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/nepali.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/nepali.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/persian.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/persian.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/simfang.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/simfang.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/spanish.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/spanish.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/tamil.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/tamil.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/telugu.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/telugu.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/urdu.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/urdu.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/uyghur.ttf
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/assets/fonts/uyghur.ttf
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/module.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/module.py
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/requirements.txt
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/requirements.txt
--- a/modules/image/text_recognition/multi_languages_ocr_db_crnn/utils.py
+++ b/modules/image/text_recognition/multi_languages_ocr_db_crnn/utils.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/tamil_ocr_db_crnn_mobile/requirements.txt
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/README.md
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/README.md
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/__init__.py
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/__init__.py
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/module.py
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/module.py
--- a/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/requirements.txt
+++ b/modules/image/text_recognition/telugu_ocr_db_crnn_mobile/requirements.txt
--- a/modules/text/punctuation_restoration/auto_punc/README.md
+++ b/modules/text/punctuation_restoration/auto_punc/README.md
--- a/modules/text/punctuation_restoration/auto_punc/__init__.py
+++ b/modules/text/punctuation_restoration/auto_punc/__init__.py
--- a/modules/text/punctuation_restoration/auto_punc/module.py
+++ b/modules/text/punctuation_restoration/auto_punc/module.py
--- a/modules/text/text_correction/ernie-csc/README.md
+++ b/modules/text/text_correction/ernie-csc/README.md
--- a/modules/text/text_correction/ernie-csc/__init__.py
+++ b/modules/text/text_correction/ernie-csc/__init__.py
--- a/modules/text/text_correction/ernie-csc/module.py
+++ b/modules/text/text_correction/ernie-csc/module.py
--- a/modules/text/text_correction/ernie-csc/requirements.txt
+++ b/modules/text/text_correction/ernie-csc/requirements.txt
--- a/modules/video/multiple_object_tracking/fairmot_dla34/config/_base_/fairmot_dla34.yml
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/config/_base_/fairmot_dla34.yml
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/deepsort_matching.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/deepsort_matching.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/jde_matching.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/matching/jde_matching.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/kalman_filter.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/motion/kalman_filter.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/__init__.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/__init__.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_jde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_jde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_sde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/base_sde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/jde_tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/tracker/jde_tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/utils.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/utils.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/visualization.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/modeling/mot/visualization.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/tracker.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/tracker.py
--- a/modules/video/multiple_object_tracking/fairmot_dla34/utils.py
+++ b/modules/video/multiple_object_tracking/fairmot_dla34/utils.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/config/_base_/jde_darknet53.yml
+++ b/modules/video/multiple_object_tracking/jde_darknet53/config/_base_/jde_darknet53.yml
--- a/modules/video/multiple_object_tracking/jde_darknet53/config/jde_darknet53_30e_1088x608.yml
+++ b/modules/video/multiple_object_tracking/jde_darknet53/config/jde_darknet53_30e_1088x608.yml
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/deepsort_matching.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/deepsort_matching.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/jde_matching.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/matching/jde_matching.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/kalman_filter.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/motion/kalman_filter.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/__init__.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/__init__.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_jde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_jde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_sde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/base_sde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/jde_tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/tracker/jde_tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/utils.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/utils.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/visualization.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/modeling/mot/visualization.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/tracker.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/tracker.py
--- a/modules/video/multiple_object_tracking/jde_darknet53/utils.py
+++ b/modules/video/multiple_object_tracking/jde_darknet53/utils.py
--- a/paddlehub/utils/pypi.py
+++ b/paddlehub/utils/pypi.py
--- a/requirements.txt
+++ b/requirements.txt