diff --git a/deploy/paddle2onnx/readme_ch.md b/deploy/paddle2onnx/readme_ch.md
index 8e821892142d65caddd6fa3bd8ff24a372fe9a5d..5004cab8338ee7e809033dcf5b8f2184b0da065a 100644
--- a/deploy/paddle2onnx/readme_ch.md
+++ b/deploy/paddle2onnx/readme_ch.md
@@ -39,14 +39,14 @@ python3.7 -m pip install onnxruntime==1.9.0
有两种方式获取Paddle静态图模型:在 [model_list](../../doc/doc_ch/models_list.md) 中下载PaddleOCR提供的预测模型;
参考[模型导出说明](../../doc/doc_ch/inference.md#训练模型转inference模型)把训练好的权重转为 inference_model。
-以 ppocr 中文检测、识别、分类模型为例:
+以 PP-OCRv3 中文检测、识别、分类模型为例:
```
-wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar
-cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && cd ..
+wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar
+cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && cd ..
-wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar
-cd ./inference && tar xf ch_PP-OCRv2_rec_infer.tar && cd ..
+wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar
+cd ./inference && tar xf ch_PP-OCRv3_rec_infer.tar && cd ..
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
cd ./inference && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && cd ..
@@ -57,7 +57,7 @@ cd ./inference && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && cd ..
使用 Paddle2ONNX 将Paddle静态图模型转换为ONNX模型格式:
```
-paddle2onnx --model_dir ./inference/ch_PP-OCRv2_det_infer \
+paddle2onnx --model_dir ./inference/ch_PP-OCRv3_det_infer \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--save_file ./inference/det_onnx/model.onnx \
@@ -65,7 +65,7 @@ paddle2onnx --model_dir ./inference/ch_PP-OCRv2_det_infer \
--input_shape_dict="{'x':[-1,3,-1,-1]}" \
--enable_onnx_checker True
-paddle2onnx --model_dir ./inference/ch_PP-OCRv2_rec_infer \
+paddle2onnx --model_dir ./inference/ch_PP-OCRv3_rec_infer \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--save_file ./inference/rec_onnx/model.onnx \
@@ -105,8 +105,8 @@ python3.7 tools/infer/predict_system.py --use_gpu=False --use_onnx=True \
```
python3.7 tools/infer/predict_system.py --use_gpu=False \
--cls_model_dir=./inference/ch_ppocr_mobile_v2.0_cls_infer \
---rec_model_dir=./inference/ch_PP-OCRv2_rec_infer \
---det_model_dir=./inference/ch_PP-OCRv2_det_infer \
+--rec_model_dir=./inference/ch_PP-OCRv3_rec_infer \
+--det_model_dir=./inference/ch_PP-OCRv3_det_infer \
--image_dir=./deploy/lite/imgs/lite_demo.png
```
diff --git a/deploy/pdserving/README.md b/deploy/pdserving/README.md
index d3ba7d4cfbabb111831a6ecbce28c1ac352066fe..55e03c4c2654f336ed942ae03e61e88b61940006 100644
--- a/deploy/pdserving/README.md
+++ b/deploy/pdserving/README.md
@@ -15,6 +15,14 @@ Some Key Features of Paddle Serving:
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
+PaddleServing supports deployment in multiple languages. In this example, two deployment methods, python pipeline and C++, are provided. The comparison between the two is as follows:
+
+| Language | Speed | Secondary development | Do you need to compile |
+|-----|-----|---------|------------|
+| C++ | fast | Slightly difficult | Single model prediction does not need to be compiled, multi-model concatenation needs to be compiled |
+| python | general | easy | single-model/multi-model no compilation required |
+
+
The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
@@ -25,6 +33,7 @@ The introduction and tutorial of Paddle Serving service deployment framework ref
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
+ - [Paddle Serving C++ deployment](#C++)
- [WINDOWS Users](#windows-users)
- [FAQ](#faq)
@@ -41,23 +50,23 @@ PaddleOCR operating environment and Paddle Serving operating environment are nee
```bash
# Install serving which used to start the service
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
-pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl
+pip3 install paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl
# Install paddle-serving-server for cuda10.1
-# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-# pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
+# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl
+# pip3 install paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl
# Install serving which used to start the service
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl
-pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp37-none-any.whl
+pip3 install paddle_serving_client-0.8.3-cp37-none-any.whl
# Install serving-app
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl
-pip3 install paddle_serving_app-0.7.0-py3-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl
+pip3 install paddle_serving_app-0.8.3-py3-none-any.whl
```
- **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md).
+ **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Latest_Packages_CN.md).
@@ -67,37 +76,37 @@ When using PaddleServing for service deployment, you need to convert the saved i
Firstly, download the [inference model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/README_ch.md#pp-ocr%E7%B3%BB%E5%88%97%E6%A8%A1%E5%9E%8B%E5%88%97%E8%A1%A8%E6%9B%B4%E6%96%B0%E4%B8%AD) of PPOCR
```
# Download and unzip the OCR text detection model
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar -O ch_PP-OCRv2_det_infer.tar && tar -xf ch_PP-OCRv2_det_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar -O ch_PP-OCRv3_det_infer.tar && tar -xf ch_PP-OCRv3_det_infer.tar
# Download and unzip the OCR text recognition model
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar -O ch_PP-OCRv2_rec_infer.tar && tar -xf ch_PP-OCRv2_rec_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar -O ch_PP-OCRv3_rec_infer.tar && tar -xf ch_PP-OCRv3_rec_infer.tar
```
Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
```
# Detection model conversion
-python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_det_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_det_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
- --serving_server ./ppocr_det_mobile_2.0_serving/ \
- --serving_client ./ppocr_det_mobile_2.0_client/
+ --serving_server ./ppocr_det_v3_serving/ \
+ --serving_client ./ppocr_det_v3_client/
# Recognition model conversion
-python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_rec_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
- --serving_server ./ppocr_rec_mobile_2.0_serving/ \
- --serving_client ./ppocr_rec_mobile_2.0_client/
+ --serving_server ./ppocr_rec_v3_serving/ \
+ --serving_client ./ppocr_rec_v3_client/
```
-After the detection model is converted, there will be additional folders of `ppocr_det_mobile_2.0_serving` and `ppocr_det_mobile_2.0_client` in the current folder, with the following format:
+After the detection model is converted, there will be additional folders of `ppocr_det_v3_serving` and `ppocr_det_v3_client` in the current folder, with the following format:
```
-|- ppocr_det_mobile_2.0_serving/
+|- ppocr_det_v3_serving/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
-|- ppocr_det_mobile_2.0_client
+|- ppocr_det_v3_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
@@ -193,16 +202,13 @@ The recognition model is the same.
2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0]
```
+
## C++ Serving
Service deployment based on python obviously has the advantage of convenient secondary development. However, the real application often needs to pursue better performance. PaddleServing also provides a more performant C++ deployment version.
The C++ service deployment is the same as python in the environment setup and data preparation stages, the difference is when the service is started and the client sends requests.
-| Language | Speed | Secondary development | Do you need to compile |
-|-----|-----|---------|------------|
-| C++ | fast | Slightly difficult | Single model prediction does not need to be compiled, multi-model concatenation needs to be compiled |
-| python | general | easy | single-model/multi-model no compilation required |
1. Compile Serving
@@ -211,7 +217,7 @@ The C++ service deployment is the same as python in the environment setup and da
2. Run the following command to start the service.
```
# Start the service and save the running log in log.txt
- python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
+ python3 -m paddle_serving_server.serve --model ppocr_det_v3_serving ppocr_rec_v3_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
```
After the service is successfully started, a log similar to the following will be printed in log.txt
![](./imgs/start_server.png)
@@ -219,7 +225,7 @@ The C++ service deployment is the same as python in the environment setup and da
3. Send service request
Due to the need for pre and post-processing in the C++Server part, in order to speed up the input to the C++Server is only the base64 encoded string of the picture, it needs to be manually modified
- Change the feed_type field and shape field in ppocrv2_det_client/serving_client_conf.prototxt to the following:
+ Change the feed_type field and shape field in ppocr_det_v3_client/serving_client_conf.prototxt to the following:
```
feed_var {
@@ -234,7 +240,7 @@ The C++ service deployment is the same as python in the environment setup and da
start the client:
```
- python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client
+ python3 ocr_cpp_client.py ppocr_det_v3_client ppocr_rec_v3_client
```
After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
![](./imgs/results.png)
diff --git a/deploy/pdserving/README_CN.md b/deploy/pdserving/README_CN.md
index b400c883d1a7a1334d9ff2c7a5335dea96699842..0891611db5f39d322473354f7d988b10afa78cbd 100644
--- a/deploy/pdserving/README_CN.md
+++ b/deploy/pdserving/README_CN.md
@@ -54,22 +54,22 @@ AIStudio演示案例可参考 [基于PaddleServing的OCR服务化部署实战](h
```bash
# 安装serving,用于启动服务
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
-pip3 install paddle_serving_server_gpu-0.7.0.post102-py3-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl
+pip3 install paddle_serving_server_gpu-0.8.3.post102-py3-none-any.whl
# 如果是cuda10.1环境,可以使用下面的命令安装paddle-serving-server
-# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
-# pip3 install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
+# wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl
+# pip3 install paddle_serving_server_gpu-0.8.3.post101-py3-none-any.whl
# 安装client,用于向服务发送请求
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl
-pip3 install paddle_serving_client-0.7.0-cp37-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp37-none-any.whl
+pip3 install paddle_serving_client-0.8.3-cp37-none-any.whl
# 安装serving-app
-wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl
-pip3 install paddle_serving_app-0.7.0-py3-none-any.whl
+wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl
+pip3 install paddle_serving_app-0.8.3-py3-none-any.whl
```
-**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md)。
+**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/Latest_Packages_CN.md)。
## 模型转换
@@ -80,38 +80,38 @@ pip3 install paddle_serving_app-0.7.0-py3-none-any.whl
```bash
# 下载并解压 OCR 文本检测模型
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar -O ch_PP-OCRv2_det_infer.tar && tar -xf ch_PP-OCRv2_det_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar -O ch_PP-OCRv3_det_infer.tar && tar -xf ch_PP-OCRv3_det_infer.tar
# 下载并解压 OCR 文本识别模型
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar -O ch_PP-OCRv2_rec_infer.tar && tar -xf ch_PP-OCRv2_rec_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar -O ch_PP-OCRv3_rec_infer.tar && tar -xf ch_PP-OCRv3_rec_infer.tar
```
接下来,用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
```bash
# 转换检测模型
-python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_det_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_det_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
- --serving_server ./ppocr_det_mobile_2.0_serving/ \
- --serving_client ./ppocr_det_mobile_2.0_client/
+ --serving_server ./ppocr_det_v3_serving/ \
+ --serving_client ./ppocr_det_v3_client/
# 转换识别模型
-python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
+python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv3_rec_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
- --serving_server ./ppocr_rec_mobile_2.0_serving/ \
- --serving_client ./ppocr_rec_mobile_2.0_client/
+ --serving_server ./ppocr_rec_v3_serving/ \
+ --serving_client ./ppocr_rec_v3_client/
```
-检测模型转换完成后,会在当前文件夹多出`ppocr_det_mobile_2.0_serving` 和`ppocr_det_mobile_2.0_client`的文件夹,具备如下格式:
+检测模型转换完成后,会在当前文件夹多出`ppocr_det_v3_serving` 和`ppocr_det_v3_client`的文件夹,具备如下格式:
```
-|- ppocr_det_mobile_2.0_serving/
+|- ppocr_det_v3_serving/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
-|- ppocr_det_mobile_2.0_client
+|- ppocr_det_v3_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
@@ -230,7 +230,7 @@ cp -rf general_detection_op.cpp Serving/core/general-server/op
```
# 启动服务,运行日志保存在log.txt
- python3 -m paddle_serving_server.serve --model ppocrv2_det_serving ppocrv2_rec_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
+ python3 -m paddle_serving_server.serve --model ppocr_det_v3_serving ppocr_rec_v3_serving --op GeneralDetectionOp GeneralInferOp --port 9293 &>log.txt &
```
成功启动服务后,log.txt中会打印类似如下日志
![](./imgs/start_server.png)
@@ -238,7 +238,7 @@ cp -rf general_detection_op.cpp Serving/core/general-server/op
3. 发送服务请求:
由于需要在C++Server部分进行前后处理,为了加速传入C++Server的仅仅是图片的base64编码的字符串,故需要手动修改
- ppocrv2_det_client/serving_client_conf.prototxt 中 feed_type 字段 和 shape 字段,修改成如下内容:
+ ppocr_det_v3_client/serving_client_conf.prototxt 中 feed_type 字段 和 shape 字段,修改成如下内容:
```
feed_var {
name: "x"
@@ -250,7 +250,7 @@ cp -rf general_detection_op.cpp Serving/core/general-server/op
```
启动客户端
```
- python3 ocr_cpp_client.py ppocrv2_det_client ppocrv2_rec_client
+ python3 ocr_cpp_client.py ppocr_det_v3_client ppocr_rec_v3_client
```
成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为:
diff --git a/deploy/pdserving/config.yml b/deploy/pdserving/config.yml
index 2aae922dfa12f46d1c0ebd352e8d3a7077065cf8..6e30a626d0cdb0b4e5fe6feb737ea46c2bc59f90 100644
--- a/deploy/pdserving/config.yml
+++ b/deploy/pdserving/config.yml
@@ -34,7 +34,7 @@ op:
client_type: local_predictor
#det模型路径
- model_config: ./ppocr_det_mobile_2.0_serving
+ model_config: ./ppocr_det_v3_serving
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["save_infer_model/scale_0.tmp_1"]
@@ -60,10 +60,10 @@ op:
client_type: local_predictor
#rec模型路径
- model_config: ./ppocr_rec_mobile_2.0_serving
+ model_config: ./ppocr_rec_v3_serving
#Fetch结果列表,以client_config中fetch_var的alias_name为准
- fetch_list: ["save_infer_model/scale_0.tmp_1"]
+ fetch_list: ["softmax_5.tmp_0"]
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0"
diff --git a/deploy/pdserving/ocr_reader.py b/deploy/pdserving/ocr_reader.py
index 67099786ea73b66412dac8f965e20201f0ac1fdc..6a2d57b679d69ab11ac6f0fd74c47a342b391545 100644
--- a/deploy/pdserving/ocr_reader.py
+++ b/deploy/pdserving/ocr_reader.py
@@ -392,38 +392,8 @@ class OCRReader(object):
return norm_img_batch[0]
- def postprocess_old(self, outputs, with_score=False):
- rec_res = []
- rec_idx_lod = outputs["ctc_greedy_decoder_0.tmp_0.lod"]
- rec_idx_batch = outputs["ctc_greedy_decoder_0.tmp_0"]
- if with_score:
- predict_lod = outputs["softmax_0.tmp_0.lod"]
- for rno in range(len(rec_idx_lod) - 1):
- beg = rec_idx_lod[rno]
- end = rec_idx_lod[rno + 1]
- if isinstance(rec_idx_batch, list):
- rec_idx_tmp = [x[0] for x in rec_idx_batch[beg:end]]
- else: #nd array
- rec_idx_tmp = rec_idx_batch[beg:end, 0]
- preds_text = self.char_ops.decode(rec_idx_tmp)
- if with_score:
- beg = predict_lod[rno]
- end = predict_lod[rno + 1]
- if isinstance(outputs["softmax_0.tmp_0"], list):
- outputs["softmax_0.tmp_0"] = np.array(outputs[
- "softmax_0.tmp_0"]).astype(np.float32)
- probs = outputs["softmax_0.tmp_0"][beg:end, :]
- ind = np.argmax(probs, axis=1)
- blank = probs.shape[1]
- valid_ind = np.where(ind != (blank - 1))[0]
- score = np.mean(probs[valid_ind, ind[valid_ind]])
- rec_res.append([preds_text, score])
- else:
- rec_res.append([preds_text])
- return rec_res
-
def postprocess(self, outputs, with_score=False):
- preds = outputs["save_infer_model/scale_0.tmp_1"]
+ preds = outputs["softmax_5.tmp_0"]
try:
preds = preds.numpy()
except:
diff --git a/deploy/pdserving/win/ocr_reader.py b/deploy/pdserving/win/ocr_reader.py
index 3f219784fca79715d09ae9353a32d95e2e427cb6..18b9385aa0c7adf7c3e0cd38efd1160655881f0e 100644
--- a/deploy/pdserving/win/ocr_reader.py
+++ b/deploy/pdserving/win/ocr_reader.py
@@ -392,38 +392,8 @@ class OCRReader(object):
return norm_img_batch[0]
- def postprocess_old(self, outputs, with_score=False):
- rec_res = []
- rec_idx_lod = outputs["ctc_greedy_decoder_0.tmp_0.lod"]
- rec_idx_batch = outputs["ctc_greedy_decoder_0.tmp_0"]
- if with_score:
- predict_lod = outputs["softmax_0.tmp_0.lod"]
- for rno in range(len(rec_idx_lod) - 1):
- beg = rec_idx_lod[rno]
- end = rec_idx_lod[rno + 1]
- if isinstance(rec_idx_batch, list):
- rec_idx_tmp = [x[0] for x in rec_idx_batch[beg:end]]
- else: #nd array
- rec_idx_tmp = rec_idx_batch[beg:end, 0]
- preds_text = self.char_ops.decode(rec_idx_tmp)
- if with_score:
- beg = predict_lod[rno]
- end = predict_lod[rno + 1]
- if isinstance(outputs["softmax_0.tmp_0"], list):
- outputs["softmax_0.tmp_0"] = np.array(outputs[
- "softmax_0.tmp_0"]).astype(np.float32)
- probs = outputs["softmax_0.tmp_0"][beg:end, :]
- ind = np.argmax(probs, axis=1)
- blank = probs.shape[1]
- valid_ind = np.where(ind != (blank - 1))[0]
- score = np.mean(probs[valid_ind, ind[valid_ind]])
- rec_res.append([preds_text, score])
- else:
- rec_res.append([preds_text])
- return rec_res
-
def postprocess(self, outputs, with_score=False):
- preds = outputs["save_infer_model/scale_0.tmp_1"]
+ preds = outputs["softmax_5.tmp_0"]
try:
preds = preds.numpy()
except:
diff --git a/doc/doc_ch/algorithm_rec_crnn.md b/doc/doc_ch/algorithm_rec_crnn.md
index 27bd59b7c1dc79d41f737dca6ca2e0961e6dedaf..70aadd3d684e40ebd1d6e627a26b95b35b544d75 100644
--- a/doc/doc_ch/algorithm_rec_crnn.md
+++ b/doc/doc_ch/algorithm_rec_crnn.md
@@ -25,10 +25,10 @@
参考[DTRB](https://arxiv.org/abs/1904.01906) 文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
-|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
+|模型|骨干网络|Avg Accuracy|配置文件|下载链接|
|---|---|---|---|---|
-|CRNN|Resnet34_vd|81.04%|rec_r34_vd_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
-|CRNN|MobileNetV3|77.95%|rec_mv3_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
+|CRNN|Resnet34_vd|81.04%|[configs/rec/rec_r34_vd_none_bilstm_ctc.yml](../../configs/rec/rec_r34_vd_none_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
+|CRNN|MobileNetV3|77.95%|[configs/rec/rec_mv3_none_bilstm_ctc.yml](../../configs/rec/rec_mv3_none_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
@@ -41,6 +41,32 @@
请参考[文本识别训练教程](./recognition.md)。PaddleOCR对代码进行了模块化,训练不同的识别模型只需要**更换配置文件**即可。
+- 训练
+
+在完成数据准备后,便可以启动训练,训练命令如下:
+
+```
+#单卡训练(训练周期长,不建议)
+python3 tools/train.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml
+
+#多卡训练,通过--gpus参数指定卡号
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_r34_vd_none_bilstm_ctc.yml
+
+```
+
+- 评估
+
+```
+# GPU 评估, Global.pretrained_model 为待测权重
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+- 预测:
+
+```
+# 预测使用的配置文件必须与训练一致
+python3 tools/infer_rec.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
## 4. 推理部署
diff --git a/doc/doc_ch/algorithm_rec_seed.md b/doc/doc_ch/algorithm_rec_seed.md
index 0db32bfceaa8a142eeba587d0bac555f7ff1087b..94c877ffac3f9716786cdf6618d335511d38325a 100644
--- a/doc/doc_ch/algorithm_rec_seed.md
+++ b/doc/doc_ch/algorithm_rec_seed.md
@@ -17,7 +17,7 @@
## 1. 算法简介
论文信息:
-> [STAR-Net: a spatial attention residue network for scene text recognition.](https://arxiv.org/pdf/2005.10977.pdf)
+> [SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition](https://arxiv.org/pdf/2005.10977.pdf)
> Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping
@@ -25,9 +25,9 @@
参考[DTRB](https://arxiv.org/abs/1904.01906) 文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
-|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
+|模型|骨干网络|Avg Accuracy|配置文件|下载链接|
|---|---|---|---|---|
-|SEED|Aster_Resnet| 85.2% | rec_resnet_stn_bilstm_att | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
+|SEED|Aster_Resnet| 85.2% | [configs/rec/rec_resnet_stn_bilstm_att.yml](../../configs/rec/rec_resnet_stn_bilstm_att.yml) | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
## 2. 环境配置
@@ -39,6 +39,38 @@
请参考[文本识别训练教程](./recognition.md)。PaddleOCR对代码进行了模块化,训练不同的识别模型只需要**更换配置文件**即可。
+- 训练
+
+SEED模型需要额外加载FastText训练好的[语言模型](https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz) ,并且安装 fasttext 依赖:
+
+```
+python3 -m pip install fasttext==0.9.1
+```
+
+然后,在完成数据准备后,便可以启动训练,训练命令如下:
+
+```
+#单卡训练(训练周期长,不建议)
+python3 tools/train.py -c configs/rec/rec_resnet_stn_bilstm_att.yml
+
+#多卡训练,通过--gpus参数指定卡号
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_resnet_stn_bilstm_att.yml
+
+```
+
+- 评估
+
+```
+# GPU 评估, Global.pretrained_model 为待测权重
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+- 预测:
+
+```
+# 预测使用的配置文件必须与训练一致
+python3 tools/infer_rec.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
## 4. 推理部署
@@ -48,6 +80,7 @@
comming soon
+
### 4.2 C++推理
diff --git a/doc/doc_ch/algorithm_rec_starnet.md b/doc/doc_ch/algorithm_rec_starnet.md
index 25ad03b65b446d852c438da233b6d3afef73cfcf..c9d7706988763a8ac257129ab54915afe11250ac 100644
--- a/doc/doc_ch/algorithm_rec_starnet.md
+++ b/doc/doc_ch/algorithm_rec_starnet.md
@@ -25,10 +25,10 @@
参考[DTRB](https://arxiv.org/abs/1904.01906) 文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
-|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
+|模型|骨干网络|Avg Accuracy|配置文件|下载链接|
|---|---|---|---|---|
-|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
-|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
+|StarNet|Resnet34_vd|84.44%|[configs/rec/rec_r34_vd_tps_bilstm_ctc.yml](../../configs/rec/rec_r34_vd_tps_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
+|StarNet|MobileNetV3|81.42%|[configs/rec/rec_mv3_tps_bilstm_ctc.yml](../../configs/rec/rec_mv3_tps_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
@@ -41,6 +41,32 @@
请参考[文本识别训练教程](./recognition.md)。PaddleOCR对代码进行了模块化,训练不同的识别模型只需要**更换配置文件**即可。
+- 训练
+
+在完成数据准备后,便可以启动训练,训练命令如下:
+
+```
+#单卡训练(训练周期长,不建议)
+python3 tools/train.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml
+
+#多卡训练,通过--gpus参数指定卡号
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_r34_vd_tps_bilstm_ctc.yml
+
+```
+
+- 评估
+
+```
+# GPU 评估, Global.pretrained_model 为待测权重
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+- 预测:
+
+```
+# 预测使用的配置文件必须与训练一致
+python3 tools/infer_rec.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
## 4. 推理部署
diff --git a/doc/doc_ch/recognition.md b/doc/doc_ch/recognition.md
index 6cdd547517ebb8888374b22c1b52314da53eebab..bb8e38d79447bce772be5be6f4cf2f97f5bd2c7e 100644
--- a/doc/doc_ch/recognition.md
+++ b/doc/doc_ch/recognition.md
@@ -99,8 +99,6 @@ train_data/rec/train/word_002.jpg 用科技让复杂的世界更简单
若您本地没有数据集,可以在官网下载 [ICDAR2015](http://rrc.cvc.uab.es/?ch=4&com=downloads) 数据,用于快速验证。也可以参考[DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here) ,下载 benchmark 所需的lmdb格式数据集。
-如果希望复现SAR的论文指标,需要下载[SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), 提取码:627x。此外,真实数据集icdar2013, icdar2015, cocotext, IIIT5也作为训练数据的一部分。具体数据细节可以参考论文SAR。
-
如果你使用的是icdar2015的公开数据集,PaddleOCR 提供了一份用于训练 ICDAR2015 数据集的标签文件,通过以下方式下载:
```
@@ -165,13 +163,12 @@ PaddleOCR内置了一部分字典,可以按需使用。
-
目前的多语言模型仍处在demo阶段,会持续优化模型并补充语种,**非常欢迎您为我们提供其他语言的字典和字体**,
如您愿意可将字典文件提交至 [dict](../../ppocr/utils/dict),我们会在Repo中感谢您。
- 自定义字典
-如需自定义dic文件,请在 `configs/rec/rec_icdar15_train.yml` 中添加 `character_dict_path` 字段, 指向您的字典路径。
+如需自定义dic文件,请在 `configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml` 中添加 `character_dict_path` 字段, 指向您的字典路径。
### 1.4 添加空格类别
@@ -196,17 +193,17 @@ PaddleOCR提供了多种数据增强方式,默认配置文件中已经添加
### 2.2 通用模型训练
-PaddleOCR提供了训练脚本、评估脚本和预测脚本,本节将以 CRNN 识别模型为例:
+PaddleOCR提供了训练脚本、评估脚本和预测脚本,本节将以 PP-OCRv3 英文识别模型为例:
首先下载pretrain model,您可以下载训练好的模型在 icdar2015 数据上进行finetune
```
cd PaddleOCR/
-# 下载MobileNetV3的预训练模型
-wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
+# 下载英文PP-OCRv3的预训练模型
+wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar
# 解压模型参数
cd pretrain_models
-tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
+tar -xf en_PP-OCRv3_rec_train.tar && rm -rf en_PP-OCRv3_rec_train.tar
```
开始训练:
@@ -218,44 +215,23 @@ tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc
# 训练icdar15英文数据 训练日志会自动保存为 "{save_model_dir}" 下的train.log
#单卡训练(训练周期长,不建议)
-python3 tools/train.py -c configs/rec/rec_icdar15_train.yml
+python3 tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy
#多卡训练,通过--gpus参数指定卡号
-python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_icdar15_train.yml
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy
```
-PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_train.yml` 中修改 `eval_batch_step` 设置评估频率,默认每500个iter评估一次。评估过程中默认将最佳acc模型,保存为 `output/rec_CRNN/best_accuracy` 。
+PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml` 中修改 `eval_batch_step` 设置评估频率,默认每500个iter评估一次。评估过程中默认将最佳acc模型,保存为 `output/en_PP-OCRv3_rec/best_accuracy` 。
如果验证集很大,测试将会比较耗时,建议减少评估次数,或训练完再进行评估。
-**提示:** 可通过 -c 参数选择 `configs/rec/` 路径下的多种模型配置进行训练,PaddleOCR支持的识别算法有:
-
+**提示:** 可通过 -c 参数选择 `configs/rec/` 路径下的多种模型配置进行训练,PaddleOCR支持的识别算法可以参考[前沿算法列表](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/algorithm_overview.md#12-%E6%96%87%E6%9C%AC%E8%AF%86%E5%88%AB%E7%AE%97%E6%B3%95):
-| 配置文件 | 算法名称 | backbone | trans | seq | pred |
-| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: |
-| [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
-| [rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc |
-| rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
-| rec_mv3_none_bilstm_ctc.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
-| rec_mv3_none_none_ctc.yml | Rosetta | Mobilenet_v3 large 0.5 | None | None | ctc |
-| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
-| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
-| rec_mv3_tps_bilstm_att.yml | CRNN | Mobilenet_v3 | TPS | BiLSTM | att |
-| rec_r34_vd_tps_bilstm_att.yml | CRNN | Resnet34_vd | TPS | BiLSTM | att |
-| rec_r50fpn_vd_none_srn.yml | SRN | Resnet50_fpn_vd | None | rnn | srn |
-| rec_mtb_nrtr.yml | NRTR | nrtr_mtb | None | transformer encoder | transformer decoder |
-| rec_r31_sar.yml | SAR | ResNet31 | None | LSTM encoder | LSTM decoder |
-| rec_resnet_stn_bilstm_att.yml | SEED | Aster_Resnet | STN | BiLSTM | att |
-
-*其中SEED模型需要额外加载FastText训练好的[语言模型](https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz) ,并且安装 fasttext 依赖:
-```
-python3.7 -m pip install fasttext==0.9.1
-```
-训练中文数据,推荐使用[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件:
+训练中文数据,推荐使用[ch_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件:
-以 `rec_chinese_lite_train_v2.0.yml` 为例:
+以 `ch_PP-OCRv3_rec.yml` 为例:
```
Global:
...
@@ -288,7 +264,7 @@ Train:
...
- RecResizeImg:
# 修改 image_shape 以适应长文本
- image_shape: [3, 32, 320]
+ image_shape: [3, 48, 320]
...
loader:
...
@@ -308,7 +284,7 @@ Eval:
...
- RecResizeImg:
# 修改 image_shape 以适应长文本
- image_shape: [3, 32, 320]
+ image_shape: [3, 48, 320]
...
loader:
# 单卡验证的batch_size
@@ -383,11 +359,11 @@ PaddleOCR支持了基于知识蒸馏的文本识别模型训练过程,更多
## 3 评估
-评估数据集可以通过 `configs/rec/rec_icdar15_train.yml` 修改Eval中的 `label_file_path` 设置。
+评估数据集可以通过 `configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml` 修改Eval中的 `label_file_path` 设置。
```
# GPU 评估, Global.checkpoints 为待测权重
-python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.checkpoints={path/to/weights}/best_accuracy
```
@@ -417,7 +393,7 @@ output/rec/
```
# 预测英文结果
-python3 tools/infer_rec.py -c configs/rec/rec_icdar15_train.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/en/word_1.png
+python3 tools/infer_rec.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
```
预测图片:
@@ -436,7 +412,7 @@ infer_img: doc/imgs_words/en/word_1.png
```
# 预测中文结果
-python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/ch/word_1.jpg
+python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/ch/word_1.jpg
```
预测图片:
@@ -462,15 +438,15 @@ infer_img: doc/imgs_words/ch/word_1.jpg
# Global.pretrained_model 参数设置待转换的训练模型地址,不用添加文件后缀 .pdmodel,.pdopt或.pdparams。
# Global.save_inference_dir参数设置转换的模型将保存的地址。
-python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_rec_train/best_accuracy Global.save_inference_dir=./inference/rec_crnn/
+python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy Global.save_inference_dir=./inference/en_PP-OCRv3_rec/
```
-**注意:**如果您是在自己的数据集上训练的模型,并且调整了中文字符的字典文件,请注意修改配置文件中的`character_dict_path`是否是所需要的字典文件。
+**注意:**如果您是在自己的数据集上训练的模型,并且调整了中文字符的字典文件,请注意修改配置文件中的`character_dict_path`为自定义字典文件。
转换成功后,在目录下有三个文件:
```
-/inference/rec_crnn/
+/inference/en_PP-OCRv3_rec/
├── inference.pdiparams # 识别inference模型的参数文件
├── inference.pdiparams.info # 识别inference模型的参数信息,可忽略
└── inference.pdmodel # 识别inference模型的program文件
@@ -481,5 +457,5 @@ python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_trai
如果训练时修改了文本的字典,在使用inference模型预测时,需要通过`--rec_char_dict_path`指定使用的字典路径
```
- python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_dict_path="your text dict path"
+ python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 48, 320" --rec_char_dict_path="your text dict path"
```
diff --git a/doc/doc_en/algorithm_rec_aster_en.md b/doc/doc_en/algorithm_rec_aster_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..1540681a19f94160e221c37173510395d0fd407f
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_aster_en.md
@@ -0,0 +1,122 @@
+# STAR-Net
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper:
+> [STAR-Net: a spatial attention residue network for scene text recognition.](http://www.bmva.org/bmvc/2016/papers/paper043/paper043.pdf)
+
+> Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong, Zhizhong Su and Junyu Han.
+
+> BMVC, pages 43.1-43.13, 2016
+
+Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Model|Backbone|ACC|config|Download link|
+| --- | --- | --- | --- | --- |
+|---|---|---|---|---|
+|StarNet|Resnet34_vd|84.44%|[configs/rec/rec_r34_vd_tps_bilstm_ctc.yml](../../configs/rec/rec_r34_vd_tps_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
+|StarNet|MobileNetV3|81.42%|[configs/rec/rec_mv3_tps_bilstm_ctc.yml](../../configs/rec/rec_mv3_tps_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
+
+
+## 2. Environment
+Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+
+Training:
+
+Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
+
+```
+#Single GPU training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml
+
+#Multi GPU training, specify the gpu number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_r34_vd_tps_bilstm_ctc.yml
+```
+
+Evaluation:
+
+```
+# GPU evaluation
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+Prediction:
+
+```
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
+
+
+## 4. Inference and Deployment
+
+
+### 4.1 Python Inference
+First, the model saved during the STAR-Net text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_STAR-Net_train.tar) ), you can use the following command to convert:
+
+```
+python3 tools/export_model.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model=./rec_r34_vd_tps_bilstm_ctc_v2.0_train/best_accuracy Global.save_inference_dir=./inference/rec_starnet
+```
+
+For STAR-Net text recognition model inference, the following commands can be executed:
+
+```
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/rec_starnet/" --rec_image_shape="3, 32, 100" --rec_char_dict_path="./ppocr/utils/ic15_dict.txt"
+```
+
+
+### 4.2 C++ Inference
+
+With the inference model prepared, refer to the [cpp infer](../../deploy/cpp_infer/) tutorial for C++ inference.
+
+
+
+### 4.3 Serving
+
+With the inference model prepared, refer to the [pdserving](../../deploy/pdserving/) tutorial for service deployment by Paddle Serving.
+
+
+
+### 4.4 More
+
+More deployment schemes supported for STAR-Net:
+
+- Paddle2ONNX: with the inference model prepared, please refer to the [paddle2onnx](../../deploy/paddle2onnx/) tutorial.
+
+
+
+## 5. FAQ
+
+
+## Citation
+
+```bibtex
+@inproceedings{liu2016star,
+ title={STAR-Net: a spatial attention residue network for scene text recognition.},
+ author={Liu, Wei and Chen, Chaofeng and Wong, Kwan-Yee K and Su, Zhizhong and Han, Junyu},
+ booktitle={BMVC},
+ volume={2},
+ pages={7},
+ year={2016}
+}
+```
diff --git a/doc/doc_en/algorithm_rec_crnn_en.md b/doc/doc_en/algorithm_rec_crnn_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..571569ee445d756ca7bdfeea6d5f960187a5a666
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_crnn_en.md
@@ -0,0 +1,123 @@
+# CRNN
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper:
+> [An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition](https://arxiv.org/abs/1507.05717)
+
+> Baoguang Shi, Xiang Bai, Cong Yao
+
+> IEEE, 2015
+
+Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Model|Backbone|ACC|config|Download link|
+| --- | --- | --- | --- | --- |
+|---|---|---|---|---|
+|CRNN|Resnet34_vd|81.04%|[configs/rec/rec_r34_vd_none_bilstm_ctc.yml](../../configs/rec/rec_r34_vd_none_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
+|CRNN|MobileNetV3|77.95%|[configs/rec/rec_mv3_none_bilstm_ctc.yml](../../configs/rec/rec_mv3_none_bilstm_ctc.yml)|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
+
+
+## 2. Environment
+Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+
+Training:
+
+Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
+
+```
+#Single GPU training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml
+
+#Multi GPU training, specify the gpu number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml
+```
+
+Evaluation:
+
+```
+# GPU evaluation
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+Prediction:
+
+```
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
+
+
+## 4. Inference and Deployment
+
+
+### 4.1 Python Inference
+First, the model saved during the CRNN text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_CRNN_train.tar) ), you can use the following command to convert:
+
+```
+python3 tools/export_model.py -c configs/rec/rec_r34_vd_none_bilstm_ctc.yml -o Global.pretrained_model=./rec_r34_vd_none_bilstm_ctc_v2.0_train/best_accuracy Global.save_inference_dir=./inference/rec_crnn
+```
+
+For CRNN text recognition model inference, the following commands can be executed:
+
+```
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/rec_crnn/" --rec_image_shape="3, 32, 100" --rec_char_dict_path="./ppocr/utils/ic15_dict.txt"
+```
+
+
+### 4.2 C++ Inference
+
+With the inference model prepared, refer to the [cpp infer](../../deploy/cpp_infer/) tutorial for C++ inference.
+
+
+
+### 4.3 Serving
+
+With the inference model prepared, refer to the [pdserving](../../deploy/pdserving/) tutorial for service deployment by Paddle Serving.
+
+
+
+### 4.4 More
+
+More deployment schemes supported for CRNN:
+
+- Paddle2ONNX: with the inference model prepared, please refer to the [paddle2onnx](../../deploy/paddle2onnx/) tutorial.
+
+
+
+## 5. FAQ
+
+
+## Citation
+
+```bibtex
+@ARTICLE{7801919,
+ author={Shi, Baoguang and Bai, Xiang and Yao, Cong},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition},
+ year={2017},
+ volume={39},
+ number={11},
+ pages={2298-2304},
+ doi={10.1109/TPAMI.2016.2646371}}
+```
diff --git a/doc/doc_en/algorithm_rec_seed_en.md b/doc/doc_en/algorithm_rec_seed_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..21679f42fd6302228804db49d731f9b69ec692b2
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_seed_en.md
@@ -0,0 +1,111 @@
+# SEED
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper:
+> [SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition](https://arxiv.org/pdf/2005.10977.pdf)
+
+> Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping
+
+> CVPR, 2020
+
+Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Model|Backbone|ACC|config|Download link|
+| --- | --- | --- | --- | --- |
+|SEED|Aster_Resnet| 85.2% | [configs/rec/rec_resnet_stn_bilstm_att.yml](../../configs/rec/rec_resnet_stn_bilstm_att.yml) | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
+
+
+## 2. Environment
+Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+
+Training:
+
+The SEED model needs to additionally load the [language model](https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz) trained by FastText, and install the fasttext dependencies:
+
+```
+python3 -m pip install fasttext==0.9.1
+```
+
+Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
+
+```
+#Single GPU training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_resnet_stn_bilstm_att.yml
+
+#Multi GPU training, specify the gpu number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_resnet_stn_bilstm_att.yml
+```
+
+Evaluation:
+
+```
+# GPU evaluation
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+Prediction:
+
+```
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+```
+
+
+## 4. Inference and Deployment
+
+
+### 4.1 Python Inference
+
+Not support
+
+
+### 4.2 C++ Inference
+
+Not support
+
+
+### 4.3 Serving
+
+Not support
+
+
+### 4.4 More
+
+Not support
+
+
+## 5. FAQ
+
+
+## Citation
+
+```bibtex
+@inproceedings{qiao2020seed,
+ title={Seed: Semantics enhanced encoder-decoder framework for scene text recognition},
+ author={Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping},
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+ pages={13528--13537},
+ year={2020}
+}
+```