提交 802caecc 编写于 作者: G grasswolfs

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into update_readme_1215

...@@ -27,8 +27,8 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools ...@@ -27,8 +27,8 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
## Visualization ## Visualization
<div align="center"> <div align="center">
<img src="doc/imgs_results/1101.jpg" width="800"> <img src="doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
<img src="doc/imgs_results/1103.jpg" width="800"> <img src="doc/imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
</div> </div>
The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md). The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md).
...@@ -58,6 +58,7 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr ...@@ -58,6 +58,7 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr
<a name="Supported-Chinese-model-list"></a> <a name="Supported-Chinese-model-list"></a>
## PP-OCR 2.0 series model list(Update on Dec 15) ## PP-OCR 2.0 series model list(Update on Dec 15)
| Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model | | Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model |
...@@ -129,21 +130,21 @@ PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of thr ...@@ -129,21 +130,21 @@ PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of thr
## Visualization [more](./doc/doc_en/visualization_en.md) ## Visualization [more](./doc/doc_en/visualization_en.md)
- Chinese OCR model - Chinese OCR model
<div align="center"> <div align="center">
<img src="./doc/imgs_results/1102.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
<img src="./doc/imgs_results/1104.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
<img src="./doc/imgs_results/1106.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
<img src="./doc/imgs_results/1105.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
</div> </div>
- English OCR model - English OCR model
<div align="center"> <div align="center">
<img src="./doc/imgs_results/img_12.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
</div> </div>
- Multilingual OCR model - Multilingual OCR model
<div align="center"> <div align="center">
<img src="./doc/imgs_results/1110.jpg" width="800"> <img src="./doc/imgs_results/french_0.jpg" width="800">
<img src="./doc/imgs_results/1112.jpg" width="800"> <img src="./doc/imgs_results/korean.jpg" width="800">
</div> </div>
......
...@@ -36,8 +36,8 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -36,8 +36,8 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
## 效果展示 ## 效果展示
<div align="center"> <div align="center">
<img src="doc/imgs_results/1101.jpg" width="800"> <img src="doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
<img src="doc/imgs_results/1103.jpg" width="800"> <img src="doc/imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
</div> </div>
上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md) 上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)
...@@ -120,10 +120,10 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -120,10 +120,10 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
## 效果展示 [more](./doc/doc_ch/visualization.md) ## 效果展示 [more](./doc/doc_ch/visualization.md)
- 中文模型 - 中文模型
<div align="center"> <div align="center">
<img src="./doc/imgs_results/1102.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
<img src="./doc/imgs_results/1104.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
<img src="./doc/imgs_results/1106.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
<img src="./doc/imgs_results/1105.jpg" width="800"> <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
</div> </div>
- 英文模型 - 英文模型
...@@ -133,8 +133,8 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -133,8 +133,8 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
- 其他语言模型 - 其他语言模型
<div align="center"> <div align="center">
<img src="./doc/imgs_results/1110.jpg" width="800"> <img src="./doc/imgs_results/french_0.jpg" width="800">
<img src="./doc/imgs_results/1112.jpg" width="800"> <img src="./doc/imgs_results/korean.jpg" width="800">
</div> </div>
<a name="欢迎加入PaddleOCR技术交流群"></a> <a name="欢迎加入PaddleOCR技术交流群"></a>
......
## Style Text Rec ## Style Text Rec
### 目录 ### 目录
[工具简介](#工具简介) - [工具简介](#工具简介)
[环境配置](#环境配置) - [环境配置](#环境配置)
[快速上手](#快速上手) - [快速上手](#快速上手)
[高级使用](#高级使用) - [高级使用](#高级使用)
[应用示例](#应用示例) - [应用示例](#应用示例)
### 工具简介 ### 工具简介
<div align="center"> <div align="center">
<img src="doc/images/3.png" width="800"> <img src="doc/images/3.png" width="800">
</div> </div>
Style-Text是对百度自研文本编辑算法《Editing Text in the Wild》中提出的SRNet网络的改进,不同于常用的GAN的方法只选择一个分支,该工具将文本合成任务分解为三个子模块,文本风格迁移模块、背景抽取模块和前背景融合模块,来提升合成数据的效果。下图显示了一些示例结果。
<div align="center"> <div align="center">
<img src="doc/images/1.png" width="800"> <img src="doc/images/1.png" width="600">
<img src="doc/images/2.png" width="800">
</div> </div>
此外,在实际铭牌文本识别场景和韩语文本识别场景,验证了该合成工具的有效性。 Style-Text数据合成工具是基于百度自研的文本编辑算法《Editing Text in the Wild》https://arxiv.org/abs/1908.03047
不同于常用的基于GAN的数据合成工具,Style-Text主要框架包括:1.文本前景风格迁移模块 2.背景抽取模块 3.融合模块。经过这样三步,就可以迅速实现图片文字风格迁移。下图是一些该数据合成工具效果图。
<div align="center">
<img src="doc/images/2.png" width="1000">
</div>
### 环境配置 ### 环境配置
1. 参考[快速安装](../doc/doc_ch/installation.md),安装PaddleOCR。强烈建议您使用python3环境。 1. 参考[快速安装](../doc/doc_ch/installation.md),安装PaddleOCR。
2. 进入`style_text_rec`目录,下载模型,并解压: 2. 进入`style_text_rec`目录,下载模型,并解压:
```bash ```bash
...@@ -159,4 +161,4 @@ style_text_rec ...@@ -159,4 +161,4 @@ style_text_rec
|-- logging.py |-- logging.py
|-- math_functions.py |-- math_functions.py
`-- sys_funcs.py `-- sys_funcs.py
``` ```
\ No newline at end of file
...@@ -128,24 +128,32 @@ python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o Global.pretrained_mo ...@@ -128,24 +128,32 @@ python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o Global.pretrained_mo
超轻量中文检测模型推理,可以执行如下命令: 超轻量中文检测模型推理,可以执行如下命令:
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" # 下载超轻量中文检测模型:
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
tar xf ch_ppocr_mobile_v2.0_det_infer.tar
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./ch_ppocr_mobile_v2.0_det_infer/"
``` ```
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下: 可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
![](../imgs_results/det_res_2.jpg) ![](../imgs_results/det_res_22.jpg)
通过参数`limit_type``det_limit_side_len`来对图片的尺寸进行限制限,`limit_type=max`为限制长边长度<`det_limit_side_len`,`limit_type=min`为限制短边长度>`det_limit_side_len`, 通过参数`limit_type``det_limit_side_len`来对图片的尺寸进行限制
图片不满足限制条件时(`limit_type=max`时长边长度>`det_limit_side_len``limit_type=min`时短边长度<`det_limit_side_len`),将对图片进行等比例缩放。 `litmit_type`可选参数为[`max`, `min`],
该参数默认设置为`limit_type='max',det_max_side_len=960`。 如果输入图片的分辨率比较大,而且想使用更大的分辨率预测,可以执行如下命令: `det_limit_size_len` 为正整数,一般设置为32 的倍数,比如960。
参数默认设置为`limit_type='max', det_limit_side_len=960`。表示网络输入图像的最长边不能超过960,
如果超过这个值,会对图像做等宽比的resize操作,确保最长边为`det_limit_side_len`
设置为`limit_type='min', det_limit_side_len=960` 则表示限制图像的最短边为960。
如果输入图片的分辨率比较大,而且想使用更大的分辨率预测,可以设置det_limit_side_len 为想要的值,比如1216:
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1200 python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1216
``` ```
如果想使用CPU进行预测,执行命令如下 如果想使用CPU进行预测,执行命令如下
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
``` ```
<a name="DB文本检测模型推理"></a> <a name="DB文本检测模型推理"></a>
......
# 效果展示 # 效果展示
<a name="通用ppocr_server_2.0效果展示"></a> <a name="超轻量ppocr_server_2.0效果展示"></a>
## 通用ppocr_server_2.0效果展示 ## 通用ppocr_server_2.0 效果展示
<div align="center"> <div align="center">
<img src="../imgs_results/1101.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00006737.jpg" width="800">
<img src="../imgs_results/1102.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00009282.jpg" width="800">
<img src="../imgs_results/1103.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
<img src="../imgs_results/1104.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
<img src="../imgs_results/1105.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
<img src="../imgs_results/1106.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00057937.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00059985.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00111002.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00077949.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00207393.jpg" width="800">
</div> </div>
<a name="英文识别模型效果展示"></a> <a name="英文识别模型效果展示"></a>
## 英文识别模型效果展示 ## 英文识别模型效果展示
<div align="center"> <div align="center">
<img src="../imgs_results/img_12.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
</div> </div>
<a name="多语言识别模型效果展示"></a> <a name="多语言识别模型效果展示"></a>
## 多语言识别模型效果展示 ## 多语言识别模型效果展示
<div align="center"> <div align="center">
<img src="../imgs_results/1110.jpg" width="800"> <img src="../imgs_results/french_0.jpg" width="800">
<img src="../imgs_results/1112.jpg" width="800"> <img src="../imgs_results/korean.jpg" width="800">
</div>
<a name="超轻量ppocr_mobile_1.0效果展示"></a>
## 超轻量ppocr_mobile_1.0效果展示
<div align="center">
<img src="../imgs_results/1.jpg" width="800">
<img src="../imgs_results/7.jpg" width="800">
<img src="../imgs_results/6.jpg" width="800">
<img src="../imgs_results/16.png" width="800">
</div>
<a name="通用ppocr_server_1.0效果展示"></a>
## 通用ppocr_server_1.0效果展示
<div align="center">
<img src="../imgs_results/chinese_db_crnn_server/11.jpg" width="800">
<img src="../imgs_results/chinese_db_crnn_server/2.jpg" width="800">
<img src="../imgs_results/chinese_db_crnn_server/8.jpg" width="800">
</div> </div>
...@@ -134,24 +134,33 @@ Because EAST and DB algorithms are very different, when inference, it is necessa ...@@ -134,24 +134,33 @@ Because EAST and DB algorithms are very different, when inference, it is necessa
For lightweight Chinese detection model inference, you can execute the following commands: For lightweight Chinese detection model inference, you can execute the following commands:
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" # download DB text detection inference model
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
tar xf ch_ppocr_mobile_v2.0_det_infer.tar
# predict
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/"
``` ```
The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows: The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_2.jpg) ![](../imgs_results/det_res_22.jpg)
The size of the image is limited by the parameters `limit_type` and `det_limit_side_len`, `limit_type=max` is to limit the length of the long side <`det_limit_side_len`, and `limit_type=min` is to limit the length of the short side>`det_limit_side_len`, You can use the parameters `limit_type` and `det_limit_side_len` to limit the size of the input image,
When the picture does not meet the restriction conditions (for `limit_type=max`and long side >`det_limit_side_len` or for `min` and short side <`det_limit_side_len`), the image will be scaled proportionally. The optional parameters of `litmit_type` are [`max`, `min`], and
This parameter is set to `limit_type='max', det_max_side_len=960` by default. If the resolution of the input picture is relatively large, and you want to use a larger resolution prediction, you can execute the following command: `det_limit_size_len` is a positive integer, generally set to a multiple of 32, such as 960.
The default setting of the parameters is `limit_type='max', det_limit_side_len=960`. Indicates that the longest side of the network input image cannot exceed 960,
If this value is exceeded, the image will be resized with the same width ratio to ensure that the longest side is `det_limit_side_len`.
Set as `limit_type='min', det_limit_side_len=960`, it means that the shortest side of the image is limited to 960.
If the resolution of the input picture is relatively large and you want to use a larger resolution prediction, you can set det_limit_side_len to the desired value, such as 1216:
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1200 python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/" --det_limit_type=max --det_limit_side_len=1216
``` ```
If you want to use the CPU for prediction, execute the command as follows If you want to use the CPU for prediction, execute the command as follows
``` ```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
``` ```
<a name="DB_DETECTION"></a> <a name="DB_DETECTION"></a>
......
# Visualization # Visualization
<a name="ppocr_server_1.1"></a>
## ch_ppocr_server_1.1
<div align="center">
<img src="../imgs_results/1101.jpg" width="800">
<img src="../imgs_results/1102.jpg" width="800">
<img src="../imgs_results/1103.jpg" width="800">
<img src="../imgs_results/1104.jpg" width="800">
<img src="../imgs_results/1105.jpg" width="800">
<img src="../imgs_results/1106.jpg" width="800">
</div>
<a name="en_ppocr_mobile_1.1"></a>
## en_ppocr_mobile_1.1
<div align="center">
<img src="../imgs_results/img_12.jpg" width="800">
</div>
<a name="ppocr_server_2.0"></a>
## ch_ppocr_server_2.0
<a name="multilingual"></a>
## (multilingual)_ppocr_mobile_1.1
<div align="center"> <div align="center">
<img src="../imgs_results/1110.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00006737.jpg" width="800">
<img src="../imgs_results/1112.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/00009282.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00015504.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00057937.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00059985.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00111002.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00077949.jpg" width="800">
<img src="../imgs_results/ch_ppocr_mobile_v2.0/00207393.jpg" width="800">
</div> </div>
<a name="ppocr_mobile_1.0"></a>
## ppocr_mobile_1.0
<a name="en_ppocr_mobile_2.0"></a>
## en_ppocr_mobile_2.0
<div align="center"> <div align="center">
<img src="../imgs_results/1.jpg" width="800"> <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
<img src="../imgs_results/7.jpg" width="800">
<img src="../imgs_results/6.jpg" width="800">
<img src="../imgs_results/16.png" width="800">
</div> </div>
<a name="ppocr_server_1.0"></a> <a name="multilingual"></a>
## ppocr_server_1.0 ## (multilingual)_ppocr_mobile_2.0
<div align="center"> <div align="center">
<img src="../imgs_results/chinese_db_crnn_server/11.jpg" width="800"> <img src="./doc/imgs_results/french_0.jpg" width="800">
<img src="../imgs_results/chinese_db_crnn_server/2.jpg" width="800"> <img src="./doc/imgs_results/korean.jpg" width="800">
<img src="../imgs_results/chinese_db_crnn_server/8.jpg" width="800">
</div> </div>
doc/imgs/korean_1.jpg

982.9 KB | W: | H:

doc/imgs/korean_1.jpg

38.8 KB | W: | H:

doc/imgs/korean_1.jpg
doc/imgs/korean_1.jpg
doc/imgs/korean_1.jpg
doc/imgs/korean_1.jpg
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册