diff --git a/README.md b/README.md index a9ea02eeab734a01a30d407aa63ee19f120cb6cf..88c1dc18fa95d96a32fb68b50520d41dfd1b8935 100644 --- a/README.md +++ b/README.md @@ -4,25 +4,22 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 **近期更新** -- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列系列中英文ocr模型,效果媲美商业效果。[模型下载](#模型下载) +- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载) +- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载) - 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md) - 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md) - 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519) -- 2020.8.16 开源文本检测算法[SAST](https://arxiv.org/abs/1908.05498)和文本识别算法[SRN](https://arxiv.org/abs/2003.12294) -- 2020.7.23 发布7月21日B站直播课回放和PPT,课节1,PaddleOCR开源大礼包全面解读,[获取地址](https://aistudio.baidu.com/aistudio/course/introduce/1519) -- 2020.7.15 添加基于EasyEdge和Paddle-Lite的移动端DEMO,支持iOS和Android系统 -- [more](./doc/doc_ch/update.md) +- [More](./doc/doc_ch/update.md) ## 特性 - PPOCR系列高质量预训练模型,媲美商业效果 - - 超轻量ppocr_mobile系列:检测(2.5M)+方向分类器(0.9M - )+ 识别(4.5M)= 7.9M + - 超轻量ppocr_mobile移动端系列:检测(2.6M)+方向分类器(0.9M)+ 识别(4.6M)= 8.1M - 通用ppocr_server系列:检测(47.2M)+方向分类器(0.9M)+ 识别(107M)= 155.1M - - 超轻量压缩ppocr_mobile_slim系列:(coming soon) + - 超轻量压缩ppocr_mobile_slim系列:检测(1.4M)+方向分类器(0.5M)+ 识别(1.6M)= 3.5M - 支持中英文数字组合识别、竖排文本识别、长文本识别 -- 支持多语言识别:韩语、日语、德语、法语 (coming soon) +- 支持多语言识别:韩语、日语、德语、法语 - 支持用户自定义训练,提供丰富的预测推理部署方案 - 支持PIP快速安装使用 - 可运行于Linux、Windows、MacOS等多种系统 @@ -31,10 +28,10 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- +
-上图是超轻量级中文OCR模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)。 +上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)。 ## 快速体验 - PC端:超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr @@ -46,44 +43,39 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 -- 代码体验:可以直接进入[快速安装](./doc/doc_ch/installation.md) +- 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始 ## PP-OCR 1.1系列模型列表(9月17日更新) | 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 | | | ------------ | --------------- | ----------------|---- | ---------- | -------- | ---- | -| 中英文超轻量OCR模型(7.9M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_train.tar) |[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | -| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_train.tar) |[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | -| 中英文超轻量压缩OCR模型 | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |即将开源 |即将开源|即将开源| | || - -更多V1.1版本模型下载,可以参考[OCR1.1模型列表](./doc/doc_ch/models_list.md) - -## PP-OCR 1.0系列模型列表(7月16日更新) - -| 模型简介 | 模型名称 | 检测模型 | 识别模型 | 支持空格的识别模型 | | -| ------------ | ---------------------- | -------- | ---------- | -------- | ---- | -| 超轻量中英文OCR模型(8.6M) | chinese_db_crnn_mobile_xx |[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar) | | -|通用中文OCR模型(212M)|chinese_db_crnn_server_xx|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)| | +| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | +| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile-v1.1.cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | +| 中英文超轻量压缩OCR模型 | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_opt.nb) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_cls_quant_opt.nb)| | || +更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](./doc/doc_ch/models_list.md) ## 文档教程 - [快速安装](./doc/doc_ch/installation.md) - [中文OCR模型快速使用](./doc/doc_ch/quickstart.md) +- [代码组织结构](./doc/doc_ch/tree.md) - 算法介绍 - [文本检测](./doc/doc_ch/algorithm_overview.md) - [文本识别](./doc/doc_ch/algorithm_overview.md) - - PP-OCR (coming soon) + - [PP-OCR Pipline](#PP-OCR) - 模型训练/评估 - [文本检测](./doc/doc_ch/detection.md) - [文本识别](./doc/doc_ch/recognition.md) - [yml参数配置文件介绍](./doc/doc_ch/config.md) - 预测部署 - - [基于Python预测引擎推理](./doc/doc_ch/inference.md) + - [基于pip安装whl包快速推理](./doc/doc_ch/whl.md) + - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md) - [基于C++预测引擎推理](./deploy/cpp_infer/readme.md) - - [服务化部署](./doc/doc_ch/serving.md) + - [服务化部署](./deploy/hubserving/readme.md) - [端侧部署](./deploy/lite/readme.md) - - 模型量化压缩(coming soon) + - [模型量化](./deploy/slim/quantization/README.md) + - [模型裁剪](./deploy/slim/prune/README_ch.md) - [Benchmark](./doc/doc_ch/benchmark.md) - 数据集 - [通用中英文OCR数据集](./doc/doc_ch/datasets.md) @@ -101,13 +93,22 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - [许可证书](#许可证书) - [贡献代码](#贡献代码) + +## PP-OCR Pipline +
+ +
+ +PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框矫正和CRNN文本识别三部分组成。该系统从骨干网络选择和调整、预测头部的设计、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型自动裁剪量化8个方面,采用19个有效策略,对各个模块的模型进行效果调优和瘦身,最终得到整体大小为3.5M的超轻量中英文OCR和2M的英文数字OCR。更多细节请参考PP-OCR技术文章(Arxiv文章链接生成中)。 + + ## 效果展示 [more](./doc/doc_ch/visualization.md)
- + diff --git a/README_en.md b/README_en.md index c0f17b57710b68e8f33573f116a120607fa8847c..0d9a4f6cf5193864b1f1879db6d5b09f948814f3 100644 --- a/README_en.md +++ b/README_en.md @@ -64,7 +64,7 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr - Deployment - [Python Inference](./doc/doc_en/inference_en.md) - [C++ Inference](./deploy/cpp_infer/readme_en.md) - - [Serving](./doc/doc_en/serving_en.md) + - [Serving](./deploy/hubserving/readme_en.md) - [Mobile](./deploy/lite/readme_en.md) - Model Quantization and Compression (coming soon) - [Benchmark](./doc/doc_en/benchmark_en.md) diff --git a/configs/det/det_mv3_db.yml b/configs/det/det_mv3_db.yml index 91a8e86f8bba440df83c1d9f7da0e6523d5907bb..5f67ca1db758069bb6d19276339895302604fd62 100755 --- a/configs/det/det_mv3_db.yml +++ b/configs/det/det_mv3_db.yml @@ -24,6 +24,7 @@ Backbone: function: ppocr.modeling.backbones.det_mobilenet_v3,MobileNetV3 scale: 0.5 model_name: large + disable_se: true Head: function: ppocr.modeling.heads.det_db_head,DBHead diff --git a/configs/det/det_mv3_db_v1.1.yml b/configs/det/det_mv3_db_v1.1.yml deleted file mode 100755 index 5f67ca1db758069bb6d19276339895302604fd62..0000000000000000000000000000000000000000 --- a/configs/det/det_mv3_db_v1.1.yml +++ /dev/null @@ -1,55 +0,0 @@ -Global: - algorithm: DB - use_gpu: true - epoch_num: 1200 - log_smooth_window: 20 - print_batch_step: 2 - save_model_dir: ./output/det_db/ - save_epoch_step: 200 - # evaluation is run every 5000 iterations after the 4000th iteration - eval_batch_step: [4000, 5000] - train_batch_size_per_card: 16 - test_batch_size_per_card: 16 - image_shape: [3, 640, 640] - reader_yml: ./configs/det/det_db_icdar15_reader.yml - pretrain_weights: ./pretrain_models/MobileNetV3_large_x0_5_pretrained/ - checkpoints: - save_res_path: ./output/det_db/predicts_db.txt - save_inference_dir: - -Architecture: - function: ppocr.modeling.architectures.det_model,DetModel - -Backbone: - function: ppocr.modeling.backbones.det_mobilenet_v3,MobileNetV3 - scale: 0.5 - model_name: large - disable_se: true - -Head: - function: ppocr.modeling.heads.det_db_head,DBHead - model_name: large - k: 50 - inner_channels: 96 - out_channels: 2 - -Loss: - function: ppocr.modeling.losses.det_db_loss,DBLoss - balance_loss: true - main_loss_type: DiceLoss - alpha: 5 - beta: 10 - ohem_ratio: 3 - -Optimizer: - function: ppocr.optimizer,AdamDecay - base_lr: 0.001 - beta1: 0.9 - beta2: 0.999 - -PostProcess: - function: ppocr.postprocess.db_postprocess,DBPostProcess - thresh: 0.3 - box_thresh: 0.6 - max_candidates: 1000 - unclip_ratio: 1.5 diff --git a/configs/rec/multi_languages/rec_en_lite_train.yml b/configs/rec/multi_languages/rec_en_lite_train.yml index 8b08d9f788415d63028480183a52630bbe245db0..128424b4d3a5631f8237f6cd596c901990ff2277 100644 --- a/configs/rec/multi_languages/rec_en_lite_train.yml +++ b/configs/rec/multi_languages/rec_en_lite_train.yml @@ -16,7 +16,7 @@ Global: loss_type: ctc distort: false use_space_char: false - reader_yml: ./configs/rec/rec_en_reader.yml + reader_yml: ./configs/rec/multi_languages/rec_en_reader.yml pretrain_weights: checkpoints: save_inference_dir: diff --git a/configs/rec/multi_languages/rec_french_lite_train.yml b/configs/rec/multi_languages/rec_french_lite_train.yml index 49d4d3df089a4749674b47f45a97a54abe95ffa2..2cf54c427eb6a7c64f4b54b021c44013a1dc1d6a 100755 --- a/configs/rec/multi_languages/rec_french_lite_train.yml +++ b/configs/rec/multi_languages/rec_french_lite_train.yml @@ -16,7 +16,7 @@ Global: loss_type: ctc distort: true use_space_char: false - reader_yml: ./configs/rec/rec_french_reader.yml + reader_yml: ./configs/rec/multi_languages/rec_french_reader.yml pretrain_weights: checkpoints: save_inference_dir: diff --git a/configs/rec/multi_languages/rec_ger_lite_train.yml b/configs/rec/multi_languages/rec_ger_lite_train.yml index 0ccadd0ad9a91c18b998a07a7c10184893327762..beb1755b105fea9cbade9f35ceac15d380651f37 100755 --- a/configs/rec/multi_languages/rec_ger_lite_train.yml +++ b/configs/rec/multi_languages/rec_ger_lite_train.yml @@ -16,7 +16,7 @@ Global: loss_type: ctc distort: true use_space_char: false - reader_yml: ./configs/rec/rec_ger_reader.yml + reader_yml: ./configs/rec/multi_languages/rec_ger_reader.yml pretrain_weights: checkpoints: save_inference_dir: diff --git a/configs/rec/multi_languages/rec_japan_lite_train.yml b/configs/rec/multi_languages/rec_japan_lite_train.yml index 2d3b388a5c9389fc27e23a5a295f9c4c70405a36..fbbab33eadd2901d9eac93f49e737e92d9441270 100755 --- a/configs/rec/multi_languages/rec_japan_lite_train.yml +++ b/configs/rec/multi_languages/rec_japan_lite_train.yml @@ -16,7 +16,7 @@ Global: loss_type: ctc distort: true use_space_char: false - reader_yml: ./configs/rec/rec_japan_reader.yml + reader_yml: ./configs/rec/multi_languages/rec_japan_reader.yml pretrain_weights: checkpoints: save_inference_dir: diff --git a/configs/rec/multi_languages/rec_korean_lite_train.yml b/configs/rec/multi_languages/rec_korean_lite_train.yml index ad55d821c63b987150a140a5924407a259d48c66..29cc08aaefb017c690551e030a57e85ebb21e2dd 100755 --- a/configs/rec/multi_languages/rec_korean_lite_train.yml +++ b/configs/rec/multi_languages/rec_korean_lite_train.yml @@ -16,7 +16,7 @@ Global: loss_type: ctc distort: true use_space_char: false - reader_yml: ./configs/rec/rec_korean_reader.yml + reader_yml: ./configs/rec/multi_languages/rec_korean_reader.yml pretrain_weights: checkpoints: save_inference_dir: diff --git a/deploy/cpp_infer/readme.md b/deploy/cpp_infer/readme.md index 0b2441097fbdd0c0ea3acb7ce5a696837645443f..571ed2eb2b071574aec3cabdff01b6c9d7f17440 100644 --- a/deploy/cpp_infer/readme.md +++ b/deploy/cpp_infer/readme.md @@ -193,6 +193,9 @@ make -j sh tools/run.sh ``` +* 若需要使用方向分类器,则需要将`tools/config.txt`中的`use_angle_cls`参数修改为1,表示开启方向分类器的预测。 + + 最终屏幕上会输出检测结果如下。
diff --git a/deploy/cpp_infer/readme_en.md b/deploy/cpp_infer/readme_en.md index ecb29f9b9673446c86b2b561440b57d29ea457f4..a545b8606cda0b476b439543382d997065721892 100644 --- a/deploy/cpp_infer/readme_en.md +++ b/deploy/cpp_infer/readme_en.md @@ -162,7 +162,7 @@ inference/ sh tools/build.sh ``` -具体地,`tools/build.sh`中内容如下。 +Specifically, the content in `tools/build.sh` is as follows. ```shell OPENCV_DIR=your_opencv_dir @@ -201,6 +201,8 @@ make -j sh tools/run.sh ``` +* If you want to orientation classifier to correct the detected boxes, you can set `use_angle_cls` in the file `tools/config.txt` as 1 to enable the function. + The detection results will be shown on the screen, which is as follows.
diff --git a/deploy/cpp_infer/tools/config.txt b/deploy/cpp_infer/tools/config.txt index d43a861d6ae2e35697d0bad0832f551e321a98af..7e03b9d13af9b65239dc257059ef0fa94106e880 100644 --- a/deploy/cpp_infer/tools/config.txt +++ b/deploy/cpp_infer/tools/config.txt @@ -15,7 +15,7 @@ det_model_dir ./inference/det_db # cls config use_angle_cls 0 -cls_model_dir ../inference/cls +cls_model_dir ./inference/cls cls_thresh 0.9 # rec config diff --git a/deploy/hubserving/ocr_det/params.py b/deploy/hubserving/ocr_det/params.py index e88ab45c7bb548ef971465d4aaefb30d247ab17f..f37993a10b85097b11e38bbb2efe25c649bec8d0 100644 --- a/deploy/hubserving/ocr_det/params.py +++ b/deploy/hubserving/ocr_det/params.py @@ -13,7 +13,7 @@ def read_params(): #params for text detector cfg.det_algorithm = "DB" - cfg.det_model_dir = "./inference/ch_det_mv3_db/" + cfg.det_model_dir = "./inference/ch_ppocr_mobile_v1.1_det_infer/" cfg.det_max_side_len = 960 #DB parmas diff --git a/deploy/hubserving/ocr_rec/params.py b/deploy/hubserving/ocr_rec/params.py index 59772e2163d1d5f8279dee85432b5bf93502914e..58a8bc119e2a54ad78446bd616eeb7a9089a6084 100644 --- a/deploy/hubserving/ocr_rec/params.py +++ b/deploy/hubserving/ocr_rec/params.py @@ -28,7 +28,7 @@ def read_params(): #params for text recognizer cfg.rec_algorithm = "CRNN" - cfg.rec_model_dir = "./inference/ch_rec_mv3_crnn/" + cfg.rec_model_dir = "./inference/ch_ppocr_mobile_v1.1_rec_infer/" cfg.rec_image_shape = "3, 32, 320" cfg.rec_char_type = 'ch' diff --git a/deploy/hubserving/ocr_system/params.py b/deploy/hubserving/ocr_system/params.py index 21e8cca4a0990ecb5963280100db1a0a3fb62151..d83fe692dca7c94c7225a1aa26e782765e665bdd 100644 --- a/deploy/hubserving/ocr_system/params.py +++ b/deploy/hubserving/ocr_system/params.py @@ -13,7 +13,7 @@ def read_params(): #params for text detector cfg.det_algorithm = "DB" - cfg.det_model_dir = "./inference/ch_det_mv3_db/" + cfg.det_model_dir = "./inference/ch_ppocr_mobile_v1.1_det_infer/" cfg.det_max_side_len = 960 #DB parmas @@ -28,7 +28,7 @@ def read_params(): #params for text recognizer cfg.rec_algorithm = "CRNN" - cfg.rec_model_dir = "./inference/ch_rec_mv3_crnn/" + cfg.rec_model_dir = "./inference/ch_ppocr_mobile_v1.1_rec_infer/" cfg.rec_image_shape = "3, 32, 320" cfg.rec_char_type = 'ch' diff --git a/doc/doc_ch/serving.md b/deploy/hubserving/readme.md similarity index 83% rename from doc/doc_ch/serving.md rename to deploy/hubserving/readme.md index 99fe3006fde8762930ef9a168da81cce9069f8e0..5d29b432ba3d4c098872431c9b5fde13f553eee0 100644 --- a/doc/doc_ch/serving.md +++ b/deploy/hubserving/readme.md @@ -1,10 +1,12 @@ -# 服务部署 +[English](readme_en.md) | 简体中文 PaddleOCR提供2种服务部署方式: -- 基于HubServing的部署:已集成到PaddleOCR中([code](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy/hubserving)),按照本教程使用; -- 基于PaddleServing的部署:详见PaddleServing官网[demo](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/ocr),后续也将集成到PaddleOCR。 +- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",按照本教程使用; +- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",使用方法参考[文档](../pdserving/readme.md)。 -服务部署目录下包括检测、识别、2阶段串联三种服务包,根据需求选择相应的服务包进行安装和启动。目录如下: +# 基于PaddleHub Serving的服务部署 + +hubserving服务部署目录下包括检测、识别、2阶段串联三种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下: ``` deploy/hubserving/ └─ ocr_det 检测模块服务包 @@ -30,11 +32,18 @@ pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple # 在Linux下设置环境变量 export PYTHONPATH=. -# 在Windows下设置环境变量 + +# 或者,在Windows下设置环境变量 SET PYTHONPATH=. ``` -### 2. 安装服务模块 +### 2. 下载推理模型 +安装服务模块前,需要准备推理模型并放到正确路径。默认使用的是v1.1版的超轻量模型,默认检测模型路径为: +`./inference/ch_ppocr_mobile_v1.1_det_infer/`,识别模型路径为:`./inference/ch_ppocr_mobile_v1.1_rec_infer/`。 + +**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的[模型库](../../doc/doc_ch/models_list.md)下载,也可以替换成自己训练转换好的模型。 + +### 3. 安装服务模块 PaddleOCR提供3种服务模块,根据需要安装所需模块。 * 在Linux环境下,安装示例如下: @@ -61,15 +70,7 @@ hub install deploy\hubserving\ocr_rec\ hub install deploy\hubserving\ocr_system\ ``` -#### 安装模型 -安装服务模块前,需要将训练好的模型放到对应的文件夹内。默认使用的是: -./inference/ch_det_mv3_db/ -和 -./inference/ch_rec_mv3_crnn/ -这两个模型可以在https://github.com/PaddlePaddle/PaddleOCR 下载 -可以在./deploy/hubserving/ocr_system/params.py 里面修改成自己的模型 - -### 3. 启动服务 +### 4. 启动服务 #### 方式1. 命令行命令启动(仅支持CPU) **启动命令:** ```shell @@ -172,7 +173,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json ```hub serving stop --port/-p XXXX``` - 2、 到相应的`module.py`和`params.py`等文件中根据实际需求修改代码。 -例如,如果需要替换部署服务所用模型,则需要到`params.py`中修改模型路径参数`det_model_dir`和`rec_model_dir`,当然,同时可能还需要修改其他相关参数,请根据实际情况修改调试。 建议修改后先直接运行`module.py`调试,能正确运行预测后再启动服务测试。 +例如,如果需要替换部署服务所用模型,则需要到`params.py`中修改模型路径参数`det_model_dir`和`rec_model_dir`,当然,同时可能还需要修改其他相关参数,请根据实际情况修改调试。 **强烈建议修改后先直接运行`module.py`调试,能正确运行预测后再启动服务测试。** - 3、 卸载旧服务包 ```hub uninstall ocr_system``` diff --git a/doc/doc_en/serving_en.md b/deploy/hubserving/readme_en.md similarity index 84% rename from doc/doc_en/serving_en.md rename to deploy/hubserving/readme_en.md index 7439cc84abb58f091febc3acda169816d34a836b..efef1cda6dd5a91d6ad2f7db27061418fa24e105 100644 --- a/doc/doc_en/serving_en.md +++ b/deploy/hubserving/readme_en.md @@ -1,10 +1,12 @@ -# Service deployment +English | [简体中文](readme.md) -PaddleOCR provides 2 service deployment methods:: -- Based on **HubServing**:Has been integrated into PaddleOCR ([code](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy/hubserving)). Please follow this tutorial. -- Based on **PaddleServing**:See PaddleServing official website for details ([demo](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/ocr)). Follow-up will also be integrated into PaddleOCR. +PaddleOCR provides 2 service deployment methods: +- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please follow this tutorial. +- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please refer to the [tutorial](../pdserving/readme_en.md) for usage. -The service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Select the corresponding service package to install and start service according to your needs. The directory is as follows: +# Service deployment based on PaddleHub Serving + +The hubserving service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Please select the corresponding service package to install and start service according to your needs. The directory is as follows: ``` deploy/hubserving/ └─ ocr_det detection module service package @@ -31,11 +33,17 @@ pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple # Set environment variables on Linux export PYTHONPATH=. + # Set environment variables on Windows SET PYTHONPATH=. ``` -### 2. Install Service Module +### 2. Download inference model +Before installing the service module, you need to prepare the inference model and put it in the correct path. By default, the ultra lightweight model of v1.1 is used, and the default detection model path is: `./inference/ch_ppocr_mobile_v1.1_det_infer/`, the default recognition model path is: `./inference/ch_ppocr_mobile_v1.1_rec_infer/`. + +**The model path can be found and modified in `params.py`.** More models provided by PaddleOCR can be obtained from the [model library](../../doc/doc_en/models_list_en.md). You can also use models trained by yourself. + +### 3. Install Service Module PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs. * On Linux platform, the examples are as follows. @@ -62,7 +70,7 @@ hub install deploy\hubserving\ocr_rec\ hub install deploy\hubserving\ocr_system\ ``` -### 3. Start service +### 4. Start service #### Way 1. Start with command line parameters (CPU only) **start command:** diff --git a/deploy/pdserving/readme.md b/deploy/pdserving/readme.md index a6a88c20517c6ca01db1004c9e634d1adeafaa3a..af12d508ba9c04e6032f2a392701e72b41462395 100644 --- a/deploy/pdserving/readme.md +++ b/deploy/pdserving/readme.md @@ -1,5 +1,10 @@ -# Paddle Serving 服务部署 +[English](readme_en.md) | 简体中文 + +PaddleOCR提供2种服务部署方式: +- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../hubserving/readme.md)。 +- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。 +# Paddle Serving 服务部署 本教程将介绍基于[Paddle Serving](https://github.com/PaddlePaddle/Serving)部署PaddleOCR在线预测服务的详细步骤。 ## 快速启动服务 diff --git a/deploy/pdserving/readme_en.md b/deploy/pdserving/readme_en.md new file mode 100644 index 0000000000000000000000000000000000000000..9a0c684fb6fb4f0eeff2552af70f62053d3351fb --- /dev/null +++ b/deploy/pdserving/readme_en.md @@ -0,0 +1,123 @@ +English | [简体中文](readme.md) + +PaddleOCR provides 2 service deployment methods: +- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../hubserving/readme_en.md) for usage. +- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please follow this tutorial. + +# Service deployment based on Paddle Serving + +This tutorial will introduce the detail steps of deploying PaddleOCR online prediction service based on [Paddle Serving](https://github.com/PaddlePaddle/Serving). + +## Quick start service + +### 1. Prepare the environment +Let's first install the relevant components of Paddle Serving. GPU is recommended for service deployment with Paddle Serving. + +**Requirements:** +- **CUDA version: 9.0** +- **CUDNN version: 7.0** +- **Operating system version: >= CentOS 6** +- **Python version: 2.7/3.6/3.7** + +**Installation:** +``` +# install GPU server +python -m pip install paddle_serving_server_gpu + +# or, install CPU server +python -m pip install paddle_serving_server + +# install client and App package (CPU/GPU) +python -m pip install paddle_serving_app paddle_serving_client +``` + +### 2. Model transformation +You can directly use converted model provided by `paddle_serving_app` for convenience. Execute the following command to obtain: +``` +python -m paddle_serving_app.package --get_model ocr_rec +tar -xzvf ocr_rec.tar.gz +python -m paddle_serving_app.package --get_model ocr_det +tar -xzvf ocr_det.tar.gz +``` +Executing the above command will download the `db_crnn_mobile` model, which is in different format with inference model. If you want to use other models for deployment, you can refer to the [tutorial](https://github.com/PaddlePaddle/Serving/blob/develop/doc/INFERENCE_TO_SERVING_CN.md) to convert your inference model to a model which is deployable for Paddle Serving. + +We take `ch_rec_r34_vd_crnn` model as example. Download the inference model by executing the following command: +``` +wget --no-check-certificate https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar +tar xf ch_rec_r34_vd_crnn_infer.tar +``` + +Convert the downloaded model by executing the following python script: +``` +from paddle_serving_client.io import inference_model_to_serving +inference_model_dir = "ch_rec_r34_vd_crnn" +serving_client_dir = "serving_client_dir" +serving_server_dir = "serving_server_dir" +feed_var_names, fetch_var_names = inference_model_to_serving( + inference_model_dir, serving_client_dir, serving_server_dir, model_filename="model", params_filename="params") +``` + +Finally, model configuration of client and server will be generated in `serving_client_dir` and `serving_server_dir`. + +### 3. Start service +Start the standard version or the fast version service according to your actual needs. The comparison of the two versions is shown in the table below: + +|version|characteristics|recommended scenarios| +|-|-|-| +|standard version|High stability, suitable for distributed deployment|Large throughput and cross regional deployment| +|fast version|Easy to deploy and fast to predict|Suitable for scenarios which requires high prediction speed and fast iteration speed| + +#### Mode 1. Start the standard mode service + +``` +# start with CPU +python -m paddle_serving_server.serve --model ocr_det_model --port 9293 +python ocr_web_server.py cpu + +# or, with GPU +python -m paddle_serving_server_gpu.serve --model ocr_det_model --port 9293 --gpu_id 0 +python ocr_web_server.py gpu +``` + +#### Mode 2. Start the fast mode service + +``` +# start with CPU +python ocr_local_server.py cpu + +# or, with GPU +python ocr_local_server.py gpu +``` + +## Send prediction requests + +``` +python ocr_web_client.py +``` + +## Returned result format + +The returned result is a JSON string, eg. +``` +{u'result': {u'res': [u'\u571f\u5730\u6574\u6cbb\u4e0e\u571f\u58e4\u4fee\u590d\u7814\u7a76\u4e2d\u5fc3', u'\u534e\u5357\u519c\u4e1a\u5927\u5b661\u7d20\u56fe']}} +``` + +You can also print the readable result in `res`: +``` +土地整治与土壤修复研究中心 +华南农业大学1素图 +``` + +## User defined service module modification + +The pre-processing and post-processing process, can be found in the `preprocess` and `postprocess` function in `ocr_web_server.py` or `ocr_local_server.py`. The pre-processing/post-processing library for common CV models provided by `paddle_serving_app` is called. +You can modify the corresponding code as actual needs. + +If you only want to start the detection service or the recognition service, execute the corresponding script reffering to the following table. Indicate the CPU or GPU is used in the start command parameters. + +| task | standard | fast | +| ---- | ----------------- | ------------------- | +| detection | det_web_server.py | det_local_server.py | +| recognition | rec_web_server.py | rec_local_server.py | + +More info can be found in [Paddle Serving](https://github.com/PaddlePaddle/Serving). diff --git a/deploy/slim/prune/README_ch.md b/deploy/slim/prune/README_ch.md index 3f55197766188a4d2ec1d066ec2a2f722d892839..fbd9921da91a61e796e1f35c5dfce6531e83bd45 100644 --- a/deploy/slim/prune/README_ch.md +++ b/deploy/slim/prune/README_ch.md @@ -128,7 +128,7 @@ ## 安装PaddleSlim -\```bash +```bash git clone https://github.com/PaddlePaddle/PaddleSlim.git @@ -136,7 +136,7 @@ cd Paddleslim python setup.py install -\``` +``` ## 获取预训练模型 @@ -148,22 +148,22 @@ python setup.py install 进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析: -\```bash +```bash python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -\``` +``` ## 裁剪模型与fine-tune 裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时,为了尽可能多的保留从图像中提取的低阶特征,我们跳过了backbone中靠近输入的4个卷积层。同样,为了减少由于裁剪导致的模型性能损失,我们通过之前敏感度分析所获得的敏感度表,挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41),并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。 -\```bash +```bash python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -\``` +``` @@ -173,8 +173,8 @@ python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml - 在得到裁剪训练保存的模型后,我们可以将其导出为inference_model,用于预测部署: -\```bash +```bash python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model -\``` +``` diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md index d345e24c71fa5a1a017362656502b44e1a082688..d854c10707bc26d0273a26c335eef68e8633e74b 100644 --- a/deploy/slim/prune/README_en.md +++ b/deploy/slim/prune/README_en.md @@ -128,7 +128,7 @@ It is recommended that you could understand following pages before reading this ## Install PaddleSlim -\```bash +```bash git clone https://github.com/PaddlePaddle/PaddleSlim.git @@ -136,7 +136,7 @@ cd Paddleslim python setup.py install -\``` +``` ## Download Pretrain Model @@ -150,11 +150,11 @@ python setup.py install Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: -\```bash +```bash python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -\``` +``` @@ -162,11 +162,11 @@ python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Gl When pruning, the previous sensitivity analysis file would determines the pruning ratio of each network layer. In the specific implementation, in order to retain as many low-level features extracted from the image as possible, we skipped the 4 convolutional layers close to the input in the backbone. Similarly, in order to reduce the model performance loss caused by pruning, we selected some of the less redundant and more sensitive [network layer](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41) through the sensitivity table obtained from the previous sensitivity analysis.And choose to skip these network layers in the subsequent pruning process. After pruning, the model need a finetune process to recover the performance and the training strategy of finetune is similar to the strategy of training original OCR detection model. -\```bash +```bash python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -\``` +``` @@ -176,8 +176,8 @@ python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml - After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: -\```bash +```bash python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model -\``` +``` diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md index f7d87c83602f69ada46b35e7d63260fe8bc6e055..d1aa3d71e5254cf6b5b2be7fdf6943903d42fafd 100755 --- a/deploy/slim/quantization/README.md +++ b/deploy/slim/quantization/README.md @@ -1,21 +1,148 @@ > 运行示例前请先安装1.2.0或更高版本PaddleSlim + # 模型量化压缩教程 +压缩结果: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
序号任务模型压缩策略精度(自建中文数据集)耗时(ms)整体耗时(ms)加速比整体模型大小(M)压缩比例下载链接
0检测MobileNetV3_DB61.7224375-8.6-
识别MobileNetV3_CRNN62.09.52
1检测SlimTextDetPACT量化训练62.11953488%2.867.82%
识别SlimTextRecPACT量化训练61.488.6
2检测SlimTextDet_quat_pruning剪裁+PACT量化训练60.8614228830%2.867.82%
识别SlimTextRecPACT量化训练61.488.6
3检测SlimTextDet_pruning剪裁61.5713829527%2.966.28%
识别SlimTextRecPACT量化训练61.488.6
+ + + ## 概述 +复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 + 该示例使用PaddleSlim提供的[量化压缩API](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/)对OCR模型进行压缩。 在阅读该示例前,建议您先了解以下内容: - [OCR模型的常规训练方法](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) -- [PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/) +- [PaddleSlim使用文档](https://paddleslim.readthedocs.io/zh_CN/latest/index.html) + + ## 安装PaddleSlim -可按照[PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/)中的步骤安装PaddleSlim。 +```bash +git clone https://github.com/PaddlePaddle/PaddleSlim.git + +cd Paddleslim + +python setup.py install +``` + + + +## 获取预训练模型 + +[识别预训练模型下载地址]() + +[检测预训练模型下载地址]() ## 量化训练 +加载预训练模型后,在定义好量化策略后即可对模型进行量化。量化相关功能的使用具体细节见:[模型量化](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html) 进入PaddleOCR根目录,通过以下命令对模型进行量化: @@ -25,10 +152,11 @@ python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global + ## 导出模型 在得到量化训练保存的模型后,我们可以将其导出为inference_model,用于预测部署: ```bash -python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_model +python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model ``` diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md new file mode 100755 index 0000000000000000000000000000000000000000..4b8a2b23a254b143cd230c81a7e433d251e10ff2 --- /dev/null +++ b/deploy/slim/quantization/README_en.md @@ -0,0 +1,167 @@ +\> PaddleSlim 1.2.0 or higher version should be installed before runing this example. + + + +# Model compress tutorial (Quantization) + +Compress results: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IDTaskModelCompress StrategyCriterion(Chinese dataset)Inference Time(ms)Inference Time(Total model)(ms)Acceleration RatioModel Size(MB)Commpress RatioDownload Link
0DetectionMobileNetV3_DBNone61.7224375-8.6-
RecognitionMobileNetV3_CRNNNone62.09.52
1DetectionSlimTextDetPACT Quant Aware Training62.11953488%2.867.82%
RecognitionSlimTextRecPACT Quant Aware Training61.488.6
2DetectionSlimTextDet_quat_pruningPruning+PACT Quant Aware Training60.8614228830%2.867.82%
RecognitionSlimTextRecPPACT Quant Aware Training61.488.6
3DetectionSlimTextDet_pruningPruning61.5713829527%2.966.28%
RecognitionSlimTextRecPACT Quant Aware Training61.488.6
+ + + +## Overview + +Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance. + +This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. + +It is recommended that you could understand following pages before reading this example,: + + + +- [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) + +- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) + + + +## Install PaddleSlim + +```bash +git clone https://github.com/PaddlePaddle/PaddleSlim.git + +cd Paddleslim + +python setup.py install + +``` + + +## Download Pretrain Model + +[Download link of Detection pretrain model]() + +[Download link of recognization pretrain model]() + + +## Quan-Aware Training + +After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see:[Model Quantization](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html) + +Enter the PaddleOCR root directory,perform model quantization with the following command: + +```bash +python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 +``` + + + +## Export inference model + +After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: + +```bash +python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model +``` diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md index 84ffeb5d7f1008bfdb1eef269f050fbf4e6fb72e..c2b62edbee7ae855cd32b03cc0019027fb05f669 100644 --- a/doc/doc_ch/detection.md +++ b/doc/doc_ch/detection.md @@ -14,6 +14,15 @@ wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/train_icdar2015_l wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/test_icdar2015_label.txt ``` +PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `train_data/gen_label.py`, 这里以训练集为例: + +``` +# 将官网下载的标签文件转换为 train_icdar2015_label.txt +python gen_label.py --mode="det" --root_path="icdar_c4_train_imgs/" \ + --input_path="ch4_training_localization_transcription_gt" \ + --output_label="train_icdar2015_label.txt" +``` + 解压数据集和下载标注文件后,PaddleOCR/train_data/ 有两个文件夹和两个文件,分别是: ``` /PaddleOCR/train_data/icdar2015/text_localization/ diff --git a/doc/doc_ch/inference.md b/doc/doc_ch/inference.md index 431cdb5a4a9f24cf5862c159d51be2a07e9d4047..709a07515c316cdfd60b74f0b090d4baeeb290a7 100644 --- a/doc/doc_ch/inference.md +++ b/doc/doc_ch/inference.md @@ -24,6 +24,7 @@ inference 模型(`fluid.io.save_inference_model`保存的模型) - [2. 基于CTC损失的识别模型推理](#基于CTC损失的识别模型推理) - [3. 基于Attention损失的识别模型推理](#基于Attention损失的识别模型推理) - [4. 自定义文本识别字典的推理](#自定义文本识别字典的推理) + - [5. 多语言模型的推理](#多语言模型的推理) - [四、方向分类模型推理](#方向识别模型推理) - [1. 方向分类模型推理](#方向分类模型推理) @@ -305,6 +306,22 @@ dict_character = list(self.character_str) python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_type="en" --rec_char_dict_path="your text dict path" ``` + +### 5. 多语言模型的推理 +如果您需要预测的是其他语言模型,在使用inference模型预测时,需要通过`--rec_char_dict_path`指定使用的字典路径, 同时为了得到正确的可视化结果, +需要通过 `--vis_font_path` 指定可视化的字体路径,`doc/` 路径下有默认提供的小语种字体,例如韩文识别: + +``` +python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/korean_dict.txt" --vis_font_path="doc/korean.ttf" +``` +![](../imgs_words/korean/1.jpg) + +执行命令后,上图的预测结果为: +``` text +2020-09-19 16:15:05,076-INFO: index: [205 206 38 39] +2020-09-19 16:15:05,077-INFO: word : 바탕으로 +2020-09-19 16:15:05,077-INFO: score: 0.9171358942985535 +``` ## 四、方向分类模型推理 diff --git a/doc/doc_ch/models_list.md b/doc/doc_ch/models_list.md index 497140592ea4f4cbfe2000146b6903844f3f9872..ab47db21ef7e31c53d018a9741c08f24eaf83ca2 100644 --- a/doc/doc_ch/models_list.md +++ b/doc/doc_ch/models_list.md @@ -7,22 +7,22 @@ - [3. 多语言识别模型](#多语言识别模型) - [三、文本方向分类模型](#文本方向分类模型) -PaddleOCR提供的可下载模型包括`预测模型`、`训练模型`、`预训练模型`、`slim模型`,模型区别说明如下: +PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训练模型`、`slim模型`,模型区别说明如下: |模型类型|模型格式|简介| |-|-|-| -|预测模型|model、params|用于python预测引擎推理,[详情](./inference.md)| +|推理模型|model、params|用于python预测引擎推理,[详情](./inference.md)| |训练模型、预训练模型|\*.pdmodel、\*.pdopt、\*.pdparams|训练过程中保存的checkpoints模型,保存的是模型的参数,多用于模型指标评估和恢复训练| |slim模型|-|用于lite部署| ### 一、文本检测模型 -|模型名称|模型简介|预测模型大小|下载地址| +|模型名称|模型简介|推理模型大小|下载地址| |-|-|-|-| -|ch_ppocr_mobile_slim_v1.1_det|slim裁剪版超轻量模型,支持中英文、多语种文本检测|-|[预测模型]() / [训练模型]() / [slim模型]()| -|ch_ppocr_mobile_v1.1_det|原始超轻量模型,支持中英文、多语种文本检测|2.6M|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)| -|ch_ppocr_server_v1.1_det|通用模型,支持中英文、多语种文本检测,比超轻量模型更大,但效果更好|47.2M|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar)| +|ch_ppocr_mobile_slim_v1.1_det|slim裁剪版超轻量模型,支持中英文、多语种文本检测|1.4M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_opt.nb)| +|ch_ppocr_mobile_v1.1_det|原始超轻量模型,支持中英文、多语种文本检测|2.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)| +|ch_ppocr_server_v1.1_det|通用模型,支持中英文、多语种文本检测,比超轻量模型更大,但效果更好|47.2M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar)| @@ -30,41 +30,42 @@ PaddleOCR提供的可下载模型包括`预测模型`、`训练模型`、`预训 #### 1. 中文识别模型 -|模型名称|模型简介|预测模型大小|下载地址| +|模型名称|模型简介|推理模型大小|下载地址| |-|-|-|-| -|ch_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持中英文、数字识别|-|[预测模型]() / [训练模型]() / [slim模型]()| -|ch_ppocr_mobile_v1.1_rec|原始超轻量模型,支持中英文、数字识别|4.6M|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar)| -|ch_ppocr_server_v1.1_rec|通用模型,支持中英文、数字识别|105M|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar)| +|ch_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持中英文、数字识别|1.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)| +|ch_ppocr_mobile_v1.1_rec|原始超轻量模型,支持中英文、数字识别|4.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar)| +|ch_ppocr_server_v1.1_rec|通用模型,支持中英文、数字识别|105M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar)| **说明:** `训练模型`是基于预训练模型在真实数据与竖排合成文本数据上finetune得到的模型,在真实应用场景中有着更好的表现,`预训练模型`则是直接基于全量真实数据与合成数据训练得到,更适合用于在自己的数据集上finetune。 #### 2. 英文识别模型 -|模型名称|模型简介|预测模型大小|下载地址| +|模型名称|模型简介|推理模型大小|下载地址| |-|-|-|-| -|en_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持英文、数字识别|-|[预测模型]() / [训练模型]() / [slim模型]()| -|en_ppocr_mobile_v1.1_rec|原始超轻量模型,支持英文、数字识别|2.0M|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_train.tar)| +|en_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持英文、数字识别|0.9M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_opt.nb)| +|en_ppocr_mobile_v1.1_rec|原始超轻量模型,支持英文、数字识别|2.0M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_train.tar)| #### 3. 多语言识别模型(更多语言持续更新中...) -|模型名称|模型简介|预测模型大小|下载地址| +|模型名称|模型简介|推理模型大小|下载地址| |-|-|-|-| -|-|法文识别|-|[预测模型]() / [训练模型]()| -|-|德文识别|-|[预测模型]() / [训练模型]()| -|-|韩文识别|-|[预测模型]() / [训练模型]()| -|-|日文识别|-|[预测模型]() / [训练模型]()| +| french_ppocr_mobile_v1.1_rec |法文识别|2.1M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_train.tar)| +| german_ppocr_mobile_v1.1_rec |德文识别|2.1M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_train.tar)| +| korean_ppocr_mobile_v1.1_rec |韩文识别|3.4M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_train.tar)| +| japan_ppocr_mobile_v1.1_rec |日文识别|3.7M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_train.tar)| + ### 三、文本方向分类模型 -|模型名称|模型简介|预测模型大小|下载地址| +|模型名称|模型简介|推理模型大小|下载地址| |-|-|-|-| -|ch_ppocr_mobile_v1.1_cls_quant|slim量化版模型|-|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_train.tar) / [slim模型]()| -|ch_ppocr_mobile_v1.1_cls|原始模型|850kb|[预测模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar)| +|ch_ppocr_mobile_v1.1_cls_quant|slim量化版模型|0.5M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_train.tar) / [slim模型]()| +|ch_ppocr_mobile_v1.1_cls|原始模型|850kb|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar)| ## OCR模型列表(V1.0,7月16日更新) |模型名称|模型简介|检测模型地址|识别模型地址|支持空格的识别模型地址| |-|-|-|-|-| -|chinese_db_crnn_mobile|8.6M超轻量级中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar) -|chinese_db_crnn_server|通用中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar) +|chinese_db_crnn_mobile|8.6M超轻量级中文OCR模型|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar) +|chinese_db_crnn_server|通用中文OCR模型|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar) diff --git a/doc/doc_ch/recognition.md b/doc/doc_ch/recognition.md index 1920be56d1a05bb2f7ade944fd225e690fb484a4..c8955f7fe1c7022cf68155be330fad307c68fe43 100644 --- a/doc/doc_ch/recognition.md +++ b/doc/doc_ch/recognition.md @@ -44,6 +44,13 @@ wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_t wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt ``` +PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `train_data/gen_label.py`, 这里以训练集为例: + +``` +# 将官网下载的标签文件转换为 rec_gt_label.txt +python gen_label.py --mode="rec" --input_path="{path/of/origin/label}" --output_label="rec_gt_label.txt" +``` + 最终训练集应有如下文件结构: ``` |-train_data @@ -201,7 +208,19 @@ Optimizer: ``` **注意,预测/评估时的配置文件请务必与训练一致。** +- 小语种 + +PaddleOCR也提供了多语言的, `configs/rec/multi_languages` 路径下的提供了多语言的配置文件,目前PaddleOCR支持的多语言算法有: + +| 配置文件 | 算法名称 | backbone | trans | seq | pred | language | +| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: | :-----: | +| rec_en_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 英语 | +| rec_french_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 法语 | +| rec_ger_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 德语 | +| rec_japan_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 日语 | +| rec_korean_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 韩语 | +多语言模型训练方式与中文模型一致,训练数据集均为100w的合成数据,少量的字体和测试数据可以在[百度网盘]()上下载。 ### 评估 diff --git a/doc/doc_ch/tree.md b/doc/doc_ch/tree.md new file mode 100644 index 0000000000000000000000000000000000000000..f730d8f01fae467f49a03d68d931eb4fda526626 --- /dev/null +++ b/doc/doc_ch/tree.md @@ -0,0 +1,208 @@ +# 整体目录结构 + +PaddleOCR 的整体目录结构介绍如下: + +``` +PaddleOCR +├── configs // 配置文件,可通过yml文件选择模型结构并修改超参 +│ ├── cls // 方向分类器相关配置文件 +│ │ ├── cls_mv3.yml // 训练配置相关,包括骨干网络、head、loss、优化器 +│ │ └── cls_reader.yml // 数据读取相关,数据读取方式、数据存储路径 +│ ├── det // 检测相关配置文件 +│ │ ├── det_db_icdar15_reader.yml // 数据读取 +│ │ ├── det_mv3_db.yml // 训练配置 +│ │ ... +│ └── rec // 识别相关配置文件 +│ ├── rec_benchmark_reader.yml // LMDB 格式数据读取相关 +│ ├── rec_chinese_common_train.yml // 通用中文训练配置 +│ ├── rec_icdar15_reader.yml // simple 数据读取相关,包括数据读取函数、数据路径、标签文件 +│ ... +├── deploy // 部署相关 +│ ├── android_demo // android_demo +│ │ ... +│ ├── cpp_infer // C++ infer +│ │ ├── CMakeLists.txt // Cmake 文件 +│ │ ├── docs // 说明文档 +│ │ │ └── windows_vs2019_build.md +│ │ ├── include // 头文件 +│ │ │ ├── clipper.h // clipper 库 +│ │ │ ├── config.h // 预测配置 +│ │ │ ├── ocr_cls.h // 方向分类器 +│ │ │ ├── ocr_det.h // 文字检测 +│ │ │ ├── ocr_rec.h // 文字识别 +│ │ │ ├── postprocess_op.h // 检测后处理 +│ │ │ ├── preprocess_op.h // 检测预处理 +│ │ │ └── utility.h // 工具 +│ │ ├── readme.md // 说明文档 +│ │ ├── ... +│ │ ├── src // 源文件 +│ │ │ ├── clipper.cpp +│ │ │ ├── config.cpp +│ │ │ ├── main.cpp +│ │ │ ├── ocr_cls.cpp +│ │ │ ├── ocr_det.cpp +│ │ │ ├── ocr_rec.cpp +│ │ │ ├── postprocess_op.cpp +│ │ │ ├── preprocess_op.cpp +│ │ │ └── utility.cpp +│ │ └── tools // 编译、执行脚本 +│ │ ├── build.sh // 编译脚本 +│ │ ├── config.txt // 配置文件 +│ │ └── run.sh // 测试启动脚本 +│ ├── docker +│ │ └── hubserving +│ │ ├── cpu +│ │ │ └── Dockerfile +│ │ ├── gpu +│ │ │ └── Dockerfile +│ │ ├── README_cn.md +│ │ ├── README.md +│ │ └── sample_request.txt +│ ├── hubserving // hubserving +│ │ ├── ocr_det // 文字检测 +│ │ │ ├── config.json // serving 配置 +│ │ │ ├── __init__.py +│ │ │ ├── module.py // 预测模型 +│ │ │ └── params.py // 预测参数 +│ │ ├── ocr_rec // 文字识别 +│ │ │ ├── config.json +│ │ │ ├── __init__.py +│ │ │ ├── module.py +│ │ │ └── params.py +│ │ └── ocr_system // 系统预测 +│ │ ├── config.json +│ │ ├── __init__.py +│ │ ├── module.py +│ │ └── params.py +│ ├── imgs // 预测图片 +│ │ ├── cpp_infer_pred_12.png +│ │ └── demo.png +│ ├── ios_demo // ios demo +│ │ ... +│ ├── lite // lite 部署 +│ │ ├── cls_process.cc // 方向分类器数据处理 +│ │ ├── cls_process.h +│ │ ├── config.txt // 检测配置参数 +│ │ ├── crnn_process.cc // crnn数据处理 +│ │ ├── crnn_process.h +│ │ ├── db_post_process.cc // db数据处理 +│ │ ├── db_post_process.h +│ │ ├── Makefile // 编译文件 +│ │ ├── ocr_db_crnn.cc // 串联预测 +│ │ ├── prepare.sh // 数据准备 +│ │ ├── readme.md // 说明文档 +│ │ ... +│ ├── pdserving // pdserving 部署 +│ │ ├── det_local_server.py // 检测 快速版,部署方便预测速度快 +│ │ ├── det_web_server.py // 检测 完整版,稳定性高分布式部署 +│ │ ├── ocr_local_server.py // 检测+识别 快速版 +│ │ ├── ocr_web_client.py // 客户端 +│ │ ├── ocr_web_server.py // 检测+识别 完整版 +│ │ ├── readme.md // 说明文档 +│ │ ├── rec_local_server.py // 识别 快速版 +│ │ └── rec_web_server.py // 识别 完整版 +│ └── slim +│ └── quantization // 量化相关 +│ ├── export_model.py // 导出模型 +│ ├── quant.py // 量化 +│ └── README.md // 说明文档 +├── doc // 文档教程 +│ ... +├── paddleocr.py +├── ppocr // 网络核心代码 +│ ├── data // 数据处理 +│ │ ├── cls // 方向分类器 +│ │ │ ├── dataset_traversal.py // 数据传输,定义数据读取器,读取数据并组成batch +│ │ │ └── randaugment.py // 随机数据增广操作 +│ │ ├── det // 检测 +│ │ │ ├── data_augment.py // 数据增广操作 +│ │ │ ├── dataset_traversal.py // 数据传输,定义数据读取器,读取数据并组成batch +│ │ │ ├── db_process.py // db 数据处理 +│ │ │ ├── east_process.py // east 数据处理 +│ │ │ ├── make_border_map.py // 生成边界图 +│ │ │ ├── make_shrink_map.py // 生成收缩图 +│ │ │ ├── random_crop_data.py // 随机切割 +│ │ │ └── sast_process.py // sast 数据处理 +│ │ ├── reader_main.py // 数据读取器主函数 +│ │ └── rec // 识别 +│ │ ├── dataset_traversal.py // 数据传输,定义数据读取器,包含 LMDB_Reader 和 Simple_Reader +│ │ └── img_tools.py // 数据处理相关,包括数据归一化、扰动 +│ ├── __init__.py +│ ├── modeling // 组网相关 +│ │ ├── architectures // 模型架构,定义模型所需的各个模块 +│ │ │ ├── cls_model.py // 方向分类器 +│ │ │ ├── det_model.py // 检测 +│ │ │ └── rec_model.py // 识别 +│ │ ├── backbones // 骨干网络 +│ │ │ ├── det_mobilenet_v3.py // 检测 mobilenet_v3 +│ │ │ ├── det_resnet_vd.py +│ │ │ ├── det_resnet_vd_sast.py +│ │ │ ├── rec_mobilenet_v3.py // 识别 mobilenet_v3 +│ │ │ ├── rec_resnet_fpn.py +│ │ │ └── rec_resnet_vd.py +│ │ ├── common_functions.py // 公共函数 +│ │ ├── heads // 头函数 +│ │ │ ├── cls_head.py // 分类头 +│ │ │ ├── det_db_head.py // db 检测头 +│ │ │ ├── det_east_head.py // east 检测头 +│ │ │ ├── det_sast_head.py // sast 检测头 +│ │ │ ├── rec_attention_head.py // 识别 attention +│ │ │ ├── rec_ctc_head.py // 识别 ctc +│ │ │ ├── rec_seq_encoder.py // 识别 序列编码 +│ │ │ ├── rec_srn_all_head.py // 识别 srn 相关 +│ │ │ └── self_attention // srn attention +│ │ │ └── model.py +│ │ ├── losses // 损失函数 +│ │ │ ├── cls_loss.py // 方向分类器损失函数 +│ │ │ ├── det_basic_loss.py // 检测基础loss +│ │ │ ├── det_db_loss.py // DB loss +│ │ │ ├── det_east_loss.py // EAST loss +│ │ │ ├── det_sast_loss.py // SAST loss +│ │ │ ├── rec_attention_loss.py // attention loss +│ │ │ ├── rec_ctc_loss.py // ctc loss +│ │ │ └── rec_srn_loss.py // srn loss +│ │ └── stns // 空间变换网络 +│ │ └── tps.py // TPS 变换 +│ ├── optimizer.py // 优化器 +│ ├── postprocess // 后处理 +│ │ ├── db_postprocess.py // DB 后处理 +│ │ ├── east_postprocess.py // East 后处理 +│ │ ├── lanms // lanms 相关 +│ │ │ ... +│ │ ├── locality_aware_nms.py // nms +│ │ └── sast_postprocess.py // sast 后处理 +│ └── utils // 工具 +│ ├── character.py // 字符处理,包括对文本的编码和解码,计算预测准确率 +│ ├── check.py // 参数加载检查 +│ ├── ic15_dict.txt // 英文数字字典,区分大小写 +│ ├── ppocr_keys_v1.txt // 中文字典,用于训练中文模型 +│ ├── save_load.py // 模型保存和加载函数 +│ ├── stats.py // 统计 +│ └── utility.py // 工具函数,包含输入参数是否合法等相关检查工具 +├── README_en.md // 说明文档 +├── README.md +├── requirments.txt // 安装依赖 +├── setup.py // whl包打包脚本 +└── tools // 启动工具 + ├── eval.py // 评估函数 + ├── eval_utils // 评估工具 + │ ├── eval_cls_utils.py // 分类相关 + │ ├── eval_det_iou.py // 检测 iou 相关 + │ ├── eval_det_utils.py // 检测相关 + │ ├── eval_rec_utils.py // 识别相关 + │ └── __init__.py + ├── export_model.py // 导出 infer 模型 + ├── infer // 基于预测引擎预测 + │ ├── predict_cls.py + │ ├── predict_det.py + │ ├── predict_rec.py + │ ├── predict_system.py + │ └── utility.py + ├── infer_cls.py // 基于训练引擎 预测分类 + ├── infer_det.py // 基于训练引擎 预测检测 + ├── infer_rec.py // 基于训练引擎 预测识别 + ├── program.py // 整体流程 + ├── test_hubserving.py + └── train.py // 启动训练 + +``` diff --git a/doc/doc_ch/update.md b/doc/doc_ch/update.md index 55442c8dfcaee815d52ef73718aeb0cacf7a4b4a..017cd9473849fa8a6fbdef434e6876cc835717e9 100644 --- a/doc/doc_ch/update.md +++ b/doc/doc_ch/update.md @@ -1,6 +1,7 @@ # 更新 -- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列系列中英文ocr模型,效果媲美商业效果。[模型下载](./models_list.md) -- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./FAQ.md) +- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载) +- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载) +- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md) - 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md) - 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519) - 2020.8.16 开源文本检测算法[SAST](https://arxiv.org/abs/1908.05498)和文本识别算法[SRN](https://arxiv.org/abs/2003.12294) diff --git a/doc/doc_ch/visualization.md b/doc/doc_ch/visualization.md index 70d4321feb5dc5502badde60c4dd02e45e5caf5c..fca075914feb6afd159c5ea6355d3c7bb6842233 100644 --- a/doc/doc_ch/visualization.md +++ b/doc/doc_ch/visualization.md @@ -1,7 +1,7 @@ # 效果展示 - PP-OCR 1.1系列模型效果 - [通用ppocr_server_1.1效果展示](#通用ppocr_server_1.1效果展示) - - [通用ppocr_server_1.1效果展示(待补充)]() + - [通用ppocr_mobile_1.1效果展示(待补充)]() - PP-OCR 1.0系列模型效果 - [超轻量ppocr_mobile_1.0效果展示](#超轻量ppocr_mobile_1.0效果展示) - [通用ppocr_server_1.0效果展示](#通用ppocr_server_1.0效果展示) diff --git a/doc/doc_ch/whl.md b/doc/doc_ch/whl.md index 657f9837a768f6753b68b5e937134e10440e382d..46796ce64a60f12db9bbfbdd7b16ff77238c1831 100644 --- a/doc/doc_ch/whl.md +++ b/doc/doc_ch/whl.md @@ -19,7 +19,9 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本 * 检测+分类+识别全流程 ```python from paddleocr import PaddleOCR, draw_ocr -ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory +# Paddleocr目前支持中英文、英文、法语、德语、韩语、日语,可以通过修改lang参数进行切换 +# 参数依次为`zh`, `en`, `french`, `german`, `korean`, `japan`。 +ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs/11.jpg' result = ocr.ocr(img_path, cls=True) for line in result: diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md index 9f37ca8d24c75ba80a143233cdc0a3321fee6a4f..401d7a9ad479716a6d6694ca1f432a2c934def88 100644 --- a/doc/doc_en/detection_en.md +++ b/doc/doc_en/detection_en.md @@ -73,7 +73,7 @@ You can also use `-o` to change the training parameters without modifying the ym python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001 ``` -#### load trained model and conntinue training +#### load trained model and continue training If you expect to load trained model and continue the training again, you can specify the parameter `Global.checkpoints` as the model path to be loaded. For example: diff --git a/doc/doc_en/models_list_en.md b/doc/doc_en/models_list_en.md new file mode 100644 index 0000000000000000000000000000000000000000..8d1192bafb7a80a882f471489f8fdeb53e2abc67 --- /dev/null +++ b/doc/doc_en/models_list_en.md @@ -0,0 +1,70 @@ +## OCR model list(V1.1, updated on 9.22) + +- [1. Text Detection Model](#Detection) +- [2. Text Recognition Model](#Recognition) + - [Chinese Recognition Model](#Chinese) + - [English Recognition Model](#English) + - [Multilingual Recognition Model](#Multilingual) +- [3. Text Angle Classification Model](#Angle) + +The downloadable models provided by PaddleOCR include `inference model`, `trained model`, `pre-trained model` and `slim model`. The differences between the models are as follows: + +|model type|model format|description| +|-|-|-| +|inference model|model、params|Used for reasoning based on Python prediction engine. [detail](./inference_en.md)| +|trained model / pre-trained model|\*.pdmodel、\*.pdopt、\*.pdparams|The checkpoints model saved in the training process, which stores the parameters of the model, mostly used for model evaluation and continuous training.| +|slim model|-|Generally used for Lite deployment| + + + +### 1. Text Detection Model +|model name|description|model size|download| +|-|-|-|-| +|ch_ppocr_mobile_slim_v1.1_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|1.4M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_opt.nb)| +|ch_ppocr_mobile_v1.1_det|Original lightweight model, supporting Chinese, English, multilingual text detection|2.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)| +|ch_ppocr_server_v1.1_det|General model, which is larger than the lightweight model, but achieved better performance|47.2M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar)| + + + +### 2. Text Recognition Model + + +#### Chinese Recognition Model +|model name|description|model size|download| +|-|-|-|-| +|ch_ppocr_mobile_slim_v1.1_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|1.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)| +|ch_ppocr_mobile_v1.1_rec|Original lightweight model, supporting Chinese, English and number recognition|4.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar)| +|ch_ppocr_server_v1.1_rec|General model, supporting Chinese, English and number recognition|105M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar)| + +**Note:** The `trained model` is finetuned on the `pre-trained model` with real data and synthsized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthsized data, which is more suitable for finetune on your own dataset. + + +#### English Recognition Model +|model name|description|model size|download| +|-|-|-|-| +|en_ppocr_mobile_slim_v1.1_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|0.9M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_opt.nb)| +|en_ppocr_mobile_v1.1_rec|Original lightweight model, supporting English and number recognition|2.0M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_train.tar)| + + +#### Multilingual Recognition Model(Updating...) +|model name|description|model size|download| +|-|-|-|-| +| french_ppocr_mobile_v1.1_rec |Lightweight model for French recognition|2.1M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_train.tar)| +| german_ppocr_mobile_v1.1_rec |German model for French recognition|2.1M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_train.tar)| +| korean_ppocr_mobile_v1.1_rec |Lightweight model for Korean recognition|3.4M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_train.tar)| +| japan_ppocr_mobile_v1.1_rec |Lightweight model for Japanese recognition|3.7M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_train.tar)| + + + +### 3. Text Angle Classification Model +|model name|description|model size|download| +|-|-|-|-| +|ch_ppocr_mobile_v1.1_cls_quant|Slim quantized model|0.5M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_train.tar) / [slim model]()| +|ch_ppocr_mobile_v1.1_cls|Original model|850kb|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar)| + + +## OCR model list(V1.0, updated on 7.16) +|model name|description|detection model|recognition model|recognition model supporting space recognition| +|-|-|-|-|-| +|chinese_db_crnn_mobile|8.6M lightweight OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar) +|chinese_db_crnn_server|General OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar) diff --git a/doc/doc_en/whl_en.md b/doc/doc_en/whl_en.md index b62e5454e82a9bf4f8242b94b0d37544d3796c13..4049d9dcb2d52eb5f610d5f02017a9d2d4f14f47 100644 --- a/doc/doc_en/whl_en.md +++ b/doc/doc_en/whl_en.md @@ -17,12 +17,16 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of padd * detection classification and recognition ```python from paddleocr import PaddleOCR,draw_ocr +# Paddleocr supports Chinese, English, French, German, Korean and Japanese. +# You can set the parameter `lang` as `zh`, `en`, `french`, `german`, `korean`, `japan` +# to switch the language model in order. ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' result = ocr.ocr(img_path, cls=True) for line in result: print(line) + # draw result from PIL import Image image = Image.open(img_path).convert('RGB') diff --git a/doc/french.ttf b/doc/french.ttf new file mode 100644 index 0000000000000000000000000000000000000000..ab68fb197d4479b3b6dec6e85bd5cbaf433a87c5 Binary files /dev/null and b/doc/french.ttf differ diff --git a/doc/german.ttf b/doc/german.ttf new file mode 100644 index 0000000000000000000000000000000000000000..ab68fb197d4479b3b6dec6e85bd5cbaf433a87c5 Binary files /dev/null and b/doc/german.ttf differ diff --git a/doc/imgs_results/1106.jpg b/doc/imgs_results/1106.jpg new file mode 100644 index 0000000000000000000000000000000000000000..61f3915d5a36b02537681687dafb0e2e9303eea2 Binary files /dev/null and b/doc/imgs_results/1106.jpg differ diff --git a/doc/imgs_results/1110.jpg b/doc/imgs_results/1110.jpg index ff004c864047ecb1cefcd02e0eea561c415a3a7b..b0c63e7c47c9ddbd555df34f8a9c17bf7d93043d 100644 Binary files a/doc/imgs_results/1110.jpg and b/doc/imgs_results/1110.jpg differ diff --git a/doc/imgs_results/1112.jpg b/doc/imgs_results/1112.jpg index c2d87fe5936abf2032f125940b5e99ec8d030da7..35bec155034ba5860620f8c9d387dbc71607d6fe 100644 Binary files a/doc/imgs_results/1112.jpg and b/doc/imgs_results/1112.jpg differ diff --git a/doc/imgs_words/french/1.jpg b/doc/imgs_words/french/1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..077ca28e70b74ed07fa637011c80219aecc448d5 Binary files /dev/null and b/doc/imgs_words/french/1.jpg differ diff --git a/doc/imgs_words/french/2.jpg b/doc/imgs_words/french/2.jpg new file mode 100644 index 0000000000000000000000000000000000000000..38a73caa621710a7eb7378603e0152ba9c14dd41 Binary files /dev/null and b/doc/imgs_words/french/2.jpg differ diff --git a/doc/imgs_words/german/1.jpg b/doc/imgs_words/german/1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..d26ec9ed14de65c2d27e37693ff0da133e774b94 Binary files /dev/null and b/doc/imgs_words/german/1.jpg differ diff --git a/doc/imgs_words/japan/1.jpg b/doc/imgs_words/japan/1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..684879749764a1b6063da32d7910bff911e855f4 Binary files /dev/null and b/doc/imgs_words/japan/1.jpg differ diff --git a/doc/imgs_words/korean/1.jpg b/doc/imgs_words/korean/1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..48a89389ae880783a39a13e9b06a861b88948fba Binary files /dev/null and b/doc/imgs_words/korean/1.jpg differ diff --git a/doc/imgs_words/korean/2.jpg b/doc/imgs_words/korean/2.jpg new file mode 100644 index 0000000000000000000000000000000000000000..b24f28914d574be44e147943d906f8634f149ed5 Binary files /dev/null and b/doc/imgs_words/korean/2.jpg differ diff --git a/doc/japan.ttc b/doc/japan.ttc new file mode 100644 index 0000000000000000000000000000000000000000..ad68243b968fc87b207928594c585039859b75a9 Binary files /dev/null and b/doc/japan.ttc differ diff --git a/doc/korean.ttf b/doc/korean.ttf new file mode 100644 index 0000000000000000000000000000000000000000..e638ce37f67ff1cd9babf73387786eaeb5c52968 Binary files /dev/null and b/doc/korean.ttf differ diff --git a/doc/ppocr_framework.png b/doc/ppocr_framework.png new file mode 100644 index 0000000000000000000000000000000000000000..ab51c88fe694b210a98423868cd90874be3c09ed Binary files /dev/null and b/doc/ppocr_framework.png differ diff --git a/paddleocr.py b/paddleocr.py index 55ca87ac93996311d2760b9e2b63530acc7e5092..7e9b2402ad792b4d690b1147f042203df46872a5 100644 --- a/paddleocr.py +++ b/paddleocr.py @@ -46,6 +46,26 @@ model_urls = { 'url': 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar', 'dict_path': './ppocr/utils/ic15_dict.txt' + }, + 'french': { + 'url': + 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar', + 'dict_path': './ppocr/utils/french_dict.txt' + }, + 'german': { + 'url': + 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar', + 'dict_path': './ppocr/utils/german_dict.txt' + }, + 'korean': { + 'url': + 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar', + 'dict_path': './ppocr/utils/korean_dict.txt' + }, + 'japan': { + 'url': + 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar', + 'dict_path': './ppocr/utils/japan_dict.txt' } }, 'cls': @@ -165,8 +185,9 @@ class PaddleOCR(predict_system.TextSystem): postprocess_params.__dict__.update(**kwargs) self.use_angle_cls = postprocess_params.use_angle_cls lang = postprocess_params.lang - assert lang in model_urls['rec'], 'param lang must in {}'.format( - model_urls['rec'].keys()) + assert lang in model_urls[ + 'rec'], 'param lang must in {}, but got {}'.format( + model_urls['rec'].keys(), lang) if postprocess_params.rec_char_dict_path is None: postprocess_params.rec_char_dict_path = model_urls['rec'][lang][ 'dict_path'] diff --git a/tools/infer/predict_system.py b/tools/infer/predict_system.py index 3e6be234c68dcd82f0f9e844f3ad2859000cec88..29c4d7e8e35ceda3966dfcadcca5f0ae985d1bb1 100755 --- a/tools/infer/predict_system.py +++ b/tools/infer/predict_system.py @@ -133,6 +133,7 @@ def main(args): image_file_list = get_image_file_list(args.image_dir) text_sys = TextSystem(args) is_visualize = True + font_path = args.vis_font_path for image_file in image_file_list: img, flag = check_and_read_gif(image_file) if not flag: @@ -160,7 +161,7 @@ def main(args): scores = [rec_res[i][1] for i in range(len(rec_res))] draw_img = draw_ocr( - image, boxes, txts, scores, drop_score=drop_score) + image, boxes, txts, scores, drop_score=drop_score, font_path=font_path) draw_img_save = "./inference_results/" if not os.path.exists(draw_img_save): os.makedirs(draw_img_save) diff --git a/tools/infer/utility.py b/tools/infer/utility.py index 3a25be52f16453d62fd6892cc824bf39bd5d6f4d..45d7b73707904d3bb2df665a1cf348a32c70f852 100755 --- a/tools/infer/utility.py +++ b/tools/infer/utility.py @@ -71,6 +71,10 @@ def parse_args(): type=str, default="./ppocr/utils/ppocr_keys_v1.txt") parser.add_argument("--use_space_char", type=str2bool, default=True) + parser.add_argument( + "--vis_font_path", + type=str, + default="./doc/simfang.ttf") # params for text classifier parser.add_argument("--use_angle_cls", type=str2bool, default=False) @@ -199,7 +203,7 @@ def draw_ocr(image, return image -def draw_ocr_box_txt(image, boxes, txts): +def draw_ocr_box_txt(image, boxes, txts, font_path="./doc/simfang.ttf"): h, w = image.height, image.width img_left = image.copy() img_right = Image.new('RGB', (w, h), (255, 255, 255)) @@ -226,7 +230,7 @@ def draw_ocr_box_txt(image, boxes, txts): if box_height > 2 * box_width: font_size = max(int(box_width * 0.9), 10) font = ImageFont.truetype( - "./doc/simfang.ttf", font_size, encoding="utf-8") + font_path, font_size, encoding="utf-8") cur_y = box[0][1] for c in txt: char_size = font.getsize(c) @@ -236,7 +240,7 @@ def draw_ocr_box_txt(image, boxes, txts): else: font_size = max(int(box_height * 0.8), 10) font = ImageFont.truetype( - "./doc/simfang.ttf", font_size, encoding="utf-8") + font_path, font_size, encoding="utf-8") draw_right.text( [box[0][0], box[0][1]], txt, fill=(0, 0, 0), font=font) img_left = Image.blend(image, img_left, 0.5) diff --git a/tools/program.py b/tools/program.py index 08799d17eb66dd9b97fa9d6a7d509167f5d74c88..2ef203f4cb08231fa04cf2e4c8ee41a40470a0ae 100755 --- a/tools/program.py +++ b/tools/program.py @@ -204,6 +204,15 @@ def build(config, main_prog, startup_prog, mode): def build_export(config, main_prog, startup_prog): """ + Build input and output for exporting a checkpoints model to an inference model + Args: + config(dict): config + main_prog(): main program + startup_prog(): startup program + Returns: + feeded_var_names(list[str]): var names of input for exported inference model + target_vars(list[Variable]): output vars for exported inference model + fetches_var_name: dict of checkpoints model outputs(included loss and measures) """ with fluid.program_guard(main_prog, startup_prog): with fluid.unique_name.guard(): @@ -246,6 +255,9 @@ def train_eval_det_run(config, train_info_dict, eval_info_dict, is_pruning=False): + ''' + main program of evaluation for detection + ''' train_batch_id = 0 log_smooth_window = config['Global']['log_smooth_window'] epoch_num = config['Global']['epoch_num'] @@ -337,6 +349,9 @@ def train_eval_det_run(config, def train_eval_rec_run(config, exe, train_info_dict, eval_info_dict): + ''' + main program of evaluation for recognition + ''' train_batch_id = 0 log_smooth_window = config['Global']['log_smooth_window'] epoch_num = config['Global']['epoch_num'] @@ -513,6 +528,7 @@ def train_eval_cls_run(config, exe, train_info_dict, eval_info_dict): def preprocess(): + # load config from yml file FLAGS = ArgsParser().parse_args() config = load_config(FLAGS.config) merge_config(FLAGS.opt) @@ -522,6 +538,7 @@ def preprocess(): use_gpu = config['Global']['use_gpu'] check_gpu(use_gpu) + # check whether the set algorithm belongs to the supported algorithm list alg = config['Global']['algorithm'] assert alg in [ 'EAST', 'DB', 'SAST', 'Rosetta', 'CRNN', 'STARNet', 'RARE', 'SRN', 'CLS' diff --git a/tools/train.py b/tools/train.py index 531dd15933ebfd83527f091215c40b85253f7866..cf0171b340f8cebb6251d2ef12efb14d3cdb709e 100755 --- a/tools/train.py +++ b/tools/train.py @@ -46,6 +46,7 @@ from paddle.fluid.contrib.model_stat import summary def main(): + # build train program train_build_outputs = program.build( config, train_program, startup_program, mode='train') train_loader = train_build_outputs[0] @@ -54,6 +55,7 @@ def main(): train_opt_loss_name = train_build_outputs[3] model_average = train_build_outputs[-1] + # build eval program eval_program = fluid.Program() eval_build_outputs = program.build( config, eval_program, startup_program, mode='eval') @@ -61,9 +63,11 @@ def main(): eval_fetch_varname_list = eval_build_outputs[2] eval_program = eval_program.clone(for_test=True) + # initialize train reader train_reader = reader_main(config=config, mode="train") train_loader.set_sample_list_generator(train_reader, places=place) + # initialize eval reader eval_reader = reader_main(config=config, mode="eval") exe = fluid.Executor(place) diff --git a/train_data/gen_label.py b/train_data/gen_label.py new file mode 100644 index 0000000000000000000000000000000000000000..552f279f34efa0be437d404273c510585da12f83 --- /dev/null +++ b/train_data/gen_label.py @@ -0,0 +1,74 @@ +#copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. +# +#Licensed under the Apache License, Version 2.0 (the "License"); +#you may not use this file except in compliance with the License. +#You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +#Unless required by applicable law or agreed to in writing, software +#distributed under the License is distributed on an "AS IS" BASIS, +#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +#See the License for the specific language governing permissions and +#limitations under the License. +import os +import argparse + + +def gen_rec_label(input_path, out_label): + with open(out_label, 'w') as out_file: + with open(input_path, 'r') as f: + for line in f.readlines(): + tmp = line.strip('\n').replace(" ", "").split(',') + img_path, label = tmp[0], tmp[1] + label = label.replace("\"", "") + out_file.write(img_path + '\t' + label + '\n') + + +def gen_det_label(root_path, input_dir, out_label): + with open(out_label, 'w') as out_file: + for label_file in os.listdir(input_dir): + img_path = root_path + label_file[3:-4] + ".jpg" + label = [] + with open(os.path.join(input_dir, label_file), 'r') as f: + for line in f.readlines(): + tmp = line.strip("\n\r").replace("\xef\xbb\xbf", "").split(',') + points = tmp[:-2] + s = [] + for i in range(0, len(points), 2): + b = points[i:i + 2] + s.append(b) + result = {"transcription": tmp[-1], "points": s} + label.append(result) + out_file.write(img_path + '\t' + str(label) + '\n') + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument( + '--mode', + type=str, + default="rec", + help='Generate rec_label or det_label, can be set rec or det') + parser.add_argument( + '--root_path', + type=str, + default=".", + help='The root directory of images.Only takes effect when mode=det ') + parser.add_argument( + '--input_path', + type=str, + default=".", + help='Input_label or input path to be converted') + parser.add_argument( + '--output_label', + type=str, + default="out_label.txt", + help='Output file name') + + args = parser.parse_args() + if args.mode == "rec": + print("Generate rec label") + gen_rec_label(args.input_path, args.output_label) + elif args.mode == "det": + gen_det_label(args.root_path, args.input_path, args.output_label)