未验证 提交 f70371ee 编写于 作者: M MissPenguin 提交者: GitHub

Merge branch 'develop' into develop

English | [简体中文](README_ch.md) English | [简体中文](README_ch.md)
## Introduction ## Introduction
PaddleOCR aims to create rich, leading, and practical OCR tools that help users train better models and apply them into practice. PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
**Recent updates** **Recent updates**
- 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941 - 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
- 2020.9.19 Update the ultra lightweight compressed ppocr_mobile_slim series models, the overall model size is 3.5M (see [PP-OCR Pipline](#PP-OCR-Pipline)), suitable for mobile deployment. [Model Downloads](#Supported-Chinese-model-list) - 2020.9.19 Update the ultra lightweight compressed ppocr_mobile_slim series models, the overall model size is 3.5M (see [PP-OCR Pipeline](#PP-OCR-Pipeline)), suitable for mobile deployment. [Model Downloads](#Supported-Chinese-model-list)
- 2020.9.17 Update the ultra lightweight ppocr_mobile series and general ppocr_server series Chinese and English ocr models, which are comparable to commercial effects. [Model Downloads](#Supported-Chinese-model-list) - 2020.9.17 Update the ultra lightweight ppocr_mobile series and general ppocr_server series Chinese and English ocr models, which are comparable to commercial effects. [Model Downloads](#Supported-Chinese-model-list)
- 2020.9.17 update [English recognition model](./doc/doc_en/models_list_en.md#english-recognition-model) and [Multilingual recognition model](doc/doc_en/models_list_en.md#english-recognition-model), `German`, `French`, `Japanese` and `Korean` have been supported. Models for more languages will continue to be updated. - 2020.9.17 update [English recognition model](./doc/doc_en/models_list_en.md#english-recognition-model) and [Multilingual recognition model](doc/doc_en/models_list_en.md#english-recognition-model), `English`, `Chinese`, `German`, `French`, `Japanese` and `Korean` have been supported. Models for more languages will continue to be updated.
- 2020.8.24 Support the use of PaddleOCR through whl package installation,pelease refer [PaddleOCR Package](./doc/doc_en/whl_en.md) - 2020.8.24 Support the use of PaddleOCR through whl package installation,please refer [PaddleOCR Package](./doc/doc_en/whl_en.md)
- 2020.8.21 Update the replay and PPT of the live lesson at Bilibili on August 18, lesson 2, easy to learn and use OCR tool spree. [Get Address](https://aistudio.baidu.com/aistudio/education/group/info/1519) - 2020.8.21 Update the replay and PPT of the live lesson at Bilibili on August 18, lesson 2, easy to learn and use OCR tool spree. [Get Address](https://aistudio.baidu.com/aistudio/education/group/info/1519)
- [more](./doc/doc_en/update_en.md) - [more](./doc/doc_en/update_en.md)
...@@ -32,6 +32,15 @@ PaddleOCR aims to create rich, leading, and practical OCR tools that help users ...@@ -32,6 +32,15 @@ PaddleOCR aims to create rich, leading, and practical OCR tools that help users
The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md). The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md).
<a name="Community"></a>
## Community
- Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.
<div align="center">
<img src="./doc/joinus.PNG" width = "200" height = "200" />
</div>
## Quick Experience ## Quick Experience
You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr) You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
...@@ -55,9 +64,14 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr ...@@ -55,9 +64,14 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr
| Chinese and English ultra-lightweight OCR model (8.1M) | ch_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | Chinese and English ultra-lightweight OCR model (8.1M) | ch_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) |
| Chinese and English general OCR model (155.1M) | ch_ppocr_server_v1.1_xx | Server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | Chinese and English general OCR model (155.1M) | ch_ppocr_server_v1.1_xx | Server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) |
| Chinese and English ultra-lightweight compressed OCR model (3.5M) | ch_ppocr_mobile_slim_v1.1_xx | Mobile | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) | | Chinese and English ultra-lightweight compressed OCR model (3.5M) | ch_ppocr_mobile_slim_v1.1_xx | Mobile | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) |
| French ultra-lightweight OCR model (4.6M) | french_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | - | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_train.tar) |
| German ultra-lightweight OCR model (4.6M) | german_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | - |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_train.tar) |
| Korean ultra-lightweight OCR model (5.9M) | korean_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | - |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_train.tar)|
| Japan ultra-lightweight OCR model (6.2M) | japan_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | - |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_train.tar) |
For more model downloads (including multiple languages), please refer to [PP-OCR v1.1 series model downloads](./doc/doc_en/models_list_en.md) For more model downloads (including multiple languages), please refer to [PP-OCR v1.1 series model downloads](./doc/doc_en/models_list_en.md).
For a new language request, please refer to [Guideline for new language_requests](#language_requests).
## Tutorials ## Tutorials
- [Installation](./doc/doc_en/installation_en.md) - [Installation](./doc/doc_en/installation_en.md)
...@@ -66,7 +80,7 @@ For more model downloads (including multiple languages), please refer to [PP-OCR ...@@ -66,7 +80,7 @@ For more model downloads (including multiple languages), please refer to [PP-OCR
- Algorithm introduction - Algorithm introduction
- [Text Detection Algorithm](./doc/doc_en/algorithm_overview_en.md) - [Text Detection Algorithm](./doc/doc_en/algorithm_overview_en.md)
- [Text Recognition Algorithm](./doc/doc_en/algorithm_overview_en.md) - [Text Recognition Algorithm](./doc/doc_en/algorithm_overview_en.md)
- [PP-OCR Pipline](#PP-OCR-Pipline) - [PP-OCR Pipeline](#PP-OCR-Pipeline)
- Model training/evaluation - Model training/evaluation
- [Text Detection](./doc/doc_en/detection_en.md) - [Text Detection](./doc/doc_en/detection_en.md)
- [Text Recognition](./doc/doc_en/recognition_en.md) - [Text Recognition](./doc/doc_en/recognition_en.md)
...@@ -88,21 +102,22 @@ For more model downloads (including multiple languages), please refer to [PP-OCR ...@@ -88,21 +102,22 @@ For more model downloads (including multiple languages), please refer to [PP-OCR
- [Data Annotation Tools](./doc/doc_en/data_annotation_en.md) - [Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
- [Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md) - [Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
- [Visualization](#Visualization) - [Visualization](#Visualization)
- [New language requests](#language_requests)
- [FAQ](./doc/doc_en/FAQ_en.md) - [FAQ](./doc/doc_en/FAQ_en.md)
- [Community](#Community) - [Community](#Community)
- [References](./doc/doc_en/reference_en.md) - [References](./doc/doc_en/reference_en.md)
- [License](#LICENSE) - [License](#LICENSE)
- [Contribution](#CONTRIBUTION) - [Contribution](#CONTRIBUTION)
<a name="PP-OCR-Pipline"></a> <a name="PP-OCR-Pipeline"></a>
## PP-OCR Pipline ## PP-OCR Pipeline
<div align="center"> <div align="center">
<img src="./doc/ppocr_framework.png" width="800"> <img src="./doc/ppocr_framework.png" width="800">
</div> </div>
PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection, detection frame correction and CRNN text recognition. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner and PACT quantization is based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim). PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection, detection frame correction and CRNN text recognition. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner and PACT quantization is based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim).
...@@ -126,13 +141,24 @@ PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of thr ...@@ -126,13 +141,24 @@ PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of thr
<img src="./doc/imgs_results/1112.jpg" width="800"> <img src="./doc/imgs_results/1112.jpg" width="800">
</div> </div>
<a name="Community"></a>
## Community
Scan the QR code below with your Wechat and completing the questionnaire, you can access to official technical exchange group.
<div align="center"> <a name="language_requests"></a>
<img src="./doc/joinus.PNG" width = "200" height = "200" /> ## Guideline for new language requests
</div>
If you want to request a new language support, a PR with 2 following files are needed:
1. In folder [ppocr/utils/dict](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/ppocr/utils/dict),
it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder.
2. In folder [ppocr/utils/corpus](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/ppocr/utils/corpus),
it is necessary to submit the corpus to this path and name it with `{language}_corpus.txt` that contains a list of words in your language.
Maybe, 50000 words per language is necessary at least.
Of course, the more, the better.
If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.
More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
<a name="LICENSE"></a> <a name="LICENSE"></a>
## License ## License
...@@ -149,3 +175,5 @@ We welcome all the contributions to PaddleOCR and appreciate for your feedback v ...@@ -149,3 +175,5 @@ We welcome all the contributions to PaddleOCR and appreciate for your feedback v
- Thanks [authorfu](https://github.com/authorfu) for contributing Android demo and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively. - Thanks [authorfu](https://github.com/authorfu) for contributing Android demo and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively.
- Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style. - Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style.
- Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services. - Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
- Thanks [lijinhan](https://github.com/lijinhan) for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
- Thanks [Mejans](https://github.com/Mejans) for contributing the Occitan corpus and character set.
...@@ -4,11 +4,11 @@ ...@@ -4,11 +4,11 @@
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。
**近期更新** **近期更新**
- 2020.10.26 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题,共计94个常见问题及解答,并且计划以后每周一都会更新,欢迎大家持续关注。
- 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941 - 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941
- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载) - 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipeline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载)
- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载) - 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载)
- 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。 - 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。
- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md)
- 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md) - 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md)
- 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519) - 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519)
- [More](./doc/doc_ch/update.md) - [More](./doc/doc_ch/update.md)
...@@ -35,6 +35,15 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -35,6 +35,15 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md) 上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)
<a name="欢迎加入PaddleOCR技术交流群"></a>
## 欢迎加入PaddleOCR技术交流群
- 微信扫描二维码加入官方交流群,获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
<div align="center">
<img src="./doc/joinus.PNG" width = "200" height = "200" />
</div>
## 快速体验 ## 快速体验
- PC端:超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr - PC端:超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
...@@ -65,7 +74,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -65,7 +74,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- 算法介绍 - 算法介绍
- [文本检测](./doc/doc_ch/algorithm_overview.md) - [文本检测](./doc/doc_ch/algorithm_overview.md)
- [文本识别](./doc/doc_ch/algorithm_overview.md) - [文本识别](./doc/doc_ch/algorithm_overview.md)
- [PP-OCR Pipline](#PP-OCR) - [PP-OCR Pipeline](#PP-OCR)
- 模型训练/评估 - 模型训练/评估
- [文本检测](./doc/doc_ch/detection.md) - [文本检测](./doc/doc_ch/detection.md)
- [文本识别](./doc/doc_ch/recognition.md) - [文本识别](./doc/doc_ch/recognition.md)
...@@ -75,7 +84,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -75,7 +84,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- [基于pip安装whl包快速推理](./doc/doc_ch/whl.md) - [基于pip安装whl包快速推理](./doc/doc_ch/whl.md)
- [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md) - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
- [基于C++预测引擎推理](./deploy/cpp_infer/readme.md) - [基于C++预测引擎推理](./deploy/cpp_infer/readme.md)
- [服务化部署](./deploy/hubserving/readme.md) - [服务化部署](./doc/doc_ch/serving_inference.md)
- [端侧部署](./deploy/lite/readme.md) - [端侧部署](./deploy/lite/readme.md)
- [模型量化](./deploy/slim/quantization/README.md) - [模型量化](./deploy/slim/quantization/README.md)
- [模型裁剪](./deploy/slim/prune/README.md) - [模型裁剪](./deploy/slim/prune/README.md)
...@@ -89,15 +98,15 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -89,15 +98,15 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- [效果展示](#效果展示) - [效果展示](#效果展示)
- FAQ - FAQ
- [【精选】OCR精选10个问题](./doc/doc_ch/FAQ.md) - [【精选】OCR精选10个问题](./doc/doc_ch/FAQ.md)
- [【理论篇】OCR通用21个问题](./doc/doc_ch/FAQ.md) - [【理论篇】OCR通用23个问题](./doc/doc_ch/FAQ.md)
- [【实战篇】PaddleOCR实战53个问题](./doc/doc_ch/FAQ.md) - [【实战篇】PaddleOCR实战61个问题](./doc/doc_ch/FAQ.md)
- [技术交流群](#欢迎加入PaddleOCR技术交流群) - [技术交流群](#欢迎加入PaddleOCR技术交流群)
- [参考文献](./doc/doc_ch/reference.md) - [参考文献](./doc/doc_ch/reference.md)
- [许可证书](#许可证书) - [许可证书](#许可证书)
- [贡献代码](#贡献代码) - [贡献代码](#贡献代码)
<a name="PP-OCR"></a> <a name="PP-OCR"></a>
## PP-OCR Pipline ## PP-OCR Pipeline
<div align="center"> <div align="center">
<img src="./doc/ppocr_framework.png" width="800"> <img src="./doc/ppocr_framework.png" width="800">
</div> </div>
...@@ -125,13 +134,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -125,13 +134,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
<img src="./doc/imgs_results/1112.jpg" width="800"> <img src="./doc/imgs_results/1112.jpg" width="800">
</div> </div>
<a name="欢迎加入PaddleOCR技术交流群"></a>
## 欢迎加入PaddleOCR技术交流群
请扫描下面二维码,完成问卷填写,获取加群二维码和OCR方向的炼丹秘籍
<div align="center">
<img src="./doc/joinus.PNG" width = "200" height = "200" />
</div>
<a name="许可证书"></a> <a name="许可证书"></a>
## 许可证书 ## 许可证书
...@@ -148,3 +151,5 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -148,3 +151,5 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
- 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码 - 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码
- 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。 - 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
- 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。 - 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。
- 非常感谢 [lijinhan](https://github.com/lijinhan) 给PaddleOCR增加java SpringBoot 调用OCR Hubserving接口完成对OCR服务化部署的使用。
- 非常感谢 [Mejans](https://github.com/Mejans) 给PaddleOCR增加新语言奥克西坦语Occitan的字典和语料。
...@@ -18,5 +18,4 @@ TestReader: ...@@ -18,5 +18,4 @@ TestReader:
infer_img: infer_img:
img_set_dir: ./train_data/icdar2015/text_localization/ img_set_dir: ./train_data/icdar2015/text_localization/
label_file_path: ./train_data/icdar2015/text_localization/test_icdar2015_label.txt label_file_path: ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
test_image_shape: [736, 1280]
do_eval: True do_eval: True
...@@ -51,7 +51,7 @@ Optimizer: ...@@ -51,7 +51,7 @@ Optimizer:
PostProcess: PostProcess:
function: ppocr.postprocess.db_postprocess,DBPostProcess function: ppocr.postprocess.db_postprocess,DBPostProcess
thresh: 0.3 thresh: 0.3
box_thresh: 0.6 box_thresh: 0.5
max_candidates: 1000 max_candidates: 1000
unclip_ratio: 1.5 unclip_ratio: 1.6
...@@ -12,7 +12,7 @@ Global: ...@@ -12,7 +12,7 @@ Global:
image_shape: [3, 32, 320] image_shape: [3, 32, 320]
max_text_length: 25 max_text_length: 25
character_type: french character_type: french
character_dict_path: ./ppocr/utils/french_dict.txt character_dict_path: ./ppocr/utils/dict/french_dict.txt
loss_type: ctc loss_type: ctc
distort: true distort: true
use_space_char: false use_space_char: false
......
...@@ -12,7 +12,7 @@ Global: ...@@ -12,7 +12,7 @@ Global:
image_shape: [3, 32, 320] image_shape: [3, 32, 320]
max_text_length: 25 max_text_length: 25
character_type: german character_type: german
character_dict_path: ./ppocr/utils/german_dict.txt character_dict_path: ./ppocr/utils/dict/german_dict.txt
loss_type: ctc loss_type: ctc
distort: true distort: true
use_space_char: false use_space_char: false
......
...@@ -12,7 +12,7 @@ Global: ...@@ -12,7 +12,7 @@ Global:
image_shape: [3, 32, 320] image_shape: [3, 32, 320]
max_text_length: 25 max_text_length: 25
character_type: japan character_type: japan
character_dict_path: ./ppocr/utils/japan_dict.txt character_dict_path: ./ppocr/utils/dict/japan_dict.txt
loss_type: ctc loss_type: ctc
distort: true distort: true
use_space_char: false use_space_char: false
......
...@@ -12,7 +12,7 @@ Global: ...@@ -12,7 +12,7 @@ Global:
image_shape: [3, 32, 320] image_shape: [3, 32, 320]
max_text_length: 25 max_text_length: 25
character_type: korean character_type: korean
character_dict_path: ./ppocr/utils/korean_dict.txt character_dict_path: ./ppocr/utils/dict/korean_dict.txt
loss_type: ctc loss_type: ctc
distort: true distort: true
use_space_char: false use_space_char: false
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
pretrain_weights: pretrain_weights:
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
pretrain_weights: pretrain_weights:
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: attention loss_type: attention
tps: true tps: true
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
tps: true tps: true
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
pretrain_weights: pretrain_weights:
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
pretrain_weights: pretrain_weights:
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: attention loss_type: attention
tps: true tps: true
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [3, 32, 100] image_shape: [3, 32, 100]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: ctc loss_type: ctc
tps: true tps: true
reader_yml: ./configs/rec/rec_benchmark_reader.yml reader_yml: ./configs/rec/rec_benchmark_reader.yml
......
...@@ -12,6 +12,7 @@ Global: ...@@ -12,6 +12,7 @@ Global:
image_shape: [1, 64, 256] image_shape: [1, 64, 256]
max_text_length: 25 max_text_length: 25
character_type: en character_type: en
character_dict_path:
loss_type: srn loss_type: srn
num_heads: 8 num_heads: 8
average_window: 0.15 average_window: 0.15
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
### 1. 安装最新版本的Android Studio ### 1. 安装最新版本的Android Studio
可以从https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。 可以从https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。
### 2. 按照NDK 20 以上版本 ### 2. 按照NDK 20 以上版本
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。 Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
如果您是初学者,可以用以下方式安装和测试NDK编译环境。 如果您是初学者,可以用以下方式安装和测试NDK编译环境。
...@@ -17,3 +17,10 @@ Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编 ...@@ -17,3 +17,10 @@ Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编
- Demo APP:可使用手机扫码安装,方便手机端快速体验文字识别 - Demo APP:可使用手机扫码安装,方便手机端快速体验文字识别
- SDK:模型被封装为适配不同芯片硬件和操作系统SDK,包括完善的接口,方便进行二次开发 - SDK:模型被封装为适配不同芯片硬件和操作系统SDK,包括完善的接口,方便进行二次开发
# FAQ:
Q1: 更新1.1版本的模型后,demo报错?
A1. 如果要更换V1.1 版本的模型,请更新模型的同时,更新预测库文件,建议使用[PaddleLite 2.6.3](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.6.3)版本的预测库文件,OCR移动端部署参考[教程](../lite/readme.md)
...@@ -35,12 +35,6 @@ public class OCRPredictorNative { ...@@ -35,12 +35,6 @@ public class OCRPredictorNative {
} }
public void release() {
if (nativePointer != 0) {
nativePointer = 0;
// destory(nativePointer);
}
}
public ArrayList<OcrResultModel> runImage(float[] inputData, int width, int height, int channels, Bitmap originalImage) { public ArrayList<OcrResultModel> runImage(float[] inputData, int width, int height, int channels, Bitmap originalImage) {
Log.i("OCRPredictorNative", "begin to run image " + inputData.length + " " + width + " " + height); Log.i("OCRPredictorNative", "begin to run image " + inputData.length + " " + width + " " + height);
...@@ -63,7 +57,7 @@ public class OCRPredictorNative { ...@@ -63,7 +57,7 @@ public class OCRPredictorNative {
protected native float[] forward(long pointer, float[] buf, float[] ddims, Bitmap originalImage); protected native float[] forward(long pointer, float[] buf, float[] ddims, Bitmap originalImage);
protected native void destory(long pointer); public native void release(long pointer);
private ArrayList<OcrResultModel> postprocess(float[] raw) { private ArrayList<OcrResultModel> postprocess(float[] raw) {
ArrayList<OcrResultModel> results = new ArrayList<OcrResultModel>(); ArrayList<OcrResultModel> results = new ArrayList<OcrResultModel>();
......
...@@ -59,7 +59,7 @@ public: ...@@ -59,7 +59,7 @@ public:
class ClsResizeImg { class ClsResizeImg {
public: public:
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, virtual void Run(const cv::Mat &img, cv::Mat &resize_img,
const std::vector<int> &rec_image_shape = {3, 32, 320}); const std::vector<int> &rec_image_shape = {3, 48, 192});
}; };
} // namespace PaddleOCR } // namespace PaddleOCR
\ No newline at end of file
...@@ -21,7 +21,7 @@ cv::Mat Classifier::Run(cv::Mat &img) { ...@@ -21,7 +21,7 @@ cv::Mat Classifier::Run(cv::Mat &img) {
img.copyTo(src_img); img.copyTo(src_img);
cv::Mat resize_img; cv::Mat resize_img;
std::vector<int> rec_image_shape = {3, 32, 100}; std::vector<int> rec_image_shape = {3, 48, 192};
int index = 0; int index = 0;
float wh_ratio = float(img.cols) / float(img.rows); float wh_ratio = float(img.cols) / float(img.rows);
......
...@@ -85,7 +85,7 @@ void ResizeImgType0::Run(const cv::Mat &img, cv::Mat &resize_img, ...@@ -85,7 +85,7 @@ void ResizeImgType0::Run(const cv::Mat &img, cv::Mat &resize_img,
if (resize_w % 32 == 0) if (resize_w % 32 == 0)
resize_w = resize_w; resize_w = resize_w;
else if (resize_w / 32 < 1) else if (resize_w / 32 < 1 + 1e-5)
resize_w = 32; resize_w = 32;
else else
resize_w = (resize_w / 32 - 1) * 32; resize_w = (resize_w / 32 - 1) * 32;
...@@ -138,4 +138,4 @@ void ClsResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img, ...@@ -138,4 +138,4 @@ void ClsResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
} }
} }
} // namespace PaddleOCR } // namespace PaddleOCR
\ No newline at end of file
...@@ -3,11 +3,14 @@ from __future__ import absolute_import ...@@ -3,11 +3,14 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import os
import sys
sys.path.insert(0, ".")
import argparse import argparse
import ast import ast
import copy import copy
import math import math
import os
import time import time
from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor
...@@ -67,9 +70,7 @@ class OCRDet(hub.Module): ...@@ -67,9 +70,7 @@ class OCRDet(hub.Module):
images.append(img) images.append(img)
return images return images
def predict(self, def predict(self, images=[], paths=[]):
images=[],
paths=[]):
""" """
Get the text box in the predicted images. Get the text box in the predicted images.
Args: Args:
...@@ -87,7 +88,7 @@ class OCRDet(hub.Module): ...@@ -87,7 +88,7 @@ class OCRDet(hub.Module):
raise TypeError("The input data is inconsistent with expectations.") raise TypeError("The input data is inconsistent with expectations.")
assert predicted_data != [], "There is not any image to be predicted. Please check the input data." assert predicted_data != [], "There is not any image to be predicted. Please check the input data."
all_results = [] all_results = []
for img in predicted_data: for img in predicted_data:
if img is None: if img is None:
...@@ -99,11 +100,9 @@ class OCRDet(hub.Module): ...@@ -99,11 +100,9 @@ class OCRDet(hub.Module):
rec_res_final = [] rec_res_final = []
for dno in range(len(dt_boxes)): for dno in range(len(dt_boxes)):
rec_res_final.append( rec_res_final.append({
{ 'text_region': dt_boxes[dno].astype(np.int).tolist()
'text_region': dt_boxes[dno].astype(np.int).tolist() })
}
)
all_results.append(rec_res_final) all_results.append(rec_res_final)
return all_results return all_results
...@@ -116,7 +115,7 @@ class OCRDet(hub.Module): ...@@ -116,7 +115,7 @@ class OCRDet(hub.Module):
results = self.predict(images_decode, **kwargs) results = self.predict(images_decode, **kwargs)
return results return results
if __name__ == '__main__': if __name__ == '__main__':
ocr = OCRDet() ocr = OCRDet()
image_path = [ image_path = [
...@@ -124,4 +123,4 @@ if __name__ == '__main__': ...@@ -124,4 +123,4 @@ if __name__ == '__main__':
'./doc/imgs/12.jpg', './doc/imgs/12.jpg',
] ]
res = ocr.predict(paths=image_path) res = ocr.predict(paths=image_path)
print(res) print(res)
\ No newline at end of file
...@@ -3,11 +3,14 @@ from __future__ import absolute_import ...@@ -3,11 +3,14 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import os
import sys
sys.path.insert(0, ".")
import argparse import argparse
import ast import ast
import copy import copy
import math import math
import os
import time import time
from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor
...@@ -67,9 +70,7 @@ class OCRRec(hub.Module): ...@@ -67,9 +70,7 @@ class OCRRec(hub.Module):
images.append(img) images.append(img)
return images return images
def predict(self, def predict(self, images=[], paths=[]):
images=[],
paths=[]):
""" """
Get the text box in the predicted images. Get the text box in the predicted images.
Args: Args:
...@@ -87,31 +88,28 @@ class OCRRec(hub.Module): ...@@ -87,31 +88,28 @@ class OCRRec(hub.Module):
raise TypeError("The input data is inconsistent with expectations.") raise TypeError("The input data is inconsistent with expectations.")
assert predicted_data != [], "There is not any image to be predicted. Please check the input data." assert predicted_data != [], "There is not any image to be predicted. Please check the input data."
img_list = [] img_list = []
for img in predicted_data: for img in predicted_data:
if img is None: if img is None:
continue continue
img_list.append(img) img_list.append(img)
rec_res_final = [] rec_res_final = []
try: try:
rec_res, predict_time = self.text_recognizer(img_list) rec_res, predict_time = self.text_recognizer(img_list)
for dno in range(len(rec_res)): for dno in range(len(rec_res)):
text, score = rec_res[dno] text, score = rec_res[dno]
rec_res_final.append( rec_res_final.append({
{ 'text': text,
'text': text, 'confidence': float(score),
'confidence': float(score), })
}
)
except Exception as e: except Exception as e:
print(e) print(e)
return [[]] return [[]]
return [rec_res_final] return [rec_res_final]
@serving @serving
def serving_method(self, images, **kwargs): def serving_method(self, images, **kwargs):
""" """
...@@ -121,7 +119,7 @@ class OCRRec(hub.Module): ...@@ -121,7 +119,7 @@ class OCRRec(hub.Module):
results = self.predict(images_decode, **kwargs) results = self.predict(images_decode, **kwargs)
return results return results
if __name__ == '__main__': if __name__ == '__main__':
ocr = OCRRec() ocr = OCRRec()
image_path = [ image_path = [
...@@ -130,4 +128,4 @@ if __name__ == '__main__': ...@@ -130,4 +128,4 @@ if __name__ == '__main__':
'./doc/imgs_words/ch/word_3.jpg', './doc/imgs_words/ch/word_3.jpg',
] ]
res = ocr.predict(paths=image_path) res = ocr.predict(paths=image_path)
print(res) print(res)
\ No newline at end of file
...@@ -3,11 +3,14 @@ from __future__ import absolute_import ...@@ -3,11 +3,14 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import os
import sys
sys.path.insert(0, ".")
import argparse import argparse
import ast import ast
import copy import copy
import math import math
import os
import time import time
from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor
...@@ -52,7 +55,7 @@ class OCRSystem(hub.Module): ...@@ -52,7 +55,7 @@ class OCRSystem(hub.Module):
) )
cfg.ir_optim = True cfg.ir_optim = True
cfg.enable_mkldnn = enable_mkldnn cfg.enable_mkldnn = enable_mkldnn
self.text_sys = TextSystem(cfg) self.text_sys = TextSystem(cfg)
def read_images(self, paths=[]): def read_images(self, paths=[]):
...@@ -67,9 +70,7 @@ class OCRSystem(hub.Module): ...@@ -67,9 +70,7 @@ class OCRSystem(hub.Module):
images.append(img) images.append(img)
return images return images
def predict(self, def predict(self, images=[], paths=[]):
images=[],
paths=[]):
""" """
Get the chinese texts in the predicted images. Get the chinese texts in the predicted images.
Args: Args:
...@@ -104,13 +105,11 @@ class OCRSystem(hub.Module): ...@@ -104,13 +105,11 @@ class OCRSystem(hub.Module):
for dno in range(dt_num): for dno in range(dt_num):
text, score = rec_res[dno] text, score = rec_res[dno]
rec_res_final.append( rec_res_final.append({
{ 'text': text,
'text': text, 'confidence': float(score),
'confidence': float(score), 'text_region': dt_boxes[dno].astype(np.int).tolist()
'text_region': dt_boxes[dno].astype(np.int).tolist() })
}
)
all_results.append(rec_res_final) all_results.append(rec_res_final)
return all_results return all_results
...@@ -123,7 +122,7 @@ class OCRSystem(hub.Module): ...@@ -123,7 +122,7 @@ class OCRSystem(hub.Module):
results = self.predict(images_decode, **kwargs) results = self.predict(images_decode, **kwargs)
return results return results
if __name__ == '__main__': if __name__ == '__main__':
ocr = OCRSystem() ocr = OCRSystem()
image_path = [ image_path = [
...@@ -131,4 +130,4 @@ if __name__ == '__main__': ...@@ -131,4 +130,4 @@ if __name__ == '__main__':
'./doc/imgs/12.jpg', './doc/imgs/12.jpg',
] ]
res = ocr.predict(paths=image_path) res = ocr.predict(paths=image_path)
print(res) print(res)
\ No newline at end of file
[English](readme_en.md) | 简体中文 [English](readme_en.md) | 简体中文
PaddleOCR提供2种服务部署方式: PaddleOCR提供2种服务部署方式:
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",按照本教程使用; - 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",按照本教程使用;
- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",使用方法参考[文档](../pdserving/readme.md) - 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",使用方法参考[文档](../../doc/doc_ch/serving_inference.md)
# 基于PaddleHub Serving的服务部署 # 基于PaddleHub Serving的服务部署
...@@ -29,12 +29,6 @@ deploy/hubserving/ocr_system/ ...@@ -29,12 +29,6 @@ deploy/hubserving/ocr_system/
```shell ```shell
# 安装paddlehub # 安装paddlehub
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
# 在Linux下设置环境变量
export PYTHONPATH=.
# 或者,在Windows下设置环境变量
SET PYTHONPATH=.
``` ```
### 2. 下载推理模型 ### 2. 下载推理模型
......
English | [简体中文](readme.md) English | [简体中文](readme.md)
PaddleOCR provides 2 service deployment methods: PaddleOCR provides 2 service deployment methods:
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please follow this tutorial. - Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please follow this tutorial.
- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please refer to the [tutorial](../pdserving/readme_en.md) for usage. - Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please refer to the [tutorial](../../doc/doc_ch/serving_inference.md) for usage.
# Service deployment based on PaddleHub Serving # Service deployment based on PaddleHub Serving
...@@ -30,12 +30,6 @@ The following steps take the 2-stage series service as an example. If only the d ...@@ -30,12 +30,6 @@ The following steps take the 2-stage series service as an example. If only the d
```shell ```shell
# Install paddlehub # Install paddlehub
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
# Set environment variables on Linux
export PYTHONPATH=.
# Set environment variables on Windows
SET PYTHONPATH=.
``` ```
### 2. Download inference model ### 2. Download inference model
......
...@@ -25,17 +25,16 @@ Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理 ...@@ -25,17 +25,16 @@ Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理
- 1. 直接下载,预测库下载链接如下: - 1. 直接下载,预测库下载链接如下:
|平台|预测库下载链接| |平台|预测库下载链接|
|-|-| |-|-|
|Android|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/Android/inference_lite_lib.android.armv7.gcc.c++_static.with_extra.CV_ON.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/Android/inference_lite_lib.android.armv8.gcc.c++_static.with_extra.CV_ON.tar.gz)| |Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)|
|IOS|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/iOS/inference_lite_lib.ios.armv7.with_extra.CV_ON.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/iOS/inference_lite_lib.ios64.armv8.with_extra.CV_ON.tar.gz)| |IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)|
注:1. 如果是从下Paddle-Lite[官网文档](https://paddle-lite.readthedocs.io/zh/latest/user_guides/release_lib.html#android-toolchain-gcc)下载的预测库, 注:1. 上述预测库为PaddleLite 2.6.3分支编译得到,有关PaddleLite 2.6.3 详细信息可参考[链接](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.6.3)。
注意选择`with_extra=ON,with_cv=ON`的下载链接。2. 如果使用量化的模型部署在端侧,建议使用Paddle-Lite develop分支编译预测库。
- 2. [建议]编译Paddle-Lite得到预测库,Paddle-Lite的编译方式如下: - 2. [推荐]编译Paddle-Lite得到预测库,Paddle-Lite的编译方式如下:
``` ```
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
# 务必使用develop分支编译预测库 # 切换到Paddle-Lite develop稳定分支
git checkout develop git checkout develop
./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON ./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON
``` ```
...@@ -101,6 +100,8 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括 ...@@ -101,6 +100,8 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
git checkout develop git checkout develop
# 切换到固定的commit
git reset --hard 55c53482bcdd2868373d024dd1144e4c5ec0e6b8
# 启动编译 # 启动编译
./lite/tools/build.sh build_optimize_tool ./lite/tools/build.sh build_optimize_tool
``` ```
...@@ -221,11 +222,11 @@ demo/cxx/ocr/ ...@@ -221,11 +222,11 @@ demo/cxx/ocr/
1. ppocr_keys_v1.txt是中文字典文件,如果使用的 nb 模型是英文数字或其他语言的模型,需要更换为对应语言的字典。 1. ppocr_keys_v1.txt是中文字典文件,如果使用的 nb 模型是英文数字或其他语言的模型,需要更换为对应语言的字典。
PaddleOCR 在ppocr/utils/下存放了多种字典,包括: PaddleOCR 在ppocr/utils/下存放了多种字典,包括:
``` ```
french_dict.txt # 法语字典 dict/french_dict.txt # 法语字典
german_dict.txt # 德语字典 dict/german_dict.txt # 德语字典
ic15_dict.txt # 英文字典 ic15_dict.txt # 英文字典
japan_dict.txt # 日语字典 dict/japan_dict.txt # 日语字典
korean_dict.txt # 韩语字典 dict/korean_dict.txt # 韩语字典
ppocr_keys_v1.txt # 中文字典 ppocr_keys_v1.txt # 中文字典
``` ```
...@@ -235,7 +236,7 @@ max_side_len 960 # 输入图像长宽大于960时,等比例缩放图 ...@@ -235,7 +236,7 @@ max_side_len 960 # 输入图像长宽大于960时,等比例缩放图
det_db_thresh 0.3 # 用于过滤DB预测的二值化图像,设置为0.-0.3对结果影响不明显 det_db_thresh 0.3 # 用于过滤DB预测的二值化图像,设置为0.-0.3对结果影响不明显
det_db_box_thresh 0.5 # DB后处理过滤box的阈值,如果检测存在漏框情况,可酌情减小 det_db_box_thresh 0.5 # DB后处理过滤box的阈值,如果检测存在漏框情况,可酌情减小
det_db_unclip_ratio 1.6 # 表示文本框的紧致程度,越小则文本框更靠近文本 det_db_unclip_ratio 1.6 # 表示文本框的紧致程度,越小则文本框更靠近文本
use_direction_classify 1 # 是否使用方向分类器,0表示不使用,1表示使用 use_direction_classify 0 # 是否使用方向分类器,0表示不使用,1表示使用
``` ```
5. 启动调试 5. 启动调试
...@@ -253,7 +254,7 @@ use_direction_classify 1 # 是否使用方向分类器,0表示不使用,1 ...@@ -253,7 +254,7 @@ use_direction_classify 1 # 是否使用方向分类器,0表示不使用,1
adb push debug /data/local/tmp/ adb push debug /data/local/tmp/
adb shell adb shell
cd /data/local/tmp/debug cd /data/local/tmp/debug
export LD_LIBRARY_PATH=/data/local/tmp/debug:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH
./ocr_db_crnn ch_ppocr_mobile_v1.1_det_prune_opt.nb ch_ppocr_mobile_v1.1_rec_quant_opt.nb ch_ppocr_mobile_cls_quant_opt.nb ./11.jpg ppocr_keys_v1.txt ./ocr_db_crnn ch_ppocr_mobile_v1.1_det_prune_opt.nb ch_ppocr_mobile_v1.1_rec_quant_opt.nb ch_ppocr_mobile_cls_quant_opt.nb ./11.jpg ppocr_keys_v1.txt
``` ```
......
# Tutorial of PaddleOCR Mobile deployment # Tutorial of PaddleOCR Mobile deployment
This tutorial will introduce how to use paddle-lite to deploy paddleOCR ultra-lightweight Chinese and English detection models on mobile phones. This tutorial will introduce how to use [paddle-lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy paddleOCR ultra-lightweight Chinese and English detection models on mobile phones.
paddle-lite is a lightweight inference engine for PaddlePaddle. paddle-lite is a lightweight inference engine for PaddlePaddle.
It provides efficient inference capabilities for mobile phones and IOTs, It provides efficient inference capabilities for mobile phones and IoTs,
and extensively integrates cross-platform hardware to provide lightweight and extensively integrates cross-platform hardware to provide lightweight
deployment solutions for end-side deployment issues. deployment solutions for end-side deployment issues.
...@@ -22,10 +22,10 @@ deployment solutions for end-side deployment issues. ...@@ -22,10 +22,10 @@ deployment solutions for end-side deployment issues.
|Platform|Prebuild library Download Link| |Platform|Prebuild library Download Link|
|-|-| |-|-|
|Android|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/Android/inference_lite_lib.android.armv7.gcc.c++_static.with_extra.CV_ON.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/Android/inference_lite_lib.android.armv8.gcc.c++_static.with_extra.CV_ON.tar.gz)| |Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)|
|IOS|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/iOS/inference_lite_lib.ios.armv7.with_extra.CV_ON.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.6.1/iOS/inference_lite_lib.ios64.armv8.with_extra.CV_ON.tar.gz)| |IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.6.3/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)|
note: It is recommended to build prebuild library using [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) develop branch if developer wants to deploy the [quantitative](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/quantization/README_en.md) model to mobile phone. note: The above pre-build inference library is compiled from the PaddleLite `release/2.6.3` branch. For more information about PaddleLite 2.6.3, please refer to [link](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.6.3).
The structure of the prediction library is as follows: The structure of the prediction library is as follows:
...@@ -65,11 +65,11 @@ If you have prepared the model file ending in `.nb`, you can skip this step. ...@@ -65,11 +65,11 @@ If you have prepared the model file ending in `.nb`, you can skip this step.
The following table also provides a series of models that can be deployed on mobile phones to recognize Chinese. The following table also provides a series of models that can be deployed on mobile phones to recognize Chinese.
You can directly download the optimized model. You can directly download the optimized model.
|Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle Lite branch | | Version | Introduction | Model size | Detection model | Text Direction model | Recognition model | Paddle Lite branch |
|-|-|-|-|-|-| | - | - | - | - | - | - | - |
|V1.1|extra-lightweight chinese OCR optimized model|8.1M|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_opt.nb)|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_opt.nb)|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_opt.nb)|develop| | V1.1 | extra-lightweight chinese OCR optimized model | 8.1M | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_opt.nb) | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_opt.nb) | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_opt.nb) | develop |
|[slim] V1.1|extra-lightweight chinese OCR optimized model|3.5M|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb)|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb)|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)|develop| | [slim] V1.1 | extra-lightweight chinese OCR optimized model | 3.5M | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb) | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb) | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) | develop |
|V1.0|lightweight Chinese OCR optimized model|8.6M|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.0_det_opt.nb)|---|[Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.0_rec_opt.nb)|develop| | V1.0 | lightweight Chinese OCR optimized model | 8.6M | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.0_det_opt.nb) | - | [Download](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.0_rec_opt.nb) | develop |
If the model to be deployed is not in the above table, you need to follow the steps below to obtain the optimized model. If the model to be deployed is not in the above table, you need to follow the steps below to obtain the optimized model.
...@@ -77,6 +77,8 @@ If the model to be deployed is not in the above table, you need to follow the st ...@@ -77,6 +77,8 @@ If the model to be deployed is not in the above table, you need to follow the st
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
git checkout develop git checkout develop
# switch to the specified commit
git reset --hard 55c53482bcdd2868373d024dd1144e4c5ec0e6b8
./lite/tools/build.sh build_optimize_tool ./lite/tools/build.sh build_optimize_tool
``` ```
...@@ -185,11 +187,11 @@ demo/cxx/ocr/ ...@@ -185,11 +187,11 @@ demo/cxx/ocr/
If the nb model is used for English recognition or other language recognition, dictionary file should be replaced with a dictionary of the corresponding language. If the nb model is used for English recognition or other language recognition, dictionary file should be replaced with a dictionary of the corresponding language.
PaddleOCR provides a variety of dictionaries under ppocr/utils/, including: PaddleOCR provides a variety of dictionaries under ppocr/utils/, including:
``` ```
french_dict.txt # french dict/french_dict.txt # french
german_dict.txt # german dict/german_dict.txt # german
ic15_dict.txt # english ic15_dict.txt # english
japan_dict.txt # japan dict/japan_dict.txt # japan
korean_dict.txt # korean dict/korean_dict.txt # korean
ppocr_keys_v1.txt # chinese ppocr_keys_v1.txt # chinese
``` ```
...@@ -199,7 +201,7 @@ max_side_len 960 # Limit the maximum image height and width to 960 ...@@ -199,7 +201,7 @@ max_side_len 960 # Limit the maximum image height and width to 960
det_db_thresh 0.3 # Used to filter the binarized image of DB prediction, setting 0.-0.3 has no obvious effect on the result det_db_thresh 0.3 # Used to filter the binarized image of DB prediction, setting 0.-0.3 has no obvious effect on the result
det_db_box_thresh 0.5 # DDB post-processing filter box threshold, if there is a missing box detected, it can be reduced as appropriate det_db_box_thresh 0.5 # DDB post-processing filter box threshold, if there is a missing box detected, it can be reduced as appropriate
det_db_unclip_ratio 1.6 # Indicates the compactness of the text box, the smaller the value, the closer the text box to the text det_db_unclip_ratio 1.6 # Indicates the compactness of the text box, the smaller the value, the closer the text box to the text
use_direction_classify 1 # Whether to use the direction classifier, 0 means not to use, 1 means to use use_direction_classify 0 # Whether to use the direction classifier, 0 means not to use, 1 means to use
``` ```
5. Run Model on phone 5. Run Model on phone
......
#!/bin/sh
# ----------------------------------------------------------------------------
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# ----------------------------------------------------------------------------
# ----------------------------------------------------------------------------
# Maven Start Up Batch script
#
# Required ENV vars:
# ------------------
# JAVA_HOME - location of a JDK home dir
#
# Optional ENV vars
# -----------------
# M2_HOME - location of maven2's installed home dir
# MAVEN_OPTS - parameters passed to the Java VM when running Maven
# e.g. to debug Maven itself, use
# set MAVEN_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000
# MAVEN_SKIP_RC - flag to disable loading of mavenrc files
# ----------------------------------------------------------------------------
if [ -z "$MAVEN_SKIP_RC" ] ; then
if [ -f /etc/mavenrc ] ; then
. /etc/mavenrc
fi
if [ -f "$HOME/.mavenrc" ] ; then
. "$HOME/.mavenrc"
fi
fi
# OS specific support. $var _must_ be set to either true or false.
cygwin=false;
darwin=false;
mingw=false
case "`uname`" in
CYGWIN*) cygwin=true ;;
MINGW*) mingw=true;;
Darwin*) darwin=true
# Use /usr/libexec/java_home if available, otherwise fall back to /Library/Java/Home
# See https://developer.apple.com/library/mac/qa/qa1170/_index.html
if [ -z "$JAVA_HOME" ]; then
if [ -x "/usr/libexec/java_home" ]; then
export JAVA_HOME="`/usr/libexec/java_home`"
else
export JAVA_HOME="/Library/Java/Home"
fi
fi
;;
esac
if [ -z "$JAVA_HOME" ] ; then
if [ -r /etc/gentoo-release ] ; then
JAVA_HOME=`java-config --jre-home`
fi
fi
if [ -z "$M2_HOME" ] ; then
## resolve links - $0 may be a link to maven's home
PRG="$0"
# need this for relative symlinks
while [ -h "$PRG" ] ; do
ls=`ls -ld "$PRG"`
link=`expr "$ls" : '.*-> \(.*\)$'`
if expr "$link" : '/.*' > /dev/null; then
PRG="$link"
else
PRG="`dirname "$PRG"`/$link"
fi
done
saveddir=`pwd`
M2_HOME=`dirname "$PRG"`/..
# make it fully qualified
M2_HOME=`cd "$M2_HOME" && pwd`
cd "$saveddir"
# echo Using m2 at $M2_HOME
fi
# For Cygwin, ensure paths are in UNIX format before anything is touched
if $cygwin ; then
[ -n "$M2_HOME" ] &&
M2_HOME=`cygpath --unix "$M2_HOME"`
[ -n "$JAVA_HOME" ] &&
JAVA_HOME=`cygpath --unix "$JAVA_HOME"`
[ -n "$CLASSPATH" ] &&
CLASSPATH=`cygpath --path --unix "$CLASSPATH"`
fi
# For Mingw, ensure paths are in UNIX format before anything is touched
if $mingw ; then
[ -n "$M2_HOME" ] &&
M2_HOME="`(cd "$M2_HOME"; pwd)`"
[ -n "$JAVA_HOME" ] &&
JAVA_HOME="`(cd "$JAVA_HOME"; pwd)`"
fi
if [ -z "$JAVA_HOME" ]; then
javaExecutable="`which javac`"
if [ -n "$javaExecutable" ] && ! [ "`expr \"$javaExecutable\" : '\([^ ]*\)'`" = "no" ]; then
# readlink(1) is not available as standard on Solaris 10.
readLink=`which readlink`
if [ ! `expr "$readLink" : '\([^ ]*\)'` = "no" ]; then
if $darwin ; then
javaHome="`dirname \"$javaExecutable\"`"
javaExecutable="`cd \"$javaHome\" && pwd -P`/javac"
else
javaExecutable="`readlink -f \"$javaExecutable\"`"
fi
javaHome="`dirname \"$javaExecutable\"`"
javaHome=`expr "$javaHome" : '\(.*\)/bin'`
JAVA_HOME="$javaHome"
export JAVA_HOME
fi
fi
fi
if [ -z "$JAVACMD" ] ; then
if [ -n "$JAVA_HOME" ] ; then
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
# IBM's JDK on AIX uses strange locations for the executables
JAVACMD="$JAVA_HOME/jre/sh/java"
else
JAVACMD="$JAVA_HOME/bin/java"
fi
else
JAVACMD="`which java`"
fi
fi
if [ ! -x "$JAVACMD" ] ; then
echo "Error: JAVA_HOME is not defined correctly." >&2
echo " We cannot execute $JAVACMD" >&2
exit 1
fi
if [ -z "$JAVA_HOME" ] ; then
echo "Warning: JAVA_HOME environment variable is not set."
fi
CLASSWORLDS_LAUNCHER=org.codehaus.plexus.classworlds.launcher.Launcher
# traverses directory structure from process work directory to filesystem root
# first directory with .mvn subdirectory is considered project base directory
find_maven_basedir() {
if [ -z "$1" ]
then
echo "Path not specified to find_maven_basedir"
return 1
fi
basedir="$1"
wdir="$1"
while [ "$wdir" != '/' ] ; do
if [ -d "$wdir"/.mvn ] ; then
basedir=$wdir
break
fi
# workaround for JBEAP-8937 (on Solaris 10/Sparc)
if [ -d "${wdir}" ]; then
wdir=`cd "$wdir/.."; pwd`
fi
# end of workaround
done
echo "${basedir}"
}
# concatenates all lines of a file
concat_lines() {
if [ -f "$1" ]; then
echo "$(tr -s '\n' ' ' < "$1")"
fi
}
BASE_DIR=`find_maven_basedir "$(pwd)"`
if [ -z "$BASE_DIR" ]; then
exit 1;
fi
##########################################################################################
# Extension to allow automatically downloading the maven-wrapper.jar from Maven-central
# This allows using the maven wrapper in projects that prohibit checking in binary data.
##########################################################################################
if [ -r "$BASE_DIR/.mvn/wrapper/maven-wrapper.jar" ]; then
if [ "$MVNW_VERBOSE" = true ]; then
echo "Found .mvn/wrapper/maven-wrapper.jar"
fi
else
if [ "$MVNW_VERBOSE" = true ]; then
echo "Couldn't find .mvn/wrapper/maven-wrapper.jar, downloading it ..."
fi
if [ -n "$MVNW_REPOURL" ]; then
jarUrl="$MVNW_REPOURL/io/takari/maven-wrapper/0.5.6/maven-wrapper-0.5.6.jar"
else
jarUrl="https://repo.maven.apache.org/maven2/io/takari/maven-wrapper/0.5.6/maven-wrapper-0.5.6.jar"
fi
while IFS="=" read key value; do
case "$key" in (wrapperUrl) jarUrl="$value"; break ;;
esac
done < "$BASE_DIR/.mvn/wrapper/maven-wrapper.properties"
if [ "$MVNW_VERBOSE" = true ]; then
echo "Downloading from: $jarUrl"
fi
wrapperJarPath="$BASE_DIR/.mvn/wrapper/maven-wrapper.jar"
if $cygwin; then
wrapperJarPath=`cygpath --path --windows "$wrapperJarPath"`
fi
if command -v wget > /dev/null; then
if [ "$MVNW_VERBOSE" = true ]; then
echo "Found wget ... using wget"
fi
if [ -z "$MVNW_USERNAME" ] || [ -z "$MVNW_PASSWORD" ]; then
wget "$jarUrl" -O "$wrapperJarPath"
else
wget --http-user=$MVNW_USERNAME --http-password=$MVNW_PASSWORD "$jarUrl" -O "$wrapperJarPath"
fi
elif command -v curl > /dev/null; then
if [ "$MVNW_VERBOSE" = true ]; then
echo "Found curl ... using curl"
fi
if [ -z "$MVNW_USERNAME" ] || [ -z "$MVNW_PASSWORD" ]; then
curl -o "$wrapperJarPath" "$jarUrl" -f
else
curl --user $MVNW_USERNAME:$MVNW_PASSWORD -o "$wrapperJarPath" "$jarUrl" -f
fi
else
if [ "$MVNW_VERBOSE" = true ]; then
echo "Falling back to using Java to download"
fi
javaClass="$BASE_DIR/.mvn/wrapper/MavenWrapperDownloader.java"
# For Cygwin, switch paths to Windows format before running javac
if $cygwin; then
javaClass=`cygpath --path --windows "$javaClass"`
fi
if [ -e "$javaClass" ]; then
if [ ! -e "$BASE_DIR/.mvn/wrapper/MavenWrapperDownloader.class" ]; then
if [ "$MVNW_VERBOSE" = true ]; then
echo " - Compiling MavenWrapperDownloader.java ..."
fi
# Compiling the Java class
("$JAVA_HOME/bin/javac" "$javaClass")
fi
if [ -e "$BASE_DIR/.mvn/wrapper/MavenWrapperDownloader.class" ]; then
# Running the downloader
if [ "$MVNW_VERBOSE" = true ]; then
echo " - Running MavenWrapperDownloader.java ..."
fi
("$JAVA_HOME/bin/java" -cp .mvn/wrapper MavenWrapperDownloader "$MAVEN_PROJECTBASEDIR")
fi
fi
fi
fi
##########################################################################################
# End of extension
##########################################################################################
export MAVEN_PROJECTBASEDIR=${MAVEN_BASEDIR:-"$BASE_DIR"}
if [ "$MVNW_VERBOSE" = true ]; then
echo $MAVEN_PROJECTBASEDIR
fi
MAVEN_OPTS="$(concat_lines "$MAVEN_PROJECTBASEDIR/.mvn/jvm.config") $MAVEN_OPTS"
# For Cygwin, switch paths to Windows format before running java
if $cygwin; then
[ -n "$M2_HOME" ] &&
M2_HOME=`cygpath --path --windows "$M2_HOME"`
[ -n "$JAVA_HOME" ] &&
JAVA_HOME=`cygpath --path --windows "$JAVA_HOME"`
[ -n "$CLASSPATH" ] &&
CLASSPATH=`cygpath --path --windows "$CLASSPATH"`
[ -n "$MAVEN_PROJECTBASEDIR" ] &&
MAVEN_PROJECTBASEDIR=`cygpath --path --windows "$MAVEN_PROJECTBASEDIR"`
fi
# Provide a "standardized" way to retrieve the CLI args that will
# work with both Windows and non-Windows executions.
MAVEN_CMD_LINE_ARGS="$MAVEN_CONFIG $@"
export MAVEN_CMD_LINE_ARGS
WRAPPER_LAUNCHER=org.apache.maven.wrapper.MavenWrapperMain
exec "$JAVACMD" \
$MAVEN_OPTS \
-classpath "$MAVEN_PROJECTBASEDIR/.mvn/wrapper/maven-wrapper.jar" \
"-Dmaven.home=${M2_HOME}" "-Dmaven.multiModuleProjectDirectory=${MAVEN_PROJECTBASEDIR}" \
${WRAPPER_LAUNCHER} $MAVEN_CONFIG "$@"
@REM ----------------------------------------------------------------------------
@REM Licensed to the Apache Software Foundation (ASF) under one
@REM or more contributor license agreements. See the NOTICE file
@REM distributed with this work for additional information
@REM regarding copyright ownership. The ASF licenses this file
@REM to you under the Apache License, Version 2.0 (the
@REM "License"); you may not use this file except in compliance
@REM with the License. You may obtain a copy of the License at
@REM
@REM https://www.apache.org/licenses/LICENSE-2.0
@REM
@REM Unless required by applicable law or agreed to in writing,
@REM software distributed under the License is distributed on an
@REM "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@REM KIND, either express or implied. See the License for the
@REM specific language governing permissions and limitations
@REM under the License.
@REM ----------------------------------------------------------------------------
@REM ----------------------------------------------------------------------------
@REM Maven Start Up Batch script
@REM
@REM Required ENV vars:
@REM JAVA_HOME - location of a JDK home dir
@REM
@REM Optional ENV vars
@REM M2_HOME - location of maven2's installed home dir
@REM MAVEN_BATCH_ECHO - set to 'on' to enable the echoing of the batch commands
@REM MAVEN_BATCH_PAUSE - set to 'on' to wait for a keystroke before ending
@REM MAVEN_OPTS - parameters passed to the Java VM when running Maven
@REM e.g. to debug Maven itself, use
@REM set MAVEN_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000
@REM MAVEN_SKIP_RC - flag to disable loading of mavenrc files
@REM ----------------------------------------------------------------------------
@REM Begin all REM lines with '@' in case MAVEN_BATCH_ECHO is 'on'
@echo off
@REM set title of command window
title %0
@REM enable echoing by setting MAVEN_BATCH_ECHO to 'on'
@if "%MAVEN_BATCH_ECHO%" == "on" echo %MAVEN_BATCH_ECHO%
@REM set %HOME% to equivalent of $HOME
if "%HOME%" == "" (set "HOME=%HOMEDRIVE%%HOMEPATH%")
@REM Execute a user defined script before this one
if not "%MAVEN_SKIP_RC%" == "" goto skipRcPre
@REM check for pre script, once with legacy .bat ending and once with .cmd ending
if exist "%HOME%\mavenrc_pre.bat" call "%HOME%\mavenrc_pre.bat"
if exist "%HOME%\mavenrc_pre.cmd" call "%HOME%\mavenrc_pre.cmd"
:skipRcPre
@setlocal
set ERROR_CODE=0
@REM To isolate internal variables from possible post scripts, we use another setlocal
@setlocal
@REM ==== START VALIDATION ====
if not "%JAVA_HOME%" == "" goto OkJHome
echo.
echo Error: JAVA_HOME not found in your environment. >&2
echo Please set the JAVA_HOME variable in your environment to match the >&2
echo location of your Java installation. >&2
echo.
goto error
:OkJHome
if exist "%JAVA_HOME%\bin\java.exe" goto init
echo.
echo Error: JAVA_HOME is set to an invalid directory. >&2
echo JAVA_HOME = "%JAVA_HOME%" >&2
echo Please set the JAVA_HOME variable in your environment to match the >&2
echo location of your Java installation. >&2
echo.
goto error
@REM ==== END VALIDATION ====
:init
@REM Find the project base dir, i.e. the directory that contains the folder ".mvn".
@REM Fallback to current working directory if not found.
set MAVEN_PROJECTBASEDIR=%MAVEN_BASEDIR%
IF NOT "%MAVEN_PROJECTBASEDIR%"=="" goto endDetectBaseDir
set EXEC_DIR=%CD%
set WDIR=%EXEC_DIR%
:findBaseDir
IF EXIST "%WDIR%"\.mvn goto baseDirFound
cd ..
IF "%WDIR%"=="%CD%" goto baseDirNotFound
set WDIR=%CD%
goto findBaseDir
:baseDirFound
set MAVEN_PROJECTBASEDIR=%WDIR%
cd "%EXEC_DIR%"
goto endDetectBaseDir
:baseDirNotFound
set MAVEN_PROJECTBASEDIR=%EXEC_DIR%
cd "%EXEC_DIR%"
:endDetectBaseDir
IF NOT EXIST "%MAVEN_PROJECTBASEDIR%\.mvn\jvm.config" goto endReadAdditionalConfig
@setlocal EnableExtensions EnableDelayedExpansion
for /F "usebackq delims=" %%a in ("%MAVEN_PROJECTBASEDIR%\.mvn\jvm.config") do set JVM_CONFIG_MAVEN_PROPS=!JVM_CONFIG_MAVEN_PROPS! %%a
@endlocal & set JVM_CONFIG_MAVEN_PROPS=%JVM_CONFIG_MAVEN_PROPS%
:endReadAdditionalConfig
SET MAVEN_JAVA_EXE="%JAVA_HOME%\bin\java.exe"
set WRAPPER_JAR="%MAVEN_PROJECTBASEDIR%\.mvn\wrapper\maven-wrapper.jar"
set WRAPPER_LAUNCHER=org.apache.maven.wrapper.MavenWrapperMain
set DOWNLOAD_URL="https://repo.maven.apache.org/maven2/io/takari/maven-wrapper/0.5.6/maven-wrapper-0.5.6.jar"
FOR /F "tokens=1,2 delims==" %%A IN ("%MAVEN_PROJECTBASEDIR%\.mvn\wrapper\maven-wrapper.properties") DO (
IF "%%A"=="wrapperUrl" SET DOWNLOAD_URL=%%B
)
@REM Extension to allow automatically downloading the maven-wrapper.jar from Maven-central
@REM This allows using the maven wrapper in projects that prohibit checking in binary data.
if exist %WRAPPER_JAR% (
if "%MVNW_VERBOSE%" == "true" (
echo Found %WRAPPER_JAR%
)
) else (
if not "%MVNW_REPOURL%" == "" (
SET DOWNLOAD_URL="%MVNW_REPOURL%/io/takari/maven-wrapper/0.5.6/maven-wrapper-0.5.6.jar"
)
if "%MVNW_VERBOSE%" == "true" (
echo Couldn't find %WRAPPER_JAR%, downloading it ...
echo Downloading from: %DOWNLOAD_URL%
)
powershell -Command "&{"^
"$webclient = new-object System.Net.WebClient;"^
"if (-not ([string]::IsNullOrEmpty('%MVNW_USERNAME%') -and [string]::IsNullOrEmpty('%MVNW_PASSWORD%'))) {"^
"$webclient.Credentials = new-object System.Net.NetworkCredential('%MVNW_USERNAME%', '%MVNW_PASSWORD%');"^
"}"^
"[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; $webclient.DownloadFile('%DOWNLOAD_URL%', '%WRAPPER_JAR%')"^
"}"
if "%MVNW_VERBOSE%" == "true" (
echo Finished downloading %WRAPPER_JAR%
)
)
@REM End of extension
@REM Provide a "standardized" way to retrieve the CLI args that will
@REM work with both Windows and non-Windows executions.
set MAVEN_CMD_LINE_ARGS=%*
%MAVEN_JAVA_EXE% %JVM_CONFIG_MAVEN_PROPS% %MAVEN_OPTS% %MAVEN_DEBUG_OPTS% -classpath %WRAPPER_JAR% "-Dmaven.multiModuleProjectDirectory=%MAVEN_PROJECTBASEDIR%" %WRAPPER_LAUNCHER% %MAVEN_CONFIG% %*
if ERRORLEVEL 1 goto error
goto end
:error
set ERROR_CODE=1
:end
@endlocal & set ERROR_CODE=%ERROR_CODE%
if not "%MAVEN_SKIP_RC%" == "" goto skipRcPost
@REM check for post script, once with legacy .bat ending and once with .cmd ending
if exist "%HOME%\mavenrc_post.bat" call "%HOME%\mavenrc_post.bat"
if exist "%HOME%\mavenrc_post.cmd" call "%HOME%\mavenrc_post.cmd"
:skipRcPost
@REM pause the script if MAVEN_BATCH_PAUSE is set to 'on'
if "%MAVEN_BATCH_PAUSE%" == "on" pause
if "%MAVEN_TERMINATE_CMD%" == "on" exit %ERROR_CODE%
exit /B %ERROR_CODE%
<?xml version="1.0" encoding="UTF-8"?>
<module org.jetbrains.idea.maven.project.MavenProjectsManager.isMavenModule="true" type="JAVA_MODULE" version="4">
<component name="FacetManager">
<facet type="Spring" name="Spring">
<configuration />
</facet>
<facet type="web" name="Web">
<configuration>
<webroots />
<sourceRoots>
<root url="file://$MODULE_DIR$/src/main/java" />
<root url="file://$MODULE_DIR$/src/main/resources" />
</sourceRoots>
</configuration>
</facet>
</component>
<component name="NewModuleRootManager" LANGUAGE_LEVEL="JDK_1_8">
<output url="file://$MODULE_DIR$/target/classes" />
<output-test url="file://$MODULE_DIR$/target/test-classes" />
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src/main/java" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/src/main/resources" type="java-resource" />
<sourceFolder url="file://$MODULE_DIR$/src/test/java" isTestSource="true" />
<excludeFolder url="file://$MODULE_DIR$/target" />
</content>
<orderEntry type="jdk" jdkName="1.8" jdkType="JavaSDK" />
<orderEntry type="sourceFolder" forTests="false" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-starter-web:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-starter:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-autoconfigure:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-starter-logging:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: ch.qos.logback:logback-classic:1.2.3" level="project" />
<orderEntry type="library" name="Maven: ch.qos.logback:logback-core:1.2.3" level="project" />
<orderEntry type="library" name="Maven: org.apache.logging.log4j:log4j-to-slf4j:2.13.3" level="project" />
<orderEntry type="library" name="Maven: org.apache.logging.log4j:log4j-api:2.13.3" level="project" />
<orderEntry type="library" name="Maven: org.slf4j:jul-to-slf4j:1.7.30" level="project" />
<orderEntry type="library" name="Maven: jakarta.annotation:jakarta.annotation-api:1.3.5" level="project" />
<orderEntry type="library" name="Maven: org.yaml:snakeyaml:1.26" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-starter-json:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.core:jackson-databind:2.11.2" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.core:jackson-annotations:2.11.2" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.core:jackson-core:2.11.2" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.datatype:jackson-datatype-jdk8:2.11.2" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.datatype:jackson-datatype-jsr310:2.11.2" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.module:jackson-module-parameter-names:2.11.2" level="project" />
<orderEntry type="library" name="Maven: org.springframework.boot:spring-boot-starter-tomcat:2.3.4.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.apache.tomcat.embed:tomcat-embed-core:9.0.38" level="project" />
<orderEntry type="library" name="Maven: org.glassfish:jakarta.el:3.0.3" level="project" />
<orderEntry type="library" name="Maven: org.apache.tomcat.embed:tomcat-embed-websocket:9.0.38" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-web:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-beans:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-webmvc:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-aop:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-context:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-expression:5.2.9.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.springframework.boot:spring-boot-starter-test:2.3.4.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.springframework.boot:spring-boot-test:2.3.4.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.springframework.boot:spring-boot-test-autoconfigure:2.3.4.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: com.jayway.jsonpath:json-path:2.4.0" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: net.minidev:json-smart:2.3" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: net.minidev:accessors-smart:1.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.ow2.asm:asm:5.0.4" level="project" />
<orderEntry type="library" name="Maven: org.slf4j:slf4j-api:1.7.30" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: jakarta.xml.bind:jakarta.xml.bind-api:2.3.3" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: jakarta.activation:jakarta.activation-api:1.2.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.assertj:assertj-core:3.16.1" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.hamcrest:hamcrest:2.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.jupiter:junit-jupiter:5.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.jupiter:junit-jupiter-api:5.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.apiguardian:apiguardian-api:1.1.0" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.opentest4j:opentest4j:1.2.0" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.platform:junit-platform-commons:1.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.jupiter:junit-jupiter-params:5.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.jupiter:junit-jupiter-engine:5.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.junit.platform:junit-platform-engine:1.6.2" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.mockito:mockito-core:3.3.3" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: net.bytebuddy:byte-buddy:1.10.14" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: net.bytebuddy:byte-buddy-agent:1.10.14" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.objenesis:objenesis:2.6" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.mockito:mockito-junit-jupiter:3.3.3" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.skyscreamer:jsonassert:1.5.0" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-core:5.2.9.RELEASE" level="project" />
<orderEntry type="library" name="Maven: org.springframework:spring-jcl:5.2.9.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.springframework:spring-test:5.2.9.RELEASE" level="project" />
<orderEntry type="library" scope="TEST" name="Maven: org.xmlunit:xmlunit-core:2.7.0" level="project" />
<orderEntry type="library" name="Maven: org.apache.httpcomponents:httpclient:4.5.12" level="project" />
<orderEntry type="library" name="Maven: org.apache.httpcomponents:httpcore:4.4.13" level="project" />
<orderEntry type="library" name="Maven: commons-codec:commons-codec:1.14" level="project" />
<orderEntry type="library" name="Maven: com.vaadin.external.google:android-json:0.0.20131108.vaadin1" level="project" />
<orderEntry type="library" name="Maven: com.google.code.gson:gson:2.8.6" level="project" />
</component>
</module>
\ No newline at end of file
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.4.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.paddelOcr_springBoot</groupId>
<artifactId>demo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>demo</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<!--引入RestTemplate-->
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
</dependency>
<dependency>
<groupId>com.vaadin.external.google</groupId>
<artifactId>android-json</artifactId>
<version>0.0.20131108.vaadin1</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.6</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
简体中文
- 使用本教程前请先基于PaddleHub Serving的部署.
# 基于PaddleHub Serving的Java SpringBoot调用
paddleOcrSpringBoot服务部署目录下包括全部SpringBoot代码。目录结构如下:
```
deploy/paddleOcrSpringBoot/
└─ src 代码文件
└─ main 主函数代码
└─ java\com\paddelocr_springboot\demo
└─ DemoApplication.java SpringBoot启动代码
└─ Controller
└─ OCR.java 控制器代码
└─ test 测试代码
```
- Hub Serving启动后的APi端口如下:
`http://[ip_address]:[port]/predict/[module_name]`
## 返回结果格式说明
返回结果为列表(list),列表中的每一项为词典(dict),词典一共可能包含3种字段,信息如下:
|字段名称|数据类型|意义|
|-|-|-|
|text|str|文本内容|
|confidence|float| 文本识别置信度|
|text_region|list|文本位置坐标|
不同模块返回的字段不同,如,文本识别服务模块返回结果不含`text_region`字段,具体信息如下:
|字段名/模块名|ocr_det|ocr_rec|ocr_system|
|-|-|-|-|
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
\ No newline at end of file
package com.paddelocr_springboot.demo.Controller;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import com.google.gson.Gson;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import org.json.JSONObject;
import org.springframework.util.LinkedMultiValueMap;
import org.springframework.util.ResourceUtils;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.http.*;
import org.springframework.util.MultiValueMap;
import org.springframework.web.client.RestTemplate;
import sun.misc.BASE64Encoder;
import java.util.Objects;
@RestController
public class OCR {
@RequestMapping("/")
public ResponseEntity<String> hi(){
//创建请求头
HttpHeaders headers = new HttpHeaders();
//设置请求头格式
headers.setContentType(MediaType.APPLICATION_JSON);
//读入静态资源文件1.png
InputStream imagePath = this.getClass().getResourceAsStream("/1.png");
//构建请求参数
MultiValueMap<String, String> map= new LinkedMultiValueMap<String, String>();
//添加请求参数images,并将Base64编码的图片传入
map.add("images", ImageToBase64(imagePath));
//构建请求
HttpEntity<MultiValueMap<String, String>> request = new HttpEntity<MultiValueMap<String, String>>(map, headers);
RestTemplate restTemplate = new RestTemplate();
//发送请求
ResponseEntity<String> response = restTemplate.postForEntity("http://127.0.0.1:8866/predict/ocr_system", request, String.class);
//打印请求返回值
return response;
}
private String ImageToBase64(InputStream imgPath) {
byte[] data = null;
// 读取图片字节数组
try {
InputStream in = imgPath;
System.out.println(imgPath);
data = new byte[in.available()];
in.read(data);
in.close();
} catch (IOException e) {
e.printStackTrace();
}
// 对字节数组Base64编码
BASE64Encoder encoder = new BASE64Encoder();
// 返回Base64编码过的字节数组字符串
//System.out.println("图片转换Base64:" + encoder.encode(Objects.requireNonNull(data)));
return encoder.encode(Objects.requireNonNull(data));
}
}
package com.paddelocr_springboot.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
server.port=8081
http_pool.max_total: 200
http_pool.default_max_per_route: 100
http_pool.connect_timeout: 5000
http_pool.connection_request_timeout: 1000
http_pool.socket_timeout: 65000
http_pool.validate_after_inactivity: 2000
\ No newline at end of file
package com.paddelocr_springboot.demo;
import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;
@SpringBootTest
class DemoApplicationTests {
@Test
void contextLoads() {
}
}
...@@ -13,7 +13,6 @@ ...@@ -13,7 +13,6 @@
# limitations under the License. # limitations under the License.
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import OCRReader
import cv2 import cv2
import sys import sys
import numpy as np import numpy as np
......
...@@ -13,7 +13,6 @@ ...@@ -13,7 +13,6 @@
# limitations under the License. # limitations under the License.
from paddle_serving_client import Client from paddle_serving_client import Client
from paddle_serving_app.reader import OCRReader
import cv2 import cv2
import sys import sys
import numpy as np import numpy as np
...@@ -59,7 +58,6 @@ class TextSystemHelper(TextSystem): ...@@ -59,7 +58,6 @@ class TextSystemHelper(TextSystem):
fetch_map = self.det_client.predict(feed, fetch) fetch_map = self.det_client.predict(feed, fetch)
outputs = [fetch_map[x] for x in fetch] outputs = [fetch_map[x] for x in fetch]
dt_boxes = self.text_detector.postprocess(outputs, self.tmp_args) dt_boxes = self.text_detector.postprocess(outputs, self.tmp_args)
# print(dt_boxes)
if dt_boxes is None: if dt_boxes is None:
return None, None return None, None
img_crop_list = [] img_crop_list = []
...@@ -73,7 +71,6 @@ class TextSystemHelper(TextSystem): ...@@ -73,7 +71,6 @@ class TextSystemHelper(TextSystem):
feed, fetch, self.tmp_args = self.text_classifier.preprocess( feed, fetch, self.tmp_args = self.text_classifier.preprocess(
img_crop_list) img_crop_list)
fetch_map = self.clas_client.predict(feed, fetch) fetch_map = self.clas_client.predict(feed, fetch)
# print(fetch_map)
outputs = [fetch_map[x] for x in self.text_classifier.fetch] outputs = [fetch_map[x] for x in self.text_classifier.fetch]
for x in fetch_map.keys(): for x in fetch_map.keys():
if ".lod" in x: if ".lod" in x:
......
...@@ -51,6 +51,7 @@ from paddleslim.quant import quant_aware, convert ...@@ -51,6 +51,7 @@ from paddleslim.quant import quant_aware, convert
from paddle.fluid.layer_helper import LayerHelper from paddle.fluid.layer_helper import LayerHelper
from eval_utils.eval_det_utils import eval_det_run from eval_utils.eval_det_utils import eval_det_run
from eval_utils.eval_rec_utils import eval_rec_run from eval_utils.eval_rec_utils import eval_rec_run
from eval_utils.eval_cls_utils import eval_cls_run
def main(): def main():
...@@ -105,6 +106,8 @@ def main(): ...@@ -105,6 +106,8 @@ def main():
if alg_type == 'det': if alg_type == 'det':
final_metrics = eval_det_run(exe, config, quant_info_dict, "eval") final_metrics = eval_det_run(exe, config, quant_info_dict, "eval")
elif alg_type == 'cls':
final_metrics = eval_cls_run(exe, quant_info_dict)
else: else:
final_metrics = eval_rec_run(exe, config, quant_info_dict, "eval") final_metrics = eval_rec_run(exe, config, quant_info_dict, "eval")
print(final_metrics) print(final_metrics)
......
...@@ -178,9 +178,12 @@ def main(): ...@@ -178,9 +178,12 @@ def main():
if train_alg_type == 'det': if train_alg_type == 'det':
program.train_eval_det_run( program.train_eval_det_run(
config, exe, train_info_dict, eval_info_dict, is_slim="quant") config, exe, train_info_dict, eval_info_dict, is_slim="quant")
else: elif train_alg_type == 'rec':
program.train_eval_rec_run( program.train_eval_rec_run(
config, exe, train_info_dict, eval_info_dict, is_slim="quant") config, exe, train_info_dict, eval_info_dict, is_slim="quant")
else:
program.train_eval_cls_run(
config, exe, train_info_dict, eval_info_dict, is_slim="quant")
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -9,43 +9,45 @@ ...@@ -9,43 +9,45 @@
## PaddleOCR常见问题汇总(持续更新) ## PaddleOCR常见问题汇总(持续更新)
* [近期更新(2020.10.19](#近期更新) * [近期更新(2020.10.26](#近期更新)
* [【精选】OCR精选10个问题](#OCR精选10个问题) * [【精选】OCR精选10个问题](#OCR精选10个问题)
* [【理论篇】OCR通用21个问题](#OCR通用问题) * [【理论篇】OCR通用23个问题](#OCR通用问题)
* [基础知识3](#基础知识) * [基础知识5](#基础知识)
* [数据集4题](#数据集) * [数据集4题](#数据集)
* [模型训练调优6题](#模型训练调优) * [模型训练调优6题](#模型训练调优)
* [预测部署8题](#预测部署) * [预测部署8题](#预测部署)
* [【实战篇】PaddleOCR实战58个问题](#PaddleOCR实战问题) * [【实战篇】PaddleOCR实战61个问题](#PaddleOCR实战问题)
* [使用咨询17](#使用咨询) * [使用咨询20](#使用咨询)
* [数据集10题](#数据集) * [数据集10题](#数据集)
* [模型训练调优15题](#模型训练调优) * [模型训练调优15题](#模型训练调优)
* [预测部署16题](#预测部署) * [预测部署16题](#预测部署)
<a name="近期更新"></a> <a name="近期更新"></a>
## 近期更新(2020.10.19 ## 近期更新(2020.10.26
#### Q3.3.14:使用之前版本的代码加载最新1.1版的通用检测预训练模型,提示在模型文件.pdparams中找不到bn4e_branch2a_variance是什么情况?是网络结构发生了变化吗? #### Q2.1.4 印章如何识别
**A**: 1. 使用带tps的识别网络或abcnet,2.使用极坐标变换将图片拉平之后使用crnn
**A**:1.1版的轻量检测模型去掉了mv3结构中的se模块,可以对比下这两个配置文件:[det_mv3_db.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/configs/det/det_mv3_db.yml)[det_mv3_db_v1.1.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/configs/det/det_mv3_db_v1.1.yml)
#### Q3.3.15: 训练中使用的字典需要与加载的预训练模型使用的字典一样吗? #### Q2.1.5 多语言的字典里是混合了不同的语种,这个是有什么讲究吗?统一到一个字典里会对精度造成多大的损失?
**A**:统一到一个字典里,会造成最后一层FC过大,增加模型大小。如果有特殊需求的话,可以把需要的几种语言合并字典训练模型,合并字典之后如果引入过多的形近字,可能会造成精度损失,字符平衡的问题可能也需要考虑一下。在PaddleOCR里暂时将语言字典分开。
**A**:是的,训练的字典与你使用该模型进行预测的字典需要保持一致的。
#### Q3.2.10: crnn+ctc模型训练所用的垂直文本(旋转至水平方向)是如何生成的 #### Q3.3.16: 如何对检测模型finetune,比如冻结前面的层或某些层使用小的学习率学习
**A**:方法与合成水平方向文字一致,只是将字体替换成了垂直字体。 **A**:如果是冻结某些层,可以将变量的stop_gradient属性设置为True,这样计算这个变量之前的所有参数都不会更新了,参考:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/faq/train_cn.html#id4
如果对某些层使用更小的学习率学习,静态图里还不是很方便,一个方法是在参数初始化的时候,给权重的属性设置固定的学习率,参考:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/fluid/param_attr/ParamAttr_cn.html#paramattr
#### Q3.4.15: hubserving、pdserving这两种部署方式区别是什么? 实际上我们实验发现,直接加载模型去fine-tune,不设置某些层不同学习率,效果也都不错
**A**:hubserving原本是paddlehub的配套服务部署工具,可以很方便的将paddlehub内置的模型部署为服务,paddleocr使用了这个功能,并将模型路径等参数暴露出来方便用户自定义修改。paddle serving是面向所有paddle模型的部署工具,文档中可以看到我们提供了快速版和标准版,其中快速版和hubserving的本质是一样的,而标准版基于rpc,更稳定,更适合分布式部署。
#### Q3.4.16: hub serving部署服务时如何多gpu同时利用起来,export CUDA_VISIBLE_DEVICES=0,1 方式吗? #### Q3.3.17: 使用通用中文模型作为预训练模型,更改了字典文件,出现ctc_fc_b not used的错误
**A**:修改了字典之后,识别模型的最后一层FC纬度发生了改变,没有办法加载参数。这里是一个警告,可以忽略,正常训练即可。
**A**:hubserving的部署方式目前暂不支持多卡预测,除非手动启动多个serving,不同端口对应不同卡。或者可以使用paddleserving进行部署,部署工具已经发布:https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy/pdserving ,在启动服务时--gpu_id 0,1 这样就可以
#### Q3.1.18:如何加入自己的检测算法?
**A**:1. 在ppocr/modeling对应目录下分别选择backbone,head。如果没有可用的可以新建文件并添加
2. 在ppocr/data下选择对应的数据处理处理方式,如果没有可用的可以新建文件并添加
3. 在ppocr/losses下新建文件并编写loss
4. 在ppocr/postprocess下新建文件并编写后处理算法
5. 将上面四个步骤里新添加的类或函数参照yml文件写到配置中
<a name="OCR精选10个问题"></a> <a name="OCR精选10个问题"></a>
## 【精选】OCR精选10个问题 ## 【精选】OCR精选10个问题
...@@ -151,6 +153,11 @@ ...@@ -151,6 +153,11 @@
**A**:端到端在文字分布密集的业务场景,效率会比较有保证,精度的话看自己业务数据积累情况,如果行级别的识别数据积累比较多的话two-stage会比较好。百度的落地场景,比如工业仪表识别、车牌识别都用到端到端解决方案。 **A**:端到端在文字分布密集的业务场景,效率会比较有保证,精度的话看自己业务数据积累情况,如果行级别的识别数据积累比较多的话two-stage会比较好。百度的落地场景,比如工业仪表识别、车牌识别都用到端到端解决方案。
#### Q2.1.4 印章如何识别
**A**: 1. 使用带tps的识别网络或abcnet,2.使用极坐标变换将图片拉平之后使用crnn
#### Q2.1.5 多语言的字典里是混合了不同的语种,这个是有什么讲究吗?统一到一个字典里会对精度造成多大的损失?
**A**:统一到一个字典里,会造成最后一层FC过大,增加模型大小。如果有特殊需求的话,可以把需要的几种语言合并字典训练模型,合并字典之后如果引入过多的形近字,可能会造成精度损失,字符平衡的问题可能也需要考虑一下。在PaddleOCR里暂时将语言字典分开。
### 数据集 ### 数据集
...@@ -329,6 +336,13 @@ ...@@ -329,6 +336,13 @@
|8.6M超轻量中文OCR模型|MobileNetV3+MobileNetV3|det_mv3_db.yml|rec_chinese_lite_train.yml| |8.6M超轻量中文OCR模型|MobileNetV3+MobileNetV3|det_mv3_db.yml|rec_chinese_lite_train.yml|
|通用中文OCR模型|Resnet50_vd+Resnet34_vd|det_r50_vd_db.yml|rec_chinese_common_train.yml| |通用中文OCR模型|Resnet50_vd+Resnet34_vd|det_r50_vd_db.yml|rec_chinese_common_train.yml|
#### 3.1.18:如何加入自己的检测算法?
**A**:1. 在ppocr/modeling对应目录下分别选择backbone,head。如果没有可用的可以新建文件并添加
2. 在ppocr/data下选择对应的数据处理处理方式,如果没有可用的可以新建文件并添加
3. 在ppocr/losses下新建文件并编写loss
4. 在ppocr/postprocess下新建文件并编写后处理算法
5. 将上面四个步骤里新添加的类或函数参照yml文件写到配置中
### 数据集 ### 数据集
...@@ -388,6 +402,7 @@ ...@@ -388,6 +402,7 @@
**A**:方法与合成水平方向文字一致,只是将字体替换成了垂直字体。 **A**:方法与合成水平方向文字一致,只是将字体替换成了垂直字体。
### 模型训练调优 ### 模型训练调优
#### Q3.3.1:文本长度超过25,应该怎么处理? #### Q3.3.1:文本长度超过25,应该怎么处理?
...@@ -463,7 +478,20 @@ return paddle.reader.multiprocess_reader(readers, False, queue_size=320) ...@@ -463,7 +478,20 @@ return paddle.reader.multiprocess_reader(readers, False, queue_size=320)
#### Q3.3.15: 训练中使用的字典需要与加载的预训练模型使用的字典一样吗? #### Q3.3.15: 训练中使用的字典需要与加载的预训练模型使用的字典一样吗?
**A**:是的,训练的字典与你使用该模型进行预测的字典需要保持一致的。 **A**:分情况,1. 不改变识别字符,训练的字典与你使用该模型进行预测的字典需要保持一致的。
2. 改变识别的字符,这种情况可以不一样,最后一层会重新训练
#### Q3.3.16: 如何对检测模型finetune,比如冻结前面的层或某些层使用小的学习率学习?
**A**
**A**:如果是冻结某些层,可以将变量的stop_gradient属性设置为True,这样计算这个变量之前的所有参数都不会更新了,参考:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/faq/train_cn.html#id4
如果对某些层使用更小的学习率学习,静态图里还不是很方便,一个方法是在参数初始化的时候,给权重的属性设置固定的学习率,参考:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/fluid/param_attr/ParamAttr_cn.html#paramattr
实际上我们实验发现,直接加载模型去fine-tune,不设置某些层不同学习率,效果也都不错
#### Q3.3.17: 使用通用中文模型作为预训练模型,更改了字典文件,出现ctc_fc_b not used的错误
**A**:修改了字典之后,识别模型的最后一层FC纬度发生了改变,没有办法加载参数。这里是一个警告,可以忽略,正常训练即可。
### 预测部署 ### 预测部署
......
...@@ -47,9 +47,9 @@ docker images ...@@ -47,9 +47,9 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
``` ```
**2. 安装PaddlePaddle Fluid v2.0** **2. 安装PaddlePaddle v2.0**
``` ```
pip3 install --upgrade pip python3 -m pip install --upgrade pip
如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装 如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装
python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
...@@ -75,7 +75,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR ...@@ -75,7 +75,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
**4. 安装第三方库** **4. 安装第三方库**
``` ```
cd PaddleOCR cd PaddleOCR
pip3 install -r requirments.txt python3 -m pip install -r requirments.txt
``` ```
注意,windows环境下,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装, 注意,windows环境下,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装,
......
...@@ -95,5 +95,5 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_mode ...@@ -95,5 +95,5 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_mode
此外,文档教程中也提供了中文OCR模型的其他预测部署方式: 此外,文档教程中也提供了中文OCR模型的其他预测部署方式:
- [基于C++预测引擎推理](../../deploy/cpp_infer/readme.md) - [基于C++预测引擎推理](../../deploy/cpp_infer/readme.md)
- [服务部署](../../deploy/pdserving/readme.md) - [服务部署](./serving_inference.md)
- [端侧部署](../../deploy/lite/readme.md) - [端侧部署](../../deploy/lite/readme.md)
...@@ -120,19 +120,19 @@ word_dict.txt 每行有一个单字,将字符与数字索引映射在一起, ...@@ -120,19 +120,19 @@ word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,
`ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典, `ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典,
`ppocr/utils/french_dict.txt` 是一个包含118个字符的法文字典 `ppocr/utils/dict/french_dict.txt` 是一个包含118个字符的法文字典
`ppocr/utils/japan_dict.txt` 是一个包含4399个字符的法文字典 `ppocr/utils/dict/japan_dict.txt` 是一个包含4399个字符的法文字典
`ppocr/utils/korean_dict.txt` 是一个包含3636个字符的法文字典 `ppocr/utils/dict/korean_dict.txt` 是一个包含3636个字符的法文字典
`ppocr/utils/german_dict.txt` 是一个包含131个字符的法文字典 `ppocr/utils/dict/german_dict.txt` 是一个包含131个字符的法文字典
您可以按需使用。 您可以按需使用。
目前的多语言模型仍处在demo阶段,会持续优化模型并补充语种,**非常欢迎您为我们提供其他语言的字典和字体** 目前的多语言模型仍处在demo阶段,会持续优化模型并补充语种,**非常欢迎您为我们提供其他语言的字典和字体**
如您愿意可将字典文件提交至 [utils](../../ppocr/utils) ,我们会在Repo中感谢您。 如您愿意可将字典文件提交至 [dict](../../ppocr/utils/dict) 将语料文件提交至[corpus](../../ppocr/utils/corpus),我们会在Repo中感谢您。
- 自定义字典 - 自定义字典
...@@ -269,7 +269,7 @@ PaddleOCR也提供了多语言的, `configs/rec/multi_languages` 路径下的 ...@@ -269,7 +269,7 @@ PaddleOCR也提供了多语言的, `configs/rec/multi_languages` 路径下的
Global: Global:
... ...
# 添加自定义字典,如修改字典请将路径指向新字典 # 添加自定义字典,如修改字典请将路径指向新字典
character_dict_path: ./ppocr/utils/french_dict.txt character_dict_path: ./ppocr/utils/dict/french_dict.txt
# 训练时添加数据增强 # 训练时添加数据增强
distort: true distort: true
# 识别空格 # 识别空格
......
PaddleOCR提供2种服务部署方式:
- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。。
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md)
# 使用Paddle Serving预测推理 # 使用Paddle Serving预测推理
阅读本文档之前,请先阅读文档 [基于Python预测引擎推理](./inference.md) 阅读本文档之前,请先阅读文档 [基于Python预测引擎推理](./inference.md)
...@@ -8,7 +12,7 @@ ...@@ -8,7 +12,7 @@
### 一、 准备环境 ### 一、 准备环境
我们先安装Paddle Serving相关组件 我们先安装Paddle Serving相关组件
我们推荐用户使用GPU来做Paddle Serving的OCR服务部署 我们推荐用户使用GPU来做Paddle Serving的OCR服务部署
**CUDA版本:9.X/10.X** **CUDA版本:9.X/10.X**
...@@ -26,7 +30,7 @@ ...@@ -26,7 +30,7 @@
#CPU/GPU版本选择一个 #CPU/GPU版本选择一个
#GPU版本服务端 #GPU版本服务端
#CUDA 9 #CUDA 9
python -m pip install -U https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post9-py3-none-any.whl python -m pip install -U https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post9-py3-none-any.whl
#CUDA 10 #CUDA 10
python -m pip install -U https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post10-py3-none-any.whl python -m pip install -U https://paddle-serving.bj.bcebos.com/whl/paddle_serving_server_gpu-0.0.0.post10-py3-none-any.whl
#CPU版本服务端 #CPU版本服务端
...@@ -81,7 +85,7 @@ def read_params(): ...@@ -81,7 +85,7 @@ def read_params():
#params for text detector #params for text detector
cfg.det_algorithm = "DB" # 检测算法, DB/EAST等 cfg.det_algorithm = "DB" # 检测算法, DB/EAST等
cfg.det_model_dir = "./det_mv_server/" # 检测算法模型路径 cfg.det_model_dir = "./det_mv_server/" # 检测算法模型路径
cfg.det_max_side_len = 960 cfg.det_max_side_len = 960
#DB params #DB params
cfg.det_db_thresh =0.3 cfg.det_db_thresh =0.3
...@@ -222,14 +226,14 @@ python rec_web_client.py ...@@ -222,14 +226,14 @@ python rec_web_client.py
#GPU用户 #GPU用户
python -m paddle_serving_server_gpu.serve --model det_infer_server --port 9293 --gpu_id 0 python -m paddle_serving_server_gpu.serve --model det_infer_server --port 9293 --gpu_id 0
python -m paddle_serving_server_gpu.serve --model cls_infer_server --port 9294 --gpu_id 0 python -m paddle_serving_server_gpu.serve --model cls_infer_server --port 9294 --gpu_id 0
python ocr_rpc_server.py python ocr_rpc_server.py
#CPU用户 #CPU用户
python -m paddle_serving_server.serve --model det_infer_server --port 9293 python -m paddle_serving_server.serve --model det_infer_server --port 9293
python -m paddle_serving_server.serve --model cls_infer_server --port 9294 python -m paddle_serving_server.serve --model cls_infer_server --port 9294
python ocr_rpc_server.py python ocr_rpc_server.py
#快速版,Windows/Linux用户 #快速版,Windows/Linux用户
python ocr_local_server.py python ocr_local_server.py
``` ```
客户端 客户端
......
...@@ -110,7 +110,7 @@ The default prediction picture is stored in `infer_img`, and the weight is speci ...@@ -110,7 +110,7 @@ The default prediction picture is stored in `infer_img`, and the weight is speci
``` ```
# Predict English results # Predict English results
python3 tools/infer_rec.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/en/word_1.jpg python3 tools/infer_cls.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
``` ```
Input image: Input image:
......
...@@ -330,7 +330,7 @@ If you need to predict other language models, when using inference model predict ...@@ -330,7 +330,7 @@ If you need to predict other language models, when using inference model predict
You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/` path, such as Korean recognition: You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/` path, such as Korean recognition:
``` ```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/ utils/korean_dict.txt" --vis_font_path="doc/korean.ttf" python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/korean.ttf"
``` ```
![](../imgs_words/korean/1.jpg) ![](../imgs_words/korean/1.jpg)
......
...@@ -18,7 +18,7 @@ cd /home/Projects ...@@ -18,7 +18,7 @@ cd /home/Projects
# You need to create a docker container for the first run, and do not need to run the current command when you run it again # You need to create a docker container for the first run, and do not need to run the current command when you run it again
# Create a docker container named ppocr and map the current directory to the /paddle directory of the container # Create a docker container named ppocr and map the current directory to the /paddle directory of the container
#If using CPU, use docker instead of nvidia-docker to create docker # If using CPU, use docker instead of nvidia-docker to create docker
sudo docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash sudo docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
``` ```
If using CUDA9, please run the following command to create a container: If using CUDA9, please run the following command to create a container:
...@@ -49,9 +49,9 @@ docker images ...@@ -49,9 +49,9 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
``` ```
**2. Install PaddlePaddle Fluid v2.0** **2. Install PaddlePaddle v2.0**
``` ```
pip3 install --upgrade pip python3 -m pip install --upgrade pip
# If you have cuda9 or cuda10 installed on your machine, please run the following command to install # If you have cuda9 or cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
...@@ -77,7 +77,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR ...@@ -77,7 +77,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
**4. Install third-party libraries** **4. Install third-party libraries**
``` ```
cd PaddleOCR cd PaddleOCR
pip3 install -r requirments.txt python3 -m pip install -r requirments.txt
``` ```
If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows. If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows.
......
...@@ -98,5 +98,5 @@ For more text detection and recognition tandem reasoning, please refer to the do ...@@ -98,5 +98,5 @@ For more text detection and recognition tandem reasoning, please refer to the do
In addition, the tutorial also provides other deployment methods for the Chinese OCR model: In addition, the tutorial also provides other deployment methods for the Chinese OCR model:
- [Server-side C++ inference](../../deploy/cpp_infer/readme_en.md) - [Server-side C++ inference](../../deploy/cpp_infer/readme_en.md)
- [Service deployment](../../deploy/pdserving/readme_en.md) - [Service deployment](../../deploy/hubserving/readme_en.md)
- [End-to-end deployment](../../deploy/lite/readme_en.md) - [End-to-end deployment](../../deploy/lite/readme_en.md)
...@@ -112,18 +112,18 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a ...@@ -112,18 +112,18 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a
`ppocr/utils/ic15_dict.txt` is an English dictionary with 63 characters `ppocr/utils/ic15_dict.txt` is an English dictionary with 63 characters
`ppocr/utils/french_dict.txt` is a French dictionary with 118 characters `ppocr/utils/dict/french_dict.txt` is a French dictionary with 118 characters
`ppocr/utils/japan_dict.txt` is a French dictionary with 4399 characters `ppocr/utils/dict/japan_dict.txt` is a French dictionary with 4399 characters
`ppocr/utils/korean_dict.txt` is a French dictionary with 3636 characters `ppocr/utils/dict/korean_dict.txt` is a French dictionary with 3636 characters
`ppocr/utils/german_dict.txt` is a French dictionary with 131 characters `ppocr/utils/dict/german_dict.txt` is a French dictionary with 131 characters
You can use it on demand. You can use it on demand.
The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**, The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**,
If you like, you can submit the dictionary file to [utils](../../ppocr/utils) and we will thank you in the Repo. If you like, you can submit the dictionary file to [dict](../../ppocr/utils/dict) or corpus file to [corpus](../../ppocr/utils/corpus) and we will thank you in the Repo.
To customize the dict file, please modify the `character_dict_path` field in `configs/rec/rec_icdar15_train.yml` and set `character_type` to `ch`. To customize the dict file, please modify the `character_dict_path` field in `configs/rec/rec_icdar15_train.yml` and set `character_type` to `ch`.
...@@ -259,7 +259,7 @@ Global: ...@@ -259,7 +259,7 @@ Global:
... ...
# Add a custom dictionary, if you modify the dictionary # Add a custom dictionary, if you modify the dictionary
# please point the path to the new dictionary # please point the path to the new dictionary
character_dict_path: ./ppocr/utils/french_dict.txt character_dict_path: ./ppocr/utils/dict/french_dict.txt
# Add data augmentation during training # Add data augmentation during training
distort: true distort: true
# Identify spaces # Identify spaces
......
doc/joinus.PNG

15.7 KB | W: | H:

doc/joinus.PNG

405.4 KB | W: | H:

doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
  • 2-up
  • Swipe
  • Onion skin
...@@ -74,7 +74,7 @@ class SimpleReader(object): ...@@ -74,7 +74,7 @@ class SimpleReader(object):
def get_device_num(): def get_device_num():
if self.use_gpu: if self.use_gpu:
gpus = os.environ.get("CUDA_VISIBLE_DEVICES", 1) gpus = os.environ.get("CUDA_VISIBLE_DEVICES", "1")
gpu_num = len(gpus.split(',')) gpu_num = len(gpus.split(','))
return gpu_num return gpu_num
else: else:
......
...@@ -28,14 +28,22 @@ from .make_border_map import MakeBorderMap ...@@ -28,14 +28,22 @@ from .make_border_map import MakeBorderMap
class DBProcessTrain(object): class DBProcessTrain(object):
""" """
DB pre-process for Train mode The pre-process of DB for train mode
""" """
def __init__(self, params): def __init__(self, params):
"""
:param params: dict of params
"""
self.img_set_dir = params['img_set_dir'] self.img_set_dir = params['img_set_dir']
self.image_shape = params['image_shape'] self.image_shape = params['image_shape']
def order_points_clockwise(self, pts): def order_points_clockwise(self, pts):
"""
Sort the points in the box clockwise
:param pts: points with shape [4, 2]
:return: sorted points
"""
rect = np.zeros((4, 2), dtype="float32") rect = np.zeros((4, 2), dtype="float32")
s = pts.sum(axis=1) s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)] rect[0] = pts[np.argmin(s)]
...@@ -46,6 +54,12 @@ class DBProcessTrain(object): ...@@ -46,6 +54,12 @@ class DBProcessTrain(object):
return rect return rect
def make_data_dict(self, imgvalue, entry): def make_data_dict(self, imgvalue, entry):
"""
create input dict
:param imgvalue: input image
:param entry: dict of annotations information
:return: created dict of input data information
"""
boxes = [] boxes = []
texts = [] texts = []
ignores = [] ignores = []
...@@ -71,6 +85,11 @@ class DBProcessTrain(object): ...@@ -71,6 +85,11 @@ class DBProcessTrain(object):
return data return data
def NormalizeImage(self, data): def NormalizeImage(self, data):
"""
Normalize input image
:param data: input dict
:return: new dict with normalized image
"""
im = data['image'] im = data['image']
img_mean = [0.485, 0.456, 0.406] img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225] img_std = [0.229, 0.224, 0.225]
...@@ -84,6 +103,11 @@ class DBProcessTrain(object): ...@@ -84,6 +103,11 @@ class DBProcessTrain(object):
return data return data
def FilterKeys(self, data): def FilterKeys(self, data):
"""
Filter keys
:param data: dict
:return:
"""
filter_keys = ['polys', 'texts', 'ignore_tags', 'shape'] filter_keys = ['polys', 'texts', 'ignore_tags', 'shape']
for key in filter_keys: for key in filter_keys:
if key in data: if key in data:
...@@ -91,6 +115,11 @@ class DBProcessTrain(object): ...@@ -91,6 +115,11 @@ class DBProcessTrain(object):
return data return data
def convert_label_infor(self, label_infor): def convert_label_infor(self, label_infor):
"""
encode annotations using json.loads
:param label_infor: string
:return: (image, encoded annotations)
"""
label_infor = label_infor.decode() label_infor = label_infor.decode()
label_infor = label_infor.encode('utf-8').decode('utf-8-sig') label_infor = label_infor.encode('utf-8').decode('utf-8-sig')
substr = label_infor.strip("\n").split("\t") substr = label_infor.strip("\n").split("\t")
...@@ -165,13 +194,13 @@ class DBProcessTest(object): ...@@ -165,13 +194,13 @@ class DBProcessTest(object):
elif resize_h // 32 <= 1: elif resize_h // 32 <= 1:
resize_h = 32 resize_h = 32
else: else:
resize_h = (resize_h // 32 - 1) * 32 resize_h = (resize_h // 32) * 32
if resize_w % 32 == 0: if resize_w % 32 == 0:
resize_w = resize_w resize_w = resize_w
elif resize_w // 32 <= 1: elif resize_w // 32 <= 1:
resize_w = 32 resize_w = 32
else: else:
resize_w = (resize_w // 32 - 1) * 32 resize_w = (resize_w // 32) * 32
try: try:
if int(resize_w) <= 0 or int(resize_h) <= 0: if int(resize_w) <= 0 or int(resize_h) <= 0:
return None, (None, None) return None, (None, None)
...@@ -184,6 +213,11 @@ class DBProcessTest(object): ...@@ -184,6 +213,11 @@ class DBProcessTest(object):
return im, (ratio_h, ratio_w) return im, (ratio_h, ratio_w)
def resize_image_type1(self, im): def resize_image_type1(self, im):
"""
resize image to a size self.image_shape
:param im: input image
:return: normalized image and resize ratio
"""
resize_h, resize_w = self.image_shape resize_h, resize_w = self.image_shape
ori_h, ori_w = im.shape[:2] # (h, w, c) ori_h, ori_w = im.shape[:2] # (h, w, c)
im = cv2.resize(im, (int(resize_w), int(resize_h))) im = cv2.resize(im, (int(resize_w), int(resize_h)))
...@@ -192,6 +226,11 @@ class DBProcessTest(object): ...@@ -192,6 +226,11 @@ class DBProcessTest(object):
return im, (ratio_h, ratio_w) return im, (ratio_h, ratio_w)
def normalize(self, im): def normalize(self, im):
"""
Normalize image
:param im: input image
:return: Normalized image
"""
img_mean = [0.485, 0.456, 0.406] img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225] img_std = [0.229, 0.224, 0.225]
im = im.astype(np.float32, copy=False) im = im.astype(np.float32, copy=False)
......
...@@ -19,6 +19,7 @@ import json ...@@ -19,6 +19,7 @@ import json
import sys import sys
import os import os
class EASTProcessTrain(object): class EASTProcessTrain(object):
def __init__(self, params): def __init__(self, params):
self.img_set_dir = params['img_set_dir'] self.img_set_dir = params['img_set_dir']
...@@ -495,13 +496,13 @@ class EASTProcessTest(object): ...@@ -495,13 +496,13 @@ class EASTProcessTest(object):
elif resize_h // 32 <= 1: elif resize_h // 32 <= 1:
resize_h = 32 resize_h = 32
else: else:
resize_h = (resize_h // 32 - 1) * 32 resize_h = (resize_h // 32) * 32
if resize_w % 32 == 0: if resize_w % 32 == 0:
resize_w = resize_w resize_w = resize_w
elif resize_w // 32 <= 1: elif resize_w // 32 <= 1:
resize_w = 32 resize_w = 32
else: else:
resize_w = (resize_w // 32 - 1) * 32 resize_w = (resize_w // 32) * 32
try: try:
if int(resize_w) <= 0 or int(resize_h) <= 0: if int(resize_w) <= 0 or int(resize_h) <= 0:
return None, (None, None) return None, (None, None)
......
...@@ -599,7 +599,7 @@ class SASTProcessTrain(object): ...@@ -599,7 +599,7 @@ class SASTProcessTrain(object):
""" """
text_polys, txt_tags, txts = [], [], [] text_polys, txt_tags, txts = [], [], []
with open(poly_txt_path) as f: with open(poly_txt_path, 'rb') as f:
for line in f.readlines(): for line in f.readlines():
poly_str, txt = line.strip().split('\t') poly_str, txt = line.strip().split('\t')
poly = map(float, poly_str.split(',')) poly = map(float, poly_str.split(','))
......
...@@ -56,7 +56,7 @@ def resize_norm_img(img, image_shape): ...@@ -56,7 +56,7 @@ def resize_norm_img(img, image_shape):
def resize_norm_img_chinese(img, image_shape): def resize_norm_img_chinese(img, image_shape):
imgC, imgH, imgW = image_shape imgC, imgH, imgW = image_shape
# todo: change to 0 and modified image shape # todo: change to 0 and modified image shape
max_wh_ratio = 0 max_wh_ratio = imgW * 1.0 / imgH
h, w = img.shape[0], img.shape[1] h, w = img.shape[0], img.shape[1]
ratio = w * 1.0 / h ratio = w * 1.0 / h
max_wh_ratio = max(max_wh_ratio, ratio) max_wh_ratio = max(max_wh_ratio, ratio)
...@@ -309,16 +309,28 @@ def warp(img, ang): ...@@ -309,16 +309,28 @@ def warp(img, ang):
if config.distort: if config.distort:
img_height, img_width = img.shape[0:2] img_height, img_width = img.shape[0:2]
if random.random() <= prob and img_height >= 20 and img_width >= 20: if random.random() <= prob and img_height >= 20 and img_width >= 20:
new_img = tia_distort(new_img, random.randint(3, 6)) try:
new_img = tia_distort(new_img, random.randint(3, 6))
except:
logger.warning(
"Exception occured during tia_distort, pass it...")
if config.stretch: if config.stretch:
img_height, img_width = img.shape[0:2] img_height, img_width = img.shape[0:2]
if random.random() <= prob and img_height >= 20 and img_width >= 20: if random.random() <= prob and img_height >= 20 and img_width >= 20:
new_img = tia_stretch(new_img, random.randint(3, 6)) try:
new_img = tia_stretch(new_img, random.randint(3, 6))
except:
logger.warning(
"Exception occured during tia_stretch, pass it...")
if config.perspective: if config.perspective:
if random.random() <= prob: if random.random() <= prob:
new_img = tia_perspective(new_img) try:
new_img = tia_perspective(new_img)
except:
logger.warning(
"Exception occured during tia_perspective, pass it...")
if config.crop: if config.crop:
img_height, img_width = img.shape[0:2] img_height, img_width = img.shape[0:2]
......
...@@ -65,6 +65,7 @@ class ClsModel(object): ...@@ -65,6 +65,7 @@ class ClsModel(object):
labels = None labels = None
loader = None loader = None
image = fluid.data(name='image', shape=image_shape, dtype='float32') image = fluid.data(name='image', shape=image_shape, dtype='float32')
image.stop_gradient = False
return image, labels, loader return image, labels, loader
def __call__(self, mode): def __call__(self, mode):
......
...@@ -16,6 +16,8 @@ from __future__ import absolute_import ...@@ -16,6 +16,8 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
from collections import OrderedDict
from paddle import fluid from paddle import fluid
from ppocr.utils.utility import create_module from ppocr.utils.utility import create_module
...@@ -215,16 +217,15 @@ class RecModel(object): ...@@ -215,16 +217,15 @@ class RecModel(object):
label = labels['label'] label = labels['label']
if self.loss_type == 'srn': if self.loss_type == 'srn':
total_loss, img_loss, word_loss = self.loss(predicts, labels) total_loss, img_loss, word_loss = self.loss(predicts, labels)
outputs = { outputs = OrderedDict([('total_loss', total_loss),
'total_loss': total_loss, ('img_loss', img_loss),
'img_loss': img_loss, ('word_loss', word_loss),
'word_loss': word_loss, ('decoded_out', decoded_out),
'decoded_out': decoded_out, ('label', label)])
'label': label
}
else: else:
outputs = {'total_loss':loss, 'decoded_out':\ outputs = OrderedDict([('total_loss', loss),
decoded_out, 'label':label} ('decoded_out', decoded_out),
('label', label)])
return loader, outputs return loader, outputs
# export_model # export_model
elif mode == "export": elif mode == "export":
...@@ -233,16 +234,15 @@ class RecModel(object): ...@@ -233,16 +234,15 @@ class RecModel(object):
predict = fluid.layers.softmax(predict) predict = fluid.layers.softmax(predict)
if self.loss_type == "srn": if self.loss_type == "srn":
return [ return [
image, labels, { image, labels, OrderedDict([('decoded_out', decoded_out),
'decoded_out': decoded_out, ('predicts', predict)])]
'predicts': predict
}
]
return [image, {'decoded_out': decoded_out, 'predicts': predict}] return [image, OrderedDict([('decoded_out', decoded_out),
('predicts', predict)])]
# eval or test # eval or test
else: else:
predict = predicts['predict'] predict = predicts['predict']
if self.loss_type == "ctc": if self.loss_type == "ctc":
predict = fluid.layers.softmax(predict) predict = fluid.layers.softmax(predict)
return loader, {'decoded_out': decoded_out, 'predicts': predict} return loader, OrderedDict([('decoded_out', decoded_out),
('predicts', predict)])
\ No newline at end of file
...@@ -23,6 +23,14 @@ import paddle.fluid as fluid ...@@ -23,6 +23,14 @@ import paddle.fluid as fluid
class ClsHead(object): class ClsHead(object):
"""
Class orientation
Args:
params(dict): super parameters for build Class network
"""
def __init__(self, params): def __init__(self, params):
super(ClsHead, self).__init__() super(ClsHead, self).__init__()
self.class_dim = params['class_dim'] self.class_dim = params['class_dim']
......
...@@ -109,6 +109,12 @@ class EASTHead(object): ...@@ -109,6 +109,12 @@ class EASTHead(object):
return f_score, f_geo return f_score, f_geo
def __call__(self, inputs): def __call__(self, inputs):
"""
Fuse different levels of feature map from backbone and predict results
Args:
inputs(list): feature maps from backbone
Return: predicts
"""
f_common = self.unet_fusion(inputs) f_common = self.unet_fusion(inputs)
f_score, f_geo = self.detector_header(f_common) f_score, f_geo = self.detector_header(f_common)
predicts = OrderedDict() predicts = OrderedDict()
......
...@@ -38,35 +38,66 @@ class SASTHead(object): ...@@ -38,35 +38,66 @@ class SASTHead(object):
blocks{}: contain block_2, block_3, block_4, block_5, block_6, block_7 with blocks{}: contain block_2, block_3, block_4, block_5, block_6, block_7 with
1/4, 1/8, 1/16, 1/32, 1/64, 1/128 resolution. 1/4, 1/8, 1/16, 1/32, 1/64, 1/128 resolution.
""" """
f = [blocks['block_6'], blocks['block_5'], blocks['block_4'], blocks['block_3'], blocks['block_2']] f = [
blocks['block_6'], blocks['block_5'], blocks['block_4'],
blocks['block_3'], blocks['block_2']
]
num_outputs = [256, 256, 192, 192, 128] num_outputs = [256, 256, 192, 192, 128]
g = [None, None, None, None, None] g = [None, None, None, None, None]
h = [None, None, None, None, None] h = [None, None, None, None, None]
for i in range(5): for i in range(5):
h[i] = conv_bn_layer(input=f[i], num_filters=num_outputs[i], h[i] = conv_bn_layer(
filter_size=1, stride=1, act=None, name='fpn_up_h'+str(i)) input=f[i],
num_filters=num_outputs[i],
filter_size=1,
stride=1,
act=None,
name='fpn_up_h' + str(i))
for i in range(4): for i in range(4):
if i == 0: if i == 0:
g[i] = deconv_bn_layer(input=h[i], num_filters=num_outputs[i + 1], act=None, name='fpn_up_g0') g[i] = deconv_bn_layer(
input=h[i],
num_filters=num_outputs[i + 1],
act=None,
name='fpn_up_g0')
#print("g[{}] shape: {}".format(i, g[i].shape)) #print("g[{}] shape: {}".format(i, g[i].shape))
else: else:
g[i] = fluid.layers.elementwise_add(x=g[i - 1], y=h[i]) g[i] = fluid.layers.elementwise_add(x=g[i - 1], y=h[i])
g[i] = fluid.layers.relu(g[i]) g[i] = fluid.layers.relu(g[i])
#g[i] = conv_bn_layer(input=g[i], num_filters=num_outputs[i], #g[i] = conv_bn_layer(input=g[i], num_filters=num_outputs[i],
# filter_size=1, stride=1, act='relu') # filter_size=1, stride=1, act='relu')
g[i] = conv_bn_layer(input=g[i], num_filters=num_outputs[i], g[i] = conv_bn_layer(
filter_size=3, stride=1, act='relu', name='fpn_up_g%d_1'%i) input=g[i],
g[i] = deconv_bn_layer(input=g[i], num_filters=num_outputs[i + 1], act=None, name='fpn_up_g%d_2'%i) num_filters=num_outputs[i],
filter_size=3,
stride=1,
act='relu',
name='fpn_up_g%d_1' % i)
g[i] = deconv_bn_layer(
input=g[i],
num_filters=num_outputs[i + 1],
act=None,
name='fpn_up_g%d_2' % i)
#print("g[{}] shape: {}".format(i, g[i].shape)) #print("g[{}] shape: {}".format(i, g[i].shape))
g[4] = fluid.layers.elementwise_add(x=g[3], y=h[4]) g[4] = fluid.layers.elementwise_add(x=g[3], y=h[4])
g[4] = fluid.layers.relu(g[4]) g[4] = fluid.layers.relu(g[4])
g[4] = conv_bn_layer(input=g[4], num_filters=num_outputs[4], g[4] = conv_bn_layer(
filter_size=3, stride=1, act='relu', name='fpn_up_fusion_1') input=g[4],
g[4] = conv_bn_layer(input=g[4], num_filters=num_outputs[4], num_filters=num_outputs[4],
filter_size=1, stride=1, act=None, name='fpn_up_fusion_2') filter_size=3,
stride=1,
act='relu',
name='fpn_up_fusion_1')
g[4] = conv_bn_layer(
input=g[4],
num_filters=num_outputs[4],
filter_size=1,
stride=1,
act=None,
name='fpn_up_fusion_2')
return g[4] return g[4]
def FPN_Down_Fusion(self, blocks): def FPN_Down_Fusion(self, blocks):
...@@ -77,95 +108,245 @@ class SASTHead(object): ...@@ -77,95 +108,245 @@ class SASTHead(object):
f = [blocks['block_0'], blocks['block_1'], blocks['block_2']] f = [blocks['block_0'], blocks['block_1'], blocks['block_2']]
num_outputs = [32, 64, 128] num_outputs = [32, 64, 128]
g = [None, None, None] g = [None, None, None]
h = [None, None, None] h = [None, None, None]
for i in range(3): for i in range(3):
h[i] = conv_bn_layer(input=f[i], num_filters=num_outputs[i], h[i] = conv_bn_layer(
filter_size=3, stride=1, act=None, name='fpn_down_h'+str(i)) input=f[i],
num_filters=num_outputs[i],
filter_size=3,
stride=1,
act=None,
name='fpn_down_h' + str(i))
for i in range(2): for i in range(2):
if i == 0: if i == 0:
g[i] = conv_bn_layer(input=h[i], num_filters=num_outputs[i+1], filter_size=3, stride=2, act=None, name='fpn_down_g0') g[i] = conv_bn_layer(
input=h[i],
num_filters=num_outputs[i + 1],
filter_size=3,
stride=2,
act=None,
name='fpn_down_g0')
else: else:
g[i] = fluid.layers.elementwise_add(x=g[i - 1], y=h[i]) g[i] = fluid.layers.elementwise_add(x=g[i - 1], y=h[i])
g[i] = fluid.layers.relu(g[i]) g[i] = fluid.layers.relu(g[i])
g[i] = conv_bn_layer(input=g[i], num_filters=num_outputs[i], filter_size=3, stride=1, act='relu', name='fpn_down_g%d_1'%i) g[i] = conv_bn_layer(
g[i] = conv_bn_layer(input=g[i], num_filters=num_outputs[i+1], filter_size=3, stride=2, act=None, name='fpn_down_g%d_2'%i) input=g[i],
num_filters=num_outputs[i],
filter_size=3,
stride=1,
act='relu',
name='fpn_down_g%d_1' % i)
g[i] = conv_bn_layer(
input=g[i],
num_filters=num_outputs[i + 1],
filter_size=3,
stride=2,
act=None,
name='fpn_down_g%d_2' % i)
# print("g[{}] shape: {}".format(i, g[i].shape)) # print("g[{}] shape: {}".format(i, g[i].shape))
g[2] = fluid.layers.elementwise_add(x=g[1], y=h[2]) g[2] = fluid.layers.elementwise_add(x=g[1], y=h[2])
g[2] = fluid.layers.relu(g[2]) g[2] = fluid.layers.relu(g[2])
g[2] = conv_bn_layer(input=g[2], num_filters=num_outputs[2], g[2] = conv_bn_layer(
filter_size=3, stride=1, act='relu', name='fpn_down_fusion_1') input=g[2],
g[2] = conv_bn_layer(input=g[2], num_filters=num_outputs[2], num_filters=num_outputs[2],
filter_size=1, stride=1, act=None, name='fpn_down_fusion_2') filter_size=3,
stride=1,
act='relu',
name='fpn_down_fusion_1')
g[2] = conv_bn_layer(
input=g[2],
num_filters=num_outputs[2],
filter_size=1,
stride=1,
act=None,
name='fpn_down_fusion_2')
return g[2] return g[2]
def SAST_Header1(self, f_common): def SAST_Header1(self, f_common):
"""Detector header.""" """Detector header."""
#f_score #f_score
f_score = conv_bn_layer(input=f_common, num_filters=64, filter_size=1, stride=1, act='relu', name='f_score1') f_score = conv_bn_layer(
f_score = conv_bn_layer(input=f_score, num_filters=64, filter_size=3, stride=1, act='relu', name='f_score2') input=f_common,
f_score = conv_bn_layer(input=f_score, num_filters=128, filter_size=1, stride=1, act='relu', name='f_score3') num_filters=64,
f_score = conv_bn_layer(input=f_score, num_filters=1, filter_size=3, stride=1, name='f_score4') filter_size=1,
stride=1,
act='relu',
name='f_score1')
f_score = conv_bn_layer(
input=f_score,
num_filters=64,
filter_size=3,
stride=1,
act='relu',
name='f_score2')
f_score = conv_bn_layer(
input=f_score,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_score3')
f_score = conv_bn_layer(
input=f_score,
num_filters=1,
filter_size=3,
stride=1,
name='f_score4')
f_score = fluid.layers.sigmoid(f_score) f_score = fluid.layers.sigmoid(f_score)
# print("f_score shape: {}".format(f_score.shape)) # print("f_score shape: {}".format(f_score.shape))
#f_boder #f_boder
f_border = conv_bn_layer(input=f_common, num_filters=64, filter_size=1, stride=1, act='relu', name='f_border1') f_border = conv_bn_layer(
f_border = conv_bn_layer(input=f_border, num_filters=64, filter_size=3, stride=1, act='relu', name='f_border2') input=f_common,
f_border = conv_bn_layer(input=f_border, num_filters=128, filter_size=1, stride=1, act='relu', name='f_border3') num_filters=64,
f_border = conv_bn_layer(input=f_border, num_filters=4, filter_size=3, stride=1, name='f_border4') filter_size=1,
stride=1,
act='relu',
name='f_border1')
f_border = conv_bn_layer(
input=f_border,
num_filters=64,
filter_size=3,
stride=1,
act='relu',
name='f_border2')
f_border = conv_bn_layer(
input=f_border,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_border3')
f_border = conv_bn_layer(
input=f_border,
num_filters=4,
filter_size=3,
stride=1,
name='f_border4')
# print("f_border shape: {}".format(f_border.shape)) # print("f_border shape: {}".format(f_border.shape))
return f_score, f_border return f_score, f_border
def SAST_Header2(self, f_common): def SAST_Header2(self, f_common):
"""Detector header.""" """Detector header."""
#f_tvo #f_tvo
f_tvo = conv_bn_layer(input=f_common, num_filters=64, filter_size=1, stride=1, act='relu', name='f_tvo1') f_tvo = conv_bn_layer(
f_tvo = conv_bn_layer(input=f_tvo, num_filters=64, filter_size=3, stride=1, act='relu', name='f_tvo2') input=f_common,
f_tvo = conv_bn_layer(input=f_tvo, num_filters=128, filter_size=1, stride=1, act='relu', name='f_tvo3') num_filters=64,
f_tvo = conv_bn_layer(input=f_tvo, num_filters=8, filter_size=3, stride=1, name='f_tvo4') filter_size=1,
stride=1,
act='relu',
name='f_tvo1')
f_tvo = conv_bn_layer(
input=f_tvo,
num_filters=64,
filter_size=3,
stride=1,
act='relu',
name='f_tvo2')
f_tvo = conv_bn_layer(
input=f_tvo,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_tvo3')
f_tvo = conv_bn_layer(
input=f_tvo, num_filters=8, filter_size=3, stride=1, name='f_tvo4')
# print("f_tvo shape: {}".format(f_tvo.shape)) # print("f_tvo shape: {}".format(f_tvo.shape))
#f_tco #f_tco
f_tco = conv_bn_layer(input=f_common, num_filters=64, filter_size=1, stride=1, act='relu', name='f_tco1') f_tco = conv_bn_layer(
f_tco = conv_bn_layer(input=f_tco, num_filters=64, filter_size=3, stride=1, act='relu', name='f_tco2') input=f_common,
f_tco = conv_bn_layer(input=f_tco, num_filters=128, filter_size=1, stride=1, act='relu', name='f_tco3') num_filters=64,
f_tco = conv_bn_layer(input=f_tco, num_filters=2, filter_size=3, stride=1, name='f_tco4') filter_size=1,
stride=1,
act='relu',
name='f_tco1')
f_tco = conv_bn_layer(
input=f_tco,
num_filters=64,
filter_size=3,
stride=1,
act='relu',
name='f_tco2')
f_tco = conv_bn_layer(
input=f_tco,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_tco3')
f_tco = conv_bn_layer(
input=f_tco, num_filters=2, filter_size=3, stride=1, name='f_tco4')
# print("f_tco shape: {}".format(f_tco.shape)) # print("f_tco shape: {}".format(f_tco.shape))
return f_tvo, f_tco return f_tvo, f_tco
def cross_attention(self, f_common): def cross_attention(self, f_common):
""" """
""" """
f_shape = fluid.layers.shape(f_common) f_shape = fluid.layers.shape(f_common)
f_theta = conv_bn_layer(input=f_common, num_filters=128, filter_size=1, stride=1, act='relu', name='f_theta') f_theta = conv_bn_layer(
f_phi = conv_bn_layer(input=f_common, num_filters=128, filter_size=1, stride=1, act='relu', name='f_phi') input=f_common,
f_g = conv_bn_layer(input=f_common, num_filters=128, filter_size=1, stride=1, act='relu', name='f_g') num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_theta')
f_phi = conv_bn_layer(
input=f_common,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_phi')
f_g = conv_bn_layer(
input=f_common,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_g')
### horizon ### horizon
fh_theta = f_theta fh_theta = f_theta
fh_phi = f_phi fh_phi = f_phi
fh_g = f_g fh_g = f_g
#flatten #flatten
fh_theta = fluid.layers.transpose(fh_theta, [0, 2, 3, 1]) fh_theta = fluid.layers.transpose(fh_theta, [0, 2, 3, 1])
fh_theta = fluid.layers.reshape(fh_theta, [f_shape[0] * f_shape[2], f_shape[3], 128]) fh_theta = fluid.layers.reshape(
fh_theta, [f_shape[0] * f_shape[2], f_shape[3], 128])
fh_phi = fluid.layers.transpose(fh_phi, [0, 2, 3, 1]) fh_phi = fluid.layers.transpose(fh_phi, [0, 2, 3, 1])
fh_phi = fluid.layers.reshape(fh_phi, [f_shape[0] * f_shape[2], f_shape[3], 128]) fh_phi = fluid.layers.reshape(
fh_phi, [f_shape[0] * f_shape[2], f_shape[3], 128])
fh_g = fluid.layers.transpose(fh_g, [0, 2, 3, 1]) fh_g = fluid.layers.transpose(fh_g, [0, 2, 3, 1])
fh_g = fluid.layers.reshape(fh_g, [f_shape[0] * f_shape[2], f_shape[3], 128]) fh_g = fluid.layers.reshape(fh_g,
[f_shape[0] * f_shape[2], f_shape[3], 128])
#correlation #correlation
fh_attn = fluid.layers.matmul(fh_theta, fluid.layers.transpose(fh_phi, [0, 2, 1])) fh_attn = fluid.layers.matmul(fh_theta,
fluid.layers.transpose(fh_phi, [0, 2, 1]))
#scale #scale
fh_attn = fh_attn / (128 ** 0.5) fh_attn = fh_attn / (128**0.5)
fh_attn = fluid.layers.softmax(fh_attn) fh_attn = fluid.layers.softmax(fh_attn)
#weighted sum #weighted sum
fh_weight = fluid.layers.matmul(fh_attn, fh_g) fh_weight = fluid.layers.matmul(fh_attn, fh_g)
fh_weight = fluid.layers.reshape(fh_weight, [f_shape[0], f_shape[2], f_shape[3], 128]) fh_weight = fluid.layers.reshape(
fh_weight, [f_shape[0], f_shape[2], f_shape[3], 128])
# print("fh_weight: {}".format(fh_weight.shape)) # print("fh_weight: {}".format(fh_weight.shape))
fh_weight = fluid.layers.transpose(fh_weight, [0, 3, 1, 2]) fh_weight = fluid.layers.transpose(fh_weight, [0, 3, 1, 2])
fh_weight = conv_bn_layer(input=fh_weight, num_filters=128, filter_size=1, stride=1, name='fh_weight') fh_weight = conv_bn_layer(
input=fh_weight,
num_filters=128,
filter_size=1,
stride=1,
name='fh_weight')
#short cut #short cut
fh_sc = conv_bn_layer(input=f_common, num_filters=128, filter_size=1, stride=1, name='fh_sc') fh_sc = conv_bn_layer(
input=f_common,
num_filters=128,
filter_size=1,
stride=1,
name='fh_sc')
f_h = fluid.layers.relu(fh_weight + fh_sc) f_h = fluid.layers.relu(fh_weight + fh_sc)
###### ######
#vertical #vertical
...@@ -174,31 +355,60 @@ class SASTHead(object): ...@@ -174,31 +355,60 @@ class SASTHead(object):
fv_g = fluid.layers.transpose(f_g, [0, 1, 3, 2]) fv_g = fluid.layers.transpose(f_g, [0, 1, 3, 2])
#flatten #flatten
fv_theta = fluid.layers.transpose(fv_theta, [0, 2, 3, 1]) fv_theta = fluid.layers.transpose(fv_theta, [0, 2, 3, 1])
fv_theta = fluid.layers.reshape(fv_theta, [f_shape[0] * f_shape[3], f_shape[2], 128]) fv_theta = fluid.layers.reshape(
fv_theta, [f_shape[0] * f_shape[3], f_shape[2], 128])
fv_phi = fluid.layers.transpose(fv_phi, [0, 2, 3, 1]) fv_phi = fluid.layers.transpose(fv_phi, [0, 2, 3, 1])
fv_phi = fluid.layers.reshape(fv_phi, [f_shape[0] * f_shape[3], f_shape[2], 128]) fv_phi = fluid.layers.reshape(
fv_phi, [f_shape[0] * f_shape[3], f_shape[2], 128])
fv_g = fluid.layers.transpose(fv_g, [0, 2, 3, 1]) fv_g = fluid.layers.transpose(fv_g, [0, 2, 3, 1])
fv_g = fluid.layers.reshape(fv_g, [f_shape[0] * f_shape[3], f_shape[2], 128]) fv_g = fluid.layers.reshape(fv_g,
[f_shape[0] * f_shape[3], f_shape[2], 128])
#correlation #correlation
fv_attn = fluid.layers.matmul(fv_theta, fluid.layers.transpose(fv_phi, [0, 2, 1])) fv_attn = fluid.layers.matmul(fv_theta,
fluid.layers.transpose(fv_phi, [0, 2, 1]))
#scale #scale
fv_attn = fv_attn / (128 ** 0.5) fv_attn = fv_attn / (128**0.5)
fv_attn = fluid.layers.softmax(fv_attn) fv_attn = fluid.layers.softmax(fv_attn)
#weighted sum #weighted sum
fv_weight = fluid.layers.matmul(fv_attn, fv_g) fv_weight = fluid.layers.matmul(fv_attn, fv_g)
fv_weight = fluid.layers.reshape(fv_weight, [f_shape[0], f_shape[3], f_shape[2], 128]) fv_weight = fluid.layers.reshape(
fv_weight, [f_shape[0], f_shape[3], f_shape[2], 128])
# print("fv_weight: {}".format(fv_weight.shape)) # print("fv_weight: {}".format(fv_weight.shape))
fv_weight = fluid.layers.transpose(fv_weight, [0, 3, 2, 1]) fv_weight = fluid.layers.transpose(fv_weight, [0, 3, 2, 1])
fv_weight = conv_bn_layer(input=fv_weight, num_filters=128, filter_size=1, stride=1, name='fv_weight') fv_weight = conv_bn_layer(
input=fv_weight,
num_filters=128,
filter_size=1,
stride=1,
name='fv_weight')
#short cut #short cut
fv_sc = conv_bn_layer(input=f_common, num_filters=128, filter_size=1, stride=1, name='fv_sc') fv_sc = conv_bn_layer(
input=f_common,
num_filters=128,
filter_size=1,
stride=1,
name='fv_sc')
f_v = fluid.layers.relu(fv_weight + fv_sc) f_v = fluid.layers.relu(fv_weight + fv_sc)
###### ######
f_attn = fluid.layers.concat([f_h, f_v], axis=1) f_attn = fluid.layers.concat([f_h, f_v], axis=1)
f_attn = conv_bn_layer(input=f_attn, num_filters=128, filter_size=1, stride=1, act='relu', name='f_attn') f_attn = conv_bn_layer(
input=f_attn,
num_filters=128,
filter_size=1,
stride=1,
act='relu',
name='f_attn')
return f_attn return f_attn
def __call__(self, blocks, with_cab=False): def __call__(self, blocks, with_cab=False):
"""
Fuse different levels of feature map from backbone and predict results
Args:
blocks(list): feature maps from backbone
with_cab(bool): whether use cross_attention
Return: predicts
"""
# for k, v in blocks.items(): # for k, v in blocks.items():
# print(k, v.shape) # print(k, v.shape)
...@@ -212,12 +422,12 @@ class SASTHead(object): ...@@ -212,12 +422,12 @@ class SASTHead(object):
f_common = fluid.layers.elementwise_add(x=f_down, y=f_up) f_common = fluid.layers.elementwise_add(x=f_down, y=f_up)
f_common = fluid.layers.relu(f_common) f_common = fluid.layers.relu(f_common)
# print("f_common: {}".format(f_common.shape)) # print("f_common: {}".format(f_common.shape))
if self.with_cab: if self.with_cab:
# print('enhence f_common with CAB.') # print('enhence f_common with CAB.')
f_common = self.cross_attention(f_common) f_common = self.cross_attention(f_common)
f_score, f_border= self.SAST_Header1(f_common) f_score, f_border = self.SAST_Header1(f_common)
f_tvo, f_tco = self.SAST_Header2(f_common) f_tvo, f_tco = self.SAST_Header2(f_common)
predicts = OrderedDict() predicts = OrderedDict()
...@@ -225,4 +435,4 @@ class SASTHead(object): ...@@ -225,4 +435,4 @@ class SASTHead(object):
predicts['f_border'] = f_border predicts['f_border'] = f_border
predicts['f_tvo'] = f_tvo predicts['f_tvo'] = f_tvo
predicts['f_tco'] = f_tco predicts['f_tco'] = f_tco
return predicts return predicts
\ No newline at end of file
...@@ -28,6 +28,13 @@ gradient_clip = 10 ...@@ -28,6 +28,13 @@ gradient_clip = 10
class SRNPredict(object): class SRNPredict(object):
"""
SRN:
see arxiv: https://arxiv.org/abs/2003.12294
args:
params(dict): the super parameters for network build
"""
def __init__(self, params): def __init__(self, params):
super(SRNPredict, self).__init__() super(SRNPredict, self).__init__()
self.char_num = params['char_num'] self.char_num = params['char_num']
...@@ -39,7 +46,15 @@ class SRNPredict(object): ...@@ -39,7 +46,15 @@ class SRNPredict(object):
self.hidden_dims = params['hidden_dims'] self.hidden_dims = params['hidden_dims']
def pvam(self, inputs, others): def pvam(self, inputs, others):
"""
Parallel visual attention module model
args:
inputs(variable): Feature map extracted from backbone network
others(list): Other location information variables
return: pvam_features
"""
b, c, h, w = inputs.shape b, c, h, w = inputs.shape
conv_features = fluid.layers.reshape(x=inputs, shape=[-1, c, h * w]) conv_features = fluid.layers.reshape(x=inputs, shape=[-1, c, h * w])
conv_features = fluid.layers.transpose(x=conv_features, perm=[0, 2, 1]) conv_features = fluid.layers.transpose(x=conv_features, perm=[0, 2, 1])
...@@ -98,6 +113,15 @@ class SRNPredict(object): ...@@ -98,6 +113,15 @@ class SRNPredict(object):
return pvam_features return pvam_features
def gsrm(self, pvam_features, others): def gsrm(self, pvam_features, others):
"""
Global Semantic Reasonging Module
args:
pvam_features(variable): Feature map extracted from pvam
others(list): Other location information variables
return: gsrm_features, word_out, gsrm_out
"""
#===== GSRM Visual-to-semantic embedding block ===== #===== GSRM Visual-to-semantic embedding block =====
b, t, c = pvam_features.shape b, t, c = pvam_features.shape
...@@ -190,7 +214,15 @@ class SRNPredict(object): ...@@ -190,7 +214,15 @@ class SRNPredict(object):
return gsrm_features, word_out, gsrm_out return gsrm_features, word_out, gsrm_out
def vsfd(self, pvam_features, gsrm_features): def vsfd(self, pvam_features, gsrm_features):
"""
Visual-Semantic Fusion Decoder Module
args:
pvam_features(variable): Feature map extracted from pvam
gsrm_features(list): Feature map extracted from gsrm
return: fc_out
"""
#===== Visual-Semantic Fusion Decoder Module ===== #===== Visual-Semantic Fusion Decoder Module =====
b, t, c1 = pvam_features.shape b, t, c1 = pvam_features.shape
b, t, c2 = gsrm_features.shape b, t, c2 = gsrm_features.shape
......
...@@ -70,6 +70,13 @@ class LocalizationNetwork(object): ...@@ -70,6 +70,13 @@ class LocalizationNetwork(object):
return initial_bias return initial_bias
def __call__(self, image): def __call__(self, image):
"""
Estimating parameters of geometric transformation
Args:
image: input
Return:
batch_C_prime: the matrix of the geometric transformation
"""
F = self.F F = self.F
loc_lr = self.loc_lr loc_lr = self.loc_lr
if self.model_name == "large": if self.model_name == "large":
...@@ -215,6 +222,14 @@ class GridGenerator(object): ...@@ -215,6 +222,14 @@ class GridGenerator(object):
return batch_C_ex_part_tensor return batch_C_ex_part_tensor
def __call__(self, batch_C_prime, I_r_size): def __call__(self, batch_C_prime, I_r_size):
"""
Generate the grid for the grid_sampler.
Args:
batch_C_prime: the matrix of the geometric transformation
I_r_size: the shape of the input image
Return:
batch_P_prime: the grid for the grid_sampler
"""
C = self.build_C() C = self.build_C()
P = self.build_P(I_r_size) P = self.build_P(I_r_size)
inv_delta_C = self.build_inv_delta_C(C).astype('float32') inv_delta_C = self.build_inv_delta_C(C).astype('float32')
......
...@@ -29,9 +29,17 @@ def cosine_decay_with_warmup(learning_rate, ...@@ -29,9 +29,17 @@ def cosine_decay_with_warmup(learning_rate,
step_each_epoch, step_each_epoch,
epochs=500, epochs=500,
warmup_minibatch=1000): warmup_minibatch=1000):
"""Applies cosine decay to the learning rate. """
Applies cosine decay to the learning rate.
lr = 0.05 * (math.cos(epoch * (math.pi / 120)) + 1) lr = 0.05 * (math.cos(epoch * (math.pi / 120)) + 1)
decrease lr for every mini-batch and start with warmup. decrease lr for every mini-batch and start with warmup.
args:
learning_rate(float): initial learning rate
step_each_epoch (int): number of step for each epoch in training process
epochs(int): number of training epochs
warmup_minibatch(int): number of minibatch for warmup
return:
lr(tensor): learning rate tensor
""" """
global_step = _decay_step_counter() global_step = _decay_step_counter()
lr = fluid.layers.tensor.create_global_var( lr = fluid.layers.tensor.create_global_var(
...@@ -65,6 +73,7 @@ def AdamDecay(params, parameter_list=None): ...@@ -65,6 +73,7 @@ def AdamDecay(params, parameter_list=None):
params(dict): the super parameters params(dict): the super parameters
parameter_list (list): list of Variable names to update to minimize loss parameter_list (list): list of Variable names to update to minimize loss
return: return:
optimizer: a Adam optimizer instance
""" """
base_lr = params['base_lr'] base_lr = params['base_lr']
beta1 = params['beta1'] beta1 = params['beta1']
...@@ -121,6 +130,7 @@ def RMSProp(params, parameter_list=None): ...@@ -121,6 +130,7 @@ def RMSProp(params, parameter_list=None):
params(dict): the super parameters params(dict): the super parameters
parameter_list (list): list of Variable names to update to minimize loss parameter_list (list): list of Variable names to update to minimize loss
return: return:
optimizer: a RMSProp optimizer instance
""" """
base_lr = params.get("base_lr", 0.001) base_lr = params.get("base_lr", 0.001)
l2_decay = params.get("l2_decay", 0.00005) l2_decay = params.get("l2_decay", 0.00005)
......
...@@ -24,6 +24,7 @@ import string ...@@ -24,6 +24,7 @@ import string
import cv2 import cv2
from shapely.geometry import Polygon from shapely.geometry import Polygon
import pyclipper import pyclipper
from copy import deepcopy
class DBPostProcess(object): class DBPostProcess(object):
...@@ -39,13 +40,15 @@ class DBPostProcess(object): ...@@ -39,13 +40,15 @@ class DBPostProcess(object):
self.min_size = 3 self.min_size = 3
self.dilation_kernel = np.array([[1, 1], [1, 1]]) self.dilation_kernel = np.array([[1, 1], [1, 1]])
def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height): def boxes_from_bitmap(self, pred, mask):
''' """
_bitmap: single map with shape (1, H, W), Get boxes from the binarized image predicted by DB.
whose values are binarized as {0, 1} :param pred: the binarized image predicted by DB.
''' :param mask: new 'pred' after threshold filtering.
:return: (boxes, the score of each boxes)
bitmap = _bitmap """
dest_height, dest_width = pred.shape[-2:]
bitmap = deepcopy(mask)
height, width = bitmap.shape height, width = bitmap.shape
outs = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, outs = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST,
...@@ -87,6 +90,11 @@ class DBPostProcess(object): ...@@ -87,6 +90,11 @@ class DBPostProcess(object):
return boxes, scores return boxes, scores
def unclip(self, box): def unclip(self, box):
"""
Shrink or expand the boxaccording to 'unclip_ratio'
:param box: The predicted box.
:return: uncliped box
"""
unclip_ratio = self.unclip_ratio unclip_ratio = self.unclip_ratio
poly = Polygon(box) poly = Polygon(box)
distance = poly.area * unclip_ratio / poly.length distance = poly.area * unclip_ratio / poly.length
...@@ -96,6 +104,11 @@ class DBPostProcess(object): ...@@ -96,6 +104,11 @@ class DBPostProcess(object):
return expanded return expanded
def get_mini_boxes(self, contour): def get_mini_boxes(self, contour):
"""
Get boxes from the contour or box.
:param contour: The predicted contour.
:return: The predicted box.
"""
bounding_box = cv2.minAreaRect(contour) bounding_box = cv2.minAreaRect(contour)
points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0]) points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0])
...@@ -119,6 +132,12 @@ class DBPostProcess(object): ...@@ -119,6 +132,12 @@ class DBPostProcess(object):
return box, min(bounding_box[1]) return box, min(bounding_box[1])
def box_score_fast(self, bitmap, _box): def box_score_fast(self, bitmap, _box):
"""
Calculate the score of box.
:param bitmap: The binarized image predicted by DB.
:param _box: The predicted box
:return: score
"""
h, w = bitmap.shape[:2] h, w = bitmap.shape[:2]
box = _box.copy() box = _box.copy()
xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1) xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1)
...@@ -137,13 +156,14 @@ class DBPostProcess(object): ...@@ -137,13 +156,14 @@ class DBPostProcess(object):
pred = pred[:, 0, :, :] pred = pred[:, 0, :, :]
segmentation = pred > self.thresh segmentation = pred > self.thresh
boxes_batch = [] boxes_batch = []
for batch_index in range(pred.shape[0]): for batch_index in range(pred.shape[0]):
height, width = pred.shape[-2:]
mask = cv2.dilate(np.array(segmentation[batch_index]).astype(np.uint8), self.dilation_kernel) mask = cv2.dilate(
tmp_boxes, tmp_scores = self.boxes_from_bitmap(pred[batch_index], mask, width, height) np.array(segmentation[batch_index]).astype(np.uint8),
self.dilation_kernel)
tmp_boxes, tmp_scores = self.boxes_from_bitmap(pred[batch_index],
mask)
boxes = [] boxes = []
for k in range(len(tmp_boxes)): for k in range(len(tmp_boxes)):
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
# Waiting for your contribution
PaddleOCR welcomes you to provide multilingual corpus for us to synthesize more data to optimize the model.
If you are interested, you can submit the corpus text to this directory and name it with {language}_corpus.txt.
PaddleOCR thanks for your contribution.
\ No newline at end of file
# 欢迎贡献语料
PaddleOCR非常欢迎你提供多语言的语料,以供我们合成更多数据来优化模型。
如你感兴趣,可将语料文本提交到此目录,并以 {语言}_corpus.txt 命名,PaddleOCR团队感谢你的贡献。
!
"
%
&
'
(
)
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
:
;
?
[
]
«
³
µ
º
»
A
Á
À
B
C
Ç
D
E
É
È
F
G
H
I
Í
Ï
J
K
L
M
N
O
Ó
Ò
P
Q
R
S
T
U
V
W
X
Y
Z
a
á
à
b
c
d
e
é
è
f
g
h
i
í
ï
j
k
l
m
n
o
ó
ò
p
q
r
s
t
u
ú
ü
v
w
x
y
z
ç
æ
Æ
ê
Ê
ë
Ë
ñ
Ñ
ô
Ô
œ
Œ
ù
Ù
...@@ -90,3 +90,15 @@ def check_and_read_gif(img_path): ...@@ -90,3 +90,15 @@ def check_and_read_gif(img_path):
return imgvalue, True return imgvalue, True
return None, False return None, False
def create_multi_devices_program(program, loss_var_name):
build_strategy = fluid.BuildStrategy()
build_strategy.memory_optimize = False
build_strategy.enable_inplace = True
exec_strategy = fluid.ExecutionStrategy()
exec_strategy.num_iteration_per_drop_scope = 1
compile_program = fluid.CompiledProgram(program).with_data_parallel(
loss_name=loss_var_name,
build_strategy=build_strategy,
exec_strategy=exec_strategy)
return compile_program
...@@ -3,4 +3,5 @@ imgaug ...@@ -3,4 +3,5 @@ imgaug
pyclipper pyclipper
lmdb lmdb
tqdm tqdm
numpy numpy
\ No newline at end of file opencv-python
...@@ -63,8 +63,9 @@ class TextRecognizer(object): ...@@ -63,8 +63,9 @@ class TextRecognizer(object):
def resize_norm_img(self, img, max_wh_ratio): def resize_norm_img(self, img, max_wh_ratio):
imgC, imgH, imgW = self.rec_image_shape imgC, imgH, imgW = self.rec_image_shape
assert imgC == img.shape[2] assert imgC == img.shape[2]
wh_ratio = max(max_wh_ratio, imgW * 1.0 / imgH)
if self.character_type == "ch": if self.character_type == "ch":
imgW = int((32 * max_wh_ratio)) imgW = int((32 * wh_ratio))
h, w = img.shape[:2] h, w = img.shape[:2]
ratio = w / float(h) ratio = w / float(h)
if math.ceil(imgH * ratio) > imgW: if math.ceil(imgH * ratio) > imgW:
......
...@@ -84,19 +84,29 @@ def parse_args(): ...@@ -84,19 +84,29 @@ def parse_args():
parser.add_argument("--enable_mkldnn", type=str2bool, default=False) parser.add_argument("--enable_mkldnn", type=str2bool, default=False)
parser.add_argument("--use_zero_copy_run", type=str2bool, default=False) parser.add_argument("--use_zero_copy_run", type=str2bool, default=False)
parser.add_argument("--use_pdserving", type=str2bool, default=False) parser.add_argument("--use_pdserving", type=str2bool, default=False)
return parser.parse_args() return parser.parse_args()
def create_predictor(args, mode): def create_predictor(args, mode):
"""
create predictor for inference
:param args: params for prediction engine
:param mode: mode
:return: predictor
"""
if mode == "det": if mode == "det":
model_dir = args.det_model_dir model_dir = args.det_model_dir
elif mode == 'cls': elif mode == 'cls':
model_dir = args.cls_model_dir model_dir = args.cls_model_dir
else: elif mode == 'rec':
model_dir = args.rec_model_dir model_dir = args.rec_model_dir
else:
raise ValueError(
"'mode' of create_predictor() can only be one of ['det', 'cls', 'rec']"
)
if model_dir is None: if model_dir is None:
logger.info("not find {} model file path {}".format(mode, model_dir)) logger.info("not find {} model file path {}".format(mode, model_dir))
...@@ -144,6 +154,12 @@ def create_predictor(args, mode): ...@@ -144,6 +154,12 @@ def create_predictor(args, mode):
def draw_text_det_res(dt_boxes, img_path): def draw_text_det_res(dt_boxes, img_path):
"""
Visualize the results of detection
:param dt_boxes: The boxes predicted by detection model
:param img_path: Image path
:return: Visualized image
"""
src_im = cv2.imread(img_path) src_im = cv2.imread(img_path)
for box in dt_boxes: for box in dt_boxes:
box = np.array(box).astype(np.int32).reshape(-1, 2) box = np.array(box).astype(np.int32).reshape(-1, 2)
......
...@@ -90,13 +90,13 @@ def load_config(file_path): ...@@ -90,13 +90,13 @@ def load_config(file_path):
merge_config(default_config) merge_config(default_config)
_, ext = os.path.splitext(file_path) _, ext = os.path.splitext(file_path)
assert ext in ['.yml', '.yaml'], "only support yaml files for now" assert ext in ['.yml', '.yaml'], "only support yaml files for now"
merge_config(yaml.load(open(file_path), Loader=yaml.Loader)) merge_config(yaml.load(open(file_path, 'rb'), Loader=yaml.Loader))
assert "reader_yml" in global_config['Global'],\ assert "reader_yml" in global_config['Global'],\
"absence reader_yml in global" "absence reader_yml in global"
reader_file_path = global_config['Global']['reader_yml'] reader_file_path = global_config['Global']['reader_yml']
_, ext = os.path.splitext(reader_file_path) _, ext = os.path.splitext(reader_file_path)
assert ext in ['.yml', '.yaml'], "only support yaml files for reader" assert ext in ['.yml', '.yaml'], "only support yaml files for reader"
merge_config(yaml.load(open(reader_file_path), Loader=yaml.Loader)) merge_config(yaml.load(open(reader_file_path, 'rb'), Loader=yaml.Loader))
return global_config return global_config
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册