diff --git a/doc/doc_ch/algorithm_overview.md b/doc/doc_ch/algorithm_overview.md index 858dc02b9d21981ce3b465f33ce494b290db51fb..ecb0e9dfefbfdef2f8cea273c4e3de468aa29415 100755 --- a/doc/doc_ch/algorithm_overview.md +++ b/doc/doc_ch/algorithm_overview.md @@ -24,7 +24,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广 ### 1.1 文本检测算法 已支持的文本检测算法列表(戳链接获取使用教程): -- [x] [DB](./algorithm_det_db.md) +- [x] [DB与DB++](./algorithm_det_db.md) - [x] [EAST](./algorithm_det_east.md) - [x] [SAST](./algorithm_det_sast.md) - [x] [PSENet](./algorithm_det_psenet.md) @@ -41,6 +41,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广 |SAST|ResNet50_vd|91.39%|83.77%|87.42%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)| |PSE|ResNet50_vd|85.81%|79.53%|82.55%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)| |PSE|MobileNetV3|82.20%|70.48%|75.89%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)| +|DB++|ResNet50|90.89%|82.66%|86.58%|[合成数据预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)| 在Total-text文本检测公开数据集上,算法效果如下: @@ -129,10 +130,10 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广 已支持的关键信息抽取算法列表(戳链接获取使用教程): -- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm.md) -- [x] [LayoutLM](./algorithm_kie_laoutxlm.md) -- [x] [LayoutLMv2](./algorithm_kie_laoutxlm.md) -- [x] [LayoutXLM](./algorithm_kie_laoutxlm.md) +- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm.md) +- [x] [LayoutLM](./algorithm_kie_layoutxlm.md) +- [x] [LayoutLMv2](./algorithm_kie_layoutxlm.md) +- [x] [LayoutXLM](./algorithm_kie_layoutxlm.md) - [x] [SDMGR](././algorithm_kie_sdmgr.md) 在wildreceipt发票公开数据集上,算法复现效果如下: diff --git a/doc/doc_en/algorithm_det_db_en.md b/doc/doc_en/algorithm_det_db_en.md index f5f333a039acded88f0f28d302821c5eb10d7402..fde344c3572f771e3e0fe5f9f62282cd1ae0a024 100644 --- a/doc/doc_en/algorithm_det_db_en.md +++ b/doc/doc_en/algorithm_det_db_en.md @@ -1,4 +1,4 @@ -# DB +# DB && DB++ - [1. Introduction](#1) - [2. Environment](#2) @@ -21,13 +21,23 @@ Paper: > Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang > AAAI, 2020 +> [Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion](https://arxiv.org/abs/2202.10304) +> Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang +> TPAMI, 2022 + On the ICDAR2015 dataset, the text detection result is as follows: |Model|Backbone|Configuration|Precision|Recall|Hmean|Download| | --- | --- | --- | --- | --- | --- | --- | |DB|ResNet50_vd|[configs/det/det_r50_vd_db.yml](../../configs/det/det_r50_vd_db.yml)|86.41%|78.72%|82.38%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)| |DB|MobileNetV3|[configs/det/det_mv3_db.yml](../../configs/det/det_mv3_db.yml)|77.29%|73.08%|75.12%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)| +|DB++|ResNet50|[configs/det/det_r50_db++_ic15.yml](../../configs/det/det_r50_db++_ic15.yml)|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)| + +On the TD_TR dataset, the text detection result is as follows: +|Model|Backbone|Configuration|Precision|Recall|Hmean|Download| +| --- | --- | --- | --- | --- | --- | --- | +|DB++|ResNet50|[configs/det/det_r50_db++_td_tr.yml](../../configs/det/det_r50_db++_td_tr.yml)|92.92%|86.48%|89.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_td_tr_train.tar)| ## 2. Environment @@ -96,4 +106,12 @@ More deployment schemes supported for DB: pages={11474--11481}, year={2020} } -``` \ No newline at end of file + +@article{liao2022real, + title={Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion}, + author={Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang}, + journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, + year={2022}, + publisher={IEEE} +} +``` diff --git a/doc/doc_en/algorithm_overview_en.md b/doc/doc_en/algorithm_overview_en.md index 5bf569e3e1649cfabbe196be7e1a55d1caa3bf61..bca22f78482980bed18d6447d0cf07b27c26720d 100755 --- a/doc/doc_en/algorithm_overview_en.md +++ b/doc/doc_en/algorithm_overview_en.md @@ -22,7 +22,7 @@ Developers are welcome to contribute more algorithms! Please refer to [add new a ### 1.1 Text Detection Algorithms Supported text detection algorithms (Click the link to get the tutorial): -- [x] [DB](./algorithm_det_db_en.md) +- [x] [DB && DB++](./algorithm_det_db_en.md) - [x] [EAST](./algorithm_det_east_en.md) - [x] [SAST](./algorithm_det_sast_en.md) - [x] [PSENet](./algorithm_det_psenet_en.md) @@ -39,6 +39,7 @@ On the ICDAR2015 dataset, the text detection result is as follows: |SAST|ResNet50_vd|91.39%|83.77%|87.42%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)| |PSE|ResNet50_vd|85.81%|79.53%|82.55%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)| |PSE|MobileNetV3|82.20%|70.48%|75.89%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)| +|DB++|ResNet50|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)| On Total-Text dataset, the text detection result is as follows: @@ -127,10 +128,10 @@ On the PubTabNet dataset, the algorithm result is as follows: Supported KIE algorithms (Click the link to get the tutorial): -- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm_en.md) -- [x] [LayoutLM](./algorithm_kie_laoutxlm_en.md) -- [x] [LayoutLMv2](./algorithm_kie_laoutxlm_en.md) -- [x] [LayoutXLM](./algorithm_kie_laoutxlm_en.md) +- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm_en.md) +- [x] [LayoutLM](./algorithm_kie_layoutxlm_en.md) +- [x] [LayoutLMv2](./algorithm_kie_layoutxlm_en.md) +- [x] [LayoutXLM](./algorithm_kie_layoutxlm_en.md) - [x] [SDMGR](./algorithm_kie_sdmgr_en.md) On wildreceipt dataset, the algorithm result is as follows: diff --git a/ppocr/postprocess/rec_postprocess.py b/ppocr/postprocess/rec_postprocess.py index f77631700648e84f28223cb14738e7b4ab679012..749060a053f1442f4bf5df6c5f4b56205e893be8 100644 --- a/ppocr/postprocess/rec_postprocess.py +++ b/ppocr/postprocess/rec_postprocess.py @@ -24,7 +24,7 @@ class BaseRecLabelDecode(object): def __init__(self, character_dict_path=None, use_space_char=False): self.beg_str = "sos" self.end_str = "eos" - + self.reverse = False self.character_str = [] if character_dict_path is None: self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz" @@ -38,6 +38,8 @@ class BaseRecLabelDecode(object): if use_space_char: self.character_str.append(" ") dict_character = list(self.character_str) + if 'arabic' in character_dict_path: + self.reverse = True dict_character = self.add_special_char(dict_character) self.dict = {} @@ -45,11 +47,6 @@ class BaseRecLabelDecode(object): self.dict[char] = i self.character = dict_character - if 'arabic' in character_dict_path: - self.reverse = True - else: - self.reverse = False - def pred_reverse(self, pred): pred_re = [] c_current = '' diff --git a/ppstructure/kie/README.md b/ppstructure/kie/README.md index adb19a3ca729821ab16bf8f0f8ec14c2376de1de..d9471fb18d140704fdeb76c321f8a001426f872d 100644 --- a/ppstructure/kie/README.md +++ b/ppstructure/kie/README.md @@ -242,9 +242,7 @@ For training, evaluation and inference tutorial for KIE models, please refer to For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](../../doc/doc_en/detection_en.md). -For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition.md). - -If you want to finish the KIE tasks in your scene, and don't know what to prepare, please refer to [End cdoc](../../doc/doc_en/recognition.md). +For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition_en.md). To complete the key information extraction task in your own scenario from data preparation to model selection, please refer to: [Guide to End-to-end KIE](./how_to_do_kie_en.md)。 diff --git a/ppstructure/pdf2word/pdf2word.md b/ppstructure/pdf2word/pdf2word.md index 8d69d60748c2a3b39282b75ce28f32f3089b589c..564df4063e101e028afbea5c3acab8946196d31d 100644 --- a/ppstructure/pdf2word/pdf2word.md +++ b/ppstructure/pdf2word/pdf2word.md @@ -1,11 +1,6 @@ # PDF2WORD -PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行 - -
- -
- +PDF2WORD是PaddleOCR社区开发者[whjdark](https://github.com/whjdark) 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行 ## 1.使用 @@ -23,17 +18,7 @@ PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模 python pdf2word.py ``` -## 2.自行打包 - -PDF2WORD应用程序通过[QPT](https://github.com/QPT-Family/QPT)工具打包实现,若您修改了界面代码需要重新打包,请在 `PaddleOCR` 文件夹下运行下方指令 - -``` -cd ./ -mv ./ppstructure/pdf2word .. -r -python GenEXE.py -``` - -## 3.软件下载 +## 2.软件下载 如需获取已打包程序,可以扫描下方二维码,关注公众号填写问卷后,加入PaddleOCR官方交流群免费获取20G OCR学习大礼包,内含OCR场景应用集合(包含数码管、液晶屏、车牌、高精度SVTR模型等7个垂类模型)、《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料 diff --git a/ppstructure/pdf2word/pdf2word.py b/ppstructure/pdf2word/pdf2word.py index 18e2eee86a9776963901da4f0e81974943772bbe..6b394094f3b24bfaa7829541f4f9a2a48f3d493f 100644 --- a/ppstructure/pdf2word/pdf2word.py +++ b/ppstructure/pdf2word/pdf2word.py @@ -438,4 +438,4 @@ def main(): if __name__ == "__main__": - main() \ No newline at end of file + main() diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index c606d641975556fe578a7e1cff8a575ccb4bff21..08635516ba8301e6f98f175e5eba8c0a97b1708e 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -51,7 +51,9 @@ The performance indicators are explained as follows: ### 4.1 Quick start -PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The following takes the Chinese table recognition model as an example to introduce how to recognize a table. +PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details. + +The following takes the Chinese table recognition model as an example to introduce how to recognize a table. Use the following commands to quickly complete the identification of a table. diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index 8aa0dc8653223f9b84a283d8be2329f3c9d12b47..1ef126261d9ce832cd1919a1b3991f341add998c 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -57,7 +57,9 @@ ### 4.1 快速开始 -PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。下面以中文表格识别模型为例,介绍如何识别一张表格。 +PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。也提供了whl包的形式方便快速使用,详见 [quickstart](../docs/quickstart.md)。 + +下面以中文表格识别模型为例,介绍如何识别一张表格。 使用如下命令即可快速完成一张表格的识别。 ```python diff --git a/requirements.txt b/requirements.txt index cf80775f73b421f96875d48b4659f2b7adf852c9..2ccd486f34ed4cb01312cf3417404f724d762baf 100644 --- a/requirements.txt +++ b/requirements.txt @@ -7,7 +7,7 @@ tqdm numpy visualdl rapidfuzz -opencv-contrib-python==4.4.0.46 +opencv-contrib-python cython lxml premailer