提交 aa2e2835 编写于 作者: qq_25193841's avatar qq_25193841

Merge remote-tracking branch 'origin/dygraph' into dy1

......@@ -24,7 +24,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
### 1.1 文本检测算法
已支持的文本检测算法列表(戳链接获取使用教程):
- [x] [DB](./algorithm_det_db.md)
- [x] [DB与DB++](./algorithm_det_db.md)
- [x] [EAST](./algorithm_det_east.md)
- [x] [SAST](./algorithm_det_sast.md)
- [x] [PSENet](./algorithm_det_psenet.md)
......@@ -41,6 +41,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
|PSE|MobileNetV3|82.20%|70.48%|75.89%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
|DB++|ResNet50|90.89%|82.66%|86.58%|[合成数据预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
在Total-text文本检测公开数据集上,算法效果如下:
......@@ -129,10 +130,10 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
已支持的关键信息抽取算法列表(戳链接获取使用教程):
- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm.md)
- [x] [LayoutLM](./algorithm_kie_laoutxlm.md)
- [x] [LayoutLMv2](./algorithm_kie_laoutxlm.md)
- [x] [LayoutXLM](./algorithm_kie_laoutxlm.md)
- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm.md)
- [x] [LayoutLM](./algorithm_kie_layoutxlm.md)
- [x] [LayoutLMv2](./algorithm_kie_layoutxlm.md)
- [x] [LayoutXLM](./algorithm_kie_layoutxlm.md)
- [x] [SDMGR](././algorithm_kie_sdmgr.md)
在wildreceipt发票公开数据集上,算法复现效果如下:
......
# DB
# DB && DB++
- [1. Introduction](#1)
- [2. Environment](#2)
......@@ -21,13 +21,23 @@ Paper:
> Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang
> AAAI, 2020
> [Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion](https://arxiv.org/abs/2202.10304)
> Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang
> TPAMI, 2022
On the ICDAR2015 dataset, the text detection result is as follows:
|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
| --- | --- | --- | --- | --- | --- | --- |
|DB|ResNet50_vd|[configs/det/det_r50_vd_db.yml](../../configs/det/det_r50_vd_db.yml)|86.41%|78.72%|82.38%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)|
|DB|MobileNetV3|[configs/det/det_mv3_db.yml](../../configs/det/det_mv3_db.yml)|77.29%|73.08%|75.12%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar)|
|DB++|ResNet50|[configs/det/det_r50_db++_ic15.yml](../../configs/det/det_r50_db++_ic15.yml)|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
On the TD_TR dataset, the text detection result is as follows:
|Model|Backbone|Configuration|Precision|Recall|Hmean|Download|
| --- | --- | --- | --- | --- | --- | --- |
|DB++|ResNet50|[configs/det/det_r50_db++_td_tr.yml](../../configs/det/det_r50_db++_td_tr.yml)|92.92%|86.48%|89.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_td_tr_train.tar)|
<a name="2"></a>
## 2. Environment
......@@ -96,4 +106,12 @@ More deployment schemes supported for DB:
pages={11474--11481},
year={2020}
}
```
\ No newline at end of file
@article{liao2022real,
title={Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion},
author={Liao, Minghui and Zou, Zhisheng and Wan, Zhaoyi and Yao, Cong and Bai, Xiang},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022},
publisher={IEEE}
}
```
......@@ -22,7 +22,7 @@ Developers are welcome to contribute more algorithms! Please refer to [add new a
### 1.1 Text Detection Algorithms
Supported text detection algorithms (Click the link to get the tutorial):
- [x] [DB](./algorithm_det_db_en.md)
- [x] [DB && DB++](./algorithm_det_db_en.md)
- [x] [EAST](./algorithm_det_east_en.md)
- [x] [SAST](./algorithm_det_sast_en.md)
- [x] [PSENet](./algorithm_det_psenet_en.md)
......@@ -39,6 +39,7 @@ On the ICDAR2015 dataset, the text detection result is as follows:
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)|
|PSE|MobileNetV3|82.20%|70.48%|75.89%|[trianed model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)|
|DB++|ResNet50|90.89%|82.66%|86.58%|[pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams)/[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_db%2B%2B_icdar15_train.tar)|
On Total-Text dataset, the text detection result is as follows:
......@@ -127,10 +128,10 @@ On the PubTabNet dataset, the algorithm result is as follows:
Supported KIE algorithms (Click the link to get the tutorial):
- [x] [VI-LayoutXLM](./algorithm_kie_vi_laoutxlm_en.md)
- [x] [LayoutLM](./algorithm_kie_laoutxlm_en.md)
- [x] [LayoutLMv2](./algorithm_kie_laoutxlm_en.md)
- [x] [LayoutXLM](./algorithm_kie_laoutxlm_en.md)
- [x] [VI-LayoutXLM](./algorithm_kie_vi_layoutxlm_en.md)
- [x] [LayoutLM](./algorithm_kie_layoutxlm_en.md)
- [x] [LayoutLMv2](./algorithm_kie_layoutxlm_en.md)
- [x] [LayoutXLM](./algorithm_kie_layoutxlm_en.md)
- [x] [SDMGR](./algorithm_kie_sdmgr_en.md)
On wildreceipt dataset, the algorithm result is as follows:
......
......@@ -24,7 +24,7 @@ class BaseRecLabelDecode(object):
def __init__(self, character_dict_path=None, use_space_char=False):
self.beg_str = "sos"
self.end_str = "eos"
self.reverse = False
self.character_str = []
if character_dict_path is None:
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
......@@ -38,6 +38,8 @@ class BaseRecLabelDecode(object):
if use_space_char:
self.character_str.append(" ")
dict_character = list(self.character_str)
if 'arabic' in character_dict_path:
self.reverse = True
dict_character = self.add_special_char(dict_character)
self.dict = {}
......@@ -45,11 +47,6 @@ class BaseRecLabelDecode(object):
self.dict[char] = i
self.character = dict_character
if 'arabic' in character_dict_path:
self.reverse = True
else:
self.reverse = False
def pred_reverse(self, pred):
pred_re = []
c_current = ''
......
......@@ -242,9 +242,7 @@ For training, evaluation and inference tutorial for KIE models, please refer to
For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](../../doc/doc_en/detection_en.md).
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition.md).
If you want to finish the KIE tasks in your scene, and don't know what to prepare, please refer to [End cdoc](../../doc/doc_en/recognition.md).
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](../../doc/doc_en/recognition_en.md).
To complete the key information extraction task in your own scenario from data preparation to model selection, please refer to: [Guide to End-to-end KIE](./how_to_do_kie_en.md)
......
# PDF2WORD
PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行
<div align="center">
<img src="./doc/imgs_results/PP-OCRv3/en/en_4.png" width="200">
</div>
PDF2WORD是PaddleOCR社区开发者[whjdark](https://github.com/whjdark) 基于PP-Structure智能文档分析模型实现的PDF转换Word应用程序,提供可直接安装的exe,方便windows用户运行
## 1.使用
......@@ -23,17 +18,7 @@ PDF2WORD是PaddleOCR社区开发者@whj 基于PP-Structure智能文档分析模
python pdf2word.py
```
## 2.自行打包
PDF2WORD应用程序通过[QPT](https://github.com/QPT-Family/QPT)工具打包实现,若您修改了界面代码需要重新打包,请在 `PaddleOCR` 文件夹下运行下方指令
```
cd ./
mv ./ppstructure/pdf2word .. -r
python GenEXE.py
```
## 3.软件下载
## 2.软件下载
如需获取已打包程序,可以扫描下方二维码,关注公众号填写问卷后,加入PaddleOCR官方交流群免费获取20G OCR学习大礼包,内含OCR场景应用集合(包含数码管、液晶屏、车牌、高精度SVTR模型等7个垂类模型)、《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料
......
......@@ -438,4 +438,4 @@ def main():
if __name__ == "__main__":
main()
\ No newline at end of file
main()
......@@ -51,7 +51,9 @@ The performance indicators are explained as follows:
### 4.1 Quick start
PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The following takes the Chinese table recognition model as an example to introduce how to recognize a table.
PP-Structure currently provides table recognition models in both Chinese and English. For the model link, see [models_list](../docs/models_list.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
The following takes the Chinese table recognition model as an example to introduce how to recognize a table.
Use the following commands to quickly complete the identification of a table.
......
......@@ -57,7 +57,9 @@
### 4.1 快速开始
PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。下面以中文表格识别模型为例,介绍如何识别一张表格。
PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 [models_list](../docs/models_list.md)。也提供了whl包的形式方便快速使用,详见 [quickstart](../docs/quickstart.md)
下面以中文表格识别模型为例,介绍如何识别一张表格。
使用如下命令即可快速完成一张表格的识别。
```python
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册