diff --git a/doc/doc_ch/algorithm_overview.md b/doc/doc_ch/algorithm_overview.md index 494392029fbcca1df336225e66df3d6aca3ad1f1..f0c16618c0dd0b0f0bcc6a06d6b142a59d58e725 100755 --- a/doc/doc_ch/algorithm_overview.md +++ b/doc/doc_ch/algorithm_overview.md @@ -21,7 +21,6 @@ PaddleOCR开源的文本检测算法列表: - [x] EAST([paper](https://arxiv.org/abs/1704.03155))[1] - [x] SAST([paper](https://arxiv.org/abs/1908.05498))[4] - [x] PSENet([paper](https://arxiv.org/abs/1903.12473v2)) -- [x] SDMGR([paper](https://arxiv.org/pdf/2103.14470.pdf)) 在ICDAR2015文本检测公开数据集上,算法效果如下: |模型|骨干网络|precision|recall|Hmean|下载链接| @@ -33,7 +32,6 @@ PaddleOCR开源的文本检测算法列表: |SAST|ResNet50_vd|91.39%|83.77%|87.42%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)| |PSE|ResNet50_vd|85.81%|79.53%|82.55%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar)| |PSE|MobileNetV3|82.20%|70.48%|75.89%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar)| -|SDMGR|VGG16|-|-|87.11%|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/kie_vgg16.tar)| 在Total-text文本检测公开数据集上,算法效果如下: diff --git a/ppstructure/README.md b/ppstructure/README.md index 8994cdd46191a0fd4fb1beba2fcad91542e19b50..a09a43299b11dccf99897d5a6c69704191253aaf 100644 --- a/ppstructure/README.md +++ b/ppstructure/README.md @@ -159,7 +159,6 @@ After running, each image will have a directory with the same name under the dir **Model List** - |model name|description|config|model size|download| | --- | --- | --- | --- | --- | |en_ppocr_mobile_v2.0_table_structure|Table structure prediction for English table scenarios|[table_mv3.yml](../configs/table/table_mv3.yml)|18.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) | @@ -184,4 +183,5 @@ OCR and table recognition model |en_ppocr_mobile_v2.0_table_rec|Text recognition of English table scene trained on PubLayNet dataset|6.9M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_rec_train.tar) | |en_ppocr_mobile_v2.0_table_structure|Table structure prediction of English table scene trained on PubLayNet dataset|18.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar) | + If you need to use other models, you can download the model in [model_list](../doc/doc_en/models_list_en.md) or use your own trained model to configure it to the three fields of `det_model_dir`, `rec_model_dir`, `table_model_dir` . diff --git a/ppstructure/docs/imgs/0.png b/ppstructure/docs/imgs/0.png new file mode 100644 index 0000000000000000000000000000000000000000..b1e8469f070d73074d9d39c7e5b42d7db1734a14 Binary files /dev/null and b/ppstructure/docs/imgs/0.png differ diff --git a/ppstructure/docs/kie.md b/ppstructure/docs/kie.md new file mode 100644 index 0000000000000000000000000000000000000000..67424a46fc6cbae3d6a250ad32b53001ec1cdb81 --- /dev/null +++ b/ppstructure/docs/kie.md @@ -0,0 +1,71 @@ + + +# 关键信息提取(Key Information Extraction) + +本节介绍PaddleOCR中关键信息提取SDMGR方法的快速使用和训练方法。 + +SDMGR是一个关键信息提取算法,将每个检测到的文本区域分类为预定义的类别,如订单ID、发票号码,金额等。 + + +* [1. 快速使用](#1-----) +* [2. 执行训练](#2-----) +* [3. 执行评估](#3-----) + + +## 1. 快速使用 + +训练和测试的数据采用wildreceipt数据集,通过如下指令下载数据集: + +``` +wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/wildreceipt.tar && tar xf wildreceipt.tar +``` + +执行预测: + +``` +cd PaddleOCR/ +wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/kie_vgg16.tar && tar xf kie_vgg16.tar +python3.7 tools/infer_kie.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=kie_vgg16/best_accuracy Global.infer_img=../wildreceipt/1.txt +``` + +执行预测后的结果保存在`./output/sdmgr_kie/predicts_kie.txt`文件中,可视化结果保存在`/output/sdmgr_kie/kie_results/`目录下。 + +可视化结果如下图所示: +[img](./imgs/0.png) + + +## 2. 执行训练 + +创建数据集软链到PaddleOCR/train_data目录下: +``` +cd PaddleOCR/ && mkdir train_data && cd train_data + +ln -s ../../wildreceipt ./ +``` + +训练采用的配置文件是configs/kie/kie_unet_sdmgr.yml,配置文件中默认训练数据路径是`train_data/wildreceipt`,准备好数据后,可以通过如下指令执行训练: +``` +python3.7 tools/train.py -c configs/kie/kie_unet_sdmgr.yml -o Global.save_model_dir=./output/kie/ +``` + +## 3. 执行评估 + +``` +python3.7 tools/eval.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=./output/kie/best_accuracy +``` + + +**参考文献:** + + + +```bibtex +@misc{sun2021spatial, + title={Spatial Dual-Modality Graph Reasoning for Key Information Extraction}, + author={Hongbin Sun and Zhanghui Kuang and Xiaoyu Yue and Chenhao Lin and Wayne Zhang}, + year={2021}, + eprint={2103.14470}, + archivePrefix={arXiv}, + primaryClass={cs.CV} +} +``` diff --git a/ppstructure/docs/model_list.md b/ppstructure/docs/model_list.md index 835d39a735462edb0d9f51493ec0529248aeadbf..45004490c1c4b0ea01a5fb409024f1eeb922f1a3 100644 --- a/ppstructure/docs/model_list.md +++ b/ppstructure/docs/model_list.md @@ -26,3 +26,9 @@ | --- | --- | --- | --- | |PP-Layout_v1.0_ser_pretrained|基于LayoutXLM在xfun中文数据集上训练的SER模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/PP-Layout_v1.0_ser_pretrained.tar) | |PP-Layout_v1.0_re_pretrained|基于LayoutXLM在xfun中文数据集上训练的RE模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/PP-Layout_v1.0_re_pretrained.tar) | + +## 3. KIE模型 + +|模型名称|模型简介|模型大小|下载地址| +| --- | --- | --- | --- | +|SDMGR|关键信息提取模型|-|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/kie_vgg16.tar)|