未验证 提交 52970f04 编写于 作者: 文幕地方's avatar 文幕地方 提交者: GitHub

add PP-StructureV2 (#5558)

* add PP-OCRv2

* add PP-OCRv2 benckmark

* update app

* update introduction

* update PP-OCRv2 introduction

* add pipeline of PP-OCRv2

* add PP-OCRv3

* update benckmark

* add paddleocr to requirements

* update

* support paddleocr 2.6.0.1

* rm print code

* add image download

* add model train

* update PP-OCRv3 ref

* update doc

* update doc

* update PP-OCR info

* add PP-StructureV2

* update doc

* merge upstream

* update info.yaml
上级 499de374
--- ---
Model_Info: Model_Info:
name: "PP-OCRv2" name: "PP-OCRv2"
description: "" description: "PP-OCRv2文字检测识别系统"
description_en: "" description_en: "PP-OCRv2 text detection and recognition system"
icon: "@后续UE统一设计之后,会存到bos上某个位置" icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleOCR" from_repo: "PaddleOCR"
Task: Task:
- tag_en: "CV" - tag_en: "CV"
tag: "计算机视觉" tag: "计算机视觉"
sub_tag_en: "Character Recognition" sub_tag_en: "Text Detection, Character Recognition, Optical Character Recognition"
sub_tag: "文字识别" sub_tag: "文字检测,文字识别,OCR"
Example: Example:
- title: "《动手学OCR》系列课程之:PP-OCRv2预测部署实战" - title: "《动手学OCR》系列课程之:PP-OCRv2预测部署实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3552922?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/3552922?channelType=0&channel=0"
title_en: "Dive into OCR series of courses: PP-OCRv2 prediction and deployment"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/3552922?channelType=0&channel=0"
- title: "《动手学OCR》系列课程之:OCR文本识别实战" - title: "《动手学OCR》系列课程之:OCR文本识别实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3552051?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/3552051?channelType=0&channel=0"
title_en: "Dive into OCR series of courses: text recognition in practice"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/3552051?channelType=0&channel=0"
- title: "《动手学OCR》系列课程之:OCR文本检测实践" - title: "《动手学OCR》系列课程之:OCR文本检测实践"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3551779?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/3551779?channelType=0&channel=0"
title_en: "Dive into OCR series of courses: text detection in practice"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/3551779?channelType=0&channel=0"
Datasets: "ICDAR 2015, ICDAR2019-LSVT,ICDAR2017-RCTW-17,Total-Text,ICDAR2019-ArT" Datasets: "ICDAR 2015, ICDAR2019-LSVT,ICDAR2017-RCTW-17,Total-Text,ICDAR2019-ArT"
Pulisher: "Baidu" Pulisher: "Baidu"
License: "apache.2.0" License: "apache.2.0"
Paper: Paper:
- title: "PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System" - title: "PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System"
url: "https://arxiv.org/pdf/2109.03144v2.pdf" url: "https://arxiv.org/abs/2109.03144"
IfTraining: 0 IfTraining: 0
IfOnlineDemo: 1 IfOnlineDemo: 1
...@@ -196,7 +196,7 @@ ...@@ -196,7 +196,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3.8.13 ('py38')",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
...@@ -210,7 +210,12 @@ ...@@ -210,7 +210,12 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.8.8" "version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "58fd1890da6594cebec461cf98c6cb9764024814357f166387d10d267624ecd6"
}
} }
}, },
"nbformat": 4, "nbformat": 4,
......
...@@ -112,7 +112,7 @@ ...@@ -112,7 +112,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### 3.2 Train the model.\n", "### 3.2 Train the model\n",
"The PP-OCR system consists of a text detection model, an angle classifier and a text recognition model. For the three model training tutorials, please refer to the following documents:\n", "The PP-OCR system consists of a text detection model, an angle classifier and a text recognition model. For the three model training tutorials, please refer to the following documents:\n",
"1. text detection model: [text detection training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/detection.md)\n", "1. text detection model: [text detection training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/detection.md)\n",
"1. angle classifier: [angle classifier training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/angle_class.md)\n", "1. angle classifier: [angle classifier training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/angle_class.md)\n",
...@@ -130,6 +130,7 @@ ...@@ -130,6 +130,7 @@
"source": [ "source": [
"## 4. Model Principles\n", "## 4. Model Principles\n",
"\n", "\n",
"The enhancement strategies are as follows\n",
"\n", "\n",
"1. Text detection enhancement strategies\n", "1. Text detection enhancement strategies\n",
"- Adopt CML (Collaborative Mutual Learning) collaborative mutual learning knowledge distillation strategy.\n", "- Adopt CML (Collaborative Mutual Learning) collaborative mutual learning knowledge distillation strategy.\n",
...@@ -193,7 +194,7 @@ ...@@ -193,7 +194,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3.8.13 ('py38')",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
...@@ -207,7 +208,12 @@ ...@@ -207,7 +208,12 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.8.8" "version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "58fd1890da6594cebec461cf98c6cb9764024814357f166387d10d267624ecd6"
}
} }
}, },
"nbformat": 4, "nbformat": 4,
......
--- ---
Model_Info: Model_Info:
name: "PP-OCRv3" name: "PP-OCRv3"
description: "" description: "PP-OCRv3文字检测识别系统"
description_en: "" description_en: "PP-OCRv3 text detection and recognition system"
icon: "@后续UE统一设计之后,会存到bos上某个位置" icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleOCR" from_repo: "PaddleOCR"
Task: Task:
- tag_en: "CV" - tag_en: "CV"
tag: "计算机视觉" tag: "计算机视觉"
sub_tag_en: "Character Recognition" sub_tag_en: "Text Detection, Character Recognition, Optical Character Recognition"
sub_tag: "文字识别" sub_tag: "文字检测,文字识别,OCR"
Example: Example:
- title: "【官方】十分钟完成 PP-OCRv3 识别全流程实战" - title: "【官方】十分钟完成 PP-OCRv3 识别全流程实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3916206?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/3916206?channelType=0&channel=0"
title_en: "[Official] Complete the whole process of PP-OCRv3 identification in ten minutes"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/3916206?channelType=0&channel=0"
- title: "鸟枪换炮!基于PP-OCRv3的电表检测识别" - title: "鸟枪换炮!基于PP-OCRv3的电表检测识别"
url: "https://aistudio.baidu.com/aistudio/projectdetail/511591?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/511591?channelType=0&channel=0"
title_en: "Swap the shotgun! Detection and recognition electricity meters based on PP-OCRv3"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/511591?channelType=0&channel=0"
- title: "基于PP-OCRv3实现PCB字符识别" - title: "基于PP-OCRv3实现PCB字符识别"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4008973?channelType=0&channel=0" url: "https://aistudio.baidu.com/aistudio/projectdetail/4008973?channelType=0&channel=0"
title_en: "PCB character recognition based on PP-OCRv3"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/4008973?channelType=0&channel=0"
Datasets: "ICDAR 2015, ICDAR2019-LSVT,ICDAR2017-RCTW-17,Total-Text,ICDAR2019-ArT" Datasets: "ICDAR 2015, ICDAR2019-LSVT,ICDAR2017-RCTW-17,Total-Text,ICDAR2019-ArT"
Pulisher: "Baidu" Pulisher: "Baidu"
License: "apache.2.0" License: "apache.2.0"
......
...@@ -67,7 +67,7 @@ ...@@ -67,7 +67,7 @@
"source": [ "source": [
"## 3. 模型如何使用\n", "## 3. 模型如何使用\n",
"\n", "\n",
"### 3.1 模型推理\n", "### 3.1 模型推理\n",
"* 安装PaddleOCR whl包" "* 安装PaddleOCR whl包"
] ]
}, },
...@@ -96,7 +96,7 @@ ...@@ -96,7 +96,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": 1,
"metadata": { "metadata": {
"scrolled": true, "scrolled": true,
"tags": [] "tags": []
...@@ -136,7 +136,7 @@ ...@@ -136,7 +136,7 @@
"模型训练完成后,可以通过指定模型路径的方式串联使用\n", "模型训练完成后,可以通过指定模型路径的方式串联使用\n",
"命令参考如下:\n", "命令参考如下:\n",
"```python\n", "```python\n",
"paddleocr --image_dir 11.jpg --use_angle_cls true --ocr_version PP-OCRv2 --det_model_dir=/path/to/det_inference_model --cls_model_dir=/path/to/cls_inference_model --rec_model_dir=/path/to/rec_inference_model\n", "paddleocr --image_dir 11.jpg --use_angle_cls true --det_model_dir=/path/to/det_inference_model --cls_model_dir=/path/to/cls_inference_model --rec_model_dir=/path/to/rec_inference_model\n",
"```" "```"
] ]
}, },
...@@ -228,36 +228,11 @@ ...@@ -228,36 +228,11 @@
"source": [ "source": [
"## 6. 相关论文以及引用信息\n", "## 6. 相关论文以及引用信息\n",
"```\n", "```\n",
"@article{du2021pp,\n", "@article{li2022pp,\n",
" title={PP-OCRv2: bag of tricks for ultra lightweight OCR system},\n", " title={PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System},\n",
" author={Du, Yuning and Li, Chenxia and Guo, Ruoyu and Cui, Cheng and Liu, Weiwei and Zhou, Jun and Lu, Bin and Yang, Yehua and Liu, Qiwen and Hu, Xiaoguang and others},\n", " author={Li, Chenxia and Liu, Weiwei and Guo, Ruoyu and Yin, Xiaoting and Jiang, Kaitao and Du, Yongkun and Du, Yuning and Zhu, Lingfeng and Lai, Baohua and Hu, Xiaoguang and others},\n",
" journal={arXiv preprint arXiv:2109.03144},\n", " journal={arXiv preprint arXiv:2206.03001},\n",
" year={2021}\n", " year={2022}\n",
"}\n",
"\n",
"@inproceedings{zhang2018deep,\n",
" title={Deep mutual learning},\n",
" author={Zhang, Ying and Xiang, Tao and Hospedales, Timothy M and Lu, Huchuan},\n",
" booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},\n",
" pages={4320--4328},\n",
" year={2018}\n",
"}\n",
"\n",
"@inproceedings{hu2020gtc,\n",
" title={Gtc: Guided training of ctc towards efficient and accurate scene text recognition},\n",
" author={Hu, Wenyang and Cai, Xiaocong and Hou, Jun and Yi, Shuai and Lin, Zhiping},\n",
" booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},\n",
" volume={34},\n",
" number={07},\n",
" pages={11005--11012},\n",
" year={2020}\n",
"}\n",
"\n",
"@inproceedings{zhang2022context,\n",
" title={Context-based Contrastive Learning for Scene Text Recognition},\n",
" author={Zhang, Xinyun and Zhu, Binwu and Yao, Xufeng and Sun, Qi and Li, Ruiyu and Yu, Bei},\n",
" year={2022},\n",
" organization={AAAI}\n",
"}\n", "}\n",
"```\n" "```\n"
] ]
...@@ -265,7 +240,7 @@ ...@@ -265,7 +240,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3.8.13 ('py38')",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
...@@ -279,7 +254,12 @@ ...@@ -279,7 +254,12 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.8.8" "version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "58fd1890da6594cebec461cf98c6cb9764024814357f166387d10d267624ecd6"
}
} }
}, },
"nbformat": 4, "nbformat": 4,
......
...@@ -129,7 +129,7 @@ ...@@ -129,7 +129,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### 3.2 Train the model.\n", "### 3.2 Train the model\n",
"The PP-OCR system consists of a text detection model, an angle classifier and a text recognition model. For the three model training tutorials, please refer to the following documents:\n", "The PP-OCR system consists of a text detection model, an angle classifier and a text recognition model. For the three model training tutorials, please refer to the following documents:\n",
"1. text detection model: [text detection training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/detection.md)\n", "1. text detection model: [text detection training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/detection.md)\n",
"1. angle classifier: [angle classifier training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/angle_class.md)\n", "1. angle classifier: [angle classifier training tutorial](https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.6/doc/doc_ch/angle_class.md)\n",
...@@ -137,7 +137,7 @@ ...@@ -137,7 +137,7 @@
"\n", "\n",
"After the model training is completed, it can be used in series by specifying the model path. The command reference is as follows:\n", "After the model training is completed, it can be used in series by specifying the model path. The command reference is as follows:\n",
"```python\n", "```python\n",
"paddleocr --image_dir 11.jpg --use_angle_cls true --ocr_version PP-OCRv2 --det_model_dir=/path/to/det_inference_model --cls_model_dir=/path/to/cls_inference_model --rec_model_dir=/path/to/rec_inference_model\n", "paddleocr --image_dir 11.jpg --use_angle_cls true --det_model_dir=/path/to/det_inference_model --cls_model_dir=/path/to/cls_inference_model --rec_model_dir=/path/to/rec_inference_model\n",
"```" "```"
] ]
}, },
...@@ -147,7 +147,7 @@ ...@@ -147,7 +147,7 @@
"source": [ "source": [
"## 4. Model Principles\n", "## 4. Model Principles\n",
"\n", "\n",
"The optimization ideas are as follows\n", "The enhancement strategies are as follows\n",
"\n", "\n",
"1. Text detection enhancement strategies\n", "1. Text detection enhancement strategies\n",
"- LK-PAN: a PAN module with large receptive field\n", "- LK-PAN: a PAN module with large receptive field\n",
...@@ -231,36 +231,11 @@ ...@@ -231,36 +231,11 @@
"source": [ "source": [
"## 6. Related papers and citations\n", "## 6. Related papers and citations\n",
"```\n", "```\n",
"@article{du2021pp,\n", "@article{li2022pp,\n",
" title={PP-OCRv2: bag of tricks for ultra lightweight OCR system},\n", " title={PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System},\n",
" author={Du, Yuning and Li, Chenxia and Guo, Ruoyu and Cui, Cheng and Liu, Weiwei and Zhou, Jun and Lu, Bin and Yang, Yehua and Liu, Qiwen and Hu, Xiaoguang and others},\n", " author={Li, Chenxia and Liu, Weiwei and Guo, Ruoyu and Yin, Xiaoting and Jiang, Kaitao and Du, Yongkun and Du, Yuning and Zhu, Lingfeng and Lai, Baohua and Hu, Xiaoguang and others},\n",
" journal={arXiv preprint arXiv:2109.03144},\n", " journal={arXiv preprint arXiv:2206.03001},\n",
" year={2021}\n", " year={2022}\n",
"}\n",
"\n",
"@inproceedings{zhang2018deep,\n",
" title={Deep mutual learning},\n",
" author={Zhang, Ying and Xiang, Tao and Hospedales, Timothy M and Lu, Huchuan},\n",
" booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},\n",
" pages={4320--4328},\n",
" year={2018}\n",
"}\n",
"\n",
"@inproceedings{hu2020gtc,\n",
" title={Gtc: Guided training of ctc towards efficient and accurate scene text recognition},\n",
" author={Hu, Wenyang and Cai, Xiaocong and Hou, Jun and Yi, Shuai and Lin, Zhiping},\n",
" booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},\n",
" volume={34},\n",
" number={07},\n",
" pages={11005--11012},\n",
" year={2020}\n",
"}\n",
"\n",
"@inproceedings{zhang2022context,\n",
" title={Context-based Contrastive Learning for Scene Text Recognition},\n",
" author={Zhang, Xinyun and Zhu, Binwu and Yao, Xufeng and Sun, Qi and Li, Ruiyu and Yu, Bei},\n",
" year={2022},\n",
" organization={AAAI}\n",
"}\n", "}\n",
"```\n" "```\n"
] ]
...@@ -268,7 +243,7 @@ ...@@ -268,7 +243,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3.8.13 ('py38')",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
...@@ -282,7 +257,12 @@ ...@@ -282,7 +257,12 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.8.8" "version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "58fd1890da6594cebec461cf98c6cb9764024814357f166387d10d267624ecd6"
}
} }
}, },
"nbformat": 4, "nbformat": 4,
......
import gradio as gr
import base64
from io import BytesIO
from PIL import Image
from paddleocr import PPStructure
table_engine = PPStructure(layout=False, show_log=True)
def image_to_base64(image):
# 输入为PIL读取的图片,输出为base64格式
byte_data = BytesIO() # 创建一个字节流管道
image.save(byte_data, format="JPEG") # 将图片数据存入字节流管道
byte_data = byte_data.getvalue() # 从字节流管道中获取二进制
base64_str = base64.b64encode(byte_data).decode("ascii") # 二进制转base64
return base64_str
# UGC: Define the inference fn() for your models
def model_inference(image):
result = table_engine(image)
res = result[0]['res']['html']
json_out = {"result": res}
return res, json_out
def clear_all():
return None, None, None
with gr.Blocks() as demo:
gr.Markdown("PP-StructureV2")
with gr.Column(scale=1, min_width=100):
img_in = gr.Image(
value="https://user-images.githubusercontent.com/12406017/200574299-32537341-c329-42a5-ae41-35ee4bd43f2f.png",
label="Input")
with gr.Row():
btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit")
html_out = gr.HTML(label="Output")
json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[html_out, json_out])
btn1.click(fn=clear_all, inputs=None, outputs=[img_in, html_out, json_out])
gr.Button.style(1)
demo.launch()
【PP-StructureV2-App-YAML】
APP_Info:
title: PP-StructureV2-App
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 3.4.1
app_file: app.py
license: apache-2.0
device: cpu
\ No newline at end of file
gradio
paddlepaddle
paddleocr>=2.6.1.0
# 模型列表
## 1. 版面分析模型
|模型名称|模型简介|推理模型大小|下载地址|dict path|
| --- | --- | --- | --- | --- |
| picodet_lcnet_x1_0_fgd_layout | 基于PicoDet LCNet_x1_0和FGD蒸馏在PubLayNet 数据集训练的英文版面分析模型,可以划分**文字、标题、表格、图片以及列表**5类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
| ppyolov2_r50vd_dcn_365e_publaynet | 基于PP-YOLOv2在PubLayNet数据集上训练的英文版面分析模型 | 221.0M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | 同上 |
| picodet_lcnet_x1_0_fgd_layout_cdla | CDLA数据集训练的中文版面分析模型,可以划分为**表格、图片、图片标题、表格、表格标题、页眉、脚本、引用、公式**10类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
| picodet_lcnet_x1_0_fgd_layout_table | 表格数据集训练的版面分析模型,支持中英文文档表格区域的检测 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
| ppyolov2_r50vd_dcn_365e_tableBank_word | 基于PP-YOLOv2在TableBank Word 数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221.0M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | 同上 |
| ppyolov2_r50vd_dcn_365e_tableBank_latex | 基于PP-YOLOv2在TableBank Latex数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221.0M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | 同上 |
## 2. OCR和表格识别模型
### 2.1 OCR
|模型名称|模型简介|推理模型大小|下载地址|
| --- | --- | --- | --- |
|en_ppocr_mobile_v2.0_table_det|PubTabNet数据集训练的英文表格场景的文字检测|4.7M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_det_train.tar) |
|en_ppocr_mobile_v2.0_table_rec|PubTabNet数据集训练的英文表格场景的文字识别|6.9M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_rec_train.tar) |
如需要使用其他OCR模型,可以在 [PP-OCR model_list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/models_list.md) 下载模型或者使用自己训练好的模型配置到 `det_model_dir`, `rec_model_dir`两个字段即可。
### 2.2 表格识别模型
|模型名称|模型简介|推理模型大小|下载地址|
| --- | --- | --- | --- |
|en_ppocr_mobile_v2.0_table_structure|基于TableRec-RARE在PubTabNet数据集上训练的英文表格识别模型|6.8M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar) |
|en_ppstructure_mobile_v2.0_SLANet|基于SLANet在PubTabNet数据集上训练的英文表格识别模型|9.2M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_train.tar) |
|ch_ppstructure_mobile_v2.0_SLANet|基于SLANet的中文表格识别模型|9.3M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
<a name="3"></a>
# Model list
## 1. Layout Analysis
|model name|description | inference model size |download|dict path|
| --- |---| --- | --- | --- |
| picodet_lcnet_x1_0_fgd_layout | The layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as **Text, Title, Table, Picture and List** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2 | 221.0M | [inference_moel](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | same as above |
| picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as **Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
| picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect tables in English documents | 221.0M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | same as above |
| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect tables in English documents | 221.0M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | same as above |
## 2. OCR and Table Recognition
### 2.1 OCR
|model name| description | inference model size |download|
| --- |---|---| --- |
|en_ppocr_mobile_v2.0_table_det| Text detection model of English table scenes trained on PubTabNet dataset | 4.7M |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_det_train.tar) |
|en_ppocr_mobile_v2.0_table_rec| Text recognition model of English table scenes trained on PubTabNet dataset | 6.9M |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_rec_train.tar) |
If you need to use other OCR models, you can download the model in [PP-OCR model_list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/models_list_en.md) or use the model you trained yourself to configure to `det_model_dir`, `rec_model_dir` field.
<a name="22"></a>
### 2.2 Table Recognition
|model| description |inference model size|download|
| --- |-----------------------------------------------------------------------------| --- | --- |
|en_ppocr_mobile_v2.0_table_structure| English table recognition model trained on PubTabNet dataset based on TableRec-RARE |6.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar) |
|en_ppstructure_mobile_v2.0_SLANet|English table recognition model trained on PubTabNet dataset based on SLANet|9.2M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_train.tar) |
|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
---
Model_Info:
name: "PP-StructureV2"
description: "PP-StructureV2文档分析系统,包含版面分析,表格识别,版面恢复和关键信息抽取"
description_en: "PP-StructureV2 document analysis system, including layout analysis, table recognition, layout recovery and key information extraction"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleOCR"
Task:
- tag_en: "CV"
tag: "计算机视觉"
sub_tag_en: "Layout Analysis, Table Recognition, Layout Recovery, Key Information Extraction"
sub_tag: "版面分析,表格识别,版面恢复,关键信息提取"
Example:
- title: "表格识别实战"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4770296?channelType=0&channel=0"
title_en: "table recognition"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/4770296?channelType=0&channel=0"
- title: "OCR发票关键信息抽取"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4823162?channelType=0&channel=0"
title_en: "Invoice key information extraction"
url_en: "https://aistudio.baidu.com/aistudio/projectdetail/4823162?channelType=0&channel=0"
Datasets: "ICDAR 2015, ICDAR2019-LSVT,ICDAR2017-RCTW-17,Total-Text,ICDAR2019-ArT"
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "PP-StructureV2: A Stronger Document Analysis System"
url: "https://arxiv.org/abs/2210.05391v2"
IfTraining: 0
IfOnlineDemo: 1
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册