From 36f174580f9939396a40587379d487cdc587dc4f Mon Sep 17 00:00:00 2001 From: an1018 <614803115@qq.com> Date: Mon, 22 Aug 2022 16:41:42 +0800 Subject: [PATCH] update doc --- ppstructure/docs/models_list.md | 13 ++++--- ppstructure/docs/models_list_en.md | 15 +++++---- ppstructure/docs/quickstart.md | 54 ++++++++++++++++++++++++++---- ppstructure/docs/quickstart_en.md | 43 ++++++++++++++++++++++-- ppstructure/layout/README.md | 2 +- ppstructure/recovery/README.md | 11 +++--- ppstructure/recovery/README_ch.md | 4 +-- 7 files changed, 115 insertions(+), 27 deletions(-) diff --git a/ppstructure/docs/models_list.md b/ppstructure/docs/models_list.md index 3b8c3790..f4c63659 100644 --- a/ppstructure/docs/models_list.md +++ b/ppstructure/docs/models_list.md @@ -10,11 +10,14 @@ ## 1. 版面分析模型 -|模型名称|模型简介|推理模型大小|下载地址| -| --- | --- | --- | --- | -| picodet_lcnet_x1_0_fgd_layout | PubLayNet 数据集训练的版面分析模型,可以划分**文字、标题、表格、图片以及列表**5类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | -| picodet_lcnet_x1_0_fgd_layout_cdla | CDLA数据集训练的版面分析模型,可以划分为**表格、图片、图片标题、表格、表格标题、页眉、脚本、引用、公式**10类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | -| picodet_lcnet_x1_0_fgd_layout_table | 表格数据集训练的版面分析模型,只能检测表格 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | +|模型名称|模型简介|推理模型大小|下载地址|dict path| +| --- | --- | --- | --- | --- | +| picodet_lcnet_x1_0_fgd_layout | 基于PicoDet LCNet_x1_0和FGD蒸馏在PubLayNet 数据集训练的英文版面分析模型,可以划分**文字、标题、表格、图片以及列表**5类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) | +| ppyolov2_r50vd_dcn_365e_publaynet | 基于PP-YOLOv2在PubLayNet数据集上训练的英文版面分析模型 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | 同上 | +| picodet_lcnet_x1_0_fgd_layout_cdla | CDLA数据集训练的中文版面分析模型,可以划分为**表格、图片、图片标题、表格、表格标题、页眉、脚本、引用、公式**10类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) | +| picodet_lcnet_x1_0_fgd_layout_table | 表格数据集训练的版面分析模型,支持中英文文档表格区域的检测 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) | +| ppyolov2_r50vd_dcn_365e_tableBank_word | 基于PP-YOLOv2在TableBank Word 数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | 同上 | +| ppyolov2_r50vd_dcn_365e_tableBank_latex | 基于PP-YOLOv2在TableBank Latex数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | 同上 | diff --git a/ppstructure/docs/models_list_en.md b/ppstructure/docs/models_list_en.md index 300f4c56..7d840b9d 100644 --- a/ppstructure/docs/models_list_en.md +++ b/ppstructure/docs/models_list_en.md @@ -6,15 +6,18 @@ - [2.2 Table Recognition](#22-table-recognition) - [3. KIE](#3-kie) - + ## 1. Layout Analysis -|model name| description |download| -| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | -| picodet_lcnet_x1_0_fgd_layout | The layout analysis model trained on the PubLayNet dataset, the model can recognition 5 types of areas such as **Text, Title, Table, Picture and List** | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | -| picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis model trained on the CDLA dataset, the model can recognition 10 types of areas such as **Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation** | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | -| picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can only detect tables | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | +|model name| description | inference model size |download|dict path| +| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | --- | --- | +| picodet_lcnet_x1_0_fgd_layout | The layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as **Text, Title, Table, Picture and List** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) | +| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2 | 221M | [inference_moel]](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | sme as above | +| picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as **Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) | +| picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) | +| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect tables in English documents | 221M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | same as above | +| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect tables in English documents | 221M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | same as above | ## 2. OCR and Table Recognition diff --git a/ppstructure/docs/quickstart.md b/ppstructure/docs/quickstart.md index 9a538a6f..51cd1015 100644 --- a/ppstructure/docs/quickstart.md +++ b/ppstructure/docs/quickstart.md @@ -8,14 +8,16 @@ - [2.1.3 版面分析](#213-版面分析) - [2.1.4 表格识别](#214-表格识别) - [2.1.5 DocVQA](#215-dockie) + - [2.1.6 版面恢复](#216-版面恢复) - [2.2 代码使用](#22-代码使用) - [2.2.1 图像方向分类版面分析表格识别](#221-图像方向分类版面分析表格识别) - [2.2.2 版面分析+表格识别](#222-版面分析表格识别) - [2.2.3 版面分析](#223-版面分析) - [2.2.4 表格识别](#224-表格识别) - [2.2.5 DocVQA](#225-dockie) + - [2.2.6 版面恢复](#226-版面恢复) - [2.3 返回结果说明](#23-返回结果说明) - - [2.3.1 版面分析+表格识别](#231-版面分析表格识别) + - [2.3.1 版面分+表格识别](#231-版面分析表格识别) - [2.3.2 DocVQA](#232-dockie) - [2.4 参数说明](#24-参数说明) @@ -24,11 +26,12 @@ ## 1. 安装依赖包 ```bash -# 安装 paddleocr,推荐使用2.5+版本 -pip3 install "paddleocr>=2.5" +# 安装 paddleocr,推荐使用2.6版本 +pip3 install "paddleocr>=2.6" # 安装 DocVQA依赖包paddlenlp(如不需要DocVQA功能,可跳过) -pip install paddlenlp - +pip3 install paddlenlp +# 安装 图像方向分类依赖包paddleclas(如不需要图像方向分类功能,可跳过) +pip3 install paddleclas ``` @@ -62,15 +65,25 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur ``` + #### 2.1.5 DocVQA 请参考:[文档视觉问答](../kie/README.md)。 + + +#### 2.1.6 版面恢复 + +```bash +paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true +``` + + ### 2.2 代码使用 -#### 2.2.1 图像方向分类版面分析表格识别 +#### 2.2.1 图像方向分类+版面分析+表格识别 ```python import os @@ -149,6 +162,7 @@ for line in result: ``` + #### 2.2.4 表格识别 ```python @@ -174,6 +188,33 @@ for line in result: 请参考:[文档视觉问答](../kie/README.md)。 + + +#### 2.2.6 版面恢复 + +```python +import os +import cv2 +from paddleocr import PPStructure,save_structure_res +from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx + +table_engine = PPStructure(layout=False, show_log=True) + +save_folder = './output' +img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img = cv2.imread(img_path) +result = table_engine(img) +save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0]) + +for line in result: + line.pop('img') + print(line) + +h, w, _ = img.shape +res = sorted_layout_boxes(res, w) +convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0]) +``` + ### 2.3 返回结果说明 PP-Structure的返回结果为一个dict组成的list,示例如下 @@ -235,6 +276,7 @@ dict 里各个字段说明如下 | table | 前向中是否执行表格识别 | True | | ocr | 对于版面分析中的非表格区域,是否执行ocr。当layout为False时会被自动设置为False| True | | recovery | 前向中是否执行版面恢复| False | +| save_pdf | 版面恢复导出docx文件的同时,是否导出pdf文件 | False | | structure_version | 模型版本,可选 PP-structure和PP-structurev2 | PP-structure | 大部分参数和PaddleOCR whl包保持一致,见 [whl包文档](../../doc/doc_ch/whl.md) diff --git a/ppstructure/docs/quickstart_en.md b/ppstructure/docs/quickstart_en.md index cf9d12ff..cccb30f8 100644 --- a/ppstructure/docs/quickstart_en.md +++ b/ppstructure/docs/quickstart_en.md @@ -8,12 +8,14 @@ - [2.1.3 layout analysis](#213-layout-analysis) - [2.1.4 table recognition](#214-table-recognition) - [2.1.5 DocVQA](#215-dockie) + - [2.1.6 layout recovery](#216-layout-recovery) - [2.2 Use by code](#22-use-by-code) - [2.2.1 image orientation + layout analysis + table recognition](#221-image-orientation--layout-analysis--table-recognition) - [2.2.2 layout analysis + table recognition](#222-layout-analysis--table-recognition) - [2.2.3 layout analysis](#223-layout-analysis) - [2.2.4 table recognition](#224-table-recognition) - [2.2.5 DocVQA](#225-dockie) + - [2.2.6 layout recovery](#226-layout-recovery) - [2.3 Result description](#23-result-description) - [2.3.1 layout analysis + table recognition](#231-layout-analysis--table-recognition) - [2.3.2 DocVQA](#232-dockie) @@ -24,10 +26,12 @@ ## 1. Install package ```bash -# Install paddleocr, version 2.5+ is recommended -pip3 install "paddleocr>=2.5" +# Install paddleocr, version 2.6 is recommended +pip3 install "paddleocr>=2.6" # Install the DocVQA dependency package paddlenlp (if you do not use the DocVQA, you can skip it) -pip install paddlenlp +pip3 install paddlenlp +# Install the image direction classification dependency package paddleclas (if you do not use the image direction classification, you can skip it) +pip3 install paddleclas ``` @@ -66,6 +70,12 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur Please refer to: [Documentation Visual Q&A](../kie/README.md) . + +#### 2.1.6 layout recovery +```bash +paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true +``` + ### 2.2 Use by code @@ -174,6 +184,32 @@ for line in result: Please refer to: [Documentation Visual Q&A](../kie/README.md) . + +#### 2.2.6 layout recovery + +```python +import os +import cv2 +from paddleocr import PPStructure,save_structure_res +from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx + +table_engine = PPStructure(layout=False, show_log=True) + +save_folder = './output' +img_path = 'PaddleOCR/ppstructure/docs/table/1.png' +img = cv2.imread(img_path) +result = table_engine(img) +save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0]) + +for line in result: + line.pop('img') + print(line) + +h, w, _ = img.shape +res = sorted_layout_boxes(res, w) +convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0]) +``` + ### 2.3 Result description @@ -235,6 +271,7 @@ Please refer to: [Documentation Visual Q&A](../kie/README.md) . | table | Whether to perform table recognition in forward | True | | ocr | Whether to perform ocr for non-table areas in layout analysis. When layout is False, it will be automatically set to False| True | | recovery | Whether to perform layout recovery in forward| False | +| save_pdf | Whether to convert docx to pdf when recovery| False | | structure_version | Structure version, optional PP-structure and PP-structurev2 | PP-structure | Most of the parameters are consistent with the PaddleOCR whl package, see [whl package documentation](../../doc/doc_en/whl.md) diff --git a/ppstructure/layout/README.md b/ppstructure/layout/README.md index f2dc9c0d..45386da3 100644 --- a/ppstructure/layout/README.md +++ b/ppstructure/layout/README.md @@ -175,7 +175,7 @@ cd pretrained_model wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams ``` -下载更多[版面分析模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#1-%E7%89%88%E9%9D%A2%E5%88%86%E6%9E%90%E6%A8%A1%E5%9E%8B)(中文CDLA数据集预训练模型、表格预训练模型) +下载更多[版面分析模型](../docs/models_list.md)(中文CDLA数据集预训练模型、表格预训练模型) ### 4.1. 启动训练 diff --git a/ppstructure/recovery/README.md b/ppstructure/recovery/README.md index 698bee08..90a6a2c3 100644 --- a/ppstructure/recovery/README.md +++ b/ppstructure/recovery/README.md @@ -6,6 +6,8 @@ English | [简体中文](README_ch.md) - [2.1 Installation dependencies](#2.1) - [2.2 Install PaddleOCR](#2.2) - [3. Quick Start](#3) + - [3.1 Download models](#3.1) + - [3.2 Layout recovery](#3.2) @@ -17,8 +19,9 @@ Layout recovery combines [layout analysis](../layout/README.md)、[table recogni The following figure shows the result:
- +
+ ## 2. Install @@ -68,11 +71,11 @@ python3 -m pip install -r ppstructure/recovery/requirements.txt ## 3. Quick Start -### 3.1 下载模型 +### 3.1 Download models If input is English document, download English models: -```python +```bash cd PaddleOCR/ppstructure # download model @@ -91,7 +94,7 @@ If input is Chinese document,download Chinese models: [Chinese and English ultra-lightweight PP-OCRv3 model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/README.md#pp-ocr-series-model-listupdate-on-september-8th)、[表格识别模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#22-表格识别模型)、[版面分析模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#1-版面分析模型) -### 3.2 版面恢复 +### 3.2 Layout recovery ```bash diff --git a/ppstructure/recovery/README_ch.md b/ppstructure/recovery/README_ch.md index 73405879..9215976d 100644 --- a/ppstructure/recovery/README_ch.md +++ b/ppstructure/recovery/README_ch.md @@ -78,7 +78,7 @@ python3 -m pip install -r ppstructure/recovery/requirements.txt 如果输入为英文文档类型,下载英文模型 -``` +```bash cd PaddleOCR/ppstructure # 下载模型 @@ -104,7 +104,7 @@ cd .. 使用下载的模型恢复给定文档的版面,以英文模型为例,执行如下命令: -```python +```bash python3 predict_system.py \ --image_dir=./docs/table/1.png \ --det_model_dir=inference/en_PP-OCRv3_det_infer \ -- GitLab