diff --git a/ppstructure/layout/README.md b/ppstructure/layout/README.md index 74cb928e30c012d5b469d685fd63b443a7d22613..0931702a7cf411e6589a1375e014a7374442f9f0 100644 --- a/ppstructure/layout/README.md +++ b/ppstructure/layout/README.md @@ -1,28 +1,19 @@ English | [简体中文](README_ch.md) - +- [Getting Started](#getting-started) + - [1. Install whl package](#1--install-whl-package) + - [2. Quick Start](#2-quick-start) + - [3. PostProcess](#3-postprocess) + - [4. Results](#4-results) + - [5. Training](#5-training) # Getting Started -[1. Install whl package](#Install) - -[2. Quick Start](#QuickStart) - -[3. PostProcess](#PostProcess) - -[4. Results](#Results) - -[5. Training](#Training) - - - ## 1. Install whl package ```bash wget https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl pip install -U layoutparser-0.0.0-py3-none-any.whl ``` - - ## 2. Quick Start Use LayoutParser to identify the layout of a document: @@ -77,8 +68,6 @@ The following model configurations and label maps are currently supported, which * TableBank word and TableBank latex are trained on datasets of word documents and latex documents respectively; * Download TableBank dataset contains both word and latex。 - - ## 3. PostProcess Layout parser contains multiple categories, if you only want to get the detection box for a specific category (such as the "Text" category), you can use the following code: @@ -119,7 +108,6 @@ Displays results with only the "Text" category:
- ## 4. Results @@ -134,8 +122,6 @@ Displays results with only the "Text" category: ​ **GPU:** a single NVIDIA Tesla P40 - - ## 5. Training The above model is based on [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection). If you want to train your own layout parser model,please refer to:[train_layoutparser_model](train_layoutparser_model.md) diff --git a/ppstructure/layout/README_ch.md b/ppstructure/layout/README_ch.md index c722e0bd88f40ff6b711edecff0433029e101f87..6fec748b7683264f5b4a7d29c0e51c84773425ba 100644 --- a/ppstructure/layout/README_ch.md +++ b/ppstructure/layout/README_ch.md @@ -1,26 +1,18 @@ [English](README.md) | 简体中文 +- [版面分析使用说明](#版面分析使用说明) + - [1. 安装whl包](#1--安装whl包) + - [2. 使用](#2-使用) + - [3. 后处理](#3-后处理) + - [4. 指标](#4-指标) + - [5. 训练版面分析模型](#5-训练版面分析模型) # 版面分析使用说明 -[1. 安装whl包](#安装whl包) - -[2. 使用](#使用) - -[3. 后处理](#后处理) - -[4. 指标](#指标) - -[5. 训练版面分析模型](#训练版面分析模型) - - - ## 1. 安装whl包 ```bash pip install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl ``` - - ## 2. 使用 使用layoutparser识别给定文档的布局: @@ -76,8 +68,6 @@ show_img.show() * TableBank word和TableBank latex分别在word文档、latex文档数据集训练; * 下载的TableBank数据集里同时包含word和latex。 - - ## 3. 后处理 版面分析检测包含多个类别,如果只想获取指定类别(如"Text"类别)的检测框、可以使用下述代码: @@ -119,8 +109,6 @@ show_img.show() - - ## 4. 指标 | Dataset | mAP | CPU time cost | GPU time cost | @@ -134,8 +122,6 @@ show_img.show() ​ **GPU:** a single NVIDIA Tesla P40 - - ## 5. 训练版面分析模型 上述模型基于[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection) 训练,如果您想训练自己的版面分析模型,请参考:[train_layoutparser_model](train_layoutparser_model_ch.md) diff --git a/ppstructure/layout/train_layoutparser_model.md b/ppstructure/layout/train_layoutparser_model.md index 58975d71606e45b2f68a7f68565459042ef32775..e877c9c0c901e8be8299101daa5ce6248de0a1dc 100644 --- a/ppstructure/layout/train_layoutparser_model.md +++ b/ppstructure/layout/train_layoutparser_model.md @@ -1,31 +1,20 @@ -# Training layout-parse - -[1. Installation](#Installation) - -​ [1.1 Requirements](#Requirements) - -​ [1.2 Install PaddleDetection](#Install_PaddleDetection) - -[2. Data preparation](#Data_reparation) - -[3. Configuration](#Configuration) +English | [简体中文](train_layoutparser_model_ch.md) +- [Training layout-parse](#training-layout-parse) + - [1. Installation](#1--installation) + - [1.1 Requirements](#11-requirements) + - [1.2 Install PaddleDetection](#12-install-paddledetection) + - [2. Data preparation](#2-data-preparation) + - [3. Configuration](#3-configuration) + - [4. Training](#4-training) + - [5. Prediction](#5-prediction) + - [6. Deployment](#6-deployment) + - [6.1 Export model](#61-export-model) + - [6.2 Inference](#62-inference) -[4. Training](#Training) - -[5. Prediction](#Prediction) - -[6. Deployment](#Deployment) - -​ [6.1 Export model](#Export_model) - -​ [6.2 Inference](#Inference) - - +# Training layout-parse ## 1. Installation - - ### 1.1 Requirements - PaddlePaddle 2.1 @@ -35,8 +24,6 @@ - CUDA >= 10.1 - cuDNN >= 7.6 - - ### 1.2 Install PaddleDetection ```bash @@ -51,8 +38,6 @@ pip install -r requirements.txt For more installation tutorials, please refer to: [Install doc](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md) - - ## 2. Data preparation Download the [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) dataset @@ -80,8 +65,6 @@ PubLayNet directory structure after decompressing : For other datasets,please refer to [the PrepareDataSet]((https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/PrepareDataSet.md) ) - - ## 3. Configuration We use the `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` configuration for training,the configuration file is as follows @@ -113,8 +96,6 @@ The `ppyolov2_r50vd_dcn_365e_coco.yml` configuration depends on other configurat Modify the preceding files, such as the dataset path and batch size etc. - - ## 4. Training PaddleDetection provides single-card/multi-card training mode to meet various training needs of users: @@ -146,8 +127,6 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy Note: If you encounter "`Out of memory error`" , try reducing `batch_size` in the `ppyolov2_reader.yml` file -prediction - ## 5. Prediction Set parameters and use PaddleDetection to predict: @@ -159,14 +138,10 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer `--draw_threshold` is an optional parameter. According to the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659), different threshold will produce different results, ` keep_top_k ` represent the maximum amount of output target, the default value is 10. You can set different value according to your own actual situation。 - - ## 6. Deployment Use your trained model in Layout Parser - - ### 6.1 Export model n the process of model training, the model file saved contains the process of forward prediction and back propagation. In the actual industrial deployment, there is no need for back propagation. Therefore, the model should be translated into the model format required by the deployment. The `tools/export_model.py` script is provided in PaddleDetection to export the model. @@ -183,8 +158,6 @@ The prediction model is exported to `inference/ppyolov2_r50vd_dcn_365e_coco` ,in More model export tutorials, please refer to:[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md) - - ### 6.2 Inference `model_path` represent the trained model path, and layoutparser is used to predict: @@ -194,8 +167,6 @@ import layoutparser as lp model = lp.PaddleDetectionLayoutModel(model_path="inference/ppyolov2_r50vd_dcn_365e_coco", threshold=0.5,label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"},enforce_cpu=True,enable_mkldnn=True) ``` - - *** More PaddleDetection training tutorials,please reference:[PaddleDetection Training](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/GETTING_STARTED_cn.md) diff --git a/ppstructure/layout/train_layoutparser_model_ch.md b/ppstructure/layout/train_layoutparser_model_ch.md index 2f73c63adcea3f82ae579222e658291224f46237..a89b0f3819b52c79b86d2ada13bac23e3d1656ed 100644 --- a/ppstructure/layout/train_layoutparser_model_ch.md +++ b/ppstructure/layout/train_layoutparser_model_ch.md @@ -1,31 +1,20 @@ -# 训练版面分析 - -[1. 安装](#安装) - -​ [1.1 环境要求](#环境要求) - -​ [1.2 安装PaddleDetection](#安装PaddleDetection) - -[2. 准备数据](#准备数据) - -[3. 配置文件改动和说明](#配置文件改动和说明) - -[4. PaddleDetection训练](#训练) - -[5. PaddleDetection预测](#预测) - -[6. 预测部署](#预测部署) - -​ [6.1 模型导出](#模型导出) - -​ [6.2 layout parser预测](#layout_parser预测) +[English](train_layoutparser_model.md) | 简体中文 +- [训练版面分析](#训练版面分析) + - [1. 安装](#1-安装) + - [1.1 环境要求](#11-环境要求) + - [1.2 安装PaddleDetection](#12-安装paddledetection) + - [2. 准备数据](#2-准备数据) + - [3. 配置文件改动和说明](#3-配置文件改动和说明) + - [4. PaddleDetection训练](#4-paddledetection训练) + - [5. PaddleDetection预测](#5-paddledetection预测) + - [6. 预测部署](#6-预测部署) + - [6.1 模型导出](#61-模型导出) + - [6.2 layout_parser预测](#62-layout_parser预测) - +# 训练版面分析 ## 1. 安装 - - ### 1.1 环境要求 - PaddlePaddle 2.1 @@ -35,8 +24,6 @@ - CUDA >= 10.1 - cuDNN >= 7.6 - - ### 1.2 安装PaddleDetection ```bash @@ -51,8 +38,6 @@ pip install -r requirements.txt 更多安装教程,请参考: [Install doc](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md) - - ## 2. 准备数据 下载 [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) 数据集: @@ -80,8 +65,6 @@ tar -xvf publaynet.tar.gz 如果使用其它数据集,请参考[准备训练数据](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/PrepareDataSet.md) - - ## 3. 配置文件改动和说明 我们使用 `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml`配置进行训练,配置文件摘要如下: @@ -113,8 +96,6 @@ weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final 根据实际情况,修改上述文件,比如数据集路径、batch size等。 - - ## 4. PaddleDetection训练 PaddleDetection提供了单卡/多卡训练模式,满足用户多种训练需求 @@ -146,8 +127,6 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy 注意:如果遇到 "`Out of memory error`" 问题, 尝试在 `ppyolov2_reader.yml` 文件中调小`batch_size` - - ## 5. PaddleDetection预测 设置参数,使用PaddleDetection预测: @@ -159,14 +138,10 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer `--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算,不同阈值会产生不同的结果 `keep_top_k`表示设置输出目标的最大数量,默认值为100,用户可以根据自己的实际情况进行设定。 - - ## 6. 预测部署 在layout parser中使用自己训练好的模型。 - - ### 6.1 模型导出 在模型训练过程中保存的模型文件是包含前向预测和反向传播的过程,在实际的工业部署则不需要反向传播,因此需要将模型进行导成部署需要的模型格式。 在PaddleDetection中提供了 `tools/export_model.py`脚本来导出模型。 @@ -183,8 +158,6 @@ python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml 更多模型导出教程,请参考:[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md) - - ### 6.2 layout_parser预测 `model_path`指定训练好的模型路径,使用layout parser进行预测: diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index 94fa76055b93cefab0ac507a6007ec148aa12945..6137cfaef657d70a2b3a2b7eb9c69e364e421d96 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -1,3 +1,13 @@ +- [Table Recognition](#table-recognition) + - [1. pipeline](#1-pipeline) + - [2. Performance](#2-performance) + - [3. How to use](#3-how-to-use) + - [3.1 quick start](#31-quick-start) + - [3.2 Train](#32-train) + - [3.3 Eval](#33-eval) + - [3.4 Inference](#34-inference) + + # Table Recognition ## 1. pipeline @@ -51,10 +61,10 @@ After running, the excel sheet of each picture will be saved in the directory sp In this chapter, we only introduce the training of the table structure model, For model training of [text detection](../../doc/doc_en/detection_en.md) and [text recognition](../../doc/doc_en/recognition_en.md), please refer to the corresponding documents -#### data preparation +* data preparation The training data uses public data set [PubTabNet](https://arxiv.org/abs/1911.10683 ), Can be downloaded from the official [website](https://github.com/ibm-aur-nlp/PubTabNet) 。The PubTabNet data set contains about 500,000 images, as well as annotations in html format。 -#### Start training +* Start training *If you are installing the cpu version of paddle, please modify the `use_gpu` field in the configuration file to false* ```shell # single GPU training @@ -67,7 +77,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/ In the above instruction, use `-c` to select the training to use the `configs/table/table_mv3.yml` configuration file. For a detailed explanation of the configuration file, please refer to [config](../../doc/doc_en/config_en.md). -#### load trained model and continue training +* load trained model and continue training If you expect to load trained model and continue the training again, you can specify the parameter `Global.checkpoints` as the model path to be loaded. diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index ef0f1ae5c4554e69e4cbeb0fcd783e6d98f96a41..39081995e6dd1e0a05fc88d067bab119ca7b6e39 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -1,14 +1,14 @@ -# 表格识别 +- [表格识别](#表格识别) + - [1. 表格识别 pipeline](#1-表格识别-pipeline) + - [2. 性能](#2-性能) + - [3. 使用](#3-使用) + - [3.1 快速开始](#31-快速开始) + - [3.2 训练](#32-训练) + - [3.3 评估](#33-评估) + - [3.4 预测](#34-预测) -* [1. 表格识别 pipeline](#1) -* [2. 性能](#2) -* [3. 使用](#3) - + [3.1 快速开始](#31) - + [3.2 训练](#32) - + [3.3 评估](#33) - + [3.4 预测](#34) +# 表格识别 - ## 1. 表格识别 pipeline 表格识别主要包含三个模型 @@ -28,7 +28,6 @@ 4. 单元格的识别结果和表格结构一起构造表格的html字符串。 - ## 2. 性能 我们在 PubTabNet[1] 评估数据集上对算法进行了评估,性能如下 @@ -38,9 +37,8 @@ | EDD[2] | 88.3 | | Ours | 93.32 | - ## 3. 使用 - + ### 3.1 快速开始 ```python @@ -61,14 +59,17 @@ python3 table/predict_table.py --det_model_dir=inference/en_ppocr_mobile_v2.0_ta 运行完成后,每张图片的excel表格会保存到output字段指定的目录下 note: 上述模型是在 PubLayNet 数据集上训练的表格识别模型,仅支持英文扫描场景,如需识别其他场景需要自己训练模型后替换 `det_model_dir`,`rec_model_dir`,`table_model_dir`三个字段即可。 - + ### 3.2 训练 + 在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)和[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。 -#### 数据准备 +* 数据准备 + 训练数据使用公开数据集PubTabNet ([论文](https://arxiv.org/abs/1911.10683),[下载地址](https://github.com/ibm-aur-nlp/PubTabNet))。PubTabNet数据集包含约50万张表格数据的图像,以及图像对应的html格式的注释。 -#### 启动训练 +* 启动训练 + *如果您安装的是cpu版本,请将配置文件中的 `use_gpu` 字段修改为false* ```shell # 单机单卡训练 @@ -79,7 +80,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/ 上述指令中,通过-c 选择训练使用configs/table/table_mv3.yml配置文件。有关配置文件的详细解释,请参考[链接](../../doc/doc_ch/config.md)。 -#### 断点训练 +* 断点训练 如果训练程序中断,如果希望加载训练中断的模型从而恢复训练,可以通过指定Global.checkpoints指定要加载的模型路径: ```shell @@ -88,7 +89,6 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo **注意**:`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。 - ### 3.3 评估 表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下: @@ -113,7 +113,6 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di ```bash teds: 93.32 ``` - ### 3.4 预测 ```python