提交 36f2ae41 编写于 作者: 文幕地方's avatar 文幕地方

rm <a> in doc

上级 208f91b5
English | [简体中文](README_ch.md)
- [Getting Started](#getting-started)
- [1. Install whl package](#1--install-whl-package)
- [2. Quick Start](#2-quick-start)
- [3. PostProcess](#3-postprocess)
- [4. Results](#4-results)
- [5. Training](#5-training)
# Getting Started
[1. Install whl package](#Install)
[2. Quick Start](#QuickStart)
[3. PostProcess](#PostProcess)
[4. Results](#Results)
[5. Training](#Training)
<a name="Install"></a>
## 1. Install whl package
```bash
wget https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip install -U layoutparser-0.0.0-py3-none-any.whl
```
<a name="QuickStart"></a>
## 2. Quick Start
Use LayoutParser to identify the layout of a document:
......@@ -77,8 +68,6 @@ The following model configurations and label maps are currently supported, which
* TableBank word and TableBank latex are trained on datasets of word documents and latex documents respectively;
* Download TableBank dataset contains both word and latex。
<a name="PostProcess"></a>
## 3. PostProcess
Layout parser contains multiple categories, if you only want to get the detection box for a specific category (such as the "Text" category), you can use the following code:
......@@ -119,7 +108,6 @@ Displays results with only the "Text" category:
<div align="center">
<img src="../../doc/table/result_text.jpg" width = "600" />
</div>
<a name="Results"></a>
## 4. Results
......@@ -134,8 +122,6 @@ Displays results with only the "Text" category:
**GPU:** a single NVIDIA Tesla P40
<a name="Training"></a>
## 5. Training
The above model is based on [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection). If you want to train your own layout parser model,please refer to:[train_layoutparser_model](train_layoutparser_model.md)
[English](README.md) | 简体中文
- [版面分析使用说明](#版面分析使用说明)
- [1. 安装whl包](#1--安装whl包)
- [2. 使用](#2-使用)
- [3. 后处理](#3-后处理)
- [4. 指标](#4-指标)
- [5. 训练版面分析模型](#5-训练版面分析模型)
# 版面分析使用说明
[1. 安装whl包](#安装whl包)
[2. 使用](#使用)
[3. 后处理](#后处理)
[4. 指标](#指标)
[5. 训练版面分析模型](#训练版面分析模型)
<a name="安装whl包"></a>
## 1. 安装whl包
```bash
pip install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
```
<a name="使用"></a>
## 2. 使用
使用layoutparser识别给定文档的布局:
......@@ -76,8 +68,6 @@ show_img.show()
* TableBank word和TableBank latex分别在word文档、latex文档数据集训练;
* 下载的TableBank数据集里同时包含word和latex。
<a name="后处理"></a>
## 3. 后处理
版面分析检测包含多个类别,如果只想获取指定类别(如"Text"类别)的检测框、可以使用下述代码:
......@@ -119,8 +109,6 @@ show_img.show()
<img src="../../doc/table/result_text.jpg" width = "600" />
</div>
<a name="指标"></a>
## 4. 指标
| Dataset | mAP | CPU time cost | GPU time cost |
......@@ -134,8 +122,6 @@ show_img.show()
**GPU:** a single NVIDIA Tesla P40
<a name="训练版面分析模型"></a>
## 5. 训练版面分析模型
上述模型基于[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection) 训练,如果您想训练自己的版面分析模型,请参考:[train_layoutparser_model](train_layoutparser_model_ch.md)
# Training layout-parse
[1. Installation](#Installation)
[1.1 Requirements](#Requirements)
[1.2 Install PaddleDetection](#Install_PaddleDetection)
[2. Data preparation](#Data_reparation)
[3. Configuration](#Configuration)
English | [简体中文](train_layoutparser_model_ch.md)
- [Training layout-parse](#training-layout-parse)
- [1. Installation](#1--installation)
- [1.1 Requirements](#11-requirements)
- [1.2 Install PaddleDetection](#12-install-paddledetection)
- [2. Data preparation](#2-data-preparation)
- [3. Configuration](#3-configuration)
- [4. Training](#4-training)
- [5. Prediction](#5-prediction)
- [6. Deployment](#6-deployment)
- [6.1 Export model](#61-export-model)
- [6.2 Inference](#62-inference)
[4. Training](#Training)
[5. Prediction](#Prediction)
[6. Deployment](#Deployment)
[6.1 Export model](#Export_model)
[6.2 Inference](#Inference)
<a name="Installation"></a>
# Training layout-parse
## 1. Installation
<a name="Requirements"></a>
### 1.1 Requirements
- PaddlePaddle 2.1
......@@ -35,8 +24,6 @@
- CUDA >= 10.1
- cuDNN >= 7.6
<a name="Install_PaddleDetection"></a>
### 1.2 Install PaddleDetection
```bash
......@@ -51,8 +38,6 @@ pip install -r requirements.txt
For more installation tutorials, please refer to: [Install doc](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)
<a name="Data_preparation"></a>
## 2. Data preparation
Download the [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) dataset
......@@ -80,8 +65,6 @@ PubLayNet directory structure after decompressing :
For other datasets,please refer to [the PrepareDataSet]((https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/PrepareDataSet.md) )
<a name="Configuration"></a>
## 3. Configuration
We use the `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` configuration for training,the configuration file is as follows
......@@ -113,8 +96,6 @@ The `ppyolov2_r50vd_dcn_365e_coco.yml` configuration depends on other configurat
Modify the preceding files, such as the dataset path and batch size etc.
<a name="Training"></a>
## 4. Training
PaddleDetection provides single-card/multi-card training mode to meet various training needs of users:
......@@ -146,8 +127,6 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy
Note: If you encounter "`Out of memory error`" , try reducing `batch_size` in the `ppyolov2_reader.yml` file
prediction<a name="Prediction"></a>
## 5. Prediction
Set parameters and use PaddleDetection to predict:
......@@ -159,14 +138,10 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer
`--draw_threshold` is an optional parameter. According to the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659), different threshold will produce different results, ` keep_top_k ` represent the maximum amount of output target, the default value is 10. You can set different value according to your own actual situation。
<a name="Deployment"></a>
## 6. Deployment
Use your trained model in Layout Parser
<a name="Export_model"></a>
### 6.1 Export model
n the process of model training, the model file saved contains the process of forward prediction and back propagation. In the actual industrial deployment, there is no need for back propagation. Therefore, the model should be translated into the model format required by the deployment. The `tools/export_model.py` script is provided in PaddleDetection to export the model.
......@@ -183,8 +158,6 @@ The prediction model is exported to `inference/ppyolov2_r50vd_dcn_365e_coco` ,in
More model export tutorials, please refer to:[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md)
<a name="Inference"></a>
### 6.2 Inference
`model_path` represent the trained model path, and layoutparser is used to predict:
......@@ -194,8 +167,6 @@ import layoutparser as lp
model = lp.PaddleDetectionLayoutModel(model_path="inference/ppyolov2_r50vd_dcn_365e_coco", threshold=0.5,label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"},enforce_cpu=True,enable_mkldnn=True)
```
***
More PaddleDetection training tutorials,please reference:[PaddleDetection Training](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/GETTING_STARTED_cn.md)
......
# 训练版面分析
[1. 安装](#安装)
[1.1 环境要求](#环境要求)
[1.2 安装PaddleDetection](#安装PaddleDetection)
[2. 准备数据](#准备数据)
[3. 配置文件改动和说明](#配置文件改动和说明)
[4. PaddleDetection训练](#训练)
[5. PaddleDetection预测](#预测)
[6. 预测部署](#预测部署)
[6.1 模型导出](#模型导出)
[6.2 layout parser预测](#layout_parser预测)
[English](train_layoutparser_model.md) | 简体中文
- [训练版面分析](#训练版面分析)
- [1. 安装](#1-安装)
- [1.1 环境要求](#11-环境要求)
- [1.2 安装PaddleDetection](#12-安装paddledetection)
- [2. 准备数据](#2-准备数据)
- [3. 配置文件改动和说明](#3-配置文件改动和说明)
- [4. PaddleDetection训练](#4-paddledetection训练)
- [5. PaddleDetection预测](#5-paddledetection预测)
- [6. 预测部署](#6-预测部署)
- [6.1 模型导出](#61-模型导出)
- [6.2 layout_parser预测](#62-layout_parser预测)
<a name="安装"></a>
# 训练版面分析
## 1. 安装
<a name="环境要求"></a>
### 1.1 环境要求
- PaddlePaddle 2.1
......@@ -35,8 +24,6 @@
- CUDA >= 10.1
- cuDNN >= 7.6
<a name="安装PaddleDetection"></a>
### 1.2 安装PaddleDetection
```bash
......@@ -51,8 +38,6 @@ pip install -r requirements.txt
更多安装教程,请参考: [Install doc](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)
<a name="数据准备"></a>
## 2. 准备数据
下载 [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) 数据集:
......@@ -80,8 +65,6 @@ tar -xvf publaynet.tar.gz
如果使用其它数据集,请参考[准备训练数据](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/PrepareDataSet.md)
<a name="配置文件改动和说明"></a>
## 3. 配置文件改动和说明
我们使用 `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml`配置进行训练,配置文件摘要如下:
......@@ -113,8 +96,6 @@ weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final
根据实际情况,修改上述文件,比如数据集路径、batch size等。
<a name="训练"></a>
## 4. PaddleDetection训练
PaddleDetection提供了单卡/多卡训练模式,满足用户多种训练需求
......@@ -146,8 +127,6 @@ python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppy
注意:如果遇到 "`Out of memory error`" 问题, 尝试在 `ppyolov2_reader.yml` 文件中调小`batch_size`
<a name="预测"></a>
## 5. PaddleDetection预测
设置参数,使用PaddleDetection预测:
......@@ -159,14 +138,10 @@ python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer
`--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算,不同阈值会产生不同的结果 `keep_top_k`表示设置输出目标的最大数量,默认值为100,用户可以根据自己的实际情况进行设定。
<a name="预测部署"></a>
## 6. 预测部署
在layout parser中使用自己训练好的模型。
<a name="模型导出"></a>
### 6.1 模型导出
在模型训练过程中保存的模型文件是包含前向预测和反向传播的过程,在实际的工业部署则不需要反向传播,因此需要将模型进行导成部署需要的模型格式。 在PaddleDetection中提供了 `tools/export_model.py`脚本来导出模型。
......@@ -183,8 +158,6 @@ python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
更多模型导出教程,请参考:[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md)
<a name="layout parser预测"></a>
### 6.2 layout_parser预测
`model_path`指定训练好的模型路径,使用layout parser进行预测:
......
- [Table Recognition](#table-recognition)
- [1. pipeline](#1-pipeline)
- [2. Performance](#2-performance)
- [3. How to use](#3-how-to-use)
- [3.1 quick start](#31-quick-start)
- [3.2 Train](#32-train)
- [3.3 Eval](#33-eval)
- [3.4 Inference](#34-inference)
# Table Recognition
## 1. pipeline
......@@ -51,10 +61,10 @@ After running, the excel sheet of each picture will be saved in the directory sp
In this chapter, we only introduce the training of the table structure model, For model training of [text detection](../../doc/doc_en/detection_en.md) and [text recognition](../../doc/doc_en/recognition_en.md), please refer to the corresponding documents
#### data preparation
* data preparation
The training data uses public data set [PubTabNet](https://arxiv.org/abs/1911.10683 ), Can be downloaded from the official [website](https://github.com/ibm-aur-nlp/PubTabNet) 。The PubTabNet data set contains about 500,000 images, as well as annotations in html format。
#### Start training
* Start training
*If you are installing the cpu version of paddle, please modify the `use_gpu` field in the configuration file to false*
```shell
# single GPU training
......@@ -67,7 +77,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/
In the above instruction, use `-c` to select the training to use the `configs/table/table_mv3.yml` configuration file.
For a detailed explanation of the configuration file, please refer to [config](../../doc/doc_en/config_en.md).
#### load trained model and continue training
* load trained model and continue training
If you expect to load trained model and continue the training again, you can specify the parameter `Global.checkpoints` as the model path to be loaded.
......
# 表格识别
- [表格识别](#表格识别)
- [1. 表格识别 pipeline](#1-表格识别-pipeline)
- [2. 性能](#2-性能)
- [3. 使用](#3-使用)
- [3.1 快速开始](#31-快速开始)
- [3.2 训练](#32-训练)
- [3.3 评估](#33-评估)
- [3.4 预测](#34-预测)
* [1. 表格识别 pipeline](#1)
* [2. 性能](#2)
* [3. 使用](#3)
+ [3.1 快速开始](#31)
+ [3.2 训练](#32)
+ [3.3 评估](#33)
+ [3.4 预测](#34)
# 表格识别
<a name="1"></a>
## 1. 表格识别 pipeline
表格识别主要包含三个模型
......@@ -28,7 +28,6 @@
4. 单元格的识别结果和表格结构一起构造表格的html字符串。
<a name="2"></a>
## 2. 性能
我们在 PubTabNet<sup>[1]</sup> 评估数据集上对算法进行了评估,性能如下
......@@ -38,9 +37,8 @@
| EDD<sup>[2]</sup> | 88.3 |
| Ours | 93.32 |
<a name="3"></a>
## 3. 使用
<a name="31"></a>
### 3.1 快速开始
```python
......@@ -61,14 +59,17 @@ python3 table/predict_table.py --det_model_dir=inference/en_ppocr_mobile_v2.0_ta
运行完成后,每张图片的excel表格会保存到output字段指定的目录下
note: 上述模型是在 PubLayNet 数据集上训练的表格识别模型,仅支持英文扫描场景,如需识别其他场景需要自己训练模型后替换 `det_model_dir`,`rec_model_dir`,`table_model_dir`三个字段即可。
<a name="32"></a>
### 3.2 训练
在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。
#### 数据准备
* 数据准备
训练数据使用公开数据集PubTabNet ([论文](https://arxiv.org/abs/1911.10683)[下载地址](https://github.com/ibm-aur-nlp/PubTabNet))。PubTabNet数据集包含约50万张表格数据的图像,以及图像对应的html格式的注释。
#### 启动训练
* 启动训练
*如果您安装的是cpu版本,请将配置文件中的 `use_gpu` 字段修改为false*
```shell
# 单机单卡训练
......@@ -79,7 +80,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/
上述指令中,通过-c 选择训练使用configs/table/table_mv3.yml配置文件。有关配置文件的详细解释,请参考[链接](../../doc/doc_ch/config.md)
#### 断点训练
* 断点训练
如果训练程序中断,如果希望加载训练中断的模型从而恢复训练,可以通过指定Global.checkpoints指定要加载的模型路径:
```shell
......@@ -88,7 +89,6 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo
**注意**`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。
<a name="33"></a>
### 3.3 评估
表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下:
......@@ -113,7 +113,6 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di
```bash
teds: 93.32
```
<a name="34"></a>
### 3.4 预测
```python
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册