train_layoutparser_model.md 7.2 KB
Newer Older
文幕地方's avatar
文幕地方 已提交
1 2 3 4 5 6 7 8 9 10 11 12
English | [简体中文](train_layoutparser_model_ch.md)
- [Training layout-parse](#training-layout-parse)
  - [1.  Installation](#1--installation)
    - [1.1 Requirements](#11-requirements)
    - [1.2 Install PaddleDetection](#12-install-paddledetection)
  - [2. Data preparation](#2-data-preparation)
  - [3. Configuration](#3-configuration)
  - [4. Training](#4-training)
  - [5. Prediction](#5-prediction)
  - [6. Deployment](#6-deployment)
    - [6.1 Export model](#61-export-model)
    - [6.2 Inference](#62-inference)
W
WenmuZhou 已提交
13

文幕地方's avatar
文幕地方 已提交
14
# Training layout-parse
W
WenmuZhou 已提交
15

G
grasswolfs 已提交
16
## 1.  Installation
W
WenmuZhou 已提交
17

G
grasswolfs 已提交
18
### 1.1 Requirements
W
WenmuZhou 已提交
19 20 21 22 23 24 25 26

- PaddlePaddle 2.1
- OS 64 bit
- Python 3(3.5.1+/3.6/3.7/3.8/3.9),64 bit
- pip/pip3(9.0.1+), 64 bit
- CUDA >= 10.1
- cuDNN >= 7.6

G
grasswolfs 已提交
27
### 1.2 Install PaddleDetection
W
WenmuZhou 已提交
28 29

```bash
G
grasswolfs 已提交
30
# Clone PaddleDetection repository
W
WenmuZhou 已提交
31 32 33 34
cd <path/to/clone/PaddleDetection>
git clone https://github.com/PaddlePaddle/PaddleDetection.git

cd PaddleDetection
G
grasswolfs 已提交
35
# Install other dependencies
W
WenmuZhou 已提交
36 37 38
pip install -r requirements.txt
```

G
grasswolfs 已提交
39
For more installation tutorials, please refer to: [Install doc](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)
W
WenmuZhou 已提交
40

G
grasswolfs 已提交
41
## 2. Data preparation
W
WenmuZhou 已提交
42

G
grasswolfs 已提交
43
Download the [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) dataset
W
WenmuZhou 已提交
44 45 46 47

```bash
cd PaddleDetection/dataset/
mkdir publaynet
G
grasswolfs 已提交
48
# execute the command,download PubLayNet
W
WenmuZhou 已提交
49
wget -O publaynet.tar.gz https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz?_ga=2.104193024.1076900768.1622560733-649911202.1622560733
G
grasswolfs 已提交
50
# unpack
W
WenmuZhou 已提交
51 52 53
tar -xvf publaynet.tar.gz
```

G
grasswolfs 已提交
54
PubLayNet directory structure after decompressing :
W
WenmuZhou 已提交
55 56 57 58 59 60

| File or Folder | Description                                      | num     |
| :------------- | :----------------------------------------------- | ------- |
| `train/`       | Images in the training subset                    | 335,703 |
| `val/`         | Images in the validation subset                  | 11,245  |
| `test/`        | Images in the testing subset                     | 11,405  |
G
grasswolfs 已提交
61 62 63 64
| `train.json`   | Annotations for training images                  |  1       |
| `val.json`     | Annotations for validation images                |  1       |
| `LICENSE.txt`  | Plaintext version of the CDLA-Permissive license |   1      |
| `README.txt`   | Text file with the file names and description    |   1      |
W
WenmuZhou 已提交
65

G
grasswolfs 已提交
66
For other datasets,please refer to [the PrepareDataSet]((https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/PrepareDataSet.md) )
W
WenmuZhou 已提交
67

G
grasswolfs 已提交
68
## 3. Configuration
W
WenmuZhou 已提交
69

G
grasswolfs 已提交
70
We use the  `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` configuration for training,the configuration file is as follows
W
WenmuZhou 已提交
71

W
WenmuZhou 已提交
72 73 74 75 76 77 78 79 80 81 82 83
```bash
_BASE_: [
  '../datasets/coco_detection.yml',
  '../runtime.yml',
  './_base_/ppyolov2_r50vd_dcn.yml',
  './_base_/optimizer_365e.yml',
  './_base_/ppyolov2_reader.yml',
]

snapshot_epoch: 8
weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final
```
G
grasswolfs 已提交
84
The `ppyolov2_r50vd_dcn_365e_coco.yml` configuration depends on other configuration files, in this case:
W
WenmuZhou 已提交
85

G
grasswolfs 已提交
86
- coco_detection.yml:mainly explains the path of training data and verification data
W
WenmuZhou 已提交
87

G
grasswolfs 已提交
88
- runtime.yml:mainly describes the common parameters, such as whether to use the GPU and how many epoch to save model etc.
W
WenmuZhou 已提交
89

G
grasswolfs 已提交
90
- optimizer_365e.yml:mainly explains the learning rate and optimizer configuration
W
WenmuZhou 已提交
91

G
grasswolfs 已提交
92
- ppyolov2_r50vd_dcn.yml:mainly describes the model and the  network
W
WenmuZhou 已提交
93

G
grasswolfs 已提交
94
- ppyolov2_reader.yml:mainly describes the configuration of data readers, such as batch size and number of concurrent loading child processes, and also includes post preprocessing, such as resize and data augmention etc.
W
WenmuZhou 已提交
95 96


G
grasswolfs 已提交
97
Modify the preceding files, such as the dataset path and batch size etc.
W
WenmuZhou 已提交
98

G
grasswolfs 已提交
99
## 4. Training
W
WenmuZhou 已提交
100

G
grasswolfs 已提交
101
PaddleDetection provides single-card/multi-card training mode to meet various training needs of users:
W
WenmuZhou 已提交
102

G
grasswolfs 已提交
103
* GPU single card training
W
WenmuZhou 已提交
104 105

```bash
G
grasswolfs 已提交
106
export CUDA_VISIBLE_DEVICES=0 #Don't need to run this command on Windows and Mac
W
WenmuZhou 已提交
107 108 109
python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
```

G
grasswolfs 已提交
110
* GPU multi-card training
W
WenmuZhou 已提交
111 112 113 114 115 116

```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval
```

G
grasswolfs 已提交
117
--eval: training while verifying
W
WenmuZhou 已提交
118

G
grasswolfs 已提交
119
* Model recovery training
W
WenmuZhou 已提交
120

G
grasswolfs 已提交
121
During the daily training, if training is interrupted due to some reasons, you can use the -r command to resume the training:
W
WenmuZhou 已提交
122 123 124

```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
W
fix doc  
WenmuZhou 已提交
125
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval -r output/ppyolov2_r50vd_dcn_365e_coco/10000
W
WenmuZhou 已提交
126 127
```

G
grasswolfs 已提交
128
Note: If you encounter "`Out of memory error`" , try reducing `batch_size` in the `ppyolov2_reader.yml`  file
W
WenmuZhou 已提交
129

G
grasswolfs 已提交
130
## 5. Prediction
W
WenmuZhou 已提交
131

G
grasswolfs 已提交
132
Set parameters and use PaddleDetection to predict:
W
WenmuZhou 已提交
133 134 135

```bash
export CUDA_VISIBLE_DEVICES=0
W
fix doc  
WenmuZhou 已提交
136
python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer_img=images/paper-image.jpg --output_dir=infer_output/ --draw_threshold=0.5 -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final --use_vdl=Ture
W
WenmuZhou 已提交
137 138
```

G
grasswolfs 已提交
139
`--draw_threshold` is an optional parameter. According to the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659), different threshold will produce different results, ` keep_top_k ` represent  the maximum amount of output target, the default value is 10. You can set different value according to your own actual situation。
W
WenmuZhou 已提交
140

G
grasswolfs 已提交
141
## 6. Deployment
W
WenmuZhou 已提交
142

G
grasswolfs 已提交
143
Use your trained model in Layout Parser
W
WenmuZhou 已提交
144

G
grasswolfs 已提交
145
### 6.1 Export model
W
WenmuZhou 已提交
146

G
grasswolfs 已提交
147
n the process of model training, the model file saved contains the process of forward prediction and back propagation. In the actual industrial deployment, there is no need for back propagation. Therefore, the model should be translated into the model format required by the deployment. The `tools/export_model.py` script is provided in PaddleDetection to export the model.
W
WenmuZhou 已提交
148

G
grasswolfs 已提交
149
The exported model name defaults to `model.*`, Layout Parser's code model is `inference.*`, So change [PaddleDetection/ppdet/engine/trainer. Py ](https://github.com/PaddlePaddle/PaddleDetection/blob/b87a1ea86fa18ce69e44a17ad1b49c1326f19ff9/ppdet/engine/trainer.py# L512) (click on the link to see the detailed line of code), change 'model' to 'inference'.
W
WenmuZhou 已提交
150

G
grasswolfs 已提交
151
Execute the script to export model:
W
WenmuZhou 已提交
152 153

```bash
W
fix doc  
WenmuZhou 已提交
154
python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams
W
WenmuZhou 已提交
155 156
```

G
grasswolfs 已提交
157
The prediction model is exported to `inference/ppyolov2_r50vd_dcn_365e_coco` ,including:`infer_cfg.yml`(prediction not required), `inference.pdiparams`, `inference.pdiparams.info`,`inference.pdmodel`
W
WenmuZhou 已提交
158

G
grasswolfs 已提交
159
More model export tutorials, please refer to:[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md)
W
WenmuZhou 已提交
160

G
grasswolfs 已提交
161
### 6.2 Inference
W
WenmuZhou 已提交
162

G
grasswolfs 已提交
163
`model_path` represent  the trained model path, and layoutparser is used to predict:
W
WenmuZhou 已提交
164 165 166

```bash
import layoutparser as lp
W
fix doc  
WenmuZhou 已提交
167
model = lp.PaddleDetectionLayoutModel(model_path="inference/ppyolov2_r50vd_dcn_365e_coco", threshold=0.5,label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"},enforce_cpu=True,enable_mkldnn=True)
W
WenmuZhou 已提交
168 169 170 171
```

***

G
grasswolfs 已提交
172
More PaddleDetection training tutorials,please reference:[PaddleDetection Training](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/GETTING_STARTED_cn.md)
W
WenmuZhou 已提交
173 174

***