-[6 Model export and inference](#6-Model-export-and-inference)
-[6.2 Test layout analysis results](#62-Test-layout-analysis-results)
-[6.1 Model export](#61-Model-export)
-[7 Model export and inference](#7-Model-export-and-inference)
-[6.2 Model inference](#62-Model-inference)
-[7.1 Model export](#71-Model-export)
-[7.2 Model inference](#72-Model-inference)
## 1. Introduction
## 1. Introduction
...
@@ -28,11 +29,12 @@ Layout analysis refers to the regional division of documents in the form of pict
...
@@ -28,11 +29,12 @@ Layout analysis refers to the regional division of documents in the form of pict
<imgsrc="../docs/layout/layout.png"width="800">
<imgsrc="../docs/layout/layout.png"width="800">
</div>
</div>
## 2. Quick start
PP-Structure currently provides layout analysis models in Chinese, English and table documents. For the model link, see [models_list](../docs/models_list_en.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
For more requirements, please refer to the instructions in the [Install file](https://www.paddlepaddle.org.cn/install/quick)。
For more requirements, please refer to the instructions in the [Install file](https://www.paddlepaddle.org.cn/install/quick)。
### 2.2. Install PaddleDetection
### 3.2. Install PaddleDetection
-**(1)Download PaddleDetection Source code**
-**(1)Download PaddleDetection Source code**
...
@@ -62,11 +64,11 @@ cd PaddleDetection
...
@@ -62,11 +64,11 @@ cd PaddleDetection
python3 -m pip install-r requirements.txt
python3 -m pip install-r requirements.txt
```
```
## 3. Data preparation
## 4. Data preparation
If you want to experience the prediction process directly, you can skip data preparation and download the pre-training model.
If you want to experience the prediction process directly, you can skip data preparation and download the pre-training model.
### 3.1. English data set
### 4.1. English data set
Download document analysis data set [PubLayNet](https://developer.ibm.com/exchanges/data/all/publaynet/)(Dataset 96G),contains 5 classes:`{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}`
Download document analysis data set [PubLayNet](https://developer.ibm.com/exchanges/data/all/publaynet/)(Dataset 96G),contains 5 classes:`{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}`
...
@@ -141,7 +143,7 @@ The JSON file contains the annotations of all images, and the data is stored in
...
@@ -141,7 +143,7 @@ The JSON file contains the annotations of all images, and the data is stored in
}
}
```
```
### 3.2. More datasets
### 4.2. More datasets
We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. data set download links,process to the JSON format of the above annotation file,that is, the training can be conducted in the same way。
We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. data set download links,process to the JSON format of the above annotation file,that is, the training can be conducted in the same way。
...
@@ -154,7 +156,7 @@ We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. d
...
@@ -154,7 +156,7 @@ We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. d
If the test image is Chinese, the pre-trained model of Chinese CDLA dataset can be downloaded to identify 10 types of document regions:Table, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation,Download the training model and inference model of Model 'picodet_lcnet_x1_0_fgd_layout_cdla' in [layout analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)。If only the table area in the image is detected, you can download the pre-trained model of the table dataset, and download the training model and inference model of the 'picodet_LCnet_x1_0_FGd_layout_table' model in [Layout Analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)
If the test image is Chinese, the pre-trained model of Chinese CDLA dataset can be downloaded to identify 10 types of document regions:Table, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation,Download the training model and inference model of Model 'picodet_lcnet_x1_0_fgd_layout_cdla' in [layout analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)。If only the table area in the image is detected, you can download the pre-trained model of the table dataset, and download the training model and inference model of the 'picodet_LCnet_x1_0_FGd_layout_table' model in [Layout Analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)
### 4.1. Train
### 5.1. Train
Train:
Train:
...
@@ -247,7 +249,7 @@ After starting training normally, you will see the following log output:
...
@@ -247,7 +249,7 @@ After starting training normally, you will see the following log output:
**Note that the configuration file for prediction / evaluation must be consistent with the training.**
**Note that the configuration file for prediction / evaluation must be consistent with the training.**
### 4.2. FGD Distillation Training
### 5.2. FGD Distillation Training
PaddleDetection supports FGD-based [Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1) The training process of the target detection model of distillation, FGD distillation is divided into two parts `Focal` and `Global`. `Focal` Distillation separates the foreground and background of the image, allowing the student model to focus on the key pixels of the foreground and background features of the teacher model respectively;` Global`Distillation section reconstructs the relationships between different pixels and transfers them from the teacher to the student to compensate for the global information lost in `Focal`Distillation.
PaddleDetection supports FGD-based [Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1) The training process of the target detection model of distillation, FGD distillation is divided into two parts `Focal` and `Global`. `Focal` Distillation separates the foreground and background of the image, allowing the student model to focus on the key pixels of the foreground and background features of the teacher model respectively;` Global`Distillation section reconstructs the relationships between different pixels and transfers them from the teacher to the student to compensate for the global information lost in `Focal`Distillation.
...
@@ -265,9 +267,9 @@ python3 tools/train.py \
...
@@ -265,9 +267,9 @@ python3 tools/train.py \
-`-c`: Specify the model configuration file.
-`-c`: Specify the model configuration file.
-`--slim_config`: Specify the compression policy profile.
-`--slim_config`: Specify the compression policy profile.
## 5. Model evaluation and prediction
## 6. Model evaluation and prediction
### 5.1. Indicator evaluation
### 6.1. Indicator evaluation
Model parameters in training are saved by default in `output/picodet_ Lcnet_ X1_ 0_ Under the layout` directory. When evaluating indicators, you need to set `weights` to point to the saved parameter file.Assessment datasets can be accessed via `configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` . Modify `EvalDataset` : `img_dir`,`anno_ Path`and`dataset_dir` setting.
Model parameters in training are saved by default in `output/picodet_ Lcnet_ X1_ 0_ Under the layout` directory. When evaluating indicators, you need to set `weights` to point to the saved parameter file.Assessment datasets can be accessed via `configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` . Modify `EvalDataset` : `img_dir`,`anno_ Path`and`dataset_dir` setting.
...
@@ -310,7 +312,7 @@ python3 tools/eval.py \
...
@@ -310,7 +312,7 @@ python3 tools/eval.py \
-`--slim_config`: Specify the distillation policy profile.
-`--slim_config`: Specify the distillation policy profile.
-`-o weights`: Specify the model path trained by the distillation algorithm.
-`-o weights`: Specify the model path trained by the distillation algorithm.
### 5.2. Test Layout Analysis Results
### 6.2. Test Layout Analysis Results
The profile predicted to be used must be consistent with the training, for example, if you pass `python3 tools/train'. Py-c configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` completed the training process for the model.
The profile predicted to be used must be consistent with the training, for example, if you pass `python3 tools/train'. Py-c configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` completed the training process for the model.
...
@@ -343,10 +345,10 @@ python3 tools/infer.py \
...
@@ -343,10 +345,10 @@ python3 tools/infer.py \
```
```
## 6. Model Export and Inference
## 7. Model Export and Inference
### 6.1 Model Export
### 7.1 Model Export
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
Replace model_with the provided inference training model for inference or the FGD distillation training `model_dir`Inference model path, execute the following commands for inference:
Replace model_with the provided inference training model for inference or the FGD distillation training `model_dir`Inference model path, execute the following commands for inference:
PaddleDetection支持了基于FGD([Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1))蒸馏的目标检测模型训练过程,FGD蒸馏分为两个部分`Focal`和`Global`。`Focal`蒸馏分离图像的前景和背景,让学生模型分别关注教师模型的前景和背景部分特征的关键像素;`Global`蒸馏部分重建不同像素之间的关系并将其从教师转移到学生,以补偿`Focal`蒸馏中丢失的全局信息。
PaddleDetection支持了基于FGD([Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1))蒸馏的目标检测模型训练过程,FGD蒸馏分为两个部分`Focal`和`Global`。`Focal`蒸馏分离图像的前景和背景,让学生模型分别关注教师模型的前景和背景部分特征的关键像素;`Global`蒸馏部分重建不同像素之间的关系并将其从教师转移到学生,以补偿`Focal`蒸馏中丢失的全局信息。
For more requirements, please refer to the instructions in [Installation Documentation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/install/pip/macos-pip_en.html).
For more requirements, please refer to the instructions in [Installation Documentation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/install/pip/macos-pip_en.html).
...
@@ -85,6 +83,8 @@ Through layout analysis, we divided the image/PDF documents into regions, locate
...
@@ -85,6 +83,8 @@ Through layout analysis, we divided the image/PDF documents into regions, locate
We can restore the test picture through the layout information, OCR detection and recognition structure, table information, and saved pictures.
We can restore the test picture through the layout information, OCR detection and recognition structure, table information, and saved pictures.
The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
<a name="3.1"></a>
<a name="3.1"></a>
### 3.1 Download models
### 3.1 Download models
...
@@ -151,10 +151,10 @@ Field:
...
@@ -151,10 +151,10 @@ Field:
## 4. More
## 4. More
For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/detection.md).
For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/detection_en.md).
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/recognition.md).
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/recognition_en.md).
For training, evaluation and inference tutorial for layout analysis models, please refer to [layout analysis doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README_ch.md)
For training, evaluation and inference tutorial for layout analysis models, please refer to [layout analysis doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md)
For training, evaluation and inference tutorial for table recognition models, please refer to [table recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/table/README_ch.md)
For training, evaluation and inference tutorial for table recognition models, please refer to [table recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/table/README.md)