From bd2370f1b9e127a804ccaa70c8b0320023da7f15 Mon Sep 17 00:00:00 2001
From: LDOUBLEV <liuvv0203@outlook.com>
Date: Mon, 21 Sep 2020 20:59:15 +0800
Subject: [PATCH] fix ocr slim doc

---
 deploy/slim/prune/README.md           |   4 +-
 deploy/slim/prune/README_en.md        | 181 +++++---------------------
 deploy/slim/quantization/README_en.md | 170 ++++++------------------
 3 files changed, 75 insertions(+), 280 deletions(-)
diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md
index 8ec5492c..9a73463a 100644
--- a/deploy/slim/prune/README.md
+++ b/deploy/slim/prune/README.md
@@ -3,9 +3,9 @@
 
 复杂的模型有利于提高模型的性能，但也导致模型中存在一定冗余，模型裁剪通过移出网络模型中的子模型来减少这种冗余，达到减少模型计算复杂度，提高模型推理性能的目的。
 
-本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。
+本教程将介绍如何使用PaddleSlim裁剪PaddleOCR的模型。
 
-在开始本教程之前，建议先了解
+在开始本教程之前，建议先了解：
 1. [PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)
 2. [分类模型裁剪教程](https://paddlepaddle.github.io/PaddleSlim/tutorials/pruning_tutorial/)
 3. [PaddleSlim 裁剪压缩API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/)
diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md
index d854c107..7a93dce5 100644
--- a/deploy/slim/prune/README_en.md
+++ b/deploy/slim/prune/README_en.md
@@ -1,150 +1,40 @@
-\> PaddleSlim develop version should be installed before runing this example.
-
-
-
-# Model compress tutorial (Pruning)
-
-Compress results：
-<table>
-<thead>
-  <tr>
-    <th>ID</th>
-    <th>Task</th>
-    <th>Model</th>
-    <th>Compress Strategy<sup><a href="#quant">[3]</a><a href="#prune">[4]</a><sup></th>
-    <th>Criterion(Chinese dataset)</th>
-    <th>Inference Time<sup><a href="#latency">[1]</a></sup>(ms)</th>
-    <th>Inference Time(Total model)<sup><a href="#rec">[2]</a></sup>(ms)</th>
-    <th>Acceleration Ratio</th>
-    <th>Model Size(MB)</th>
-    <th>Commpress Ratio</th>
-    <th>Download Link</th>
-  </tr>
-</thead>
-<tbody>
-  <tr>
-    <td rowspan="2">0</td>
-    <td>Detection</td>
-    <td>MobileNetV3_DB</td>
-    <td>None</td>
-    <td>61.7</td>
-    <td>224</td>
-    <td rowspan="2">375</td>
-    <td rowspan="2">-</td>
-    <td rowspan="2">8.6</td>
-    <td rowspan="2">-</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>MobileNetV3_CRNN</td>
-    <td>None</td>
-    <td>62.0</td>
-    <td>9.52</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">1</td>
-    <td>Detection</td>
-    <td>SlimTextDet</td>
-    <td>PACT Quant Aware Training</td>
-    <td>62.1</td>
-    <td>195</td>
-    <td rowspan="2">348</td>
-    <td rowspan="2">8%</td>
-    <td rowspan="2">2.8</td>
-    <td rowspan="2">67.82%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">2</td>
-    <td>Detection</td>
-    <td>SlimTextDet_quat_pruning</td>
-    <td>Pruning+PACT Quant Aware Training</td>
-    <td>60.86</td>
-    <td>142</td>
-    <td rowspan="2">288</td>
-    <td rowspan="2">30%</td>
-    <td rowspan="2">2.8</td>
-    <td rowspan="2">67.82%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PPACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">3</td>
-    <td>Detection</td>
-    <td>SlimTextDet_pruning</td>
-    <td>Pruning</td>
-    <td>61.57</td>
-    <td>138</td>
-    <td rowspan="2">295</td>
-    <td rowspan="2">27%</td>
-    <td rowspan="2">2.9</td>
-    <td rowspan="2">66.28%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-</tbody>
-</table>
-
-
-## Overview
-
-Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance.
-
-This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model.
-
-It is recommended that you could understand following pages before reading this example,：
-
-
-
-\- [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md)
-
-\- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/)
-
-
-
-## Install PaddleSlim
 
-```bash
 
-git clone https://github.com/PaddlePaddle/PaddleSlim.git
+## Introduction
 
-cd Paddleslim
+Complicated models help to improve the performance of the model, but it also leads to some redundancy in the model. Model tailoring reduces this redundancy by removing the sub-models in the network model, so as to reduce model calculation complexity and improve model inference performance. .
 
-python setup.py install
+This tutorial will introduce how to use PaddleSlim to crop PaddleOCR model.
 
-```
+It is recommended that you could understand following pages before reading this example：
+1. [PaddleOCR training methods](../../../doc/doc_ch/quickstart.md)
+2. [The demo of prune](https://paddlepaddle.github.io/PaddleSlim/tutorials/pruning_tutorial/)
+3. [PaddleSlim prune API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/)
 
+## Quick start
+
+Five steps for OCR model prune:
+1. Install PaddleSlim
+2. Prepare the trained model
+3. Sensitivity analysis and training
+4. Model tailoring training
+5. Export model, predict deployment
+
+### 1. Install PaddleSlim
+
+```bash
+git clone https://github.com/PaddlePaddle/PaddleSlim.git
+cd Paddleslim
+python setup.py install
+```
 
-## Download Pretrain Model
 
-[Download link of Detection pretrain model]()
+### 2. Download Pretrain Model
+Model prune needs to load pre-trained models.
+PaddleOCR also provides a series of models [../../../doc/doc_en/models_list_en.md]. Developers can choose their own models or use their own models according to their needs.
 
 
-## Pruning sensitivity analysis
+### 3. Pruning sensitivity analysis
 
   After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, thereby determining the pruning ratio of each network layer. For specific details of sensitivity analysis, see：[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
 
@@ -158,7 +48,7 @@ python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Gl
 
 
 
-## Model pruning and Fine-tune
+### 4. Model pruning and Fine-tune
 
   When pruning, the previous sensitivity analysis file would determines the pruning ratio of each network layer. In the specific implementation, in order to retain as many low-level features extracted from the image as possible, we skipped the 4 convolutional layers close to the input in the backbone. Similarly, in order to reduce the model performance loss caused by pruning, we selected some of the less redundant and more sensitive [network layer](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41) through the sensitivity table obtained from the previous sensitivity analysis.And choose to skip these network layers in the subsequent pruning process. After pruning, the model need a finetune process to recover the performance and the training strategy of finetune is similar to the strategy of training original OCR detection model.
 
@@ -169,15 +59,14 @@ python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -
 ```
 
 
+### 5.  Export inference model and deploy it
 
-
-
-## Export inference model
-
-After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
-
+We can export the pruned model as inference_model for deployment:
 ```bash
-
 python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model
-
 ```
+
+Reference for prediction and deployment of inference model:
+1. [inference model python prediction](../../../doc/doc_en/inference_en.md)
+2. [inference model C++ prediction](../../cpp_infer/readme_en.md)
+3. [Deployment of inference model on mobile](../../lite/readme_en.md)
diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md
index 4b8a2b23..e565f13f 100755
--- a/deploy/slim/quantization/README_en.md
+++ b/deploy/slim/quantization/README_en.md
@@ -1,133 +1,30 @@
-\> PaddleSlim 1.2.0 or higher version should be installed before runing this example.
-
-
 
 # Model compress tutorial (Quantization)
 
-Compress results：
-<table>
-<thead>
-  <tr>
-    <th>ID</th>
-    <th>Task</th>
-    <th>Model</th>
-    <th>Compress Strategy</th>
-    <th>Criterion(Chinese dataset)</th>
-    <th>Inference Time(ms)</th>
-    <th>Inference Time(Total model)(ms)</th>
-    <th>Acceleration Ratio</th>
-    <th>Model Size(MB)</th>
-    <th>Commpress Ratio</th>
-    <th>Download Link</th>
-  </tr>
-</thead>
-<tbody>
-  <tr>
-    <td rowspan="2">0</td>
-    <td>Detection</td>
-    <td>MobileNetV3_DB</td>
-    <td>None</td>
-    <td>61.7</td>
-    <td>224</td>
-    <td rowspan="2">375</td>
-    <td rowspan="2">-</td>
-    <td rowspan="2">8.6</td>
-    <td rowspan="2">-</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>MobileNetV3_CRNN</td>
-    <td>None</td>
-    <td>62.0</td>
-    <td>9.52</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">1</td>
-    <td>Detection</td>
-    <td>SlimTextDet</td>
-    <td>PACT Quant Aware Training</td>
-    <td>62.1</td>
-    <td>195</td>
-    <td rowspan="2">348</td>
-    <td rowspan="2">8%</td>
-    <td rowspan="2">2.8</td>
-    <td rowspan="2">67.82%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">2</td>
-    <td>Detection</td>
-    <td>SlimTextDet_quat_pruning</td>
-    <td>Pruning+PACT Quant Aware Training</td>
-    <td>60.86</td>
-    <td>142</td>
-    <td rowspan="2">288</td>
-    <td rowspan="2">30%</td>
-    <td rowspan="2">2.8</td>
-    <td rowspan="2">67.82%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PPACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td rowspan="2">3</td>
-    <td>Detection</td>
-    <td>SlimTextDet_pruning</td>
-    <td>Pruning</td>
-    <td>61.57</td>
-    <td>138</td>
-    <td rowspan="2">295</td>
-    <td rowspan="2">27%</td>
-    <td rowspan="2">2.9</td>
-    <td rowspan="2">66.28%</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>Recognition</td>
-    <td>SlimTextRec</td>
-    <td>PACT Quant Aware Training</td>
-    <td>61.48</td>
-    <td>8.6</td>
-    <td></td>
-  </tr>
-</tbody>
-</table>
-
-
-
-## Overview
-
-Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
-
-This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model.
-
-It is recommended that you could understand following pages before reading this example,：
-
+## Introduction
 
+Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model.
+Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number,
+so as to reduce model calculation complexity and improve model inference performance.
 
-- [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md)
+This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model.
 
+It is recommended that you could understand following pages before reading this example：
+- [The training strategy of OCR model](../../../doc/doc_en/quickstart_en.md)
 - [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/)
 
+## Quick Start
+Quantization is mostly suitable for the deployment of lightweight models on mobile terminals.
+After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps.
+
+1. Install PaddleSlim
+2. Prepare trained model
+3. Quantization-Aware Training
+4. Export inference model
+5. Deploy quantization inference model
 
 
-## Install PaddleSlim
+### 1. Install PaddleSlim
 
 ```bash
 git clone https://github.com/PaddlePaddle/PaddleSlim.git
@@ -139,29 +36,38 @@ python setup.py install
 ```
 
 
-## Download Pretrain Model
+###2. Download Pretrain Model
+PaddleOCR provides a series of trained [models](../../../doc/doc_en/models_list_en.md).
+If the model to be quantified is not in the list, you need to follow the [Regular Training](../. ./../doc/doc_en/quickstart_en.md) method to get the trained model.
 
-[Download link of Detection pretrain model]()
 
-[Download link of recognization pretrain model]()
+### 3. Quant-Aware Training
+Quantization training includes offline quantization training and online quantization training.
+Online quantization training is more effective. It is necessary to load the pre-training model.
+After the quantization strategy is defined, the model can be quantified.
 
+The code for quantization training is located in `slim/quantization/quant/py`. For example, to train a detection model, the training instructions are as follows:
+```bash
+python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights='your trained model'   Global.save_model_dir=./output/quant_model
 
-## Quan-Aware Training
-
-After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see：[Model Quantization](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html)
-
-Enter the PaddleOCR root directory，perform model quantization with the following command：
+# download provided model
+wget https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar
+tar xf ch_ppocr_mobile_v1.1_det_train.tar
+python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./ch_ppocr_mobile_v1.1_det_train/best_accuracy   Global.save_model_dir=./output/quant_model
 
-```bash
-python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1
 ```
 
 
-
-## Export inference model
+### 4. Export inference model
 
 After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
 
 ```bash
 python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model
 ```
+
+### 5. Deploy
+The numerical range of the quantized model parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8.
+The derived model can be converted through the `opt tool` of PaddleLite.
+
+For quantitative model deployment, please refer to [Mobile terminal model deployment](../lite/readme_en.md)
-- 
GitLab


ID	Task	Model	Compress Strategy^[3][4]	Criterion(Chinese dataset)	Inference Time^[1](ms)	Inference Time(Total model)^[2](ms)	Acceleration Ratio	Model Size(MB)	Commpress Ratio
0	Detection	MobileNetV3_DB	None	61.7	224	375	-	8.6	-
0	Recognition	MobileNetV3_CRNN	None	62.0	9.52	375	-	8.6	-
1	Detection	SlimTextDet	PACT Quant Aware Training	62.1	195	348	8%	2.8	67.82%
1	Recognition	SlimTextRec	PACT Quant Aware Training	61.48	8.6	348	8%	2.8	67.82%
2	Detection	SlimTextDet_quat_pruning	Pruning+PACT Quant Aware Training	60.86	142	288	30%	2.8	67.82%
2	Recognition	SlimTextRec	PPACT Quant Aware Training	61.48	8.6	288	30%	2.8	67.82%
3	Detection	SlimTextDet_pruning	Pruning	61.57	138	295	27%	2.9	66.28%
3	Recognition	SlimTextRec	PACT Quant Aware Training	61.48	8.6	295	27%	2.9	66.28%