README_en.md 4.0 KB
Newer Older
B
baiyfbupt 已提交
1

qq_25193841's avatar
qq_25193841 已提交
2
# PP-OCR Models Quantization
B
baiyfbupt 已提交
3

fanruinet's avatar
fanruinet 已提交
4
Generally, a more complex model would achieve better performance in the task, but it also leads to some redundancy in the model.
B
baiyfbupt 已提交
5 6 7
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number,
so as to reduce model calculation complexity and improve model inference performance.

A
andyjpaddle 已提交
8
This example uses PaddleSlim provided [APIs of Quantization](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst) to compress the OCR model.
B
baiyfbupt 已提交
9 10 11

It is recommended that you could understand following pages before reading this example:
- [The training strategy of OCR model](../../../doc/doc_en/quickstart_en.md)
A
andyjpaddle 已提交
12
- [PaddleSlim Document](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst)
B
baiyfbupt 已提交
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

## Quick Start
Quantization is mostly suitable for the deployment of lightweight models on mobile terminals.
After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps.

1. Install PaddleSlim
2. Prepare trained model
3. Quantization-Aware Training
4. Export inference model
5. Deploy quantization inference model


### 1. Install PaddleSlim

```bash
L
LDOUBLEV 已提交
28
pip3 install paddleslim==2.2.2
B
baiyfbupt 已提交
29 30 31
```


fanruinet's avatar
fanruinet 已提交
32 33
### 2. Download Pre-trained Model
PaddleOCR provides a series of pre-trained [models](../../../doc/doc_en/models_list_en.md).
B
baiyfbupt 已提交
34 35 36 37 38
If the model to be quantified is not in the list, you need to follow the [Regular Training](../../../doc/doc_en/quickstart_en.md) method to get the trained model.


### 3. Quant-Aware Training
Quantization training includes offline quantization training and online quantization training.
fanruinet's avatar
fanruinet 已提交
39
Online quantization training is more effective. It is necessary to load the pre-trained model.
B
baiyfbupt 已提交
40 41 42 43
After the quantization strategy is defined, the model can be quantified.

The code for quantization training is located in `slim/quantization/quant.py`. For example, to train a detection model, the training instructions are as follows:
```bash
L
LDOUBLEV 已提交
44
python deploy/slim/quantization/quant.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model='your trained model'   Global.save_model_dir=./output/quant_model
B
baiyfbupt 已提交
45 46 47 48

# download provided model
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar
tar -xf ch_ppocr_mobile_v2.0_det_train.tar
L
LDOUBLEV 已提交
49
python deploy/slim/quantization/quant.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./ch_ppocr_mobile_v2.0_det_train/best_accuracy   Global.save_model_dir=./output/quant_model
B
baiyfbupt 已提交
50 51 52
```


L
LDOUBLEV 已提交
53 54 55 56 57 58 59 60 61 62 63
Model distillation and model quantization can be used at the same time, taking the PPOCRv3 detection model as an example:
```
# download provided model
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar
tar xf https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar

python deploy/slim/quantization/quant.py -c configs/det/ch_PP-OCRv3_det/ch_PP-OCRv3_det_cml.yml -o Global.pretrained_model='./ch_PP-OCRv3_det_distill_train/best_accuracy'   Global.save_model_dir=./output/quant_model_distill/
```

If you want to quantify the text recognition model, you can modify the configuration file and loaded model parameters.

B
baiyfbupt 已提交
64 65
### 4. Export inference model

fanruinet's avatar
fanruinet 已提交
66
Once we got the model after pruning and fine-tuning, we can export it as an inference model for the deployment of predictive tasks:
B
baiyfbupt 已提交
67 68

```bash
L
LDOUBLEV 已提交
69
python deploy/slim/quantization/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_inference_dir=./output/quant_inference_model
B
baiyfbupt 已提交
70 71 72 73 74 75 76
```

### 5. Deploy
The numerical range of the quantized model parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8.
The derived model can be converted through the `opt tool` of PaddleLite.

For quantitative model deployment, please refer to [Mobile terminal model deployment](../../lite/readme_en.md)