\> PaddleSlim 1.2.0 or higher version should be installed before runing this example.
# Model compress tutorial (Quantization)
Compress results:
ID |
Task |
Model |
Compress Strategy |
Criterion(Chinese dataset) |
Inference Time(ms) |
Inference Time(Total model)(ms) |
Acceleration Ratio |
Model Size(MB) |
Commpress Ratio |
Download Link |
0 |
Detection |
MobileNetV3_DB |
None |
61.7 |
224 |
375 |
- |
8.6 |
- |
|
Recognition |
MobileNetV3_CRNN |
None |
62.0 |
9.52 |
|
1 |
Detection |
SlimTextDet |
PACT Quant Aware Training |
62.1 |
195 |
348 |
8% |
2.8 |
67.82% |
|
Recognition |
SlimTextRec |
PACT Quant Aware Training |
61.48 |
8.6 |
|
2 |
Detection |
SlimTextDet_quat_pruning |
Pruning+PACT Quant Aware Training |
60.86 |
142 |
288 |
30% |
2.8 |
67.82% |
|
Recognition |
SlimTextRec |
PPACT Quant Aware Training |
61.48 |
8.6 |
|
3 |
Detection |
SlimTextDet_pruning |
Pruning |
61.57 |
138 |
295 |
27% |
2.9 |
66.28% |
|
Recognition |
SlimTextRec |
PACT Quant Aware Training |
61.48 |
8.6 |
|
## Overview
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model.
It is recommended that you could understand following pages before reading this example,:
- [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md)
- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/)
## Install PaddleSlim
```bash
git clone https://github.com/PaddlePaddle/PaddleSlim.git
cd Paddleslim
python setup.py install
```
## Download Pretrain Model
[Download link of Detection pretrain model]()
[Download link of recognization pretrain model]()
## Quan-Aware Training
After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see:[Model Quantization](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html)
Enter the PaddleOCR root directory,perform model quantization with the following command:
```bash
python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1
```
## Export inference model
After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
```bash
python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model
```