\> PaddleSlim 1.2.0 or higher version should be installed before runing this example. # Model compress tutorial (Quantization) Compress results:
ID Task Model Compress Strategy Criterion(Chinese dataset) Inference Time(ms) Inference Time(Total model)(ms) Acceleration Ratio Model Size(MB) Commpress Ratio Download Link
0 Detection MobileNetV3_DB None 61.7 224 375 - 8.6 -
Recognition MobileNetV3_CRNN None 62.0 9.52
1 Detection SlimTextDet PACT Quant Aware Training 62.1 195 348 8% 2.8 67.82%
Recognition SlimTextRec PACT Quant Aware Training 61.48 8.6
2 Detection SlimTextDet_quat_pruning Pruning+PACT Quant Aware Training 60.86 142 288 30% 2.8 67.82%
Recognition SlimTextRec PPACT Quant Aware Training 61.48 8.6
3 Detection SlimTextDet_pruning Pruning 61.57 138 295 27% 2.9 66.28%
Recognition SlimTextRec PACT Quant Aware Training 61.48 8.6
## Overview Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance. This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. It is recommended that you could understand following pages before reading this example,: - [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) - [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) ## Install PaddleSlim ```bash git clone https://github.com/PaddlePaddle/PaddleSlim.git cd Paddleslim python setup.py install ``` ## Download Pretrain Model [Download link of Detection pretrain model]() [Download link of recognization pretrain model]() ## Quan-Aware Training After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see:[Model Quantization](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html) Enter the PaddleOCR root directory,perform model quantization with the following command: ```bash python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 ``` ## Export inference model After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: ```bash python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model ```