\> PaddleSlim 1.2.0 or higher version should be installed before runing this example. # Model compress tutorial (Quantization) Compress results:
ID Task Model Compress Strategy Criterion(Chinese dataset) Inference Time(ms) Inference Time(Total model)(ms) Acceleration Ratio Model Size(MB) Commpress Ratio Download Link
0 Detection MobileNetV3_DB None 61.7 224 375 - 8.6 -
Recognition MobileNetV3_CRNN None 62.0 9.52
1 Detection SlimTextDet PACT Quant Aware Training 62.1 195 348 8% 2.8 67.82%
Recognition SlimTextRec PACT Quant Aware Training 61.48 8.6
2 Detection SlimTextDet_quat_pruning Pruning+PACT Quant Aware Training 60.86 142 288 30% 2.8 67.82%
Recognition SlimTextRec PPACT Quant Aware Training 61.48 8.6
3 Detection SlimTextDet_pruning Pruning 61.57 138 295 27% 2.9 66.28%
Recognition SlimTextRec PACT Quant Aware Training 61.48 8.6
## Overview Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance. This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. PaddleSlim (GitHub: https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry. It is recommended that you could understand following pages before reading this example,: - [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) - [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) ## Install PaddleSlim ```bash git clone https://github.com/PaddlePaddle/PaddleSlim.git cd Paddleslim python setup.py install ``` ## Download Pretrain Model [Download link of Detection pretrain model]() [Download link of recognization pretrain model]() ## Quan-Aware Training After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see:[Model Quantization](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/quantization_api.html) Enter the PaddleOCR root directory,perform model quantization with the following command: ```bash python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 ``` ## Export inference model After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: ```bash python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model ```