From cf7372475e37918c021db64b436e89f6d44c402b Mon Sep 17 00:00:00 2001 From: yukavio <kavioyu@gmail.com> Date: Fri, 18 Sep 2020 15:26:07 +0000 Subject: [PATCH 1/2] complete prune doc --- deploy/slim/prune/README.md | 40 ------- deploy/slim/prune/README_ch.md | 180 ++++++++++++++++++++++++++++ deploy/slim/prune/README_en.md | 183 +++++++++++++++++++++++++++++ deploy/slim/quantization/README.md | 2 +- 4 files changed, 364 insertions(+), 41 deletions(-) delete mode 100644 deploy/slim/prune/README.md create mode 100644 deploy/slim/prune/README_ch.md create mode 100644 deploy/slim/prune/README_en.md diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md deleted file mode 100644 index f28d2be0..00000000 --- a/deploy/slim/prune/README.md +++ /dev/null @@ -1,40 +0,0 @@ -> è¿è¡Œç¤ºä¾‹å‰è¯·å…ˆå®‰è£…develop版本PaddleSlim - -# 模型è£å‰ªåŽ‹ç¼©æ•™ç¨‹ - -## 概述 - -该示例使用PaddleSlimæ供的[è£å‰ªåŽ‹ç¼©API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/)对OCR模型进行压缩。 -在阅读该示例å‰ï¼Œå»ºè®®æ‚¨å…ˆäº†è§£ä»¥ä¸‹å†…容: - -- [OCR模型的常规è®ç»ƒæ–¹æ³•](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) -- [PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/) - -## 安装PaddleSlim -å¯æŒ‰ç…§[PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/)ä¸çš„æ¥éª¤å®‰è£…PaddleSlim。 - - - -## æ•æ„Ÿåº¦åˆ†æžè®ç»ƒ - -进入PaddleOCRæ ¹ç›®å½•ï¼Œé€šè¿‡ä»¥ä¸‹å‘½ä»¤å¯¹æ¨¡åž‹è¿›è¡Œæ•æ„Ÿåº¦åˆ†æžï¼š - -```bash -python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -``` - -## è£å‰ªæ¨¡åž‹ä¸Žfine-tune - -```bash -python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 -``` - - - -## 评估并导出 - -在得到è£å‰ªè®ç»ƒä¿å˜çš„模型åŽï¼Œæˆ‘们å¯ä»¥å°†å…¶å¯¼å‡ºä¸ºinference_model,用于预测部署: - -```bash -python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model -``` diff --git a/deploy/slim/prune/README_ch.md b/deploy/slim/prune/README_ch.md new file mode 100644 index 00000000..3f551977 --- /dev/null +++ b/deploy/slim/prune/README_ch.md @@ -0,0 +1,180 @@ +\> è¿è¡Œç¤ºä¾‹å‰è¯·å…ˆå®‰è£…develop版本PaddleSlim + + + +# 模型è£å‰ªåŽ‹ç¼©æ•™ç¨‹ + +压缩结果: +<table> +<thead> + <tr> + <th>åºå·</th> + <th>任务</th> + <th>模型</th> + <th>压缩ç–ç•¥<sup><a href="#quant">[3]</a><a href="#prune">[4]</a><sup></th> + <th>精度(自建ä¸æ–‡æ•°æ®é›†)</th> + <th>耗时<sup><a href="#latency">[1]</a></sup>(ms)</th> + <th>整体耗时<sup><a href="#rec">[2]</a></sup>(ms)</th> + <th>åŠ é€Ÿæ¯”</th> + <th>整体模型大å°(M)</th> + <th>压缩比例</th> + <th>下载链接</th> + </tr> +</thead> +<tbody> + <tr> + <td rowspan="2">0</td> + <td>检测</td> + <td>MobileNetV3_DB</td> + <td>æ— </td> + <td>61.7</td> + <td>224</td> + <td rowspan="2">375</td> + <td rowspan="2">-</td> + <td rowspan="2">8.6</td> + <td rowspan="2">-</td> + <td></td> + </tr> + <tr> + <td>识别</td> + <td>MobileNetV3_CRNN</td> + <td>æ— </td> + <td>62.0</td> + <td>9.52</td> + <td></td> + </tr> + <tr> + <td rowspan="2">1</td> + <td>检测</td> + <td>SlimTextDet</td> + <td>PACTé‡åŒ–è®ç»ƒ</td> + <td>62.1</td> + <td>195</td> + <td rowspan="2">348</td> + <td rowspan="2">8%</td> + <td rowspan="2">2.8</td> + <td rowspan="2">67.82%</td> + <td></td> + </tr> + <tr> + <td>识别</td> + <td>SlimTextRec</td> + <td>PACTé‡åŒ–è®ç»ƒ</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> + <tr> + <td rowspan="2">2</td> + <td>检测</td> + <td>SlimTextDet_quat_pruning</td> + <td>剪è£+PACTé‡åŒ–è®ç»ƒ</td> + <td>60.86</td> + <td>142</td> + <td rowspan="2">288</td> + <td rowspan="2">30%</td> + <td rowspan="2">2.8</td> + <td rowspan="2">67.82%</td> + <td></td> + </tr> + <tr> + <td>识别</td> + <td>SlimTextRec</td> + <td>PACTé‡åŒ–è®ç»ƒ</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> + <tr> + <td rowspan="2">3</td> + <td>检测</td> + <td>SlimTextDet_pruning</td> + <td>剪è£</td> + <td>61.57</td> + <td>138</td> + <td rowspan="2">295</td> + <td rowspan="2">27%</td> + <td rowspan="2">2.9</td> + <td rowspan="2">66.28%</td> + <td></td> + </tr> + <tr> + <td>识别</td> + <td>SlimTextRec</td> + <td>PACTé‡åŒ–è®ç»ƒ</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> +</tbody> +</table> + + +## 概述 + +å¤æ‚的模型有利于æ高模型的性能,但也导致模型ä¸å˜åœ¨ä¸€å®šå†—余,模型è£å‰ªé€šè¿‡ç§»å‡ºç½‘络模型ä¸çš„å模型æ¥å‡å°‘è¿™ç§å†—余,达到å‡å°‘模型计算å¤æ‚度,æ高模型推ç†æ€§èƒ½çš„目的。 + +该示例使用PaddleSlimæ供的[è£å‰ªåŽ‹ç¼©API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/)对OCR模型进行压缩。 + +在阅读该示例å‰ï¼Œå»ºè®®æ‚¨å…ˆäº†è§£ä»¥ä¸‹å†…容: + + + +\- [OCR模型的常规è®ç»ƒæ–¹æ³•](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) + +\- [PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/) + + + +## 安装PaddleSlim + +\```bash + +git clone https://github.com/PaddlePaddle/PaddleSlim.git + +cd Paddleslim + +python setup.py install + +\``` + + +## 获å–预è®ç»ƒæ¨¡åž‹ +[检测预è®ç»ƒæ¨¡åž‹ä¸‹è½½åœ°å€]() + + +## æ•æ„Ÿåº¦åˆ†æžè®ç»ƒ + åŠ è½½é¢„è®ç»ƒæ¨¡åž‹åŽï¼Œé€šè¿‡å¯¹çŽ°æœ‰æ¨¡åž‹çš„æ¯ä¸ªç½‘络层进行æ•æ„Ÿåº¦åˆ†æžï¼Œäº†è§£å„网络层冗余度,从而决定æ¯ä¸ªç½‘络层的è£å‰ªæ¯”例。æ•æ„Ÿåº¦åˆ†æžçš„具体细节è§ï¼š[æ•æ„Ÿåº¦åˆ†æž](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + +进入PaddleOCRæ ¹ç›®å½•ï¼Œé€šè¿‡ä»¥ä¸‹å‘½ä»¤å¯¹æ¨¡åž‹è¿›è¡Œæ•æ„Ÿåº¦åˆ†æžï¼š + +\```bash + +python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 + +\``` + + + +## è£å‰ªæ¨¡åž‹ä¸Žfine-tune + è£å‰ªæ—¶é€šè¿‡ä¹‹å‰çš„æ•æ„Ÿåº¦åˆ†æžæ–‡ä»¶å†³å®šæ¯ä¸ªç½‘络层的è£å‰ªæ¯”例。在具体实现时,为了尽å¯èƒ½å¤šçš„ä¿ç•™ä»Žå›¾åƒä¸æå–的低阶特å¾ï¼Œæˆ‘们跳过了backboneä¸é 近输入的4个å·ç§¯å±‚。åŒæ ·ï¼Œä¸ºäº†å‡å°‘由于è£å‰ªå¯¼è‡´çš„模型性能æŸå¤±ï¼Œæˆ‘们通过之å‰æ•æ„Ÿåº¦åˆ†æžæ‰€èŽ·å¾—çš„æ•æ„Ÿåº¦è¡¨ï¼ŒæŒ‘选出了一些冗余较少,对è£å‰ªè¾ƒä¸ºæ•æ„Ÿçš„[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41),并在之åŽçš„è£å‰ªè¿‡ç¨‹ä¸é€‰æ‹©é¿å¼€è¿™äº›ç½‘络层。è£å‰ªè¿‡åŽfinetune的过程沿用OCR检测模型原始的è®ç»ƒç–略。 + +\```bash + +python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 + +\``` + + + + + +## 导出模型 + +在得到è£å‰ªè®ç»ƒä¿å˜çš„模型åŽï¼Œæˆ‘们å¯ä»¥å°†å…¶å¯¼å‡ºä¸ºinference_model,用于预测部署: + +\```bash + +python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model + +\``` diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md new file mode 100644 index 00000000..4a13481d --- /dev/null +++ b/deploy/slim/prune/README_en.md @@ -0,0 +1,183 @@ +\> PaddleSlim develop version should be installed before runing this example. + + + +# 模型è£å‰ªåŽ‹ç¼©æ•™ç¨‹ + +Compress results: +<table> +<thead> + <tr> + <th>ID</th> + <th>Task</th> + <th>Model</th> + <th>Compress Strategy<sup><a href="#quant">[3]</a><a href="#prune">[4]</a><sup></th> + <th>Criterion(Chinese dataset)</th> + <th>Inference Time<sup><a href="#latency">[1]</a></sup>(ms)</th> + <th>Inference Time(Total model)<sup><a href="#rec">[2]</a></sup>(ms)</th> + <th>Acceleration Ratio</th> + <th>Model Size(MB)</th> + <th>Commpress Ratio</th> + <th>Download Link</th> + </tr> +</thead> +<tbody> + <tr> + <td rowspan="2">0</td> + <td>Detection</td> + <td>MobileNetV3_DB</td> + <td>None</td> + <td>61.7</td> + <td>224</td> + <td rowspan="2">375</td> + <td rowspan="2">-</td> + <td rowspan="2">8.6</td> + <td rowspan="2">-</td> + <td></td> + </tr> + <tr> + <td>Recognition</td> + <td>MobileNetV3_CRNN</td> + <td>None</td> + <td>62.0</td> + <td>9.52</td> + <td></td> + </tr> + <tr> + <td rowspan="2">1</td> + <td>Detection</td> + <td>SlimTextDet</td> + <td>PACT Quant Aware Training</td> + <td>62.1</td> + <td>195</td> + <td rowspan="2">348</td> + <td rowspan="2">8%</td> + <td rowspan="2">2.8</td> + <td rowspan="2">67.82%</td> + <td></td> + </tr> + <tr> + <td>Recognition</td> + <td>SlimTextRec</td> + <td>PACT Quant Aware Training</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> + <tr> + <td rowspan="2">2</td> + <td>Detection</td> + <td>SlimTextDet_quat_pruning</td> + <td>Pruning+PACT Quant Aware Training</td> + <td>60.86</td> + <td>142</td> + <td rowspan="2">288</td> + <td rowspan="2">30%</td> + <td rowspan="2">2.8</td> + <td rowspan="2">67.82%</td> + <td></td> + </tr> + <tr> + <td>Recognition</td> + <td>SlimTextRec</td> + <td>PPACT Quant Aware Training</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> + <tr> + <td rowspan="2">3</td> + <td>Detection</td> + <td>SlimTextDet_pruning</td> + <td>Pruning</td> + <td>61.57</td> + <td>138</td> + <td rowspan="2">295</td> + <td rowspan="2">27%</td> + <td rowspan="2">2.9</td> + <td rowspan="2">66.28%</td> + <td></td> + </tr> + <tr> + <td>Recognition</td> + <td>SlimTextRec</td> + <td>PACT Quant Aware Training</td> + <td>61.48</td> + <td>8.6</td> + <td></td> + </tr> +</tbody> +</table> + + +## Overview + +Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance. + +This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model. + +It is recommended that you could understand following pages before reading this example,: + + + +\- [The training strategy of OCR model](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/detection.md) + +\- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/) + + + +## Install PaddleSlim + +\```bash + +git clone https://github.com/PaddlePaddle/PaddleSlim.git + +cd Paddleslim + +python setup.py install + +\``` + + +## Download Pretrain Model + +[Download link of Detection pretrain model]() + + +## Pruning sensitivity analysis + + After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, thereby determining the pruning ratio of each network layer. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + +Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: + +\```bash + +python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 + +\``` + + + +## Model pruning and Fine-tune + + When pruning, the previous sensitivity analysis file would determines the pruning ratio of each network layer. In the specific implementation, in order to retain as many low-level features extracted from the image as possible, we skipped the 4 convolutional layers close to the input in the backbone. Similarly, in order to reduce the model performance loss caused by pruning, we selected some of the less redundant and more sensitive [network layer](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41) through the sensitivity table obtained from the previous sensitivity analysis.And choose to skip these network layers in the subsequent pruning process. After pruning, the model need a finetune process to recover the performance and the training strategy of finetune is similar to the strategy of training original OCR detection model. + +\```bash + +python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 + +\``` + + + + + +## Export inference model + +After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: + +\```bash + +python deploy/slim/prune/export_prune_model.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model + +\``` diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md index f2e92f54..f7d87c83 100755 --- a/deploy/slim/quantization/README.md +++ b/deploy/slim/quantization/README.md @@ -25,7 +25,7 @@ python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global -## 评估并导出 +## 导出模型 在得到é‡åŒ–è®ç»ƒä¿å˜çš„模型åŽï¼Œæˆ‘们å¯ä»¥å°†å…¶å¯¼å‡ºä¸ºinference_model,用于预测部署: -- GitLab From 9958fdde66da4a67bb4e50a8f752cf12171d3eef Mon Sep 17 00:00:00 2001 From: yukavio <kavioyu@gmail.com> Date: Fri, 18 Sep 2020 15:29:20 +0000 Subject: [PATCH 2/2] fix some bug --- deploy/slim/prune/README_en.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md index 4a13481d..d345e24c 100644 --- a/deploy/slim/prune/README_en.md +++ b/deploy/slim/prune/README_en.md @@ -2,7 +2,7 @@ -# 模型è£å‰ªåŽ‹ç¼©æ•™ç¨‹ +# Model compress tutorial (Pruning) Compress results: <table> -- GitLab