diff --git a/docs/en/advanced_tutorials/ModelQuantizationPrunning_en.md b/docs/en/advanced_tutorials/model_prune_quantization_en.md
similarity index 65%
rename from docs/en/advanced_tutorials/ModelQuantizationPrunning_en.md
rename to docs/en/advanced_tutorials/model_prune_quantization_en.md
index c7a0950c34e407828f1c9a2eb41a4e7fdf519801..a96ca79e79f683f9ed315d5036b8b15189279e0d 100644
--- a/docs/en/advanced_tutorials/ModelQuantizationPrunning_en.md
+++ b/docs/en/advanced_tutorials/model_prune_quantization_en.md
@@ -8,23 +8,25 @@ Model pruning decreases the number of model parameters by cutting out the unimpo
This tutorial explains how to use PaddleSlim, PaddlePaddle's model compression library, for PaddleClas compression, i.e., pruning and quantization. [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) integrates a variety of common and leading model compression functions such as model pruning, quantization (including quantization training and offline quantization), distillation, and neural network search. If you are interested, please follow us and learn more.
-To start with, you are recommended to learn [PaddleClas Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/classification.md) and [PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html), see [Model Pruning and Quantization Algorithms](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/algorithm_introduction/model_prune_quantization.md) for related pruning and quantization methods.
+To start with, you are recommended to learn [PaddleClas Training](../models_training/classification_en.md) and [PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html), see [Model Pruning and Quantization Algorithms](../algorithm_introduction/model_prune_quantization_en.md) for related pruning and quantization methods.
------
-## Contents
+## Catalogue
-- [1. Prepare the Environment](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#1)
- - [1.1 Install PaddleSlim](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#1.1)
- - [1.2 Prepare the Trained Model](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#1.2)
-- [2. Quick Start](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#2)
- - [2.1 Model Quantization](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#2.1)
- - [2.1.1 Online Quantization Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#2.1.1)
- - [2.1.2 Offline Quantization](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#2.1.2)
- - [2.2 Model Pruning](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#2.2)
-- [3. Export the Model](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#3)
-- [4. Deploy the Model](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#4)
-- [5. Hyperparameter Training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/advanced_tutorials/model_prune_quantization.md#5)
+- [1. Prepare the Environment](#1)
+ - [1.1 Install PaddleSlim](#1.1)
+ - [1.2 Prepare the Trained Model](#1.2)
+- [2. Quick Start](#2)
+ - [2.1 Model Quantization](#2.1)
+ - [2.1.1 Online Quantization Training](#2.1.1)
+ - [2.1.2 Offline Quantization](#2.1.2)
+ - [2.2 Model Pruning](#2.2)
+- [3. Export the Model](#3)
+- [4. Deploy the Model](#4)
+- [5. Hyperparameter Training](#5)
+
+
## 1. Prepare the Environment
@@ -38,6 +40,8 @@ Five steps are included:
4. Export quantized inference model
5. Inference and deployment of the quantized model
+
+
### 1.1 Install PaddleSlim
- You can adopt pip install for installation.
@@ -54,9 +58,13 @@ cd Paddleslim
python3.7 setup.py install
```
+
+
### 1.2 Prepare the Trained Model
-PaddleClas offers a list of trained [models](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models/models_intro.md). If the model to be quantized is not in the list, you need to follow the [regular training](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/classification.md) method to get the trained model.
+PaddleClas offers a list of trained [models](../models/models_intro_en.md). If the model to be quantized is not in the list, you need to follow the [regular training](../models_training/classification_en.md) method to get the trained model.
+
+
## 2. Quick Start
@@ -68,10 +76,14 @@ cd PaddleClas
Related code for `slim` training has been integrated under `ppcls/engine/`, and the offline quantization code can be found in `deploy/slim/quant_post_static.py`.
+
+
### 2.1 Model Quantization
Quantization training includes offline and online training. Online quantitative training, the more effective one, requires loading a pre-trained model, which can be quantized after defining the strategy.
+
+
#### 2.1.1 Online Quantization Training
Try the following command:
@@ -84,7 +96,7 @@ Take CPU for example, if you use GPU, change the `cpu` to `gpu`.
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.device=cpu
```
-The parsing of the `yaml` file is described in [reference document](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models_training/config_description.md). For accuracy, the `pretrained model` has already been adopted by the `yaml` file.
+The parsing of the `yaml` file is described in [reference document](../models_training/config_description_en.md). For accuracy, the `pretrained model` has already been adopted by the `yaml` file.
- Launch in single-machine multi-card/ multi-machine multi-card mode
@@ -96,9 +108,11 @@ python3.7 -m paddle.distributed.launch \
-c ppcls/configs/slim/ResNet50_vd_quantization.yaml
```
+
+
#### 2.1.2 Offline Quantization
-**Note**: Currently, the `inference model` exported from the trained model is a must for offline quantization. See the [tutorial](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/export_model .md) for general export of the `inference model`.
+**Note**: Currently, the `inference model` exported from the trained model is a must for offline quantization. See the [tutorial](../inference_deployment/export_model_en.md) for general export of the `inference model`.
Normally, offline quantization may lose more accuracy.
@@ -112,6 +126,8 @@ The `inference model` is stored in`Global.save_inference_dir`.
Successfully executed, the `quant_post_static_model` folder is created in the `Global.save_inference_dir`, where the generated offline quantization models are stored and can be deployed directly without re-exporting the models.
+
+
### 2.2 Model Pruning
Trying the following command:
@@ -134,6 +150,8 @@ python3.7 -m paddle.distributed.launch \
-c ppcls/configs/slim/ResNet50_vd_prune.yaml
```
+
+
## 3. Export the Model
Having obtained the saved model after online quantization training and pruning, it can be exported as an inference model for inference deployment. Here we take model pruning as an example:
@@ -145,11 +163,15 @@ python3.7 tools/export.py \
-o Global.save_inference_dir=./inference
```
+
+
## 4. Deploy the Model
-The exported model can be deployed directly using inference, please refer to [inference deployment](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_ deployment).
+The exported model can be deployed directly using inference, please refer to [inference deployment](../inference_deployment/).
+
+You can also use PaddleLite's opt tool to convert the inference model to a mobile model for its mobile deployment. Please refer to [Mobile Model Deployment](../inference_deployment/paddle_lite_deploy_en.md ) for more details.
-You can also use PaddleLite's opt tool to convert the inference model to a mobile model for its mobile deployment. Please refer to [Mobile Model Deployment](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/inference_deployment/paddle_lite_deploy.md ) for more details.
+
## 5. Hyperparameter Training
diff --git a/docs/en/algorithm_introduction/ModelQuantizationPrunning_en.md b/docs/en/algorithm_introduction/model_prune_quantization_en.md
similarity index 99%
rename from docs/en/algorithm_introduction/ModelQuantizationPrunning_en.md
rename to docs/en/algorithm_introduction/model_prune_quantization_en.md
index 909e5928e97c3a011c54c1f593650b25ecd8cdf3..f18fdd46972e3b879da04dceaf52ef3b6e9652c4 100644
--- a/docs/en/algorithm_introduction/ModelQuantizationPrunning_en.md
+++ b/docs/en/algorithm_introduction/model_prune_quantization_en.md
@@ -7,12 +7,12 @@ Deep learning limits the deployment of corresponding models in some scenarios an
See [PaddeSlim](https://github.com/PaddlePaddle/PaddleSlim/) for detailed parameters.
-## Contents
+## Catlogue
- [1. PACT](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/algorithm_introduction/model_prune_quantization.md#1)
- [2. FPGM](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/algorithm_introduction/model_prune_quantization.md#2)
-
+
## 1. PACT
@@ -36,7 +36,9 @@ After the above improvement, *PACT* preprocessing is inserted between the activa
For specific algorithm parameters, please refer to [Introduction to Parameters](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0.0/docs/zh_cn/api_cn/dygraph/quanter/qat.rst#qat) in PaddleSlim.
-## FPGM
+
+
+## 2. FPGM
Model pruning is an essential practice to reduce the model size and improve inference efficiency. In previous articles on network pruning, the norm of the network filter is generally adopted to measure its importance, **the smaller the norm value, the less important the filter is** and the more significant it will be to clip it from the network. **FPGM** believes that the previous approach relies on the following two points:
diff --git a/docs/zh_CN/algorithm_introduction/model_prune_quantization.md b/docs/zh_CN/algorithm_introduction/model_prune_quantization.md
index 30cae234875ae403a65df099acee4ac5634cf78f..02125ca9e4ce143c3d4e2f0fdce4d8d8215720c3 100644
--- a/docs/zh_CN/algorithm_introduction/model_prune_quantization.md
+++ b/docs/zh_CN/algorithm_introduction/model_prune_quantization.md
@@ -36,7 +36,8 @@
算法具体参数请参考 PaddleSlim 中[参数介绍](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0.0/docs/zh_cn/api_cn/dygraph/quanter/qat.rst#qat)。
-## FPGM 裁剪
+
+## 2. FPGM 裁剪
模型剪枝是减小模型大小,提升预测效率的一种非常重要的手段。在之前的网络剪枝文章中一般将网络 filter 的范数作为其重要性度量,**范数值较小的代表的 filter 越不重要**,将其从网络中裁剪掉,反之也就越重要。而**FPGM**认为之前的方法要依赖如下两点