未验证 提交 74622af4 编写于 作者: W Walter 提交者: GitHub

Merge pull request #1093 from RainFrost1/slim

添加slim功能
## 介绍
复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。
## Slim功能介绍
复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余。此部分提供精简模型的功能,包括两部分:模型量化(量化训练、离线量化)、模型剪枝。
其中模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。
模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。
模型剪枝将CNN中不重要的卷积核裁剪掉,减少模型参数量,从而降低模型计算复杂度。
本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleClas模型的压缩。
[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) 集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。
在开始本教程之前,建议先了解[PaddleClas模型的训练方法](../../../docs/zh_CN/tutorials/quick_start.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html)
在开始本教程之前,建议先了解[PaddleClas模型的训练方法](../../docs/zh_CN/tutorials/getting_started.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html)
## 快速开始
量化多适用于轻量模型在移动端的部署,当训练出一个模型后,如果希望进一步的压缩模型大小并加速预测,可使用量化的方法压缩模型。
当训练出一个模型后,如果希望进一步的压缩模型大小并加速预测,可使用量化或者剪枝的方法压缩模型。
模型量化主要包括五个步骤:
模型压缩主要包括五个步骤:
1. 安装 PaddleSlim
2. 准备训练好的模型
3. 量化训练
3. 模型压缩
4. 导出量化推理模型
5. 量化模型预测部署
......@@ -24,7 +28,7 @@
* 可以通过pip install的方式进行安装。
```bash
pip3.7 install paddleslim==2.0.0
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
```
* 如果获取PaddleSlim的最新特性,可以从源码安装。
......@@ -37,70 +41,104 @@ python3.7 setup.py install
### 2. 准备训练好的模型
PaddleClas提供了一系列训练好的[模型](../../../docs/zh_CN/models/models_intro.md),如果待量化的模型不在列表中,需要按照[常规训练](../../../docs/zh_CN/tutorials/getting_started.md)方法得到训练好的模型。
PaddleClas提供了一系列训练好的[模型](../../docs/zh_CN/models/models_intro.md),如果待量化的模型不在列表中,需要按照[常规训练](../../docs/zh_CN/tutorials/getting_started.md)方法得到训练好的模型。
### 3. 模型压缩
进入PaddleClas根目录
```bash
cd PaddleClas
```
`slim`训练相关代码已经集成到`ppcls/engine/`下,离线量化代码位于`deploy/slim/quant_post_static.py`
#### 3.1 模型量化
### 3. 量化训练
量化训练包括离线量化训练和在线量化训练,在线量化训练效果更好,需加载预训练模型,在定义好量化策略后即可对模型进行量化。
##### 3.1.1 在线量化训练
量化训练的代码位于`deploy/slim/quant/quant.py` 中,训练指令如下:
训练指令如下:
* CPU/单机单卡启动
* CPU/单卡GPU
以CPU为例,若使用GPU,则将命令中改成`cpu`改成`gpu`
```bash
python3.7 deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.device=cpu
```
* 单机单卡/单机多卡/多机多卡启动
其中`yaml`文件解析详见[参考文档](../../docs/zh_CN/tutorials/config_description.md)。为了保证精度,`yaml`文件中已经使用`pretrained model`.
* 单机多卡/多机多卡启动
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \
deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
--gpus="0,1,2,3" \
tools/train.py \
-c ppcls/configs/slim/ResNet50_vd_quantization.yaml
```
##### 3.1.2 离线量化
**注意**:目前离线量化,必须使用已经训练好的模型,导出的`inference model`进行量化。一般模型导出`inference model`可参考[教程](../../docs/zh_CN/inference.md).
一般来说,离线量化损失模型精度较多。
生成`inference model`后,离线量化运行方式如下
```bash
python3.7 deploy/slim/quant_post_static.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
```
`Global.save_inference_dir``inference model`存放的目录。
执行成功后,在`Global.save_inference_dir`的目录下,生成`quant_post_static_model`文件夹,其中存储生成的离线量化模型,其可以直接进行预测部署,无需再重新导出模型。
#### 3.2 模型剪枝
训练指令如下:
- CPU/单卡GPU
以CPU为例,若使用GPU,则将命令中改成`cpu`改成`gpu`
```bash
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml -o Global.device=cpu
```
* 下面是量化`MobileNetV3_large_x1_0`模型的训练示例脚本。
- 单机单卡/单机多卡/多机多卡启动
```bash
# 下载预训练模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x1_0_pretrained.pdparams
# 启动训练,这里如果因为显存限制,batch size无法设置过大,可以将batch size和learning rate同比例缩小。
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \
deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
-o LEARNING_RATE.params.lr=0.13 \
-o epochs=100
--gpus="0,1,2,3" \
tools/train.py \
-c ppcls/configs/slim/ResNet50_vd_prune.yaml
```
### 4. 导出模型
在得到量化训练保存的模型后,可以将其导出为inference model,用于预测部署
在得到在线量化训练、模型剪枝保存的模型后,可以将其导出为inference model,用于预测部署,以模型剪枝为例
```bash
python3.7 deploy/slim/quant/export_model.py \
-m MobileNetV3_large_x1_0 \
-p output/MobileNetV3_large_x1_0/best_model/ppcls \
-o ./MobileNetV3_large_x1_0_infer/ \
--img_size=224 \
--class_dim=1000
python3.7 tools/export.py \
-c ppcls/configs/slim/ResNet50_vd_prune.yaml \
-o Global.pretrained_model=./output/ResNet50_vd/best_model \
-o Global.save_inference_dir=./inference
```
### 5. 量化模型部署
### 5. 模型部署
上述步骤导出的量化模型,参数精度仍然是FP32,但是参数的数值范围是int8,导出的模型可以通过PaddleLite的opt模型转换工具完成模型转换。
量化模型部署的可参考 [移动端模型部署](../../lite/readme.md)
上述步骤导出的模型可以通过PaddleLite的opt模型转换工具完成模型转换。
模型部署的可参考 [移动端模型部署](../lite/readme.md)
## 量化训练超参数建议
## 训练超参数建议
* 量化训练时,建议加载常规训练得到的预训练模型,加速量化训练收敛。
* 量化训练时,建议初始学习率修改为常规训练的`1/20~1/10`,同时将训练epoch数修改为常规训练的`1/5~1/2`,学习率策略方面,加上Warmup,其他配置信息不建议修改。
## Introduction
## Introduction to Slim
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model.
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number,
so as to reduce model calculation complexity and improve model inference performance.
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. This part provides the function of compressing the model, including two parts: model quantization (offline quantization training and online quantization training) and model pruning.
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the PaddleClas models.
Model pruning cuts off the unimportant convolution kernel in CNN to reduce the amount of model parameters, so as to reduce the computational complexity of the model.
It is recommended that you could understand following pages before reading this example:
- [The training strategy of PaddleClas models](../../../docs/en/tutorials/quick_start_en.md)
- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/)
- [The training strategy of PaddleClas models](../../docs/en/tutorials/getting_started_en.md)
- [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
## Quick Start
Quantization is mostly suitable for the deployment of lightweight models on mobile terminals.
After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps.
After training a model, if you want to further compress the model size and speed up the prediction, you can use quantization or pruning to compress the model according to the following steps.
1. Install PaddleSlim
2. Prepare trained model
3. Quantization-Aware Training
3. Model compression
4. Export inference model
5. Deploy quantization inference model
......@@ -27,7 +25,7 @@ After training, if you want to further compress the model size and accelerate th
* Install by pip.
```bash
pip3.7 install paddleslim==2.0.0
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
```
* Install from source code to get the lastest features.
......@@ -40,71 +38,105 @@ python setup.py install
### 2. Download Pretrain Model
PaddleClas provides a series of trained [models](../../../docs/en/models/models_intro_en.md).
If the model to be quantified is not in the list, you need to follow the [Regular Training](../../../docs/en/tutorials/getting_started_en.md) method to get the trained model.
PaddleClas provides a series of trained [models](../../docs/en/models/models_intro_en.md).
If the model to be quantified is not in the list, you need to follow the [Regular Training](../../docs/en/tutorials/getting_started_en.md) method to get the trained model.
### 3. Model Compression
Go to the root directory of PaddleClas
```bash
cd PaddleClase
```
The training related codes have been integrated into `ppcls/engine/`. The offline quantization code is located in `deploy/slim/quant_post_static.py`
#### 3.1 Model Quantization
Quantization training includes offline quantization and online quantization training.
##### 3.1.1 Online quantization training
### 3. Quant-Aware Training
Quantization training includes offline quantization training and online quantization training.
Online quantization training is more effective. It is necessary to load the pre-trained model.
After the quantization strategy is defined, the model can be quantified.
The code for quantization training is located in `deploy/slim/quant/quant.py`. The training command is as follow:
The training command is as follow:
* CPU/Single GPU
* CPU/Single GPU training
If using GPU, change the `cpu` to `gpu` in the following command.
```bash
python3.7 deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.device=cpu
```
The description of `yaml` file can be found in this [doc](../../docs/en/tutorials/config_en.md). To get better accuracy, the `pretrained model`is used in `yaml`.
* Distributed training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \
deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
--gpus="0,1,2,3" \
tools/train.py \
-m train \
-c ppcls/configs/slim/ResNet50_vd_quantization.yaml
```
##### 3.1.2 Offline quantization
**Attention**: At present, offline quantization must use `inference model` as input, which can be exported by trained model. The process of exporting `inference model` for trained model can refer to this [doc](../../docs/en/inference.md).
Generally speaking, the offline quantization gets more loss of accuracy than online qutization training.
After getting `inference model`, we can run following command to get offline quantization model.
```
python3.7 deploy/slim/quant_post_static.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
```
`Global.save_inference_dir` is the directory storing the `inference model`.
If run successfully, the directory `quant_post_static_model` is generated in `Global.save_inference_dir`, which stores the offline quantization model that can be used for deploy directly.
* The command of quantizing `MobileNetV3_large_x1_0` model is as follow:
#### 3.2 Model Pruning
- CPU/Single GPU
If using GPU, change the `cpu` to `gpu` in the following command.
```bash
# download pre-trained model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x1_0_pretrained.pdparams
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml -o Global.device=cpu
```
# run training
- Distributed training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \
deploy/slim/quant/quant.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
-o LEARNING_RATE.params.lr=0.13 \
-o epochs=100
--gpus="0,1,2,3" \
tools/train.py \
-c ppcls/configs/slim/ResNet50_vd_prune.yaml
```
### 4. Export inference model
After getting the model quantization aware trained, we can export it as inference model for predictive deployment:
After getting the compressed model, we can export it as inference model for predictive deployment. Using pruned model as example:
```bash
python3.7 deploy/slim/quant/export_model.py \
-m MobileNetV3_large_x1_0 \
-p output/MobileNetV3_large_x1_0/best_model/ppcls \
-o ./MobileNetV3_large_x1_0_infer/ \
--img_size=224 \
--class_dim=1000
python3.7 tools/export.py \
-c ppcls/configs/slim/ResNet50_vd_prune.yaml \
-o Global.pretrained_model=./output/ResNet50_vd/best_model
-o Global.save_inference_dir=./inference
```
### 5. Deploy
The type of quantized model's parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8.
The derived model can be converted through the `opt tool` of PaddleLite.
For quantitative model deployment, please refer to [Mobile terminal model deployment](../../lite/readme_en.md)
For compresed model deployment, please refer to [Mobile terminal model deployment](../lite/readme_en.md)
## Notes:
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
import os
import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..')))
sys.path.append(
os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools')))
from ppcls.arch import backbone
from ppcls.utils.save_load import load_dygraph_pretrain
import paddle
import paddle.nn.functional as F
from paddle.jit import to_static
from paddleslim.dygraph.quant import QAT
from pact_helper import get_default_quant_config
def parse_args():
def str2bool(v):
return v.lower() in ("true", "t", "1")
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str)
parser.add_argument("-p", "--pretrained_model", type=str)
parser.add_argument("-o", "--output_path", type=str, default="./inference")
parser.add_argument("--class_dim", type=int, default=1000)
parser.add_argument("--load_static_weights", type=str2bool, default=False)
parser.add_argument("--img_size", type=int, default=224)
return parser.parse_args()
class Net(paddle.nn.Layer):
def __init__(self, net, class_dim, model=None):
super(Net, self).__init__()
self.pre_net = net(class_dim=class_dim)
self.model = model
def forward(self, inputs):
x = self.pre_net(inputs)
if self.model == "GoogLeNet":
x = x[0]
x = F.softmax(x)
return x
def main():
args = parse_args()
net = backbone.__dict__[args.model]
model = Net(net, args.class_dim, args.model)
# get QAT model
quant_config = get_default_quant_config()
# TODO(littletomatodonkey): add PACT for export model
# quant_config["activation_preprocess_type"] = "PACT"
quanter = QAT(config=quant_config)
quanter.quantize(model)
load_dygraph_pretrain(
model.pre_net,
path=args.pretrained_model,
load_static_weights=args.load_static_weights)
model.eval()
save_path = os.path.join(args.output_path, "inference")
quanter.save_quantized_model(
model,
save_path,
input_spec=[
paddle.static.InputSpec(
shape=[None, 3, args.img_size, args.img_size], dtype='float32')
])
print('inference QAT model is saved to {}'.format(save_path))
if __name__ == "__main__":
main()
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle
def get_default_quant_config():
quant_config = {
# weight preprocess type, default is None and no preprocessing is performed.
'weight_preprocess_type': None,
# activation preprocess type, default is None and no preprocessing is performed.
'activation_preprocess_type': None,
# weight quantize type, default is 'channel_wise_abs_max'
'weight_quantize_type': 'channel_wise_abs_max',
# activation quantize type, default is 'moving_average_abs_max'
'activation_quantize_type': 'moving_average_abs_max',
# weight quantize bit num, default is 8
'weight_bits': 8,
# activation quantize bit num, default is 8
'activation_bits': 8,
# data type after quantization, such as 'uint8', 'int8', etc. default is 'int8'
'dtype': 'int8',
# window size for 'range_abs_max' quantization. default is 10000
'window_size': 10000,
# The decay coefficient of moving average, default is 0.9
'moving_rate': 0.9,
# for dygraph quantization, layers of type in quantizable_layer_type will be quantized
'quantizable_layer_type': ['Conv2D', 'Linear'],
}
return quant_config
# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..')))
sys.path.append(
os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools')))
import paddle
from paddleslim.dygraph.quant import QAT
from ppcls.data import Reader
from ppcls.utils.config import get_config
from ppcls.utils.save_load import init_model, save_model
from ppcls.utils import logger
import program
from pact_helper import get_default_quant_config
def parse_args():
parser = argparse.ArgumentParser("PaddleClas train script")
parser.add_argument(
'-c',
'--config',
type=str,
default='configs/ResNet/ResNet50.yaml',
help='config file path')
parser.add_argument(
'-o',
'--override',
action='append',
default=[],
help='config options to be overridden')
args = parser.parse_args()
return args
def main(args):
paddle.seed(12345)
config = get_config(args.config, overrides=args.override, show=True)
# assign the place
use_gpu = config.get("use_gpu", True)
place = paddle.set_device('gpu' if use_gpu else 'cpu')
trainer_num = paddle.distributed.get_world_size()
use_data_parallel = trainer_num != 1
config["use_data_parallel"] = use_data_parallel
if config["use_data_parallel"]:
paddle.distributed.init_parallel_env()
net = program.create_model(config.ARCHITECTURE, config.classes_num)
# prepare to quant
quant_config = get_default_quant_config()
quant_config["activation_preprocess_type"] = "PACT"
quanter = QAT(config=quant_config)
quanter.quantize(net)
optimizer, lr_scheduler = program.create_optimizer(
config, parameter_list=net.parameters())
init_model(config, net, optimizer)
if config["use_data_parallel"]:
net = paddle.DataParallel(net)
train_dataloader = Reader(config, 'train', places=place)()
if config.validate:
valid_dataloader = Reader(config, 'valid', places=place)()
last_epoch_id = config.get("last_epoch", -1)
best_top1_acc = 0.0 # best top1 acc record
best_top1_epoch = last_epoch_id
for epoch_id in range(last_epoch_id + 1, config.epochs):
net.train()
# 1. train with train dataset
program.run(train_dataloader, config, net, optimizer, lr_scheduler,
epoch_id, 'train')
# 2. validate with validate dataset
if config.validate and epoch_id % config.valid_interval == 0:
net.eval()
with paddle.no_grad():
top1_acc = program.run(valid_dataloader, config, net, None,
None, epoch_id, 'valid')
if top1_acc > best_top1_acc:
best_top1_acc = top1_acc
best_top1_epoch = epoch_id
model_path = os.path.join(config.model_save_dir,
config.ARCHITECTURE["name"])
save_model(net, optimizer, model_path, "best_model")
message = "The best top1 acc {:.5f}, in epoch: {:d}".format(
best_top1_acc, best_top1_epoch)
logger.info(message)
# 3. save the persistable model
if epoch_id % config.save_interval == 0:
model_path = os.path.join(config.model_save_dir,
config.ARCHITECTURE["name"])
save_model(net, optimizer, model_path, epoch_id)
if __name__ == '__main__':
args = parse_args()
main(args)
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import os
import sys
import numpy as np
import paddle
import paddleslim
from paddle.jit import to_static
from paddleslim.analysis import dygraph_flops as flops
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
from paddleslim.dygraph.quant import QAT
from ppcls.data import build_dataloader
from ppcls.utils import config as conf
from ppcls.utils.logger import init_logger
def main():
args = conf.parse_args()
config = conf.get_config(args.config, overrides=args.override, show=False)
assert os.path.exists(
os.path.join(config["Global"]["save_inference_dir"],
'inference.pdmodel')) and os.path.exists(
os.path.join(config["Global"]["save_inference_dir"],
'inference.pdiparams'))
config["DataLoader"]["Train"]["sampler"]["batch_size"] = 1
config["DataLoader"]["Train"]["loader"]["num_workers"] = 0
init_logger()
device = paddle.set_device("cpu")
train_dataloader = build_dataloader(config["DataLoader"], "Train", device,
False)
def sample_generator(loader):
def __reader__():
for indx, data in enumerate(loader):
images = np.array(data[0])
yield images
return __reader__
paddle.enable_static()
place = paddle.CPUPlace()
exe = paddle.static.Executor(place)
paddleslim.quant.quant_post_static(
executor=exe,
model_dir=config["Global"]["save_inference_dir"],
model_filename='inference.pdmodel',
params_filename='inference.pdiparams',
quantize_model_path=os.path.join(
config["Global"]["save_inference_dir"], "quant_post_static_model"),
sample_generator=sample_generator(train_dataloader),
batch_nums=10)
if __name__ == "__main__":
main()
......@@ -162,7 +162,7 @@ class MobileNetV3(TheseusLayer):
if_act=True,
act="hardswish")
self.blocks = nn.Sequential(*[
self.blocks = nn.Sequential(* [
ResidualUnit(
in_c=_make_divisible(self.inplanes * self.scale if i == 0 else
self.cfg[i - 1][2] * self.scale),
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantization or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: MobileNetV3_large_x1_0
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.65
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 60
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantalization or prune model
Slim:
## for quantization
quant:
name: pact
# model architecture
Arch:
name: MobileNetV3_large_x1_0
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.065
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 200
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantization or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: ResNet50_vd
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- MixCELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.1
regularizer:
name: 'L2'
coeff: 0.00007
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
batch_transform_ops:
- MixupOperator:
alpha: 0.2
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantalization or prune model
Slim:
## for quantization
quant:
name: pact
# model architecture
Arch:
name: ResNet50_vd
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- MixCELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
regularizer:
name: 'L2'
coeff: 0.00007
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
batch_transform_ops:
- MixupOperator:
alpha: 0.2
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 160
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
eval_mode: "retrieval"
# for quantizaiton or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: "RecModel"
infer_output_key: "features"
infer_add_softmax: False
Backbone:
name: "ResNet50_last_stage_stride1"
pretrained: True
BackboneStopLayer:
name: "adaptive_avg_pool2d_0"
Neck:
name: "VehicleNeck"
in_channels: 2048
out_channels: 512
Head:
name: "ArcMargin"
embedding_size: 512
class_num: 30671
margin: 0.15
scale: 32
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
- SupConLoss:
weight: 1.0
views: 2
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
last_epoch: -1
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images/"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/train_list_start0.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- AugMix:
prob: 0.5
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 0.4
r1: 0.3
mean: [0., 0., 0.]
sampler:
name: DistributedRandomIdentitySampler
batch_size: 128
num_instances: 2
drop_last: False
shuffle: True
loader:
num_workers: 6
use_shared_memory: True
Eval:
Query:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id_query.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Gallery:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Metric:
Eval:
- Recallk:
topk: [1, 5]
- mAP: {}
......@@ -42,6 +42,7 @@ from ppcls.data import create_operators
from ppcls.engine.train import train_epoch
from ppcls.engine import evaluation
from ppcls.arch.gears.identity_head import IdentityHead
from ppcls.engine.slim import get_pruner, get_quaner
class Engine(object):
......@@ -182,6 +183,8 @@ class Engine(object):
self.model, self.config["Global"]["pretrained_model"])
# for slim
self.pruner = get_pruner(self.config, self.model)
self.quanter = get_quaner(self.config, self.model)
# build optimizer
if self.mode == 'train':
......@@ -346,18 +349,26 @@ class Engine(object):
self.config["Global"]["pretrained_model"])
model.eval()
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(
model,
os.path.join(self.config["Global"]["save_inference_dir"],
"inference"))
save_path = os.path.join(self.config["Global"]["save_inference_dir"],
"inference")
if self.quanter:
self.quanter.save_quantized_model(
model,
save_path,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
else:
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(model, save_path)
class ExportModel(nn.Layer):
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from ppcls.engine.slim.prune import get_pruner
from ppcls.engine.slim.quant import get_quaner
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
from ppcls.utils import logger
def get_pruner(config, model):
if config.get("Slim", False) and config["Slim"].get("prune", False):
import paddleslim
prune_method_name = config["Slim"]["prune"]["name"].lower()
assert prune_method_name in [
"fpgm", "l1_norm"
], "The prune methods only support 'fpgm' and 'l1_norm'"
if prune_method_name == "fpgm":
pruner = paddleslim.dygraph.FPGMFilterPruner(
model, [1] + config["Global"]["image_shape"])
else:
pruner = paddleslim.dygraph.L1NormFilterPruner(
model, [1] + config["Global"]["image_shape"])
# prune model
_prune_model(pruner, config, model)
else:
pruner = None
return pruner
def _prune_model(pruner, config, model):
from paddleslim.analysis import dygraph_flops as flops
logger.info("FLOPs before pruning: {}GFLOPs".format(
flops(model, [1] + config["Global"]["image_shape"]) / 1e9))
model.eval()
params = []
for sublayer in model.sublayers():
for param in sublayer.parameters(include_sublayers=False):
if isinstance(sublayer, paddle.nn.Conv2D):
params.append(param.name)
ratios = {}
for param in params:
ratios[param] = config["Slim"]["prune"]["pruned_ratio"]
plan = pruner.prune_vars(ratios, [0])
logger.info("FLOPs after pruning: {}GFLOPs; pruned ratio: {}".format(
flops(model, [1] + config["Global"]["image_shape"]) / 1e9,
plan.pruned_flops))
for param in model.parameters():
if "conv2d" in param.name:
logger.info("{}\t{}".format(param.name, param.shape))
model.train()
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
from ppcls.utils import logger
QUANT_CONFIG = {
# weight preprocess type, default is None and no preprocessing is performed.
'weight_preprocess_type': None,
# activation preprocess type, default is None and no preprocessing is performed.
'activation_preprocess_type': None,
# weight quantize type, default is 'channel_wise_abs_max'
'weight_quantize_type': 'channel_wise_abs_max',
# activation quantize type, default is 'moving_average_abs_max'
'activation_quantize_type': 'moving_average_abs_max',
# weight quantize bit num, default is 8
'weight_bits': 8,
# activation quantize bit num, default is 8
'activation_bits': 8,
# data type after quantization, such as 'uint8', 'int8', etc. default is 'int8'
'dtype': 'int8',
# window size for 'range_abs_max' quantization. default is 10000
'window_size': 10000,
# The decay coefficient of moving average, default is 0.9
'moving_rate': 0.9,
# for dygraph quantization, layers of type in quantizable_layer_type will be quantized
'quantizable_layer_type': ['Conv2D', 'Linear'],
}
def get_quaner(config, model):
if config.get("Slim", False) and config["Slim"].get("quant", False):
from paddleslim.dygraph.quant import QAT
assert config["Slim"]["quant"]["name"].lower(
) == 'pact', 'Only PACT quantization method is supported now'
QUANT_CONFIG["activation_preprocess_type"] = "PACT"
quanter = QAT(config=QUANT_CONFIG)
quanter.quantize(model)
logger.info("QAT model summary:")
paddle.summary(model, (1, 3, 224, 224))
else:
quanter = None
return quanter
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册