未验证 提交 2d4964bc 编写于 作者: C Chang Xu 提交者: GitHub

Add more classification demo for ACT (#1188)

上级 9a918144
......@@ -16,16 +16,40 @@
本示例将以图像分类模型MobileNetV1为例,介绍如何使用PaddleClas中Inference部署模型进行自动压缩。本示例使用的自动压缩策略为量化训练和蒸馏。
## 2. Benchmark
- PaddlePaddle MobileNetV1模型
| 模型 | 策略 | Top-1 Acc | 耗时(ms) threads=4 |
|:------:|:------:|:------:|:------:|
| MobileNetV1 | Base模型 | 70.90 | 39.041 |
| MobileNetV1 | 量化+蒸馏 | 70.49 | 29.238|
### PaddleClas模型
- 测试环境:`SDM710 2*A75(2.2GHz) 6*A55(1.7GHz)`
- TensorFlow MobileNetV1模型
| 模型 | 策略 | Top-1 Acc | GPU 耗时(ms) | ARM CPU 耗时(ms) |
|:------:|:------:|:------:|:------:|:------:|
| MobileNetV1 | Baseline | 70.90 | - | 33.15 |
| MobileNetV1 | 量化+蒸馏 | 70.49 | - | 13.64 |
| ResNet50_vd | Baseline | 79.12 | 3.19 | - |
| ResNet50_vd | 量化+蒸馏 | 78.55 | 0.92 | - |
| ShuffleNetV2_x1_0 | Baseline | 68.65 | - | 10.43 |
| ShuffleNetV2_x1_0 | 量化+蒸馏 | 67.78 | - | 5.51 |
| SqueezeNet1_0_infer | Baseline | 59.60 | - | 35.98 |
| SqueezeNet1_0_infer | 量化+蒸馏 | 59.13 | - | 16.96 |
| PPLCNetV2_base | Baseline | 76.86 | - | 36.50 |
| PPLCNetV2_base | 量化+蒸馏 | 76.43 | - | 15.79 |
| PPHGNet_tiny | Baseline | 79.59 | 2.82 | - |
| PPHGNet_tiny | 量化+蒸馏 | 79.19 | 0.98 | - |
| EfficientNetB0 | Baseline | 77.02 | 1.95 | - |
| EfficientNetB0 | 量化+蒸馏 | 73.61 | 1.44 | - |
| GhostNet_x1_0 | Baseline | 74.02 | 2.93 | - |
| GhostNet_x1_0 | 量化+蒸馏 | 71.11 | 1.03 | - |
| InceptionV3 | Baseline | 79.14 | 4.79 | - |
| InceptionV3 | 量化+蒸馏 | 73.16 | 1.47 | - |
| MobileNetV3_large_x1_0 | Baseline | 75.32 | - | 16.62 |
| MobileNetV3_large_x1_0 | 量化+蒸馏 | 68.84 | - | 9.85 |
- ARM CPU 测试环境:`SDM865(4xA77+4xA55)`
- Nvidia GPU 测试环境:
- 硬件:NVIDIA Tesla T4 单卡
- 软件:CUDA 11.2, cuDNN 8.0, TensorRT 8.4
- 测试配置:batch_size: 1, image size: 224
### TensorFlow MobileNetV1模型
| 模型 | 策略 | Top-1 Acc | 耗时(ms) threads=1 | Inference模型 |
|:------:|:------:|:------:|:------:|:------:|
......@@ -35,14 +59,8 @@
- 测试环境:`骁龙865 4*A77 4*A55`
说明:
- MobileNetV1模型源自[tensorflow/models](http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz),通过[X2Paddle](https://github.com/PaddlePaddle/X2Paddle)工具转换MobileNetV1预测模型步骤:
(1) 安装X2Paddle的1.3.6以上版本;(pip install x2paddle)
(2) 转换模型:
x2paddle --framework=tensorflow --model=tf_model.pb --save_dir=pd_model
- MobileNetV1模型源自[tensorflow/models](http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz)
即可得到MobileNetV1模型的预测模型(`model.pdmodel``model.pdiparams`)。如想快速体验,可直接下载上方表格中MobileNetV1的Base预测模型。
## 3. 自动压缩流程
......@@ -90,24 +108,11 @@ tar -xf MobileNetV1_infer.tar
```shell
# 单卡启动
export CUDA_VISIBLE_DEVICES=0
python run.py \
--model_dir='MobileNetV1_infer' \
--model_filename='inference.pdmodel' \
--params_filename='inference.pdiparams' \
--save_dir='./output' \
--batch_size=128 \
--config_path='./configs/mobilenetv1_qat_dis.yaml'\
--data_dir='ILSVRC2012'
# 多卡启动
python -m paddle.distributed.launch run.py \
--model_dir='MobileNetV1_infer' \
--model_filename='inference.pdmodel' \
--params_filename='inference.pdiparams' \
--save_dir='./output' \
--batch_size=128 \
--config_path='./configs/mobilenetv1_qat_dis.yaml'\
--data_dir='ILSVRC2012'
export CUDA_VISIBLE_DEVICES=0,1,2,3
python run.py --save_dir='./save_quant_mobilev1/' --config_path='./configs/MobileNetV1/qat_dis.yaml'
```
......@@ -118,4 +123,3 @@ python -m paddle.distributed.launch run.py \
- [Paddle Lite部署](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/deployment/lite/lite.md)
## 5.FAQ
[1.] 如果遇到报错 ```ValueError: var inputs not in this block``` ,则说明模型中的输入变量的名字不是 ```inputs``` ,可以先用netron可视化查看输入变量的名称,然后修改 ```run.py``` 中的第35行中 ``` yield {"inputs": imgs}``````yield {${input_tensor_name}: imgs}```。一般PaddleClas产出部署模型的输入名字如果不是 ```inputs```,则是 ```x```
Global:
input_name: x
model_dir: EfficientNetB0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
ChannelPrune:
pruned_ratio: 0.25
prune_params_name:
- _blocks.0._se_reduce_weights
- _blocks.0._se_expand_weights
- _blocks.0._project_conv_weights
- _blocks.1._expand_conv_weights
- _blocks.1._se_reduce_weights
- _blocks.1._se_expand_weights
- _blocks.1._project_conv_weights
- _blocks.2._expand_conv_weights
- _blocks.2._se_reduce_weights
- _blocks.2._se_expand_weights
- _blocks.2._project_conv_weights
- _blocks.3._expand_conv_weights
- _blocks.3._se_reduce_weights
- _blocks.3._se_expand_weights
- _blocks.3._project_conv_weights
- _blocks.4._expand_conv_weights
- _blocks.4._se_reduce_weights
- _blocks.4._se_expand_weights
- _blocks.4._project_conv_weights
- _blocks.5._expand_conv_weights
- _blocks.5._se_reduce_weights
- _blocks.5._se_expand_weights
- _blocks.5._project_conv_weights
- _blocks.6._expand_conv_weights
- _blocks.6._se_reduce_weights
- _blocks.6._se_expand_weights
- _blocks.6._project_conv_weights
- _blocks.7._expand_conv_weights
- _blocks.7._se_reduce_weights
- _blocks.7._se_expand_weights
- _blocks.7._project_conv_weights
- _blocks.8._expand_conv_weights
- _blocks.8._se_reduce_weights
- _blocks.8._se_expand_weights
- _blocks.8._project_conv_weights
- _blocks.9._expand_conv_weights
- _blocks.9._se_reduce_weights
- _blocks.9._se_expand_weights
- _blocks.9._project_conv_weights
- _blocks.10._expand_conv_weights
- _blocks.10._se_reduce_weights
- _blocks.10._se_expand_weights
- _blocks.10._project_conv_weights
- _blocks.11._expand_conv_weights
- _blocks.11._se_reduce_weights
- _blocks.11._se_expand_weights
- _blocks.11._project_conv_weights
- _blocks.12._expand_conv_weights
- _blocks.12._se_reduce_weights
- _blocks.12._se_expand_weights
- _blocks.12._project_conv_weights
- _blocks.13._expand_conv_weights
- _blocks.13._se_reduce_weights
- _blocks.13._se_expand_weights
- _blocks.13._project_conv_weights
- _blocks.14._expand_conv_weights
- _blocks.14._se_reduce_weights
- _blocks.14._se_expand_weights
- _blocks.14._project_conv_weights
- _blocks.15._expand_conv_weights
- _blocks.15._se_reduce_weights
- _blocks.15._se_expand_weights
- _blocks.15._project_conv_weights
- _conv_head_weights
criterion: l1_norm
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 500
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7738
Global:
input_name: x
model_dir: EfficientNetB0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7738
Global:
input_name: inputs
model_dir: GhostNet_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
ChannelPrune:
pruned_ratio: 0.25
criterion: l1_norm
prune_params_name:
- conv1_weights
- _ghostbottleneck_0_ghost_module_1_primary_conv_weights
- _ghostbottleneck_0_ghost_module_2_primary_conv_weights
- _ghostbottleneck_1_ghost_module_1_primary_conv_weights
- _ghostbottleneck_1_ghost_module_2_primary_conv_weights
- _ghostbottleneck_1_shortcut_conv_weights
- _ghostbottleneck_2_ghost_module_1_primary_conv_weights
- _ghostbottleneck_2_ghost_module_2_primary_conv_weights
- _ghostbottleneck_3_ghost_module_1_primary_conv_weights
- _ghostbottleneck_3_ghost_module_2_primary_conv_weights
- _ghostbottleneck_3_shortcut_conv_weights
- _ghostbottleneck_4_ghost_module_1_primary_conv_weights
- _ghostbottleneck_4_ghost_module_2_primary_conv_weights
- _ghostbottleneck_5_ghost_module_1_primary_conv_weights
- _ghostbottleneck_5_ghost_module_2_primary_conv_weights
- _ghostbottleneck_5_shortcut_conv_weights
- _ghostbottleneck_6_ghost_module_1_primary_conv_weights
- _ghostbottleneck_6_ghost_module_2_primary_conv_weights
- _ghostbottleneck_7_ghost_module_1_primary_conv_weights
- _ghostbottleneck_7_ghost_module_2_primary_conv_weights
- _ghostbottleneck_8_ghost_module_1_primary_conv_weights
- _ghostbottleneck_8_ghost_module_2_primary_conv_weights
- _ghostbottleneck_9_ghost_module_1_primary_conv_weights
- _ghostbottleneck_9_ghost_module_2_primary_conv_weights
- _ghostbottleneck_9_shortcut_conv_weights
- _ghostbottleneck_10_ghost_module_1_primary_conv_weights
- _ghostbottleneck_10_ghost_module_2_primary_conv_weights
- _ghostbottleneck_11_ghost_module_1_primary_conv_weights
- _ghostbottleneck_11_ghost_module_2_primary_conv_weights
- _ghostbottleneck_11_shortcut_conv_weights
- _ghostbottleneck_12_ghost_module_1_primary_conv_weights
- _ghostbottleneck_12_ghost_module_2_primary_conv_weights
- _ghostbottleneck_13_ghost_module_1_primary_conv_weights
- _ghostbottleneck_13_ghost_module_2_primary_conv_weights
- _ghostbottleneck_14_ghost_module_1_primary_conv_weights
- _ghostbottleneck_14_ghost_module_2_primary_conv_weights
- _ghostbottleneck_15_ghost_module_1_primary_conv_weights
- _ghostbottleneck_15_ghost_module_2_primary_conv_weights
- conv_last_weights
- fc_0_weights
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7402
Global:
input_name: inputs
model_dir: GhostNet_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
Global:
input_name: x
model_dir: InceptionV3_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
ChannelPrune:
pruned_ratio: 0.25
criterion: l1_norm
prune_params_name:
- conv2d_0.w_0
- conv2d_1.w_0
- conv2d_2.w_0
- conv2d_3.w_0
- conv2d_4.w_0
- conv2d_5.w_0
- conv2d_6.w_0
- conv2d_7.w_0
- conv2d_8.w_0
- conv2d_9.w_0
- conv2d_10.w_0
- conv2d_11.w_0
- conv2d_12.w_0
- conv2d_13.w_0
- conv2d_14.w_0
- conv2d_15.w_0
- conv2d_16.w_0
- conv2d_17.w_0
- conv2d_18.w_0
- conv2d_19.w_0
- conv2d_20.w_0
- conv2d_21.w_0
- conv2d_22.w_0
- conv2d_23.w_0
- conv2d_24.w_0
- conv2d_25.w_0
- conv2d_26.w_0
- conv2d_27.w_0
- conv2d_28.w_0
- conv2d_29.w_0
- conv2d_30.w_0
- conv2d_31.w_0
- conv2d_32.w_0
- conv2d_33.w_0
- conv2d_34.w_0
- conv2d_35.w_0
- conv2d_36.w_0
- conv2d_37.w_0
- conv2d_38.w_0
- conv2d_39.w_0
- conv2d_40.w_0
- conv2d_41.w_0
- conv2d_42.w_0
- conv2d_43.w_0
- conv2d_44.w_0
- conv2d_45.w_0
- conv2d_46.w_0
- conv2d_47.w_0
- conv2d_48.w_0
- conv2d_49.w_0
- conv2d_50.w_0
- conv2d_51.w_0
- conv2d_52.w_0
- conv2d_53.w_0
- conv2d_54.w_0
- conv2d_55.w_0
- conv2d_56.w_0
- conv2d_57.w_0
- conv2d_58.w_0
- conv2d_59.w_0
- conv2d_60.w_0
- conv2d_61.w_0
- conv2d_62.w_0
- conv2d_63.w_0
- conv2d_64.w_0
- conv2d_65.w_0
- conv2d_66.w_0
- conv2d_67.w_0
- conv2d_68.w_0
- conv2d_69.w_0
- conv2d_70.w_0
- conv2d_71.w_0
- conv2d_72.w_0
- conv2d_73.w_0
- conv2d_74.w_0
- conv2d_75.w_0
- conv2d_76.w_0
- conv2d_77.w_0
- conv2d_78.w_0
- conv2d_79.w_0
- conv2d_80.w_0
- conv2d_81.w_0
- conv2d_82.w_0
- conv2d_83.w_0
- conv2d_84.w_0
- conv2d_85.w_0
- conv2d_86.w_0
- conv2d_87.w_0
- conv2d_88.w_0
- conv2d_89.w_0
- conv2d_90.w_0
- conv2d_91.w_0
- conv2d_92.w_0
- conv2d_93.w_0
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7914
Global:
input_name: x
model_dir: InceptionV3_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 10.0
loss: l2
node:
- softmax_1.tmp_0
Quantization:
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7914
Global:
input_name: inputs
model_dir: MobileNetV1_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 10000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.70898
Global:
input_name: inputs
model_dir: MobileNetV1_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /workspace/dataset/ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
......@@ -15,9 +26,12 @@ Quantization:
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate: 0.004
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 10000
optimizer_builder:
optimizer:
optimizer:
type: Momentum
weight_decay: 0.00003
weight_decay: 0.00002
origin_metric: 0.70898
Global:
input_name: inputs
model_dir: MobileNetV3_large_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7532
Global:
input_name: inputs
model_dir: MobileNetV3_large_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
activation_bits: 8
is_full_quantize: false
use_pact: true
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 2000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.0001
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7532
Global:
input_name: x
model_dir: PPLCNetV2_base_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7704
Global:
input_name: x
model_dir: PPLCNetV2_base_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7704
Global:
input_name: x
model_dir: PPLCNet_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7132
Global:
input_name: x
model_dir: PPLCNet_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 10000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7132
Global:
input_name: inputs
model_dir: ResNet50_vd_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
ChannelPrune:
pruned_ratio: 0.25
criterion: l1_norm
prune_params_name:
- conv1_1_weights
- conv1_2_weights
- conv1_3_weights
- res2a_branch2a_weights
- res2a_branch2b_weights
- res2a_branch2c_weights
- res2a_branch1_weights
- res2b_branch2a_weights
- res2b_branch2b_weights
- res2b_branch2c_weights
- res2c_branch2a_weights
- res2c_branch2b_weights
- res2c_branch2c_weights
- res3a_branch2a_weights
- res3a_branch2b_weights
- res3a_branch2c_weights
- res3a_branch1_weights
- res3b_branch2a_weights
- res3b_branch2b_weights
- res3b_branch2c_weights
- res3c_branch2a_weights
- res3c_branch2b_weights
- res3c_branch2c_weights
- res3d_branch2a_weights
- res3d_branch2b_weights
- res3d_branch2c_weights
- res4a_branch2a_weights
- res4a_branch2b_weights
- res4a_branch2c_weights
- res4a_branch1_weights
- res4b_branch2a_weights
- res4b_branch2b_weights
- res4b_branch2c_weights
- res4c_branch2a_weights
- res4c_branch2b_weights
- res4c_branch2c_weights
- res4d_branch2a_weights
- res4d_branch2b_weights
- res4d_branch2c_weights
- res4e_branch2a_weights
- res4e_branch2b_weights
- res4e_branch2c_weights
- res4f_branch2a_weights
- res4f_branch2b_weights
- res4f_branch2c_weights
- res5a_branch2a_weights
- res5a_branch2b_weights
- res5a_branch2c_weights
- res5a_branch1_weights
- res5b_branch2a_weights
- res5b_branch2b_weights
- res5b_branch2c_weights
- res5c_branch2a_weights
- res5c_branch2b_weights
- res5c_branch2c_weights
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 500
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7912
Global:
input_name: inputs
model_dir: ResNet50_vd_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7912
Global:
input_name: inputs
model_dir: ShuffleNetV2_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.6880
Global:
input_name: inputs
model_dir: ShuffleNetV2_x1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.6880
Global:
input_name: inputs
model_dir: SqueezeNet1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
UnstructurePrune:
prune_strategy: gmp
prune_mode: ratio
ratio: 0.75
gmp_config:
stable_iterations: 0
pruning_iterations: 4500
tunning_iterations: 4500
resume_iteration: -1
pruning_steps: 100
initial_ratio: 0.15
prune_params_type: conv1x1_only
local_sparsity: True
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.596
Global:
input_name: inputs
model_dir: SqueezeNet1_0_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
teacher_model_dir: SqueezeNet1_0_infer
teacher_model_filename: inference.pdmodel
teacher_params_filename: inference.pdiparams
Quantization:
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.596
\ No newline at end of file
Global:
input_name: inputs
model_dir: SwinTransformer_base_patch4_window7_224_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
batch_size: 32
data_dir: /ILSVRC2012
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_48.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
activation_quantize_type: range_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 5000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.83
......@@ -10,34 +10,38 @@ import numpy as np
import paddle
import paddle.nn as nn
from paddle.io import Dataset, BatchSampler, DataLoader
import imagenet_reader as pd_imagenet_reader
import tf_imagenet_reader
from paddleslim.auto_compression.config_helpers import load_config
import imagenet_reader as reader
from paddleslim.auto_compression.config_helpers import load_config as load_slim_config
from paddleslim.auto_compression import AutoCompression
from utility import add_arguments, print_arguments
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('model_dir', str, None, "inference model directory.")
add_arg('model_filename', str, None, "inference model filename.")
add_arg('params_filename', str, None, "inference params filename.")
add_arg('save_dir', str, None, "directory to save compressed model.")
add_arg('batch_size', int, 1, "train batch size.")
add_arg('config_path', str, None, "path of compression strategy config.")
add_arg('data_dir', str, None, "path of dataset")
add_arg('input_name', str, "inputs", "input name of the model")
add_arg('input_shape', int, [3,224,224], "input shape of the model except batch_size", nargs='+')
add_arg('image_reader_type', str, "paddle", "the preprocess of data. choice in [\"paddle\", \"tensorflow\"]")
def argsparser():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
'--config_path',
type=str,
default=None,
help="path of compression strategy config.",
required=True)
parser.add_argument(
'--save_dir',
type=str,
default='output',
help="directory to save compressed model.")
return parser
def print_arguments(args):
print('----------- Running Arguments -----------')
for arg, value in sorted(vars(args).items()):
print('%s: %s' % (arg, value))
print('------------------------------------------')
# yapf: enable
def reader_wrapper(reader, input_name, input_shape):
def reader_wrapper(reader, input_name):
def gen():
for i, data in enumerate(reader()):
imgs = np.float32([item[0] for item in data])
imgs = imgs.reshape([len(data)] + input_shape)
yield {input_name: imgs}
return gen
......@@ -50,9 +54,9 @@ def eval_reader(data_dir, batch_size):
def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
val_reader = eval_reader(data_dir, batch_size=args.batch_size)
val_reader = eval_reader(data_dir, batch_size=global_config['batch_size'])
image = paddle.static.data(
name=args.input_name, shape=[None] + args.input_shape, dtype='float32')
name=global_config['input_name'], shape=[None, 3, 224, 224], dtype='float32')
label = paddle.static.data(name='label', shape=[None, 1], dtype='int64')
results = []
......@@ -60,7 +64,7 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
# top1_acc, top5_acc
if len(test_feed_names) == 1:
image = np.array([[d[0]] for d in data])
image = image.reshape([len(data)] + args.input_shape)
image = image.reshape((len(data), 3, 224, 224))
label = [[d[1]] for d in data]
pred = exe.run(compiled_test_program,
feed={test_feed_names[0]: image},
......@@ -80,8 +84,7 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
else:
# eval "eval model", which inputs are image and label, output is top1 and top5 accuracy
image = np.array([[d[0]] for d in data])
image = image.reshape([len(data)] + args.input_shape)
label = [[d[1]] for d in data]
image = image.reshape((len(data), 3, 224, 224))
label = [[d[1]] for d in data]
result = exe.run(
compiled_test_program,
......@@ -96,35 +99,33 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
return result[0]
if __name__ == '__main__':
args = parser.parse_args()
print_arguments(args)
paddle.enable_static()
data_dir = args.data_dir
def main():
global global_config
all_config = load_slim_config(args.config_path)
assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
global_config = all_config["Global"]
global data_dir
data_dir = global_config['data_dir']
if args.image_reader_type == 'paddle':
reader = pd_imagenet_reader
elif args.image_reader_type == 'tensorflow':
reader = tf_imagenet_reader
else:
raise NotImplementedError(
"image_reader_type only can be set to paddle or tensorflow, but now is {}".
format(args.image_reader_type))
train_reader = paddle.batch(
reader.train(data_dir=data_dir), batch_size=args.batch_size)
train_dataloader = reader_wrapper(train_reader, args.input_name,
args.input_shape)
reader.train(data_dir=data_dir), batch_size=global_config['batch_size'])
train_dataloader = reader_wrapper(train_reader, global_config['input_name'])
ac = AutoCompression(
model_dir=args.model_dir,
model_filename=args.model_filename,
params_filename=args.params_filename,
model_dir=global_config['model_dir'],
model_filename=global_config['model_filename'],
params_filename=global_config['params_filename'],
save_dir=args.save_dir,
config=args.config_path,
config=all_config,
train_dataloader=train_dataloader,
eval_callback=eval_function,
eval_dataloader=reader_wrapper(
eval_reader(data_dir, args.batch_size), args.input_name,
args.input_shape))
eval_dataloader=reader_wrapper(eval_reader(data_dir, global_config['batch_size']), global_config['input_name']))
ac.compress()
if __name__ == '__main__':
paddle.enable_static()
parser = argsparser()
args = parser.parse_args()
print_arguments(args)
main()
......@@ -729,21 +729,24 @@ class AutoCompression:
test_program_info.feed_target_names,
test_program_info.fetch_targets)
_logger.info(
"epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
format(epoch_id, metric, best_metric))
if metric > best_metric:
paddle.static.save(
program=test_program_info.program._program,
model_path=os.path.join(self.tmp_dir,
'best_model'))
best_metric = metric
_logger.info(
"epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
format(epoch_id, metric, best_metric))
if self.metric_before_compressed is not None and float(
abs(best_metric -
self.metric_before_compressed)
) / self.metric_before_compressed <= 0.005:
break
else:
_logger.info(
"epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
format(epoch_id, metric, best_metric))
if train_config.target_metric is not None:
if metric > float(train_config.target_metric):
break
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册