Add more classification demo for ACT (#1188)

2d4964bc · Chang Xu · GitHub · 9a918144 · 2d4964bc · 2d4964bc
24 changed file
--- a/demo/auto_compression/image_classification/README.md
+++ b/demo/auto_compression/image_classification/README.md
@@ -16,16 +16,40 @@
 本示例将以图像分类模型MobileNetV1为例，介绍如何使用PaddleClas中Inference部署模型进行自动压缩。本示例使用的自动压缩策略为量化训练和蒸馏。
 ## 2. Benchmark
- PaddlePaddle MobileNetV1模型
-| 模型 | 策略 | Top-1 Acc | 耗时(ms) threads=4 |
+### PaddleClas模型
-|:------:|:------:|:------:|:------:|
-| MobileNetV1 | Base模型 | 70.90 | 39.041 |
-| MobileNetV1 | 量化+蒸馏 | 70.49 | 29.238|
- 测试环境：`SDM710 2*A75(2.2GHz) 6*A55(1.7GHz)`
+| 模型 | 策略 | Top-1 Acc | GPU 耗时(ms) | ARM CPU 耗时(ms) | 
+|:------:|:------:|:------:|:------:|:------:|
- TensorFlow MobileNetV1模型
+| MobileNetV1 | Baseline | 70.90 | - | 33.15 |
+| MobileNetV1 | 量化+蒸馏 | 70.49 | - | 13.64 |
+| ResNet50_vd | Baseline | 79.12 | 3.19 | - |
+| ResNet50_vd | 量化+蒸馏 | 78.55 | 0.92 | - |
+| ShuffleNetV2_x1_0 | Baseline | 68.65 | - | 10.43 |
+| ShuffleNetV2_x1_0 | 量化+蒸馏 | 67.78 | - | 5.51 |
+| SqueezeNet1_0_infer | Baseline | 59.60 | - | 35.98 |
+| SqueezeNet1_0_infer | 量化+蒸馏 | 59.13 | - | 16.96 |
+| PPLCNetV2_base | Baseline | 76.86 | - | 36.50 |
+| PPLCNetV2_base | 量化+蒸馏 | 76.43 | - | 15.79 |
+| PPHGNet_tiny | Baseline | 79.59 | 2.82 | - |
+| PPHGNet_tiny | 量化+蒸馏 | 79.19 | 0.98 | - |
+| EfficientNetB0 | Baseline | 77.02 | 1.95 | - |
+| EfficientNetB0 | 量化+蒸馏 | 73.61 | 1.44 | - |
+| GhostNet_x1_0 | Baseline | 74.02 | 2.93 | - |
+| GhostNet_x1_0 | 量化+蒸馏 | 71.11 | 1.03 | - |
+| InceptionV3 | Baseline | 79.14 | 4.79 | - |
+| InceptionV3 | 量化+蒸馏 | 73.16 | 1.47 | - |
+| MobileNetV3_large_x1_0 | Baseline | 75.32 | - | 16.62 |
+| MobileNetV3_large_x1_0 | 量化+蒸馏 | 68.84 | - | 9.85 |
+- ARM CPU 测试环境：`SDM865(4xA77+4xA55)`
+- Nvidia GPU 测试环境：
+  - 硬件：NVIDIA Tesla T4 单卡
+  - 软件：CUDA 11.2, cuDNN 8.0, TensorRT 8.4
+  - 测试配置：batch_size: 1, image size: 224
+### TensorFlow MobileNetV1模型
 | 模型 | 策略 | Top-1 Acc | 耗时(ms) threads=1 | Inference模型 |
 |:------:|:------:|:------:|:------:|:------:|
@@ -35,14 +59,8 @@
 - 测试环境：`骁龙865 4*A77 4*A55`
 说明：
- MobileNetV1模型源自[tensorflow/models](http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz)，通过[X2Paddle](https://github.com/PaddlePaddle/X2Paddle)工具转换MobileNetV1预测模型步骤：
+- MobileNetV1模型源自[tensorflow/models](http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz)
-(1) 安装X2Paddle的1.3.6以上版本；（pip install x2paddle）
-(2) 转换模型：
-x2paddle --framework=tensorflow --model=tf_model.pb --save_dir=pd_model
-即可得到MobileNetV1模型的预测模型（`model.pdmodel` 和 `model.pdiparams`）。如想快速体验，可直接下载上方表格中MobileNetV1的Base预测模型。
 ## 3. 自动压缩流程
@@ -90,24 +108,11 @@ tar -xf MobileNetV1_infer.tar
 ```shell
 # 单卡启动
 export CUDA_VISIBLE_DEVICES=0
-python run.py \
-    --model_dir='MobileNetV1_infer' \
-    --model_filename='inference.pdmodel' \
-    --params_filename='inference.pdiparams' \
-    --save_dir='./output' \
-    --batch_size=128 \
-    --config_path='./configs/mobilenetv1_qat_dis.yaml'\
-    --data_dir='ILSVRC2012'
 # 多卡启动
-python -m paddle.distributed.launch run.py \
+export CUDA_VISIBLE_DEVICES=0,1,2,3
-    --model_dir='MobileNetV1_infer' \
-    --model_filename='inference.pdmodel' \
+python run.py --save_dir='./save_quant_mobilev1/' --config_path='./configs/MobileNetV1/qat_dis.yaml'
-    --params_filename='inference.pdiparams' \
-    --save_dir='./output' \
-    --batch_size=128 \
-    --config_path='./configs/mobilenetv1_qat_dis.yaml'\
-    --data_dir='ILSVRC2012'
 ```
@@ -118,4 +123,3 @@ python -m paddle.distributed.launch run.py \
 - [Paddle Lite部署](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/deployment/lite/lite.md)
 ## 5.FAQ
-[1.] 如果遇到报错 ```ValueError: var inputs not in this block``` ，则说明模型中的输入变量的名字不是 ```inputs``` ，可以先用netron可视化查看输入变量的名称，然后修改 ```run.py``` 中的第35行中 ``` yield {"inputs": imgs}``` 为 ```yield {${input_tensor_name}: imgs}```。一般PaddleClas产出部署模型的输入名字如果不是 ```inputs```，则是 ```x```。
--- a/demo/auto_compression/image_classification/configs/EfficientNetB0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/EfficientNetB0/prune_dis.yaml
+Global:
+  input_name: x
+  model_dir: EfficientNetB0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+ChannelPrune:
+  pruned_ratio: 0.25
+  prune_params_name:
+  - _blocks.0._se_reduce_weights
+  - _blocks.0._se_expand_weights
+  - _blocks.0._project_conv_weights
+  - _blocks.1._expand_conv_weights
+  - _blocks.1._se_reduce_weights
+  - _blocks.1._se_expand_weights
+  - _blocks.1._project_conv_weights
+  - _blocks.2._expand_conv_weights
+  - _blocks.2._se_reduce_weights
+  - _blocks.2._se_expand_weights
+  - _blocks.2._project_conv_weights
+  - _blocks.3._expand_conv_weights
+  - _blocks.3._se_reduce_weights
+  - _blocks.3._se_expand_weights
+  - _blocks.3._project_conv_weights
+  - _blocks.4._expand_conv_weights
+  - _blocks.4._se_reduce_weights
+  - _blocks.4._se_expand_weights
+  - _blocks.4._project_conv_weights
+  - _blocks.5._expand_conv_weights
+  - _blocks.5._se_reduce_weights
+  - _blocks.5._se_expand_weights
+  - _blocks.5._project_conv_weights
+  - _blocks.6._expand_conv_weights
+  - _blocks.6._se_reduce_weights
+  - _blocks.6._se_expand_weights
+  - _blocks.6._project_conv_weights
+  - _blocks.7._expand_conv_weights
+  - _blocks.7._se_reduce_weights
+  - _blocks.7._se_expand_weights
+  - _blocks.7._project_conv_weights
+  - _blocks.8._expand_conv_weights
+  - _blocks.8._se_reduce_weights
+  - _blocks.8._se_expand_weights
+  - _blocks.8._project_conv_weights
+  - _blocks.9._expand_conv_weights
+  - _blocks.9._se_reduce_weights
+  - _blocks.9._se_expand_weights
+  - _blocks.9._project_conv_weights
+  - _blocks.10._expand_conv_weights
+  - _blocks.10._se_reduce_weights
+  - _blocks.10._se_expand_weights
+  - _blocks.10._project_conv_weights
+  - _blocks.11._expand_conv_weights
+  - _blocks.11._se_reduce_weights
+  - _blocks.11._se_expand_weights
+  - _blocks.11._project_conv_weights
+  - _blocks.12._expand_conv_weights
+  - _blocks.12._se_reduce_weights
+  - _blocks.12._se_expand_weights
+  - _blocks.12._project_conv_weights
+  - _blocks.13._expand_conv_weights
+  - _blocks.13._se_reduce_weights
+  - _blocks.13._se_expand_weights
+  - _blocks.13._project_conv_weights
+  - _blocks.14._expand_conv_weights
+  - _blocks.14._se_reduce_weights
+  - _blocks.14._se_expand_weights
+  - _blocks.14._project_conv_weights
+  - _blocks.15._expand_conv_weights
+  - _blocks.15._se_reduce_weights
+  - _blocks.15._se_expand_weights
+  - _blocks.15._project_conv_weights
+  - _conv_head_weights
+  criterion: l1_norm
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 500
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7738
--- a/demo/auto_compression/image_classification/configs/EfficientNetB0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/EfficientNetB0/qat_dis.yaml
+Global:
+  input_name: x
+  model_dir: EfficientNetB0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7738
--- a/demo/auto_compression/image_classification/configs/GhostNet_x1_0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/GhostNet_x1_0/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: GhostNet_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+ChannelPrune:
+  pruned_ratio: 0.25
+  criterion: l1_norm
+  prune_params_name:
+  - conv1_weights
+  - _ghostbottleneck_0_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_0_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_1_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_1_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_1_shortcut_conv_weights
+  - _ghostbottleneck_2_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_2_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_3_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_3_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_3_shortcut_conv_weights
+  - _ghostbottleneck_4_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_4_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_5_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_5_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_5_shortcut_conv_weights
+  - _ghostbottleneck_6_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_6_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_7_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_7_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_8_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_8_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_9_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_9_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_9_shortcut_conv_weights
+  - _ghostbottleneck_10_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_10_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_11_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_11_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_11_shortcut_conv_weights
+  - _ghostbottleneck_12_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_12_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_13_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_13_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_14_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_14_ghost_module_2_primary_conv_weights
+  - _ghostbottleneck_15_ghost_module_1_primary_conv_weights
+  - _ghostbottleneck_15_ghost_module_2_primary_conv_weights
+  - conv_last_weights
+  - fc_0_weights
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7402
--- a/demo/auto_compression/image_classification/configs/GhostNet_x1_0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/GhostNet_x1_0/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: GhostNet_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
--- a/demo/auto_compression/image_classification/configs/InceptionV3/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/InceptionV3/prune_dis.yaml
+Global:
+  input_name: x
+  model_dir: InceptionV3_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+ChannelPrune:
+  pruned_ratio: 0.25
+  criterion: l1_norm
+  prune_params_name:
+  - conv2d_0.w_0
+  - conv2d_1.w_0
+  - conv2d_2.w_0
+  - conv2d_3.w_0
+  - conv2d_4.w_0
+  - conv2d_5.w_0
+  - conv2d_6.w_0
+  - conv2d_7.w_0
+  - conv2d_8.w_0
+  - conv2d_9.w_0
+  - conv2d_10.w_0
+  - conv2d_11.w_0
+  - conv2d_12.w_0
+  - conv2d_13.w_0
+  - conv2d_14.w_0
+  - conv2d_15.w_0
+  - conv2d_16.w_0
+  - conv2d_17.w_0
+  - conv2d_18.w_0
+  - conv2d_19.w_0
+  - conv2d_20.w_0
+  - conv2d_21.w_0
+  - conv2d_22.w_0
+  - conv2d_23.w_0
+  - conv2d_24.w_0
+  - conv2d_25.w_0
+  - conv2d_26.w_0
+  - conv2d_27.w_0
+  - conv2d_28.w_0
+  - conv2d_29.w_0
+  - conv2d_30.w_0
+  - conv2d_31.w_0
+  - conv2d_32.w_0
+  - conv2d_33.w_0
+  - conv2d_34.w_0
+  - conv2d_35.w_0
+  - conv2d_36.w_0
+  - conv2d_37.w_0
+  - conv2d_38.w_0
+  - conv2d_39.w_0
+  - conv2d_40.w_0
+  - conv2d_41.w_0
+  - conv2d_42.w_0
+  - conv2d_43.w_0
+  - conv2d_44.w_0
+  - conv2d_45.w_0
+  - conv2d_46.w_0
+  - conv2d_47.w_0
+  - conv2d_48.w_0
+  - conv2d_49.w_0
+  - conv2d_50.w_0
+  - conv2d_51.w_0
+  - conv2d_52.w_0
+  - conv2d_53.w_0
+  - conv2d_54.w_0
+  - conv2d_55.w_0
+  - conv2d_56.w_0
+  - conv2d_57.w_0
+  - conv2d_58.w_0
+  - conv2d_59.w_0
+  - conv2d_60.w_0
+  - conv2d_61.w_0
+  - conv2d_62.w_0
+  - conv2d_63.w_0
+  - conv2d_64.w_0
+  - conv2d_65.w_0
+  - conv2d_66.w_0
+  - conv2d_67.w_0
+  - conv2d_68.w_0
+  - conv2d_69.w_0
+  - conv2d_70.w_0
+  - conv2d_71.w_0
+  - conv2d_72.w_0
+  - conv2d_73.w_0
+  - conv2d_74.w_0
+  - conv2d_75.w_0
+  - conv2d_76.w_0
+  - conv2d_77.w_0
+  - conv2d_78.w_0
+  - conv2d_79.w_0
+  - conv2d_80.w_0
+  - conv2d_81.w_0
+  - conv2d_82.w_0
+  - conv2d_83.w_0
+  - conv2d_84.w_0
+  - conv2d_85.w_0
+  - conv2d_86.w_0
+  - conv2d_87.w_0
+  - conv2d_88.w_0
+  - conv2d_89.w_0
+  - conv2d_90.w_0
+  - conv2d_91.w_0
+  - conv2d_92.w_0
+  - conv2d_93.w_0
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7914
--- a/demo/auto_compression/image_classification/configs/InceptionV3/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/InceptionV3/qat_dis.yaml
+Global:
+  input_name: x
+  model_dir: InceptionV3_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 10.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+Quantization:
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7914
--- a/demo/auto_compression/image_classification/configs/MobileNetV1/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/MobileNetV1/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: MobileNetV1_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 10000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.70898
--- a/demo/auto_compression/image_classification/configs/mobilenetv1_qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/mobilenetv1_qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: MobileNetV1_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /workspace/dataset/ILSVRC2012
 Distillation:
  alpha: 1.0
  loss: l2
+  node:
+  - softmax_0.tmp_0
 Quantization:
+  use_pact: true
  activation_bits: 8
  is_full_quantize: false
  activation_quantize_type: range_abs_max
-  weight_quantize_type: abs_max
+  weight_quantize_type: channel_wise_abs_max
  not_quant_pattern:
  - skip_quant
  quantize_op_types:
@@ -15,9 +26,12 @@ Quantization:
 TrainConfig:
  epochs: 1
  eval_iter: 500
-  learning_rate: 0.004
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 10000
  optimizer_builder:
-    optimizer: 
+    optimizer:
      type: Momentum
-    weight_decay: 0.00003
+    weight_decay: 0.00002
  origin_metric: 0.70898
--- a/demo/auto_compression/image_classification/configs/MobileNetV3_large_x1_0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/MobileNetV3_large_x1_0/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: MobileNetV3_large_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7532
--- a/demo/auto_compression/image_classification/configs/MobileNetV3_large_x1_0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/MobileNetV3_large_x1_0/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: MobileNetV3_large_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+Quantization:
+  activation_bits: 8
+  is_full_quantize: false
+  use_pact: true
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 2000
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.0001
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7532
--- a/demo/auto_compression/image_classification/configs/PPLCNetV2_base/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/PPLCNetV2_base/prune_dis.yaml
+Global:
+  input_name: x
+  model_dir: PPLCNetV2_base_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7704
--- a/demo/auto_compression/image_classification/configs/PPLCNetV2_base/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/PPLCNetV2_base/qat_dis.yaml
+Global:
+  input_name: x
+  model_dir: PPLCNetV2_base_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7704
--- a/demo/auto_compression/image_classification/configs/PPLCNet_x1_0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/PPLCNet_x1_0/prune_dis.yaml
+Global:
+  input_name: x
+  model_dir: PPLCNet_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7132
--- a/demo/auto_compression/image_classification/configs/PPLCNet_x1_0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/PPLCNet_x1_0/qat_dis.yaml
+Global:
+  input_name: x
+  model_dir: PPLCNet_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_1.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 10000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7132
--- a/demo/auto_compression/image_classification/configs/ResNet50_vd/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/ResNet50_vd/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: ResNet50_vd_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+ChannelPrune:
+  pruned_ratio: 0.25
+  criterion: l1_norm
+  prune_params_name:
+  - conv1_1_weights
+  - conv1_2_weights
+  - conv1_3_weights
+  - res2a_branch2a_weights
+  - res2a_branch2b_weights
+  - res2a_branch2c_weights
+  - res2a_branch1_weights
+  - res2b_branch2a_weights
+  - res2b_branch2b_weights
+  - res2b_branch2c_weights
+  - res2c_branch2a_weights
+  - res2c_branch2b_weights
+  - res2c_branch2c_weights
+  - res3a_branch2a_weights
+  - res3a_branch2b_weights
+  - res3a_branch2c_weights
+  - res3a_branch1_weights
+  - res3b_branch2a_weights
+  - res3b_branch2b_weights
+  - res3b_branch2c_weights
+  - res3c_branch2a_weights
+  - res3c_branch2b_weights
+  - res3c_branch2c_weights
+  - res3d_branch2a_weights
+  - res3d_branch2b_weights
+  - res3d_branch2c_weights
+  - res4a_branch2a_weights
+  - res4a_branch2b_weights
+  - res4a_branch2c_weights
+  - res4a_branch1_weights
+  - res4b_branch2a_weights
+  - res4b_branch2b_weights
+  - res4b_branch2c_weights
+  - res4c_branch2a_weights
+  - res4c_branch2b_weights
+  - res4c_branch2c_weights
+  - res4d_branch2a_weights
+  - res4d_branch2b_weights
+  - res4d_branch2c_weights
+  - res4e_branch2a_weights
+  - res4e_branch2b_weights
+  - res4e_branch2c_weights
+  - res4f_branch2a_weights
+  - res4f_branch2b_weights
+  - res4f_branch2c_weights
+  - res5a_branch2a_weights
+  - res5a_branch2b_weights
+  - res5a_branch2c_weights
+  - res5a_branch1_weights
+  - res5b_branch2a_weights
+  - res5b_branch2b_weights
+  - res5b_branch2c_weights
+  - res5c_branch2a_weights
+  - res5c_branch2b_weights
+  - res5c_branch2c_weights
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 500
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7912
--- a/demo/auto_compression/image_classification/configs/ResNet50_vd/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/ResNet50_vd/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: ResNet50_vd_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.7912
--- a/demo/auto_compression/image_classification/configs/ShuffleNetV2_x1_0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/ShuffleNetV2_x1_0/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: ShuffleNetV2_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.6880
--- a/demo/auto_compression/image_classification/configs/ShuffleNetV2_x1_0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/ShuffleNetV2_x1_0/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: ShuffleNetV2_x1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.6880
--- a/demo/auto_compression/image_classification/configs/SqueezeNet1_0/prune_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/SqueezeNet1_0/prune_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: SqueezeNet1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+UnstructurePrune:
+  prune_strategy: gmp
+  prune_mode: ratio
+  ratio: 0.75
+  gmp_config: 
+    stable_iterations: 0
+    pruning_iterations: 4500
+    tunning_iterations: 4500
+    resume_iteration: -1
+    pruning_steps: 100
+    initial_ratio: 0.15
+  prune_params_type: conv1x1_only
+  local_sparsity: True
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.596
--- a/demo/auto_compression/image_classification/configs/SqueezeNet1_0/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/SqueezeNet1_0/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: SqueezeNet1_0_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_0.tmp_0
+  teacher_model_dir: SqueezeNet1_0_infer
+  teacher_model_filename: inference.pdmodel
+  teacher_params_filename: inference.pdiparams
+Quantization:
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.596
\ No newline at end of file
--- a/demo/auto_compression/image_classification/configs/SwinTransformer_base_patch4_window7_224/qat_dis.yaml
+++ b/demo/auto_compression/image_classification/configs/SwinTransformer_base_patch4_window7_224/qat_dis.yaml
+Global:
+  input_name: inputs
+  model_dir: SwinTransformer_base_patch4_window7_224_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  batch_size: 32
+  data_dir: /ILSVRC2012
+Distillation:
+  alpha: 1.0
+  loss: l2
+  node:
+  - softmax_48.tmp_0
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  activation_quantize_type: range_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+TrainConfig:
+  epochs: 1
+  eval_iter: 500
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.015
+    T_max: 5000
+  optimizer_builder:
+    optimizer:
+      type: Momentum
+    weight_decay: 0.00002
+  origin_metric: 0.83
--- a/demo/auto_compression/image_classification/run.py
+++ b/demo/auto_compression/image_classification/run.py
@@ -10,34 +10,38 @@ import numpy as np
 import paddle
 import paddle.nn as nn
 from paddle.io import Dataset, BatchSampler, DataLoader
-import imagenet_reader as pd_imagenet_reader
+import imagenet_reader as reader
-import tf_imagenet_reader
+from paddleslim.auto_compression.config_helpers import load_config as load_slim_config
-from paddleslim.auto_compression.config_helpers import load_config
 from paddleslim.auto_compression import AutoCompression
 from utility import add_arguments, print_arguments
-parser = argparse.ArgumentParser(description=__doc__)
+def argsparser():
-add_arg = functools.partial(add_arguments, argparser=parser)
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
-# yapf: disable
+        '--config_path',
-add_arg('model_dir',                   str,    None,         "inference model directory.")
+        type=str,
-add_arg('model_filename',              str,    None,         "inference model filename.")
+        default=None,
-add_arg('params_filename',             str,    None,         "inference params filename.")
+        help="path of compression strategy config.",
-add_arg('save_dir',                    str,    None,         "directory to save compressed model.")
+        required=True)
-add_arg('batch_size',                  int,    1,            "train batch size.")
+    parser.add_argument(
-add_arg('config_path',                 str,    None,         "path of compression strategy config.")
+        '--save_dir',
-add_arg('data_dir',                    str,    None,         "path of dataset")
+        type=str,
-add_arg('input_name',                  str,    "inputs",     "input name of the model")
+        default='output',
-add_arg('input_shape',                 int,    [3,224,224],  "input shape of the model except batch_size", nargs='+')
+        help="directory to save compressed model.")
-add_arg('image_reader_type',           str,    "paddle",     "the preprocess of data. choice in [\"paddle\", \"tensorflow\"]")
+    return parser
+def print_arguments(args):
+    print('-----------  Running Arguments -----------')
+    for arg, value in sorted(vars(args).items()):
+        print('%s: %s' % (arg, value))
+    print('------------------------------------------')
 # yapf: enable
-def reader_wrapper(reader, input_name, input_shape):
+def reader_wrapper(reader, input_name):
    def gen():
        for i, data in enumerate(reader()):
            imgs = np.float32([item[0] for item in data])
-            imgs = imgs.reshape([len(data)] + input_shape)
            yield {input_name: imgs}
    return gen
@@ -50,9 +54,9 @@ def eval_reader(data_dir, batch_size):
 def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
-    val_reader = eval_reader(data_dir, batch_size=args.batch_size)
+    val_reader = eval_reader(data_dir, batch_size=global_config['batch_size'])
    image = paddle.static.data(
-        name=args.input_name, shape=[None] + args.input_shape, dtype='float32')
+        name=global_config['input_name'], shape=[None, 3, 224, 224], dtype='float32')
    label = paddle.static.data(name='label', shape=[None, 1], dtype='int64')
    results = []
@@ -60,7 +64,7 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
        # top1_acc, top5_acc
        if len(test_feed_names) == 1:
            image = np.array([[d[0]] for d in data])
-            image = image.reshape([len(data)] + args.input_shape)
+            image = image.reshape((len(data), 3, 224, 224))
            label = [[d[1]] for d in data]
            pred = exe.run(compiled_test_program,
                           feed={test_feed_names[0]: image},
@@ -80,8 +84,7 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
        else:
            # eval "eval model", which inputs are image and label, output is top1 and top5 accuracy
            image = np.array([[d[0]] for d in data])
-            image = image.reshape([len(data)] + args.input_shape)
+            image = image.reshape((len(data), 3, 224, 224))
-            label = [[d[1]] for d in data]
            label = [[d[1]] for d in data]
            result = exe.run(
                compiled_test_program,
@@ -96,35 +99,33 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
    return result[0]
-if __name__ == '__main__':
+def main():
-    args = parser.parse_args()
+    global global_config
-    print_arguments(args)
+    all_config = load_slim_config(args.config_path)
-    paddle.enable_static()
+    assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
-    data_dir = args.data_dir
+    global_config = all_config["Global"]
+    global data_dir
+    data_dir = global_config['data_dir']
-    if args.image_reader_type == 'paddle':
-        reader = pd_imagenet_reader
-    elif args.image_reader_type == 'tensorflow':
-        reader = tf_imagenet_reader
-    else:
-        raise NotImplementedError(
-            "image_reader_type only can be set to paddle or tensorflow, but now is {}".
-            format(args.image_reader_type))
    train_reader = paddle.batch(
-        reader.train(data_dir=data_dir), batch_size=args.batch_size)
+        reader.train(data_dir=data_dir), batch_size=global_config['batch_size'])
-    train_dataloader = reader_wrapper(train_reader, args.input_name,
+    train_dataloader = reader_wrapper(train_reader, global_config['input_name'])
-                                      args.input_shape)
    ac = AutoCompression(
-        model_dir=args.model_dir,
+        model_dir=global_config['model_dir'],
-        model_filename=args.model_filename,
+        model_filename=global_config['model_filename'],
-        params_filename=args.params_filename,
+        params_filename=global_config['params_filename'],
        save_dir=args.save_dir,
-        config=args.config_path,
+        config=all_config,
        train_dataloader=train_dataloader,
        eval_callback=eval_function,
-        eval_dataloader=reader_wrapper(
+        eval_dataloader=reader_wrapper(eval_reader(data_dir, global_config['batch_size']), global_config['input_name']))
-            eval_reader(data_dir, args.batch_size), args.input_name,
-            args.input_shape))
    ac.compress()
+if __name__ == '__main__':
+    paddle.enable_static()
+    parser = argsparser()
+    args = parser.parse_args()
+    print_arguments(args)
+    main()
--- a/paddleslim/auto_compression/compressor.py
+++ b/paddleslim/auto_compression/compressor.py
@@ -729,21 +729,24 @@ class AutoCompression:
                            test_program_info.feed_target_names,
                            test_program_info.fetch_targets)
-                        _logger.info(
-                            "epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
-                            format(epoch_id, metric, best_metric))
                        if metric > best_metric:
                            paddle.static.save(
                                program=test_program_info.program._program,
                                model_path=os.path.join(self.tmp_dir,
                                                        'best_model'))
                            best_metric = metric
+                            _logger.info(
+                                "epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
+                                format(epoch_id, metric, best_metric))
                            if self.metric_before_compressed is not None and float(
                                    abs(best_metric -
                                        self.metric_before_compressed)
                            ) / self.metric_before_compressed <= 0.005:
                                break
+                        else:
+                            _logger.info(
+                                "epoch: {} metric of compressed model is: {:.6f}, best metric of compressed model is {:.6f}".
+                                format(epoch_id, metric, best_metric))
                        if train_config.target_metric is not None:
                            if metric > float(train_config.target_metric):
                                break