add YOLOv8 ACT demo (#7624)

ac3b6f85 · Guanghua Yu · GitHub · 1e62f011 · ac3b6f85 · ac3b6f85
13 changed file
--- a/deploy/auto_compression/README.md
+++ b/deploy/auto_compression/README.md
@@ -17,43 +17,52 @@

 ## 2.Benchmark

-### PP-YOLOE
+### PP-YOLOE+

 | 模型  | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 |  配置文件 | 量化模型  |
 | :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
-| PP-YOLOE-l | 50.9  |  - | 50.6  |   11.2ms  |   7.7ms   |  **6.7ms**  |  [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco_quant.tar) |
+| PP-YOLOE+_s	 | 43.7  |  - | 42.9  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_s_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_s_qat_dis.tar) |
+| PP-YOLOE+_m | 49.8  |  - | 49.3  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_m_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_m_qat_dis.tar) |
+| PP-YOLOE+_l | 52.9  |  - | 52.6  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_l_qat_dis.tar) |
+| PP-YOLOE+_x | 54.7  |  - | 54.4  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_x_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_x_qat_dis.tar) |

 - mAP的指标均在COCO val2017数据集中评测得到，IoU=0.5:0.95。
- PP-YOLOE-l模型在Tesla V100的GPU环境下测试，并且开启TensorRT，batch_size=1，包含NMS，测试脚本是[benchmark demo](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/deploy/python)。

-### PP-PicoDet
+### YOLOv8

-| 模型  | 策略 | mAP | FP32 | FP16 | INT8 |  配置文件 | 模型  |
-| :-------- |:-------- |:--------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
-| PicoDet-S-NPU | Baseline | 30.1   |   -   |  -  |  -  | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_npu.yml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_416_coco_npu.tar) |
-| PicoDet-S-NPU |  量化训练 | 29.7  |   -  |   -   |  -  |  [config](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/full_quantization/detection/configs/picodet_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_npu_quant.tar) |
+| 模型  | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 |  配置文件 | 量化模型  |
+| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| YOLOv8-s | 44.9 |  43.9 | 44.3  |   9.27ms  |   4.65ms   |  **3.78ms**  |  [config](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms_quant.tar) |

+**注意：**
+- 表格中YOLOv8模型均为带NMS的模型，可直接在TRT中部署，如果需要对齐测试标准，需要测试不带NMS的模型。
 - mAP的指标均在COCO val2017数据集中评测得到，IoU=0.5:0.95。
+- 表格中的性能在Tesla T4的GPU环境下测试，并且开启TensorRT，batch_size=1。

-### PP-YOLOE+
+### PP-YOLOE

 | 模型  | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 |  配置文件 | 量化模型  |
 | :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
-| PP-YOLOE+_s	 | 43.7  |  - | 42.9  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_s_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_s_qat_dis.tar) |
-| PP-YOLOE+_m | 49.8  |  - | 49.3  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_m_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_m_qat_dis.tar) |
-| PP-YOLOE+_l | 52.9  |  - | 52.6  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_l_qat_dis.tar) |
-| PP-YOLOE+_x | 54.7  |  - | 54.4  |   -  |   -   |  -  |  [config](./configs/ppyoloe_plus_x_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_x_qat_dis.tar) |
+| PP-YOLOE-l | 50.9  |  - | 50.6  |   11.2ms  |   7.7ms   |  **6.7ms**  |  [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco_quant.tar) |

 - mAP的指标均在COCO val2017数据集中评测得到，IoU=0.5:0.95。
+- PP-YOLOE-l模型在Tesla V100的GPU环境下测试，并且开启TensorRT，batch_size=1，包含NMS，测试脚本是[benchmark demo](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/deploy/python)。

+### PP-PicoDet
+
+| 模型  | 策略 | mAP | FP32 | FP16 | INT8 |  配置文件 | 模型  |
+| :-------- |:-------- |:--------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| PicoDet-S-NPU | Baseline | 30.1   |   -   |  -  |  -  | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_npu.yml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_416_coco_npu.tar) |
+| PicoDet-S-NPU |  量化训练 | 29.7  |   -  |   -   |  -  |  [config](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/full_quantization/detection/configs/picodet_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_npu_quant.tar) |

+- mAP的指标均在COCO val2017数据集中评测得到，IoU=0.5:0.95。

 ## 3. 自动压缩流程

 #### 3.1 准备环境
- PaddlePaddle >= 2.3 （可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装）
- PaddleSlim >= 2.3
- PaddleDet >= 2.4
+- PaddlePaddle >= 2.4 （可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装）
+- PaddleSlim >= 2.4.1
+- PaddleDet >= 2.5
 - opencv-python

 安装paddlepaddle：
@@ -74,6 +83,8 @@ pip install paddleslim
 pip install paddledet
 ```

+**注意：** YOLOv8模型的自动化压缩需要依赖安装最新[Develop Paddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html)和[Develop PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim#%E5%AE%89%E8%A3%85)版本。
+
 #### 3.2 准备数据集

 本案例默认以COCO数据进行自动压缩实验，如果自定义COCO数据，或者其他格式数据，请参考[数据准备文档](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/docs/tutorials/data/PrepareDataSet.md) 来准备数据。
@@ -102,6 +113,16 @@ python tools/export_model.py \
        trt=True \
 ```

+YOLOv8-s模型，包含NMS，具体可参考[YOLOv8模型文档](https://github.com/PaddlePaddle/PaddleYOLO/tree/release/2.5/configs/yolov8), 然后执行：
+```shell
+python tools/export_model.py \
+        -c configs/yolov8/yolov8_s_500e_coco.yml \
+        -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams \
+        trt=True
+```
+
+如快速体验，可直接下载[YOLOv8-s导出模型](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms.tar)
+
 #### 3.4 自动压缩并产出模型

 蒸馏量化自动压缩示例通过run.py脚本启动，会使用接口```paddleslim.auto_compression.AutoCompression```对模型进行自动压缩。配置config文件中模型路径、蒸馏、量化、和训练等部分的参数，配置完成后便可对模型进行量化和蒸馏。具体运行命令为：

--- a/deploy/auto_compression/configs/picodet_s_qat_dis.yaml
+++ b/deploy/auto_compression/configs/picodet_s_qat_dis.yaml
 Global:
  reader_config: ./configs/picodet_reader.yml
-  input_list: ['image', 'scale_factor']
+  include_nms: True
  Evaluation: True
  model_dir: ./picodet_s_416_coco_npu/
  model_filename: model.pdmodel
@@ -10,7 +10,7 @@ Distillation:
  alpha: 1.0
  loss: l2

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  weight_bits: 8

--- a/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml
+++ b/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml

 Global:
  reader_config: configs/ppyoloe_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./ppyoloe_crn_l_300e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/ppyoloe_plus_l_qat_dis.yaml
+++ b/deploy/auto_compression/configs/ppyoloe_plus_l_qat_dis.yaml

 Global:
  reader_config: configs/ppyoloe_plus_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./ppyoloe_plus_crn_l_80e_coco  
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/ppyoloe_plus_m_qat_dis.yaml
+++ b/deploy/auto_compression/configs/ppyoloe_plus_m_qat_dis.yaml

 Global:
  reader_config: configs/ppyoloe_plus_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./ppyoloe_plus_crn_m_80e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/ppyoloe_plus_s_qat_dis.yaml
+++ b/deploy/auto_compression/configs/ppyoloe_plus_s_qat_dis.yaml

 Global:
  reader_config: configs/ppyoloe_plus_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./ppyoloe_plus_crn_s_80e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/ppyoloe_plus_x_qat_dis.yaml
+++ b/deploy/auto_compression/configs/ppyoloe_plus_x_qat_dis.yaml

 Global:
  reader_config: configs/ppyoloe_plus_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./ppyoloe_plus_crn_x_80e_coco  
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/yolov5_s_qat_dis.yml
+++ b/deploy/auto_compression/configs/yolov5_s_qat_dis.yml

 Global:
  reader_config: configs/yolov5_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./yolov5_s_300e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  use_pact: true
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:

--- a/deploy/auto_compression/configs/yolov6mt_s_qat_dis.yaml
+++ b/deploy/auto_compression/configs/yolov6mt_s_qat_dis.yaml

 Global:
  reader_config: configs/yolov5_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./yolov6mt_s_400e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:
  - conv2d

--- a/deploy/auto_compression/configs/yolov7_l_qat_dis.yaml
+++ b/deploy/auto_compression/configs/yolov7_l_qat_dis.yaml

 Global:
  reader_config: configs/yolov5_reader.yml
-  input_list: ['image', 'scale_factor']
-  arch: YOLO
+  include_nms: True
  Evaluation: True
  model_dir: ./yolov7_l_300e_coco
  model_filename: model.pdmodel
@@ -12,7 +11,7 @@ Distillation:
  alpha: 1.0
  loss: soft_label

-Quantization:
+QuantAware:
  activation_quantize_type: 'moving_average_abs_max'
  quantize_op_types:
  - conv2d

--- a/deploy/auto_compression/configs/yolov8_reader.yml
+++ b/deploy/auto_compression/configs/yolov8_reader.yml
+metric: COCO
+num_classes: 80
+
+# Dataset configuration
+TrainDataset:
+  !COCODataSet
+    image_dir: train2017
+    anno_path: annotations/instances_train2017.json
+    dataset_dir: dataset/coco/
+
+EvalDataset:
+  !COCODataSet
+    image_dir: val2017
+    anno_path: annotations/instances_val2017.json
+    dataset_dir: dataset/coco/
+
+worker_num: 0
+
+# preprocess reader in test
+EvalReader:
+  sample_transforms:
+    - Decode: {}
+    - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1}
+    - Pad: {size: [640, 640], fill_value: [114., 114., 114.]}
+    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+    - Permute: {}
+  batch_size: 4
--- a/deploy/auto_compression/configs/yolov8_s_qat_dis.yaml
+++ b/deploy/auto_compression/configs/yolov8_s_qat_dis.yaml
+
+Global:
+  reader_config: configs/yolov8_reader.yml
+  include_nms: True
+  Evaluation: True
+  model_dir: ./yolov8_s_500e_coco_trt_nms/
+  model_filename: model.pdmodel
+  params_filename: model.pdiparams
+
+Distillation:
+  alpha: 1.0
+  loss: soft_label
+
+QuantAware:
+  onnx_format: true
+  activation_quantize_type: 'moving_average_abs_max'
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+
+TrainConfig:
+  train_iter: 8000
+  eval_iter: 1000
+  learning_rate:  
+    type: CosineAnnealingDecay
+    learning_rate: 0.00003
+    T_max: 10000
+  optimizer_builder:
+    optimizer: 
+      type: SGD
+    weight_decay: 4.0e-05
+
--- a/deploy/auto_compression/run.py
+++ b/deploy/auto_compression/run.py
@@ -23,6 +23,7 @@ from ppdet.metrics import COCOMetric, VOCMetric, KeyPointTopDownCOCOEval
 from paddleslim.auto_compression.config_helpers import load_config as load_slim_config
 from paddleslim.auto_compression import AutoCompression
 from post_process import PPYOLOEPostProcess
+from paddleslim.common.dataloader import get_feed_vars


 def argsparser():
@@ -94,9 +95,12 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
                       fetch_list=test_fetch_list,
                       return_numpy=False)
        res = {}
+        if 'include_nms' in global_config and not global_config['include_nms']:
            if 'arch' in global_config and global_config['arch'] == 'PPYOLOE':
                postprocess = PPYOLOEPostProcess(
                    score_threshold=0.01, nms_threshold=0.6)
+            else:
+                assert "Not support arch={} now.".format(global_config['arch'])
            res = postprocess(np.array(outs[0]), data_all['scale_factor'])
        else:
            for out in outs:
@@ -128,6 +132,10 @@ def main():
    train_loader = create('EvalReader')(reader_cfg['TrainDataset'],
                                        reader_cfg['worker_num'],
                                        return_list=True)
+    if global_config.get('input_list') is None:
+        global_config['input_list'] = get_feed_vars(
+            global_config['model_dir'], global_config['model_filename'],
+            global_config['params_filename'])
    train_loader = reader_wrapper(train_loader, global_config['input_list'])

    if 'Evaluation' in global_config.keys() and global_config[