diff --git a/example/auto_compression/detection/README.md b/example/auto_compression/detection/README.md index 1800054c0a0b46baba7a0ad5a9d41e5aa56afa3c..4261fe7076578c82d4c435377e1e44636523ad88 100644 --- a/example/auto_compression/detection/README.md +++ b/example/auto_compression/detection/README.md @@ -14,7 +14,6 @@ ## 1. 简介 本示例将以目标检测模型PP-YOLOE为例,介绍如何使用PaddleDetection中Inference部署模型进行自动压缩。本示例使用的自动压缩策略为量化蒸馏。 - ## 2.Benchmark ### PP-YOLOE @@ -28,6 +27,18 @@ - mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。 - PP-YOLOE-l模型在Tesla V100的GPU环境下测试,并且开启TensorRT,batch_size=1,包含NMS,测试脚本是[benchmark demo](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/deploy/python)。 - PP-YOLOE-s模型在Tesla T4,TensorRT 8.4.1,CUDA 11.2,batch_size=1,不包含NMS,测试脚本是[cpp_infer_ppyoloe](./cpp_infer_ppyoloe)。 + +### YOLOv8 + +| 模型 | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 | 配置文件 | 量化模型 | +| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: | +| YOLOv8-s | 44.9 | 43.9 | 44.3 | 9.27ms | 4.65ms | **3.78ms** | [config](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms_quant.tar) | + +**注意:** +- 表格中YOLOv8模型均为带NMS的模型,可直接在TRT中部署,如果需要对齐测试标准,需要测试不带NMS的模型。 +- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。 +- 表格中的性能在Tesla T4的GPU环境下测试,并且开启TensorRT,batch_size=1。 + ### SSD on Pascal VOC | 模型 | Box AP | ACT量化Box AP | TRT-FP32 | TRT-INT8 | 配置文件 | 量化模型 | | :-------- |:-------- | :---------------------: | :----------------: | :---------------: | :----------------------: | :---------------------: | @@ -99,6 +110,16 @@ python tools/export_model.py \ trt=True exclude_post_process=True \ ``` +YOLOv8-s模型,包含NMS,具体可参考[YOLOv8模型文档](https://github.com/PaddlePaddle/PaddleYOLO/tree/release/2.5/configs/yolov8), 然后执行: +```shell +python tools/export_model.py \ + -c configs/yolov8/yolov8_s_500e_coco.yml \ + -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams \ + trt=True +``` + +如快速体验,可直接下载[YOLOv8-s导出模型](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms.tar) + #### 3.4 自动压缩并产出模型 蒸馏量化自动压缩示例通过run.py脚本启动,会使用接口```paddleslim.auto_compression.AutoCompression```对模型进行自动压缩。配置config文件中模型路径、蒸馏、量化、和训练等部分的参数,配置完成后便可对模型进行量化和蒸馏。具体运行命令为: diff --git a/example/auto_compression/detection/configs/yolov8_reader.yml b/example/auto_compression/detection/configs/yolov8_reader.yml new file mode 100644 index 0000000000000000000000000000000000000000..54565c78dd12df138a5ea30f72c78ddcef4460cd --- /dev/null +++ b/example/auto_compression/detection/configs/yolov8_reader.yml @@ -0,0 +1,27 @@ +metric: COCO +num_classes: 80 + +# Dataset configuration +TrainDataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: dataset/coco/ + +EvalDataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: dataset/coco/ + +worker_num: 0 + +# preprocess reader in train & test +EvalReader: + sample_transforms: + - Decode: {} + - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1} + - Pad: {size: [640, 640], fill_value: [114., 114., 114.]} + - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none} + - Permute: {} + batch_size: 4 diff --git a/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml b/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml new file mode 100644 index 0000000000000000000000000000000000000000..144fba9a1a6f08b0c60b791bcbd5d2e7820ece79 --- /dev/null +++ b/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml @@ -0,0 +1,31 @@ + +Global: + reader_config: configs/yolov8_reader.yml + include_nms: True + Evaluation: True + model_dir: ./yolov8_s_500e_coco_trt_nms/ + model_filename: model.pdmodel + params_filename: model.pdiparams + +Distillation: + alpha: 1.0 + loss: soft_label + +QuantAware: + onnx_format: true + activation_quantize_type: 'moving_average_abs_max' + quantize_op_types: + - conv2d + - depthwise_conv2d + +TrainConfig: + train_iter: 8000 + eval_iter: 1000 + learning_rate: + type: CosineAnnealingDecay + learning_rate: 0.00003 + T_max: 10000 + optimizer_builder: + optimizer: + type: SGD + weight_decay: 4.0e-05