diff --git a/example/auto_compression/detection/README.md b/example/auto_compression/detection/README.md index ea73146b8fe19ad2da738ed8f58c4ad161be239a..4c08e59a6094e1c895a75b48cb70ca5adaceedc4 100644 --- a/example/auto_compression/detection/README.md +++ b/example/auto_compression/detection/README.md @@ -145,7 +145,5 @@ python eval.py --config_path=./configs/ppyoloe_l_qat_dis.yaml ## 5.FAQ -- 如果想测试离线量化模型精度,可执行: -```shell -python post_quant.py --config_path=./configs/ppyoloe_s_qat_dis.yaml -``` + +- 如果想对模型进行离线量化,可进入[Detection模型离线量化示例](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/post_training_quantization/detection)中进行实验。 diff --git a/example/post_training_quantization/detection/README.md b/example/post_training_quantization/detection/README.md new file mode 100644 index 0000000000000000000000000000000000000000..2f155a3885585f685046c64b09d19e254260d5c4 --- /dev/null +++ b/example/post_training_quantization/detection/README.md @@ -0,0 +1,158 @@ +# 目标检测模型离线量化示例 + +目录: +- [1.简介](#1简介) +- [2.Benchmark](#2Benchmark) +- [3.开始离线量化](#离线量化流程) + - [3.1 准备环境](#31-准备环境) + - [3.2 准备数据集](#32-准备数据集) + - [3.3 准备预测模型](#33-准备预测模型) + - [3.4 测试模型精度](#34-测试模型精度) + - [3.5 离线量化并产出模型](#35-离线量化并产出模型) + - [3.6 提高离线量化精度](#36-提高离线量化精度) + +- [4.预测部署](#4预测部署) +- [5.FAQ](5FAQ) + +## 1. 简介 +本示例将以目标检测模型PP-YOLOE和PicoDet为例,介绍如何使用PaddleDetection中Inference部署模型,使用离线量化功能进行压缩,并使用敏感度分析功能提升离线量化精度。 + + +## 2.Benchmark + +| 模型 | 策略 | 输入尺寸 | mAPval
0.5:0.95 | 预测时延FP32
(ms) |预测时延FP16
(ms) | 预测时延INT8
(ms) | 配置文件 | Inference模型 | +| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | +| PP-YOLOE-s | Base模型 | 640*640 | 43.1 | 11.2ms | 7.7ms | - | - | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_s_300e_coco.tar) | +| PP-YOLOE-s | 离线量化 | 640*640 | 42.6 | - | - | 6.7ms | - | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_s_ptq.tar) | +| | | | | | | | | | +| PicoDet-s | Base模型 | 416*416 | 32.5 | - | - | - | - | [Model](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) | +| PicoDet-s | 离线量化(量化分析前) | 416*416 | 0.0 | - | - | - | - | - | +| PicoDet-s | 离线量化(量化分析后) | 416*416 | 24.9 | - | - | - | - | [Infer Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_ptq.tar) | + +- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。 + + +## 3. 离线量化流程 + +#### 3.1 准备环境 +- PaddlePaddle >= 2.3 (可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装) +- PaddleSlim >= 2.3 +- PaddleDet >= 2.4 +- opencv-python + +安装paddlepaddle: +```shell +# CPU +pip install paddlepaddle +# GPU +pip install paddlepaddle-gpu +``` + +安装paddleslim: +```shell +pip install paddleslim +``` + +安装paddledet: +```shell +pip install paddledet +``` + +注:安装PaddleDet的目的是为了直接使用PaddleDetection中的Dataloader组件。 + +#### 3.2 准备数据集 + +本案例默认以COCO数据进行离线量化实验,如果自定义COCO数据,或者其他格式数据,请参考[PaddleDetection数据准备文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/PrepareDataSet.md) 来准备数据。 + +如果数据集为非COCO格式数据,请修改[configs](./configs)中reader配置文件中的Dataset字段。 + +以PP-YOLOE模型为例,如果已经准备好数据集,请直接修改[./configs/ppyoloe_s_ptq.yml]中`EvalDataset`的`dataset_dir`字段为自己数据集路径即可。 + +#### 3.3 准备预测模型 + +预测模型的格式为:`model.pdmodel` 和 `model.pdiparams`两个,带`pdmodel`的是模型文件,带`pdiparams`后缀的是权重文件。 + +注:其他像`__model__`和`__params__`分别对应`model.pdmodel` 和 `model.pdiparams`文件。 + + +根据[PaddleDetection文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/GETTING_STARTED_cn.md#8-%E6%A8%A1%E5%9E%8B%E5%AF%BC%E5%87%BA) 导出Inference模型,具体可参考下方PP-YOLOE模型的导出示例: +- 下载代码 +``` +git clone https://github.com/PaddlePaddle/PaddleDetection.git +``` +- 导出预测模型 + + +- PPYOLOE-s模型,不包含NMS:如快速体验,可直接下载[PP-YOLOE-s导出模型](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_s_300e_coco.tar) +```shell +python tools/export_model.py \ + -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml \ + -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams \ + trt=True exclude_nms=True \ +``` + +- PicoDet-s模型,包含NMS:如快速体验,可直接下载[PicoDet-s导出模型](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) + +```shell +python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \ + -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \ + --output_dir=output_inference \ +``` + +#### 3.4 离线量化并产出模型 + +离线量化示例通过post_quant.py脚本启动,会使用接口```paddleslim.quant.quant_post_static```对模型进行量化。配置config文件中模型路径、数据路径和量化相关的参数,配置完成后便可对模型进行离线量化。具体运行命令为: + +- PPYOLOE-s: + +``` +export CUDA_VISIBLE_DEVICES=0 +python post_quant.py --config_path=./configs/ppyoloe_s_ptq.yaml --save_dir=./ppyoloe_s_ptq +``` + +- PicoDet-s: + +``` +export CUDA_VISIBLE_DEVICES=0 +python post_quant.py --config_path=./configs/picodet_s_ptq.yaml --save_dir=./picodet_s_ptq +``` + + +#### 3.5 测试模型精度 + +使用eval.py脚本得到模型的mAP: +``` +export CUDA_VISIBLE_DEVICES=0 +python eval.py --config_path=./configs/ppyoloe_s_ptq.yaml +``` + +**注意**: +- 要测试的模型路径可以在配置文件中`model_dir`字段下进行修改。 + +#### 3.6 提高离线量化精度 +本节介绍如何使用量化分析工具提升离线量化精度。离线量化功能仅需使用少量数据,且使用简单、能快速得到量化模型,但往往会造成较大的精度损失。PaddleSlim提供量化分析工具,会使用接口```paddleslim.quant.AnalysisQuant```,可视化展示出不适合量化的层,通过跳过这些层,提高离线量化模型精度。 + +经过多个实验,包括尝试多种激活算法(avg,KL等)、weight的量化方式(abs_max,channel_wise_abs_max),对PicoDet-s进行离线量化后精度均为0,以PicoDet-s为例,量化分析工具具体使用方法如下: + +```shell +python analysis.py --config_path=./configs/picodet_s_analysis.yaml +``` + +如下图,经过量化分析之后,可以发现`conv2d_1.w_0`, `conv2d_3.w_0`,`conv2d_5.w_0`, `conv2d_7.w_0`, `conv2d_9.w_0` 这些层会导致较大的精度损失,这些层均为主干网络中靠前部分的`depthwise_conv`。 + +

+
+

+ +经此分析,在进行离线量化时,可以跳过这些导致精度下降较多的层,可使用 [picodet_s_analyzed_ptq.yaml](./configs/picodet_s_analyzed_ptq.yaml),然后再次进行离线量化。跳过这些层后,离线量化精度上升24.9个点。 + +```shell +python post_quant.py --config_path=./configs/picodet_s_analyzed_ptq.yaml --save_dir=./picodet_s_analyzed_ptq_out +``` + +## 4.预测部署 +预测部署可参考[Detection模型自动压缩示例](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression/detection) + +## 5.FAQ + +- 如果想对模型进行自动压缩,可进入[Detection模型自动压缩示例](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression/detection)中进行实验。 diff --git a/example/post_training_quantization/detection/configs/picodet_s_analyzed_ptq.yaml b/example/post_training_quantization/detection/configs/picodet_s_analyzed_ptq.yaml new file mode 100644 index 0000000000000000000000000000000000000000..54aa3cb9ce3486eb87974c20af0352ae8429ee14 --- /dev/null +++ b/example/post_training_quantization/detection/configs/picodet_s_analyzed_ptq.yaml @@ -0,0 +1,38 @@ +input_list: ['image', 'scale_factor'] +model_dir: ./picodet_s_416_coco_lcnet/ +model_filename: model.pdmodel +params_filename: model.pdiparams +skip_tensor_list: ['conv2d_9.w_0', 'conv2d_7.w_0', 'conv2d_3.w_0', 'conv2d_5.w_0', 'conv2d_1.w_0', ] + +metric: COCO +num_classes: 80 + +# Datset configuration +TrainDataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: /paddle/dataset/coco/ + +EvalDataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: /paddle/dataset/coco/ + +eval_height: &eval_height 416 +eval_width: &eval_width 416 +eval_size: &eval_size [*eval_height, *eval_width] + +worker_num: 0 + +EvalReader: + inputs_def: + image_shape: [1, 3, *eval_height, *eval_width] + sample_transforms: + - Decode: {} + - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False} + - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]} + - Permute: {} + batch_size: 32 + diff --git a/example/post_training_quantization/detection/configs/ppyoloe_s_analysis.yaml b/example/post_training_quantization/detection/configs/ppyoloe_s_analysis.yaml new file mode 100644 index 0000000000000000000000000000000000000000..96d87a0372e8665d141c85ecf8243f48f6500ff9 --- /dev/null +++ b/example/post_training_quantization/detection/configs/ppyoloe_s_analysis.yaml @@ -0,0 +1,41 @@ +input_list: ['image'] +arch: PPYOLOE # When export exclude_nms=True, need set arch: PPYOLOE +model_dir: ./ppyoloe_crn_s_300e_coco +model_filename: model.pdmodel +params_filename: model.pdiparams +save_dir: ./analysis_results_ppyoloe +metric: COCO +num_classes: 80 + +PTQ: + quantizable_op_type: ["conv2d", "depthwise_conv2d"] + weight_quantize_type: 'abs_max' + activation_quantize_type: 'moving_average_abs_max' + is_full_quantize: False + batch_size: 32 + batch_nums: 10 + + +# Datset configuration +TrainDataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: /dataset/coco/ + +EvalDataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: /dataset/coco/ + +worker_num: 0 + +# preprocess reader in test +EvalReader: + sample_transforms: + - Decode: {} + - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2} + - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True} + - Permute: {} + batch_size: 32 \ No newline at end of file diff --git a/example/post_training_quantization/detection/configs/ppyoloe_s_ptq.yaml b/example/post_training_quantization/detection/configs/ppyoloe_s_ptq.yaml new file mode 100644 index 0000000000000000000000000000000000000000..3c8752652e7dbff5f9ec4587184aedd1ffdec64f --- /dev/null +++ b/example/post_training_quantization/detection/configs/ppyoloe_s_ptq.yaml @@ -0,0 +1,32 @@ +input_list: ['image'] +arch: PPYOLOE # When export exclude_nms=True, need set arch: PPYOLOE +model_dir: ./ppyoloe_crn_s_300e_coco +model_filename: model.pdmodel +params_filename: model.pdiparams +metric: COCO +num_classes: 80 + + +# Datset configuration +TrainDataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: /dataset/coco/ + +EvalDataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: /dataset/coco/ + +worker_num: 0 + +# preprocess reader in test +EvalReader: + sample_transforms: + - Decode: {} + - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2} + - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True} + - Permute: {} + batch_size: 32 \ No newline at end of file diff --git a/example/post_training_quantization/detection/eval.py b/example/post_training_quantization/detection/eval.py index fc0c09ae46c644fea8ca6218d0f0da3544d59161..f8e1342d5d10978c4e94a3f2ffcf7bb5e06321f7 100644 --- a/example/post_training_quantization/detection/eval.py +++ b/example/post_training_quantization/detection/eval.py @@ -20,7 +20,7 @@ import paddle from ppdet.core.workspace import load_config, merge_config from ppdet.core.workspace import create from ppdet.metrics import COCOMetric, VOCMetric, KeyPointTopDownCOCOEval -from paddleslim.common import load_config as load_slim_config +from paddleslim.common import load_inference_model from keypoint_utils import keypoint_post_process from post_process import PPYOLOEPostProcess @@ -77,34 +77,35 @@ def eval(): place = paddle.CUDAPlace(0) if FLAGS.devices == 'gpu' else paddle.CPUPlace() exe = paddle.static.Executor(place) - val_program, feed_target_names, fetch_targets = paddle.static.load_inference_model( - global_config["model_dir"].rstrip('/'), + val_program, feed_target_names, fetch_targets = load_inference_model( + config["model_dir"].rstrip('/'), exe, - model_filename=global_config["model_filename"], - params_filename=global_config["params_filename"]) - print('Loaded model from: {}'.format(global_config["model_dir"])) + model_filename=config["model_filename"], + params_filename=config["params_filename"]) - metric = global_config['metric'] + print('Loaded model from: {}'.format(config["model_dir"])) + + metric = config['metric'] for batch_id, data in enumerate(val_loader): data_all = convert_numpy_data(data, metric) data_input = {} for k, v in data.items(): - if isinstance(global_config['input_list'], list): - if k in global_config['input_list']: + if isinstance(config['input_list'], list): + if k in config['input_list']: data_input[k] = np.array(v) - elif isinstance(global_config['input_list'], dict): - if k in global_config['input_list'].keys(): - data_input[global_config['input_list'][k]] = np.array(v) + elif isinstance(config['input_list'], dict): + if k in config['input_list'].keys(): + data_input[config['input_list'][k]] = np.array(v) outs = exe.run(val_program, feed=data_input, fetch_list=fetch_targets, return_numpy=False) res = {} - if 'arch' in global_config and global_config['arch'] == 'keypoint': + if 'arch' in config and config['arch'] == 'keypoint': res = keypoint_post_process(data, data_input, exe, val_program, fetch_targets, outs) - if 'arch' in global_config and global_config['arch'] == 'PPYOLOE': + if 'arch' in config and config['arch'] == 'PPYOLOE': postprocess = PPYOLOEPostProcess( score_threshold=0.01, nms_threshold=0.6) res = postprocess(np.array(outs[0]), data_all['scale_factor']) @@ -124,34 +125,32 @@ def eval(): def main(): - global global_config - all_config = load_slim_config(FLAGS.config_path) - global_config = all_config["Global"] - reader_cfg = load_config(global_config['reader_config']) + global config + config = load_config(FLAGS.config_path) - dataset = reader_cfg['EvalDataset'] + dataset = config['EvalDataset'] global val_loader - val_loader = create('EvalReader')(reader_cfg['EvalDataset'], - reader_cfg['worker_num'], + val_loader = create('EvalReader')(config['EvalDataset'], + config['worker_num'], return_list=True) metric = None - if reader_cfg['metric'] == 'COCO': + if config['metric'] == 'COCO': clsid2catid = {v: k for k, v in dataset.catid2clsid.items()} anno_file = dataset.get_anno() metric = COCOMetric( anno_file=anno_file, clsid2catid=clsid2catid, IouType='bbox') - elif reader_cfg['metric'] == 'VOC': + elif config['metric'] == 'VOC': metric = VOCMetric( label_list=dataset.get_label_list(), - class_num=reader_cfg['num_classes'], - map_type=reader_cfg['map_type']) - elif reader_cfg['metric'] == 'KeyPointTopDownCOCOEval': + class_num=config['num_classes'], + map_type=config['map_type']) + elif config['metric'] == 'KeyPointTopDownCOCOEval': anno_file = dataset.get_anno() metric = KeyPointTopDownCOCOEval(anno_file, len(dataset), 17, 'output_eval') else: raise ValueError("metric currently only supports COCO and VOC.") - global_config['metric'] = metric + config['metric'] = metric eval() diff --git a/example/post_training_quantization/detection/images/picodet_analysis.png b/example/post_training_quantization/detection/images/picodet_analysis.png new file mode 100644 index 0000000000000000000000000000000000000000..8e518aa0fbd78ecf2819f9e7648578a1bb96a66c Binary files /dev/null and b/example/post_training_quantization/detection/images/picodet_analysis.png differ diff --git a/example/post_training_quantization/detection/post_quant.py b/example/post_training_quantization/detection/post_quant.py index a0c010364dd1b47ce33131814fd95942da7d96b0..052fdcb5ec9da888be5b2c9ad15e7047f4161cc4 100644 --- a/example/post_training_quantization/detection/post_quant.py +++ b/example/post_training_quantization/detection/post_quant.py @@ -41,7 +41,7 @@ def argsparser(): default='gpu', help="which device used to compress.") parser.add_argument( - '--algo', type=str, default='KL', help="post quant algo.") + '--algo', type=str, default='avg', help="post quant algo.") return parser @@ -79,8 +79,8 @@ def main(): data_loader=train_loader, model_filename=config["model_filename"], params_filename=config["params_filename"], - batch_size=4, - batch_nums=64, + batch_size=32, + batch_nums=10, algo=FLAGS.algo, hist_percent=0.999, is_full_quantize=False, diff --git a/example/post_training_quantization/pytorch_yolo_series/README.md b/example/post_training_quantization/pytorch_yolo_series/README.md index 085a0ee575f478eaa4e9071eaed4f975eb0615b6..d54efe9d5b74375aff76172309a4defeaa1e9482 100644 --- a/example/post_training_quantization/pytorch_yolo_series/README.md +++ b/example/post_training_quantization/pytorch_yolo_series/README.md @@ -144,6 +144,7 @@ python post_quant.py --config_path=./configs/yolov6s_analyzed_ptq.yaml --save_di ``` ## 4.预测部署 +预测部署可参考[YOLO系列模型自动压缩示例](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression/pytorch_yolo_series) ## 5.FAQ diff --git a/example/post_training_quantization/pytorch_yolo_series/eval.py b/example/post_training_quantization/pytorch_yolo_series/eval.py index e105bb788b1483a086537f2c2ba9860707a54c99..6c2462acf4a19e7402990847b7363d347972f7f1 100644 --- a/example/post_training_quantization/pytorch_yolo_series/eval.py +++ b/example/post_training_quantization/pytorch_yolo_series/eval.py @@ -49,7 +49,10 @@ def eval(): exe = paddle.static.Executor(place) val_program, feed_target_names, fetch_targets = load_inference_model( - config["model_dir"], exe, "model.pdmodel", "model.pdiparams") + config["model_dir"].rstrip('/'), + exe, + model_filename=config["model_filename"], + params_filename=config["params_filename"]) bboxes_list, bbox_nums_list, image_id_list = [], [], [] with tqdm( diff --git a/example/post_training_quantization/pytorch_yolo_series/post_quant.py b/example/post_training_quantization/pytorch_yolo_series/post_quant.py index ec4fbdbc420e518b94fd0386e86dcd610e96ba3f..4964a3ff13447dd6be880db2ccc04ecd24ff696f 100644 --- a/example/post_training_quantization/pytorch_yolo_series/post_quant.py +++ b/example/post_training_quantization/pytorch_yolo_series/post_quant.py @@ -41,7 +41,7 @@ def argsparser(): default='gpu', help="which device used to compress.") parser.add_argument( - '--algo', type=str, default='KL', help="post quant algo.") + '--algo', type=str, default='avg', help="post quant algo.") return parser