diff --git a/demo/auto_compression/README.md b/demo/auto_compression/README.md index e780ee0363368b8252226d722496e9e35bb444b3..ff26dc29addfdd5c0edd83bebf5f643d4a1f4777 100644 --- a/demo/auto_compression/README.md +++ b/demo/auto_compression/README.md @@ -1,126 +1,62 @@ -# 使用预测模型进行量化训练示例 +# 自动压缩工具ACT(Auto Compression Tookit) -预测模型保存接口: -动态图使用``paddle.jit.save``保存; -静态图使用``paddle.static.save_inference_model``保存。 +## 简介 +PaddleSlim推出全新自动压缩工具(ACT),旨在通过Source-Free的方式,自动对预测模型进行压缩,压缩后模型可直接部署应用。ACT自动压缩工具主要特性如下: +- **『更便捷』**:开发者无需了解或修改模型源码,直接使用导出的预测模型进行压缩; +- **『更智能』**:开发者简单配置即可启动压缩,ACT工具会自动优化得到最好预测模型; +- **『更丰富』**:ACT中提供了量化训练、蒸馏、结构化剪枝、非结构化剪枝、多种离线量化方法及超参搜索等等,可任意搭配使用。 -本示例将介绍如何使用预测模型进行蒸馏量化训练, -首先使用接口``paddleslim.quant.quant_aware_with_infermodel``训练量化模型, -训练完成后,使用接口``paddleslim.quant.export_quant_infermodel``将训好的量化模型导出为预测模型。 -## 分类模型量化训练流程 +## 环境准备 -### 1. 准备数据 +- 安装PaddlePaddle >= 2.3版本 (从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装) +- 安装PaddleSlim >= 2.3 或者适当develop版本 -在``demo``文件夹下创建``data``文件夹,将``ImageNet``数据集解压在``data``文件夹下,解压后``data/ILSVRC2012``文件夹下应包含以下文件: -- ``'train'``文件夹,训练图片 -- ``'train_list.txt'``文件 -- ``'val'``文件夹,验证图片 -- ``'val_list.txt'``文件 +## 快速上手 -### 2. 准备需要量化的模型 - -飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,本示例使用该套件产出imagenet分类模型。 -#### 2.1 下载PaddleClas release/2.3分支代码 - -解压后,进入PaddleClas目录 -``` -cd PaddleClas-release-2.3 -``` -#### 2.2 下载MobileNetV2预训练模型 -在PaddleClas根目录创建``pretrained``文件夹: -``` -mkdir pretrained +```python +# 导入依赖包 +from paddleslim.auto_compression.config_helpers import load_config +from paddleslim.auto_compression import AutoCompression +from paddleslim.common.imagenet_reader import reader +# 加载配置文件 +compress_config, train_config = load_slim_config("./image_classification/mobilenetv1_qat_dis.yaml") +# 定义DataLoader +train_loader = reader(mode='test') # DataLoader +# 开始自动压缩 +ac = AutoCompression( + model_dir="./mobilenetv1_infer", + model_filename="model.pdmodel", + params_filename="model.pdiparams", + save_dir="output", + strategy_config=compress_config, + train_config=train_config, + train_dataloader=train_loader, + eval_callback=None) # eval_function to verify accuracy +ac.compress() ``` -下载预训练模型 -分类预训练模型库地址 -MobileNetV2预训练模型地址 -执行下载命令: -``` -wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams -O ./pretrained/MobileNetV2_pretrained.pdparams -``` +**提示:** +- DataLoader传入的数据集是待压缩模型所用的数据集,DataLoader继承自`paddle.io.DataLoader`。 +- 如无需验证自动压缩过程中模型的精度,`eval_callback`可不传入function,程序会自动根据损失来选择最优模型。 +- 自动压缩Config中定义量化、蒸馏、剪枝等压缩算法会合并执行,压缩策略有:量化+蒸馏,剪枝+蒸馏等等。 -#### 2.3 导出预测模型 -PaddleClas代码库根目录执行如下命令,导出预测模型 -``` -python tools/export_model.py \ - -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml \ - -o Global.pretrained_model=pretrained/MobileNetV2_pretrained \ - -o Global.save_inference_dir=infermodel_mobilenetv2 -``` -#### 2.4 测试模型精度 -拷贝``infermodel_mobilenetv2``文件夹到``PaddleSlim/demo/auto_compression/``文件夹。 -``` -cd PaddleSlim/demo/auto_compression/ -``` -使用[eval.py](../quant/quant_post/eval.py)脚本得到模型的分类精度,压缩后的模型也可以使用同一个脚本测试精度: -``` -python ../quant/quant_post/eval.py --model_path infermodel_mobilenetv2 --model_name inference.pdmodel --params_name inference.pdiparams -``` -精度输出为: -``` -top1_acc/top5_acc= [0.71918 0.90568] -``` +## 应用示例 -### 3. 进行多策略融合压缩 +#### [图像分类](./image_classification) -每一个小章节代表一种多策略融合压缩,不代表需要串行执行。 +#### [目标检测](./detection) -### 3.1 进行量化蒸馏压缩 -蒸馏量化训练示例脚本为[demo_imagenet.py](./demo_imagenet.py),使用接口``paddleslim.auto_compression.AutoCompression``对模型进行量化训练。运行命令为: -``` -python demo_imagenet.py \ - --model_dir='infermodel_mobilenetv2' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_qat_mbv2/' \ - --devices='gpu' \ - --batch_size=64 \ - --data_dir='../data/ILSVRC2012/' \ - --config_path='./configs/CV/mbv2_qat_dis.yaml' -``` +#### [语义分割](./semantic_segmentation) -### 3.2 进行离线量化超参搜索压缩 -离线量化超参搜索压缩示例脚本为[demo_imagenet.py](./demo_imagenet.py),使用接口``paddleslim.auto_compression.AutoCompression``对模型进行压缩。运行命令为: -``` -python demo_imagenet.py \ - --model_dir='infermodel_mobilenetv2' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_ptq_mbv2/' \ - --devices='gpu' \ - --batch_size=64 \ - --data_dir='../data/ILSVRC2012/' \ - --config_path='./configs/CV/mbv2_ptq_hpo.yaml' -``` +#### [NLP](./nlp) -### 3.3 进行剪枝蒸馏策略融合压缩 -注意:本示例为对BERT模型进行ASP稀疏。 -首先参考[脚本](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/language_model/bert#%E9%A2%84%E6%B5%8B)得到可部署的模型,或者下载SST-2数据集上的示例模型[SST-2-BERT](https://paddle-qa.bj.bcebos.com/PaddleSlim_datasets/static_bert_models.tar.gz)。 -剪枝蒸馏压缩示例脚本为[demo_glue.py](./demo_glue.py),使用接口``paddleslim.auto_compression.AutoCompression``对模型进行压缩。运行命令为: -``` -python demo_glue.py \ - --model_dir='./static_bert_models/' \ - --model_filename='bert.pdmodel' \ - --params_filename='bert.pdiparams' \ - --save_dir='./save_asp_bert/' \ - --devices='gpu' \ - --batch_size=32 \ - --task='sst-2' \ - --config_path='./configs/NLP/bert_asp_dis.yaml' -``` +#### 即将发布 +- [ ] 更多自动压缩应用示例 +- [ ] X2Paddle模型自动压缩示例 -### 3.4 进行非结构化稀疏蒸馏策略融合压缩 -非结构化稀疏蒸馏压缩示例脚本为[demo_imagenet.py](./demo_imagenet.py),使用接口``paddleslim.auto_compression.AutoCompression``对模型进行压缩。运行命令为: -``` -python demo_imagenet.py \ - --model_dir='infermodel_mobilenetv2' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_asp_mbv2/' \ - --devices='gpu' \ - --batch_size=64 \ - --data_dir='../data/ILSVRC2012/' \ - --config_path='./configs/CV/xxx.yaml' -``` +## 其他 + +- ACT可以自动处理常见的预测模型,如果有更特殊的改造需求,可以参考[ACT超参配置教程](./hyperparameter_tutorial.md)来进行单独配置压缩策略。 + +- 如果你发现任何关于ACT自动压缩工具的问题或者是建议, 欢迎通过[GitHub Issues](https://github.com/PaddlePaddle/PaddleSlim/issues)给我们提issues。同时欢迎贡献更多优秀模型,共建开源生态。 diff --git a/demo/auto_compression/configs/CV/mbv2_ptq_hpo.yaml b/demo/auto_compression/configs/CV/mbv2_ptq_hpo.yaml deleted file mode 100644 index 02962a91083bf46a89db6fd9f977ef132a3df8ba..0000000000000000000000000000000000000000 --- a/demo/auto_compression/configs/CV/mbv2_ptq_hpo.yaml +++ /dev/null @@ -1,22 +0,0 @@ -HyperParameterOptimization: - batch_num: - - 4 - - 16 - bias_correct: - - true - hist_percent: - - 0.999 - - 0.99999 - max_quant_count: 20 - ptq_algo: - - KL - - hist - weight_quantize_type: - - channel_wise_abs_max -Quantization: - activation_bits: 8 - quantize_op_types: - - conv2d - - depthwise_conv2d - - mul - weight_bits: 8 diff --git a/demo/auto_compression/configs/CV/mbv2_qat_dis.yaml b/demo/auto_compression/configs/CV/mbv2_qat_dis.yaml deleted file mode 100644 index e1b3702ec5532938b8dd306803328fa34bac3029..0000000000000000000000000000000000000000 --- a/demo/auto_compression/configs/CV/mbv2_qat_dis.yaml +++ /dev/null @@ -1,63 +0,0 @@ -Distillation: - distill_lambda: 1.0 - distill_loss: l2_loss - distill_node_pair: - - teacher_conv2d_54.tmp_0 - - conv2d_54.tmp_0 - - teacher_conv2d_55.tmp_0 - - conv2d_55.tmp_0 - - teacher_conv2d_57.tmp_0 - - conv2d_57.tmp_0 - - teacher_elementwise_add_0 - - elementwise_add_0 - - teacher_conv2d_61.tmp_0 - - conv2d_61.tmp_0 - - teacher_elementwise_add_1 - - elementwise_add_1 - - teacher_elementwise_add_2 - - elementwise_add_2 - - teacher_conv2d_67.tmp_0 - - conv2d_67.tmp_0 - - teacher_elementwise_add_3 - - elementwise_add_3 - - teacher_elementwise_add_4 - - elementwise_add_4 - - teacher_elementwise_add_5 - - elementwise_add_5 - - teacher_conv2d_75.tmp_0 - - conv2d_75.tmp_0 - - teacher_elementwise_add_6 - - elementwise_add_6 - - teacher_elementwise_add_7 - - elementwise_add_7 - - teacher_conv2d_81.tmp_0 - - conv2d_81.tmp_0 - - teacher_elementwise_add_8 - - elementwise_add_8 - - teacher_elementwise_add_9 - - elementwise_add_9 - - teacher_conv2d_87.tmp_0 - - conv2d_87.tmp_0 - - teacher_linear_1.tmp_0 - - linear_1.tmp_0 - merge_feed: true - teacher_model_dir: ./infermodel_mobilenetv2 - teacher_model_filename: inference.pdmodel - teacher_params_filename: inference.pdiparams -Quantization: - activation_bits: 8 - is_full_quantize: false - not_quant_pattern: - - skip_quant - quantize_op_types: - - conv2d - - depthwise_conv2d - weight_bits: 8 -TrainConfig: - epochs: 1 - eval_iter: 1000 - learning_rate: 0.0001 - optimizer: SGD - optim_args: - weight_decay: 4.0e-05 - origin_metric: 0.765 diff --git a/demo/auto_compression/configs/NLP/bert_asp_dis.yaml b/demo/auto_compression/configs/NLP/bert_asp_dis.yaml deleted file mode 100644 index c6d998b5b6afe5116e356bf65b7194122a4819bc..0000000000000000000000000000000000000000 --- a/demo/auto_compression/configs/NLP/bert_asp_dis.yaml +++ /dev/null @@ -1,44 +0,0 @@ -Distillation: - distill_lambda: 1.0 - distill_loss: l2_loss - distill_node_pair: - - teacher_tmp_9 - - tmp_9 - - teacher_tmp_12 - - tmp_12 - - teacher_tmp_15 - - tmp_15 - - teacher_tmp_18 - - tmp_18 - - teacher_tmp_21 - - tmp_21 - - teacher_tmp_24 - - tmp_24 - - teacher_tmp_27 - - tmp_27 - - teacher_tmp_30 - - tmp_30 - - teacher_tmp_33 - - tmp_33 - - teacher_tmp_36 - - tmp_36 - - teacher_tmp_39 - - tmp_39 - - teacher_tmp_42 - - tmp_42 - - teacher_linear_147.tmp_1 - - linear_147.tmp_1 - merge_feed: true - teacher_model_dir: static_bert_models - teacher_model_filename: bert.pdmodel - teacher_params_filename: bert.pdiparams -Prune: - prune_algo: asp -TrainConfig: - epochs: 3 - eval_iter: 1000 - learning_rate: 2.0e-05 - optim_args: - weight_decay: 0.0 - optimizer: AdamW - origin_metric: 0.93 diff --git a/demo/auto_compression/configs/NLP/bert_ptq_hpo.yaml b/demo/auto_compression/configs/NLP/bert_ptq_hpo.yaml deleted file mode 100644 index 02962a91083bf46a89db6fd9f977ef132a3df8ba..0000000000000000000000000000000000000000 --- a/demo/auto_compression/configs/NLP/bert_ptq_hpo.yaml +++ /dev/null @@ -1,22 +0,0 @@ -HyperParameterOptimization: - batch_num: - - 4 - - 16 - bias_correct: - - true - hist_percent: - - 0.999 - - 0.99999 - max_quant_count: 20 - ptq_algo: - - KL - - hist - weight_quantize_type: - - channel_wise_abs_max -Quantization: - activation_bits: 8 - quantize_op_types: - - conv2d - - depthwise_conv2d - - mul - weight_bits: 8 diff --git a/demo/auto_compression/configs/NLP/bert_qat_dis.yaml b/demo/auto_compression/configs/NLP/bert_qat_dis.yaml deleted file mode 100644 index bb2687488fd399c26c1937ac402d7c5b953eeed8..0000000000000000000000000000000000000000 --- a/demo/auto_compression/configs/NLP/bert_qat_dis.yaml +++ /dev/null @@ -1,53 +0,0 @@ -Distillation: - distill_lambda: 1.0 - distill_loss: l2_loss - distill_node_pair: - - teacher_tmp_9 - - tmp_9 - - teacher_tmp_12 - - tmp_12 - - teacher_tmp_15 - - tmp_15 - - teacher_tmp_18 - - tmp_18 - - teacher_tmp_21 - - tmp_21 - - teacher_tmp_24 - - tmp_24 - - teacher_tmp_27 - - tmp_27 - - teacher_tmp_30 - - tmp_30 - - teacher_tmp_33 - - tmp_33 - - teacher_tmp_36 - - tmp_36 - - teacher_tmp_39 - - tmp_39 - - teacher_tmp_42 - - tmp_42 - - teacher_linear_147.tmp_1 - - linear_147.tmp_1 - merge_feed: true - teacher_model_dir: ../auto-compression_origin/static_bert_models - teacher_model_filename: bert.pdmodel - teacher_params_filename: bert.pdiparams -Quantization: - activation_bits: 8 - is_full_quantize: false - not_quant_pattern: - - skip_quant - quantize_op_types: - - conv2d - - depthwise_conv2d - - mul - - matmul - weight_bits: 8 -TrainConfig: - epochs: 3 - eval_iter: 1000 - learning_rate: 0.0001 - optimizer: SGD - optim_args: - weight_decay: 4.0e-05 - origin_metric: 0.93 diff --git a/demo/auto_compression/detection/README.md b/demo/auto_compression/detection/README.md index 9830a6abdd0cf69607281864fcefbc70daaccd8d..2cc064a37bcce67d7002a17eff86819d9873f893 100644 --- a/demo/auto_compression/detection/README.md +++ b/demo/auto_compression/detection/README.md @@ -61,9 +61,9 @@ tar -xf ppyoloe_crn_l_300e_coco.tar ### 4. 测试模型精度 -使用[run_main.py](run_main.py)脚本得到模型的mAP: +使用[run.py](run.py)脚本得到模型的mAP: ``` -python3.7 run_main.py --config_path=./configs/ppyoloe_l_qat_dist.yaml --eval=True +python run.py --config_path=./configs/ppyoloe_l_qat_dist.yaml --eval=True ``` **注意**:TinyPose模型暂不支持精度测试。 @@ -71,9 +71,9 @@ python3.7 run_main.py --config_path=./configs/ppyoloe_l_qat_dist.yaml --eval=Tru ## 开始自动压缩 ### 进行量化蒸馏自动压缩 -蒸馏量化自动压缩示例通过[run_main.py](run_main.py)脚本启动,会使用接口``paddleslim.auto_compression.AutoCompression``对模型进行量化训练。具体运行命令为: +蒸馏量化自动压缩示例通过[run.py](run.py)脚本启动,会使用接口``paddleslim.auto_compression.AutoCompression``对模型进行量化训练。具体运行命令为: ``` -python run_main.py --config_path=./configs/ppyoloe_l_qat_dist.yaml --save_dir='./output/' --devices='gpu' +python run.py --config_path=./configs/ppyoloe_l_qat_dist.yaml --save_dir='./output/' --devices='gpu' ``` diff --git a/demo/auto_compression/detection/run_main.py b/demo/auto_compression/detection/run.py similarity index 100% rename from demo/auto_compression/detection/run_main.py rename to demo/auto_compression/detection/run.py diff --git a/demo/auto_compression/hyperparameter_tutorial.md b/demo/auto_compression/hyperparameter_tutorial.md new file mode 100644 index 0000000000000000000000000000000000000000..c86dcfe7f9dc42feaf7c515c8c7a8c6eef23bc3d --- /dev/null +++ b/demo/auto_compression/hyperparameter_tutorial.md @@ -0,0 +1,77 @@ + +# ACT超参详细教程 + +## 各压缩方法超参解析 + +#### 配置定制量化方案 + +量化参数主要设置量化比特数和量化op类型,其中量化op包含卷积层(conv2d, depthwise_conv2d)和全连接层(mul, matmul_v2)。以下为只量化卷积层的示例: +```yaml +Quantization: + use_pact: true # 量化训练是否使用PACT方法 + activation_bits: 8 # 激活量化比特数 + weight_bits: 8 # 权重量化比特数 + activation_quantize_type: 'range_abs_max' # 激活量化方式 + weight_quantize_type: 'channel_wise_abs_max' # 权重量化方式 + is_full_quantize: false # 是否全量化 + not_quant_pattern: [skip_quant] # 跳过量化层的name_scpoe命名(保持默认即可) + quantize_op_types: [conv2d, depthwise_conv2d] # 量化OP列表 +``` + +#### 配置定制蒸馏策略 + +蒸馏参数主要设置蒸馏节点(`distill_node_pair`)和教师预测模型路径。蒸馏节点需包含教师网络节点和对应的学生网络节点,其中教师网络节点名称将在程序中自动添加 “teacher_” 前缀,如下所示: +```yaml +Distillation: + distill_lambda: 1.0 + distill_loss: l2_loss + distill_node_pair: + - teacher_relu_30.tmp_0 + - relu_30.tmp_0 + merge_feed: true + teacher_model_dir: ./inference_model + teacher_model_filename: model.pdmodel + teacher_params_filename: model.pdiparams +``` + +#### 配置定制非结构化稀疏策略 + +非结构化稀疏参数设置如下所示,其中参数含义详见[非结构化稀疏API文档](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/pruners/unstructured_pruner.rst): +```yaml +UnstructurePrune: + prune_strategy: gmp + prune_mode: ratio + pruned_ratio: 0.75 + gmp_config: + stable_iterations: 0 + pruning_iterations: 4500 + tunning_iterations: 4500 + resume_iteration: -1 + pruning_steps: 100 + initial_ratio: 0.15 + prune_params_type: conv1x1_only + local_sparsity: True +``` + +#### 配置训练超参 + +训练参数主要设置学习率、训练次数(epochs)和优化器等。 +```yaml +TrainConfig: + epochs: 14 + eval_iter: 400 + learning_rate: 5.0e-03 + optimizer: SGD + optim_args: + weight_decay: 0.0005 +``` + +## 其他参数配置 + +#### 1.自动蒸馏效果不理想,怎么自主选择蒸馏节点? + +首先使用[Netron工具](https://netron.app/) 可视化`model.pdmodel`模型文件,选择模型中某些层输出Tensor名称,对蒸馏节点进行配置。(一般选择Backbone或网络的输出等层进行蒸馏) + +
+ +
diff --git a/demo/auto_compression/demo_imagenet.py b/demo/auto_compression/image_classification/run.py similarity index 98% rename from demo/auto_compression/demo_imagenet.py rename to demo/auto_compression/image_classification/run.py index aaa8a0a9df91204c66b729b910ad335711932db6..86c16692807b0b3c50e3c1f24324d590611a01b2 100644 --- a/demo/auto_compression/demo_imagenet.py +++ b/demo/auto_compression/image_classification/run.py @@ -107,6 +107,6 @@ if __name__ == '__main__': train_config=train_config, train_dataloader=train_dataloader, eval_callback=eval_function, - eval_dataloader=eader_wrapper(eval_reader(data_dir, 64))) + eval_dataloader=reader_wrapper(eval_reader(data_dir, 64))) ac.compress() diff --git a/demo/auto_compression/demo_glue.py b/demo/auto_compression/nlp/run.py similarity index 100% rename from demo/auto_compression/demo_glue.py rename to demo/auto_compression/nlp/run.py diff --git a/demo/auto_compression/pp-humanseg/README.md b/demo/auto_compression/pp-humanseg/README.md deleted file mode 100644 index 38de84508db3108d7512ba6c48965dacd2c71294..0000000000000000000000000000000000000000 --- a/demo/auto_compression/pp-humanseg/README.md +++ /dev/null @@ -1,164 +0,0 @@ -# 使用预测模型进行自动压缩示例 - -本示例将介绍如何使用PaddleSeg中预测模型进行自动压缩训练。 - -以[PP-HumanSeg-Lite](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/PP-HumanSeg#portrait-segmentation)模型为例,使用自动压缩接口分别进行了蒸馏稀疏训练和蒸馏量化训练实验,并在SD710上使用单线程测试加速效果,其压缩结果和测速结果如下所示: -| 压缩方式 | Total IoU | 耗时(ms)
thread=1 | 加速比 | -|:-----:|:----------:|:---------:| :------:| -| Baseline | 0.9287 | 56.363 | - | -| 非结构化稀疏 | 0.9235 | 37.712 | 49.456% | -| 量化 | 0.9284 | 49.656 | 13.506% | - -## 自动压缩训练流程 - -### 1. 准备数据集 - -参考[PaddleSeg数据准备文档](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/data/marker/marker_cn.md) - -### 2. 准备待压缩模型 - -PaddleSeg 是基于飞桨 PaddlePaddle 开发的端到端图像分割开发套件,涵盖了高精度和轻量级等不同方向的大量高质量分割模型。 -安装 PaddleSeg 指令如下: -``` -pip install paddleseg -``` -PaddleSeg 环境依赖详见[安装文档](https://github.com/PaddlePaddle/PaddleSeg/blob/develop/docs/install_cn.md)。 - -#### 2.1 下载代码 -``` -git clone https://github.com/PaddlePaddle/PaddleSeg.git -``` -#### 2.2 准备预训练模型 - -在 PaddleSeg 目录下执行如下指令,下载预训练模型。 -``` shell -wget https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224.tar.gz -tar -xzf ppseg_lite_portrait_398x224.tar.gz -``` -#### 2.3 导出预测模型 - -在 PaddleSeg 目录下执行如下命令,则预测模型会保存在 inference_model 文件夹。 -```shell -# 设置1张可用的卡 -export CUDA_VISIBLE_DEVICES=0 -# windows下请执行以下命令 -# set CUDA_VISIBLE_DEVICES=0 -python export.py \ - --config configs/pp_humanseg_lite/pp_humanseg_lite_export_398x224.yml \ - --model_path ppseg_lite_portrait_398x224/model.pdparams \ - --save_dir inference_model - --with_softmax -``` -或直接下载 PP-HumanSeg-Lite 的预测模型: -```shell -wegt https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224_with_softmax.tar.gz -tar -xzf ppseg_lite_portrait_398x224_with_softmax.tar.gz -``` - -### 3. 多策略融合压缩 - -每一个小章节代表一种多策略融合压缩方式。 - -### 3.1 进行蒸馏稀疏压缩 -自动压缩训练需要准备 config 文件、数据集 dataloader 以及测试函数(``eval_function``)。 -#### 3.1.1 配置config - -使用自动压缩进行蒸馏和非结构化稀疏的联合训练,首先要配置 config 文件,包含蒸馏、稀疏和训练三部分参数。 - -- 蒸馏参数 - -蒸馏参数主要设置蒸馏节点(``distill_node_pair``)和教师网络测预测模型路径。蒸馏节点需包含教师网络节点和对应的学生网络节点,其中教师网络节点名称将在程序中自动添加 “teacher_” 前缀,如下所示。 -```yaml -Distillation: - distill_lambda: 1.0 - distill_loss: l2_loss - distill_node_pair: - - teacher_relu_30.tmp_0 - - relu_30.tmp_0 - merge_feed: true - teacher_model_dir: ./inference_model - teacher_model_filename: model.pdmodel - teacher_params_filename: model.pdiparams -``` -- 稀疏参数 - -稀疏参数设置如下所示,其中参数含义详见[非结构化稀疏API文档](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/pruners/unstructured_pruner.rst)。 -```yaml -UnstructurePrune: - prune_strategy: gmp - prune_mode: ratio - pruned_ratio: 0.75 - gmp_config: - stable_iterations: 0 - pruning_iterations: 4500 - tunning_iterations: 4500 - resume_iteration: -1 - pruning_steps: 100 - initial_ratio: 0.15 - prune_params_type: conv1x1_only - local_sparsity: True -``` - -- 训练参数 - -训练参数主要设置学习率、训练次数(epochs)和优化器等。 -```yaml -TrainConfig: - epochs: 14 - eval_iter: 400 - learning_rate: 5.0e-03 - optimizer: SGD - optim_args: - weight_decay: 0.0005 -``` -#### 3.1.2 准备 dataloader 和测试函数 -准备好数据集后,需将训练数据封装成 dict 类型传入自动压缩接口,可参考以下函数进行封装。测试函数用于测试模型精度,需在静态图模式下实现。 -```python -def reader_wrapper(reader): - def gen(): - for i, data in enumerate(reader()): - imgs = np.array(data[0]) - yield {"x": imgs} - return gen -``` -> 注:该dict类型的key值要和保存预测模型时的输入名称保持一致。 - -#### 3.1.3 开启训练 - -将训练数据集 dataloader 和测试函数传入接口 ``paddleslim.auto_compression.AutoCompression``,对模型进行非结构化稀疏训练。运行指令如下: -```shell -python run.py \ - --model_dir='inference_model' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_model' \ - --config_path='configs/humanseg_sparse_dis.yaml' -``` - -### 3.2 进行蒸馏量化压缩 -#### 3.2.1 配置config -使用自动压缩进行量化训练,首先要配置config文件,包含蒸馏、量化和训练三部分参数。其中蒸馏和训练参数与稀疏训练类似,下面主要介绍量化参数的设置。 -- 量化参数 - -量化参数主要设置量化比特数和量化op类型,其中量化op包含卷积层(conv2d, depthwise_conv2d)和全连接层(matmul)。以下为只量化卷积层的示例: -```yaml -Quantization: - activation_bits: 8 - weight_bits: 8 - is_full_quantize: false - not_quant_pattern: - - skip_quant - quantize_op_types: - - conv2d - - depthwise_conv2d -``` -#### 3.2.2 开启训练 -将数据集 dataloader 和测试函数(``eval_function``)传入接口``paddleslim.auto_compression.AutoCompression``,对模型进行量化训练。运行指令如下: -``` -python run.py \ - --model_dir='inference_model' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_model' \ - --config_path='configs/humanseg_quant_dis.yaml' -``` diff --git a/demo/auto_compression/run_gelu.sh b/demo/auto_compression/run_gelu.sh deleted file mode 100644 index 8391fe73880b49a9666630ee5f88b3a826d863ea..0000000000000000000000000000000000000000 --- a/demo/auto_compression/run_gelu.sh +++ /dev/null @@ -1,9 +0,0 @@ -python3.7 demo_glue.py \ - --model_dir='./static_bert_models/' \ - --model_filename='bert.pdmodel' \ - --params_filename='bert.pdiparams' \ - --save_dir='./save_asp_bert/' \ - --devices='cpu' \ - --batch_size=32 \ - --task='sst-2' \ - --config_path='./configs/NLP/bert_asp_dis.yaml' diff --git a/demo/auto_compression/run_imagenet.sh b/demo/auto_compression/run_imagenet.sh deleted file mode 100644 index 376554f37e5dbfa9e356862ec6e0fb93fa460b57..0000000000000000000000000000000000000000 --- a/demo/auto_compression/run_imagenet.sh +++ /dev/null @@ -1,9 +0,0 @@ -python3.7 demo_imagenet.py \ - --model_dir='infermodel_mobilenetv2' \ - --model_filename='inference.pdmodel' \ - --params_filename='./inference.pdiparams' \ - --save_dir='./save_qat_mbv2/' \ - --devices='cpu' \ - --batch_size=2 \ - --data_dir='../data/ILSVRC2012/' \ - --config_path='./configs/CV/mbv2_ptq_hpo.yaml' diff --git a/demo/auto_compression/semantic_segmentation/README.md b/demo/auto_compression/semantic_segmentation/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1a829ccf4b90f4a6f6cd6bbec3534d456fd98299 --- /dev/null +++ b/demo/auto_compression/semantic_segmentation/README.md @@ -0,0 +1,91 @@ +# 语义分割自动压缩 + +目录: +- [1.简介](#1简介) +- [2.Benchmark](#2Benchmark) +- [3.开始自动压缩](#开始自动压缩) + - [3.1 环境准备](#31-环境准备) + - [3.2 准备数据集](#32-准备数据集) + - [3.3 准备预测模型](#33-准备预测模型) + - [3.4 自动压缩并产出模型](#34-自动压缩并产出模型) +- [4.预测部署](#4预测部署) +- [5.FAQ](5FAQ) + +## 1.简介 + +语义分割是计算机视觉领域重要的一个研究方向,在很多场景中均有应用落地,语义分割模型的部署落地的性能也倍受关注,自动压缩工具(ACT)致力于更便捷的自动压缩优化模型,达到压缩模型体积、加速模型预测的效果。 + +## 2.Benchmark + +- [PP-HumanSeg-Lite](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/PP-HumanSeg#portrait-segmentation) + +| 模型 | 策略 | Total IoU | 耗时(ms)
thread=1 | 配置文件 | Inference模型 | +|:-----:|:-----:|:----------:|:---------:| :------:| :------:| +| PP-HumanSeg-Lite | Baseline | 0.9287 | 56.363 | - | [model](https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224_with_softmax.tar.gz) | +| PP-HumanSeg-Lite | 非结构化稀疏+蒸馏 | 0.9235 | 37.712 | [config](./configs/pp_human_sparse_dis.yaml)| - | +| PP-HumanSeg-Lite | 量化+蒸馏 | 0.9284 | 49.656 | [config](./configs/pp_human_sparse_dis.yaml) | - | + +- 测试环境:`SDM710 2*A75(2.2GHz) 6*A55(1.7GHz)`; +- 测试数据集:AISegment + PP-HumanSeg14K + 内部自建数据集。 + +下面将以开源数据集为例介绍如何进行自动压缩。 + +## 3.开始自动压缩 + +#### 3.1 环境准备 + +- PaddlePaddle >= 2.2 (从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装) +- PaddleSlim >= 2.3 或者适当develop版本 +- PaddleSeg >= 2.5 + +```shell +pip install paddleslim +pip install paddleseg +``` + +注:安装[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)的目的只是为了直接使用PaddleSeg中的Dataloader组件,不涉及模型组网等。 + +#### 3.2 准备数据集 + +开发者可下载开源数据集或自定义语义分割数据集,例如PP-HumanSeg-Lite模型中使用的语义分割数据集[PP-HumanSeg14K](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/contrib/PP-HumanSeg/paper.md#pp-humanseg14k-a-large-scale-teleconferencing-video-dataset)可从官方渠道下载。 + +如果是自定义数据,请参考[PaddleSeg数据准备文档](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/data/marker/marker_cn.md)来检查对齐数据格式即可。 + +#### 3.3 准备预测模型 + +预测模型的格式为:`model.pdmodel` 和 `model.pdiparams`两个,带`pdmodel`的是模型文件,带`pdiparams`后缀的是权重文件。 + +注:其他像`__model__`和`__params__`分别对应`model.pdmodel` 和 `model.pdiparams`文件。 + +- 如果想快速体验,可直接下载PP-HumanSeg-Lite 的预测模型: + +```shell +wegt https://paddleseg.bj.bcebos.com/dygraph/ppseg/ppseg_lite_portrait_398x224_with_softmax.tar.gz +tar -xzf ppseg_lite_portrait_398x224_with_softmax.tar.gz +``` + +也可进入[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg) 中导出所需预测模型。 + +#### 3.4 自动压缩并产出模型 + +首先要配置config文件中模型路径、数据集路径、蒸馏、量化、稀疏化和训练等部分的参数,配置完成后便可开始自动压缩。 + +```shell +python run.py \ + --model_dir='inference_model' \ + --model_filename='inference.pdmodel' \ + --params_filename='./inference.pdiparams' \ + --save_dir='./save_model' \ + --config_path='configs/humanseg_sparse_dis.yaml' +``` + +压缩完成后会在`save_dir`中产出压缩好的预测模型,可直接预测部署。 + + +## 4.预测部署 + +- [Paddle Inference Python部署](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/deployment/inference/python_inference.md) +- [Paddle Inference C++部署](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/deployment/inference/cpp_inference.md) +- [Paddle Lite部署](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/deployment/lite/lite.md) + +## 5.FAQ diff --git a/demo/auto_compression/pp-humanseg/configs/humanseg_quant_dis.yaml b/demo/auto_compression/semantic_segmentation/configs/pp_humanseg_quant_dis.yaml similarity index 73% rename from demo/auto_compression/pp-humanseg/configs/humanseg_quant_dis.yaml rename to demo/auto_compression/semantic_segmentation/configs/pp_humanseg_quant_dis.yaml index ba05eff47c9f4339f15c6a12a0474b029da2fccd..0f38f9dc2a792b65f59bf0d2837a49efe9d2bc00 100644 --- a/demo/auto_compression/pp-humanseg/configs/humanseg_quant_dis.yaml +++ b/demo/auto_compression/semantic_segmentation/configs/pp_humanseg_quant_dis.yaml @@ -2,33 +2,33 @@ Distillation: distill_lambda: 1.0 distill_loss: l2_loss distill_node_pair: - - teacher_reshape2_1.tmp_0 # + - teacher_reshape2_1.tmp_0 - reshape2_1.tmp_0 - - teacher_reshape2_3.tmp_0 # + - teacher_reshape2_3.tmp_0 - reshape2_3.tmp_0 - - teacher_reshape2_5.tmp_0 # + - teacher_reshape2_5.tmp_0 - reshape2_5.tmp_0 - teacher_reshape2_7.tmp_0 #block1 - reshape2_7.tmp_0 - - teacher_reshape2_9.tmp_0 # + - teacher_reshape2_9.tmp_0 - reshape2_9.tmp_0 - - teacher_reshape2_11.tmp_0 # + - teacher_reshape2_11.tmp_0 - reshape2_11.tmp_0 - - teacher_reshape2_13.tmp_0 # + - teacher_reshape2_13.tmp_0 - reshape2_13.tmp_0 - - teacher_reshape2_15.tmp_0 # + - teacher_reshape2_15.tmp_0 - reshape2_15.tmp_0 - - teacher_reshape2_17.tmp_0 # + - teacher_reshape2_17.tmp_0 - reshape2_17.tmp_0 - - teacher_reshape2_19.tmp_0 # + - teacher_reshape2_19.tmp_0 - reshape2_19.tmp_0 - - teacher_reshape2_21.tmp_0 # + - teacher_reshape2_21.tmp_0 - reshape2_21.tmp_0 - teacher_depthwise_conv2d_14.tmp_0 # block2 - depthwise_conv2d_14.tmp_0 - teacher_depthwise_conv2d_15.tmp_0 - depthwise_conv2d_15.tmp_0 - - teacher_reshape2_23.tmp_0 #block1 + - teacher_reshape2_23.tmp_0 #block3 - reshape2_23.tmp_0 - teacher_relu_30.tmp_0 # final_conv - relu_30.tmp_0 diff --git a/demo/auto_compression/pp-humanseg/configs/humanseg_sparse_dis.yaml b/demo/auto_compression/semantic_segmentation/configs/pp_humanseg_sparse_dis.yaml similarity index 100% rename from demo/auto_compression/pp-humanseg/configs/humanseg_sparse_dis.yaml rename to demo/auto_compression/semantic_segmentation/configs/pp_humanseg_sparse_dis.yaml diff --git a/demo/auto_compression/pp-humanseg/run.py b/demo/auto_compression/semantic_segmentation/run.py similarity index 100% rename from demo/auto_compression/pp-humanseg/run.py rename to demo/auto_compression/semantic_segmentation/run.py diff --git a/docs/images/dis_node.png b/docs/images/dis_node.png new file mode 100644 index 0000000000000000000000000000000000000000..897000505f6acdeac7e8ebeab19cb7ff17672892 Binary files /dev/null and b/docs/images/dis_node.png differ