README.md 16.2 KB
Newer Older
W
whs 已提交
1
# PaddleSlim
W
wanghaoshuang 已提交
2

W
whs 已提交
3
中文 | [English](README_en.md)
V
vincentXiyu 已提交
4

W
whs 已提交
5 6 7
[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://paddleslim.readthedocs.io/en/latest/)
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://paddleslim.readthedocs.io/zh_CN/latest/)
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
V
vincentXiyu 已提交
8

9
PaddleSlim是一个模型压缩工具库,包含模型剪裁、定点量化、知识蒸馏、超参搜索和模型结构搜索等一系列模型压缩策略。
V
vincentXiyu 已提交
10

11 12 13
对于业务用户,PaddleSlim提供完整的模型压缩解决方案,可用于图像分类、检测、分割等各种类型的视觉场景。
同时也在持续探索NLP领域模型的压缩方案。另外,PaddleSlim提供且在不断完善各种压缩策略在经典开源任务的benchmark,
以便业务用户参考。
V
vincentXiyu 已提交
14

15 16
对于模型压缩算法研究者或开发者,PaddleSlim提供各种压缩策略的底层辅助接口,方便用户复现、调研和使用最新论文方法。
PaddleSlim会从底层能力、技术咨询合作和业务场景等角度支持开发者进行模型压缩策略相关的创新工作。
V
vincentXiyu 已提交
17 18


19
## 功能
V
vincentXiyu 已提交
20

W
whs 已提交
21 22 23 24
<table style="width:100%;" cellpadding="2" cellspacing="0" border="1" bordercolor="#000000">
	<tbody>
		<tr>
			<td style="text-align:center;">
25
				<span style="font-size:18px;">功能模块</span>
W
whs 已提交
26 27
			</td>
			<td style="text-align:center;">
28
				<span style="font-size:18px;">算法</span>
W
whs 已提交
29 30
			</td>
			<td style="text-align:center;">
31
				<span style="font-size:18px;">教程</span><span style="font-size:18px;">与文档</span>
W
whs 已提交
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
			</td>
		</tr>
		<tr>
			<td style="text-align:center;">
				<span style="font-size:12px;">剪裁</span><span style="font-size:12px;"></span><br />
			</td>
			<td>
				<ul>
					<li>
						Sensitivity&nbsp;&nbsp;Pruner:&nbsp;<a href="https://arxiv.org/abs/1608.08710" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Li H , Kadav A , Durdanovic I , et al. Pruning Filters for Efficient ConvNets[J]. 2016.</span></span></a>
					</li>
					<li>
						AMC Pruner:&nbsp;<a href="https://arxiv.org/abs/1802.03494" target="_blank"><span style="font-family:&quot;font-size:13px;background-color:#FFFFFF;">He, Yihui , et al. "AMC: AutoML for Model Compression and Acceleration on Mobile Devices." (2018).</span></a>
					</li>
					<li>
Y
yukavio 已提交
47
						FPGM Pruner:&nbsp;<a href="https://arxiv.org/abs/1811.00250" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">He Y , Liu P , Wang Z , et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration[C]// IEEE/CVF Conference on Computer Vision &amp; Pattern Recognition. IEEE, 2019.</span></a>
W
whs 已提交
48 49 50 51 52
					</li>
					<li>
						Slim Pruner:<span style="background-color:#FFFDFA;">&nbsp;<a href="https://arxiv.org/pdf/1708.06519.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Liu Z , Li J , Shen Z , et al. Learning Efficient Convolutional Networks through Network Slimming[J]. 2017.</span></a></span>
					</li>
					<li>
W
whs 已提交
53
						<span style="background-color:#FFFDFA;">Opt Slim Pruner:&nbsp;<a href="https://arxiv.org/pdf/2003.04566.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Ye Y , You G , Fwu J K , et al. Channel Pruning via Optimal Thresholding[J]. 2020.</span></a><br />
54
</span>
W
whs 已提交
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
					</li>
				</ul>
			</td>
			<td>
					<ul>
						<li>
							<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/prune_api.rst" target="_blank">剪裁模块API文档</a>
						</li>
					        <li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/pruning_tutorial.md" target="_blank">剪裁快速开始示例</a>
						</li>
						<li>
							<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md" target="_blank">分类模敏感度分析教程</a>
						</li>
						<li>
							<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_pruing_tutorial.md" target="_blank">检测模型剪裁教程</a>
						</li>
						<li>
								<span id="__kindeditor_bookmark_start_313__"></span><a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_prune_dist_tutorial.md" target="_blank">检测模型剪裁+蒸馏教程</a>
						</li>
						<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_sensitivy_tutorial.md" target="_blank">检测模型敏感度分析教程</a>
						</li>
					</ul>
			</td>
		</tr>
		<tr>
			<td style="text-align:center;">
				量化
			</td>
			<td>
				<ul>
					<li>
						Quantization Aware Training:&nbsp;<a href="https://arxiv.org/abs/1806.08342" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Krishnamoorthi R . Quantizing deep convolutional networks for efficient inference: A whitepaper[J]. 2018.</span></a>
					</li>
					<li>
91
						Post Training&nbsp;<span>Quantization&nbsp;</span><a href="http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf" target="_blank">原理</a>
W
whs 已提交
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
					</li>
					<li>
						Embedding&nbsp;<span>Quantization:&nbsp;<a href="https://arxiv.org/pdf/1603.01025.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Miyashita D , Lee E H , Murmann B . Convolutional Neural Networks using Logarithmic Data Representation[J]. 2016.</span></a></span>
					</li>
					<li>
						DSQ: <a href="https://arxiv.org/abs/1908.05033" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Gong, Ruihao, et al. "Differentiable soft quantization: Bridging full-precision and low-bit neural networks."&nbsp;</span><i>Proceedings of the IEEE International Conference on Computer Vision</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2019.</span></a>
					</li>
					<li>
						PACT:&nbsp; <a href="https://arxiv.org/abs/1805.06085" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Choi, Jungwook, et al. "Pact: Parameterized clipping activation for quantized neural networks."&nbsp;</span><i>arXiv preprint arXiv:1805.06085</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2018).</span></a>
					</li>
				</ul>
			</td>
			<td>
				<ul>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/quantization_api.rst" target="_blank">量化API文档</a>
					</li>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/quant_aware_tutorial.md" target="_blank">量化训练快速开始示例</a>
					</li>
					<li>
113
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/quant_post_static_tutorial.md" target="_blank">静态离线量化快速开始示例</a>
W
whs 已提交
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
					</li>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_quantization_tutorial.md" target="_blank">检测模型量化教程</a>
					</li>
				</ul>
			</td>
		</tr>
		<tr>
			<td style="text-align:center;">
				蒸馏
			</td>
			<td>
				<ul>
					<li>
						<span>Knowledge Distillation</span>:&nbsp;<a href="https://arxiv.org/abs/1503.02531" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network."&nbsp;</span><i>arXiv preprint arXiv:1503.02531</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2015).</span></a>
					</li>
					<li>
						FSP <span>Knowledge Distillation</span>:&nbsp;&nbsp;<a href="http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Yim, Junho, et al. "A gift from knowledge distillation: Fast optimization, network minimization and transfer learning."&nbsp;</span><i>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2017.</span></a>
					</li>
					<li>
						YOLO Knowledge Distillation:&nbsp;&nbsp;<a href="http://openaccess.thecvf.com/content_ECCVW_2018/papers/11133/Mehta_Object_detection_at_200_Frames_Per_Second_ECCVW_2018_paper.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Mehta, Rakesh, and Cemalettin Ozturk. "Object detection at 200 frames per second."&nbsp;</span><i>Proceedings of the European Conference on Computer Vision (ECCV)</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2018.</span></a>
					</li>
					<li>
						DML:&nbsp;<a href="https://arxiv.org/abs/1706.00384" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Zhang, Ying, et al. "Deep mutual learning."&nbsp;</span><i>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2018.</span></a>
					</li>
				</ul>
			</td>
			<td>
				<ul>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/single_distiller_api.rst" target="_blank">蒸馏API文档</a>
					</li>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/distillation_tutorial.md" target="_blank">蒸馏快速开始示例</a>
					</li>
					<li>
						<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_distillation_tutorial.md" target="_blank">检测模型蒸馏教程</a>
					</li>
				</ul>
			</td>
		</tr>
		<tr>
			<td style="text-align:center;">
				模型结构搜索(NAS)
			</td>
			<td>
				<ul>
					<li>
						Simulate Anneal NAS:&nbsp;<a href="https://arxiv.org/pdf/2005.04117.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Abdelhamed, Abdelrahman, et al. "Ntire 2020 challenge on real image denoising: Dataset, methods and results."&nbsp;</span><i>The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. Vol. 2. 2020.</span></a>
					</li>
					<li>
						DARTS <a href="https://arxiv.org/abs/1806.09055" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search."&nbsp;</span><i>arXiv preprint arXiv:1806.09055</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2018).</span></a>
					</li>
					<li>
						PC-DARTS <a href="https://arxiv.org/abs/1907.05737" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Xu, Yuhui, et al. "Pc-darts: Partial channel connections for memory-efficient differentiable architecture search."&nbsp;</span><i>arXiv preprint arXiv:1907.05737</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2019).</span></a>
					</li>
					<li>
						OneShot&nbsp;
					</li>
				</ul>
			</td>
			<td>
						<ul>
							<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/nas_api.rst" target="_blank">NAS API文档</a>
							</li>
							<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/darts.rst" target="_blank">DARTS API文档</a>
							</li>
							<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/nas_tutorial.md" target="_blank">NAS快速开始示例</a>
							</li>
							<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_nas_tutorial.md" target="_blank">检测模型NAS教程</a>
							</li>
							<li>
								<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/sanas_darts_space.md" target="_blank">SANAS进阶版实验教程-压缩DARTS产出模型</a>
							</li>
						</ul>
			</td>
		</tr>
	</tbody>
</table>
<br />
V
vincentXiyu 已提交
198

199
## 安装
V
vincentXiyu 已提交
200

W
whs 已提交
201
```bash
W
whs 已提交
202
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
W
whs 已提交
203
```
204 205
### 量化和Paddle版本的对应关系

206
如果在ARM和GPU上预测,每个版本都可以,如果在CPU上预测,请选择Paddle 2.0对应的PaddleSlim 1.2.0版本
207 208 209 210 211 212 213 214 215 216 217 218 219

- Paddle 1.7 系列版本,需要安装PaddleSlim 1.0.1版本

```bash
pip install paddleslim==1.0.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```

- Paddle 1.8 系列版本,需要安装PaddleSlim 1.1.1版本

```bash
pip install paddleslim==1.1.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```

220
- Paddle 2.0 系列版本,需要安装PaddleSlim 1.2.0版本
221 222

```bash
223
pip install paddleslim==1.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
224 225 226
```


W
wanghaoshuang 已提交
227

228
## 使用
W
wanghaoshuang 已提交
229

230
- [快速开始](docs/zh_cn/quick_start):通过简单示例介绍如何快速使用PaddleSlim。
B
Bai Yifan 已提交
231 232 233
- 动态图:
    - 剪枝:[教程](dygraph_docs/)[示例](demo/dygraph/pruning)
    - 量化:[示例](demo/dygraph/quant)
234 235 236 237
- [进阶教程](docs/zh_cn/tutorials):PaddleSlim高阶教程。
- [模型库](docs/zh_cn/model_zoo.md):各个压缩策略在图像分类、目标检测和图像语义分割模型上的实验结论,包括模型精度、预测速度和可供下载的预训练模型。
- [API文档](https://paddlepaddle.github.io/PaddleSlim/api_cn/index.html)
- [算法原理](https://paddlepaddle.github.io/PaddleSlim/algo/algo.html): 介绍量化、剪枝、蒸馏、NAS的基本知识背景。
W
whs 已提交
238 239 240 241 242
- 视觉模型压缩
    - [SlimMobileNet](paddleslim/models#slimmobilenet系列指标)
    - [SlimFaceNet](demo/slimfacenet/README.md)
    - [OCR模型压缩(基于PaddleOCR)](demo/ocr/README.md)
    - [检测模型压缩(基于PaddleDetection)](demo/detection/README.md)
B
Bai Yifan 已提交
243
- [TensorRT部署](demo/quant/deploy/TensorRT): 介绍如何使用TensorRT部署PaddleSlim量化得到的模型。
W
wanghaoshuang 已提交
244

245
## 部分压缩策略效果
W
wanghaoshuang 已提交
246

247
### 分类模型
W
wanghaoshuang 已提交
248

249
数据: ImageNet2012; 模型: MobileNetV1;
W
wanghaoshuang 已提交
250

251
|压缩策略 |精度收益(baseline: 70.91%) |模型大小(baseline: 17.0M)|
W
whs 已提交
252
|:---:|:---:|:---:|
253 254 255
| 知识蒸馏(ResNet50)| [+1.06%]() |-|
| 知识蒸馏(ResNet50) + int8量化训练 |[+1.10%]()| [-71.76%]()|
| 剪裁(FLOPs-50%) + int8量化训练|[-1.71%]()|[-86.47%]()|
W
wanghaoshuang 已提交
256 257


258
### 图像检测模型
W
wanghaoshuang 已提交
259

260
#### 数据:Pascal VOC;模型:MobileNet-V1-YOLOv3
W
wanghaoshuang 已提交
261

262
|        压缩方法           | mAP(baseline: 76.2%)         | 模型大小(baseline: 94MB)      |
W
whs 已提交
263
| :---------------------:   | :------------: | :------------:|
264 265 266
| 知识蒸馏(ResNet34-YOLOv3) | [+2.8%](#)      |       -       |
| 剪裁 FLOPs -52.88%        | [+1.4%]()      | [-67.76%]()   |
|知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-69.57%)| [+2.6%]()|[-67.00%]()|
W
wanghaoshuang 已提交
267 268


269
#### 数据:COCO;模型:MobileNet-V1-YOLOv3
W
wanghaoshuang 已提交
270

271
|        压缩方法           | mAP(baseline: 29.3%) | 模型大小|
W
whs 已提交
272
| :---------------------:   | :------------: | :------:|
273 274
| 知识蒸馏(ResNet34-YOLOv3) |  [+2.1%]()     |-|
| 知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-67.56%) | [-0.3%]() | [-66.90%]()|
W
wanghaoshuang 已提交
275

276
### 搜索
W
wanghaoshuang 已提交
277

278
数据:ImageNet2012; 模型:MobileNetV2
W
wanghaoshuang 已提交
279

280
|硬件环境           | 推理耗时 | Top1准确率(baseline:71.90%) |
W
whs 已提交
281 282 283 284
|:---------------:|:---------:|:--------------------:|
| RK3288  | [-23%]()    | +0.07%    |
| Android cellphone  | [-20%]()    | +0.16% |
| iPhone 6s   | [-17%]()    | +0.32%  |
W
whs 已提交
285 286 287 288 289 290 291

## 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。

## 如何贡献代码

我们非常欢迎你可以为PaddleSlim提供代码,也十分感谢你的反馈。