未验证 提交 466b587f 编写于 作者: W whs 提交者: GitHub

Modify readme (#651)

上级 250eaea0
......@@ -6,286 +6,160 @@
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://paddleslim.readthedocs.io/zh_CN/latest/)
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
PaddleSlim是一个模型压缩工具库,包含模型剪裁、定点量化、知识蒸馏、超参搜索和模型结构搜索等一系列模型压缩策略。
## 简介
对于业务用户,PaddleSlim提供完整的模型压缩解决方案,可用于图像分类、检测、分割等各种类型的视觉场景。
同时也在持续探索NLP领域模型的压缩方案。另外,PaddleSlim提供且在不断完善各种压缩策略在经典开源任务的benchmark,
以便业务用户参考。
PaddleSlim是一个专注于深度学习模型压缩的工具库,提供**剪裁、量化、蒸馏、和模型结构搜索**等模型压缩策略,帮助用户快速实现模型的小型化。
对于模型压缩算法研究者或开发者,PaddleSlim提供各种压缩策略的底层辅助接口,方便用户复现、调研和使用最新论文方法。
PaddleSlim会从底层能力、技术咨询合作和业务场景等角度支持开发者进行模型压缩策略相关的创新工作。
## 版本对齐
| PaddleSlim | PaddlePaddle | PaddleLite | 备注 |
| :-----------: | :------------: | :------------:| :----------:|
| 1.0.1 | <=1.7 | 2.7 | 支持静态图 |
| 1.1.1 | 1.8 | 2.7 | 支持静态图 |
| 1.2.0 | 2.0Beta/RC | 2.8 | 支持静态图; 新增CPU预测 |
| 2.0.0 | 2.0 | 2.8 | 支持动态图和静态图 |
## 功能
<table style="width:100%;" cellpadding="2" cellspacing="0" border="1" bordercolor="#000000">
<tbody>
<tr>
<td style="text-align:center;">
<span style="font-size:18px;">功能模块</span>
</td>
<td style="text-align:center;">
<span style="font-size:18px;">算法</span>
</td>
<td style="text-align:center;">
<span style="font-size:18px;">教程</span><span style="font-size:18px;">与文档</span>
</td>
</tr>
<tr>
<td style="text-align:center;">
<span style="font-size:12px;">剪裁</span><span style="font-size:12px;"></span><br />
</td>
<td>
<ul>
<li>
Sensitivity&nbsp;&nbsp;Pruner:&nbsp;<a href="https://arxiv.org/abs/1608.08710" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Li H , Kadav A , Durdanovic I , et al. Pruning Filters for Efficient ConvNets[J]. 2016.</span></span></a>
</li>
<li>
AMC Pruner:&nbsp;<a href="https://arxiv.org/abs/1802.03494" target="_blank"><span style="font-family:&quot;font-size:13px;background-color:#FFFFFF;">He, Yihui , et al. "AMC: AutoML for Model Compression and Acceleration on Mobile Devices." (2018).</span></a>
</li>
<li>
FPGM Pruner:&nbsp;<a href="https://arxiv.org/abs/1811.00250" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">He Y , Liu P , Wang Z , et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration[C]// IEEE/CVF Conference on Computer Vision &amp; Pattern Recognition. IEEE, 2019.</span></a>
</li>
<li>
Slim Pruner:<span style="background-color:#FFFDFA;">&nbsp;<a href="https://arxiv.org/pdf/1708.06519.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Liu Z , Li J , Shen Z , et al. Learning Efficient Convolutional Networks through Network Slimming[J]. 2017.</span></a></span>
</li>
<li>
<span style="background-color:#FFFDFA;">Opt Slim Pruner:&nbsp;<a href="https://arxiv.org/pdf/2003.04566.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Ye Y , You G , Fwu J K , et al. Channel Pruning via Optimal Thresholding[J]. 2020.</span></a><br />
</span>
</li>
</ul>
</td>
<td>
<ul>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/prune_api.rst" target="_blank">剪裁模块API文档</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/pruning_tutorial.md" target="_blank">剪裁快速开始示例</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md" target="_blank">分类模敏感度分析教程</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_pruing_tutorial.md" target="_blank">检测模型剪裁教程</a>
</li>
<li>
<span id="__kindeditor_bookmark_start_313__"></span><a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_prune_dist_tutorial.md" target="_blank">检测模型剪裁+蒸馏教程</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_sensitivy_tutorial.md" target="_blank">检测模型敏感度分析教程</a>
</li>
</ul>
</td>
</tr>
<tr>
<td style="text-align:center;">
量化
</td>
<td>
<ul>
<li>
Quantization Aware Training:&nbsp;<a href="https://arxiv.org/abs/1806.08342" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Krishnamoorthi R . Quantizing deep convolutional networks for efficient inference: A whitepaper[J]. 2018.</span></a>
</li>
<li>
Post Training&nbsp;<span>Quantization&nbsp;</span><a href="http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf" target="_blank">原理</a>
</li>
<li>
Embedding&nbsp;<span>Quantization:&nbsp;<a href="https://arxiv.org/pdf/1603.01025.pdf" target="_blank"><span style="font-family:&quot;font-size:14px;background-color:#FFFFFF;">Miyashita D , Lee E H , Murmann B . Convolutional Neural Networks using Logarithmic Data Representation[J]. 2016.</span></a></span>
</li>
<li>
DSQ: <a href="https://arxiv.org/abs/1908.05033" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Gong, Ruihao, et al. "Differentiable soft quantization: Bridging full-precision and low-bit neural networks."&nbsp;</span><i>Proceedings of the IEEE International Conference on Computer Vision</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2019.</span></a>
</li>
<li>
PACT:&nbsp; <a href="https://arxiv.org/abs/1805.06085" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Choi, Jungwook, et al. "Pact: Parameterized clipping activation for quantized neural networks."&nbsp;</span><i>arXiv preprint arXiv:1805.06085</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2018).</span></a>
</li>
</ul>
</td>
<td>
<ul>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/quantization_api.rst" target="_blank">量化API文档</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/quant_aware_tutorial.md" target="_blank">量化训练快速开始示例</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/quant_post_static_tutorial.md" target="_blank">静态离线量化快速开始示例</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_quantization_tutorial.md" target="_blank">检测模型量化教程</a>
</li>
</ul>
</td>
</tr>
<tr>
<td style="text-align:center;">
蒸馏
</td>
## 安装
安装最新版本:
```bash
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
```
安装指定版本:
```bash
pip install paddleslim=1.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
```
## 最近更新
2021.2.5: 发布V2.0.0版本,新增支持动态图,新增OFA压缩功能,优化剪枝功能。
## 功能概览
PaddleSlim支持以下功能,也支持自定义量化、裁剪等功能。
<table>
<tr align="center" valign="bottom">
<th>Quantization</th>
<th>Pruning</th>
<th>NAS</th>
<th>Distilling</th>
</tr>
<tr valign="top">
<td>
<ul>
<li>
<span>Knowledge Distillation</span>:&nbsp;<a href="https://arxiv.org/abs/1503.02531" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network."&nbsp;</span><i>arXiv preprint arXiv:1503.02531</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2015).</span></a>
</li>
<li>
FSP <span>Knowledge Distillation</span>:&nbsp;&nbsp;<a href="http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Yim, Junho, et al. "A gift from knowledge distillation: Fast optimization, network minimization and transfer learning."&nbsp;</span><i>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2017.</span></a>
</li>
<li>
YOLO Knowledge Distillation:&nbsp;&nbsp;<a href="http://openaccess.thecvf.com/content_ECCVW_2018/papers/11133/Mehta_Object_detection_at_200_Frames_Per_Second_ECCVW_2018_paper.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Mehta, Rakesh, and Cemalettin Ozturk. "Object detection at 200 frames per second."&nbsp;</span><i>Proceedings of the European Conference on Computer Vision (ECCV)</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2018.</span></a>
</li>
<li>
DML:&nbsp;<a href="https://arxiv.org/abs/1706.00384" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Zhang, Ying, et al. "Deep mutual learning."&nbsp;</span><i>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. 2018.</span></a>
</li>
<li>QAT</li>
<li>PACT</li>
<li>PTQ-Static</li>
<li>PTQ-Dynamic</li>
<li>Embedding Quant</li>
</ul>
</td>
<td>
<ul>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/single_distiller_api.rst" target="_blank">蒸馏API文档</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/distillation_tutorial.md" target="_blank">蒸馏快速开始示例</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_distillation_tutorial.md" target="_blank">检测模型蒸馏教程</a>
</li>
<li>SensitivityPruner</li>
<li>FPGMFilterPruner</li>
<li>L1NormFilterPruner</li>
<li>L2NormFilterPruner</li>
<li>*SlimFilterPruner</li>
<li>*OptSlimFilterPruner</li>
</ul>
</td>
</tr>
<tr>
<td style="text-align:center;">
模型结构搜索(NAS)
</td>
<td>
<ul>
<li>
Simulate Anneal NAS:&nbsp;<a href="https://arxiv.org/pdf/2005.04117.pdf" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Abdelhamed, Abdelrahman, et al. "Ntire 2020 challenge on real image denoising: Dataset, methods and results."&nbsp;</span><i>The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">. Vol. 2. 2020.</span></a>
</li>
<li>
DARTS <a href="https://arxiv.org/abs/1806.09055" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search."&nbsp;</span><i>arXiv preprint arXiv:1806.09055</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2018).</span></a>
</li>
<li>
PC-DARTS <a href="https://arxiv.org/abs/1907.05737" target="_blank"><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">Xu, Yuhui, et al. "Pc-darts: Partial channel connections for memory-efficient differentiable architecture search."&nbsp;</span><i>arXiv preprint arXiv:1907.05737</i><span style="color:#222222;font-family:Arial, sans-serif;font-size:13px;background-color:#FFFFFF;">&nbsp;(2019).</span></a>
</li>
<li>
OneShot&nbsp;
</li>
<li>*Simulate Anneal based NAS</li>
<li>*Reinforcement Learning based NAS</li>
<li>**DARTS</li>
<li>**PC-DARTS</li>
<li>**Once-for-All</li>
<li>*Hardware-aware Search</li>
</ul>
</td>
<td>
<ul>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/nas_api.rst" target="_blank">NAS API文档</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/darts.rst" target="_blank">DARTS API文档</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/quick_start/nas_tutorial.md" target="_blank">NAS快速开始示例</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/paddledetection_slim_nas_tutorial.md" target="_blank">检测模型NAS教程</a>
</li>
<li>
<a href="https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/sanas_darts_space.md" target="_blank">SANAS进阶版实验教程-压缩DARTS产出模型</a>
</li>
<li>*FSP</li>
<li>*DML</li>
<li>*DK for YOLOv3</li>
</ul>
</td>
</tr>
</tbody>
</tr>
</table>
<br />
## 安装
注:*表示仅支持静态图,**表示仅支持动态图
```bash
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 量化和Paddle版本的对应关系
如果在ARM和GPU上预测,每个版本都可以,如果在CPU上预测,请选择Paddle 2.0对应的PaddleSlim 1.2.0版本
- Paddle 1.7 系列版本,需要安装PaddleSlim 1.0.1版本
```bash
pip install paddleslim==1.0.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 效果展示
- Paddle 1.8 系列版本,需要安装PaddleSlim 1.1.1版本
PaddleSlim在典型视觉和自然语言处理任务上做了模型压缩,并且测试了Nvidia GPU、ARM等设备上的加速情况,这里展示部分模型的压缩效果,详细方案可以参考下面CV和NLP模型压缩方案:
```bash
pip install paddleslim==1.1.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```
<p align="center">
<img src="docs/images/benchmark.png" height=185 width=849 hspace='10'/> <br />
<strong>表1: 部分模型压缩加速情况</strong>
</p>
- Paddle 2.0 系列版本,需要安装PaddleSlim 1.2.0版本
注:
- YOLOv3: 在移动端SD855上加速3.55倍。
- PP-OCR: 体积由8.9M减少到2.9M, 在SD855上加速1.27倍。
- BERT: 模型参数由110M减少到80M,精度提升的情况下,Tesla T4 GPU FP16计算加速1.47倍。
```bash
pip install paddleslim==1.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
```
## 文档教程
### 快速上手
- 量化训练 - [动态图]() | [静态图]()
- 离线量化 - [动态图]() | [静态图]()
- 剪裁 - [动态图]() | [静态图]()
- 蒸馏 - [动态图]() | [静态图]()
- NAS - [动态图]() | [静态图]()
## 使用
### 进阶教程
- [快速开始](docs/zh_cn/quick_start):通过简单示例介绍如何快速使用PaddleSlim。
- 动态图:
- 剪枝:[教程](dygraph_docs/)[示例](demo/dygraph/pruning)
- 量化:[示例](demo/dygraph/quant)
- [进阶教程](docs/zh_cn/tutorials):PaddleSlim高阶教程。
- [模型库](docs/zh_cn/model_zoo.md):各个压缩策略在图像分类、目标检测和图像语义分割模型上的实验结论,包括模型精度、预测速度和可供下载的预训练模型。
- [API文档](https://paddlepaddle.github.io/PaddleSlim/api_cn/index.html)
- [算法原理](https://paddlepaddle.github.io/PaddleSlim/algo/algo.html): 介绍量化、剪枝、蒸馏、NAS的基本知识背景。
- 视觉模型压缩
- [SlimMobileNet](paddleslim/models#slimmobilenet系列指标)
- [SlimFaceNet](demo/slimfacenet/README.md)
- [OCR模型压缩(基于PaddleOCR)](demo/ocr/README.md)
- [检测模型压缩(基于PaddleDetection)](demo/detection/README.md)
- [TensorRT部署](demo/quant/deploy/TensorRT): 介绍如何使用TensorRT部署PaddleSlim量化得到的模型。
#### 压缩功能详解
## 部分压缩策略效果
[量化训练]() | [离线量化]() | [剪裁]() | [蒸馏]() | [NAS]()
### 分类模型
#### 推理部署
数据: ImageNet2012; 模型: MobileNetV1;
- [概述]()
- [PaddleInference量化部署]()
- [Intel CPU量化部署]()
- [GPU量化部署]()
- [PaddleLite量化部署]()
|压缩策略 |精度收益(baseline: 70.91%) |模型大小(baseline: 17.0M)|
|:---:|:---:|:---:|
| 知识蒸馏(ResNet50)| [+1.06%]() |-|
| 知识蒸馏(ResNet50) + int8量化训练 |[+1.10%]()| [-71.76%]()|
| 剪裁(FLOPs-50%) + int8量化训练|[-1.71%]()|[-86.47%]()|
### CV模型压缩
- [检测模型压缩(基于PaddleDetection)]()
- YOLOv3 3.5倍加速方案
### 图像检测模型
- [分割模型压缩(基于PaddleSeg)]()
#### 数据:Pascal VOC;模型:MobileNet-V1-YOLOv3
- [OCR模型压缩(基于PaddleOCR)]()
- [3.5M模型压缩方案]()
| 压缩方法 | mAP(baseline: 76.2%) | 模型大小(baseline: 94MB) |
| :---------------------: | :------------: | :------------:|
| 知识蒸馏(ResNet34-YOLOv3) | [+2.8%](#) | - |
| 剪裁 FLOPs -52.88% | [+1.4%]() | [-67.76%]() |
|知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-69.57%)| [+2.6%]()|[-67.00%]()|
### NLP模型压缩
- [BERT]()
- [ERNIE]()
#### 数据:COCO;模型:MobileNet-V1-YOLOv3
### 通用轻量级模型
| 压缩方法 | mAP(baseline: 29.3%) | 模型大小|
| :---------------------: | :------------: | :------:|
| 知识蒸馏(ResNet34-YOLOv3) | [+2.1%]() |-|
| 知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-67.56%) | [-0.3%]() | [-66.90%]()|
- 人脸模型(SlimfaceNet)
- 图像分类模型(SlimMobileNet)
### 搜索
### API文档
数据:ImageNet2012; 模型:MobileNetV2
- 动态图
- 静态图
|硬件环境 | 推理耗时 | Top1准确率(baseline:71.90%) |
|:---------------:|:---------:|:--------------------:|
| RK3288 | [-23%]() | +0.07% |
| Android cellphone | [-20%]() | +0.16% |
| iPhone 6s | [-17%]() | +0.32% |
### [FAQ]()
## 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
## 如何贡献代码
本项目的发布受[Apache 2.0 license](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/LICENSE)许可认证。
## 贡献代码
我们非常欢迎你可以为PaddleSlim提供代码,也十分感谢你的反馈。
## 欢迎加入PaddleSlim技术交流群
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册