README_en.md 6.1 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
[中文](README.md) | English

Documents:https://paddlepaddle.github.io/PaddleSlim

# PaddleSlim

PaddleSlim is a toolkit for model compression. It contains a collection of compression strategies, such as pruning, fixed point quantization, knowledge distillation, hyperparameter searching and neural architecture search.

PaddleSlim provides solutions of compression on computer vision models, such as image classification, object detection and semantic segmentation. Meanwhile, PaddleSlim Keeps exploring advanced compression strategies for language model. Furthermore, benckmark of compression strategies on some open tasks is available for your reference.

PaddleSlim also provides auxiliary and primitive API for developer and researcher to survey, implement and apply the method in latest papers. PaddleSlim will support developer in ability of framework and technology consulting.

## Features

### Pruning

  - Uniform pruning of convolution
  - Sensitivity-based prunning
  - Automated pruning based evolution search strategy
  - Support pruning of various deep architectures such as VGG, ResNet, and MobileNet.
  - Support self-defined range of pruning, i.e., layers to be pruned.

### Fixed Point Quantization

  - **Training aware**
    - Dynamic strategy: During inference, we quantize models with hyperparameters dynamically estimated from small batches of samples.
    - Static strategy: During inference, we quantize models with the same hyperparameters estimated from training data.
    - Support layer-wise and channel-wise quantization.
  - **Post training**

### Knowledge Distillation

  - **Naive knowledge distillation:** transfers dark knowledge by merging the teacher and student model into the same Program
  - **Paddle large-scale scalable knowledge distillation framework Pantheon:** a universal solution for knowledge distillation, more flexible than the naive knowledge distillation, and easier to scale to the large-scale applications.

    - Decouple the teacher and student models --- they run in different processes in the same or different nodes, and transfer knowledge via TCP/IP ports or local files;
    - Friendly to assemble multiple teacher models and each of them can work in either online or offline mode independently;
    - Merge knowledge from different teachers and make batch data for the student model automatically;
    - Support the large-scale knowledge prediction of teacher models on multiple devices.

### Neural Architecture Search

  - Neural architecture search based on evolution strategy.
  - Support distributed search.
  - One-Shot neural architecture search.
  - Support FLOPs and latency constrained search.
  - Support the latency estimation on different hardware and platforms.

## Install

Requires:

Paddle >= 1.7.0

```bash
pip install paddleslim -i https://pypi.org/simple
```

59 60 61 62
### quantization

If you want to use quantization in PaddleSlim, please install PaddleSlim as follows.

63
If you want to use quantized model in ARM and GPU, any PaddleSlim version is ok and you should install 1.2.0 for CPU.
64 65 66 67 68 69 70 71 72 73 74 75 76

- For Paddle 1.7, install PaddleSlim 1.0.1

```bash
pip install paddleslim==1.0.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```

- For Paddle 1.8,install PaddleSlim 1.1.1

```bash
pip install paddleslim==1.1.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
```

77
- For Paddle 2.0 ,install PaddleSlim 1.2.0
78 79

```bash
80
pip install paddleslim==1.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
81 82
```

83 84 85
## Usage

- [QuickStart](https://paddlepaddle.github.io/PaddleSlim/quick_start/index_en.html): Introduce how to use PaddleSlim by simple examples.
B
Bai Yifan 已提交
86 87 88 89 90
- Dynamic graph
  - Pruning: [Tutorial](dygraph_docs/), [Demo](demo/dygraph/pruning)
  - Quantization: [Demo](demo/dygraph/quant)


91
- [Advanced Tutorials](https://paddlepaddle.github.io/PaddleSlim/tutorials/index_en.html):Tutorials about advanced usage of PaddleSlim.
B
Bai Yifan 已提交
92

93
- [Model Zoo](https://paddlepaddle.github.io/PaddleSlim/model_zoo_en.html):Benchmark and pretrained models.
B
Bai Yifan 已提交
94

95
- [API Documents](https://paddlepaddle.github.io/PaddleSlim/api_en/index_en.html)
B
Bai Yifan 已提交
96

97
- [Algorithm Background](https://paddlepaddle.github.io/PaddleSlim/algo/algo.html): Introduce the background of quantization, pruning, distillation, NAS.
B
Bai Yifan 已提交
98

99
- [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/tree/master/slim): Introduce how to use PaddleSlim in PaddleDetection library.
B
Bai Yifan 已提交
100

101
- [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/slim): Introduce how to use PaddleSlim in PaddleSeg library.
B
Bai Yifan 已提交
102

103
- [PaddleLite](https://paddlepaddle.github.io/Paddle-Lite/): How to use PaddleLite to deploy models generated by PaddleSlim.
B
Bai Yifan 已提交
104

B
Bai Yifan 已提交
105
- [TensorRT Deploy](demo/quant/deploy/TensorRT): How to use TensorRT to deploy models generated by PaddleSlim.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146

## Performance

### Image Classification

Dataset: ImageNet2012; Model: MobileNetV1;

|Method |Accuracy(baseline: 70.91%) |Model Size(baseline: 17.0M)|
|:---:|:---:|:---:|
| Knowledge Distillation(ResNet50)| [+1.06%]() |-|
| Knowledge Distillation(ResNet50) + int8 quantization |[+1.10%]()| [-71.76%]()|
| Pruning(FLOPs-50%) + int8 quantization|[-1.71%]()|[-86.47%]()|


### Object Detection

#### Dataset: Pascal VOC; Model: MobileNet-V1-YOLOv3

|        Method           | mAP(baseline: 76.2%)         | Model Size(baseline: 94MB)      |
| :---------------------:   | :------------: | :------------:|
| Knowledge Distillation(ResNet34-YOLOv3) | [+2.8%]()      |       -       |
| Pruning(FLOPs -52.88%)        | [+1.4%]()      | [-67.76%]()   |
|Knowledge DistillationResNet34-YOLOv3)+Pruning(FLOPs-69.57%)| [+2.6%]()|[-67.00%]()|


#### Dataset: COCO; Model: MobileNet-V1-YOLOv3

|        Method           | mAP(baseline: 29.3%) | Model Size|
| :---------------------:   | :------------: | :------:|
| Knowledge Distillation(ResNet34-YOLOv3) |  [+2.1%]()     |-|
| Knowledge Distillation(ResNet34-YOLOv3)+Pruning(FLOPs-67.56%) | [-0.3%]() | [-66.90%]()|

### NAS

Dataset: ImageNet2012; Model: MobileNetV2

|Device           | Infer time cost | Top1 accuracy(baseline:71.90%) |
|:---------------:|:---------:|:--------------------:|
| RK3288  | [-23%]()    | +0.07%    |
| Android cellphone  | [-20%]()    | +0.16% |
| iPhone 6s   | [-17%]()    | +0.32%  |