README.md 2.8 KB
Newer Older
1 2 3 4 5 6 7 8
模型量化 Model Quantization
---

本目录包含了采用MegEngine实现的量化训练和部署的代码,包括常用的ResNet、ShuffleNet和MobileNet,其量化模型的ImageNet Top 1 准确率如下:

| Model | top1 acc (float32) | FPS* (float32) | top1 acc (int8) | FPS* (int8) |
| --- | --- | --- | --- | --- |
| ResNet18 |  69.824  | 10.5   | 69.754 | 16.3 |
9 10
| ShufflenetV1 (1.5x) | 71.954  |  17.3 | 70.656 | 25.3 |
| MobilenetV2 | 72.820  |  13.1  | 71.378 | 17.4 |
11 12 13

**: FPS is measured on Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz, single 224x224 image*

14 15
*We finetune mobile models with QAT for 30 epochs, training longer may yield better accuracy*

16 17
量化模型使用时,统一读取0-255的uint8图片,减去128的均值,转化为int8,输入网络。

18 19 20 21 22 23 24 25 26 27 28

#### (Optional) Download Pretrained Models
```
wget https://data.megengine.org.cn/models/weights/mobilenet_v2_normal_72820.pkl 
wget https://data.megengine.org.cn/models/weights/mobilenet_v2_qat_71378.pkl
wget https://data.megengine.org.cn/models/weights/resnet18_normal_69824.pkl
wget https://data.megengine.org.cn/models/weights/resnet18_qat_69754.pkl
wget https://data.megengine.org.cn/models/weights/shufflenet_v1_x1_5_g3_normal_71954.pkl
wget https://data.megengine.org.cn/models/weights/shufflenet_v1_x1_5_g3_qat_70656.pkl
```

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
## Quantization Aware Training (QAT)

```python
import megengine.quantization as Q

model = ...

# Quantization Aware Training
Q.quantize_qat(model, qconfig=Q.ema_fakequant_qconfig)

for _ in range(...):
    train(model)
```

## Deploying Quantized Model

```python
import megengine.quantization as Q
import megengine.jit as jit

model = ...

Q.quantize_qat(model, qconfig=Q.ema_fakequant_qconfig)

# real quant
Q.quantize(model)

@jit.trace(symbolic=True):
def inference_func(x):
    return model(x)

inference_func.dump(...)
```

# HOWTO use this codebase

## Step 1. Train a fp32 model

```
python3 train.py -a resnet18 -d /path/to/imagenet --mode normal
```

## Step 2. Finetune fp32 model with quantization aware training(QAT)

```
python3 finetune.py -a resnet18 -d /path/to/imagenet --checkpoint /path/to/resnet18.normal/checkpoint.pkl --mode qat
```

## Step 3. Test QAT model on ImageNet Testset

```
python3 test.py -a resnet18 -d /path/to/imagenet --checkpoint /path/to/resnet18.qat/checkpoint.pkl --mode qat
```

or testing in quantized mode, which uses only cpu for inference and takes longer time

```
python3 test.py -a resnet18 -d /path/to/imagenet --checkpoint /path/to/resnet18.qat/checkpoint.pkl --mode quantized -n 1
```

## Step 4. Inference and dump

```
python3 inference.py -a resnet18 --checkpoint /path/to/resnet18.qat/checkpoint.pkl --mode quantized --dump
```

will feed a cat image to the network and output the classification probabilities with quantized network.

Also, set `--dump` will dump the quantized network to `resnet18.quantized.megengine` binary file.