提交 3880f3d2 编写于 作者: M Michał Gallus 提交者: Tao Luo

Add document for int8 object detection quantization (#19356) (#20669)

test=release/1.6 test=document_fix
上级 c0f94aef
# INT8 MKL-DNN quantization # INT8 MKL-DNN quantization
This document describes how to use Paddle inference Engine to convert the FP32 model to INT8 model on ResNet-50 and MobileNet-V1. We provide the instructions on enabling INT8 MKL-DNN quantization in Paddle inference and show the ResNet-50 and MobileNet-V1 results in accuracy and performance. This document describes how to use Paddle inference Engine to convert the FP32 models to INT8 models. We provide the instructions on enabling INT8 MKL-DNN quantization in Paddle inference and show the accuracy and performance results of the quantized models, including 7 image classification models: GoogleNet, MobileNet-V1, MobileNet-V2, ResNet-101, ResNet-50, VGG16, VGG19, and 1 object detection model Mobilenet-SSD.
## 0. Install PaddlePaddle ## 0. Install PaddlePaddle
...@@ -15,7 +15,7 @@ Note: MKL-DNN and MKL are required. ...@@ -15,7 +15,7 @@ Note: MKL-DNN and MKL are required.
## 1. Enable INT8 MKL-DNN quantization ## 1. Enable INT8 MKL-DNN quantization
For reference, please examine the code of unit test enclosed in [analyzer_int8_image_classification_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc). For reference, please examine the code of unit test enclosed in [analyzer_int8_image_classification_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc) and [analyzer_int8_object_detection_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_object_detection_tester.cc).
* ### Create Analysis config * ### Create Analysis config
...@@ -34,12 +34,10 @@ cfg.mkldnn_quantizer_config()->SetWarmupData(warmup_data); ...@@ -34,12 +34,10 @@ cfg.mkldnn_quantizer_config()->SetWarmupData(warmup_data);
cfg.mkldnn_quantizer_config()->SetWarmupBatchSize(100); cfg.mkldnn_quantizer_config()->SetWarmupBatchSize(100);
``` ```
## 2. Accuracy and Performance benchmark ## 2. Accuracy and Performance benchmark for Image Classification models
We provide the results of accuracy and performance measured on Intel(R) Xeon(R) Gold 6271 on single core. We provide the results of accuracy and performance measured on Intel(R) Xeon(R) Gold 6271 on single core.
>**Dataset: ILSVRC2012 Validation dataset**
>**I. Top-1 Accuracy on Intel(R) Xeon(R) Gold 6271** >**I. Top-1 Accuracy on Intel(R) Xeon(R) Gold 6271**
| Model | FP32 Accuracy | INT8 Accuracy | Accuracy Diff(FP32-INT8) | | Model | FP32 Accuracy | INT8 Accuracy | Accuracy Diff(FP32-INT8) |
...@@ -64,20 +62,10 @@ We provide the results of accuracy and performance measured on Intel(R) Xeon(R) ...@@ -64,20 +62,10 @@ We provide the results of accuracy and performance measured on Intel(R) Xeon(R)
| VGG16 | 3.64 | 10.56 | 2.90 | | VGG16 | 3.64 | 10.56 | 2.90 |
| VGG19 | 2.95 | 9.02 | 3.05 | | VGG19 | 2.95 | 9.02 | 3.05 |
Notes:
* Measurement of accuracy requires a model which accepts two inputs: data and labels.
* Different sampling batch size data may cause slight difference on INT8 top accuracy.
* CAPI performance data is better than python API performance data because of the python overhead. Especially for the small computational model, python overhead will be more obvious.
## 3. Commands to reproduce the above accuracy and performance benchmark
Two steps to reproduce the above-mentioned accuracy results, and we take GoogleNet benchmark as an example:
* ### Prepare dataset * ## Prepare dataset
Running the following commands to download and preprocess the ILSVRC2012 Validation dataset. Run the following commands to download and preprocess the ILSVRC2012 Validation dataset.
```bash ```bash
cd /PATH/TO/PADDLE/build cd /PATH/TO/PADDLE/build
...@@ -86,12 +74,13 @@ python ../paddle/fluid/inference/tests/api/full_ILSVRC2012_val_preprocess.py ...@@ -86,12 +74,13 @@ python ../paddle/fluid/inference/tests/api/full_ILSVRC2012_val_preprocess.py
Then the ILSVRC2012 Validation dataset will be preprocessed and saved by default in `~/.cache/paddle/dataset/int8/download/int8_full_val.bin` Then the ILSVRC2012 Validation dataset will be preprocessed and saved by default in `~/.cache/paddle/dataset/int8/download/int8_full_val.bin`
* ### Commands to reproduce benchmark * ## Commands to reproduce image classification benchmark
You can run `test_analyzer_int8_imagenet_classification` with the following arguments to reproduce the accuracy result on GoogleNet. You can run `test_analyzer_int8_imagenet_classification` with the following arguments to reproduce the accuracy result on Resnet50.
```bash ```bash
./paddle/fluid/inference/tests/api/test_analyzer_int8_image_classification --infer_model=third_party/inference_demo/int8v2/resnet50/model --infer_data=/~/.cache/paddle/dataset/int8/download/int8_full_val.bin --batch_size=1 --paddle_num_threads=1 cd /PATH/TO/PADDLE/build
./paddle/fluid/inference/tests/api/test_analyzer_int8_image_classification --infer_model=third_party/inference_demo/int8v2/resnet50/model --infer_data=$HOME/.cache/paddle/dataset/int8/download/int8_full_val.bin --batch_size=1 --paddle_num_threads=1
``` ```
To verify all the 7 models, you need to set the parameter of `--infer_model` to one of the following values in command line: To verify all the 7 models, you need to set the parameter of `--infer_model` to one of the following values in command line:
...@@ -103,3 +92,59 @@ To verify all the 7 models, you need to set the parameter of `--infer_model` to ...@@ -103,3 +92,59 @@ To verify all the 7 models, you need to set the parameter of `--infer_model` to
```text ```text
MODEL_NAME=googlenet, mobilenetv1, mobilenetv2, resnet101, resnet50, vgg16, vgg19 MODEL_NAME=googlenet, mobilenetv1, mobilenetv2, resnet101, resnet50, vgg16, vgg19
``` ```
## 3. Accuracy and Performance benchmark for Object Detection models
>**I. mAP on Intel(R) Xeon(R) Gold 6271 (batch size 1 on single core):**
| Model | FP32 Accuracy | INT8 Accuracy | Accuracy Diff(FP32-INT8) |
| :----------: | :-------------: | :------------: | :--------------: |
| Mobilenet-SSD| 73.80% | 73.17% | 0.63% |
>**II. Throughput on Intel(R) Xeon(R) Gold 6271 (batch size 1 on single core)**
| Model | FP32 Throughput(images/s) | INT8 Throughput(images/s) | Ratio(INT8/FP32)|
| :-----------:| :------------: | :------------: | :------------: |
| Mobilenet-SSD | 37.8180 | 115.0604 |3.04 |
* ## Prepare dataset
* Run the following commands to download and preprocess the Pascal VOC2007 test set.
```bash
cd /PATH/TO/PADDLE/build
python ./paddle/fluid/inference/tests/api/full_pascalvoc_test_preprocess.py --choice=VOC_test_2007 \\
```
Then the Pascal VOC2007 test set will be preprocessed and saved by default in `~/.cache/paddle/dataset/pascalvoc/pascalvoc_full.bin`
* Run the following commands to prepare your own dataset.
```bash
cd /PATH/TO/PADDLE/build
python ./paddle/fluid/inference/tests/api/full_pascalvoc_test_preprocess.py --choice=local \\
--data_dir=./third_party/inference_demo/int8v2/pascalvoc_small \\
--img_annotation_list=test_100.txt \\
--label_file=label_list \\
--output_file=pascalvoc_small.bin \\
--resize_h=300 \\
--resize_w=300 \\
--mean_value=[127.5, 127.5, 127.5] \\
--ap_version=11point \\
```
Then the user dataset will be preprocessed and saved by default in `/PATH/TO/PADDLE/build/third_party/inference_demo/int8v2/pascalvoc_small/pascalvoc_small.bin`
* ## Commands to reproduce object detection benchmark
You can run `test_analyzer_int8_object_detection` with the following arguments to reproduce the benchmark results for Mobilenet-SSD.
```bash
cd /PATH/TO/PADDLE/build
./paddle/fluid/inference/tests/api/test_analyzer_int8_object_detection --infer_model=third_party/inference_demo/int8v2/mobilenet-ssd/model --infer_data=$HOME/.cache/paddle/dataset/pascalvoc/pascalvoc_full.bin --warmup_batch_size=10 --batch_size=100 --paddle_num_threads=1
```
## 4. Notes
* Measurement of accuracy requires a model which accepts two inputs: data and labels.
* Different sampling batch size data may cause slight difference on INT8 accuracy.
* CAPI performance data is better than python API performance data because of the python overhead. Especially for the small computational model, python overhead will be more obvious.
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册