diff --git a/paddle/fluid/inference/tests/api/int8_mkldnn_quantization.md b/paddle/fluid/inference/tests/api/int8_mkldnn_quantization.md index 6f0e0d14bdf8c4f27761b278fd9bc617cf1cd527..2d73c43da9218df6ee0758268c8c7f48d0c593f8 100644 --- a/paddle/fluid/inference/tests/api/int8_mkldnn_quantization.md +++ b/paddle/fluid/inference/tests/api/int8_mkldnn_quantization.md @@ -42,25 +42,25 @@ We provide the results of accuracy and performance measured on Intel(R) Xeon(R) | Model | FP32 Accuracy | INT8 Accuracy | Accuracy Diff(FP32-INT8) | | :----------: | :-------------: | :------------: | :--------------: | -| GoogleNet | 70.50% | 69.81% | 0.69% | -| MobileNet-V1 | 70.78% | 70.42% | 0.36% | -| MobileNet-V2 | 71.90% | 71.35% | 0.55% | -| ResNet-101 | 77.50% | 77.42% | 0.08% | -| ResNet-50 | 76.63% | 76.52% | 0.11% | -| VGG16 | 72.08% | 72.03% | 0.05% | -| VGG19 | 72.57% | 72.55% | 0.02% | +| GoogleNet | 70.50% | 70.08% | 0.42% | +| MobileNet-V1 | 70.78% | 70.41% | 0.37% | +| MobileNet-V2 | 71.90% | 71.34% | 0.56% | +| ResNet-101 | 77.50% | 77.43% | 0.07% | +| ResNet-50 | 76.63% | 76.57% | 0.06% | +| VGG16 | 72.08% | 72.05% | 0.03% | +| VGG19 | 72.57% | 72.57% | 0.00% | >**II. Throughput on Intel(R) Xeon(R) Gold 6271 (batch size 1 on single core)** | Model | FP32 Throughput(images/s) | INT8 Throughput(images/s) | Ratio(INT8/FP32)| | :-----------:| :------------: | :------------: | :------------: | -| GoogleNet | 34.06 | 72.79 | 2.14 | -| MobileNet-V1 | 80.02 | 230.65 | 2.88 | -| MobileNet-V2 | 99.38 | 206.92 | 2.08 | -| ResNet-101 | 7.38 | 27.31 | 3.70 | -| ResNet-50 | 13.71 | 50.55 | 3.69 | -| VGG16 | 3.64 | 10.56 | 2.90 | -| VGG19 | 2.95 | 9.02 | 3.05 | +| GoogleNet | 32.76 | 67.43 | 2.06 | +| MobileNet-V1 | 73.96 | 218.82 | 2.96 | +| MobileNet-V2 | 87.94 | 193.70 | 2.20 | +| ResNet-101 | 7.17 | 26.37 | 3.42 | +| ResNet-50 | 13.26 | 48.72 | 3.67 | +| VGG16 | 3.47 | 10.10 | 2.91 | +| VGG19 | 2.82 | 8.68 | 3.07 | * ## Prepare dataset diff --git a/python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md b/python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md index 115ba489bff257581a052691cda4a31fd74648dc..dcf69d05e6a12f8229ffd1891c725441a24287dd 100644 --- a/python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md +++ b/python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md @@ -65,12 +65,12 @@ Notes: | Model | Fake QAT Original Throughput(images/s) | INT8 Throughput(images/s) | Ratio(INT8/FP32)| | :-----------:| :-------------------------: | :------------: | :------------: | -| MobileNet-V1 | 13.66 | 114.98 | 8.42 | -| MobileNet-V2 | 10.22 | 79.78 | 7.81 | -| ResNet101 | 2.65 | 18.97 | 7.16 | -| ResNet50 | 4.58 | 35.09 | 7.66 | -| VGG16 | 2.38 | 9.93 | 4.17 | -| VGG19 | 2.03 | 8.53 | 4.20 | +| MobileNet-V1 | 12.86 | 118.05 | 9.18 | +| MobileNet-V2 | 9.76 | 85.89 | 8.80 | +| ResNet101 | 2.55 | 19.40 | 7.61 | +| ResNet50 | 4.39 | 35.78 | 8.15 | +| VGG16 | 2.26 | 9.89 | 4.38 | +| VGG19 | 1.96 | 8.41 | 4.29 | ## 3. How to reproduce the results Three steps to reproduce the above-mentioned accuracy results, and we take ResNet50 benchmark as an example: @@ -95,7 +95,7 @@ cd /PATH/TO/DOWNLOAD/MODEL/ wget http://paddle-inference-dist.bj.bcebos.com/int8/${MODEL_FILE_NAME} ``` -To download and verify all the 7 models, you need to set `MODEL_NAME` to one of the following values in command line: +Unzip the downloaded model to the folder.To verify all the 7 models, you need to set `MODEL_NAME` to one of the following values in command line: ```text QAT MKL-DNN 1.0 MODEL_NAME=ResNet50, ResNet101, GoogleNet, MobileNetV1, MobileNetV2, VGG16, VGG19