未验证 提交 8ef3c02e 编写于 作者: L lidanqing 提交者: GitHub

Update DNNL QAT document 2.0-alpha (#24494)

Update DNNL QAT document 2.0-alpha
上级 db2b6b65
...@@ -109,10 +109,9 @@ The code snipped shows how the `Qat2Int8MkldnnPass` can be applied to a model gr ...@@ -109,10 +109,9 @@ The code snipped shows how the `Qat2Int8MkldnnPass` can be applied to a model gr
## 5. Accuracy and Performance benchmark ## 5. Accuracy and Performance benchmark
This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on two servers: This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on the following server:
* Intel(R) Xeon(R) Gold 6271 (with AVX512 VNNI support), * Intel(R) Xeon(R) Gold 6271 (with AVX512 VNNI support),
* Intel(R) Xeon(R) Gold 6148.
Performance benchmarks were run with the following environment settings: Performance benchmarks were run with the following environment settings:
...@@ -144,17 +143,6 @@ Performance benchmarks were run with the following environment settings: ...@@ -144,17 +143,6 @@ Performance benchmarks were run with the following environment settings:
| VGG16 | 72.08% | 71.73% | -0.35% | 90.63% | 89.71% | -0.92% | | VGG16 | 72.08% | 71.73% | -0.35% | 90.63% | 89.71% | -0.92% |
| VGG19 | 72.57% | 72.12% | -0.45% | 90.84% | 90.15% | -0.69% | | VGG19 | 72.57% | 72.12% | -0.45% | 90.84% | 90.15% | -0.69% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Top1 Accuracy | INT8 QAT Top1 Accuracy | Top1 Diff | FP32 Top5 Accuracy | INT8 QAT Top5 Accuracy | Top5 Diff |
| :----------: | :----------------: | :--------------------: | :-------: | :----------------: | :--------------------: | :-------: |
| MobileNet-V1 | 70.78% | 70.85% | 0.07% | 89.69% | 89.41% | -0.28% |
| MobileNet-V2 | 71.90% | 72.08% | 0.18% | 90.56% | 90.66% | +0.10% |
| ResNet101 | 77.50% | 77.51% | 0.01% | 93.58% | 93.50% | -0.08% |
| ResNet50 | 76.63% | 76.55% | -0.08% | 93.10% | 92.96% | -0.14% |
| VGG16 | 72.08% | 71.72% | -0.36% | 90.63% | 89.75% | -0.88% |
| VGG19 | 72.57% | 72.08% | -0.49% | 90.84% | 90.11% | -0.73% |
#### Performance #### Performance
Image classification models performance was measured using a single thread. The setting is included in the benchmark reproduction commands below. Image classification models performance was measured using a single thread. The setting is included in the benchmark reproduction commands below.
...@@ -164,23 +152,12 @@ Image classification models performance was measured using a single thread. The ...@@ -164,23 +152,12 @@ Image classification models performance was measured using a single thread. The
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) | | Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: | | :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 77.00 | 210.76 | 2.74 | | MobileNet-V1 | 74.05 | 196.98 | 2.66 |
| MobileNet-V2 | 88.43 | 182.47 | 2.06 | | MobileNet-V2 | 88.60 | 187.67 | 2.12 |
| ResNet101 | 7.20 | 25.88 | 3.60 | | ResNet101 | 7.20 | 26.43 | 3.67 |
| ResNet50 | 13.26 | 47.44 | 3.58 | | ResNet50 | 13.23 | 47.44 | 3.59 |
| VGG16 | 3.48 | 10.11 | 2.90 | | VGG16 | 3.47 | 10.20 | 2.94 |
| VGG19 | 2.83 | 8.77 | 3.10 | | VGG19 | 2.83 | 8.67 | 3.06 |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 75.23 | 103.63 | 1.38 |
| MobileNet-V2 | 86.65 | 128.14 | 1.48 |
| ResNet101 | 6.61 | 10.79 | 1.63 |
| ResNet50 | 12.42 | 19.65 | 1.58 |
| VGG16 | 3.31 | 4.74 | 1.43 |
| VGG19 | 2.68 | 3.91 | 1.46 |
Notes: Notes:
...@@ -194,13 +171,8 @@ Notes: ...@@ -194,13 +171,8 @@ Notes:
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff | | Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
|:------------:|:----------------------:|:----------------------:|:---------:| |:------------:|:----------------------:|:----------------------:|:---------:|
| Ernie | 80.20% | 79.88% | -0.32% | | Ernie | 80.20% | 79.44% | -0.76% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
| :---: | :-----------: | :---------------: | :-----------: |
| Ernie | 80.20% | 79.64% | -0.56% |
#### Performance #### Performance
...@@ -209,16 +181,9 @@ Notes: ...@@ -209,16 +181,9 @@ Notes:
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) | | Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
|:------------:|:----------------------:|:-------------------:|:---------:|:---------:| |:------------:|:----------------------:|:-------------------:|:---------:|:---------:|
| Ernie | 1 thread | 236.72 | 83.70 | 2.82x | | Ernie | 1 thread | 237.21 | 79.26 | 2.99x |
| Ernie | 20 threads | 27.40 | 15.01 | 1.83x | | Ernie | 20 threads | 22.08 | 12.57 | 1.76x |
>**Intel(R) Xeon(R) Gold 6148**
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
| :---: | :--------: | :---------------: | :-------------------: | :---------------: |
| Ernie | 1 thread | 248.42 | 169.30 | 1.46 |
| Ernie | 20 threads | 28.92 | 20.83 | 1.39 |
## 6. How to reproduce the results ## 6. How to reproduce the results
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册