Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Crayon鑫
Paddle
提交
8ef3c02e
P
Paddle
项目概览
Crayon鑫
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
8ef3c02e
编写于
5月 14, 2020
作者:
L
lidanqing
提交者:
GitHub
5月 14, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update DNNL QAT document 2.0-alpha (#24494)
Update DNNL QAT document 2.0-alpha
上级
db2b6b65
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
10 addition
and
45 deletion
+10
-45
python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md
...paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md
+10
-45
未找到文件。
python/paddle/fluid/contrib/slim/tests/QAT_mkldnn_int8_readme.md
浏览文件 @
8ef3c02e
...
...
@@ -109,10 +109,9 @@ The code snipped shows how the `Qat2Int8MkldnnPass` can be applied to a model gr
## 5. Accuracy and Performance benchmark
This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on t
wo servers
:
This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on t
he following server
:
* Intel(R) Xeon(R) Gold 6271 (with AVX512 VNNI support),
* Intel(R) Xeon(R) Gold 6148.
Performance benchmarks were run with the following environment settings:
...
...
@@ -144,17 +143,6 @@ Performance benchmarks were run with the following environment settings:
| VGG16 | 72.08% | 71.73% | -0.35% | 90.63% | 89.71% | -0.92% |
| VGG19 | 72.57% | 72.12% | -0.45% | 90.84% | 90.15% | -0.69% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Top1 Accuracy | INT8 QAT Top1 Accuracy | Top1 Diff | FP32 Top5 Accuracy | INT8 QAT Top5 Accuracy | Top5 Diff |
| :----------: | :----------------: | :--------------------: | :-------: | :----------------: | :--------------------: | :-------: |
| MobileNet-V1 | 70.78% | 70.85% | 0.07% | 89.69% | 89.41% | -0.28% |
| MobileNet-V2 | 71.90% | 72.08% | 0.18% | 90.56% | 90.66% | +0.10% |
| ResNet101 | 77.50% | 77.51% | 0.01% | 93.58% | 93.50% | -0.08% |
| ResNet50 | 76.63% | 76.55% | -0.08% | 93.10% | 92.96% | -0.14% |
| VGG16 | 72.08% | 71.72% | -0.36% | 90.63% | 89.75% | -0.88% |
| VGG19 | 72.57% | 72.08% | -0.49% | 90.84% | 90.11% | -0.73% |
#### Performance
Image classification models performance was measured using a single thread. The setting is included in the benchmark reproduction commands below.
...
...
@@ -164,23 +152,12 @@ Image classification models performance was measured using a single thread. The
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 77.00 | 210.76 | 2.74 |
| MobileNet-V2 | 88.43 | 182.47 | 2.06 |
| ResNet101 | 7.20 | 25.88 | 3.60 |
| ResNet50 | 13.26 | 47.44 | 3.58 |
| VGG16 | 3.48 | 10.11 | 2.90 |
| VGG19 | 2.83 | 8.77 | 3.10 |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 75.23 | 103.63 | 1.38 |
| MobileNet-V2 | 86.65 | 128.14 | 1.48 |
| ResNet101 | 6.61 | 10.79 | 1.63 |
| ResNet50 | 12.42 | 19.65 | 1.58 |
| VGG16 | 3.31 | 4.74 | 1.43 |
| VGG19 | 2.68 | 3.91 | 1.46 |
| MobileNet-V1 | 74.05 | 196.98 | 2.66 |
| MobileNet-V2 | 88.60 | 187.67 | 2.12 |
| ResNet101 | 7.20 | 26.43 | 3.67 |
| ResNet50 | 13.23 | 47.44 | 3.59 |
| VGG16 | 3.47 | 10.20 | 2.94 |
| VGG19 | 2.83 | 8.67 | 3.06 |
Notes:
...
...
@@ -194,13 +171,8 @@ Notes:
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
|:------------:|:----------------------:|:----------------------:|:---------:|
| Ernie | 80.20% | 79.
88% | -0.32
% |
| Ernie | 80.20% | 79.
44% | -0.76
% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
| :---: | :-----------: | :---------------: | :-----------: |
| Ernie | 80.20% | 79.64% | -0.56% |
#### Performance
...
...
@@ -209,16 +181,9 @@ Notes:
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
|:------------:|:----------------------:|:-------------------:|:---------:|:---------:|
| Ernie | 1 thread | 236.72 | 83.70 | 2.82x |
| Ernie | 20 threads | 27.40 | 15.01 | 1.83x |
>**Intel(R) Xeon(R) Gold 6148**
| Ernie | 1 thread | 237.21 | 79.26 | 2.99x |
| Ernie | 20 threads | 22.08 | 12.57 | 1.76x |
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
| :---: | :--------: | :---------------: | :-------------------: | :---------------: |
| Ernie | 1 thread | 248.42 | 169.30 | 1.46 |
| Ernie | 20 threads | 28.92 | 20.83 | 1.39 |
## 6. How to reproduce the results
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录