Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleSlim
提交
18d3524a
P
PaddleSlim
项目概览
PaddlePaddle
/
PaddleSlim
1 年多 前同步成功
通知
51
Star
1434
Fork
344
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
53
列表
看板
标记
里程碑
合并请求
16
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleSlim
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
53
Issue
53
列表
看板
标记
里程碑
合并请求
16
合并请求
16
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
18d3524a
编写于
10月 24, 2022
作者:
C
Chang Xu
提交者:
GitHub
10月 24, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update Quant Analysis (#1463)
上级
12ec1002
变更
6
显示空白变更内容
内联
并排
Showing
6 changed file
with
436 addition
and
245 deletion
+436
-245
docs/zh_cn/tutorials/quant/AnalysisQuant.md
docs/zh_cn/tutorials/quant/AnalysisQuant.md
+98
-0
example/post_training_quantization/detection/README.md
example/post_training_quantization/detection/README.md
+2
-1
example/post_training_quantization/detection/analysis.py
example/post_training_quantization/detection/analysis.py
+4
-7
example/post_training_quantization/pytorch_yolo_series/README.md
.../post_training_quantization/pytorch_yolo_series/README.md
+3
-1
example/post_training_quantization/pytorch_yolo_series/analysis.py
...ost_training_quantization/pytorch_yolo_series/analysis.py
+2
-6
paddleslim/quant/analysis.py
paddleslim/quant/analysis.py
+327
-230
未找到文件。
example/post_training_quantization/analysis
.md
→
docs/zh_cn/tutorials/quant/AnalysisQuant
.md
浏览文件 @
18d3524a
# 量化分析工具详细教程
## 1. 量化分析工具功能
1.
遍历模型所有层,依次量化该层,计算量化后精度。为所有只量化一层的模型精度排序,可视化不适合量化的层,以供量化时可选择性跳过不适合量化的层。
2.
可视化激活箱状图,以供分析每个可量化OP的激活分布对量化效果的影响。
3.
量化效果较好和较差的层的权重和激活直方分布图,以供分析其对量化效果的影响。
4.
输入预期精度,直接产出符合预期精度的量化模型。
1.
statistical_analyse:
-
可视化激活和权重箱状图。箱状图可发现是否出现离群点。
-
可视化权重和激活直方分布图。直方分布图可观察更具体的数值分布。
-
提供量化前后权重和激活的具体数据信息,包括min,max,mean,std等
2.
metric_error_analyse:
-
遍历量化模型的每层,并计算量化后精度。该功能可以定位具体某层导致的量化损失。
3.
get_target_quant_model:
-
输入预期精度,直接产出符合预期精度的量化模型。
## 2. paddleslim.quant.AnalysisQuant 可传入参数解析
```
yaml
...
...
@@ -14,25 +21,23 @@ params_filename: None
eval_function
:
None
data_loader
:
None
save_dir
:
'
analysis_results'
checkpoint_name
:
'
analysis_checkpoint.pkl'
num_histogram_plots
:
10
resume
:
False
ptq_config
```
-
model_dir: 必须传入的模型文件路径,可为文件夹名;若模型为ONNX类型,直接输入'.onnx'模型文件名称即可。
-
model_filename: 默认为None,若model_dir为文件夹名,则必须传入以'.pdmodel'结尾的模型名称,若model_dir为'.onnx'模型文件名称,则不需要传入。
-
params_filename: 默认为None,若model_dir为文件夹名,则必须传入以'.pdiparams'结尾的模型名称,若model_dir为'.onnx'模型文件名称,则不需要传入。
-
eval_function:
目前不支持为None
,需要传入自定义的验证函数。
-
eval_function:
若需要验证精度
,需要传入自定义的验证函数。
-
data_loader:模型校准时使用的数据,DataLoader继承自
`paddle.io.DataLoader`
。可以直接使用模型套件中的DataLoader,或者根据
[
paddle.io.DataLoader
](
https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/io/DataLoader_cn.html#dataloader
)
自定义所需要的DataLoader。
-
save_dir:分析后保存模型精度或pdf等文件的文件夹,默认为
`analysis_results`
。
-
checkpoint_name:由于模型可能存在大量层需要分析,因此分析过程中会中间保存结果,如果程序中断会自动加载已经分析好的结果,默认为
`analysis_checkpoint.pkl`
。
-
num_histogram_plots:需要可视化的直方分布图数量。可视化量化效果最好和最坏的该数量个权重和激活的分布图。默认为10。若不需要可视化直方图,设置为0即可。
-
resume:是否加载中间分析文件
-
ptq_config:可传入的离线量化中的参数,详细可参考
[
离线量化文档
](
https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/quant/quant_post
)
。
## 3. 量化分析工具的使用
1.
创建量化分析工具
:
**创建量化分析工具**
:
```
analyzer = AnalysisQuant(
model_dir=config["model_dir"],
...
...
@@ -44,45 +49,47 @@ analyzer = AnalysisQuant(
ptq_config=config['PTQ'])
```
2.
绘制所有可量化层的激活箱状图
**统计分析**
```
analyzer.
plot_activation_distribution
()
analyzer.
statistical_analyse
()
```
以检测模型中的picodet-s为例,从以下激活箱状图(部分层)中可以发现,
`conv2d_7.w_0`
,
`conv2d_9.w_0`
这两层的激活输入有大量离群点,会导致量化效果较差。
<p
align=
"center"
>
<img
src=
"./detection/images/act_distribution.png"
width=
849
hspace=
'10'
/>
<br
/>
</p>
3.
计算每层的量化敏感度并且绘制直方分布图
调用该接口,会统计量化前和量化后每一个可量化权重和其对应激活的数据。只使用该接口可以不输入Eval Function,但需要输入DataLoader,少量数据即可。会产出以下文件:
-
`fp_activation_boxplot.pdf`
:量化前Float数据类型的模型激活箱状图
-
`fp_weight_boxplot.pdf`
:量化前Float数据类型的模型权重箱状图
-
`quantized_activation_boxplot.pdf`
:量化后INT数据类型的模型激活箱状图
-
`quantized_weight_boxplot.pdf`
:量化后INT数据类型的模型权重箱状图
-
`fp_activation_histplot.pdf`
:量化前Float数据类型的模型激活直方图
-
`fp_weight_histplot.pdf`
:量化前Float数据类型的模型权重直方图
-
`quantized_activation_histplot.pdf`
:量化后INT数据类型的模型激活直方图
-
`quantized_weight_histplot.pdf`
:量化后INT数据类型的模型权重直方图
-
`statistic.csv`
:量化前后权重和激活的具体数据信息,表格中会保存的信息有:
-
Var Name: Variable的名称
-
Var Type:Variable的类型,Weight或Activation
-
Corresponding Weight Name:如果为Activation,其对应的Weight名称
-
FP32 Min:量化前Float数据类型的最小值
-
FP32 Max:量化前Float数据类型的最大值
-
FP32 Mean:量化前Float数据类型的平均值
-
FP32 Std:量化前Float数据类型的方差值
-
Quantized Min:量化后INT数据类型的最小值
-
Quantized Max:量化后INT数据类型的最大值
-
Quantized Mean:量化后INT数据类型的平均值
-
Quantized Std:量化后INT数据类型的方差值
-
Diff Min:量化前后该Variable的相差的最小值
-
Diff Max:量化前后该Variable的相差的最大值
-
Diff Mean:量化前后该Variable的相差的平均值
-
Diff Std:量化前后该Variable的相差的方差值
**精度误差分析**
```
analyzer.
compute_quant_sensitivity(plot_hist=True
)
analyzer.
metric_error_analyse(
)
```
`plot_hist`
默认为True,如不需要获得量化效果较好和较差的层的权重和激活分布图,可设置为False。
量化分析工具会默认会产出以下目录:
```
analysis_results/
├── analysis.txt
├── best_weight_hist_result.pdf
├── best_act_hist_result.pdf
├── worst_weight_hist_result.pdf
├── worst_act_hist_result.pdf
```
-
所有只量化一层的模型精度排序,将默认保存在
`./analysis_results/analysis.txt`
中。
-
通过设置参数
`num_histogram_plots`
,可选择绘出该数量个量化效果最好和最差层的weight和activation的直方分布图,将以PDF形式保存在
`./analysis_results`
文件夹下, 分别保存为
`best_weight_hist_result.pdf`
,
`best_act_hist_result.pdf`
,
`worst_weight_hist_result.pdf`
和
`worst_act_hist_result.pdf`
中以供对比分析。
调用该接口,会遍历量化模型中的一层,并计算量化该层后模型的损失。调用该接口时,需要输入Eval Function。会产出所有只量化一层的模型精度排序,将默认保存在
`./analysis_results/analysis.txt`
中。
以检测模型中的picodet-s为例,从
`analysis.txt`
可以发现
`conv2d_1.w_0`
,
`conv2d_3.w_0`
,
`conv2d_5.w_0`
,
`conv2d_7.w_0`
,
`conv2d_9.w_0`
这些层会导致较大的精度损失。这一现象符合对激活箱状图的观察。
<p
align=
"center"
>
<img
src=
"./detection/images/picodet_analysis.png"
width=
849
hspace=
'10'
/>
<br
/>
</p>
4.
直接产出符合预期精度的量化模型
**直接产出符合预期精度的量化模型**
```
analyzer.get_target_quant_model(target_metric)
```
...
...
example/post_training_quantization/detection/README.md
浏览文件 @
18d3524a
...
...
@@ -130,7 +130,8 @@ python eval.py --config_path=./configs/ppyoloe_s_ptq.yaml
-
要测试的模型路径可以在配置文件中
`model_dir`
字段下进行修改。
#### 3.6 提高离线量化精度
本节介绍如何使用量化分析工具提升离线量化精度。离线量化功能仅需使用少量数据,且使用简单、能快速得到量化模型,但往往会造成较大的精度损失。PaddleSlim提供量化分析工具,会使用接口
```paddleslim.quant.AnalysisQuant```
,可视化展示出不适合量化的层,通过跳过这些层,提高离线量化模型精度。
本节介绍如何使用量化分析工具提升离线量化精度。离线量化功能仅需使用少量数据,且使用简单、能快速得到量化模型,但往往会造成较大的精度损失。PaddleSlim提供量化分析工具,会使用接口
```paddleslim.quant.AnalysisQuant```
,可视化展示出不适合量化的层,通过跳过这些层,提高离线量化模型精度。
```paddleslim.quant.AnalysisQuant```
详解见
[
AnalysisQuant.md
](
../../../../docs/zh_cn/tutorials/quant/AnalysisQuant.md
)
。
经过多个实验,包括尝试多种激活算法(avg,KL等)、weight的量化方式(abs_max,channel_wise_abs_max),对PicoDet-s进行离线量化后精度均为0,以PicoDet-s为例,量化分析工具具体使用方法如下:
...
...
example/post_training_quantization/detection/analysis.py
浏览文件 @
18d3524a
...
...
@@ -168,14 +168,11 @@ def main():
eval_function
=
eval_function
,
data_loader
=
ptq_data_loader
,
save_dir
=
config
[
'save_dir'
],
ptq_config
=
ptq_config
)
ptq_config
=
ptq_config
,
resume
=
True
,
)
# plot the boxplot of activations of quantizable weights
analyzer
.
plot_activation_distribution
()
# get the rank of sensitivity of each quantized layer
# plot the histogram plot of best and worst activations and weights if plot_hist is True
analyzer
.
compute_quant_sensitivity
(
plot_hist
=
config
[
'plot_hist'
])
analyzer
.
statistical_analyse
()
analyzer
.
metric_error_analyse
()
if
config
[
'get_target_quant_model'
]:
if
'FastEvalDataset'
in
config
:
...
...
example/post_training_quantization/pytorch_yolo_series/README.md
浏览文件 @
18d3524a
...
...
@@ -116,7 +116,7 @@ python eval.py --config_path=./configs/yolov5s_ptq.yaml
#### 3.6 提高离线量化精度
本节介绍如何使用量化分析工具提升离线量化精度。离线量化功能仅需使用少量数据,且使用简单、能快速得到量化模型,但往往会造成较大的精度损失。PaddleSlim提供量化分析工具,会使用接口
```paddleslim.quant.AnalysisQuant```
,可视化展示出不适合量化的层,通过跳过这些层,提高离线量化模型精度。
本节介绍如何使用量化分析工具提升离线量化精度。离线量化功能仅需使用少量数据,且使用简单、能快速得到量化模型,但往往会造成较大的精度损失。PaddleSlim提供量化分析工具,会使用接口
```paddleslim.quant.AnalysisQuant```
,可视化展示出不适合量化的层,通过跳过这些层,提高离线量化模型精度。
```paddleslim.quant.AnalysisQuant```
详解见
[
AnalysisQuant.md
](
../../../../docs/zh_cn/tutorials/quant/AnalysisQuant.md
)
。
由于YOLOv6离线量化效果较差,以YOLOv6为例,量化分析工具具体使用方法如下:
...
...
@@ -148,6 +148,8 @@ python post_quant.py --config_path=./configs/yolov6s_analyzed_ptq.yaml --save_di
如想分析之后直接产出符合目标精度的量化模型,可在
`yolov6s_analysis.yaml`
中将
`get_target_quant_model`
设置为True,并填写
`target_metric`
,注意
`target_metric`
不能比原模型精度高。
**加速分析过程**
使用量化分析工具时,因需要逐层量化模型并进行验证,因此过程可能较慢,若想加速分析过程,可以在配置文件中设置
`fast_val_anno_path`
,输入一个图片数量较少的annotation文件路径。注意,用少量数据验证的模型精度不一定等于全量数据验证的模型精度,若只需分析时获得不同层量化效果的相对排序,可以使用少量数据集;若要求准确精度,请使用全量验证数据集。如需要全量验证数据,将
`fast_val_anno_path`
设置为None即可。
...
...
example/post_training_quantization/pytorch_yolo_series/analysis.py
浏览文件 @
18d3524a
...
...
@@ -113,12 +113,8 @@ def main():
resume
=
FLAGS
.
resume
,
ptq_config
=
ptq_config
)
# plot the boxplot of activations of quantizable weights
analyzer
.
plot_activation_distribution
()
# get the rank of sensitivity of each quantized layer
# plot the histogram plot of best and worst activations and weights if plot_hist is True
analyzer
.
compute_quant_sensitivity
(
plot_hist
=
config
[
'plot_hist'
])
analyzer
.
statistical_analyse
()
analyzer
.
metric_error_analyse
()
if
config
[
'get_target_quant_model'
]:
if
config
[
'fast_val_anno_path'
]
is
not
None
:
...
...
paddleslim/quant/analysis.py
浏览文件 @
18d3524a
...
...
@@ -19,6 +19,7 @@ import copy
import
logging
import
matplotlib.pyplot
as
plt
from
matplotlib.backends.backend_pdf
import
PdfPages
import
csv
import
numpy
as
np
import
random
import
tempfile
...
...
@@ -28,7 +29,7 @@ from paddle.fluid import framework
from
paddle.fluid.framework
import
IrGraph
from
paddle.fluid.executor
import
global_scope
from
paddle.fluid.contrib.slim.quantization
import
PostTrainingQuantization
from
paddle.fluid.contrib.slim.quantization.utils
import
_get_op_input_var_names
,
load_variable_data
from
paddle.fluid.contrib.slim.quantization.utils
import
_get_op_input_var_names
,
_get_op_output_var_names
,
load_variable_data
from
.quanter
import
quant_post
from
..core
import
GraphWrapper
from
..common
import
get_logger
...
...
@@ -47,7 +48,6 @@ class AnalysisQuant(object):
eval_function
=
None
,
data_loader
=
None
,
save_dir
=
'analysis_results'
,
num_histogram_plots
=
10
,
resume
=
False
,
ptq_config
=
None
):
"""
...
...
@@ -79,7 +79,6 @@ class AnalysisQuant(object):
self
.
quant_layer_names
=
[]
self
.
checkpoint_name
=
os
.
path
.
join
(
save_dir
,
'analysis_checkpoint.pkl'
)
self
.
quant_layer_metrics
=
{}
self
.
num_histogram_plots
=
num_histogram_plots
self
.
ptq_config
=
ptq_config
self
.
batch_nums
=
ptq_config
[
'batch_nums'
]
if
'batch_nums'
in
ptq_config
else
10
...
...
@@ -112,25 +111,12 @@ class AnalysisQuant(object):
# create data_loader
self
.
data_loader
=
wrap_dataloader
(
data_loader
,
self
.
feed_list
)
# evaluate before quant
# TODO: self.eval_function can be None
if
self
.
eval_function
is
not
None
:
self
.
base_metric
=
self
.
eval_function
(
executor
,
program
,
self
.
feed_list
,
self
.
fetch_list
)
_logger
.
info
(
'Before quantized, the accuracy of the model is: {}'
.
format
(
self
.
base_metric
))
# quant model to get quantizable ops
post_training_quantization
=
self
.
create_ptq
(
executor
,
None
,
'avg'
)
# quant and evaluate after quant (skip_list = None)
post_training_quantization
=
PostTrainingQuantization
(
executor
=
executor
,
data_loader
=
self
.
data_loader
,
model_dir
=
self
.
model_dir
,
model_filename
=
self
.
model_filename
,
params_filename
=
self
.
params_filename
,
skip_tensor_list
=
None
,
onnx_format
=
self
.
onnx_format
,
**
self
.
ptq_config
)
_logger
.
info
(
'Run PTQ before analysis.'
)
program
=
post_training_quantization
.
quantize
()
if
self
.
onnx_format
:
post_training_quantization
.
save_quantized_model
(
self
.
temp_save_path
,
...
...
@@ -141,16 +127,14 @@ class AnalysisQuant(object):
executor
,
model_filename
=
'model.pdmodel'
,
params_filename
=
'model.pdiparams'
)
self
.
quant_metric
=
self
.
eval_function
(
executor
,
program
,
self
.
feed_list
,
self
.
fetch_list
)
_logger
.
info
(
'After quantized, the accuracy of the model is: {}'
.
format
(
self
.
quant_metric
))
# get quantized weight and act var name
self
.
quantized_weight_var_name
=
post_training_quantization
.
_quantized_weight_var_name
self
.
quantized_act_var_name
=
post_training_quantization
.
_quantized_act_var_name
self
.
support_quant_val_name_list
=
self
.
quantized_weight_var_name
if
not
self
.
is_full_quantize
else
list
(
self
.
quantized_act_var_name
)
self
.
weight_names
=
list
(
self
.
quantized_weight_var_name
)
self
.
act_names
=
list
(
self
.
quantized_act_var_name
)
executor
.
close
()
# load tobe_analyized_layer from checkpoint
...
...
@@ -160,131 +144,60 @@ class AnalysisQuant(object):
list
(
self
.
quant_layer_metrics
.
keys
()))
self
.
tobe_analyized_layer
=
sorted
(
list
(
self
.
tobe_analyized_layer
))
def
compute_quant_sensitivity
(
self
,
plot_hist
=
True
):
'''
compute the sensitivity of quantized layers by eval function
'''
assert
self
.
data_loader
is
not
None
,
"When computing the sensitivity of quantized layers, the data loader is needed"
assert
self
.
eval_function
is
not
None
,
"When computing the sensitivity of quantized layers, the eval function is needed"
self
.
eval_quant_model
()
self
.
sensitivity_ranklist
=
sorted
(
self
.
quant_layer_metrics
,
key
=
self
.
quant_layer_metrics
.
get
,
reverse
=
False
)
_logger
.
info
(
'Finished computing the sensitivity of the model.'
)
for
name
in
self
.
sensitivity_ranklist
:
_logger
.
info
(
"quant layer name: {}, eval metric: {}"
.
format
(
name
,
self
.
quant_layer_metrics
[
name
]))
analysis_file
=
os
.
path
.
join
(
self
.
save_dir
,
"analysis.txt"
)
with
open
(
analysis_file
,
"w"
)
as
analysis_ret_f
:
for
name
in
self
.
sensitivity_ranklist
:
analysis_ret_f
.
write
(
"quant layer name: {}, eval metric: {}
\n
"
.
format
(
name
,
self
.
quant_layer_metrics
[
name
]))
_logger
.
info
(
'Analysis file is saved in {}'
.
format
(
analysis_file
))
if
plot_hist
:
self
.
calculate_histogram
()
def
save_checkpoint
(
self
):
if
not
os
.
path
.
exists
(
self
.
save_dir
):
os
.
makedirs
(
self
.
save_dir
)
with
open
(
self
.
checkpoint_name
,
'wb'
)
as
f
:
pickle
.
dump
(
self
.
quant_layer_metrics
,
f
)
_logger
.
info
(
'
save checkpoint to {}
'
.
format
(
self
.
checkpoint_name
))
_logger
.
info
(
'
Save checkpoint to {}.
'
.
format
(
self
.
checkpoint_name
))
def
load_checkpoint
(
self
):
if
not
os
.
path
.
exists
(
self
.
checkpoint_name
):
_logger
.
info
(
'Checkpoint path {} does not exist.'
.
format
(
self
.
checkpoint_name
))
return
False
with
open
(
self
.
checkpoint_name
,
'rb'
)
as
f
:
self
.
quant_layer_metrics
=
pickle
.
load
(
f
)
_logger
.
info
(
'
load checkpoint from {}
'
.
format
(
self
.
checkpoint_name
))
_logger
.
info
(
'
Load checkpoint from {}.
'
.
format
(
self
.
checkpoint_name
))
return
True
def
plot_activation_distribution
(
self
,
axis
=
None
):
'''
Collect and plot the distribution of the activation of each weight layer.
'''
devices
=
paddle
.
device
.
get_device
().
split
(
':'
)[
0
]
places
=
paddle
.
device
.
_convert_to_place
(
devices
)
executor
=
paddle
.
static
.
Executor
(
places
)
[
program
,
feed_list
,
fetch_list
]
=
load_inference_model
(
\
self
.
model_dir
,
\
executor
=
executor
,
\
model_filename
=
self
.
model_filename
,
\
params_filename
=
self
.
params_filename
)
scope
=
global_scope
()
persistable_var_names
=
[]
for
var
in
program
.
list_vars
():
if
var
.
persistable
:
persistable_var_names
.
append
(
var
.
name
)
weight_names
=
sorted
(
list
(
self
.
quantized_weight_var_name
))
acts_weight_map
=
self
.
get_weight_act_map
(
program
,
weight_names
,
persistable_var_names
)
all_acts
=
list
(
acts_weight_map
.
keys
())
all_weights
=
[
acts_weight_map
[
act
]
for
act
in
all_acts
]
act_distribution
=
[]
for
var_name
in
all_acts
:
var_tensor
=
load_variable_data
(
scope
,
var_name
)
if
axis
is
None
:
var_tensor
=
var_tensor
.
flatten
()
else
:
var_tensor
=
var_tensor
.
reshape
(
[
-
1
,
var_tensor
.
shape
[
axis
]]).
abs
().
max
(
axis
=-
1
)
sample_num
=
len
(
var_tensor
)
if
len
(
var_tensor
)
<
1000
else
1000
var_tensor
=
random
.
sample
(
list
(
var_tensor
),
sample_num
)
act_distribution
.
append
(
var_tensor
)
all_values
=
sum
(
act_distribution
,
[])
max_value
=
np
.
max
(
all_values
)
min_value
=
np
.
min
(
all_values
)
pdf_path
=
os
.
path
.
join
(
self
.
save_dir
,
'activation_distribution.pdf'
)
with
PdfPages
(
pdf_path
)
as
pdf
:
for
i
in
range
(
0
,
len
(
act_distribution
),
20
):
r
=
i
+
20
if
i
+
20
<
len
(
act_distribution
)
else
len
(
act_distribution
)
plt
.
boxplot
(
act_distribution
[
i
:
r
],
labels
=
all_weights
[
i
:
r
],
showbox
=
True
,
patch_artist
=
True
)
plt
.
xticks
(
rotation
=
90
)
plt
.
tick_params
(
axis
=
'x'
)
plt
.
ylim
([
min_value
,
max_value
])
plt
.
xlabel
(
'Weight Name'
)
plt
.
ylabel
(
"Activation Distribution"
)
plt
.
tight_layout
()
plt
.
show
()
pdf
.
savefig
()
plt
.
close
()
_logger
.
info
(
'Distribution plots is saved in {}'
.
format
(
pdf_path
))
def
eval_quant_model
(
self
):
'''
For each layer, quantize the weight op and evaluate the quantized model.
'''
for
i
,
layer_name
in
enumerate
(
self
.
tobe_analyized_layer
):
_logger
.
info
(
'Checking {}/{} quant model: quant layer {}'
.
format
(
i
+
1
,
len
(
self
.
tobe_analyized_layer
),
layer_name
))
skip_list
=
copy
.
copy
(
list
(
self
.
support_quant_val_name_list
))
skip_list
.
remove
(
layer_name
)
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
post_training_quantization
=
PostTrainingQuantization
(
def
save_csv
(
self
,
data
,
save_name
,
csv_columns
):
save_path
=
os
.
path
.
join
(
self
.
save_dir
,
save_name
)
with
open
(
save_path
,
'w'
)
as
csvfile
:
writer
=
csv
.
DictWriter
(
csvfile
,
fieldnames
=
csv_columns
)
writer
.
writeheader
()
for
d
in
data
:
writer
.
writerow
(
d
)
_logger
.
info
(
'Activation Statistic is saved in {}'
.
format
(
save_path
))
def
create_ptq
(
self
,
executor
,
skip_tensor_list
,
algo
):
return
PostTrainingQuantization
(
executor
=
executor
,
data_loader
=
self
.
data_loader
,
model_dir
=
self
.
model_dir
,
model_filename
=
self
.
model_filename
,
params_filename
=
self
.
params_filename
,
skip_tensor_list
=
skip_list
,
skip_tensor_list
=
skip_tensor_list
,
algo
=
algo
,
# avg fastest
onnx_format
=
self
.
onnx_format
,
**
self
.
ptq_config
)
def
sampling
(
self
,
executor
,
program
,
scope
):
batch_id
=
0
for
data
in
self
.
data_loader
():
executor
.
run
(
program
=
program
,
feed
=
data
,
fetch_list
=
self
.
fetch_list
,
return_numpy
=
False
,
scope
=
scope
)
batch_id
+=
1
if
batch_id
>=
self
.
batch_nums
:
break
def
eval_quant_model
(
self
,
skip_list
):
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
post_training_quantization
=
self
.
create_ptq
(
executor
,
skip_list
,
algo
=
'avg'
)
program
=
post_training_quantization
.
quantize
()
_logger
.
info
(
'Evaluating...'
)
if
self
.
onnx_format
:
...
...
@@ -300,6 +213,41 @@ class AnalysisQuant(object):
quant_metric
=
self
.
eval_function
(
executor
,
program
,
self
.
feed_list
,
self
.
fetch_list
)
executor
.
close
()
return
quant_metric
def
metric_error_analyse
(
self
):
'''
Evaluate the quantized models, which are generated by quantizing each weight operator one by one. The results will be saved into analysis.txt.
'''
assert
self
.
data_loader
is
not
None
,
"When computing the sensitivity of quantized layers, the data loader is needed"
assert
self
.
eval_function
is
not
None
,
"When computing the sensitivity of quantized layers, the eval function is needed"
# evaluate before quant
_logger
.
info
(
'Start to evaluate the base model.'
)
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
[
program
,
feed_list
,
fetch_list
]
=
load_inference_model
(
\
self
.
model_dir
,
\
executor
=
executor
,
\
model_filename
=
self
.
model_filename
,
\
params_filename
=
self
.
params_filename
)
self
.
base_metric
=
self
.
eval_function
(
executor
,
program
,
feed_list
,
fetch_list
)
_logger
.
info
(
'Before quantized, the accuracy of the model is: {}'
.
format
(
self
.
base_metric
))
# evaluate before quant
_logger
.
info
(
'Start to evaluate the quantized model.'
)
self
.
quant_metric
=
self
.
eval_quant_model
(
None
)
_logger
.
info
(
'After quantized, the accuracy of the model is: {}'
.
format
(
self
.
quant_metric
))
# For each layer, quantize the weight op and evaluate the quantized model.
for
i
,
layer_name
in
enumerate
(
self
.
tobe_analyized_layer
):
_logger
.
info
(
'Checking {}/{} quant model: quant layer {}'
.
format
(
i
+
1
,
len
(
self
.
tobe_analyized_layer
),
layer_name
))
skip_list
=
copy
.
copy
(
list
(
self
.
support_quant_val_name_list
))
skip_list
.
remove
(
layer_name
)
quant_metric
=
self
.
eval_quant_model
(
skip_list
)
_logger
.
info
(
"Quantized layer name: {}, eval metric: {}, the loss caused by this layer: {}"
.
format
(
layer_name
,
...
...
@@ -307,117 +255,261 @@ class AnalysisQuant(object):
round
(
self
.
base_metric
-
quant_metric
,
4
)))
self
.
quant_layer_metrics
[
layer_name
]
=
quant_metric
self
.
save_checkpoint
()
if
self
.
onnx_format
:
self
.
temp_root_path
.
cleanup
()
def
get_weight_act_map
(
self
,
program
,
weight_names
,
persistable_var_names
):
act_names
=
{}
for
op_name
in
weight_names
:
for
block_id
in
range
(
len
(
program
.
blocks
)):
for
op
in
program
.
blocks
[
block_id
].
ops
:
var_name_list
=
_get_op_input_var_names
(
op
)
if
op_name
in
var_name_list
:
for
var_name
in
var_name_list
:
if
var_name
not
in
persistable_var_names
:
act_names
[
var_name
]
=
op_name
return
act_names
def
get_hist_ops_name
(
self
,
graph
,
program
):
if
self
.
num_histogram_plots
<=
0
:
return
[]
best_weight_ops
=
self
.
sensitivity_ranklist
[::
-
1
][:
self
.
num_histogram_plots
]
worst_weight_ops
=
self
.
sensitivity_ranklist
[:
self
.
num_histogram_plots
]
self
.
sensitivity_ranklist
=
sorted
(
self
.
quant_layer_metrics
,
key
=
self
.
quant_layer_metrics
.
get
,
reverse
=
False
)
persistable_var_names
=
[]
for
var
in
program
.
list_vars
()
:
if
var
.
persistable
:
persistable_var_names
.
append
(
var
.
name
)
_logger
.
info
(
'Finished computing the sensitivity of the model.'
)
for
name
in
self
.
sensitivity_ranklist
:
_logger
.
info
(
"quant layer name: {}, eval metric: {}"
.
format
(
name
,
self
.
quant_layer_metrics
[
name
])
)
best_acts
=
self
.
get_weight_act_map
(
program
,
best_weight_ops
,
persistable_var_names
)
worst_acts
=
self
.
get_weight_act_map
(
program
,
worst_weight_ops
,
persistable_var_names
)
return
[
best_weight_ops
,
best_acts
,
worst_weight_ops
,
worst_acts
]
analysis_file
=
os
.
path
.
join
(
self
.
save_dir
,
"analysis.txt"
)
with
open
(
analysis_file
,
"w"
)
as
analysis_ret_f
:
for
name
in
self
.
sensitivity_ranklist
:
analysis_ret_f
.
write
(
"quant layer name: {}, eval metric: {}
\n
"
.
format
(
name
,
self
.
quant_layer_metrics
[
name
]))
_logger
.
info
(
'Analysis file is saved in {}'
.
format
(
analysis_file
))
def
collect_
tensors_histogram
(
self
,
scope
,
op
s
):
hist
=
{}
for
var_name
in
op
s
:
def
collect_
vars
(
self
,
scope
,
var_name
s
):
all_vars
=
{}
for
var_name
in
var_name
s
:
var_tensor
=
load_variable_data
(
scope
,
var_name
)
var_tensor
=
np
.
array
(
var_tensor
)
min_v
=
float
(
np
.
min
(
var_tensor
))
max_v
=
float
(
np
.
max
(
var_tensor
))
var_tensor
=
var_tensor
.
flatten
()
_
,
hist_edges
=
np
.
histogram
(
var_tensor
.
copy
(),
bins
=
self
.
histogram_bins
,
range
=
(
min_v
,
max_v
))
hist
[
var_name
]
=
[
var_tensor
,
hist_edges
]
return
hist
def
calculate_histogram
(
self
):
'''
Sample histograms for the weight and corresponding act tensors
'''
devices
=
paddle
.
device
.
get_device
().
split
(
':'
)[
0
]
places
=
paddle
.
device
.
_convert_to_place
(
devices
)
executor
=
paddle
.
static
.
Executor
(
places
)
all_vars
[
var_name
]
=
var_tensor
return
all_vars
def
collect_base_stat
(
self
):
_logger
.
info
(
'Collecting Statistic Before PTQ...'
)
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
[
program
,
feed_list
,
fetch_list
]
=
load_inference_model
(
\
self
.
model_dir
,
\
executor
=
executor
,
\
model_filename
=
self
.
model_filename
,
\
params_filename
=
self
.
params_filename
)
scope
=
global_scope
()
persistable_var_names
=
[]
for
var
in
program
.
list_vars
():
if
var
.
persistable
:
persistable_var_names
.
append
(
var
.
name
)
self
.
acts_weight_map
=
self
.
get_weight_act_map
(
program
,
self
.
weight_names
,
persistable_var_names
)
activations_names
=
list
(
self
.
acts_weight_map
.
keys
())
# sample
self
.
sampling
(
executor
,
program
,
scope
)
before_act_data
=
self
.
collect_vars
(
scope
,
activations_names
)
before_weight_data
=
self
.
collect_vars
(
scope
,
self
.
weight_names
)
executor
.
close
()
return
before_act_data
,
before_weight_data
def
collect_quant_stat
(
self
):
_logger
.
info
(
'Collecting Statistic After PTQ...'
)
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
scope
=
global_scope
()
post_training_quantization
=
self
.
create_ptq
(
executor
,
None
,
algo
=
'avg'
)
program
=
post_training_quantization
.
quantize
()
graph
=
IrGraph
(
core
.
Graph
(
program
.
desc
),
for_test
=
False
)
tensors_tobe_draw_hist
=
self
.
get_hist_ops_name
(
graph
,
program
)
if
not
tensors_tobe_draw_hist
:
return
persistable_var_names
=
[]
for
var
in
program
.
list_vars
():
if
var
.
persistable
:
persistable_var_names
.
append
(
var
.
name
)
quant_weight_names
=
self
.
weight_names
dequant_act_names
=
[
"%s.quantized"
%
(
n
)
for
n
in
self
.
acts_weight_map
]
for
var
in
program
.
list_vars
():
if
var
.
name
in
self
.
quantized_act_var_name
:
if
var
.
name
in
dequant_act_names
:
var
.
persistable
=
True
# sample before collect histogram
batch_id
=
0
for
data
in
self
.
data_loader
():
executor
.
run
(
program
=
program
,
feed
=
data
,
fetch_list
=
fetch_list
,
return_numpy
=
False
,
scope
=
scope
)
batch_id
+=
1
if
batch_id
>=
self
.
batch_nums
:
break
self
.
sampling
(
executor
,
program
,
scope
)
pdf_names
=
[
'best_weight_hist_result.pdf'
,
'best_act_hist_result.pdf'
,
'worst_weight_hist_result.pdf'
,
'worst_act_hist_result.pdf'
,
after_act_data
=
self
.
collect_vars
(
scope
,
dequant_act_names
)
after_weight_data
=
self
.
collect_vars
(
scope
,
quant_weight_names
)
executor
.
close
()
return
after_act_data
,
after_weight_data
def
statistical_analyse
(
self
,
analysis_axis
=
None
):
self
.
act_data
,
self
.
weight_data
=
self
.
collect_base_stat
()
self
.
quant_act_data
,
self
.
dequant_weight_data
=
self
.
collect_quant_stat
(
)
fp_q_act_name_map
=
{
n
:
"%s.quantized"
%
(
n
)
for
n
in
self
.
acts_weight_map
}
act_statistic
,
box_fp_dist
,
box_q_dist
,
hist_fp_dist
,
hist_q_dist
=
self
.
collect_statistic
(
self
.
act_data
,
self
.
quant_act_data
,
fp_q_act_name_map
,
is_weight
=
False
,
axis
=
analysis_axis
)
self
.
plot_box_distribution
(
box_fp_dist
,
list
(
self
.
acts_weight_map
.
keys
()),
'fp_activation_boxplot.pdf'
)
self
.
plot_box_distribution
(
box_q_dist
,
list
(
self
.
acts_weight_map
.
keys
()),
'quantized_activation_boxplot.pdf'
)
self
.
plot_hist_distribution
(
hist_fp_dist
,
'fp_activation_histplot.pdf'
)
self
.
plot_hist_distribution
(
hist_q_dist
,
'quantized_activation_histplot.pdf'
)
weight_statistic
,
box_fp_dist
,
box_q_dist
,
hist_fp_dist
,
hist_q_dist
=
self
.
collect_statistic
(
self
.
weight_data
,
self
.
dequant_weight_data
,
None
,
is_weight
=
True
,
axis
=
analysis_axis
)
self
.
plot_box_distribution
(
box_fp_dist
,
list
(
self
.
quantized_weight_var_name
),
'fp_weight_boxplot.pdf'
)
self
.
plot_box_distribution
(
box_q_dist
,
list
(
self
.
quantized_weight_var_name
),
'quantized_weight_boxplot.pdf'
)
self
.
plot_hist_distribution
(
hist_fp_dist
,
'fp_weight_histplot.pdf'
)
self
.
plot_hist_distribution
(
hist_q_dist
,
'quantized_weight_histplot.pdf'
)
statistic
=
act_statistic
+
weight_statistic
csv_columns
=
[
'Var Name'
,
'Var Type'
,
'Corresponding Weight Name'
,
'FP32 Min'
,
'FP32 Max'
,
'FP32 Mean'
,
'FP32 Std'
,
'Quantized Min'
,
'Quantized Max'
,
'Quantized Mean'
,
'Quantized Std'
,
'Diff Min'
,
'Diff Max'
,
'Diff Mean'
,
'Diff Std'
]
for
tensors
,
save_pdf_name
in
zip
(
tensors_tobe_draw_hist
,
pdf_names
):
if
isinstance
(
tensors
,
list
):
hist_data
=
self
.
collect_tensors_histogram
(
scope
,
tensors
)
self
.
draw_hist_pdf
(
hist_data
,
save_pdf_name
,
None
)
self
.
save_csv
(
statistic
,
'statistic.csv'
,
csv_columns
)
def
get_weight_act_map
(
self
,
program
,
weight_names
,
persistable_var_names
):
weight_act_map
=
{}
for
op_name
in
weight_names
:
for
block_id
in
range
(
len
(
program
.
blocks
)):
for
op
in
program
.
blocks
[
block_id
].
ops
:
var_name_list
=
_get_op_input_var_names
(
op
)
if
op_name
in
var_name_list
:
for
var_name
in
var_name_list
:
if
var_name
not
in
persistable_var_names
:
weight_act_map
[
var_name
]
=
op_name
return
weight_act_map
def
collect_statistic
(
self
,
fp_tensors
,
quant_tensors
,
var_name_map
,
is_weight
,
axis
=
None
):
statistic
=
[]
box_fp_dist
,
box_q_dist
=
[],
[]
hist_fp_dist
,
hist_q_dist
=
{},
{}
for
var_name
in
fp_tensors
:
fp_tensor
=
fp_tensors
[
var_name
]
quant_name
=
var_name_map
[
var_name
]
if
var_name_map
is
not
None
else
var_name
quant_tensor
=
quant_tensors
[
quant_name
]
diff
=
fp_tensor
-
quant_tensor
fp_min
=
round
(
fp_tensor
.
min
(),
4
)
fp_max
=
round
(
fp_tensor
.
max
(),
4
)
fp_mean
=
round
(
fp_tensor
.
mean
(),
4
)
fp_std
=
round
(
fp_tensor
.
std
(),
4
)
q_min
=
round
(
quant_tensor
.
min
(),
4
)
q_max
=
round
(
quant_tensor
.
max
(),
4
)
q_mean
=
round
(
quant_tensor
.
mean
(),
4
)
q_std
=
round
(
quant_tensor
.
std
(),
4
)
diff_min
=
round
(
diff
.
min
(),
4
)
diff_max
=
round
(
diff
.
max
(),
4
)
diff_mean
=
round
(
diff
.
mean
(),
4
)
diff_std
=
round
(
diff
.
std
(),
4
)
stat
=
{
'Var Name'
:
var_name
,
'Var Type'
:
'Weight'
if
is_weight
else
'Activation'
,
'Corresponding Weight Name'
:
self
.
acts_weight_map
[
var_name
]
if
not
is_weight
else
None
,
'FP32 Min'
:
fp_min
,
'FP32 Max'
:
fp_max
,
'FP32 Mean'
:
fp_mean
,
'FP32 Std'
:
fp_std
,
'Quantized Min'
:
q_min
,
'Quantized Max'
:
q_max
,
'Quantized Mean'
:
q_mean
,
'Quantized Std'
:
q_std
,
'Diff Min'
:
diff_min
,
'Diff Max'
:
diff_max
,
'Diff Mean'
:
diff_mean
,
'Diff Std'
:
diff_std
,
}
statistic
.
append
(
stat
)
# for boxplot
if
axis
is
None
:
box_fp_tensor
=
fp_tensor
.
flatten
()
box_q_tensor
=
quant_tensor
.
flatten
()
else
:
box_fp_tensor
=
fp_tensor
.
reshape
(
[
-
1
,
fp_tensor
.
shape
[
axis
]]).
abs
().
max
(
axis
=-
1
)
box_q_tensor
=
quant_tensor
.
reshape
(
[
-
1
,
quant_tensor
.
shape
[
axis
]]).
abs
().
max
(
axis
=-
1
)
sample_num
=
len
(
box_fp_tensor
)
if
len
(
box_fp_tensor
)
<
1000
else
1000
box_fp_tensor
=
random
.
sample
(
list
(
box_fp_tensor
),
sample_num
)
box_q_tensor
=
random
.
sample
(
list
(
box_q_tensor
),
sample_num
)
box_fp_dist
.
append
(
box_fp_tensor
)
box_q_dist
.
append
(
box_q_tensor
)
# for histplot
_
,
hist_edges
=
np
.
histogram
(
fp_tensor
.
copy
(),
bins
=
50
,
range
=
(
fp_min
,
fp_max
))
hist_fp_dist
[
var_name
]
=
[
fp_tensor
.
flatten
(),
hist_edges
]
_
,
hist_edges
=
np
.
histogram
(
quant_tensor
.
copy
(),
bins
=
50
,
range
=
(
q_min
,
q_max
))
hist_q_dist
[
quant_name
]
=
[
quant_tensor
.
flatten
(),
hist_edges
]
return
statistic
,
box_fp_dist
,
box_q_dist
,
hist_fp_dist
,
hist_q_dist
def
plot_box_distribution
(
self
,
distribution
,
labels
,
save_name
):
all_values
=
sum
(
distribution
,
[])
max_value
=
np
.
max
(
all_values
)
min_value
=
np
.
min
(
all_values
)
pdf_path
=
os
.
path
.
join
(
self
.
save_dir
,
save_name
)
with
PdfPages
(
pdf_path
)
as
pdf
:
for
i
in
range
(
0
,
len
(
distribution
),
20
):
r
=
i
+
20
if
i
+
20
<
len
(
distribution
)
else
len
(
distribution
)
plt
.
boxplot
(
distribution
[
i
:
r
],
labels
=
labels
[
i
:
r
],
showbox
=
True
,
patch_artist
=
True
)
plt
.
xticks
(
rotation
=
90
)
plt
.
tick_params
(
axis
=
'x'
)
plt
.
ylim
([
min_value
,
max_value
])
if
'act'
in
save_name
:
plt
.
xlabel
(
'Activation Name'
)
else
:
hist_data
=
self
.
collect_tensors_histogram
(
scope
,
list
(
tensors
.
keys
()))
self
.
draw_hist_pdf
(
hist_data
,
save_pdf_name
,
tensors
)
plt
.
xlabel
(
'Weight Name'
)
plt
.
ylabel
(
"Box Distribution"
)
plt
.
tight_layout
()
plt
.
show
()
pdf
.
savefig
()
plt
.
close
()
_logger
.
info
(
'Distribution plots is saved in {}'
.
format
(
pdf_path
))
def
draw_hist_pdf
(
self
,
hist_data
,
save_pdf_name
,
weight_act_map
):
pdf_path
=
os
.
path
.
join
(
self
.
save_dir
,
save_
pdf_
name
)
def
plot_hist_distribution
(
self
,
hist_data
,
save_name
):
pdf_path
=
os
.
path
.
join
(
self
.
save_dir
,
save_name
)
with
PdfPages
(
pdf_path
)
as
pdf
:
for
name
in
hist_data
:
plt
.
hist
(
hist_data
[
name
][
0
],
bins
=
hist_data
[
name
][
1
])
plt
.
xlabel
(
name
)
plt
.
ylabel
(
"Frequency"
)
if
'act'
in
save_pdf_name
:
plt
.
title
(
"Hist of Activation {}/Input of Weight {}"
.
format
(
name
,
weight_act_map
[
name
]))
if
'act'
in
save_name
:
plt
.
title
(
"Hist of Activation {}"
.
format
(
name
))
else
:
plt
.
title
(
"Hist of Weight {}"
.
format
(
name
))
plt
.
show
()
...
...
@@ -427,25 +519,30 @@ class AnalysisQuant(object):
def
get_target_quant_model
(
self
,
target_metric
):
_logger
.
info
(
'Start to Find quant model that satisfies the target metric.'
)
'Start to Find quant
ized
model that satisfies the target metric.'
)
_logger
.
info
(
'Make sure that you are using full eval dataset to get target quantized model.'
)
skip_list
=
[]
rank_list
=
copy
.
copy
(
self
.
sensitivity_ranklist
)
if
self
.
quant_layer_metrics
:
rank_list
=
sorted
(
self
.
quant_layer_metrics
,
key
=
self
.
quant_layer_metrics
.
get
,
reverse
=
False
)
else
:
_logger
.
info
(
'Analyse metric error before get target quantized model.'
)
self
.
metric_error_analyse
()
while
True
:
skip_list
.
append
(
rank_list
.
pop
(
0
))
_logger
.
info
(
'Skip Ops: {}'
.
format
(
skip_list
))
executor
=
paddle
.
static
.
Executor
(
self
.
places
)
post_training_quantization
=
PostTrainingQuantization
(
executor
=
executor
,
data_loader
=
self
.
data_loader
,
model_dir
=
self
.
model_dir
,
model_filename
=
self
.
model_filename
,
params_filename
=
self
.
params_filename
,
onnx_format
=
self
.
onnx_format
,
skip_tensor_list
=
skip_list
,
**
self
.
ptq_config
)
post_training_quantization
=
self
.
create_ptq
(
executor
,
skip_list
,
algo
=
self
.
ptq_config
[
'algo'
]
if
'algo'
in
self
.
ptq_config
else
'KL'
)
program
=
post_training_quantization
.
quantize
()
_logger
.
info
(
'Evaluating...'
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录