Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
models
提交
fa7227e5
M
models
项目概览
PaddlePaddle
/
models
大约 1 年 前同步成功
通知
222
Star
6828
Fork
2962
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
602
列表
看板
标记
里程碑
合并请求
255
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
models
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
602
Issue
602
列表
看板
标记
里程碑
合并请求
255
合并请求
255
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
fa7227e5
编写于
12月 31, 2021
作者:
G
Guanghua Yu
提交者:
GitHub
12月 31, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add ptq docs and demo (#5451)
* add ptq docs and demo * fix readme * update readme
上级
4f732271
变更
8
隐藏空白更改
内联
并排
Showing
8 changed file
with
665 addition
and
7 deletion
+665
-7
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/README.md
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/README.md
+173
-0
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
+139
-0
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/post_quant.py
...ls/mobilenetv3_prod/Step6/deploy/ptq_python/post_quant.py
+113
-0
tutorials/tipc/images/post_training_quant_guide.png
tutorials/tipc/images/post_training_quant_guide.png
+0
-0
tutorials/tipc/kl_infer_python/kl_infer_python.md
tutorials/tipc/kl_infer_python/kl_infer_python.md
+0
-7
tutorials/tipc/ptq_infer_python/README.md
tutorials/tipc/ptq_infer_python/README.md
+0
-0
tutorials/tipc/ptq_infer_python/ptq_infer_python.md
tutorials/tipc/ptq_infer_python/ptq_infer_python.md
+240
-0
tutorials/tipc/ptq_infer_python/test_ptq_infer_python.md
tutorials/tipc/ptq_infer_python/test_ptq_infer_python.md
+0
-0
未找到文件。
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/README.md
浏览文件 @
fa7227e5
# MobileNetV3
## 目录
-
[
1. 简介
](
#1
)
-
[
2. 离线量化
](
#2
)
-
[
2.1 准备Inference模型及环境
](
#2.1
)
-
[
2.2 开始离线量化
](
#2.2
)
-
[
2.3 验证推理结果
](
#2.3
)
-
[
3. FAQ
](
#3
)
<a
name=
"1"
></a>
## 1. 简介
Paddle中静态离线量化,使用少量校准数据计算量化因子,可以快速将FP32模型量化成低比特模型(比如最常用的int8量化)。使用该量化模型进行预测,可以减少计算量、降低计算内存、减小模型大小。
本文档主要基于Paddle的MobileNetV3模型进行离线量化。
更多关于Paddle 模型离线量化的介绍,可以参考
[
Paddle 离线量化官网教程
](
https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/static/quant/quantization_api.rst#quant_post_static
)
。
<a
name=
"2"
></a>
## 2. 离线量化
<a
name=
"2.1"
></a>
### 2.1 准备Inference模型及环境
由于离线量化直接使用Inference模型进行量化,不依赖模型组网,所以需要提前准备好Inference模型.
我们准备好了动转静后的MobileNetv3 small的Inference模型,可以从
[
mobilenet_v3_small_infer
](
https://paddle-model-ecology.bj.bcebos.com/model/mobilenetv3_reprod/mobilenet_v3_small_infer.tar
)
直接下载。
```
shell
wget https://paddle-model-ecology.bj.bcebos.com/model/mobilenetv3_reprod/mobilenet_v3_small_infer.tar
tar
-xf
mobilenet_v3_small_infer.tar
```
也可以按照
[
MobileNetv3 动转静流程
](
xxx
)
,将MobileNetv3 small的模型转成Inference模型。
<a
name=
"2.2"
></a>
环境准备:
-
安装PaddleSlim:
```
shell
pip
install
paddleslim
==
2.2.1
```
-
安装PaddlePaddle:
```
shell
pip
install
paddlepaddle-gpu
==
2.2.1.post112
-f
https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
```
-
准备数据:
请参考
[
数据准备文档
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6#32-%E5%87%86%E5%A4%87%E6%95%B0%E6%8D%AE
)
。
### 2.2 开始离线量化
启动离线量化:
```
bash
python post_quant.py
--model_path
=
mobilenet_v3_small_infer/
\
--model_filename
=
inference.pdmodel
\
--params_filename
=
inference.pdiparams
\
--data_dir
=
/path/dataset/ILSVRC2012/
\
--use_gpu
=
True
\
--batch_size
=
32
\
--batch_num
=
20
```
部分离线量化日志如下:
```
Thu Dec 30 12:36:17-INFO: Collect quantized variable names ...
Thu Dec 30 12:36:17-INFO: Preparation stage ...
Thu Dec 30 12:36:27-INFO: Run batch: 0
Thu Dec 30 12:37:10-INFO: Run batch: 5
Thu Dec 30 12:37:43-INFO: Finish preparation stage, all batch:10
Thu Dec 30 12:37:43-INFO: Sampling stage ...
Thu Dec 30 12:38:10-INFO: Run batch: 0
Thu Dec 30 12:39:03-INFO: Run batch: 5
Thu Dec 30 12:39:46-INFO: Finish sampling stage, all batch: 10
Thu Dec 30 12:39:46-INFO: Calculate hist threshold ...
Thu Dec 30 12:39:47-INFO: Update the program ...
Thu Dec 30 12:39:49-INFO: The quantized model is saved in output/mv3_int8_infer
```
离线量化完成后,会在
`output_dir`
中生成量化后的Inference模型。
<a
name=
"2.3"
></a>
### 2.3 验证推理结果
-
量化推理模型重新命名:
需要将
`__model__`
重命名为
`inference.pdmodel`
,将
`__params__`
重命名为
`inference.pdiparams`
。
正确的命名如下:
```
shell
output/mv3_int8_infer/
|----inference.pdiparams : 模型参数文件
(
原__params__文件
)
|----inference.pdmodel : 模型结构文件
(
原__model__文件
)
```
-
使用Paddle Inference测试模型推理结果是否正确:
具体测试流程请参考
[
Inference推理文档
](
https://github.com/PaddlePaddle/models/blob/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/inference_python/README.md
)
如果您希望验证量化模型的在全量验证集上的精度,也可以按照下面的步骤进行操作:
使用如下命令验证MobileNetv3 small模型的精度:
-
FP32模型:
```
bash
python eval.py
--model_path
=
mobilenet_v3_small_infer/
\
--model_filename
=
inference.pdmodel
\
--params_filename
=
inference.pdiparams
\
--data_dir
=
/path/dataset/ILSVRC2012/
\
--batch_size
=
128
\
--use_gpu
=
True
```
FP32模型精度验证日志如下:
```
batch_id 300, acc1 0.602, acc5 0.825, avg time 0.00005 sec/img
batch_id 310, acc1 0.602, acc5 0.825, avg time 0.00005 sec/img
batch_id 320, acc1 0.602, acc5 0.825, avg time 0.00005 sec/img
batch_id 330, acc1 0.602, acc5 0.825, avg time 0.00005 sec/img
batch_id 340, acc1 0.601, acc5 0.825, avg time 0.00005 sec/img
batch_id 350, acc1 0.601, acc5 0.825, avg time 0.00005 sec/img
batch_id 360, acc1 0.602, acc5 0.826, avg time 0.00005 sec/img
batch_id 370, acc1 0.602, acc5 0.826, avg time 0.00005 sec/img
batch_id 380, acc1 0.602, acc5 0.825, avg time 0.00005 sec/img
batch_id 390, acc1 0.601, acc5 0.825, avg time 0.00005 sec/img
End test: test image 50000.0
test_acc1 0.6015, test_acc5 0.8253, avg time 0.00005 sec/img
```
-
量化模型:
```
shell
python eval.py
--model_path
=
output/mv3_int8_infer/
\
--model_filename
=
__model__
\
--params_filename
=
__params__
\
--data_dir
=
/path/dataset/ILSVRC2012/
\
--batch_size
=
128
\
--use_gpu
=
True
```
量化后模型精度验证日志如下:
```
batch_id 300, acc1 0.564, acc5 0.800, avg time 0.00006 sec/img
batch_id 310, acc1 0.562, acc5 0.798, avg time 0.00006 sec/img
batch_id 320, acc1 0.560, acc5 0.796, avg time 0.00006 sec/img
batch_id 330, acc1 0.556, acc5 0.792, avg time 0.00006 sec/img
batch_id 340, acc1 0.554, acc5 0.792, avg time 0.00006 sec/img
batch_id 350, acc1 0.552, acc5 0.790, avg time 0.00006 sec/img
batch_id 360, acc1 0.550, acc5 0.789, avg time 0.00006 sec/img
batch_id 370, acc1 0.551, acc5 0.789, avg time 0.00006 sec/img
batch_id 380, acc1 0.551, acc5 0.789, avg time 0.00006 sec/img
batch_id 390, acc1 0.553, acc5 0.790, avg time 0.00006 sec/img
End test: test image 50000.0
test_acc1 0.5530, test_acc5 0.7905, avg time 0.00006 sec/img
```
<a
name=
"3"
></a>
## 3. FAQ
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
0 → 100644
浏览文件 @
fa7227e5
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
os
import
numpy
as
np
import
time
import
sys
import
argparse
import
math
sys
.
path
[
0
]
=
os
.
path
.
join
(
os
.
path
.
dirname
(
"__file__"
),
os
.
path
.
pardir
,
os
.
path
.
pardir
)
import
paddle
import
paddle.inference
as
paddle_infer
from
presets
import
ClassificationPresetEval
import
paddlevision
def
eval
():
# create predictor
model_file
=
os
.
path
.
join
(
FLAGS
.
model_path
,
FLAGS
.
model_filename
)
params_file
=
os
.
path
.
join
(
FLAGS
.
model_path
,
FLAGS
.
params_filename
)
config
=
paddle_infer
.
Config
(
model_file
,
params_file
)
if
FLAGS
.
use_gpu
:
config
.
enable_use_gpu
(
1000
,
0
)
if
not
FLAGS
.
ir_optim
:
config
.
switch_ir_optim
(
False
)
predictor
=
paddle_infer
.
create_predictor
(
config
)
input_names
=
predictor
.
get_input_names
()
input_handle
=
predictor
.
get_input_handle
(
input_names
[
0
])
output_names
=
predictor
.
get_output_names
()
output_handle
=
predictor
.
get_output_handle
(
output_names
[
0
])
# prepare data
resize_size
,
crop_size
=
(
256
,
224
)
val_dataset
=
paddlevision
.
datasets
.
ImageFolder
(
os
.
path
.
join
(
FLAGS
.
data_dir
,
'val'
),
ClassificationPresetEval
(
crop_size
=
crop_size
,
resize_size
=
resize_size
))
eval_loader
=
paddle
.
io
.
DataLoader
(
val_dataset
,
batch_size
=
FLAGS
.
batch_size
,
num_workers
=
5
)
cost_time
=
0.
total_num
=
0.
correct_1_num
=
0
correct_5_num
=
0
for
batch_id
,
data
in
enumerate
(
eval_loader
()):
# set input
img_np
=
np
.
array
([
tensor
.
numpy
()
for
tensor
in
data
[
0
]])
label_np
=
np
.
array
([
tensor
.
numpy
()
for
tensor
in
data
[
1
]])
input_handle
.
reshape
(
img_np
.
shape
)
input_handle
.
copy_from_cpu
(
img_np
)
# run
t1
=
time
.
time
()
predictor
.
run
()
t2
=
time
.
time
()
cost_time
+=
(
t2
-
t1
)
output_data
=
output_handle
.
copy_to_cpu
()
# calculate accuracy
for
i
in
range
(
len
(
label_np
)):
label
=
label_np
[
i
][
0
]
result
=
output_data
[
i
,
:]
index
=
result
.
argsort
()
total_num
+=
1
if
index
[
-
1
]
==
label
:
correct_1_num
+=
1
if
label
in
index
[
-
5
:]:
correct_5_num
+=
1
if
batch_id
%
10
==
0
:
acc1
=
correct_1_num
/
total_num
acc5
=
correct_5_num
/
total_num
avg_time
=
cost_time
/
total_num
print
(
"batch_id {}, acc1 {:.3f}, acc5 {:.3f}, avg time {:.5f} sec/img"
.
format
(
batch_id
,
acc1
,
acc5
,
avg_time
))
acc1
=
correct_1_num
/
total_num
acc5
=
correct_5_num
/
total_num
avg_time
=
cost_time
/
total_num
print
(
"End test: test image {}"
.
format
(
total_num
))
print
(
"test_acc1 {:.4f}, test_acc5 {:.4f}, avg time {:.5f} sec/img"
.
format
(
acc1
,
acc5
,
avg_time
))
print
(
"
\n
"
)
if
__name__
==
'__main__'
:
parser
=
argparse
.
ArgumentParser
(
description
=
__doc__
)
parser
.
add_argument
(
'--model_path'
,
type
=
str
,
default
=
""
,
help
=
"The inference model path."
)
parser
.
add_argument
(
'--model_filename'
,
type
=
str
,
default
=
"model.pdmodel"
,
help
=
"model filename"
)
parser
.
add_argument
(
'--params_filename'
,
type
=
str
,
default
=
"model.pdiparams"
,
help
=
"params filename"
)
parser
.
add_argument
(
'--data_dir'
,
type
=
str
,
default
=
"dataset/ILSVRC2012/"
,
help
=
"The ImageNet dataset root dir."
)
parser
.
add_argument
(
'--batch_size'
,
type
=
int
,
default
=
10
,
help
=
"Batch size."
)
parser
.
add_argument
(
'--use_gpu'
,
type
=
bool
,
default
=
False
,
help
=
" Whether use gpu or not."
)
parser
.
add_argument
(
'--ir_optim'
,
type
=
bool
,
default
=
False
,
help
=
"Enable ir optim."
)
FLAGS
=
parser
.
parse_args
()
eval
()
tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/post_quant.py
0 → 100644
浏览文件 @
fa7227e5
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
division
from
__future__
import
print_function
import
argparse
import
os
import
sys
import
numpy
as
np
from
PIL
import
Image
sys
.
path
[
0
]
=
os
.
path
.
join
(
os
.
path
.
dirname
(
"__file__"
),
os
.
path
.
pardir
,
os
.
path
.
pardir
)
import
paddle
import
paddlevision
from
presets
import
ClassificationPresetEval
from
paddleslim.quant
import
quant_post_static
def
sample_generator
(
loader
):
def
__reader__
():
for
indx
,
data
in
enumerate
(
loader
):
images
=
np
.
array
(
data
[
0
])
yield
images
return
__reader__
def
main
():
paddle
.
enable_static
()
place
=
paddle
.
CUDAPlace
(
0
)
if
FLAGS
.
use_gpu
else
paddle
.
CPUPlace
()
resize_size
,
crop_size
=
(
256
,
224
)
val_dataset
=
paddlevision
.
datasets
.
ImageFolder
(
os
.
path
.
join
(
FLAGS
.
data_dir
,
'val'
),
ClassificationPresetEval
(
crop_size
=
crop_size
,
resize_size
=
resize_size
))
data_loader
=
paddle
.
io
.
DataLoader
(
val_dataset
,
places
=
place
,
batch_size
=
FLAGS
.
batch_size
)
quant_output_dir
=
os
.
path
.
join
(
FLAGS
.
output_dir
,
"mv3_int8_infer"
)
exe
=
paddle
.
static
.
Executor
(
place
)
quant_post_static
(
executor
=
exe
,
model_dir
=
FLAGS
.
model_path
,
quantize_model_path
=
quant_output_dir
,
sample_generator
=
sample_generator
(
data_loader
),
model_filename
=
FLAGS
.
model_filename
,
params_filename
=
FLAGS
.
params_filename
,
batch_size
=
FLAGS
.
batch_size
,
batch_nums
=
FLAGS
.
batch_num
,
algo
=
FLAGS
.
algo
,
hist_percent
=
FLAGS
.
hist_percent
)
if
__name__
==
'__main__'
:
parser
=
argparse
.
ArgumentParser
(
"Quantization on ImageNet"
)
parser
.
add_argument
(
"--model_path"
,
type
=
str
,
default
=
None
,
help
=
"Inference model path"
)
parser
.
add_argument
(
"--model_filename"
,
type
=
str
,
default
=
None
,
help
=
"Inference model model_filename"
)
parser
.
add_argument
(
"--params_filename"
,
type
=
str
,
default
=
None
,
help
=
"Inference model params_filename"
)
parser
.
add_argument
(
"--output_dir"
,
type
=
str
,
default
=
'output'
,
help
=
"save dir"
)
parser
.
add_argument
(
'--data_dir'
,
default
=
"/dataset/ILSVRC2012"
,
help
=
'path to dataset (should have subdirectories named "train" and "val"'
)
parser
.
add_argument
(
'--use_gpu'
,
default
=
True
,
type
=
bool
,
help
=
'Whether to use GPU or not.'
)
# train
parser
.
add_argument
(
"--batch_num"
,
default
=
10
,
type
=
int
,
help
=
"batch num for quant"
)
parser
.
add_argument
(
"--batch_size"
,
default
=
10
,
type
=
int
,
help
=
"batch size for quant"
)
parser
.
add_argument
(
'--algo'
,
default
=
'hist'
,
type
=
str
,
help
=
"calibration algorithm"
)
parser
.
add_argument
(
'--hist_percent'
,
default
=
0.999
,
type
=
float
,
help
=
"The percentile of algo:hist"
)
FLAGS
=
parser
.
parse_args
()
assert
FLAGS
.
data_dir
,
"error: must provide data path"
main
()
tutorials/tipc/images/post_training_quant_guide.png
0 → 100644
浏览文件 @
fa7227e5
324.9 KB
tutorials/tipc/kl_infer_python/kl_infer_python.md
已删除
100644 → 0
浏览文件 @
4f732271
# Linux GPU/CPU 离线量化功能开发文档
# 目录
-
[
1. 简介
](
#1---
)
-
[
2. 开发流程
](
#2---
)
-
[
3. FAQ
](
#3---
)
tutorials/tipc/
kl
_infer_python/README.md
→
tutorials/tipc/
ptq
_infer_python/README.md
浏览文件 @
fa7227e5
文件已移动
tutorials/tipc/ptq_infer_python/ptq_infer_python.md
0 → 100644
浏览文件 @
fa7227e5
# Linux GPU/CPU 离线量化功能开发文档
# 目录
-
[
1. 简介
](
#1
)
-
[
2. 离线量化功能开发
](
#2
)
-
[
2.1 准备校准数据和环境
](
#2.1
)
-
[
2.2 准备推理模型
](
#2.2
)
-
[
2.3 准备离线量化代码
](
#2.3
)
-
[
2.4 开始离线量化
](
#2.4
)
-
[
2.5 验证推理结果正确性
](
#2.5
)
-
[
3. FAQ
](
#3
)
-
[
3.1 通用问题
](
#3.1
)
<a
name=
"1"
></a>
## 1. 简介
Paddle中静态离线量化,使用少量校准数据计算量化因子,可以快速将FP32模型量化成低比特模型(比如最常用的int8量化)。使用该量化模型进行预测,可以减少计算量、降低计算内存、减小模型大小。
更多关于Paddle 模型离线量化的介绍,可以参考
[
Paddle 离线量化官网教程
](
https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/static/quant/quantization_api.rst#quant_post_static
)
。
<a
name=
"2"
></a>
## 2. 离线量化功能开发
Paddle 混合精度训练开发可以分为4个步骤,如下图所示。
<div
align=
"center"
>
<img
src=
"../images/post_training_quant_guide.png"
width=
"600"
>
</div>
其中设置了2个核验点,分别为:
*
准备推理模型
*
验证量化模型推理结果正确性
<a
name=
"2.1"
></a>
### 2.1 准备校准数据和环境
**【准备校准数据】**
由于离线量化需要获得网络预测的每层的scale值,用来做数值范围的映射,所以需要适量的数据执行网络前向,故需要事先准备好校准数据集。
以ImageNet1k数据集为例,可参考
[
数据准备文档
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6#32-%E5%87%86%E5%A4%87%E6%95%B0%E6%8D%AE
)
。
**【准备开发环境】**
-
确定已安装paddle,通过pip安装linux版本paddle命令如下,更多的版本安装方法可查看飞桨
[
官网
](
https://www.paddlepaddle.org.cn/
)
-
确定已安装paddleslim,通过pip安装linux版本paddle命令如下,更多的版本安装方法可查看
[
PaddleSlim
](
https://github.com/PaddlePaddle/PaddleSlim
)
```
pip install paddlepaddle-gpu==2.2.1.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
pip install paddleslim==2.2.1
```
<a
name=
"2.2"
></a>
### 2.2 准备推理模型
**【基本流程】**
准备推理模型分为三步:
-
Step1:定义继承自
`paddle.nn.Layer`
的网络模型
-
Step2:使用
`paddle.jit.save`
接口对模型进行动转静,导出成Inference模型
-
Step3:检查导出的路径下是否生成
`model.pdmodel`
和
`model.pdiparams`
文件
**【实战】**
模型组网可以参考
[
mobilenet_v3
](
https://github.com/PaddlePaddle/models/blob/release/2.2/tutorials/mobilenetv3_prod/Step6/paddlevision/models/mobilenet_v3.py
)
```
python
fp32_model
=
mobilenet_v3_small
()
fp32_model
.
eval
()
```
然后将模型进行动转静:
```
python
# save inference model
input_spec
=
paddle
.
static
.
InputSpec
(
shape
=
[
None
,
3
,
224
,
224
],
dtype
=
'float32'
)
fp32_output_model_path
=
os
.
path
.
join
(
"mv3_fp32_infer"
,
"model"
)
paddle
.
jit
.
save
(
fp32_model
,
fp32_output_model_path
,
[
input_spec
])
```
会在
`mv3_fp32_infer`
文件夹下生成
`model.pdmodel`
和
`model.pdiparams`
两个文件。
<a
name=
"2.3"
></a>
### 2.3 准备离线量化代码
**【基本流程】**
基于PaddleSlim,使用接口
``paddleslim.quant.quant_post_static``
对模型进行离线量化:
-
Step1:定义
`sample_generator`
,传入paddle.io.Dataloader实例化对象,用来遍历校准数据集
-
Step2:定义Executor,由于离线量化模型是Inference模型,量化校准过程也需要在静态图下执行,所以需要定义静态图Executor,用来执行离线量化校准执行
**【实战】**
1)定义数据集,可以参考
[
Datasets定义
](
https://github.com/PaddlePaddle/models/blob/release/2.2/tutorials/mobilenetv3_prod/Step6/paddlevision/datasets/vision.py
)
2)定义
`sample_generator`
:
```
python
def
sample_generator
(
loader
):
def
__reader__
():
for
indx
,
data
in
enumerate
(
loader
):
images
=
np
.
array
(
data
[
0
])
yield
images
return
__reader__
```
2)定义Executor:
```
python
use_gpu
=
True
place
=
paddle
.
CUDAPlace
(
0
)
if
use_gpu
else
paddle
.
CPUPlace
()
exe
=
paddle
.
static
.
Executor
(
place
)
```
<a
name=
"2.4"
></a>
### 2.4 开始离线量化
**【基本流程】**
使用飞桨PaddleSlim中的
`quant_post_static`
接口开始进行离线量化:
-
Step1:导入
`quant_post_static`
接口
```
python
from
paddleslim.quant
import
quant_post_static
```
-
Step2:配置传入
`quant_post_static`
接口参数,开始离线量化
```
python
fp32_model_dir
=
'mv3_fp32_infer'
quant_output_dir
=
'quant_model'
quant_post_static
(
executor
=
exe
,
model_dir
=
fp32_model_dir
,
quantize_model_path
=
quant_output_dir
,
sample_generator
=
sample_generator
(
data_loader
),
model_filename
=
'model.pdmodel'
,
params_filename
=
'model.pdiparams'
,
batch_size
=
32
,
batch_nums
=
10
,
algo
=
'KL'
)
```
-
Step3:检查输出结果,确保离线量化后生成
`__model__`
和
`__params__`
文件。
**【实战】**
开始离线量化,具体可参考MobileNetv3
[
离线量化代码
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/post_quant.py
)
。
<a
name=
"2.5"
></a>
### 2.5 验证推理结果正确性
**【基本流程】**
使用Paddle Inference库测试离线量化模型,确保模型精度符合预期。
-
Step1:初始化
`paddle.inference`
库并配置相应参数
```
python
import
paddle.inference
as
paddle_infer
model_file
=
os
.
path
.
join
(
'quant_model'
,
'__model__'
)
params_file
=
os
.
path
.
join
(
'quant_model'
,
'__params__'
)
config
=
paddle_infer
.
Config
(
model_file
,
params_file
)
if
FLAGS
.
use_gpu
:
config
.
enable_use_gpu
(
1000
,
0
)
if
not
FLAGS
.
ir_optim
:
config
.
switch_ir_optim
(
False
)
predictor
=
paddle_infer
.
create_predictor
(
config
)
```
-
Step2:配置预测库输入输出
```
python
``
`
python
input_names
=
predictor
.
get_input_names
()
input_handle
=
predictor
.
get_input_handle
(
input_names
[
0
])
output_names
=
predictor
.
get_output_names
()
output_handle
=
predictor
.
get_output_handle
(
output_names
[
0
])
```
-
Step3:开始预测并检验结果正确性
```
python
``
`
python
input_handle
.
copy_from_cpu
(
img_np
)
predictor
.
run
()
output_data
=
output_handle
.
copy_to_cpu
()
```
**【实战】**
1)初始化
`paddle.inference`
库并配置相应参数:
具体可以参考MobileNetv3
[
Inference模型测试代码
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
)
2)配置预测库输入输出:
具体可以参考MobileNetv3
[
Inference模型测试代码
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
)
3)开始预测:
具体可以参考MobileNetv3
[
Inference模型测试代码
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/eval.py
)
4)测试单张图像预测结果是否正确,可参考
[
Inference预测文档
](
https://github.com/PaddlePaddle/models/blob/release/2.2/docs/tipc/train_infer_python/infer_python.md
)
5)同时也可以测试量化模型和FP32模型的精度,确保量化后模型精度损失符合预期。参考
[
MobileNet量化模型精度验证文档
](
https://github.com/PaddlePaddle/models/tree/release/2.2/tutorials/mobilenetv3_prod/Step6/deploy/ptq_python/README.md
)
<a
name=
"3"
></a>
## 3. FAQ
如果您在使用该文档完成离线量化的过程中遇到问题,可以给在
[
这里
](
https://github.com/PaddlePaddle/PaddleSlim/issues
)
提一个ISSUE,我们会高优跟进。
## 3.1 通用问题
-
如何选择离线量化方法?
选择合适的离线量化方法,比如
`KL`
、
`hist`
、
`mse`
等,具体离线量化方法选择可以参考API文档:
[
quant_post_static API文档
](
https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/static/quant/quantization_api.rst#quant_post_static
)
。
tutorials/tipc/
kl_infer_python/test_kl
_infer_python.md
→
tutorials/tipc/
ptq_infer_python/test_ptq
_infer_python.md
浏览文件 @
fa7227e5
文件已移动
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录