Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleDetection
提交
0ea122f0
P
PaddleDetection
项目概览
PaddlePaddle
/
PaddleDetection
1 年多 前同步成功
通知
696
Star
11112
Fork
2696
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
184
列表
看板
标记
里程碑
合并请求
40
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleDetection
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
184
Issue
184
列表
看板
标记
里程碑
合并请求
40
合并请求
40
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
0ea122f0
编写于
10月 25, 2019
作者:
B
Bai Yifan
提交者:
whs
10月 25, 2019
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Copy slim from release/1.6 to develop (#3758)
上级
3126a437
变更
11
隐藏空白更改
内联
并排
Showing
11 changed file
with
790 addition
and
141 deletion
+790
-141
slim/distillation/README.md
slim/distillation/README.md
+141
-0
slim/distillation/compress.py
slim/distillation/compress.py
+325
-0
slim/distillation/run.sh
slim/distillation/run.sh
+47
-0
slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
...tion/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
+18
-0
slim/distillation/yolov3_resnet34.yml
slim/distillation/yolov3_resnet34.yml
+34
-0
slim/infer.py
slim/infer.py
+37
-13
slim/prune/README.md
slim/prune/README.md
+37
-15
slim/prune/compress.py
slim/prune/compress.py
+16
-17
slim/quantization/README.md
slim/quantization/README.md
+75
-31
slim/quantization/compress.py
slim/quantization/compress.py
+19
-20
slim/quantization/freeze.py
slim/quantization/freeze.py
+41
-45
未找到文件。
slim/distillation/README.md
0 → 100755
浏览文件 @
0ea122f0
>运行该示例前请安装Paddle1.6或更高版本
# 检测模型蒸馏示例
## 概述
该示例使用PaddleSlim提供的
[
蒸馏策略
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#3-蒸馏
)
对检测库中的模型进行蒸馏训练。
在阅读该示例前,建议您先了解以下内容:
-
[
检测库的常规训练方法
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/PaddleDetection
)
-
[
PaddleSlim使用文档
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md
)
## 配置文件说明
关于配置文件如何编写您可以参考:
-
[
PaddleSlim配置文件编写说明
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#122-%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E7%9A%84%E4%BD%BF%E7%94%A8
)
-
[
蒸馏策略配置文件编写说明
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#23-蒸馏
)
这里以ResNet34-YoloV3蒸馏MobileNetV1-YoloV3模型为例,首先,为了对
`student model`
和
`teacher model`
有个总体的认识,从而进一步确认蒸馏的对象,我们通过以下命令分别观察两个网络变量(Variable)的名称和形状:
```
python
# 观察student model的Variable
for
v
in
fluid
.
default_main_program
().
list_vars
():
if
"py_reader"
not
in
v
.
name
and
"double_buffer"
not
in
v
.
name
and
"generated_var"
not
in
v
.
name
:
print
(
v
.
name
,
v
.
shape
)
# 观察teacher model的Variable
for
v
in
teacher_program
.
list_vars
():
print
(
v
.
name
,
v
.
shape
)
```
经过对比可以发现,
`student model`
和
`teacher model`
的部分中间结果分别为:
```
bash
# student model
conv2d_15.tmp_0
# teacher model
teacher_teacher_conv2d_1.tmp_0
```
所以,我们用
`l2_distiller`
对这两个特征图做蒸馏。在配置文件中进行如下配置:
```
yaml
distillers
:
l2_distiller
:
class
:
'
L2Distiller'
teacher_feature_map
:
'
teacher_teacher_conv2d_1.tmp_0'
student_feature_map
:
'
conv2d_15.tmp_0'
distillation_loss_weight
:
1
strategies
:
distillation_strategy
:
class
:
'
DistillationStrategy'
distillers
:
[
'
l2_distiller'
]
start_epoch
:
0
end_epoch
:
270
```
我们也可以根据上述操作为蒸馏策略选择其他loss,PaddleSlim支持的有
`FSP_loss`
,
`L2_loss`
和
`softmax_with_cross_entropy_loss`
。
## 训练
根据
[
PaddleDetection/tools/train.py
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/PaddleDetection/tools/train.py
)
编写压缩脚本compress.py。
在该脚本中定义了Compressor对象,用于执行压缩任务。
您可以通过运行脚本
`run.sh`
运行该示例。
### 保存断点(checkpoint)
如果在配置文件中设置了
`checkpoint_path`
, 则在蒸馏任务执行过程中会自动保存断点,当任务异常中断时,
重启任务会自动从
`checkpoint_path`
路径下按数字顺序加载最新的checkpoint文件。如果不想让重启的任务从断点恢复,
需要修改配置文件中的
`checkpoint_path`
,或者将
`checkpoint_path`
路径下文件清空。
>注意:配置文件中的信息不会保存在断点中,重启前对配置文件的修改将会生效。
## 评估
如果在配置文件中设置了
`checkpoint_path`
,则每个epoch会保存一个压缩后的用于评估的模型,
该模型会保存在
`${checkpoint_path}/${epoch_id}/eval_model/`
路径下,包含
`__model__`
和
`__params__`
两个文件。
其中,
`__model__`
用于保存模型结构信息,
`__params__`
用于保存参数(parameters)信息。
如果不需要保存评估模型,可以在定义Compressor对象时,将
`save_eval_model`
选项设置为False(默认为True)。
运行命令为:
```
python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
如果在配置文件中设置了
`checkpoint_path`
,并且在定义Compressor对象时指定了
`prune_infer_model`
选项,则每个epoch都会
保存一个
`inference model`
。该模型是通过删除eval_program中多余的operators而得到的。
该模型会保存在
`${checkpoint_path}/${epoch_id}/eval_model/`
路径下,包含
`__model__.infer`
和
`__params__`
两个文件。
其中,
`__model__.infer`
用于保存模型结构信息,
`__params__`
用于保存参数(parameters)信息。
更多关于
`prune_infer_model`
选项的介绍,请参考:
[
Compressor介绍
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#121-%E5%A6%82%E4%BD%95%E6%94%B9%E5%86%99%E6%99%AE%E9%80%9A%E8%AE%AD%E7%BB%83%E8%84%9A%E6%9C%AC
)
### python预测
在脚本
<a
href=
"../infer.py"
>
slim/infer.py
</a>
中展示了如何使用fluid python API加载使用预测模型进行预测。
运行命令为:
```
python ../infer.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__.infer \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
### PaddleLite
该示例中产出的预测(inference)模型可以直接用PaddleLite进行加载使用。
关于PaddleLite如何使用,请参考:
[
PaddleLite使用文档
](
https://github.com/PaddlePaddle/Paddle-Lite/wiki#%E4%BD%BF%E7%94%A8
)
## 示例结果
>当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| FLOPS |Box AP|
|---|---|
|baseline|76.2 |
|蒸馏后|- |
## FAQ
slim/distillation/compress.py
0 → 100644
浏览文件 @
0ea122f0
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
os
import
time
import
multiprocessing
import
numpy
as
np
from
collections
import
deque
,
OrderedDict
from
paddle.fluid.contrib.slim.core
import
Compressor
from
paddle.fluid.framework
import
IrGraph
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
FLAGS_eager_delete_tensor_gb
=
0
,
# enable GC to save memory
)
from
paddle
import
fluid
import
sys
sys
.
path
.
append
(
"../../"
)
from
ppdet.core.workspace
import
load_config
,
merge_config
,
create
from
ppdet.data.data_feed
import
create_reader
from
ppdet.utils.eval_utils
import
parse_fetches
,
eval_results
from
ppdet.utils.stats
import
TrainingStats
from
ppdet.utils.cli
import
ArgsParser
from
ppdet.utils.check
import
check_gpu
import
ppdet.utils.checkpoint
as
checkpoint
from
ppdet.modeling.model_input
import
create_feed
import
logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
"""
iter_id
=
0
results
=
[]
if
len
(
cls
)
!=
0
:
values
=
[]
for
i
in
range
(
len
(
cls
)):
_
,
accum_map
=
cls
[
i
].
get_map_var
()
cls
[
i
].
reset
(
exe
)
values
.
append
(
accum_map
)
images_num
=
0
start_time
=
time
.
time
()
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
[
values
[
0
]],
return_numpy
=
False
)
outs
.
append
(
data
[
'gt_box'
])
outs
.
append
(
data
[
'gt_label'
])
outs
.
append
(
data
[
'is_difficult'
])
res
=
{
k
:
(
np
.
array
(
v
),
v
.
recursive_sequence_lengths
())
for
k
,
v
in
zip
(
keys
,
outs
)
}
results
.
append
(
res
)
if
iter_id
%
100
==
0
:
logger
.
info
(
'Test iter {}'
.
format
(
iter_id
))
iter_id
+=
1
images_num
+=
len
(
res
[
'bbox'
][
1
][
0
])
if
has_bbox
else
1
logger
.
info
(
'Test finish iter {}'
.
format
(
iter_id
))
end_time
=
time
.
time
()
fps
=
images_num
/
(
end_time
-
start_time
)
if
has_bbox
:
logger
.
info
(
'Total number of images: {}, inference time: {} fps.'
.
format
(
images_num
,
fps
))
else
:
logger
.
info
(
'Total iteration: {}, inference time: {} batch/s.'
.
format
(
images_num
,
fps
))
return
results
def
main
():
cfg
=
load_config
(
FLAGS
.
config
)
if
'architecture'
in
cfg
:
main_arch
=
cfg
.
architecture
else
:
raise
ValueError
(
"'architecture' not specified in config file."
)
merge_config
(
FLAGS
.
opt
)
if
'log_iter'
not
in
cfg
:
cfg
.
log_iter
=
20
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu
(
cfg
.
use_gpu
)
if
cfg
.
use_gpu
:
devices_num
=
fluid
.
core
.
get_cuda_device_count
()
else
:
devices_num
=
int
(
os
.
environ
.
get
(
'CPU_NUM'
,
multiprocessing
.
cpu_count
()))
if
'train_feed'
not
in
cfg
:
train_feed
=
create
(
main_arch
+
'TrainFeed'
)
else
:
train_feed
=
create
(
cfg
.
train_feed
)
if
'eval_feed'
not
in
cfg
:
eval_feed
=
create
(
main_arch
+
'EvalFeed'
)
else
:
eval_feed
=
create
(
cfg
.
eval_feed
)
place
=
fluid
.
CUDAPlace
(
0
)
if
cfg
.
use_gpu
else
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
lr_builder
=
create
(
'LearningRate'
)
optim_builder
=
create
(
'OptimizerBuilder'
)
# build program
model
=
create
(
main_arch
)
train_loader
,
train_feed_vars
=
create_feed
(
train_feed
,
iterable
=
True
)
train_fetches
=
model
.
train
(
train_feed_vars
)
loss
=
train_fetches
[
'loss'
]
lr
=
lr_builder
()
opt
=
optim_builder
(
lr
)
opt
.
minimize
(
loss
)
#for v in fluid.default_main_program().list_vars():
# if "py_reader" not in v.name and "double_buffer" not in v.name and "generated_var" not in v.name:
# print(v.name, v.shape)
cfg
.
max_iters
=
258
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
exe
.
run
(
fluid
.
default_startup_program
())
# parse train fetches
train_keys
,
train_values
,
_
=
parse_fetches
(
train_fetches
)
train_keys
.
append
(
'lr'
)
train_values
.
append
(
lr
.
name
)
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
print
(
"train_fetch_list: {}"
.
format
(
train_fetch_list
))
eval_prog
=
fluid
.
Program
()
startup_prog
=
fluid
.
Program
()
with
fluid
.
program_guard
(
eval_prog
,
startup_prog
):
with
fluid
.
unique_name
.
guard
():
model
=
create
(
main_arch
)
_
,
test_feed_vars
=
create_feed
(
eval_feed
,
iterable
=
True
)
fetches
=
model
.
eval
(
test_feed_vars
)
eval_prog
=
eval_prog
.
clone
(
True
)
eval_reader
=
create_reader
(
eval_feed
,
args_path
=
FLAGS
.
dataset_dir
)
test_data_feed
=
fluid
.
DataFeeder
(
test_feed_vars
.
values
(),
place
)
# parse eval fetches
extra_keys
=
[]
if
cfg
.
metric
==
'COCO'
:
extra_keys
=
[
'im_info'
,
'im_id'
,
'im_shape'
]
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
print
(
"eval_fetch_list: {}"
.
format
(
eval_fetch_list
))
exe
.
run
(
startup_prog
)
checkpoint
.
load_params
(
exe
,
fluid
.
default_main_program
(),
cfg
.
pretrain_weights
)
best_box_ap_list
=
[]
def
eval_func
(
program
,
scope
):
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
is_bbox_normalized
=
False
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
is_bbox_normalized
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
(
'im_size'
,
test_feed_vars
[
'im_size'
].
name
)]
teacher_cfg
=
load_config
(
FLAGS
.
teacher_config
)
teacher_arch
=
teacher_cfg
.
architecture
teacher_programs
=
[]
teacher_program
=
fluid
.
Program
()
teacher_startup_program
=
fluid
.
Program
()
with
fluid
.
program_guard
(
teacher_program
,
teacher_startup_program
):
with
fluid
.
unique_name
.
guard
(
'teacher_'
):
teacher_feed_vars
=
OrderedDict
()
for
name
,
var
in
train_feed_vars
.
items
():
teacher_feed_vars
[
name
]
=
teacher_program
.
global_block
(
).
_clone_variable
(
var
,
force_persistable
=
False
)
model
=
create
(
teacher_arch
)
train_fetches
=
model
.
train
(
teacher_feed_vars
)
#print("="*50+"teacher_model_params"+"="*50)
#for v in teacher_program.list_vars():
# print(v.name, v.shape)
#return
exe
.
run
(
teacher_startup_program
)
assert
FLAGS
.
teacher_pretrained
and
os
.
path
.
exists
(
FLAGS
.
teacher_pretrained
),
"teacher_pretrained should be set when teacher_model is not None."
def
if_exist
(
var
):
return
os
.
path
.
exists
(
os
.
path
.
join
(
FLAGS
.
teacher_pretrained
,
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
FLAGS
.
teacher_pretrained
,
main_program
=
teacher_program
,
predicate
=
if_exist
)
teacher_programs
.
append
(
teacher_program
.
clone
(
for_test
=
True
))
com
=
Compressor
(
place
,
fluid
.
global_scope
(),
fluid
.
default_main_program
(),
train_reader
=
train_reader
,
train_feed_list
=
[(
key
,
value
.
name
)
for
key
,
value
in
train_feed_vars
.
items
()],
train_fetch_list
=
train_fetch_list
,
eval_program
=
eval_prog
,
eval_reader
=
eval_reader
,
eval_feed_list
=
test_feed
,
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
eval_fetch_list
[
0
:
1
],
save_eval_model
=
True
,
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
teacher_programs
=
teacher_programs
,
train_optimizer
=
None
,
distiller_optimizer
=
opt
,
log_period
=
20
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
"-t"
,
"--teacher_config"
,
default
=
None
,
type
=
str
,
help
=
"Config file of teacher architecture."
)
parser
.
add_argument
(
"-s"
,
"--slim_file"
,
default
=
None
,
type
=
str
,
help
=
"Config file of PaddleSlim."
)
parser
.
add_argument
(
"-r"
,
"--resume_checkpoint"
,
default
=
None
,
type
=
str
,
help
=
"Checkpoint path for resuming training."
)
parser
.
add_argument
(
"--eval"
,
action
=
'store_true'
,
default
=
False
,
help
=
"Whether to perform evaluation in train"
)
parser
.
add_argument
(
"--teacher_pretrained"
,
default
=
None
,
type
=
str
,
help
=
"Whether to use pretrained model."
)
parser
.
add_argument
(
"--output_eval"
,
default
=
None
,
type
=
str
,
help
=
"Evaluation directory, default is current directory."
)
parser
.
add_argument
(
"-d"
,
"--dataset_dir"
,
default
=
None
,
type
=
str
,
help
=
"Dataset path, same as DataFeed.dataset.dataset_dir"
)
FLAGS
=
parser
.
parse_args
()
main
()
slim/distillation/run.sh
0 → 100644
浏览文件 @
0ea122f0
#!/usr/bin/env bash
# download pretrain model
root_url
=
"https://paddlemodels.bj.bcebos.com/object_detection"
yolov3_r34_voc
=
"yolov3_r34_voc.tar"
pretrain_dir
=
'./pretrain'
if
[
!
-d
${
pretrain_dir
}
]
;
then
mkdir
${
pretrain_dir
}
fi
cd
${
pretrain_dir
}
if
[
!
-f
${
yolov3_r34_voc
}
]
;
then
wget
${
root_url
}
/
${
yolov3_r34_voc
}
tar
xf
${
yolov3_r34_voc
}
fi
cd
-
# enable GC strategy
export
FLAGS_fast_eager_deletion_mode
=
1
export
FLAGS_eager_delete_tensor_gb
=
0.0
# for distillation
#-----------------
export
CUDA_VISIBLE_DEVICES
=
0,1,2,3
# Fixing name conflicts in distillation
cd
${
pretrain_dir
}
/yolov3_r34_voc
for
files
in
$(
ls
teacher_
*
)
do
mv
$files
${
files
#*_
}
done
for
files
in
$(
ls
*
)
do
mv
$files
"teacher_"
$files
done
cd
-
python
-u
compress.py
\
-c
../../configs/yolov3_mobilenet_v1_voc.yml
\
-t
yolov3_resnet34.yml
\
-s
yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
\
-o
YoloTrainFeed.batch_size
=
64
\
-d
../../dataset/voc
\
--teacher_pretrained
./pretrain/yolov3_r34_voc
\
>
yolov3_distallation.log 2>&1 &
tailf yolov3_distallation.log
slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
0 → 100644
浏览文件 @
0ea122f0
version
:
1.0
distillers
:
l2_distiller
:
class
:
'
L2Distiller'
teacher_feature_map
:
'
teacher_teacher_conv2d_1.tmp_0'
student_feature_map
:
'
conv2d_15.tmp_0'
distillation_loss_weight
:
1
strategies
:
distillation_strategy
:
class
:
'
DistillationStrategy'
distillers
:
[
'
l2_distiller'
]
start_epoch
:
0
end_epoch
:
270
compressor
:
epoch
:
271
checkpoint_path
:
'
./checkpoints/'
strategies
:
-
distillation_strategy
slim/
quantization/yolov3_mobilenet_v1_voc
.yml
→
slim/
distillation/yolov3_resnet34
.yml
浏览文件 @
0ea122f0
architecture
:
YOLOv3
train_feed
:
YoloTrainFeed
eval_feed
:
YoloEvalFeed
test_feed
:
YoloTestFeed
use_gpu
:
true
max_iters
:
1000
log_smooth_window
:
20
save_dir
:
output
snapshot_iter
:
2000
metric
:
VOC
map_type
:
11point
pretrain_weights
:
https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar
weights
:
output/yolov3_mobilenet_v1_voc/model_final
num_classes
:
20
weight_prefix_name
:
teacher_
YOLOv3
:
backbone
:
Mobile
Net
backbone
:
Res
Net
yolo_head
:
YOLOv3Head
Mobile
Net
:
Res
Net
:
norm_type
:
sync_bn
freeze_at
:
0
freeze_norm
:
false
norm_decay
:
0.
conv_group_scale
:
1
with_extra_blocks
:
false
depth
:
34
feature_maps
:
[
3
,
4
,
5
]
YOLOv3Head
:
anchor_masks
:
[[
6
,
7
,
8
],
[
3
,
4
,
5
],
[
0
,
1
,
2
]]
...
...
@@ -38,50 +32,3 @@ YOLOv3Head:
nms_top_k
:
1000
normalized
:
false
score_threshold
:
0.01
LearningRate
:
base_lr
:
0.0001
schedulers
:
-
!PiecewiseDecay
gamma
:
0.1
milestones
:
-
1000
-
2000
#- !LinearWarmup
#start_factor: 0.
#steps: 1000
OptimizerBuilder
:
optimizer
:
momentum
:
0.9
type
:
Momentum
regularizer
:
factor
:
0.0005
type
:
L2
YoloTrainFeed
:
batch_size
:
8
dataset
:
dataset_dir
:
../../dataset/voc
annotation
:
VOCdevkit/VOC_all/ImageSets/Main/train.txt
image_dir
:
VOCdevkit/VOC_all/JPEGImages
use_default_label
:
true
num_workers
:
8
bufsize
:
128
use_process
:
true
mixup_epoch
:
250
YoloEvalFeed
:
batch_size
:
8
image_shape
:
[
3
,
608
,
608
]
dataset
:
dataset_dir
:
../../dataset/voc
annotation
:
VOCdevkit/VOC_all/ImageSets/Main/val.txt
image_dir
:
VOCdevkit/VOC_all/JPEGImages
use_default_label
:
true
YoloTestFeed
:
batch_size
:
1
image_shape
:
[
3
,
608
,
608
]
dataset
:
use_default_label
:
true
slim/infer.py
浏览文件 @
0ea122f0
...
...
@@ -19,11 +19,13 @@ from __future__ import print_function
import
os
import
sys
import
glob
import
time
import
numpy
as
np
from
PIL
import
Image
sys
.
path
.
append
(
"../../"
)
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
...
...
@@ -117,20 +119,19 @@ def main():
test_images
=
get_test_images
(
FLAGS
.
infer_dir
,
FLAGS
.
infer_img
)
test_feed
.
dataset
.
add_images
(
test_images
)
place
=
fluid
.
CUDAPlace
(
0
)
if
cfg
.
use_gpu
else
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
infer_prog
,
feed_var_names
,
fetch_list
=
fluid
.
io
.
load_inference_model
(
dirname
=
FLAGS
.
model_path
,
model_filename
=
FLAGS
.
model_name
,
params_filename
=
FLAGS
.
params_name
,
executor
=
exe
)
dirname
=
FLAGS
.
model_path
,
model_filename
=
FLAGS
.
model_name
,
params_filename
=
FLAGS
.
params_name
,
executor
=
exe
)
reader
=
create_reader
(
test_feed
)
feeder
=
fluid
.
DataFeeder
(
place
=
place
,
feed_list
=
feed_var_names
,
program
=
infer_prog
)
feeder
=
fluid
.
DataFeeder
(
place
=
place
,
feed_list
=
feed_var_names
,
program
=
infer_prog
)
# parse infer fetches
assert
cfg
.
metric
in
[
'COCO'
,
'VOC'
],
\
...
...
@@ -140,7 +141,9 @@ def main():
extra_keys
=
[
'im_info'
,
'im_id'
,
'im_shape'
]
if
cfg
[
'metric'
]
==
'VOC'
:
extra_keys
=
[
'im_id'
,
'im_shape'
]
keys
,
values
,
_
=
parse_fetches
({
'bbox'
:
fetch_list
},
infer_prog
,
extra_keys
)
keys
,
values
,
_
=
parse_fetches
({
'bbox'
:
fetch_list
},
infer_prog
,
extra_keys
)
# parse dataset category
if
cfg
.
metric
==
'COCO'
:
...
...
@@ -166,9 +169,33 @@ def main():
imid2path
=
reader
.
imid2path
keys
=
[
'bbox'
]
infer_time
=
True
compile_prog
=
fluid
.
compiler
.
CompiledProgram
(
infer_prog
)
for
iter_id
,
data
in
enumerate
(
reader
()):
feed_data
=
[[
d
[
0
],
d
[
1
]]
for
d
in
data
]
outs
=
exe
.
run
(
infer_prog
,
# for infer time
if
infer_time
:
warmup_times
=
10
repeats_time
=
100
feed_data_dict
=
feeder
.
feed
(
feed_data
)
for
i
in
range
(
warmup_times
):
exe
.
run
(
compile_prog
,
feed
=
feed_data_dict
,
fetch_list
=
fetch_list
,
return_numpy
=
False
)
start_time
=
time
.
time
()
for
i
in
range
(
repeats_time
):
exe
.
run
(
compile_prog
,
feed
=
feed_data_dict
,
fetch_list
=
fetch_list
,
return_numpy
=
False
)
print
(
"infer time: {} ms/sample"
.
format
((
time
.
time
()
-
start_time
)
*
1000
/
repeats_time
))
infer_time
=
False
outs
=
exe
.
run
(
compile_prog
,
feed
=
feeder
.
feed
(
feed_data
),
fetch_list
=
fetch_list
,
return_numpy
=
False
)
...
...
@@ -258,10 +285,7 @@ if __name__ == '__main__':
default
=
"tb_log_dir/image"
,
help
=
'Tensorboard logging directory for image.'
)
parser
.
add_argument
(
'--model_path'
,
type
=
str
,
default
=
None
,
help
=
"inference model path"
)
'--model_path'
,
type
=
str
,
default
=
None
,
help
=
"inference model path"
)
parser
.
add_argument
(
'--model_name'
,
type
=
str
,
...
...
slim/prune/README.md
浏览文件 @
0ea122f0
...
...
@@ -7,7 +7,8 @@
该示例使用PaddleSlim提供的
[
卷积通道剪裁压缩策略
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86
)
对检测库中的模型进行压缩。
在阅读该示例前,建议您先了解以下内容:
-
<a
href=
"../..README_cn.md"
>
检测库的常规训练方法
</a>
-
<a
href=
"../../README_cn.md"
>
检测库的常规训练方法
</a>
-
[
检测模型数据准备
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/PaddleDetection/docs/INSTALL_cn.md#%E6%95%B0%E6%8D%AE%E9%9B%86
)
-
[
PaddleSlim使用文档
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md
)
...
...
@@ -29,7 +30,7 @@ from paddle.fluid.framework import IrGraph
from paddle.fluid import core
graph = IrGraph(core.Graph(train_prog.desc), for_test=True)
marked_nodes = set()
marked_nodes = set()
for op in graph.all_op_nodes():
print(op.name())
if op.name().find('conv') > -1:
...
...
@@ -39,12 +40,12 @@ graph.draw('.', 'forward', marked_nodes)
该示例中MobileNetV1-YoloV3模型结构的可视化结果:
<a
href=
"./images/MobileNetV1-YoloV3.pdf"
>
MobileNetV1-YoloV3.pdf
</a>
同时通过以下命令观察目标卷积层的参数(parameters)的名称和shape:
同时通过以下命令观察目标卷积层的参数(parameters)的名称和shape:
```
for param in fluid.default_main_program().global_block().all_parameters():
if 'weights' in param.name:
print
param.name, param.shape
print
(param.name, param.shape)
```
...
...
@@ -109,6 +110,7 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=258 \
YoloTrainFeed.batch_size=64 \
-d "../../dataset/voc"
```
...
...
@@ -117,9 +119,9 @@ python compress.py \
如果要调整训练卡数,需要调整配置文件
`yolov3_mobilenet_v1_voc.yml`
中的以下参数:
-
**max_iters:**
一个
`epoch`
中batch的数量,需要设置为
`total_num / batch_size`
, 其中
`total_num`
为训练样本总数量,
`batch_size`
为多卡上总的batch size.
-
**YoloTrainFeed.batch_size:**
单张卡上的batch size,
受限于显存大小。
-
**YoloTrainFeed.batch_size:**
当使用DataLoader时,表示单张卡上的batch size; 当使用普通reader时,则表示多卡上的总的
`batch_size`
。
`batch_size`
受限于显存大小。
-
**LeaningRate.base_lr:**
根据多卡的总
`batch_size`
调整
`base_lr`
,两者大小正相关,可以简单的按比例进行调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.LinearWarmup.steps:**
请根据batch size的变化对其进行调整。
...
...
@@ -130,7 +132,7 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=258 \
-o YoloTrainFeed.batch_size = 16
\
YoloTrainFeed.batch_size=64
\
-d "../../dataset/voc"
```
...
...
@@ -140,9 +142,9 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=516 \
-o LeaningRate.base_lr=0.005 \ # 0.001 /2
-o YoloTrainFeed.batch_size = 16
\
-o
LearningRate.schedulers='[!PiecewiseDecay {gamma: 0.1, milestones: [110000, 124000]}, !LinearWarmup {start_factor: 0., steps: 2000}]' \
LeaningRate.base_lr=0.005 \
YoloTrainFeed.batch_size=32
\
LearningRate.schedulers='[!PiecewiseDecay {gamma: 0.1, milestones: [110000, 124000]}, !LinearWarmup {start_factor: 0., steps: 2000}]' \
-d "../../dataset/voc"
```
...
...
@@ -166,6 +168,16 @@ python compress.py \
如果不需要保存评估模型,可以在定义Compressor对象时,将
`save_eval_model`
选项设置为False(默认为True)。
运行命令为:
```
python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
如果在配置文件中设置了
`checkpoint_path`
,并且在定义Compressor对象时指定了
`prune_infer_model`
选项,则每个epoch都会
...
...
@@ -180,6 +192,16 @@ python compress.py \
在脚本
<a
href=
"../infer.py"
>
PaddleDetection/tools/infer.py
</a>
中展示了如何使用fluid python API加载使用预测模型进行预测。
运行命令为:
```
python ../infer.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__.infer \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
### PaddleLite
该示例中产出的预测(inference)模型可以直接用PaddleLite进行加载使用。
...
...
@@ -187,13 +209,13 @@ python compress.py \
## 示例结果
> 当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| FLOPS |
top1_acc/top5_acc
| model_size |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
| FLOPS |
Box AP
| model_size |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
|---|---|---|---|---|
|baseline|- |- |- |-|
|-10%|- |- |- |-|
|-30%|- |- |- |-|
|-50%|- |- |- |-|
|baseline|76.2 |93M |- |-|
|-50%|69.48 |51M |- |-|
## FAQ
slim/prune/compress.py
浏览文件 @
0ea122f0
...
...
@@ -24,11 +24,13 @@ import sys
sys
.
path
.
append
(
"../../"
)
from
paddle.fluid.contrib.slim
import
Compressor
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -48,6 +50,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -66,8 +70,7 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
[
values
[
0
]],
...
...
@@ -147,9 +150,7 @@ def main():
optimizer
=
optim_builder
(
lr
)
optimizer
.
minimize
(
loss
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
*
devices_num
,
FLAGS
.
dataset_dir
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
# parse train fetches
...
...
@@ -157,7 +158,7 @@ def main():
train_keys
.
append
(
"lr"
)
train_values
.
append
(
lr
.
name
)
train_fetch_list
=
[]
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
...
...
@@ -181,8 +182,8 @@ def main():
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
eval_fetch_list
=
[]
extra_keys
)
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
...
...
@@ -195,21 +196,20 @@ def main():
#place = fluid.CPUPlace()
#exe = fluid.Executor(place)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_
keys
,
eval_values
,
eval_
cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
checkpoint
.
save
(
exe
,
train_prog
,
os
.
path
.
join
(
save_dir
,
"best_model"
))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
...
...
@@ -228,13 +228,12 @@ def main():
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
[
eval_fetch_list
[
0
]],
save_eval_model
=
True
,
prune_infer_model
=
[[
"image"
,
"im_size"
],[
"multiclass_nms_0.tmp_0"
]],
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
train_optimizer
=
None
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
...
...
slim/quantization/README.md
浏览文件 @
0ea122f0
...
...
@@ -37,28 +37,76 @@
-
config: 检测库的配置,其中配置了训练超参数、数据集信息等。
-
slim_file: PaddleSlim的配置文件,参见
[
配置文件说明
](
#配置文件说明
)
。
您可以通过运行以下命令运行该示例
,请确保已正确下载
[
pretrained model
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification#%E5%B7%B2%E5%8F%91%E5%B8%83%E6%A8%A1%E5%9E%8B%E5%8F%8A%E5%85%B6%E6%80%A7%E8%83%BD
)
。
您可以通过运行以下命令运行该示例。
step1:
开启显存优化策略
step1:
设置gpu卡
```
export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0
export CUDA_VISIBLE_DEVICES=0
```
step2: 设置gpu卡,目前的超参设置适合2卡训练
step2: 开始训练
使用PaddleDetection提供的配置文件在用8卡进行训练:
```
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=258 \
LearningRate.base_lr=0.0001 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [258, 516]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=64
```
>通过命令行覆盖设置max_iters选项,因为PaddleDetection中训练是以`batch`为单位迭代的,并没有涉及`epoch`的概念,但是PaddleSlim需要知道当前训练进行到第几个`epoch`, 所以需要将`max_iters`设置为一个`epoch`内的`batch`的数量。
如果要调整训练卡数,需要调整配置文件
`yolov3_mobilenet_v1_voc.yml`
中的以下参数:
-
**max_iters:**
一个
`epoch`
中batch的数量,需要设置为
`total_num / batch_size`
, 其中
`total_num`
为训练样本总数量,
`batch_size`
为多卡上总的batch size.
-
**YoloTrainFeed.batch_size:**
当使用DataLoader时,表示单张卡上的batch size; 当使用普通reader时,则表示多卡上的总的batch_size。batch_size受限于显存大小。
-
**LeaningRate.base_lr:**
根据多卡的总
`batch_size`
调整
`base_lr`
,两者大小正相关,可以简单的按比例进行调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.LinearWarmup.steps:**
请根据batch size的变化对其进行调整。
以下为4卡训练示例,通过命令行覆盖
`yolov3_mobilenet_v1_voc.yml`
中的参数:
```
export CUDA_VISIBLE_DEVICES=0,1
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=258 \
LearningRate.base_lr=0.0001 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [258, 516]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=64
```
step3: 开始训练
以下为2卡训练示例,受显存所制,单卡
`batch_size`
不变, 总
`batch_size`
减小,
`base_lr`
减小,一个epoch内batch数量增加,同时需要调整学习率相关参数,如下:
```
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=516 \
LearningRate.base_lr=0.00005 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [516, 1012]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=32
```
通过
`python compress.py --help`
查看可配置参数。
通过
`python ../../tools/configure.py ${option_name} help`
查看如何通过命令行覆盖配置文件
`yolov3_mobilenet_v1_voc.yml`
中的参数。
### 训练时的模型结构
这部分介绍来源于
[
量化low-level API介绍
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim/quant_low_level_api#1-%E9%87%8F%E5%8C%96%E8%AE%AD%E7%BB%83low-level-apis%E4%BB%8B%E7%BB%8D
)
。
PaddlePaddle框架中
有四个和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8Pass以及TransformForMobile
Pass。在训练时,对网络应用了QuantizationTransformPass,作用是在网络中的conv2d、depthwise_conv2d、mul等算子的各个输入前插入连续的量化op和反量化op,并改变相应反向算子的某些输入。示例图如下:
PaddlePaddle框架中
和量化相关的IrPass, 分别有QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8
Pass。在训练时,对网络应用了QuantizationTransformPass,作用是在网络中的conv2d、depthwise_conv2d、mul等算子的各个输入前插入连续的量化op和反量化op,并改变相应反向算子的某些输入。示例图如下:
<p
align=
"center"
>
<img
src=
"./images/TransformPass.png"
height=
400
width=
520
hspace=
'10'
/>
<br
/>
...
...
@@ -76,10 +124,10 @@ PaddlePaddle框架中有四个和量化相关的IrPass, 分别是QuantizationTra
### 保存评估和预测模型
如果在配置文件的量化策略中设置了
`float_model_save_path`
,
`int8_model_save_path`
,
`mobile_model_save_path`
, 在训练结束后,会保存模型量化压缩之后用于预测的模型。接下来介绍这三
种预测模型的区别。
如果在配置文件的量化策略中设置了
`float_model_save_path`
,
`int8_model_save_path`
在训练结束后,会保存模型量化压缩之后用于预测的模型。接下来介绍这2
种预测模型的区别。
#### FP32模型
在介绍量化训练时的模型结构时介绍了PaddlePaddle框架中
有四个和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8Pass以及TransformForMobile
Pass。FP32模型是在应用QuantizationFreezePass并删除eval_program中多余的operators之后,保存的模型。
在介绍量化训练时的模型结构时介绍了PaddlePaddle框架中
和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8
Pass。FP32模型是在应用QuantizationFreezePass并删除eval_program中多余的operators之后,保存的模型。
QuantizationFreezePass主要用于改变IrGraph中量化op和反量化op的顺序,即将类似图1中的量化op和反量化op顺序改变为图2中的布局。除此之外,QuantizationFreezePass还会将
`conv2d`
、
`depthwise_conv2d`
、
`mul`
等算子的权重离线量化为int8_t范围内的值(但数据类型仍为float32),以减少预测过程中对权重的量化操作,示例如图2:
...
...
@@ -97,19 +145,13 @@ QuantizationFreezePass主要用于改变IrGraph中量化op和反量化op的顺
<strong>
图3:应用ConvertToInt8Pass后的结果
</strong>
</p>
#### mobile模型
经TransformForMobilePass转换后,用户可得到兼容
[
paddle-lite
](
https://github.com/PaddlePaddle/Paddle-Lite
)
移动端预测库的量化模型。paddle-mobile中的量化op和反量化op的名称分别为
`quantize`
和
`dequantize`
。
`quantize`
算子和PaddlePaddle框架中的
`fake_quantize_abs_max`
算子簇的功能类似,
`dequantize`
算子和PaddlePaddle框架中的
`fake_dequantize_max_abs`
算子簇的功能相同。若选择paddle-mobile执行量化训练输出的模型,则需要将
`fake_quantize_abs_max`
等算子改为
`quantize`
算子以及将
`fake_dequantize_max_abs`
等算子改为
`dequantize`
算子,示例如图4:
<p
align=
"center"
>
<img
src=
"./images/TransformForMobilePass.png"
height=
400
width=
400
hspace=
'10'
/>
<br
/>
<strong>
图4:应用TransformForMobilePass后的结果
</strong>
</p>
> 综上,可得在量化过程中有以下几种模型结构:
1.
原始模型
2.
经QuantizationTransformPass之后得到的适用于训练的量化模型结构,在${checkpoint_path}下保存的
`eval_model`
是这种结构,在训练过程中每个epoch结束时也使用这个网络结构进行评估,虽然这个模型结构不是最终想要的模型结构,但是每个epoch的评估结果可用来挑选模型。
3.
经QuantizationFreezePass之后得到的FP32模型结构,具体结构已在上面进行介绍。本文档中列出的数据集的评估结果是对FP32模型结构进行评估得到的结果。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成FP32模型,可使用脚本
<a
href=
'./freeze.py'
>
PaddleSlim/classification/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
4.
经ConvertToInt8Pass之后得到的8-bit模型结构,具体结构已在上面进行介绍。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成8-bit模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
5.
经TransformForMobilePass之后得到的mobile模型结构,具体结构已在上面进行介绍。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成mobile模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
## 评估
...
...
@@ -128,21 +170,24 @@ python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
在评估之后,选取效果最好的epoch的模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
将该模型转化为以上介绍的
三种模型:FP32模型,int8模型,mobile
模型,需要配置的参数为:
在评估之后,选取效果最好的epoch的模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
将该模型转化为以上介绍的
2种模型:FP32模型,int8
模型,需要配置的参数为:
-
model_path, 加载的模型路径,
`为${checkpoint_path}/${epoch_id}/eval_model/`
-
weight_quant_type 模型参数的量化方式,和配置文件中的类型保持一致
-
save_path
`FP32`
,
`8-bit`
,
`mobile`
模型的保存路径,分别为
`${save_path}/float/`
,
`${save_path}/int8/`
,
`${save_path}/mobile
/`
-
save_path
`FP32`
,
`8-bit`
模型的保存路径,分别为
`${save_path}/float/`
,
`${save_path}/int8
/`
运行命令示例:
```
python freeze.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--weight_quant_type ${weight_quant_type} \
--save_path ${any path you want}
--save_path ${any path you want} \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
### 最终评估模型
...
...
@@ -150,10 +195,11 @@ python freeze.py \
运行命令为:
```
python ../eval.py \
--model_path ${float_model_path}
--model_path ${float_model_path}
--model_name model \
--params_name weights \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
...
...
@@ -169,7 +215,7 @@ python ../infer.py \
--model_path ${save_path}/float \
--model_name model \
--params_name weights \
-c yolov3_mobilenet_v1_voc.yml \
-c
../../configs/
yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
...
...
@@ -180,7 +226,9 @@ FP32模型可使用PaddleLite进行加载预测,可参见教程[Paddle-Lite如
## 示例结果
### MobileNetV1
>当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| weight量化方式 | activation量化方式| Box ap |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
|---|---|---|---|---|
...
...
@@ -189,9 +237,5 @@ FP32模型可使用PaddleLite进行加载预测,可参见教程[Paddle-Lite如
|abs_max|moving_average_abs_max|- |- |-|
|channel_wise_abs_max|abs_max|- |- |-|
>训练超参:
## FAQ
slim/quantization/compress.py
浏览文件 @
0ea122f0
...
...
@@ -28,11 +28,13 @@ from paddle.fluid.contrib.slim import Compressor
from
paddle.fluid.framework
import
IrGraph
from
paddle.fluid
import
core
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -46,7 +48,7 @@ from ppdet.data.data_feed import create_reader
from
ppdet.utils.eval_utils
import
parse_fetches
,
eval_results
from
ppdet.utils.stats
import
TrainingStats
from
ppdet.utils.cli
import
ArgsParser
from
ppdet.utils.cli
import
ArgsParser
,
print_total_cfg
from
ppdet.utils.check
import
check_gpu
,
check_version
import
ppdet.utils.checkpoint
as
checkpoint
from
ppdet.modeling.model_input
import
create_feed
...
...
@@ -55,6 +57,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -73,11 +77,10 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
values
[
0
],
fetch_list
=
[
values
[
0
]
],
return_numpy
=
False
)
outs
.
append
(
data
[
'gt_box'
])
outs
.
append
(
data
[
'gt_label'
])
...
...
@@ -118,8 +121,8 @@ def main():
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu
(
cfg
.
use_gpu
)
# print_total_cfg(cfg)
#check_version()
if
cfg
.
use_gpu
:
devices_num
=
fluid
.
core
.
get_cuda_device_count
()
else
:
...
...
@@ -155,16 +158,14 @@ def main():
optimizer
=
optim_builder
(
lr
)
optimizer
.
minimize
(
loss
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
*
devices_num
,
FLAGS
.
dataset_dir
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
# parse train fetches
train_keys
,
train_values
,
_
=
parse_fetches
(
train_fetches
)
train_values
.
append
(
lr
)
train_fetch_list
=
[]
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
print
(
"train_fetch_list: {}"
.
format
(
train_fetch_list
))
...
...
@@ -188,14 +189,13 @@ def main():
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
extra_keys
)
# print(eval_values)
eval_fetch_list
=
[]
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
exe
.
run
(
startup_prog
)
start_iter
=
0
...
...
@@ -208,21 +208,20 @@ def main():
#place = fluid.CPUPlace()
#exe = fluid.Executor(place)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_
keys
,
eval_values
,
eval_
cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
checkpoint
.
save
(
exe
,
train_prog
,
os
.
path
.
join
(
save_dir
,
"best_model"
))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
...
...
@@ -240,12 +239,12 @@ def main():
eval_feed_list
=
test_feed
,
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
[
eval_fetch_list
[
0
]],
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
train_optimizer
=
None
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
...
...
slim/quantization/freeze.py
浏览文件 @
0ea122f0
...
...
@@ -32,11 +32,13 @@ from paddle.fluid.contrib.slim.quantization import QuantizationFreezePass
from
paddle.fluid.contrib.slim.quantization
import
ConvertToInt8Pass
from
paddle.fluid.contrib.slim.quantization
import
TransformForMobilePass
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -59,6 +61,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -71,8 +75,7 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
values
[
0
],
...
...
@@ -123,7 +126,6 @@ def main():
devices_num
=
int
(
os
.
environ
.
get
(
'CPU_NUM'
,
multiprocessing
.
cpu_count
()))
if
'eval_feed'
not
in
cfg
:
eval_feed
=
create
(
main_arch
+
'EvalFeed'
)
else
:
...
...
@@ -138,85 +140,79 @@ def main():
#eval_pyreader.decorate_sample_list_generator(eval_reader, place)
test_data_feed
=
fluid
.
DataFeeder
(
test_feed_vars
.
values
(),
place
)
assert
os
.
path
.
exists
(
FLAGS
.
model_path
)
infer_prog
,
feed_names
,
fetch_targets
=
fluid
.
io
.
load_inference_model
(
dirname
=
FLAGS
.
model_path
,
executor
=
exe
,
model_filename
=
'__model__'
,
params_filename
=
'__params__'
)
dirname
=
FLAGS
.
model_path
,
executor
=
exe
,
model_filename
=
'__model__.infer'
,
params_filename
=
'__params__'
)
eval_keys
=
[
'bbox'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_values
=
[
'multiclass_nms_0.tmp_0'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_values
=
[
'multiclass_nms_0.tmp_0'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_cls
=
[]
eval_values
[
0
]
=
fetch_targets
[
0
]
results
=
eval_run
(
exe
,
infer_prog
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
infer_prog
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
resolution
,
False
,
FLAGS
.
output_eval
)
logger
.
info
(
"freeze the graph for inference"
)
test_graph
=
IrGraph
(
core
.
Graph
(
infer_prog
.
desc
),
for_test
=
True
)
freeze_pass
=
QuantizationFreezePass
(
scope
=
fluid
.
global_scope
(),
place
=
place
,
weight_quantize_type
=
FLAGS
.
weight_quant_type
)
scope
=
fluid
.
global_scope
(),
place
=
place
,
weight_quantize_type
=
FLAGS
.
weight_quant_type
)
freeze_pass
.
apply
(
test_graph
)
server_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'float'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'float'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
logger
.
info
(
"convert the weights into int8 type"
)
convert_int8_pass
=
ConvertToInt8Pass
(
scope
=
fluid
.
global_scope
(),
place
=
place
)
scope
=
fluid
.
global_scope
(),
place
=
place
)
convert_int8_pass
.
apply
(
test_graph
)
server_int8_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'int8'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_int8_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'int8'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_int8_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
logger
.
info
(
"convert the freezed pass to paddle-lite execution"
)
mobile_pass
=
TransformForMobilePass
()
mobile_pass
.
apply
(
test_graph
)
mobile_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'mobile'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
mobile_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'mobile'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
mobile_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
"-m"
,
"--model_path"
,
default
=
None
,
type
=
str
,
help
=
"path of checkpoint"
)
"-m"
,
"--model_path"
,
default
=
None
,
type
=
str
,
help
=
"path of checkpoint"
)
parser
.
add_argument
(
"--output_eval"
,
default
=
None
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录