Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
内鬼850
PaddleDetection
提交
0ea122f0
P
PaddleDetection
项目概览
内鬼850
/
PaddleDetection
与 Fork 源项目一致
Fork自
PaddlePaddle / PaddleDetection
通知
2
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleDetection
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
0ea122f0
编写于
10月 25, 2019
作者:
B
Bai Yifan
提交者:
whs
10月 25, 2019
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Copy slim from release/1.6 to develop (#3758)
上级
3126a437
变更
11
隐藏空白更改
内联
并排
Showing
11 changed file
with
790 addition
and
141 deletion
+790
-141
slim/distillation/README.md
slim/distillation/README.md
+141
-0
slim/distillation/compress.py
slim/distillation/compress.py
+325
-0
slim/distillation/run.sh
slim/distillation/run.sh
+47
-0
slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
...tion/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
+18
-0
slim/distillation/yolov3_resnet34.yml
slim/distillation/yolov3_resnet34.yml
+34
-0
slim/infer.py
slim/infer.py
+37
-13
slim/prune/README.md
slim/prune/README.md
+37
-15
slim/prune/compress.py
slim/prune/compress.py
+16
-17
slim/quantization/README.md
slim/quantization/README.md
+75
-31
slim/quantization/compress.py
slim/quantization/compress.py
+19
-20
slim/quantization/freeze.py
slim/quantization/freeze.py
+41
-45
未找到文件。
slim/distillation/README.md
0 → 100755
浏览文件 @
0ea122f0
>运行该示例前请安装Paddle1.6或更高版本
# 检测模型蒸馏示例
## 概述
该示例使用PaddleSlim提供的
[
蒸馏策略
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#3-蒸馏
)
对检测库中的模型进行蒸馏训练。
在阅读该示例前,建议您先了解以下内容:
-
[
检测库的常规训练方法
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/PaddleDetection
)
-
[
PaddleSlim使用文档
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md
)
## 配置文件说明
关于配置文件如何编写您可以参考:
-
[
PaddleSlim配置文件编写说明
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#122-%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E7%9A%84%E4%BD%BF%E7%94%A8
)
-
[
蒸馏策略配置文件编写说明
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#23-蒸馏
)
这里以ResNet34-YoloV3蒸馏MobileNetV1-YoloV3模型为例,首先,为了对
`student model`
和
`teacher model`
有个总体的认识,从而进一步确认蒸馏的对象,我们通过以下命令分别观察两个网络变量(Variable)的名称和形状:
```
python
# 观察student model的Variable
for
v
in
fluid
.
default_main_program
().
list_vars
():
if
"py_reader"
not
in
v
.
name
and
"double_buffer"
not
in
v
.
name
and
"generated_var"
not
in
v
.
name
:
print
(
v
.
name
,
v
.
shape
)
# 观察teacher model的Variable
for
v
in
teacher_program
.
list_vars
():
print
(
v
.
name
,
v
.
shape
)
```
经过对比可以发现,
`student model`
和
`teacher model`
的部分中间结果分别为:
```
bash
# student model
conv2d_15.tmp_0
# teacher model
teacher_teacher_conv2d_1.tmp_0
```
所以,我们用
`l2_distiller`
对这两个特征图做蒸馏。在配置文件中进行如下配置:
```
yaml
distillers
:
l2_distiller
:
class
:
'
L2Distiller'
teacher_feature_map
:
'
teacher_teacher_conv2d_1.tmp_0'
student_feature_map
:
'
conv2d_15.tmp_0'
distillation_loss_weight
:
1
strategies
:
distillation_strategy
:
class
:
'
DistillationStrategy'
distillers
:
[
'
l2_distiller'
]
start_epoch
:
0
end_epoch
:
270
```
我们也可以根据上述操作为蒸馏策略选择其他loss,PaddleSlim支持的有
`FSP_loss`
,
`L2_loss`
和
`softmax_with_cross_entropy_loss`
。
## 训练
根据
[
PaddleDetection/tools/train.py
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/PaddleDetection/tools/train.py
)
编写压缩脚本compress.py。
在该脚本中定义了Compressor对象,用于执行压缩任务。
您可以通过运行脚本
`run.sh`
运行该示例。
### 保存断点(checkpoint)
如果在配置文件中设置了
`checkpoint_path`
, 则在蒸馏任务执行过程中会自动保存断点,当任务异常中断时,
重启任务会自动从
`checkpoint_path`
路径下按数字顺序加载最新的checkpoint文件。如果不想让重启的任务从断点恢复,
需要修改配置文件中的
`checkpoint_path`
,或者将
`checkpoint_path`
路径下文件清空。
>注意:配置文件中的信息不会保存在断点中,重启前对配置文件的修改将会生效。
## 评估
如果在配置文件中设置了
`checkpoint_path`
,则每个epoch会保存一个压缩后的用于评估的模型,
该模型会保存在
`${checkpoint_path}/${epoch_id}/eval_model/`
路径下,包含
`__model__`
和
`__params__`
两个文件。
其中,
`__model__`
用于保存模型结构信息,
`__params__`
用于保存参数(parameters)信息。
如果不需要保存评估模型,可以在定义Compressor对象时,将
`save_eval_model`
选项设置为False(默认为True)。
运行命令为:
```
python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
如果在配置文件中设置了
`checkpoint_path`
,并且在定义Compressor对象时指定了
`prune_infer_model`
选项,则每个epoch都会
保存一个
`inference model`
。该模型是通过删除eval_program中多余的operators而得到的。
该模型会保存在
`${checkpoint_path}/${epoch_id}/eval_model/`
路径下,包含
`__model__.infer`
和
`__params__`
两个文件。
其中,
`__model__.infer`
用于保存模型结构信息,
`__params__`
用于保存参数(parameters)信息。
更多关于
`prune_infer_model`
选项的介绍,请参考:
[
Compressor介绍
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md#121-%E5%A6%82%E4%BD%95%E6%94%B9%E5%86%99%E6%99%AE%E9%80%9A%E8%AE%AD%E7%BB%83%E8%84%9A%E6%9C%AC
)
### python预测
在脚本
<a
href=
"../infer.py"
>
slim/infer.py
</a>
中展示了如何使用fluid python API加载使用预测模型进行预测。
运行命令为:
```
python ../infer.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__.infer \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
### PaddleLite
该示例中产出的预测(inference)模型可以直接用PaddleLite进行加载使用。
关于PaddleLite如何使用,请参考:
[
PaddleLite使用文档
](
https://github.com/PaddlePaddle/Paddle-Lite/wiki#%E4%BD%BF%E7%94%A8
)
## 示例结果
>当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| FLOPS |Box AP|
|---|---|
|baseline|76.2 |
|蒸馏后|- |
## FAQ
slim/distillation/compress.py
0 → 100644
浏览文件 @
0ea122f0
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
os
import
time
import
multiprocessing
import
numpy
as
np
from
collections
import
deque
,
OrderedDict
from
paddle.fluid.contrib.slim.core
import
Compressor
from
paddle.fluid.framework
import
IrGraph
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
FLAGS_eager_delete_tensor_gb
=
0
,
# enable GC to save memory
)
from
paddle
import
fluid
import
sys
sys
.
path
.
append
(
"../../"
)
from
ppdet.core.workspace
import
load_config
,
merge_config
,
create
from
ppdet.data.data_feed
import
create_reader
from
ppdet.utils.eval_utils
import
parse_fetches
,
eval_results
from
ppdet.utils.stats
import
TrainingStats
from
ppdet.utils.cli
import
ArgsParser
from
ppdet.utils.check
import
check_gpu
import
ppdet.utils.checkpoint
as
checkpoint
from
ppdet.modeling.model_input
import
create_feed
import
logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
"""
iter_id
=
0
results
=
[]
if
len
(
cls
)
!=
0
:
values
=
[]
for
i
in
range
(
len
(
cls
)):
_
,
accum_map
=
cls
[
i
].
get_map_var
()
cls
[
i
].
reset
(
exe
)
values
.
append
(
accum_map
)
images_num
=
0
start_time
=
time
.
time
()
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
[
values
[
0
]],
return_numpy
=
False
)
outs
.
append
(
data
[
'gt_box'
])
outs
.
append
(
data
[
'gt_label'
])
outs
.
append
(
data
[
'is_difficult'
])
res
=
{
k
:
(
np
.
array
(
v
),
v
.
recursive_sequence_lengths
())
for
k
,
v
in
zip
(
keys
,
outs
)
}
results
.
append
(
res
)
if
iter_id
%
100
==
0
:
logger
.
info
(
'Test iter {}'
.
format
(
iter_id
))
iter_id
+=
1
images_num
+=
len
(
res
[
'bbox'
][
1
][
0
])
if
has_bbox
else
1
logger
.
info
(
'Test finish iter {}'
.
format
(
iter_id
))
end_time
=
time
.
time
()
fps
=
images_num
/
(
end_time
-
start_time
)
if
has_bbox
:
logger
.
info
(
'Total number of images: {}, inference time: {} fps.'
.
format
(
images_num
,
fps
))
else
:
logger
.
info
(
'Total iteration: {}, inference time: {} batch/s.'
.
format
(
images_num
,
fps
))
return
results
def
main
():
cfg
=
load_config
(
FLAGS
.
config
)
if
'architecture'
in
cfg
:
main_arch
=
cfg
.
architecture
else
:
raise
ValueError
(
"'architecture' not specified in config file."
)
merge_config
(
FLAGS
.
opt
)
if
'log_iter'
not
in
cfg
:
cfg
.
log_iter
=
20
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu
(
cfg
.
use_gpu
)
if
cfg
.
use_gpu
:
devices_num
=
fluid
.
core
.
get_cuda_device_count
()
else
:
devices_num
=
int
(
os
.
environ
.
get
(
'CPU_NUM'
,
multiprocessing
.
cpu_count
()))
if
'train_feed'
not
in
cfg
:
train_feed
=
create
(
main_arch
+
'TrainFeed'
)
else
:
train_feed
=
create
(
cfg
.
train_feed
)
if
'eval_feed'
not
in
cfg
:
eval_feed
=
create
(
main_arch
+
'EvalFeed'
)
else
:
eval_feed
=
create
(
cfg
.
eval_feed
)
place
=
fluid
.
CUDAPlace
(
0
)
if
cfg
.
use_gpu
else
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
lr_builder
=
create
(
'LearningRate'
)
optim_builder
=
create
(
'OptimizerBuilder'
)
# build program
model
=
create
(
main_arch
)
train_loader
,
train_feed_vars
=
create_feed
(
train_feed
,
iterable
=
True
)
train_fetches
=
model
.
train
(
train_feed_vars
)
loss
=
train_fetches
[
'loss'
]
lr
=
lr_builder
()
opt
=
optim_builder
(
lr
)
opt
.
minimize
(
loss
)
#for v in fluid.default_main_program().list_vars():
# if "py_reader" not in v.name and "double_buffer" not in v.name and "generated_var" not in v.name:
# print(v.name, v.shape)
cfg
.
max_iters
=
258
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
exe
.
run
(
fluid
.
default_startup_program
())
# parse train fetches
train_keys
,
train_values
,
_
=
parse_fetches
(
train_fetches
)
train_keys
.
append
(
'lr'
)
train_values
.
append
(
lr
.
name
)
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
print
(
"train_fetch_list: {}"
.
format
(
train_fetch_list
))
eval_prog
=
fluid
.
Program
()
startup_prog
=
fluid
.
Program
()
with
fluid
.
program_guard
(
eval_prog
,
startup_prog
):
with
fluid
.
unique_name
.
guard
():
model
=
create
(
main_arch
)
_
,
test_feed_vars
=
create_feed
(
eval_feed
,
iterable
=
True
)
fetches
=
model
.
eval
(
test_feed_vars
)
eval_prog
=
eval_prog
.
clone
(
True
)
eval_reader
=
create_reader
(
eval_feed
,
args_path
=
FLAGS
.
dataset_dir
)
test_data_feed
=
fluid
.
DataFeeder
(
test_feed_vars
.
values
(),
place
)
# parse eval fetches
extra_keys
=
[]
if
cfg
.
metric
==
'COCO'
:
extra_keys
=
[
'im_info'
,
'im_id'
,
'im_shape'
]
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
print
(
"eval_fetch_list: {}"
.
format
(
eval_fetch_list
))
exe
.
run
(
startup_prog
)
checkpoint
.
load_params
(
exe
,
fluid
.
default_main_program
(),
cfg
.
pretrain_weights
)
best_box_ap_list
=
[]
def
eval_func
(
program
,
scope
):
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
is_bbox_normalized
=
False
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
is_bbox_normalized
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
(
'im_size'
,
test_feed_vars
[
'im_size'
].
name
)]
teacher_cfg
=
load_config
(
FLAGS
.
teacher_config
)
teacher_arch
=
teacher_cfg
.
architecture
teacher_programs
=
[]
teacher_program
=
fluid
.
Program
()
teacher_startup_program
=
fluid
.
Program
()
with
fluid
.
program_guard
(
teacher_program
,
teacher_startup_program
):
with
fluid
.
unique_name
.
guard
(
'teacher_'
):
teacher_feed_vars
=
OrderedDict
()
for
name
,
var
in
train_feed_vars
.
items
():
teacher_feed_vars
[
name
]
=
teacher_program
.
global_block
(
).
_clone_variable
(
var
,
force_persistable
=
False
)
model
=
create
(
teacher_arch
)
train_fetches
=
model
.
train
(
teacher_feed_vars
)
#print("="*50+"teacher_model_params"+"="*50)
#for v in teacher_program.list_vars():
# print(v.name, v.shape)
#return
exe
.
run
(
teacher_startup_program
)
assert
FLAGS
.
teacher_pretrained
and
os
.
path
.
exists
(
FLAGS
.
teacher_pretrained
),
"teacher_pretrained should be set when teacher_model is not None."
def
if_exist
(
var
):
return
os
.
path
.
exists
(
os
.
path
.
join
(
FLAGS
.
teacher_pretrained
,
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
FLAGS
.
teacher_pretrained
,
main_program
=
teacher_program
,
predicate
=
if_exist
)
teacher_programs
.
append
(
teacher_program
.
clone
(
for_test
=
True
))
com
=
Compressor
(
place
,
fluid
.
global_scope
(),
fluid
.
default_main_program
(),
train_reader
=
train_reader
,
train_feed_list
=
[(
key
,
value
.
name
)
for
key
,
value
in
train_feed_vars
.
items
()],
train_fetch_list
=
train_fetch_list
,
eval_program
=
eval_prog
,
eval_reader
=
eval_reader
,
eval_feed_list
=
test_feed
,
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
eval_fetch_list
[
0
:
1
],
save_eval_model
=
True
,
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
teacher_programs
=
teacher_programs
,
train_optimizer
=
None
,
distiller_optimizer
=
opt
,
log_period
=
20
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
"-t"
,
"--teacher_config"
,
default
=
None
,
type
=
str
,
help
=
"Config file of teacher architecture."
)
parser
.
add_argument
(
"-s"
,
"--slim_file"
,
default
=
None
,
type
=
str
,
help
=
"Config file of PaddleSlim."
)
parser
.
add_argument
(
"-r"
,
"--resume_checkpoint"
,
default
=
None
,
type
=
str
,
help
=
"Checkpoint path for resuming training."
)
parser
.
add_argument
(
"--eval"
,
action
=
'store_true'
,
default
=
False
,
help
=
"Whether to perform evaluation in train"
)
parser
.
add_argument
(
"--teacher_pretrained"
,
default
=
None
,
type
=
str
,
help
=
"Whether to use pretrained model."
)
parser
.
add_argument
(
"--output_eval"
,
default
=
None
,
type
=
str
,
help
=
"Evaluation directory, default is current directory."
)
parser
.
add_argument
(
"-d"
,
"--dataset_dir"
,
default
=
None
,
type
=
str
,
help
=
"Dataset path, same as DataFeed.dataset.dataset_dir"
)
FLAGS
=
parser
.
parse_args
()
main
()
slim/distillation/run.sh
0 → 100644
浏览文件 @
0ea122f0
#!/usr/bin/env bash
# download pretrain model
root_url
=
"https://paddlemodels.bj.bcebos.com/object_detection"
yolov3_r34_voc
=
"yolov3_r34_voc.tar"
pretrain_dir
=
'./pretrain'
if
[
!
-d
${
pretrain_dir
}
]
;
then
mkdir
${
pretrain_dir
}
fi
cd
${
pretrain_dir
}
if
[
!
-f
${
yolov3_r34_voc
}
]
;
then
wget
${
root_url
}
/
${
yolov3_r34_voc
}
tar
xf
${
yolov3_r34_voc
}
fi
cd
-
# enable GC strategy
export
FLAGS_fast_eager_deletion_mode
=
1
export
FLAGS_eager_delete_tensor_gb
=
0.0
# for distillation
#-----------------
export
CUDA_VISIBLE_DEVICES
=
0,1,2,3
# Fixing name conflicts in distillation
cd
${
pretrain_dir
}
/yolov3_r34_voc
for
files
in
$(
ls
teacher_
*
)
do
mv
$files
${
files
#*_
}
done
for
files
in
$(
ls
*
)
do
mv
$files
"teacher_"
$files
done
cd
-
python
-u
compress.py
\
-c
../../configs/yolov3_mobilenet_v1_voc.yml
\
-t
yolov3_resnet34.yml
\
-s
yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
\
-o
YoloTrainFeed.batch_size
=
64
\
-d
../../dataset/voc
\
--teacher_pretrained
./pretrain/yolov3_r34_voc
\
>
yolov3_distallation.log 2>&1 &
tailf yolov3_distallation.log
slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
0 → 100644
浏览文件 @
0ea122f0
version
:
1.0
distillers
:
l2_distiller
:
class
:
'
L2Distiller'
teacher_feature_map
:
'
teacher_teacher_conv2d_1.tmp_0'
student_feature_map
:
'
conv2d_15.tmp_0'
distillation_loss_weight
:
1
strategies
:
distillation_strategy
:
class
:
'
DistillationStrategy'
distillers
:
[
'
l2_distiller'
]
start_epoch
:
0
end_epoch
:
270
compressor
:
epoch
:
271
checkpoint_path
:
'
./checkpoints/'
strategies
:
-
distillation_strategy
slim/
quantization/yolov3_mobilenet_v1_voc
.yml
→
slim/
distillation/yolov3_resnet34
.yml
浏览文件 @
0ea122f0
architecture
:
YOLOv3
train_feed
:
YoloTrainFeed
eval_feed
:
YoloEvalFeed
test_feed
:
YoloTestFeed
use_gpu
:
true
max_iters
:
1000
log_smooth_window
:
20
save_dir
:
output
snapshot_iter
:
2000
metric
:
VOC
map_type
:
11point
pretrain_weights
:
https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar
weights
:
output/yolov3_mobilenet_v1_voc/model_final
num_classes
:
20
weight_prefix_name
:
teacher_
YOLOv3
:
backbone
:
Mobile
Net
backbone
:
Res
Net
yolo_head
:
YOLOv3Head
Mobile
Net
:
Res
Net
:
norm_type
:
sync_bn
freeze_at
:
0
freeze_norm
:
false
norm_decay
:
0.
conv_group_scale
:
1
with_extra_blocks
:
false
depth
:
34
feature_maps
:
[
3
,
4
,
5
]
YOLOv3Head
:
anchor_masks
:
[[
6
,
7
,
8
],
[
3
,
4
,
5
],
[
0
,
1
,
2
]]
...
...
@@ -38,50 +32,3 @@ YOLOv3Head:
nms_top_k
:
1000
normalized
:
false
score_threshold
:
0.01
LearningRate
:
base_lr
:
0.0001
schedulers
:
-
!PiecewiseDecay
gamma
:
0.1
milestones
:
-
1000
-
2000
#- !LinearWarmup
#start_factor: 0.
#steps: 1000
OptimizerBuilder
:
optimizer
:
momentum
:
0.9
type
:
Momentum
regularizer
:
factor
:
0.0005
type
:
L2
YoloTrainFeed
:
batch_size
:
8
dataset
:
dataset_dir
:
../../dataset/voc
annotation
:
VOCdevkit/VOC_all/ImageSets/Main/train.txt
image_dir
:
VOCdevkit/VOC_all/JPEGImages
use_default_label
:
true
num_workers
:
8
bufsize
:
128
use_process
:
true
mixup_epoch
:
250
YoloEvalFeed
:
batch_size
:
8
image_shape
:
[
3
,
608
,
608
]
dataset
:
dataset_dir
:
../../dataset/voc
annotation
:
VOCdevkit/VOC_all/ImageSets/Main/val.txt
image_dir
:
VOCdevkit/VOC_all/JPEGImages
use_default_label
:
true
YoloTestFeed
:
batch_size
:
1
image_shape
:
[
3
,
608
,
608
]
dataset
:
use_default_label
:
true
slim/infer.py
浏览文件 @
0ea122f0
...
...
@@ -19,11 +19,13 @@ from __future__ import print_function
import
os
import
sys
import
glob
import
time
import
numpy
as
np
from
PIL
import
Image
sys
.
path
.
append
(
"../../"
)
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
...
...
@@ -117,20 +119,19 @@ def main():
test_images
=
get_test_images
(
FLAGS
.
infer_dir
,
FLAGS
.
infer_img
)
test_feed
.
dataset
.
add_images
(
test_images
)
place
=
fluid
.
CUDAPlace
(
0
)
if
cfg
.
use_gpu
else
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
infer_prog
,
feed_var_names
,
fetch_list
=
fluid
.
io
.
load_inference_model
(
dirname
=
FLAGS
.
model_path
,
model_filename
=
FLAGS
.
model_name
,
params_filename
=
FLAGS
.
params_name
,
executor
=
exe
)
dirname
=
FLAGS
.
model_path
,
model_filename
=
FLAGS
.
model_name
,
params_filename
=
FLAGS
.
params_name
,
executor
=
exe
)
reader
=
create_reader
(
test_feed
)
feeder
=
fluid
.
DataFeeder
(
place
=
place
,
feed_list
=
feed_var_names
,
program
=
infer_prog
)
feeder
=
fluid
.
DataFeeder
(
place
=
place
,
feed_list
=
feed_var_names
,
program
=
infer_prog
)
# parse infer fetches
assert
cfg
.
metric
in
[
'COCO'
,
'VOC'
],
\
...
...
@@ -140,7 +141,9 @@ def main():
extra_keys
=
[
'im_info'
,
'im_id'
,
'im_shape'
]
if
cfg
[
'metric'
]
==
'VOC'
:
extra_keys
=
[
'im_id'
,
'im_shape'
]
keys
,
values
,
_
=
parse_fetches
({
'bbox'
:
fetch_list
},
infer_prog
,
extra_keys
)
keys
,
values
,
_
=
parse_fetches
({
'bbox'
:
fetch_list
},
infer_prog
,
extra_keys
)
# parse dataset category
if
cfg
.
metric
==
'COCO'
:
...
...
@@ -166,9 +169,33 @@ def main():
imid2path
=
reader
.
imid2path
keys
=
[
'bbox'
]
infer_time
=
True
compile_prog
=
fluid
.
compiler
.
CompiledProgram
(
infer_prog
)
for
iter_id
,
data
in
enumerate
(
reader
()):
feed_data
=
[[
d
[
0
],
d
[
1
]]
for
d
in
data
]
outs
=
exe
.
run
(
infer_prog
,
# for infer time
if
infer_time
:
warmup_times
=
10
repeats_time
=
100
feed_data_dict
=
feeder
.
feed
(
feed_data
)
for
i
in
range
(
warmup_times
):
exe
.
run
(
compile_prog
,
feed
=
feed_data_dict
,
fetch_list
=
fetch_list
,
return_numpy
=
False
)
start_time
=
time
.
time
()
for
i
in
range
(
repeats_time
):
exe
.
run
(
compile_prog
,
feed
=
feed_data_dict
,
fetch_list
=
fetch_list
,
return_numpy
=
False
)
print
(
"infer time: {} ms/sample"
.
format
((
time
.
time
()
-
start_time
)
*
1000
/
repeats_time
))
infer_time
=
False
outs
=
exe
.
run
(
compile_prog
,
feed
=
feeder
.
feed
(
feed_data
),
fetch_list
=
fetch_list
,
return_numpy
=
False
)
...
...
@@ -258,10 +285,7 @@ if __name__ == '__main__':
default
=
"tb_log_dir/image"
,
help
=
'Tensorboard logging directory for image.'
)
parser
.
add_argument
(
'--model_path'
,
type
=
str
,
default
=
None
,
help
=
"inference model path"
)
'--model_path'
,
type
=
str
,
default
=
None
,
help
=
"inference model path"
)
parser
.
add_argument
(
'--model_name'
,
type
=
str
,
...
...
slim/prune/README.md
浏览文件 @
0ea122f0
...
...
@@ -7,7 +7,8 @@
该示例使用PaddleSlim提供的
[
卷积通道剪裁压缩策略
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/tutorial.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86
)
对检测库中的模型进行压缩。
在阅读该示例前,建议您先了解以下内容:
-
<a
href=
"../..README_cn.md"
>
检测库的常规训练方法
</a>
-
<a
href=
"../../README_cn.md"
>
检测库的常规训练方法
</a>
-
[
检测模型数据准备
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/PaddleDetection/docs/INSTALL_cn.md#%E6%95%B0%E6%8D%AE%E9%9B%86
)
-
[
PaddleSlim使用文档
](
https://github.com/PaddlePaddle/models/blob/develop/PaddleSlim/docs/usage.md
)
...
...
@@ -29,7 +30,7 @@ from paddle.fluid.framework import IrGraph
from paddle.fluid import core
graph = IrGraph(core.Graph(train_prog.desc), for_test=True)
marked_nodes = set()
marked_nodes = set()
for op in graph.all_op_nodes():
print(op.name())
if op.name().find('conv') > -1:
...
...
@@ -39,12 +40,12 @@ graph.draw('.', 'forward', marked_nodes)
该示例中MobileNetV1-YoloV3模型结构的可视化结果:
<a
href=
"./images/MobileNetV1-YoloV3.pdf"
>
MobileNetV1-YoloV3.pdf
</a>
同时通过以下命令观察目标卷积层的参数(parameters)的名称和shape:
同时通过以下命令观察目标卷积层的参数(parameters)的名称和shape:
```
for param in fluid.default_main_program().global_block().all_parameters():
if 'weights' in param.name:
print
param.name, param.shape
print
(param.name, param.shape)
```
...
...
@@ -109,6 +110,7 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=258 \
YoloTrainFeed.batch_size=64 \
-d "../../dataset/voc"
```
...
...
@@ -117,9 +119,9 @@ python compress.py \
如果要调整训练卡数,需要调整配置文件
`yolov3_mobilenet_v1_voc.yml`
中的以下参数:
-
**max_iters:**
一个
`epoch`
中batch的数量,需要设置为
`total_num / batch_size`
, 其中
`total_num`
为训练样本总数量,
`batch_size`
为多卡上总的batch size.
-
**YoloTrainFeed.batch_size:**
单张卡上的batch size,
受限于显存大小。
-
**YoloTrainFeed.batch_size:**
当使用DataLoader时,表示单张卡上的batch size; 当使用普通reader时,则表示多卡上的总的
`batch_size`
。
`batch_size`
受限于显存大小。
-
**LeaningRate.base_lr:**
根据多卡的总
`batch_size`
调整
`base_lr`
,两者大小正相关,可以简单的按比例进行调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.LinearWarmup.steps:**
请根据batch size的变化对其进行调整。
...
...
@@ -130,7 +132,7 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=258 \
-o YoloTrainFeed.batch_size = 16
\
YoloTrainFeed.batch_size=64
\
-d "../../dataset/voc"
```
...
...
@@ -140,9 +142,9 @@ python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-o max_iters=516 \
-o LeaningRate.base_lr=0.005 \ # 0.001 /2
-o YoloTrainFeed.batch_size = 16
\
-o
LearningRate.schedulers='[!PiecewiseDecay {gamma: 0.1, milestones: [110000, 124000]}, !LinearWarmup {start_factor: 0., steps: 2000}]' \
LeaningRate.base_lr=0.005 \
YoloTrainFeed.batch_size=32
\
LearningRate.schedulers='[!PiecewiseDecay {gamma: 0.1, milestones: [110000, 124000]}, !LinearWarmup {start_factor: 0., steps: 2000}]' \
-d "../../dataset/voc"
```
...
...
@@ -166,6 +168,16 @@ python compress.py \
如果不需要保存评估模型,可以在定义Compressor对象时,将
`save_eval_model`
选项设置为False(默认为True)。
运行命令为:
```
python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
如果在配置文件中设置了
`checkpoint_path`
,并且在定义Compressor对象时指定了
`prune_infer_model`
选项,则每个epoch都会
...
...
@@ -180,6 +192,16 @@ python compress.py \
在脚本
<a
href=
"../infer.py"
>
PaddleDetection/tools/infer.py
</a>
中展示了如何使用fluid python API加载使用预测模型进行预测。
运行命令为:
```
python ../infer.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__.infer \
--params_name __params__ \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
### PaddleLite
该示例中产出的预测(inference)模型可以直接用PaddleLite进行加载使用。
...
...
@@ -187,13 +209,13 @@ python compress.py \
## 示例结果
> 当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| FLOPS |
top1_acc/top5_acc
| model_size |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
| FLOPS |
Box AP
| model_size |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
|---|---|---|---|---|
|baseline|- |- |- |-|
|-10%|- |- |- |-|
|-30%|- |- |- |-|
|-50%|- |- |- |-|
|baseline|76.2 |93M |- |-|
|-50%|69.48 |51M |- |-|
## FAQ
slim/prune/compress.py
浏览文件 @
0ea122f0
...
...
@@ -24,11 +24,13 @@ import sys
sys
.
path
.
append
(
"../../"
)
from
paddle.fluid.contrib.slim
import
Compressor
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -48,6 +50,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -66,8 +70,7 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
[
values
[
0
]],
...
...
@@ -147,9 +150,7 @@ def main():
optimizer
=
optim_builder
(
lr
)
optimizer
.
minimize
(
loss
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
*
devices_num
,
FLAGS
.
dataset_dir
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
# parse train fetches
...
...
@@ -157,7 +158,7 @@ def main():
train_keys
.
append
(
"lr"
)
train_values
.
append
(
lr
.
name
)
train_fetch_list
=
[]
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
...
...
@@ -181,8 +182,8 @@ def main():
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
eval_fetch_list
=
[]
extra_keys
)
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
...
...
@@ -195,21 +196,20 @@ def main():
#place = fluid.CPUPlace()
#exe = fluid.Executor(place)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_
keys
,
eval_values
,
eval_
cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
checkpoint
.
save
(
exe
,
train_prog
,
os
.
path
.
join
(
save_dir
,
"best_model"
))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
...
...
@@ -228,13 +228,12 @@ def main():
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
[
eval_fetch_list
[
0
]],
save_eval_model
=
True
,
prune_infer_model
=
[[
"image"
,
"im_size"
],[
"multiclass_nms_0.tmp_0"
]],
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
train_optimizer
=
None
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
...
...
slim/quantization/README.md
浏览文件 @
0ea122f0
...
...
@@ -37,28 +37,76 @@
-
config: 检测库的配置,其中配置了训练超参数、数据集信息等。
-
slim_file: PaddleSlim的配置文件,参见
[
配置文件说明
](
#配置文件说明
)
。
您可以通过运行以下命令运行该示例
,请确保已正确下载
[
pretrained model
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification#%E5%B7%B2%E5%8F%91%E5%B8%83%E6%A8%A1%E5%9E%8B%E5%8F%8A%E5%85%B6%E6%80%A7%E8%83%BD
)
。
您可以通过运行以下命令运行该示例。
step1:
开启显存优化策略
step1:
设置gpu卡
```
export FLAGS_fast_eager_deletion_mode=1
export FLAGS_eager_delete_tensor_gb=0.0
export CUDA_VISIBLE_DEVICES=0
```
step2: 设置gpu卡,目前的超参设置适合2卡训练
step2: 开始训练
使用PaddleDetection提供的配置文件在用8卡进行训练:
```
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=258 \
LearningRate.base_lr=0.0001 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [258, 516]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=64
```
>通过命令行覆盖设置max_iters选项,因为PaddleDetection中训练是以`batch`为单位迭代的,并没有涉及`epoch`的概念,但是PaddleSlim需要知道当前训练进行到第几个`epoch`, 所以需要将`max_iters`设置为一个`epoch`内的`batch`的数量。
如果要调整训练卡数,需要调整配置文件
`yolov3_mobilenet_v1_voc.yml`
中的以下参数:
-
**max_iters:**
一个
`epoch`
中batch的数量,需要设置为
`total_num / batch_size`
, 其中
`total_num`
为训练样本总数量,
`batch_size`
为多卡上总的batch size.
-
**YoloTrainFeed.batch_size:**
当使用DataLoader时,表示单张卡上的batch size; 当使用普通reader时,则表示多卡上的总的batch_size。batch_size受限于显存大小。
-
**LeaningRate.base_lr:**
根据多卡的总
`batch_size`
调整
`base_lr`
,两者大小正相关,可以简单的按比例进行调整。
-
**LearningRate.schedulers.PiecewiseDecay.milestones:**
请根据batch size的变化对其调整。
-
**LearningRate.schedulers.PiecewiseDecay.LinearWarmup.steps:**
请根据batch size的变化对其进行调整。
以下为4卡训练示例,通过命令行覆盖
`yolov3_mobilenet_v1_voc.yml`
中的参数:
```
export CUDA_VISIBLE_DEVICES=0,1
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=258 \
LearningRate.base_lr=0.0001 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [258, 516]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=64
```
step3: 开始训练
以下为2卡训练示例,受显存所制,单卡
`batch_size`
不变, 总
`batch_size`
减小,
`base_lr`
减小,一个epoch内batch数量增加,同时需要调整学习率相关参数,如下:
```
python compress.py \
-s yolov3_mobilenet_v1_slim.yaml \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc" \
-o max_iters=516 \
LearningRate.base_lr=0.00005 \
LearningRate.schedulers="[!PiecewiseDecay {gamma: 0.1, milestones: [516, 1012]}]" \
pretrain_weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar \
YoloTrainFeed.batch_size=32
```
通过
`python compress.py --help`
查看可配置参数。
通过
`python ../../tools/configure.py ${option_name} help`
查看如何通过命令行覆盖配置文件
`yolov3_mobilenet_v1_voc.yml`
中的参数。
### 训练时的模型结构
这部分介绍来源于
[
量化low-level API介绍
](
https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim/quant_low_level_api#1-%E9%87%8F%E5%8C%96%E8%AE%AD%E7%BB%83low-level-apis%E4%BB%8B%E7%BB%8D
)
。
PaddlePaddle框架中
有四个和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8Pass以及TransformForMobile
Pass。在训练时,对网络应用了QuantizationTransformPass,作用是在网络中的conv2d、depthwise_conv2d、mul等算子的各个输入前插入连续的量化op和反量化op,并改变相应反向算子的某些输入。示例图如下:
PaddlePaddle框架中
和量化相关的IrPass, 分别有QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8
Pass。在训练时,对网络应用了QuantizationTransformPass,作用是在网络中的conv2d、depthwise_conv2d、mul等算子的各个输入前插入连续的量化op和反量化op,并改变相应反向算子的某些输入。示例图如下:
<p
align=
"center"
>
<img
src=
"./images/TransformPass.png"
height=
400
width=
520
hspace=
'10'
/>
<br
/>
...
...
@@ -76,10 +124,10 @@ PaddlePaddle框架中有四个和量化相关的IrPass, 分别是QuantizationTra
### 保存评估和预测模型
如果在配置文件的量化策略中设置了
`float_model_save_path`
,
`int8_model_save_path`
,
`mobile_model_save_path`
, 在训练结束后,会保存模型量化压缩之后用于预测的模型。接下来介绍这三
种预测模型的区别。
如果在配置文件的量化策略中设置了
`float_model_save_path`
,
`int8_model_save_path`
在训练结束后,会保存模型量化压缩之后用于预测的模型。接下来介绍这2
种预测模型的区别。
#### FP32模型
在介绍量化训练时的模型结构时介绍了PaddlePaddle框架中
有四个和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8Pass以及TransformForMobile
Pass。FP32模型是在应用QuantizationFreezePass并删除eval_program中多余的operators之后,保存的模型。
在介绍量化训练时的模型结构时介绍了PaddlePaddle框架中
和量化相关的IrPass, 分别是QuantizationTransformPass、QuantizationFreezePass、ConvertToInt8
Pass。FP32模型是在应用QuantizationFreezePass并删除eval_program中多余的operators之后,保存的模型。
QuantizationFreezePass主要用于改变IrGraph中量化op和反量化op的顺序,即将类似图1中的量化op和反量化op顺序改变为图2中的布局。除此之外,QuantizationFreezePass还会将
`conv2d`
、
`depthwise_conv2d`
、
`mul`
等算子的权重离线量化为int8_t范围内的值(但数据类型仍为float32),以减少预测过程中对权重的量化操作,示例如图2:
...
...
@@ -97,19 +145,13 @@ QuantizationFreezePass主要用于改变IrGraph中量化op和反量化op的顺
<strong>
图3:应用ConvertToInt8Pass后的结果
</strong>
</p>
#### mobile模型
经TransformForMobilePass转换后,用户可得到兼容
[
paddle-lite
](
https://github.com/PaddlePaddle/Paddle-Lite
)
移动端预测库的量化模型。paddle-mobile中的量化op和反量化op的名称分别为
`quantize`
和
`dequantize`
。
`quantize`
算子和PaddlePaddle框架中的
`fake_quantize_abs_max`
算子簇的功能类似,
`dequantize`
算子和PaddlePaddle框架中的
`fake_dequantize_max_abs`
算子簇的功能相同。若选择paddle-mobile执行量化训练输出的模型,则需要将
`fake_quantize_abs_max`
等算子改为
`quantize`
算子以及将
`fake_dequantize_max_abs`
等算子改为
`dequantize`
算子,示例如图4:
<p
align=
"center"
>
<img
src=
"./images/TransformForMobilePass.png"
height=
400
width=
400
hspace=
'10'
/>
<br
/>
<strong>
图4:应用TransformForMobilePass后的结果
</strong>
</p>
> 综上,可得在量化过程中有以下几种模型结构:
1.
原始模型
2.
经QuantizationTransformPass之后得到的适用于训练的量化模型结构,在${checkpoint_path}下保存的
`eval_model`
是这种结构,在训练过程中每个epoch结束时也使用这个网络结构进行评估,虽然这个模型结构不是最终想要的模型结构,但是每个epoch的评估结果可用来挑选模型。
3.
经QuantizationFreezePass之后得到的FP32模型结构,具体结构已在上面进行介绍。本文档中列出的数据集的评估结果是对FP32模型结构进行评估得到的结果。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成FP32模型,可使用脚本
<a
href=
'./freeze.py'
>
PaddleSlim/classification/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
4.
经ConvertToInt8Pass之后得到的8-bit模型结构,具体结构已在上面进行介绍。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成8-bit模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
5.
经TransformForMobilePass之后得到的mobile模型结构,具体结构已在上面进行介绍。这种模型结构在训练过程中只会保存一次,也就是在量化配置文件中设置的
`end_epoch`
结束时进行保存,如果想将其他epoch的训练结果转化成mobile模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
进行转化,具体使用方法在
[
评估
](
#评估
)
中介绍。
## 评估
...
...
@@ -128,21 +170,24 @@ python ../eval.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--model_name __model__ \
--params_name __params__ \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
在评估之后,选取效果最好的epoch的模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
将该模型转化为以上介绍的
三种模型:FP32模型,int8模型,mobile
模型,需要配置的参数为:
在评估之后,选取效果最好的epoch的模型,可使用脚本
<a
href=
'./freeze.py'
>
slim/quantization/freeze.py
</a>
将该模型转化为以上介绍的
2种模型:FP32模型,int8
模型,需要配置的参数为:
-
model_path, 加载的模型路径,
`为${checkpoint_path}/${epoch_id}/eval_model/`
-
weight_quant_type 模型参数的量化方式,和配置文件中的类型保持一致
-
save_path
`FP32`
,
`8-bit`
,
`mobile`
模型的保存路径,分别为
`${save_path}/float/`
,
`${save_path}/int8/`
,
`${save_path}/mobile
/`
-
save_path
`FP32`
,
`8-bit`
模型的保存路径,分别为
`${save_path}/float/`
,
`${save_path}/int8
/`
运行命令示例:
```
python freeze.py \
--model_path ${checkpoint_path}/${epoch_id}/eval_model/ \
--weight_quant_type ${weight_quant_type} \
--save_path ${any path you want}
--save_path ${any path you want} \
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
### 最终评估模型
...
...
@@ -150,10 +195,11 @@ python freeze.py \
运行命令为:
```
python ../eval.py \
--model_path ${float_model_path}
--model_path ${float_model_path}
--model_name model \
--params_name weights \
-c yolov3_mobilenet_v1_voc.yml
-c ../../configs/yolov3_mobilenet_v1_voc.yml \
-d "../../dataset/voc"
```
## 预测
...
...
@@ -169,7 +215,7 @@ python ../infer.py \
--model_path ${save_path}/float \
--model_name model \
--params_name weights \
-c yolov3_mobilenet_v1_voc.yml \
-c
../../configs/
yolov3_mobilenet_v1_voc.yml \
--infer_dir ../../demo
```
...
...
@@ -180,7 +226,9 @@ FP32模型可使用PaddleLite进行加载预测,可参见教程[Paddle-Lite如
## 示例结果
### MobileNetV1
>当前release的结果并非超参调优后的最好结果,仅做示例参考,后续我们会优化当前结果。
### MobileNetV1-YOLO-V3
| weight量化方式 | activation量化方式| Box ap |Paddle Fluid inference time(ms)| Paddle Lite inference time(ms)|
|---|---|---|---|---|
...
...
@@ -189,9 +237,5 @@ FP32模型可使用PaddleLite进行加载预测,可参见教程[Paddle-Lite如
|abs_max|moving_average_abs_max|- |- |-|
|channel_wise_abs_max|abs_max|- |- |-|
>训练超参:
## FAQ
slim/quantization/compress.py
浏览文件 @
0ea122f0
...
...
@@ -28,11 +28,13 @@ from paddle.fluid.contrib.slim import Compressor
from
paddle.fluid.framework
import
IrGraph
from
paddle.fluid
import
core
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -46,7 +48,7 @@ from ppdet.data.data_feed import create_reader
from
ppdet.utils.eval_utils
import
parse_fetches
,
eval_results
from
ppdet.utils.stats
import
TrainingStats
from
ppdet.utils.cli
import
ArgsParser
from
ppdet.utils.cli
import
ArgsParser
,
print_total_cfg
from
ppdet.utils.check
import
check_gpu
,
check_version
import
ppdet.utils.checkpoint
as
checkpoint
from
ppdet.modeling.model_input
import
create_feed
...
...
@@ -55,6 +57,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -73,11 +77,10 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
values
[
0
],
fetch_list
=
[
values
[
0
]
],
return_numpy
=
False
)
outs
.
append
(
data
[
'gt_box'
])
outs
.
append
(
data
[
'gt_label'
])
...
...
@@ -118,8 +121,8 @@ def main():
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu
(
cfg
.
use_gpu
)
# print_total_cfg(cfg)
#check_version()
if
cfg
.
use_gpu
:
devices_num
=
fluid
.
core
.
get_cuda_device_count
()
else
:
...
...
@@ -155,16 +158,14 @@ def main():
optimizer
=
optim_builder
(
lr
)
optimizer
.
minimize
(
loss
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
*
devices_num
,
FLAGS
.
dataset_dir
)
train_reader
=
create_reader
(
train_feed
,
cfg
.
max_iters
,
FLAGS
.
dataset_dir
)
train_loader
.
set_sample_list_generator
(
train_reader
,
place
)
# parse train fetches
train_keys
,
train_values
,
_
=
parse_fetches
(
train_fetches
)
train_values
.
append
(
lr
)
train_fetch_list
=
[]
train_fetch_list
=
[]
for
k
,
v
in
zip
(
train_keys
,
train_values
):
train_fetch_list
.
append
((
k
,
v
))
print
(
"train_fetch_list: {}"
.
format
(
train_fetch_list
))
...
...
@@ -188,14 +189,13 @@ def main():
if
cfg
.
metric
==
'VOC'
:
extra_keys
=
[
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_keys
,
eval_values
,
eval_cls
=
parse_fetches
(
fetches
,
eval_prog
,
extra_keys
)
extra_keys
)
# print(eval_values)
eval_fetch_list
=
[]
eval_fetch_list
=
[]
for
k
,
v
in
zip
(
eval_keys
,
eval_values
):
eval_fetch_list
.
append
((
k
,
v
))
exe
.
run
(
startup_prog
)
start_iter
=
0
...
...
@@ -208,21 +208,20 @@ def main():
#place = fluid.CPUPlace()
#exe = fluid.Executor(place)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_
keys
,
eval_values
,
eval_
cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
program
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
if
len
(
best_box_ap_list
)
==
0
:
best_box_ap_list
.
append
(
box_ap_stats
[
0
])
elif
box_ap_stats
[
0
]
>
best_box_ap_list
[
0
]:
best_box_ap_list
[
0
]
=
box_ap_stats
[
0
]
checkpoint
.
save
(
exe
,
train_prog
,
os
.
path
.
join
(
save_dir
,
"best_model"
))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
logger
.
info
(
"Best test box ap: {}"
.
format
(
best_box_ap_list
[
0
]))
return
best_box_ap_list
[
0
]
test_feed
=
[(
'image'
,
test_feed_vars
[
'image'
].
name
),
...
...
@@ -240,12 +239,12 @@ def main():
eval_feed_list
=
test_feed
,
eval_func
=
{
'map'
:
eval_func
},
eval_fetch_list
=
[
eval_fetch_list
[
0
]],
prune_infer_model
=
[[
"image"
,
"im_size"
],
[
"multiclass_nms_0.tmp_0"
]],
train_optimizer
=
None
)
com
.
config
(
FLAGS
.
slim_file
)
com
.
run
()
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
...
...
slim/quantization/freeze.py
浏览文件 @
0ea122f0
...
...
@@ -32,11 +32,13 @@ from paddle.fluid.contrib.slim.quantization import QuantizationFreezePass
from
paddle.fluid.contrib.slim.quantization
import
ConvertToInt8Pass
from
paddle.fluid.contrib.slim.quantization
import
TransformForMobilePass
def
set_paddle_flags
(
**
kwargs
):
for
key
,
value
in
kwargs
.
items
():
if
os
.
environ
.
get
(
key
,
None
)
is
None
:
os
.
environ
[
key
]
=
str
(
value
)
# NOTE(paddle-dev): All of these flags should be set before
# `import paddle`. Otherwise, it would not take any effect.
set_paddle_flags
(
...
...
@@ -59,6 +61,8 @@ import logging
FORMAT
=
'%(asctime)s-%(levelname)s: %(message)s'
logging
.
basicConfig
(
level
=
logging
.
INFO
,
format
=
FORMAT
)
logger
=
logging
.
getLogger
(
__name__
)
def
eval_run
(
exe
,
compile_program
,
reader
,
keys
,
values
,
cls
,
test_feed
):
"""
Run evaluation program, return program outputs.
...
...
@@ -71,8 +75,7 @@ def eval_run(exe, compile_program, reader, keys, values, cls, test_feed):
has_bbox
=
'bbox'
in
keys
for
data
in
reader
():
data
=
test_feed
.
feed
(
data
)
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
feed_data
=
{
'image'
:
data
[
'image'
],
'im_size'
:
data
[
'im_size'
]}
outs
=
exe
.
run
(
compile_program
,
feed
=
feed_data
,
fetch_list
=
values
[
0
],
...
...
@@ -123,7 +126,6 @@ def main():
devices_num
=
int
(
os
.
environ
.
get
(
'CPU_NUM'
,
multiprocessing
.
cpu_count
()))
if
'eval_feed'
not
in
cfg
:
eval_feed
=
create
(
main_arch
+
'EvalFeed'
)
else
:
...
...
@@ -138,85 +140,79 @@ def main():
#eval_pyreader.decorate_sample_list_generator(eval_reader, place)
test_data_feed
=
fluid
.
DataFeeder
(
test_feed_vars
.
values
(),
place
)
assert
os
.
path
.
exists
(
FLAGS
.
model_path
)
infer_prog
,
feed_names
,
fetch_targets
=
fluid
.
io
.
load_inference_model
(
dirname
=
FLAGS
.
model_path
,
executor
=
exe
,
model_filename
=
'__model__'
,
params_filename
=
'__params__'
)
dirname
=
FLAGS
.
model_path
,
executor
=
exe
,
model_filename
=
'__model__.infer'
,
params_filename
=
'__params__'
)
eval_keys
=
[
'bbox'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_values
=
[
'multiclass_nms_0.tmp_0'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_values
=
[
'multiclass_nms_0.tmp_0'
,
'gt_box'
,
'gt_label'
,
'is_difficult'
]
eval_cls
=
[]
eval_values
[
0
]
=
fetch_targets
[
0
]
results
=
eval_run
(
exe
,
infer_prog
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
results
=
eval_run
(
exe
,
infer_prog
,
eval_reader
,
eval_keys
,
eval_values
,
eval_cls
,
test_data_feed
)
resolution
=
None
if
'mask'
in
results
[
0
]:
resolution
=
model
.
mask_head
.
resolution
box_ap_stats
=
eval_results
(
results
,
eval_feed
,
cfg
.
metric
,
cfg
.
num_classes
,
resolution
,
False
,
FLAGS
.
output_eval
)
resolution
,
False
,
FLAGS
.
output_eval
)
logger
.
info
(
"freeze the graph for inference"
)
test_graph
=
IrGraph
(
core
.
Graph
(
infer_prog
.
desc
),
for_test
=
True
)
freeze_pass
=
QuantizationFreezePass
(
scope
=
fluid
.
global_scope
(),
place
=
place
,
weight_quantize_type
=
FLAGS
.
weight_quant_type
)
scope
=
fluid
.
global_scope
(),
place
=
place
,
weight_quantize_type
=
FLAGS
.
weight_quant_type
)
freeze_pass
.
apply
(
test_graph
)
server_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'float'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'float'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
logger
.
info
(
"convert the weights into int8 type"
)
convert_int8_pass
=
ConvertToInt8Pass
(
scope
=
fluid
.
global_scope
(),
place
=
place
)
scope
=
fluid
.
global_scope
(),
place
=
place
)
convert_int8_pass
.
apply
(
test_graph
)
server_int8_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'int8'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_int8_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'int8'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
server_int8_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
logger
.
info
(
"convert the freezed pass to paddle-lite execution"
)
mobile_pass
=
TransformForMobilePass
()
mobile_pass
.
apply
(
test_graph
)
mobile_program
=
test_graph
.
to_program
()
fluid
.
io
.
save_inference_model
(
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'mobile'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
mobile_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
dirname
=
os
.
path
.
join
(
FLAGS
.
save_path
,
'mobile'
),
feeded_var_names
=
feed_names
,
target_vars
=
fetch_targets
,
executor
=
exe
,
main_program
=
mobile_program
,
model_filename
=
'model'
,
params_filename
=
'weights'
)
if
__name__
==
'__main__'
:
parser
=
ArgsParser
()
parser
.
add_argument
(
"-m"
,
"--model_path"
,
default
=
None
,
type
=
str
,
help
=
"path of checkpoint"
)
"-m"
,
"--model_path"
,
default
=
None
,
type
=
str
,
help
=
"path of checkpoint"
)
parser
.
add_argument
(
"--output_eval"
,
default
=
None
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录