未验证 提交 e3c75b78 编写于 作者: W whs 提交者: GitHub

Add demo for pruning mobilenet-ssd. (#3283)

* Add demo for pruning mobilenet-ssd.

* Remove unused option.

* Use soft link instead of copying files from PaddleCV/ssd
上级 ba781d9d
本示例压缩目标为[MobileNetV1-SSD](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ssd). 主要裁剪主干网络的卷积通道数。
## 第一步:观察网络结构
该模型的主干网络为MobileNetV1, 主要包含两种卷积:depthwise convolution和普通1X1卷积, 考虑到depthwise convolution的特殊性,我们只对普通1X1卷积做裁剪。
首先,我们需要知道主干网络中普通1X1卷积的参数(filter weights)的名称,在当前实现中,网络结构定义在`fluid.default_main_program`中,可以通过以下方式打印出网络中所有参数的名称和形状:
```
for param in fluid.default_main_program().global_block().all_parameters():
print("{}: {}".format(param.name, param.shape))
```
上述代码会按网络定义顺序,依次打印相应的参数名称,如下所示:
```
conv2d_0.w_0 (32L, 3L, 3L, 3L)
depthwise_conv2d_0.w_0 (32L, 1L, 3L, 3L)
conv2d_1.w_0 (64L, 32L, 1L, 1L)
depthwise_conv2d_1.w_0 (64L, 1L, 3L, 3L)
conv2d_2.w_0 (128L, 64L, 1L, 1L)
depthwise_conv2d_2.w_0 (128L, 1L, 3L, 3L)
conv2d_3.w_0 (128L, 128L, 1L, 1L)
depthwise_conv2d_3.w_0 (128L, 1L, 3L, 3L)
conv2d_4.w_0 (256L, 128L, 1L, 1L)
depthwise_conv2d_4.w_0 (256L, 1L, 3L, 3L)
conv2d_5.w_0 (256L, 256L, 1L, 1L)
depthwise_conv2d_5.w_0 (256L, 1L, 3L, 3L)
conv2d_6.w_0 (512L, 256L, 1L, 1L)
depthwise_conv2d_6.w_0 (512L, 1L, 3L, 3L)
conv2d_7.w_0 (512L, 512L, 1L, 1L)
depthwise_conv2d_7.w_0 (512L, 1L, 3L, 3L)
conv2d_8.w_0 (512L, 512L, 1L, 1L)
depthwise_conv2d_8.w_0 (512L, 1L, 3L, 3L)
conv2d_9.w_0 (512L, 512L, 1L, 1L)
depthwise_conv2d_9.w_0 (512L, 1L, 3L, 3L)
conv2d_10.w_0 (512L, 512L, 1L, 1L)
depthwise_conv2d_10.w_0 (512L, 1L, 3L, 3L)
conv2d_11.w_0 (512L, 512L, 1L, 1L)
depthwise_conv2d_11.w_0 (512L, 1L, 3L, 3L)
conv2d_12.w_0 (1024L, 512L, 1L, 1L)
depthwise_conv2d_12.w_0 (1024L, 1L, 3L, 3L)
```
观察可知,普通1X1卷积名称为`conv2d_1.w_0`~`conv2d_12.w_0`, 用正则表达式可表示为:
```
"conv2d_[1-9].w_0|conv2d_1[0-2].w_0"
```
## 第二步:编写配置文件
我们以uniform pruning为例, 需要重点注意以下配置:
- target_ratio:指将被剪裁掉的flops的比例, 该选项的设置还要考虑主干网络参数量占全网络比例,如果该选项设置的太大,某些卷积层的channel会被全部裁剪掉,为了避免该问题,建议多先尝试设置不同的值,观察卷积层被裁剪的情况,然后再设置合适的值。当前示例会以0.2为例。
- pruned_params:将被裁剪的参数的名称,支持正则表达式,注意设置的正则表达式一定不要匹配到不想被裁剪到的参数名。最安全的做法是设置为`param_1_name|param_2_name|param_3_name`类似的格式,这样可以严格匹配指定的参数名。根据第一步,当前示例设置为`conv2d_[1-9].w_0|conv2d_1[0-2].w_0`
## 第三步:编写压缩脚本
当前示例的压缩脚本是在脚本[ssd/train.py](https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/ssd/train.py)基础上修改的。
需要注意一下几点:
### fluid.metrics.DetectionMAP
PaddleSlim暂时不支持fluid.metrics和fluid.evaluator, 所以这里将metrics.DetectionMAP改写为:
```
gt_label = fluid.layers.cast(x=gt_label, dtype=gt_box.dtype)
if difficult:
difficult = fluid.layers.cast(x=difficult, dtype=gt_box.dtype)
gt_label = fluid.layers.reshape(gt_label, [-1, 1])
difficult = fluid.layers.reshape(difficult, [-1, 1])
label = fluid.layers.concat([gt_label, difficult, gt_box], axis=1)
else:
label = fluid.layers.concat([gt_label, gt_box], axis=1)
map_var = fluid.layers.detection.detection_map(
nmsed_out,
label,
class_num,
background_label=0,
overlap_threshold=0.5,
evaluate_difficult=False,
ap_version=ap_version)
```
### data reader
注意在构造Compressor时,train_reader和eval_reader给的都是py_reader.
因为用了py_reader所以不需要再给feed_list.
```
compressor = Compressor(
place,
fluid.global_scope(),
train_prog,
train_reader=train_py_reader, # noteeeeeeeeeeeee
train_feed_list=None, # noteeeeeeeeeeeee
train_fetch_list=train_fetch_list,
eval_program=test_prog,
eval_reader=test_py_reader, # noteeeeeeeeeeeee
eval_feed_list=None, # noteeeeeeeeeeeee
eval_fetch_list=val_fetch_list,
train_optimizer=None)
```
## 第四步:保存剪裁模型
以下代码为保存剪枝后的模型:
```
com_pass = Compressor(...)
com_pass.config(args.compress_config)
com_pass.run()
pruned_prog = com_pass.eval_graph.program
fluid.io.save_inference_model("./pruned_model/", [image.name, label.name], [acc_top1], exe, main_program=pruned_prog)
# check the shape of parameters
for param in pruned_prog.global_block().all_parameters():
print("name: {}; shape: {}".format(param.name, param.shape))
```
关于save_inference_model api请参考:https://www.paddlepaddle.org.cn/documentation/docs/zh/1.5/api_cn/io_cn.html#save-inference-model
version: 1.0
pruners:
pruner_1:
class: 'StructurePruner'
pruning_axis:
'*': 0
criterions:
'*': 'l1_norm'
strategies:
uniform_pruning_strategy:
class: 'UniformPruneStrategy'
pruner: 'pruner_1'
start_epoch: 0
target_ratio: 0.2
pruned_params: 'conv2d_[1-9].w_0|conv2d_1[0-2].w_0'
compressor:
epoch: 200
#init_model: './checkpoints/0' # Please enable this option for loading checkpoint.
checkpoint_path: './checkpoints/'
strategies:
- uniform_pruning_strategy
../../PaddleCV/ssd/image_util.py
\ No newline at end of file
../../PaddleCV/ssd/mobilenet_ssd.py
\ No newline at end of file
../../PaddleCV/ssd/reader.py
\ No newline at end of file
export CUDA_VISIBLE_DEVICES=2
python train.py \
--dataset="coco2014" \
--data_dir="./data/coco" \
> ./run.log 2>&1 &
tailf run.log
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import time
import numpy as np
import argparse
import functools
import shutil
import math
import multiprocessing
from paddle.fluid.contrib.slim import Compressor
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import paddle
import paddle.fluid as fluid
import reader
from mobilenet_ssd import build_mobilenet_ssd
from utility import add_arguments, print_arguments, check_cuda
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('learning_rate', float, 0.001, "Learning rate.")
add_arg('batch_size', int, 64, "Minibatch size of all devices.")
add_arg('epoc_num', int, 120, "Epoch number.")
add_arg('use_gpu', bool, True, "Whether use GPU.")
add_arg('parallel', bool, True, "Whether train in parallel on multi-devices.")
add_arg('dataset', str, 'pascalvoc', "dataset can be coco2014, coco2017, and pascalvoc.")
add_arg('model_save_dir', str, 'model', "The path to save model.")
add_arg('pretrained_model', str, 'pretrained/ssd_mobilenet_v1_coco/', "The init model path.")
add_arg('ap_version', str, '11point', "mAP version can be integral or 11point.")
add_arg('image_shape', str, '3,300,300', "Input image shape.")
add_arg('mean_BGR', str, '127.5,127.5,127.5', "Mean value for B,G,R channel which will be subtracted.")
add_arg('data_dir', str, 'data/pascalvoc', "Data directory.")
add_arg('use_multiprocess', bool, True, "Whether use multi-process for data preprocessing.")
add_arg('enable_ce', bool, False, "Whether use CE to evaluate the model.")
#yapf: enable
train_parameters = {
"pascalvoc": {
"train_images": 16551,
"image_shape": [3, 300, 300],
"class_num": 21,
"batch_size": 64,
"lr": 0.001,
"lr_epochs": [40, 60, 80, 100],
"lr_decay": [1, 0.5, 0.25, 0.1, 0.01],
"ap_version": '11point',
},
"coco2014": {
"train_images": 82783,
"image_shape": [3, 300, 300],
"class_num": 91,
"batch_size": 64,
"lr": 0.001,
"lr_epochs": [12, 19],
"lr_decay": [1, 0.5, 0.25],
"ap_version": 'integral', # should use eval_coco_map.py to test model
},
"coco2017": {
"train_images": 118287,
"image_shape": [3, 300, 300],
"class_num": 91,
"batch_size": 64,
"lr": 0.001,
"lr_epochs": [12, 19],
"lr_decay": [1, 0.5, 0.25],
"ap_version": 'integral', # should use eval_coco_map.py to test model
}
}
def optimizer_setting(train_params):
batch_size = train_params["batch_size"]
iters = train_params["train_images"] // batch_size
lr = train_params["lr"]
boundaries = [i * iters for i in train_params["lr_epochs"]]
values = [ i * lr for i in train_params["lr_decay"]]
optimizer = fluid.optimizer.RMSProp(
learning_rate=fluid.layers.piecewise_decay(boundaries, values),
regularization=fluid.regularizer.L2Decay(0.00005), )
return optimizer
def build_program(main_prog, startup_prog, train_params, is_train):
image_shape = train_params['image_shape']
class_num = train_params['class_num']
ap_version = train_params['ap_version']
outs = []
with fluid.program_guard(main_prog, startup_prog):
py_reader = fluid.layers.py_reader(
capacity=64,
shapes=[[-1] + image_shape, [-1, 4], [-1, 1], [-1, 1]],
lod_levels=[0, 1, 1, 1],
dtypes=["float32", "float32", "int32", "int32"],
use_double_buffer=True)
with fluid.unique_name.guard():
image, gt_box, gt_label, difficult = fluid.layers.read_file(py_reader)
locs, confs, box, box_var = build_mobilenet_ssd(image, class_num, image_shape)
if is_train:
with fluid.unique_name.guard("train"):
loss = fluid.layers.ssd_loss(locs, confs, gt_box, gt_label, box,
box_var)
loss = fluid.layers.reduce_sum(loss)
optimizer = optimizer_setting(train_params)
optimizer.minimize(loss)
outs = [py_reader, loss]
else:
with fluid.unique_name.guard("inference"):
nmsed_out = fluid.layers.detection_output(
locs, confs, box, box_var, nms_threshold=0.45)
gt_label = fluid.layers.cast(x=gt_label, dtype=gt_box.dtype)
if difficult:
difficult = fluid.layers.cast(x=difficult, dtype=gt_box.dtype)
gt_label = fluid.layers.reshape(gt_label, [-1, 1])
difficult = fluid.layers.reshape(difficult, [-1, 1])
label = fluid.layers.concat([gt_label, difficult, gt_box], axis=1)
else:
label = fluid.layers.concat([gt_label, gt_box], axis=1)
map_var = fluid.layers.detection.detection_map(
nmsed_out,
label,
class_num,
background_label=0,
overlap_threshold=0.5,
evaluate_difficult=False,
ap_version=ap_version)
# nmsed_out and image is used to save mode for inference
outs = [py_reader, map_var, nmsed_out, image]
return outs
def train(args,
data_args,
train_params,
train_file_list,
val_file_list):
model_save_dir = args.model_save_dir
pretrained_model = args.pretrained_model
use_gpu = args.use_gpu
parallel = args.parallel
enable_ce = args.enable_ce
is_shuffle = True
if not use_gpu:
devices_num = int(os.environ.get('CPU_NUM',
multiprocessing.cpu_count()))
else:
devices_num = fluid.core.get_cuda_device_count()
batch_size = train_params['batch_size']
epoc_num = train_params['epoc_num']
batch_size_per_device = batch_size // devices_num
num_workers = 8
startup_prog = fluid.Program()
train_prog = fluid.Program()
test_prog = fluid.Program()
train_py_reader, loss = build_program(
main_prog=train_prog,
startup_prog=startup_prog,
train_params=train_params,
is_train=True)
test_py_reader, map_var, _, _ = build_program(
main_prog=test_prog,
startup_prog=startup_prog,
train_params=train_params,
is_train=False)
test_prog = test_prog.clone(for_test=True)
for param in train_prog.global_block().all_parameters():
if 'conv' in param.name:
print param.name, param.shape
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(startup_prog)
if pretrained_model:
def if_exist(var):
return os.path.exists(os.path.join(pretrained_model, var.name))
fluid.io.load_vars(exe, pretrained_model, main_program=train_prog,
predicate=if_exist)
test_reader = reader.test(data_args, val_file_list, batch_size)
test_py_reader.decorate_paddle_reader(test_reader)
train_reader = reader.train(data_args,
train_file_list,
batch_size_per_device,
shuffle=is_shuffle,
use_multiprocess=args.use_multiprocess,
num_workers=num_workers,
enable_ce=enable_ce)
train_py_reader.decorate_paddle_reader(train_reader)
train_fetch_list=[("loss", loss.name)]
val_fetch_list=[("map", map_var.name)]
compressor = Compressor(
place,
fluid.global_scope(),
train_prog,
train_reader=train_py_reader,
train_feed_list=None,
train_fetch_list=train_fetch_list,
eval_program=test_prog,
eval_reader=test_py_reader,
eval_feed_list=None,
eval_fetch_list=val_fetch_list,
train_optimizer=None)
compressor.config('./compress.yaml')
compressor.run()
def main():
args = parser.parse_args()
print_arguments(args)
check_cuda(args.use_gpu)
data_dir = args.data_dir
dataset = args.dataset
assert dataset in ['pascalvoc', 'coco2014', 'coco2017']
# for pascalvoc
label_file = 'label_list'
train_file_list = 'trainval.txt'
val_file_list = 'test.txt'
if dataset == 'coco2014':
train_file_list = 'annotations/instances_train2014.json'
val_file_list = 'annotations/instances_val2014.json'
elif dataset == 'coco2017':
train_file_list = 'annotations/instances_train2017.json'
val_file_list = 'annotations/instances_val2017.json'
mean_BGR = [float(m) for m in args.mean_BGR.split(",")]
image_shape = [int(m) for m in args.image_shape.split(",")]
train_parameters[dataset]['image_shape'] = image_shape
train_parameters[dataset]['batch_size'] = args.batch_size
train_parameters[dataset]['lr'] = args.learning_rate
train_parameters[dataset]['epoc_num'] = args.epoc_num
train_parameters[dataset]['ap_version'] = args.ap_version
data_args = reader.Settings(
dataset=args.dataset,
data_dir=data_dir,
label_file=label_file,
resize_h=image_shape[1],
resize_w=image_shape[2],
mean_value=mean_BGR,
apply_distort=True,
apply_expand=True,
ap_version = args.ap_version)
train(args,
data_args,
train_parameters[dataset],
train_file_list=train_file_list,
val_file_list=val_file_list)
if __name__ == '__main__':
main()
../../PaddleCV/ssd/utility.py
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册