提交 14473532 编写于 作者: L LielinJiang 提交者: Zeyu Chen

Add paddleslim demo for seg (#154)

* add prune distill nas

* update readme

* update readme

* remove debug code
上级 0193ce28
......@@ -110,7 +110,7 @@ pip install -r requirements.txt
* [如何解决二分类中类别不均衡问题](./docs/loss_select.md)
* [特色垂类模型使用](./contrib)
* [多进程训练和混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)
* 使用PaddleSlim进行分割模型压缩([量化](./slim/quantization/README.md), [蒸馏](./slim/distillation/README.md), [剪枝](./slim/prune/README.md), [搜索](./slim/nas/README.md))
## 在线体验
我们在AI Studio平台上提供了在线体验的教程,欢迎体验:
......
......@@ -236,3 +236,19 @@ cfg.FREEZE.MODEL_FILENAME = '__model__'
cfg.FREEZE.PARAMS_FILENAME = '__params__'
# 预测模型参数保存的路径
cfg.FREEZE.SAVE_DIR = 'freeze_model'
########################## paddle-slim ######################################
cfg.SLIM.KNOWLEDGE_DISTILL_IS_TEACHER = False
cfg.SLIM.KNOWLEDGE_DISTILL = False
cfg.SLIM.KNOWLEDGE_DISTILL_TEACHER_MODEL_DIR = ""
cfg.SLIM.NAS_PORT = 23333
cfg.SLIM.NAS_ADDRESS = ""
cfg.SLIM.NAS_SEARCH_STEPS = 100
cfg.SLIM.NAS_START_EVAL_EPOCH = 0
cfg.SLIM.NAS_IS_SERVER = True
cfg.SLIM.NAS_SPACE_NAME = ""
cfg.SLIM.PRUNE_PARAMS = ''
cfg.SLIM.PRUNE_RATIOS = []
>运行该示例前请安装PaddleSlim和Paddle1.6或更高版本
# PaddleSeg蒸馏教程
在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解
该文档介绍如何使用[PaddleSlim](https://paddlepaddle.github.io/PaddleSlim)对分割库中的模型进行蒸馏。
该教程中所示操作,如无特殊说明,均在`PaddleSeg/`路径下执行。
## 概述
该示例使用PaddleSlim提供的[蒸馏策略](https://paddlepaddle.github.io/PaddleSlim/algo/algo/#3)对分割库中的模型进行蒸馏训练。
在阅读该示例前,建议您先了解以下内容:
- [PaddleSlim蒸馏API文档](https://paddlepaddle.github.io/PaddleSlim/api/single_distiller_api/)
## 安装PaddleSlim
可按照[PaddleSlim使用文档](https://paddlepaddle.github.io/PaddleSlim/)中的步骤安装PaddleSlim
## 蒸馏策略说明
关于蒸馏API如何使用您可以参考PaddleSlim蒸馏API文档
这里以Deeplabv3-xception蒸馏训练Deeplabv3-mobilenet模型为例,首先,为了对`student model``teacher model`有个总体的认识,进一步确认蒸馏的对象,我们通过以下命令分别观察两个网络变量(Variables)的名称和形状:
```python
# 观察student model的Variables
student_vars = []
for v in fluid.default_main_program().list_vars():
try:
student_vars.append((v.name, v.shape))
except:
pass
print("="*50+"student_model_vars"+"="*50)
print(student_vars)
# 观察teacher model的Variables
teacher_vars = []
for v in teacher_program.list_vars():
try:
teacher_vars.append((v.name, v.shape))
except:
pass
print("="*50+"teacher_model_vars"+"="*50)
print(teacher_vars)
```
经过对比可以发现,`student model``teacher model`输入到`loss`的特征图分别为:
```bash
# student model
bilinear_interp_1.tmp_0
# teacher model
bilinear_interp_2.tmp_0
```
它们形状两两相同,且分别处于两个网络的输出部分。所以,我们用`l2_loss`对这几个特征图两两对应添加蒸馏loss。需要注意的是,teacher的Variable在merge过程中被自动添加了一个`name_prefix`,所以这里也需要加上这个前缀`"teacher_"`,merge过程请参考[蒸馏API文档](https://paddlepaddle.github.io/PaddleSlim/api/single_distiller_api/#merge)
```python
distill_loss = l2_loss('teacher_bilinear_interp_2.tmp_0', 'bilinear_interp_1.tmp_0')
```
我们也可以根据上述操作为蒸馏策略选择其他loss,PaddleSlim支持的有`FSP_loss`, `L2_loss`, `softmax_with_cross_entropy_loss` 以及自定义的任何loss。
## 训练
根据[PaddleSeg/pdseg/train.py](../../pdseg/train.py)编写压缩脚本`train_distill.py`
在该脚本中定义了teacher_model和student_model,用teacher_model的输出指导student_model的训练
### 执行示例
如下命令启动训练,每间隔```cfg.TRAIN.SNAPSHOT_EPOCH```会进行一次评估。
```shell
CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch ./slim/distill/train.py \
--log_steps 10 --cfg ./slim/distill/cityscape_fast_scnn.yaml \
--teacher_cfg ./slim/distill/cityscape_teacher.yaml \
--use_gpu \
--use_mpio \
--do_eval
```
## 评估预测
训练完成后的评估和预测请参考PaddleSeg的[快速入门](../../README.md#快速入门)[基础功能](../../README.md#基础功能)等章节
EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling
TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MIN_SCALE_FACTOR: 0.5 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
FLIP: True
FLIP_RATIO: 0.2
RICH_CROP:
ENABLE: False
ASPECT_RATIO: 0.33
BLUR: True
BLUR_RATIO: 0.1
MAX_ROTATION: 15
MIN_AREA_RATIO: 0.5
BRIGHTNESS_JITTER_RATIO: 0.5
CONTRAST_JITTER_RATIO: 0.5
SATURATION_JITTER_RATIO: 0.5
BATCH_SIZE: 16
MEAN: [0.5, 0.5, 0.5]
STD: [0.5, 0.5, 0.5]
DATASET:
DATA_DIR: "./dataset/cityscapes/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 19
TEST_FILE_LIST: "dataset/cityscapes/val.list"
TRAIN_FILE_LIST: "dataset/cityscapes/train.list"
VAL_FILE_LIST: "dataset/cityscapes/val.list"
IGNORE_INDEX: 255
FREEZE:
MODEL_FILENAME: "model"
PARAMS_FILENAME: "params"
MODEL:
DEFAULT_NORM_TYPE: "bn"
MODEL_NAME: "deeplabv3p"
DEEPLAB:
BACKBONE: "mobilenet"
ASPP_WITH_SEP_CONV: True
DECODER_USE_SEP_CONV: True
ENCODER_WITH_ASPP: False
ENABLE_DECODER: False
TEST:
TEST_MODEL: "snapshots/cityscape_v5/final/"
TRAIN:
MODEL_SAVE_DIR: "snapshots/cityscape_mbv2_kd_e100_1/"
PRETRAINED_MODEL_DIR: u"/workspace/pretrained_models/mobilenet_cityscapes"
SNAPSHOT_EPOCH: 5
SYNC_BATCH_NORM: True
SOLVER:
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "sgd"
NUM_EPOCHS: 100
EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling
TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MIN_SCALE_FACTOR: 0.5 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
FLIP: True
FLIP_RATIO: 0.2
RICH_CROP:
ENABLE: False
ASPECT_RATIO: 0.33
BLUR: True
BLUR_RATIO: 0.1
MAX_ROTATION: 15
MIN_AREA_RATIO: 0.5
BRIGHTNESS_JITTER_RATIO: 0.5
CONTRAST_JITTER_RATIO: 0.5
SATURATION_JITTER_RATIO: 0.5
BATCH_SIZE: 16
MEAN: [0.5, 0.5, 0.5]
STD: [0.5, 0.5, 0.5]
DATASET:
DATA_DIR: "./dataset/cityscapes/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 19
TEST_FILE_LIST: "dataset/cityscapes/val.list"
TRAIN_FILE_LIST: "dataset/cityscapes/train.list"
VAL_FILE_LIST: "dataset/cityscapes/val.list"
IGNORE_INDEX: 255
FREEZE:
MODEL_FILENAME: "model"
PARAMS_FILENAME: "params"
MODEL:
DEFAULT_NORM_TYPE: "bn"
MODEL_NAME: "deeplabv3p"
DEEPLAB:
BACKBONE: "xception_65"
ASPP_WITH_SEP_CONV: True
DECODER_USE_SEP_CONV: True
ENCODER_WITH_ASPP: True
ENABLE_DECODER: True
TEST:
TEST_MODEL: "snapshots/cityscape_v5/final/"
TRAIN:
MODEL_SAVE_DIR: "snapshots/cityscape_v7/"
PRETRAINED_MODEL_DIR: u"pretrain/deeplabv3plus_gn_init"
SNAPSHOT_EPOCH: 5
SYNC_BATCH_NORM: True
SOLVER:
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "sgd"
NUM_EPOCHS: 100
SLIM:
KNOWLEDGE_DISTILL_IS_TEACHER: True
KNOWLEDGE_DISTILL: True
KNOWLEDGE_DISTILL_TEACHER_MODEL_DIR: "/workspace/pretrained_models/xception65_bn_cityscapes"
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import struct
import paddle.fluid as fluid
import numpy as np
from paddle.fluid.proto.framework_pb2 import VarType
import solver
from utils.config import cfg
from loss import multi_softmax_with_loss
from loss import multi_dice_loss
from loss import multi_bce_loss
from models.modeling import deeplab, unet, icnet, pspnet, hrnet, fast_scnn
class ModelPhase(object):
"""
Standard name for model phase in PaddleSeg
The following standard keys are defined:
* `TRAIN`: training mode.
* `EVAL`: testing/evaluation mode.
* `PREDICT`: prediction/inference mode.
* `VISUAL` : visualization mode
"""
TRAIN = 'train'
EVAL = 'eval'
PREDICT = 'predict'
VISUAL = 'visual'
@staticmethod
def is_train(phase):
return phase == ModelPhase.TRAIN
@staticmethod
def is_predict(phase):
return phase == ModelPhase.PREDICT
@staticmethod
def is_eval(phase):
return phase == ModelPhase.EVAL
@staticmethod
def is_visual(phase):
return phase == ModelPhase.VISUAL
@staticmethod
def is_valid_phase(phase):
""" Check valid phase """
if ModelPhase.is_train(phase) or ModelPhase.is_predict(phase) \
or ModelPhase.is_eval(phase) or ModelPhase.is_visual(phase):
return True
return False
def seg_model(image, class_num):
model_name = cfg.MODEL.MODEL_NAME
if model_name == 'unet':
logits = unet.unet(image, class_num)
elif model_name == 'deeplabv3p':
logits = deeplab.deeplabv3p(image, class_num)
elif model_name == 'icnet':
logits = icnet.icnet(image, class_num)
elif model_name == 'pspnet':
logits = pspnet.pspnet(image, class_num)
elif model_name == 'hrnet':
logits = hrnet.hrnet(image, class_num)
elif model_name == 'fast_scnn':
logits = fast_scnn.fast_scnn(image, class_num)
else:
raise Exception(
"unknow model name, only support unet, deeplabv3p, icnet, pspnet, hrnet"
)
return logits
def softmax(logit):
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.softmax(logit)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def sigmoid_to_softmax(logit):
"""
one channel to two channel
"""
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.sigmoid(logit)
logit_back = 1 - logit
logit = fluid.layers.concat([logit_back, logit], axis=-1)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def export_preprocess(image):
"""导出模型的预处理流程"""
image = fluid.layers.transpose(image, [0, 3, 1, 2])
origin_shape = fluid.layers.shape(image)[-2:]
# 不同AUG_METHOD方法的resize
if cfg.AUG.AUG_METHOD == 'unpadding':
h_fix = cfg.AUG.FIX_RESIZE_SIZE[1]
w_fix = cfg.AUG.FIX_RESIZE_SIZE[0]
image = fluid.layers.resize_bilinear(
image, out_shape=[h_fix, w_fix], align_corners=False, align_mode=0)
elif cfg.AUG.AUG_METHOD == 'rangescaling':
size = cfg.AUG.INF_RESIZE_VALUE
value = fluid.layers.reduce_max(origin_shape)
scale = float(size) / value.astype('float32')
image = fluid.layers.resize_bilinear(
image, scale=scale, align_corners=False, align_mode=0)
# 存储resize后图像shape
valid_shape = fluid.layers.shape(image)[-2:]
# padding到eval_crop_size大小
width = cfg.EVAL_CROP_SIZE[0]
height = cfg.EVAL_CROP_SIZE[1]
pad_target = fluid.layers.assign(
np.array([height, width]).astype('float32'))
up = fluid.layers.assign(np.array([0]).astype('float32'))
down = pad_target[0] - valid_shape[0]
left = up
right = pad_target[1] - valid_shape[1]
paddings = fluid.layers.concat([up, down, left, right])
paddings = fluid.layers.cast(paddings, 'int32')
image = fluid.layers.pad2d(image, paddings=paddings, pad_value=127.5)
# normalize
mean = np.array(cfg.MEAN).reshape(1, len(cfg.MEAN), 1, 1)
mean = fluid.layers.assign(mean.astype('float32'))
std = np.array(cfg.STD).reshape(1, len(cfg.STD), 1, 1)
std = fluid.layers.assign(std.astype('float32'))
image = (image / 255 - mean) / std
# 使后面的网络能通过类似image.shape获取特征图的shape
image = fluid.layers.reshape(
image, shape=[-1, cfg.DATASET.DATA_DIM, height, width])
return image, valid_shape, origin_shape
def build_model(main_prog=None, start_prog=None, phase=ModelPhase.TRAIN, **kwargs):
print('debugggggggggg')
if not ModelPhase.is_valid_phase(phase):
raise ValueError("ModelPhase {} is not valid!".format(phase))
if ModelPhase.is_train(phase):
width = cfg.TRAIN_CROP_SIZE[0]
height = cfg.TRAIN_CROP_SIZE[1]
else:
width = cfg.EVAL_CROP_SIZE[0]
height = cfg.EVAL_CROP_SIZE[1]
image_shape = [cfg.DATASET.DATA_DIM, height, width]
grt_shape = [1, height, width]
class_num = cfg.DATASET.NUM_CLASSES
#with fluid.program_guard(main_prog, start_prog):
# with fluid.unique_name.guard():
# 在导出模型的时候,增加图像标准化预处理,减小预测部署时图像的处理流程
# 预测部署时只须对输入图像增加batch_size维度即可
if cfg.SLIM.KNOWLEDGE_DISTILL_IS_TEACHER:
print('teacher input:')
image = main_prog.global_block()._clone_variable(kwargs['image'],
force_persistable=False)
label = main_prog.global_block()._clone_variable(kwargs['label'],
force_persistable=False)
mask = main_prog.global_block()._clone_variable(kwargs['mask'],
force_persistable=False)
else:
if ModelPhase.is_predict(phase):
origin_image = fluid.layers.data(
name='image',
shape=[-1, -1, -1, cfg.DATASET.DATA_DIM],
dtype='float32',
append_batch_size=False)
image, valid_shape, origin_shape = export_preprocess(
origin_image)
else:
image = fluid.layers.data(
name='image', shape=image_shape, dtype='float32')
label = fluid.layers.data(
name='label', shape=grt_shape, dtype='int32')
mask = fluid.layers.data(
name='mask', shape=grt_shape, dtype='int32')
# use PyReader when doing traning and evaluation
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
py_reader = None
if not cfg.SLIM.KNOWLEDGE_DISTILL_IS_TEACHER:
py_reader = fluid.io.PyReader(
feed_list=[image, label, mask],
capacity=cfg.DATALOADER.BUF_SIZE,
iterable=False,
use_double_buffer=True)
loss_type = cfg.SOLVER.LOSS
if not isinstance(loss_type, list):
loss_type = list(loss_type)
# dice_loss或bce_loss只适用两类分割中
if class_num > 2 and (("dice_loss" in loss_type) or
("bce_loss" in loss_type)):
raise Exception(
"dice loss and bce loss is only applicable to binary classfication"
)
# 在两类分割情况下,当loss函数选择dice_loss或bce_loss的时候,最后logit输出通道数设置为1
if ("dice_loss" in loss_type) or ("bce_loss" in loss_type):
class_num = 1
if "softmax_loss" in loss_type:
raise Exception(
"softmax loss can not combine with dice loss or bce loss"
)
logits = seg_model(image, class_num)
# 根据选择的loss函数计算相应的损失函数
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
loss_valid = False
avg_loss_list = []
valid_loss = []
if "softmax_loss" in loss_type:
weight = cfg.SOLVER.CROSS_ENTROPY_WEIGHT
avg_loss_list.append(
multi_softmax_with_loss(logits, label, mask, class_num, weight))
loss_valid = True
valid_loss.append("softmax_loss")
if "dice_loss" in loss_type:
avg_loss_list.append(multi_dice_loss(logits, label, mask))
loss_valid = True
valid_loss.append("dice_loss")
if "bce_loss" in loss_type:
avg_loss_list.append(multi_bce_loss(logits, label, mask))
loss_valid = True
valid_loss.append("bce_loss")
if not loss_valid:
raise Exception(
"SOLVER.LOSS: {} is set wrong. it should "
"include one of (softmax_loss, bce_loss, dice_loss) at least"
" example: ['softmax_loss'], ['dice_loss'], ['bce_loss', 'dice_loss']"
.format(cfg.SOLVER.LOSS))
invalid_loss = [x for x in loss_type if x not in valid_loss]
if len(invalid_loss) > 0:
print(
"Warning: the loss {} you set is invalid. it will not be included in loss computed."
.format(invalid_loss))
avg_loss = 0
for i in range(0, len(avg_loss_list)):
avg_loss += avg_loss_list[i]
#get pred result in original size
if isinstance(logits, tuple):
logit = logits[0]
else:
logit = logits
if logit.shape[2:] != label.shape[2:]:
logit = fluid.layers.resize_bilinear(logit, label.shape[2:])
# return image input and logit output for inference graph prune
if ModelPhase.is_predict(phase):
# 两类分割中,使用dice_loss或bce_loss返回的logit为单通道,进行到两通道的变换
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
# 获取有效部分
logit = fluid.layers.slice(
logit, axes=[2, 3], starts=[0, 0], ends=valid_shape)
logit = fluid.layers.resize_bilinear(
logit,
out_shape=origin_shape,
align_corners=False,
align_mode=0)
logit = fluid.layers.argmax(logit, axis=1)
return origin_image, logit
if class_num == 1:
out = sigmoid_to_softmax(logit)
out = fluid.layers.transpose(out, [0, 2, 3, 1])
else:
out = fluid.layers.transpose(logit, [0, 2, 3, 1])
pred = fluid.layers.argmax(out, axis=3)
pred = fluid.layers.unsqueeze(pred, axes=[3])
if ModelPhase.is_visual(phase):
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return pred, logit
if ModelPhase.is_eval(phase):
return py_reader, avg_loss, pred, label, mask
if ModelPhase.is_train(phase):
decayed_lr = None
if not cfg.SLIM.KNOWLEDGE_DISTILL:
optimizer = solver.Solver(main_prog, start_prog)
decayed_lr = optimizer.optimise(avg_loss)
# optimizer = solver.Solver(main_prog, start_prog)
# decayed_lr = optimizer.optimise(avg_loss)
return py_reader, avg_loss, decayed_lr, pred, label, mask, image
def to_int(string, dest="I"):
return struct.unpack(dest, string)[0]
def parse_shape_from_file(filename):
with open(filename, "rb") as file:
version = file.read(4)
lod_level = to_int(file.read(8), dest="Q")
for i in range(lod_level):
_size = to_int(file.read(8), dest="Q")
_ = file.read(_size)
version = file.read(4)
tensor_desc_size = to_int(file.read(4))
tensor_desc = VarType.TensorDesc()
tensor_desc.ParseFromString(file.read(tensor_desc_size))
return tuple(tensor_desc.dims)
此差异已折叠。
>运行该示例前请安装Paddle1.6或更高版本
# PaddleSeg神经网络搜索(NAS)示例
在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解
该文档介绍如何使用[PaddleSlim](https://paddlepaddle.github.io/PaddleSlim)对分割库中的模型进行搜索。
该教程中所示操作,如无特殊说明,均在`PaddleSeg/`路径下执行。
## 概述
我们选取Deeplab+mobilenetv2模型作为神经网络搜索示例,该示例使用[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
辅助完成神经网络搜索实验,具体技术细节,请您参考[神经网络搜索策略](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/docs/tutorials/nas_demo.md)
## 定义搜索空间
搜索实验中,我们采用了SANAS的方式进行搜索,本次实验会对网络模型中的通道数和卷积核尺寸进行搜索。
所以我们定义了如下搜索空间:
- head通道模块`head_num`:定义了MobilenetV2 head模块中通道数变化区间;
- inverse_res_block1-6`filter_num1-6`: 定义了inverse_res_block模块中通道数变化区间;
- inverse_res_block`repeat`:定义了MobilenetV2 inverse_res_block模块中unit的个数;
- inverse_res_block`multiply`:定义了MobilenetV2 inverse_res_block模块中expansion_factor变化区间;
- 卷积核尺寸`k_size`:定义了MobilenetV2中卷积和尺寸大小是3x3或者5x5。
根据定义的搜索空间各个区间,我们的搜索空间tokens共9位,变化区间在([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [7, 5, 8, 6, 2, 5, 8, 6, 2, 5, 8, 6, 2, 5, 10, 6, 2, 5, 10, 6, 2, 5, 12, 6, 2])范围内。
初始化tokens为:[4, 4, 5, 1, 0, 4, 4, 1, 0, 4, 4, 3, 0, 4, 5, 2, 0, 4, 7, 2, 0, 4, 9, 0, 0]。
## 开始搜索
首先需要安装PaddleSlim,请参考[安装教程](https://paddlepaddle.github.io/PaddleSlim/#_2)
配置paddleseg的config, 下面只展示nas相关的内容
```shell
SLIM:
NAS_PORT: 23333 # 端口
NAS_ADDRESS: "" # ip地址,作为server不用填写,作为client的时候需要填写server的ip
NAS_SEARCH_STEPS: 100 # 搜索多少个结构
NAS_START_EVAL_EPOCH: -1 # 第几个epoch开始对模型进行评估
NAS_IS_SERVER: True # 是否为server
NAS_SPACE_NAME: "MobileNetV2SpaceSeg" # 搜索空间
```
## 训练与评估
执行以下命令,边训练边评估
```shell
python -u ./slim/nas/train.py --log_steps 10 --cfg configs/cityscape.yaml --use_gpu --use_mpio \
SLIM.NAS_PORT 23333 \
SLIM.NAS_ADDRESS "" \
SLIM.NAS_SEARCH_STEPS 2 \
SLIM.NAS_START_EVAL_EPOCH -1 \
SLIM.NAS_IS_SERVER True \
SLIM.NAS_SPACE_NAME "MobileNetV2SpaceSeg" \
```
## FAQ
- 运行报错:`socket.error: [Errno 98] Address already in use`
解决方法:当前端口被占用,请修改`SLIM.NAS_PORT`端口。
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import contextlib
import paddle
import paddle.fluid as fluid
from utils.config import cfg
from models.libs.model_libs import scope, name_scope
from models.libs.model_libs import bn, bn_relu, relu
from models.libs.model_libs import conv
from models.libs.model_libs import separate_conv
from models.backbone.mobilenet_v2 import MobileNetV2 as mobilenet_backbone
from models.backbone.xception import Xception as xception_backbone
def encoder(input):
# 编码器配置,采用ASPP架构,pooling + 1x1_conv + 三个不同尺度的空洞卷积并行, concat后1x1conv
# ASPP_WITH_SEP_CONV:默认为真,使用depthwise可分离卷积,否则使用普通卷积
# OUTPUT_STRIDE: 下采样倍数,8或16,决定aspp_ratios大小
# aspp_ratios:ASPP模块空洞卷积的采样率
if cfg.MODEL.DEEPLAB.OUTPUT_STRIDE == 16:
aspp_ratios = [6, 12, 18]
elif cfg.MODEL.DEEPLAB.OUTPUT_STRIDE == 8:
aspp_ratios = [12, 24, 36]
else:
raise Exception("deeplab only support stride 8 or 16")
param_attr = fluid.ParamAttr(
name=name_scope + 'weights',
regularizer=None,
initializer=fluid.initializer.TruncatedNormal(loc=0.0, scale=0.06))
with scope('encoder'):
channel = 256
with scope("image_pool"):
image_avg = fluid.layers.reduce_mean(
input, [2, 3], keep_dim=True)
image_avg = bn_relu(
conv(
image_avg,
channel,
1,
1,
groups=1,
padding=0,
param_attr=param_attr))
image_avg = fluid.layers.resize_bilinear(image_avg, input.shape[2:])
with scope("aspp0"):
aspp0 = bn_relu(
conv(
input,
channel,
1,
1,
groups=1,
padding=0,
param_attr=param_attr))
with scope("aspp1"):
if cfg.MODEL.DEEPLAB.ASPP_WITH_SEP_CONV:
aspp1 = separate_conv(
input, channel, 1, 3, dilation=aspp_ratios[0], act=relu)
else:
aspp1 = bn_relu(
conv(
input,
channel,
stride=1,
filter_size=3,
dilation=aspp_ratios[0],
padding=aspp_ratios[0],
param_attr=param_attr))
with scope("aspp2"):
if cfg.MODEL.DEEPLAB.ASPP_WITH_SEP_CONV:
aspp2 = separate_conv(
input, channel, 1, 3, dilation=aspp_ratios[1], act=relu)
else:
aspp2 = bn_relu(
conv(
input,
channel,
stride=1,
filter_size=3,
dilation=aspp_ratios[1],
padding=aspp_ratios[1],
param_attr=param_attr))
with scope("aspp3"):
if cfg.MODEL.DEEPLAB.ASPP_WITH_SEP_CONV:
aspp3 = separate_conv(
input, channel, 1, 3, dilation=aspp_ratios[2], act=relu)
else:
aspp3 = bn_relu(
conv(
input,
channel,
stride=1,
filter_size=3,
dilation=aspp_ratios[2],
padding=aspp_ratios[2],
param_attr=param_attr))
with scope("concat"):
data = fluid.layers.concat([image_avg, aspp0, aspp1, aspp2, aspp3],
axis=1)
data = bn_relu(
conv(
data,
channel,
1,
1,
groups=1,
padding=0,
param_attr=param_attr))
data = fluid.layers.dropout(data, 0.9)
return data
def decoder(encode_data, decode_shortcut):
# 解码器配置
# encode_data:编码器输出
# decode_shortcut: 从backbone引出的分支, resize后与encode_data concat
# DECODER_USE_SEP_CONV: 默认为真,则concat后连接两个可分离卷积,否则为普通卷积
param_attr = fluid.ParamAttr(
name=name_scope + 'weights',
regularizer=None,
initializer=fluid.initializer.TruncatedNormal(loc=0.0, scale=0.06))
with scope('decoder'):
with scope('concat'):
decode_shortcut = bn_relu(
conv(
decode_shortcut,
48,
1,
1,
groups=1,
padding=0,
param_attr=param_attr))
encode_data = fluid.layers.resize_bilinear(
encode_data, decode_shortcut.shape[2:])
encode_data = fluid.layers.concat([encode_data, decode_shortcut],
axis=1)
if cfg.MODEL.DEEPLAB.DECODER_USE_SEP_CONV:
with scope("separable_conv1"):
encode_data = separate_conv(
encode_data, 256, 1, 3, dilation=1, act=relu)
with scope("separable_conv2"):
encode_data = separate_conv(
encode_data, 256, 1, 3, dilation=1, act=relu)
else:
with scope("decoder_conv1"):
encode_data = bn_relu(
conv(
encode_data,
256,
stride=1,
filter_size=3,
dilation=1,
padding=1,
param_attr=param_attr))
with scope("decoder_conv2"):
encode_data = bn_relu(
conv(
encode_data,
256,
stride=1,
filter_size=3,
dilation=1,
padding=1,
param_attr=param_attr))
return encode_data
def nas_backbone(input, arch):
# scale = cfg.MODEL.DEEPLAB.DEPTH_MULTIPLIER
# output_stride = cfg.MODEL.DEEPLAB.OUTPUT_STRIDE
# model = mobilenet_backbone(scale=scale, output_stride=output_stride)
end_points = 8
decode_point = 3
data, decode_shortcuts = arch(
input, end_points=end_points, return_block=decode_point, output_stride=16)
decode_shortcut = decode_shortcuts[decode_point]
return data, decode_shortcut
def deeplabv3p_nas(img, num_classes, arch=None):
data, decode_shortcut = nas_backbone(img, arch)
# 编码器解码器设置
cfg.MODEL.DEFAULT_EPSILON = 1e-5
if cfg.MODEL.DEEPLAB.ENCODER_WITH_ASPP:
data = encoder(data)
if cfg.MODEL.DEEPLAB.ENABLE_DECODER:
data = decoder(data, decode_shortcut)
# 根据类别数设置最后一个卷积层输出,并resize到图片原始尺寸
param_attr = fluid.ParamAttr(
name=name_scope + 'weights',
regularizer=fluid.regularizer.L2DecayRegularizer(
regularization_coeff=0.0),
initializer=fluid.initializer.TruncatedNormal(loc=0.0, scale=0.01))
with scope('logit'):
logit = conv(
data,
num_classes,
1,
stride=1,
padding=0,
bias_attr=True,
param_attr=param_attr)
logit = fluid.layers.resize_bilinear(logit, img.shape[2:])
return logit
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
SEG_PATH = os.path.join(LOCAL_PATH, "../../", "pdseg")
sys.path.append(SEG_PATH)
sys.path.append('/workspace/codes/PaddleSlim1')
import time
import argparse
import functools
import pprint
import cv2
import numpy as np
import paddle
import paddle.fluid as fluid
from utils.config import cfg
from utils.timer import Timer, calculate_eta
from model_builder import build_model
from model_builder import ModelPhase
from reader import SegDataset
from metrics import ConfusionMatrix
from mobilenetv2_search_space import MobileNetV2SpaceSeg
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg model evalution')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu',
dest='use_gpu',
help='Use gpu or cpu',
action='store_true',
default=False)
parser.add_argument(
'--use_mpio',
dest='use_mpio',
help='Use multiprocess IO or not',
action='store_true',
default=False)
parser.add_argument(
'opts',
help='See utils/config.py for all options',
default=None,
nargs=argparse.REMAINDER)
if len(sys.argv) == 1:
parser.print_help()
sys.exit(1)
return parser.parse_args()
def evaluate(cfg, ckpt_dir=None, use_gpu=False, use_mpio=False, **kwargs):
np.set_printoptions(precision=5, suppress=True)
startup_prog = fluid.Program()
test_prog = fluid.Program()
dataset = SegDataset(
file_list=cfg.DATASET.VAL_FILE_LIST,
mode=ModelPhase.EVAL,
data_dir=cfg.DATASET.DATA_DIR)
def data_generator():
#TODO: check is batch reader compatitable with Windows
if use_mpio:
data_gen = dataset.multiprocess_generator(
num_processes=cfg.DATALOADER.NUM_WORKERS,
max_queue_size=cfg.DATALOADER.BUF_SIZE)
else:
data_gen = dataset.generator()
for b in data_gen:
yield b[0], b[1], b[2]
py_reader, avg_loss, pred, grts, masks = build_model(
test_prog, startup_prog, phase=ModelPhase.EVAL, arch=kwargs['arch'])
py_reader.decorate_sample_generator(
data_generator, drop_last=False, batch_size=cfg.BATCH_SIZE)
# Get device environment
places = fluid.cuda_places() if use_gpu else fluid.cpu_places()
place = places[0]
dev_count = len(places)
print("#Device count: {}".format(dev_count))
exe = fluid.Executor(place)
exe.run(startup_prog)
test_prog = test_prog.clone(for_test=True)
ckpt_dir = cfg.TEST.TEST_MODEL if not ckpt_dir else ckpt_dir
if not os.path.exists(ckpt_dir):
raise ValueError('The TEST.TEST_MODEL {} is not found'.format(ckpt_dir))
if ckpt_dir is not None:
print('load test model:', ckpt_dir)
fluid.io.load_params(exe, ckpt_dir, main_program=test_prog)
# Use streaming confusion matrix to calculate mean_iou
np.set_printoptions(
precision=4, suppress=True, linewidth=160, floatmode="fixed")
conf_mat = ConfusionMatrix(cfg.DATASET.NUM_CLASSES, streaming=True)
fetch_list = [avg_loss.name, pred.name, grts.name, masks.name]
num_images = 0
step = 0
all_step = cfg.DATASET.TEST_TOTAL_IMAGES // cfg.BATCH_SIZE + 1
timer = Timer()
timer.start()
py_reader.start()
while True:
try:
step += 1
loss, pred, grts, masks = exe.run(
test_prog, fetch_list=fetch_list, return_numpy=True)
loss = np.mean(np.array(loss))
num_images += pred.shape[0]
conf_mat.calculate(pred, grts, masks)
_, iou = conf_mat.mean_iou()
_, acc = conf_mat.accuracy()
speed = 1.0 / timer.elapsed_time()
print(
"[EVAL]step={} loss={:.5f} acc={:.4f} IoU={:.4f} step/sec={:.2f} | ETA {}"
.format(step, loss, acc, iou, speed,
calculate_eta(all_step - step, speed)))
timer.restart()
sys.stdout.flush()
except fluid.core.EOFException:
break
category_iou, avg_iou = conf_mat.mean_iou()
category_acc, avg_acc = conf_mat.accuracy()
print("[EVAL]#image={} acc={:.4f} IoU={:.4f}".format(
num_images, avg_acc, avg_iou))
print("[EVAL]Category IoU:", category_iou)
print("[EVAL]Category Acc:", category_acc)
print("[EVAL]Kappa:{:.4f}".format(conf_mat.kappa()))
return category_iou, avg_iou, category_acc, avg_acc
def main():
args = parse_args()
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
cfg.check_and_infer()
print(pprint.pformat(cfg))
evaluate(cfg, **args.__dict__)
if __name__ == '__main__':
main()
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddleslim.nas.search_space.search_space_base import SearchSpaceBase
from paddleslim.nas.search_space.base_layer import conv_bn_layer
from paddleslim.nas.search_space.search_space_registry import SEARCHSPACE
from paddleslim.nas.search_space.utils import check_points
__all__ = ["MobileNetV2SpaceSeg"]
@SEARCHSPACE.register
class MobileNetV2SpaceSeg(SearchSpaceBase):
def __init__(self, input_size, output_size, block_num, block_mask=None):
super(MobileNetV2SpaceSeg, self).__init__(input_size, output_size,
block_num, block_mask)
# self.head_num means the first convolution channel
self.head_num = np.array([3, 4, 8, 12, 16, 24, 32]) #7
# self.filter_num1 ~ self.filter_num6 means following convlution channel
self.filter_num1 = np.array([3, 4, 8, 12, 16, 24, 32, 48]) #8
self.filter_num2 = np.array([8, 12, 16, 24, 32, 48, 64, 80]) #8
self.filter_num3 = np.array([16, 24, 32, 48, 64, 80, 96, 128]) #8
self.filter_num4 = np.array(
[24, 32, 48, 64, 80, 96, 128, 144, 160, 192]) #10
self.filter_num5 = np.array(
[32, 48, 64, 80, 96, 128, 144, 160, 192, 224]) #10
self.filter_num6 = np.array(
[64, 80, 96, 128, 144, 160, 192, 224, 256, 320, 384, 512]) #12
# self.k_size means kernel size
self.k_size = np.array([3, 5]) #2
# self.multiply means expansion_factor of each _inverted_residual_unit
self.multiply = np.array([1, 2, 3, 4, 6]) #5
# self.repeat means repeat_num _inverted_residual_unit in each _invresi_blocks
self.repeat = np.array([1, 2, 3, 4, 5, 6]) #6
def init_tokens(self):
"""
The initial token.
The first one is the index of the first layers' channel in self.head_num,
each line in the following represent the index of the [expansion_factor, filter_num, repeat_num, kernel_size]
"""
# original MobileNetV2
# yapf: disable
init_token_base = [4, # 1, 16, 1
4, 5, 1, 0, # 6, 24, 2
4, 4, 2, 0, # 6, 32, 3
4, 4, 3, 0, # 6, 64, 4
4, 5, 2, 0, # 6, 96, 3
4, 7, 2, 0, # 6, 160, 3
4, 9, 0, 0] # 6, 320, 1
# yapf: enable
return init_token_base
def range_table(self):
"""
Get range table of current search space, constrains the range of tokens.
"""
# head_num + 6 * [multiple(expansion_factor), filter_num, repeat, kernel_size]
# yapf: disable
range_table_base = [len(self.head_num),
len(self.multiply), len(self.filter_num1), len(self.repeat), len(self.k_size),
len(self.multiply), len(self.filter_num2), len(self.repeat), len(self.k_size),
len(self.multiply), len(self.filter_num3), len(self.repeat), len(self.k_size),
len(self.multiply), len(self.filter_num4), len(self.repeat), len(self.k_size),
len(self.multiply), len(self.filter_num5), len(self.repeat), len(self.k_size),
len(self.multiply), len(self.filter_num6), len(self.repeat), len(self.k_size)]
# yapf: enable
return range_table_base
def token2arch(self, tokens=None):
"""
return net_arch function
"""
if tokens is None:
tokens = self.init_tokens()
self.bottleneck_params_list = []
self.bottleneck_params_list.append(
(1, self.head_num[tokens[0]], 1, 1, 3))
self.bottleneck_params_list.append(
(self.multiply[tokens[1]], self.filter_num1[tokens[2]],
self.repeat[tokens[3]], 2, self.k_size[tokens[4]]))
self.bottleneck_params_list.append(
(self.multiply[tokens[5]], self.filter_num2[tokens[6]],
self.repeat[tokens[7]], 2, self.k_size[tokens[8]]))
self.bottleneck_params_list.append(
(self.multiply[tokens[9]], self.filter_num3[tokens[10]],
self.repeat[tokens[11]], 2, self.k_size[tokens[12]]))
self.bottleneck_params_list.append(
(self.multiply[tokens[13]], self.filter_num4[tokens[14]],
self.repeat[tokens[15]], 1, self.k_size[tokens[16]]))
self.bottleneck_params_list.append(
(self.multiply[tokens[17]], self.filter_num5[tokens[18]],
self.repeat[tokens[19]], 2, self.k_size[tokens[20]]))
self.bottleneck_params_list.append(
(self.multiply[tokens[21]], self.filter_num6[tokens[22]],
self.repeat[tokens[23]], 1, self.k_size[tokens[24]]))
def _modify_bottle_params(output_stride=None):
if output_stride is not None and output_stride % 2 != 0:
raise Exception("output stride must to be even number")
if output_stride is None:
return
else:
stride = 2
for i, layer_setting in enumerate(self.bottleneck_params_list):
t, c, n, s, ks = layer_setting
stride = stride * s
if stride > output_stride:
s = 1
self.bottleneck_params_list[i] = (t, c, n, s, ks)
def net_arch(input,
scale=1.0,
return_block=None,
end_points=None,
output_stride=None):
self.scale = scale
_modify_bottle_params(output_stride)
decode_ends = dict()
def check_points(count, points):
if points is None:
return False
else:
if isinstance(points, list):
return (True if count in points else False)
else:
return (True if count == points else False)
#conv1
# all padding is 'SAME' in the conv2d, can compute the actual padding automatic.
input = conv_bn_layer(
input,
num_filters=int(32 * self.scale),
filter_size=3,
stride=2,
padding='SAME',
act='relu6',
name='mobilenetv2_conv1')
layer_count = 1
depthwise_output = None
# bottleneck sequences
in_c = int(32 * self.scale)
for i, layer_setting in enumerate(self.bottleneck_params_list):
t, c, n, s, k = layer_setting
layer_count += 1
### return_block and end_points means block num
if check_points((layer_count - 1), return_block):
decode_ends[layer_count - 1] = depthwise_output
if check_points((layer_count - 1), end_points):
return input, decode_ends
input, depthwise_output = self._invresi_blocks(
input=input,
in_c=in_c,
t=t,
c=int(c * self.scale),
n=n,
s=s,
k=k,
name='mobilenetv2_conv' + str(i))
in_c = int(c * self.scale)
### return_block and end_points means block num
if check_points(layer_count, return_block):
decode_ends[layer_count] = depthwise_output
if check_points(layer_count, end_points):
return input, decode_ends
# last conv
input = conv_bn_layer(
input=input,
num_filters=int(1280 * self.scale)
if self.scale > 1.0 else 1280,
filter_size=1,
stride=1,
padding='SAME',
act='relu6',
name='mobilenetv2_conv' + str(i + 1))
input = fluid.layers.pool2d(
input=input,
pool_type='avg',
global_pooling=True,
name='mobilenetv2_last_pool')
return input
return net_arch
def _shortcut(self, input, data_residual):
"""Build shortcut layer.
Args:
input(Variable): input.
data_residual(Variable): residual layer.
Returns:
Variable, layer output.
"""
return fluid.layers.elementwise_add(input, data_residual)
def _inverted_residual_unit(self,
input,
num_in_filter,
num_filters,
ifshortcut,
stride,
filter_size,
expansion_factor,
reduction_ratio=4,
name=None):
"""Build inverted residual unit.
Args:
input(Variable), input.
num_in_filter(int), number of in filters.
num_filters(int), number of filters.
ifshortcut(bool), whether using shortcut.
stride(int), stride.
filter_size(int), filter size.
padding(str|int|list), padding.
expansion_factor(float), expansion factor.
name(str), name.
Returns:
Variable, layers output.
"""
num_expfilter = int(round(num_in_filter * expansion_factor))
channel_expand = conv_bn_layer(
input=input,
num_filters=num_expfilter,
filter_size=1,
stride=1,
padding='SAME',
num_groups=1,
act='relu6',
name=name + '_expand')
bottleneck_conv = conv_bn_layer(
input=channel_expand,
num_filters=num_expfilter,
filter_size=filter_size,
stride=stride,
padding='SAME',
num_groups=num_expfilter,
act='relu6',
name=name + '_dwise',
use_cudnn=False)
depthwise_output = bottleneck_conv
linear_out = conv_bn_layer(
input=bottleneck_conv,
num_filters=num_filters,
filter_size=1,
stride=1,
padding='SAME',
num_groups=1,
act=None,
name=name + '_linear')
out = linear_out
if ifshortcut:
out = self._shortcut(input=input, data_residual=out)
return out, depthwise_output
def _invresi_blocks(self, input, in_c, t, c, n, s, k, name=None):
"""Build inverted residual blocks.
Args:
input: Variable, input.
in_c: int, number of in filters.
t: float, expansion factor.
c: int, number of filters.
n: int, number of layers.
s: int, stride.
k: int, filter size.
name: str, name.
Returns:
Variable, layers output.
"""
first_block, depthwise_output = self._inverted_residual_unit(
input=input,
num_in_filter=in_c,
num_filters=c,
ifshortcut=False,
stride=s,
filter_size=k,
expansion_factor=t,
name=name + '_1')
last_residual_block = first_block
last_c = c
for i in range(1, n):
last_residual_block, depthwise_output = self._inverted_residual_unit(
input=last_residual_block,
num_in_filter=last_c,
num_filters=c,
ifshortcut=True,
stride=1,
filter_size=k,
expansion_factor=t,
name=name + '_' + str(i + 1))
return last_residual_block, depthwise_output
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import struct
import paddle.fluid as fluid
import numpy as np
from paddle.fluid.proto.framework_pb2 import VarType
import solver
from utils.config import cfg
from loss import multi_softmax_with_loss
from loss import multi_dice_loss
from loss import multi_bce_loss
import deeplab
class ModelPhase(object):
"""
Standard name for model phase in PaddleSeg
The following standard keys are defined:
* `TRAIN`: training mode.
* `EVAL`: testing/evaluation mode.
* `PREDICT`: prediction/inference mode.
* `VISUAL` : visualization mode
"""
TRAIN = 'train'
EVAL = 'eval'
PREDICT = 'predict'
VISUAL = 'visual'
@staticmethod
def is_train(phase):
return phase == ModelPhase.TRAIN
@staticmethod
def is_predict(phase):
return phase == ModelPhase.PREDICT
@staticmethod
def is_eval(phase):
return phase == ModelPhase.EVAL
@staticmethod
def is_visual(phase):
return phase == ModelPhase.VISUAL
@staticmethod
def is_valid_phase(phase):
""" Check valid phase """
if ModelPhase.is_train(phase) or ModelPhase.is_predict(phase) \
or ModelPhase.is_eval(phase) or ModelPhase.is_visual(phase):
return True
return False
def seg_model(image, class_num, arch):
model_name = cfg.MODEL.MODEL_NAME
if model_name == 'deeplabv3p':
logits = deeplab.deeplabv3p_nas(image, class_num, arch)
else:
raise Exception(
"unknow model name, only support deeplabv3p"
)
return logits
def softmax(logit):
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.softmax(logit)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def sigmoid_to_softmax(logit):
"""
one channel to two channel
"""
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.sigmoid(logit)
logit_back = 1 - logit
logit = fluid.layers.concat([logit_back, logit], axis=-1)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def export_preprocess(image):
"""导出模型的预处理流程"""
image = fluid.layers.transpose(image, [0, 3, 1, 2])
origin_shape = fluid.layers.shape(image)[-2:]
# 不同AUG_METHOD方法的resize
if cfg.AUG.AUG_METHOD == 'unpadding':
h_fix = cfg.AUG.FIX_RESIZE_SIZE[1]
w_fix = cfg.AUG.FIX_RESIZE_SIZE[0]
image = fluid.layers.resize_bilinear(
image, out_shape=[h_fix, w_fix], align_corners=False, align_mode=0)
elif cfg.AUG.AUG_METHOD == 'rangescaling':
size = cfg.AUG.INF_RESIZE_VALUE
value = fluid.layers.reduce_max(origin_shape)
scale = float(size) / value.astype('float32')
image = fluid.layers.resize_bilinear(
image, scale=scale, align_corners=False, align_mode=0)
# 存储resize后图像shape
valid_shape = fluid.layers.shape(image)[-2:]
# padding到eval_crop_size大小
width = cfg.EVAL_CROP_SIZE[0]
height = cfg.EVAL_CROP_SIZE[1]
pad_target = fluid.layers.assign(
np.array([height, width]).astype('float32'))
up = fluid.layers.assign(np.array([0]).astype('float32'))
down = pad_target[0] - valid_shape[0]
left = up
right = pad_target[1] - valid_shape[1]
paddings = fluid.layers.concat([up, down, left, right])
paddings = fluid.layers.cast(paddings, 'int32')
image = fluid.layers.pad2d(image, paddings=paddings, pad_value=127.5)
# normalize
mean = np.array(cfg.MEAN).reshape(1, len(cfg.MEAN), 1, 1)
mean = fluid.layers.assign(mean.astype('float32'))
std = np.array(cfg.STD).reshape(1, len(cfg.STD), 1, 1)
std = fluid.layers.assign(std.astype('float32'))
image = (image / 255 - mean) / std
# 使后面的网络能通过类似image.shape获取特征图的shape
image = fluid.layers.reshape(
image, shape=[-1, cfg.DATASET.DATA_DIM, height, width])
return image, valid_shape, origin_shape
def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN, arch=None):
if not ModelPhase.is_valid_phase(phase):
raise ValueError("ModelPhase {} is not valid!".format(phase))
if ModelPhase.is_train(phase):
width = cfg.TRAIN_CROP_SIZE[0]
height = cfg.TRAIN_CROP_SIZE[1]
else:
width = cfg.EVAL_CROP_SIZE[0]
height = cfg.EVAL_CROP_SIZE[1]
image_shape = [cfg.DATASET.DATA_DIM, height, width]
grt_shape = [1, height, width]
class_num = cfg.DATASET.NUM_CLASSES
with fluid.program_guard(main_prog, start_prog):
with fluid.unique_name.guard():
# 在导出模型的时候,增加图像标准化预处理,减小预测部署时图像的处理流程
# 预测部署时只须对输入图像增加batch_size维度即可
if ModelPhase.is_predict(phase):
origin_image = fluid.layers.data(
name='image',
shape=[-1, -1, -1, cfg.DATASET.DATA_DIM],
dtype='float32',
append_batch_size=False)
image, valid_shape, origin_shape = export_preprocess(
origin_image)
else:
image = fluid.layers.data(
name='image', shape=image_shape, dtype='float32')
label = fluid.layers.data(
name='label', shape=grt_shape, dtype='int32')
mask = fluid.layers.data(
name='mask', shape=grt_shape, dtype='int32')
# use PyReader when doing traning and evaluation
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
py_reader = fluid.io.PyReader(
feed_list=[image, label, mask],
capacity=cfg.DATALOADER.BUF_SIZE,
iterable=False,
use_double_buffer=True)
loss_type = cfg.SOLVER.LOSS
if not isinstance(loss_type, list):
loss_type = list(loss_type)
# dice_loss或bce_loss只适用两类分割中
if class_num > 2 and (("dice_loss" in loss_type) or
("bce_loss" in loss_type)):
raise Exception(
"dice loss and bce loss is only applicable to binary classfication"
)
# 在两类分割情况下,当loss函数选择dice_loss或bce_loss的时候,最后logit输出通道数设置为1
if ("dice_loss" in loss_type) or ("bce_loss" in loss_type):
class_num = 1
if "softmax_loss" in loss_type:
raise Exception(
"softmax loss can not combine with dice loss or bce loss"
)
logits = seg_model(image, class_num, arch)
# 根据选择的loss函数计算相应的损失函数
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
loss_valid = False
avg_loss_list = []
valid_loss = []
if "softmax_loss" in loss_type:
weight = cfg.SOLVER.CROSS_ENTROPY_WEIGHT
avg_loss_list.append(
multi_softmax_with_loss(logits, label, mask, class_num, weight))
loss_valid = True
valid_loss.append("softmax_loss")
if "dice_loss" in loss_type:
avg_loss_list.append(multi_dice_loss(logits, label, mask))
loss_valid = True
valid_loss.append("dice_loss")
if "bce_loss" in loss_type:
avg_loss_list.append(multi_bce_loss(logits, label, mask))
loss_valid = True
valid_loss.append("bce_loss")
if not loss_valid:
raise Exception(
"SOLVER.LOSS: {} is set wrong. it should "
"include one of (softmax_loss, bce_loss, dice_loss) at least"
" example: ['softmax_loss'], ['dice_loss'], ['bce_loss', 'dice_loss']"
.format(cfg.SOLVER.LOSS))
invalid_loss = [x for x in loss_type if x not in valid_loss]
if len(invalid_loss) > 0:
print(
"Warning: the loss {} you set is invalid. it will not be included in loss computed."
.format(invalid_loss))
avg_loss = 0
for i in range(0, len(avg_loss_list)):
avg_loss += avg_loss_list[i]
#get pred result in original size
if isinstance(logits, tuple):
logit = logits[0]
else:
logit = logits
if logit.shape[2:] != label.shape[2:]:
logit = fluid.layers.resize_bilinear(logit, label.shape[2:])
# return image input and logit output for inference graph prune
if ModelPhase.is_predict(phase):
# 两类分割中,使用dice_loss或bce_loss返回的logit为单通道,进行到两通道的变换
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
# 获取有效部分
logit = fluid.layers.slice(
logit, axes=[2, 3], starts=[0, 0], ends=valid_shape)
logit = fluid.layers.resize_bilinear(
logit,
out_shape=origin_shape,
align_corners=False,
align_mode=0)
logit = fluid.layers.argmax(logit, axis=1)
return origin_image, logit
if class_num == 1:
out = sigmoid_to_softmax(logit)
out = fluid.layers.transpose(out, [0, 2, 3, 1])
else:
out = fluid.layers.transpose(logit, [0, 2, 3, 1])
pred = fluid.layers.argmax(out, axis=3)
pred = fluid.layers.unsqueeze(pred, axes=[3])
if ModelPhase.is_visual(phase):
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return pred, logit
if ModelPhase.is_eval(phase):
return py_reader, avg_loss, pred, label, mask
if ModelPhase.is_train(phase):
optimizer = solver.Solver(main_prog, start_prog)
decayed_lr = optimizer.optimise(avg_loss)
return py_reader, avg_loss, decayed_lr, pred, label, mask
def to_int(string, dest="I"):
return struct.unpack(dest, string)[0]
def parse_shape_from_file(filename):
with open(filename, "rb") as file:
version = file.read(4)
lod_level = to_int(file.read(8), dest="Q")
for i in range(lod_level):
_size = to_int(file.read(8), dest="Q")
_ = file.read(_size)
version = file.read(4)
tensor_desc_size = to_int(file.read(4))
tensor_desc = VarType.TensorDesc()
tensor_desc.ParseFromString(file.read(tensor_desc_size))
return tuple(tensor_desc.dims)
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
SEG_PATH = os.path.join(LOCAL_PATH, "../../", "pdseg")
sys.path.append(SEG_PATH)
import argparse
import pprint
import random
import shutil
import functools
import paddle
import numpy as np
import paddle.fluid as fluid
from utils.config import cfg
from utils.timer import Timer, calculate_eta
from metrics import ConfusionMatrix
from reader import SegDataset
from model_builder import build_model
from model_builder import ModelPhase
from model_builder import parse_shape_from_file
from eval_nas import evaluate
from vis import visualize
from utils import dist_utils
from mobilenetv2_search_space import MobileNetV2SpaceSeg
from paddleslim.nas.search_space.search_space_factory import SearchSpaceFactory
from paddleslim.analysis import flops
from paddleslim.nas.sa_nas import SANAS
from paddleslim.nas import search_space
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg training')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu',
dest='use_gpu',
help='Use gpu or cpu',
action='store_true',
default=False)
parser.add_argument(
'--use_mpio',
dest='use_mpio',
help='Use multiprocess I/O or not',
action='store_true',
default=False)
parser.add_argument(
'--log_steps',
dest='log_steps',
help='Display logging information at every log_steps',
default=10,
type=int)
parser.add_argument(
'--debug',
dest='debug',
help='debug mode, display detail information of training',
action='store_true')
parser.add_argument(
'--use_tb',
dest='use_tb',
help='whether to record the data during training to Tensorboard',
action='store_true')
parser.add_argument(
'--tb_log_dir',
dest='tb_log_dir',
help='Tensorboard logging directory',
default=None,
type=str)
parser.add_argument(
'--do_eval',
dest='do_eval',
help='Evaluation models result on every new checkpoint',
action='store_true')
parser.add_argument(
'opts',
help='See utils/config.py for all options',
default=None,
nargs=argparse.REMAINDER)
parser.add_argument(
'--enable_ce',
dest='enable_ce',
help='If set True, enable continuous evaluation job.'
'This flag is only used for internal test.',
action='store_true')
return parser.parse_args()
def save_vars(executor, dirname, program=None, vars=None):
"""
Temporary resolution for Win save variables compatability.
Will fix in PaddlePaddle v1.5.2
"""
save_program = fluid.Program()
save_block = save_program.global_block()
for each_var in vars:
# NOTE: don't save the variable which type is RAW
if each_var.type == fluid.core.VarDesc.VarType.RAW:
continue
new_var = save_block.create_var(
name=each_var.name,
shape=each_var.shape,
dtype=each_var.dtype,
type=each_var.type,
lod_level=each_var.lod_level,
persistable=True)
file_path = os.path.join(dirname, new_var.name)
file_path = os.path.normpath(file_path)
save_block.append_op(
type='save',
inputs={'X': [new_var]},
outputs={},
attrs={'file_path': file_path})
executor.run(save_program)
def save_checkpoint(exe, program, ckpt_name):
"""
Save checkpoint for evaluation or resume training
"""
ckpt_dir = os.path.join(cfg.TRAIN.MODEL_SAVE_DIR, str(ckpt_name))
print("Save model checkpoint to {}".format(ckpt_dir))
if not os.path.isdir(ckpt_dir):
os.makedirs(ckpt_dir)
save_vars(
exe,
ckpt_dir,
program,
vars=list(filter(fluid.io.is_persistable, program.list_vars())))
return ckpt_dir
def load_checkpoint(exe, program):
"""
Load checkpoiont from pretrained model directory for resume training
"""
print('Resume model training from:', cfg.TRAIN.RESUME_MODEL_DIR)
if not os.path.exists(cfg.TRAIN.RESUME_MODEL_DIR):
raise ValueError("TRAIN.PRETRAIN_MODEL {} not exist!".format(
cfg.TRAIN.RESUME_MODEL_DIR))
fluid.io.load_persistables(
exe, cfg.TRAIN.RESUME_MODEL_DIR, main_program=program)
model_path = cfg.TRAIN.RESUME_MODEL_DIR
# Check is path ended by path spearator
if model_path[-1] == os.sep:
model_path = model_path[0:-1]
epoch_name = os.path.basename(model_path)
# If resume model is final model
if epoch_name == 'final':
begin_epoch = cfg.SOLVER.NUM_EPOCHS
# If resume model path is end of digit, restore epoch status
elif epoch_name.isdigit():
epoch = int(epoch_name)
begin_epoch = epoch + 1
else:
raise ValueError("Resume model path is not valid!")
print("Model checkpoint loaded successfully!")
return begin_epoch
def update_best_model(ckpt_dir):
best_model_dir = os.path.join(cfg.TRAIN.MODEL_SAVE_DIR, 'best_model')
if os.path.exists(best_model_dir):
shutil.rmtree(best_model_dir)
shutil.copytree(ckpt_dir, best_model_dir)
def print_info(*msg):
if cfg.TRAINER_ID == 0:
print(*msg)
def train(cfg):
startup_prog = fluid.Program()
train_prog = fluid.Program()
if args.enable_ce:
startup_prog.random_seed = 1000
train_prog.random_seed = 1000
drop_last = True
dataset = SegDataset(
file_list=cfg.DATASET.TRAIN_FILE_LIST,
mode=ModelPhase.TRAIN,
shuffle=True,
data_dir=cfg.DATASET.DATA_DIR)
def data_generator():
if args.use_mpio:
data_gen = dataset.multiprocess_generator(
num_processes=cfg.DATALOADER.NUM_WORKERS,
max_queue_size=cfg.DATALOADER.BUF_SIZE)
else:
data_gen = dataset.generator()
batch_data = []
for b in data_gen:
batch_data.append(b)
if len(batch_data) == (cfg.BATCH_SIZE // cfg.NUM_TRAINERS):
for item in batch_data:
yield item[0], item[1], item[2]
batch_data = []
# If use sync batch norm strategy, drop last batch if number of samples
# in batch_data is less then cfg.BATCH_SIZE to avoid NCCL hang issues
if not cfg.TRAIN.SYNC_BATCH_NORM:
for item in batch_data:
yield item[0], item[1], item[2]
# Get device environment
# places = fluid.cuda_places() if args.use_gpu else fluid.cpu_places()
# place = places[0]
gpu_id = int(os.environ.get('FLAGS_selected_gpus', 0))
place = fluid.CUDAPlace(gpu_id) if args.use_gpu else fluid.CPUPlace()
places = fluid.cuda_places() if args.use_gpu else fluid.cpu_places()
# Get number of GPU
dev_count = cfg.NUM_TRAINERS if cfg.NUM_TRAINERS > 1 else len(places)
print_info("#Device count: {}".format(dev_count))
# Make sure BATCH_SIZE can divided by GPU cards
assert cfg.BATCH_SIZE % dev_count == 0, (
'BATCH_SIZE:{} not divisble by number of GPUs:{}'.format(
cfg.BATCH_SIZE, dev_count))
# If use multi-gpu training mode, batch data will allocated to each GPU evenly
batch_size_per_dev = cfg.BATCH_SIZE // dev_count
print_info("batch_size_per_dev: {}".format(batch_size_per_dev))
config_info = {'input_size': 769, 'output_size': 1, 'block_num': 7}
config = ([(cfg.SLIM.NAS_SPACE_NAME, config_info)])
factory = SearchSpaceFactory()
space = factory.get_search_space(config)
port = cfg.SLIM.NAS_PORT
server_address = (cfg.SLIM.NAS_ADDRESS, port)
sa_nas = SANAS(config, server_addr=server_address, search_steps=cfg.SLIM.NAS_SEARCH_STEPS,
is_server=cfg.SLIM.NAS_IS_SERVER)
for step in range(cfg.SLIM.NAS_SEARCH_STEPS):
arch = sa_nas.next_archs()[0]
start_prog = fluid.Program()
train_prog = fluid.Program()
py_reader, avg_loss, lr, pred, grts, masks = build_model(
train_prog, start_prog, arch=arch, phase=ModelPhase.TRAIN)
cur_flops = flops(train_prog)
print('current step:', step, 'flops:', cur_flops)
py_reader.decorate_sample_generator(
data_generator, batch_size=batch_size_per_dev, drop_last=drop_last)
exe = fluid.Executor(place)
exe.run(start_prog)
exec_strategy = fluid.ExecutionStrategy()
# Clear temporary variables every 100 iteration
if args.use_gpu:
exec_strategy.num_threads = fluid.core.get_cuda_device_count()
exec_strategy.num_iteration_per_drop_scope = 100
build_strategy = fluid.BuildStrategy()
if cfg.NUM_TRAINERS > 1 and args.use_gpu:
dist_utils.prepare_for_multi_process(exe, build_strategy, train_prog)
exec_strategy.num_threads = 1
if cfg.TRAIN.SYNC_BATCH_NORM and args.use_gpu:
if dev_count > 1:
# Apply sync batch norm strategy
print_info("Sync BatchNorm strategy is effective.")
build_strategy.sync_batch_norm = True
else:
print_info(
"Sync BatchNorm strategy will not be effective if GPU device"
" count <= 1")
compiled_train_prog = fluid.CompiledProgram(train_prog).with_data_parallel(
loss_name=avg_loss.name,
exec_strategy=exec_strategy,
build_strategy=build_strategy)
# Resume training
begin_epoch = cfg.SOLVER.BEGIN_EPOCH
if cfg.TRAIN.RESUME_MODEL_DIR:
begin_epoch = load_checkpoint(exe, train_prog)
# Load pretrained model
elif os.path.exists(cfg.TRAIN.PRETRAINED_MODEL_DIR):
print_info('Pretrained model dir: ', cfg.TRAIN.PRETRAINED_MODEL_DIR)
load_vars = []
load_fail_vars = []
def var_shape_matched(var, shape):
"""
Check whehter persitable variable shape is match with current network
"""
var_exist = os.path.exists(
os.path.join(cfg.TRAIN.PRETRAINED_MODEL_DIR, var.name))
if var_exist:
var_shape = parse_shape_from_file(
os.path.join(cfg.TRAIN.PRETRAINED_MODEL_DIR, var.name))
return var_shape == shape
return False
for x in train_prog.list_vars():
if isinstance(x, fluid.framework.Parameter):
shape = tuple(fluid.global_scope().find_var(
x.name).get_tensor().shape())
if var_shape_matched(x, shape):
load_vars.append(x)
else:
load_fail_vars.append(x)
fluid.io.load_vars(
exe, dirname=cfg.TRAIN.PRETRAINED_MODEL_DIR, vars=load_vars)
for var in load_vars:
print_info("Parameter[{}] loaded sucessfully!".format(var.name))
for var in load_fail_vars:
print_info(
"Parameter[{}] don't exist or shape does not match current network, skip"
" to load it.".format(var.name))
print_info("{}/{} pretrained parameters loaded successfully!".format(
len(load_vars),
len(load_vars) + len(load_fail_vars)))
else:
print_info(
'Pretrained model dir {} not exists, training from scratch...'.
format(cfg.TRAIN.PRETRAINED_MODEL_DIR))
fetch_list = [avg_loss.name, lr.name]
global_step = 0
all_step = cfg.DATASET.TRAIN_TOTAL_IMAGES // cfg.BATCH_SIZE
if cfg.DATASET.TRAIN_TOTAL_IMAGES % cfg.BATCH_SIZE and drop_last != True:
all_step += 1
all_step *= (cfg.SOLVER.NUM_EPOCHS - begin_epoch + 1)
avg_loss = 0.0
timer = Timer()
timer.start()
if begin_epoch > cfg.SOLVER.NUM_EPOCHS:
raise ValueError(
("begin epoch[{}] is larger than cfg.SOLVER.NUM_EPOCHS[{}]").format(
begin_epoch, cfg.SOLVER.NUM_EPOCHS))
if args.use_mpio:
print_info("Use multiprocess reader")
else:
print_info("Use multi-thread reader")
best_miou = 0.0
for epoch in range(begin_epoch, cfg.SOLVER.NUM_EPOCHS + 1):
py_reader.start()
while True:
try:
loss, lr = exe.run(
program=compiled_train_prog,
fetch_list=fetch_list,
return_numpy=True)
avg_loss += np.mean(np.array(loss))
global_step += 1
if global_step % args.log_steps == 0 and cfg.TRAINER_ID == 0:
avg_loss /= args.log_steps
speed = args.log_steps / timer.elapsed_time()
print((
"epoch={} step={} lr={:.5f} loss={:.4f} step/sec={:.3f} | ETA {}"
).format(epoch, global_step, lr[0], avg_loss, speed,
calculate_eta(all_step - global_step, speed)))
sys.stdout.flush()
avg_loss = 0.0
timer.restart()
except fluid.core.EOFException:
py_reader.reset()
break
except Exception as e:
print(e)
if epoch > cfg.SLIM.NAS_START_EVAL_EPOCH:
ckpt_dir = save_checkpoint(exe, train_prog, '{}_tmp'.format(port))
_, mean_iou, _, mean_acc = evaluate(
cfg=cfg,
arch=arch,
ckpt_dir=ckpt_dir,
use_gpu=args.use_gpu,
use_mpio=args.use_mpio)
if best_miou < mean_iou:
print('search step {}, epoch {} best iou {}'.format(step, epoch, mean_iou))
best_miou = mean_iou
sa_nas.reward(float(best_miou))
def main(args):
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
if args.enable_ce:
random.seed(0)
np.random.seed(0)
cfg.TRAINER_ID = int(os.getenv("PADDLE_TRAINER_ID", 0))
cfg.NUM_TRAINERS = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
cfg.check_and_infer()
print_info(pprint.pformat(cfg))
train(cfg)
if __name__ == '__main__':
args = parse_args()
if fluid.core.is_compiled_with_cuda() != True and args.use_gpu == True:
print(
"You can not set use_gpu = True in the model because you are using paddlepaddle-cpu."
)
print(
"Please: 1. Install paddlepaddle-gpu to run your models on GPU or 2. Set use_gpu=False to run models on CPU."
)
sys.exit(1)
main(args)
# PaddleSeg剪裁教程
在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解
该文档介绍如何使用[PaddleSlim](https://paddlepaddle.github.io/PaddleSlim)的卷积通道剪裁接口对检测库中的模型的卷积层的通道数进行剪裁。
在分割库中,可以直接调用`PaddleSeg/slim/prune/train_prune.py`脚本实现剪裁,在该脚本中调用了PaddleSlim的[paddleslim.prune.Pruner](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/#Pruner)接口。
该教程中所示操作,如无特殊说明,均在`PaddleSeg/`路径下执行。
## 1. 数据与预训练模型准备
执行如下命令,下载cityscapes数据集
```
python dataset/download_cityscapes.py
```
参照[预训练模型列表](../../docs/model_zoo.md)获取所需预训练模型
## 2. 确定待分析参数
我们通过剪裁卷积层参数达到缩减卷积层通道数的目的,在剪裁之前,我们需要确定待裁卷积层的参数的名称。
通过以下命令查看当前模型的所有参数:
```python
# 查看模型所有Paramters
for x in train_prog.list_vars():
if isinstance(x, fluid.framework.Parameter):
print(x.name, x.shape)
```
通过观察参数名称和参数的形状,筛选出所有卷积层参数,并确定要裁剪的卷积层参数。
## 3. 启动剪裁任务
使用`train_prune.py`启动裁剪任务时,通过`SLIM.PRUNE_PARAMS`选项指定待裁剪的参数名称列表,参数名之间用逗号分隔,通过`SLIM.PRUNE_RATIOS`选项指定各个参数被裁掉的比例。
```shell
CUDA_VISIBLE_DEVICES=0
python -u ./slim/prune/train_prune.py --log_steps 10 --cfg configs/cityscape_fast_scnn.yaml --use_gpu --use_mpio \
SLIM.PRUNE_PARAMS 'learning_to_downsample/weights,learning_to_downsample/dsconv1/pointwise/weights,learning_to_downsample/dsconv2/pointwise/weights' \
SLIM.PRUNE_RATIOS '[0.1,0.1,0.1]'
```
这里我们选取三个参数,按0.1的比例剪裁。
## 4. 评估
```shell
CUDA_VISIBLE_DEVICES=0
python -u ./slim/prune/eval_prune.py --cfg configs/cityscape_fast_scnn.yaml --use_gpu --use_mpio \
TEST.TEST_MODEL your_trained_model \
```
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
SEG_PATH = os.path.join(LOCAL_PATH, "../../", "pdseg")
sys.path.append(SEG_PATH)
sys.path.append('/workspace/codes/PaddleSlim1')
import time
import argparse
import functools
import pprint
import cv2
import numpy as np
import paddle
import paddle.fluid as fluid
from utils.config import cfg
from utils.timer import Timer, calculate_eta
from models.model_builder import build_model
from models.model_builder import ModelPhase
from reader import SegDataset
from metrics import ConfusionMatrix
from paddleslim.prune.io import *
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg model evalution')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu',
dest='use_gpu',
help='Use gpu or cpu',
action='store_true',
default=False)
parser.add_argument(
'--use_mpio',
dest='use_mpio',
help='Use multiprocess IO or not',
action='store_true',
default=False)
parser.add_argument(
'opts',
help='See utils/config.py for all options',
default=None,
nargs=argparse.REMAINDER)
if len(sys.argv) == 1:
parser.print_help()
sys.exit(1)
return parser.parse_args()
def evaluate(cfg, ckpt_dir=None, use_gpu=False, use_mpio=False, **kwargs):
np.set_printoptions(precision=5, suppress=True)
startup_prog = fluid.Program()
test_prog = fluid.Program()
dataset = SegDataset(
file_list=cfg.DATASET.VAL_FILE_LIST,
mode=ModelPhase.EVAL,
data_dir=cfg.DATASET.DATA_DIR)
def data_generator():
#TODO: check is batch reader compatitable with Windows
if use_mpio:
data_gen = dataset.multiprocess_generator(
num_processes=cfg.DATALOADER.NUM_WORKERS,
max_queue_size=cfg.DATALOADER.BUF_SIZE)
else:
data_gen = dataset.generator()
for b in data_gen:
yield b[0], b[1], b[2]
py_reader, avg_loss, pred, grts, masks = build_model(
test_prog, startup_prog, phase=ModelPhase.EVAL)
py_reader.decorate_sample_generator(
data_generator, drop_last=False, batch_size=cfg.BATCH_SIZE)
# Get device environment
places = fluid.cuda_places() if use_gpu else fluid.cpu_places()
place = places[0]
dev_count = len(places)
print("#Device count: {}".format(dev_count))
exe = fluid.Executor(place)
exe.run(startup_prog)
test_prog = test_prog.clone(for_test=True)
ckpt_dir = cfg.TEST.TEST_MODEL if not ckpt_dir else ckpt_dir
if not os.path.exists(ckpt_dir):
raise ValueError('The TEST.TEST_MODEL {} is not found'.format(ckpt_dir))
if ckpt_dir is not None:
print('load test model:', ckpt_dir)
load_model(exe, test_prog, ckpt_dir)
#fluid.io.load_params(exe, ckpt_dir, main_program=test_prog)
# Use streaming confusion matrix to calculate mean_iou
np.set_printoptions(
precision=4, suppress=True, linewidth=160, floatmode="fixed")
conf_mat = ConfusionMatrix(cfg.DATASET.NUM_CLASSES, streaming=True)
fetch_list = [avg_loss.name, pred.name, grts.name, masks.name]
num_images = 0
step = 0
all_step = cfg.DATASET.TEST_TOTAL_IMAGES // cfg.BATCH_SIZE + 1
timer = Timer()
timer.start()
py_reader.start()
while True:
try:
step += 1
loss, pred, grts, masks = exe.run(
test_prog, fetch_list=fetch_list, return_numpy=True)
loss = np.mean(np.array(loss))
num_images += pred.shape[0]
conf_mat.calculate(pred, grts, masks)
_, iou = conf_mat.mean_iou()
_, acc = conf_mat.accuracy()
speed = 1.0 / timer.elapsed_time()
print(
"[EVAL]step={} loss={:.5f} acc={:.4f} IoU={:.4f} step/sec={:.2f} | ETA {}"
.format(step, loss, acc, iou, speed,
calculate_eta(all_step - step, speed)))
timer.restart()
sys.stdout.flush()
except fluid.core.EOFException:
break
category_iou, avg_iou = conf_mat.mean_iou()
category_acc, avg_acc = conf_mat.accuracy()
print("[EVAL]#image={} acc={:.4f} IoU={:.4f}".format(
num_images, avg_acc, avg_iou))
print("[EVAL]Category IoU:", category_iou)
print("[EVAL]Category Acc:", category_acc)
print("[EVAL]Kappa:{:.4f}".format(conf_mat.kappa()))
return category_iou, avg_iou, category_acc, avg_acc
def main():
args = parse_args()
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
cfg.check_and_infer()
print(pprint.pformat(cfg))
evaluate(cfg, **args.__dict__)
if __name__ == '__main__':
main()
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册