提交 d6d9409b 编写于 作者: E Exception-star 提交者: wangguanzhong

Revert "add danet to PaddleCV/Research (#3907)" (#4049)

This reverts commit ab0920d3.
上级 09123cd9
# [Dual Attention Network for Scene Segmentation (CVPR2019)](https://arxiv.org/pdf/1809.02983.pdf)
本项目是[DANet](https://arxiv.org/pdf/1809.02983.pdf)的 PaddlePaddle 实现, 包含模型训练,验证等内容。
## 模型简介
![net](img/Network.png)
骨干网络使用ResNet,为更好地进行语义分割任务,作者对ResNet做出以下改动:
1、将最后两个layer的downsampling取消,使得特征图是原图的1/8,保持较高空间分辨率。
2、最后两个layer采用空洞卷积扩大感受野。
然后接上两个并行的注意力模块(位置注意力和通道注意力),最终将两个模块的结果进行elementwise操作,之后再接一层卷积输出分割图。
### 位置注意力
![position](img/position.png)
A是骨干网络ResNet输出经过一层卷积生成的特征图,维度为CHW;
A经过3个卷积操作输出维度均为CHW的B、C、D。将B、C、D都reshape到CN(N = H*W);
然后将B reshape后的结果转置与C相乘,得到N * N的矩阵, 对于矩阵的每一个点进行softmax;
然后将D与softmax后的结果相乘并reshape到CHW,再与A进行elementwise。
### 通道注意力
![channel](img/channel.png)
A是骨干网络ResNet输出经过一层卷积生成的特征图,维度为CHW;
A经过3个reshape操作输出维度均为CN(N = H*W)的B、C、D;
然后将B转置与C相乘,得到C * C的矩阵,对于矩阵的每一个点进行softmax;
然后将D与softmax后的结果相乘并reshape到CHW,再与A进行elementwise。
## 数据准备
公开数据集:Cityscapes
训练集2975张,验证集500张,测试集1525张,图片分辨率都是1024*2048。
数据集来源:AIstudio数据集页面上[下载](https://aistudio.baidu.com/aistudio/datasetDetail/11503),放到dataset文件夹下,其目录结构如下:
```text
dataset
├── cityscapes # Cityscapes数据集
├── gtFine # 精细化标注的label
├── leftImg8bit # 训练,验证,测试图片
├── trainLabels.txt # 训练图片路径
├── valLabels.txt # 验证图片路径
... ...
```
## 训练说明
#### 数据增强策略
1、随机尺度缩放:尺度范围0.75到2.0
2、随机左右翻转:发生概率0.5
3、同比例缩放:缩放的大小由选项1决定。
4、随机裁剪:
5、高斯模糊:发生概率0.3(可选)
6、颜色抖动,对比度,锐度,亮度; 发生概率0.3(可选)
###### 默认1、2、3、4都开启,5、6关闭
#### 学习率调节策略
1、使用热身策略,学习率由0递增到base_lr,热身轮数(epoch)是5
2、在热身策略之后使用学习率衰减策略(poly),学习率由base_lr递减到0
#### 优化器选择
Momentum: 动量0.9,正则化系数1e-4
#### 加载预训练模型
设置 --load_pretrained_model True (默认为True)
预训练文件:
checkpoint/DANet50_pretrained_model_paddle1.6.pdparams
checkpoint/DANet101_pretrained_model_paddle1.6.pdparams
#### 不加载预训练模型
从0开始训练,需要设置 --load_pretrained_model False
#### 加载训练好的模型
设置 --load_better_model True (默认为False)
训练好的文件:
checkpoint/DANet101_better_model_paddle1.6.pdparams
##### 【注】
训练时paddle版本是1.5.2,代码已转为1.6版本(兼容1.6版本),预训练参数、训练好的参数来自1.5.2版本
#### 配置模型文件路径
[预训练参数、最优模型参数下载](https://paddlemodels.bj.bcebos.com/DANet/DANet_models.tar)
其目录结构如下:
```text
checkpoint
├── DANet50_pretrained_model_paddle1.6.pdparams # DANet50预训练模型,适合1.6.0版本
├── DANet101_pretrained_model_paddle1.6.pdparams # DANet101预训练模型,适合1.6.0版本
├── DANet101_better_model_paddle1.6.pdparams # DANet101训练最优模型,适合1.6.0版本
├── DANet101_better_model_paddle1.5.2 # DANet101训练最优模型,只适合1.5.2版本
```
## 模型训练
```sh
# open garbage collection to save memory
export FLAGS_eager_delete_tensor_gb=0.0
# setting visible devices for train
export CUDA_VISIBLE_DEVICES=0,1,2,3
python train_executor.py --backbone resnet101 --batch_size 2 --lr 0.003 --lr_scheduler poly --epoch_num 350 --crop_size 768 --base_size 1024 --warm_up True --cuda True --use_data_parallel True --dilated True --multi_grid True --multi_dilation [4, 8, 16] --scale True --load_pretrained_model True --load_better_model False
或者
python train_dygraph.py --backbone resnet101 --batch_size 2 --lr 0.003 --lr_scheduler poly --epoch_num 350 --crop_size 768 --base_size 1024 --warm_up True --cuda True --use_data_parallel True --dilated True --multi_grid True --multi_dilation [4, 8, 16] --scale True --load_pretrained_model True --load_better_model False
```
#### 【注】
##### train_executor.py使用executor方式训练(适合paddle1.5.2),train_dygraph.py使用动态图方式训练(适合paddle1.6.0),两种方式都可以
##### 在训练阶段,验证的结果不是真实的,需要使用eval.py来获得验证的最终结果。
## 模型验证
```sh
# open garbage collection to save memory
export FLAGS_eager_delete_tensor_gb=0.0
# setting visible devices for prediction
export CUDA_VISIBLE_DEVICES=0
python eval.py --backbone resnet101 --load_better_model True --batch_size 1 --crop_size 1024 --base_size 2048 --cuda True --multi_scales True --flip True --dilated True --multi_grid True --multi_dilation [4, 8, 16]
```
## 验证结果
评测指标:mean IOU(平均交并比)
| 模型 | 单尺度 | 多尺度 |
| :---:|:---:| :---:|
|DANet101|0.8043836|0.8138021
##### 具体数值
| 模型 | cls1 | cls2 | cls3 | cls4 | cls5 | cls6 | cls7 | cls8 | cls9 | cls10 | cls11 | cls12 | cls13 | cls14 | cls15 | cls16 |cls17 | cls18 | cls19 |
| :---:|:---: | :---:| :---:|:---: | :---:| :---:|:---: | :---:| :---:|:---: |:---: |:---: |:---: | :---: | :---: |:---: | :---:| :---: |:---: |
|DANet101-SS|0.98212|0.85372|0.92799|0.59976|0.63318|0.65819|0.72023|0.80000|0.92605|0.65788|0.94841|0.83377|0.65206|0.95566|0.87148|0.91233|0.84352|0.71948|0.78737|
|DANet101-MS|0.98047|0.84637|0.93084|0.62699|0.64839|0.67769|0.73650|0.81343|0.92942|0.67010|0.95127|0.84466|0.66635|0.95749|0.87755|0.92370|0.85344|0.73007|0.79742|
## 输出结果可视化
![val_1](img/val_1.png)
###### 输入图片
![val_gt](img/val_gt.png)
###### 图片label
![val_output](img/val_output.png)
###### DANet101模型输出
此差异已折叠。
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99"
import paddle.fluid as fluid
import paddle
import logging
import math
import numpy as np
import shutil
import os
from PIL import ImageOps, Image, ImageEnhance, ImageFilter
from datetime import datetime
from danet import DANet
from options import Options
from utils.cityscapes_data import cityscapes_train
from utils.cityscapes_data import cityscapes_val
from utils.cityscapes_data import cityscapes_test
from utils.lr_scheduler import Lr
from iou import IOUMetric
# globals
data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
def pad_single_image(image, crop_size):
w, h = image.size
pad_h = crop_size - h if h < crop_size else 0
pad_w = crop_size - w if w < crop_size else 0
image = ImageOps.expand(image, border=(0, 0, pad_w, pad_h), fill=0)
assert (image.size[0] >= crop_size and image.size[1] >= crop_size)
return image
def crop_image(image, h0, w0, h1, w1):
return image.crop((w0, h0, w1, h1))
def flip_left_right_image(image):
return image.transpose(Image.FLIP_LEFT_RIGHT)
def resize_image(image, out_h, out_w, mode=Image.BILINEAR):
return image.resize((out_w, out_h), mode)
def mapper_image(image):
image_array = np.array(image) # HWC
image_array = image_array.transpose((2, 0, 1)) # CHW
image_array = image_array / 255.0
image_array = (image_array - data_mean) / data_std
image_array = image_array.astype('float32')
image_array = image_array[np.newaxis, :]
return image_array
def get_model(args):
model = DANet('DANet',
backbone=args.backbone,
num_classes=args.num_classes,
batch_size=1,
dilated=args.dilated,
multi_grid=args.multi_grid,
multi_dilation=args.multi_dilation)
return model
def copy_model(path, new_path):
shutil.rmtree(new_path, ignore_errors=True)
shutil.copytree(path, new_path)
model_path = os.path.join(new_path, '__model__')
if os.path.exists(model_path):
os.remove(model_path)
def mean_iou(pred, label, num_classes=19):
label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32),
fluid.layers.assign(np.array([num_classes], dtype=np.int32)))
label_ig = (label == num_classes).astype('int32')
label_ng = (label != num_classes).astype('int32')
pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32')
pred = pred * label_ng + label_ig * num_classes
miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1)
label.stop_gradient = True
return miou, wrong, correct
def eval(args, model_path):
with fluid.dygraph.guard():
num_classes = args.num_classes
base_size = args.base_size # 图片最长边 2048
crop_size = args.crop_size # 输入网络大小 1024
multi_scales = args.multi_scales # 多尺度测试
flip = args.flip # 左右翻转测试
if not multi_scales:
scales = [1.0]
else:
# scales = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.2]
scales = [0.5, 0.75, 1.0, 1.25, 1.35, 1.5, 1.75, 2.0, 2.2] # 可能效果会更好
if len(scales) == 1: # single scale
# stride_rate = 2.0 / 3.0
stride_rate = 1.0 / 2.0 # 可能效果会更好
else:
stride_rate = 1.0 / 2.0
stride = int(crop_size * stride_rate) # 滑动stride
model = get_model(args)
x = np.random.randn(1, 3, 224, 224).astype('float32')
x = fluid.dygraph.to_variable(x)
y = model(x)
iou = IOUMetric(num_classes)
# 加载最优模型
if args.load_better_model and paddle.__version__ == '1.5.2':
assert os.path.exists(model_path), "请核对模型文件地址是否存在"
print('better model exist!')
new_model_path = 'dygraph/' + model_path
copy_model(model_path, new_model_path)
model_param, _ = fluid.dygraph.load_persistables(new_model_path)
model.load_dict(model_param)
elif args.load_better_model and paddle.__version__ == '1.6.0':
assert os.path.exists(model_path + '.pdparams'), "请核对模型文件地址是否存在, 1.6版本只能加载一个独立文件"
print('better model exist!')
model_param, _ = fluid.dygraph.load_dygraph(model_path)
model.load_dict(model_param)
else:
raise ValueError('请设置load_better_model = True!')
assert len(model_param) == len(model.state_dict()), "参数量不一致,加载参数失败," \
"请核对模型是否初始化/模型是否一致"
model.eval()
prev_time = datetime.now()
# reader = cityscapes_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True)
reader = cityscapes_test(split='val', base_size=2048, crop_size=1024, scale=True, xmap=True)
print('MultiEvalModule: base_size {}, crop_size {}'.
format(base_size, crop_size))
print('scales: {}'.format(scales))
print('val ing...')
palette = pat()
for data in reader():
# print(data)
image = data[0]
label_path = data[1] # val_label is a picture, test_label is a path
label = Image.open(label_path, mode='r') # val_label is a picture, test_label is a path
save_png_path = label_path.replace('val', '{}_val'.format(args.backbone)).replace('test', '{}_test'.format(args.backbone))
label_np = np.array(label)
w, h = image.size # h 1024, w 2048
scores = np.zeros(shape=[num_classes, h, w], dtype='float32') # 得分矩阵
for scale in scales:
long_size = int(math.ceil(base_size * scale)) # long_size
if h > w:
height = long_size
width = int(1.0 * w * long_size / h + 0.5)
short_size = width
else:
width = long_size
height = int(1.0 * h * long_size / w + 0.5)
short_size = height
cur_img = resize_image(image, height, width)
# 右下角pad
if long_size <= crop_size:
pad_img = pad_single_image(cur_img, crop_size)
pad_img = mapper_image(pad_img)
pad_img = fluid.dygraph.to_variable(pad_img)
pred1, pred2, pred3 = model(pad_img)
pred1 = pred1.numpy()
outputs = pred1[:, :, :height, :width]
if flip:
pad_img_filp = flip_left_right_image(cur_img)
pad_img_filp = pad_single_image(pad_img_filp, crop_size) # pad
pad_img_filp = mapper_image(pad_img_filp)
pad_img_filp = fluid.dygraph.to_variable(pad_img_filp)
pred1, pred2, pred3 = model(pad_img_filp)
pred1 = fluid.layers.reverse(pred1, axis=3)
pred1 = pred1.numpy()
outputs += pred1[:, :, :height, :width]
else:
if short_size < crop_size:
# pad if needed
pad_img = pad_single_image(cur_img, crop_size)
else:
pad_img = cur_img
pw, ph = pad_img.size
assert (ph >= height and pw >= width)
# 滑动网格
h_grids = int(math.ceil(1.0 * (ph - crop_size) / stride)) + 1
w_grids = int(math.ceil(1.0 * (pw - crop_size) / stride)) + 1
outputs = np.zeros(shape=[1, num_classes, ph, pw], dtype='float32')
count_norm = np.zeros(shape=[1, 1, ph, pw], dtype='int32')
for idh in range(h_grids):
for idw in range(w_grids):
h0 = idh * stride
w0 = idw * stride
h1 = min(h0 + crop_size, ph)
w1 = min(w0 + crop_size, pw)
crop_img = crop_image(pad_img, h0, w0, h1, w1)
pad_crop_img = pad_single_image(crop_img, crop_size)
pad_crop_img = mapper_image(pad_crop_img)
pad_crop_img = fluid.dygraph.to_variable(pad_crop_img)
pred1, pred2, pred3 = model(pad_crop_img) # shape [1, num_class, h, w]
pred = pred1.numpy() # channel, h, w
outputs[:, :, h0:h1, w0:w1] += pred[:, :, 0:h1 - h0, 0:w1 - w0]
count_norm[:, :, h0:h1, w0:w1] += 1
if flip:
pad_img_filp = flip_left_right_image(crop_img)
pad_img_filp = pad_single_image(pad_img_filp, crop_size) # pad
pad_img_array = mapper_image(pad_img_filp)
pad_img_array = fluid.dygraph.to_variable(pad_img_array)
pred1, pred2, pred3 = model(pad_img_array)
pred1 = fluid.layers.reverse(pred1, axis=3)
pred = pred1.numpy()
outputs[:, :, h0:h1, w0:w1] += pred[:, :, 0:h1 - h0, 0:w1 - w0]
count_norm[:, :, h0:h1, w0:w1] += 1
assert ((count_norm == 0).sum() == 0)
outputs = outputs / count_norm
outputs = outputs[:, :, :height, :width]
outputs = fluid.dygraph.to_variable(outputs)
outputs = fluid.layers.resize_bilinear(outputs, out_shape=[h, w])
score = outputs.numpy()[0]
scores += score # scopes 是所有尺度的和, shape: [channel, h, w]
pred = np.argmax(score, axis=0).astype('uint8')
picture_path = '{}'.format(save_png_path).replace('.png', '_scale_{}'.format(scale))
save_png(pred, palette, picture_path)
pred = np.argmax(scores, axis=0).astype('uint8')
picture_path = '{}'.format(save_png_path).replace('.png', '_scores')
save_png(pred, palette, picture_path)
iou.add_batch(pred, label_np) # 计算iou
print('eval done!')
acc, acc_cls, iu, mean_iu, fwavacc, kappa = iou.evaluate()
print('acc = {}'.format(acc))
print('acc_cls = {}'.format(acc_cls))
print('iu = {}'.format(iu))
print('mean_iou(含有255) = {}'.format(mean_iu))
print('mean_iou = {}'.format(np.nanmean(iu[:-1]))) # 真正的iou
print('fwavacc = {}'.format(fwavacc))
print('kappa = {}'.format(kappa))
cur_time = datetime.now()
h, remainder = divmod((cur_time - prev_time).seconds, 3600)
m, s = divmod(remainder, 60)
time_str = "Time %02d:%02d:%02d" % (h, m, s)
print('val ' + time_str)
def save_png(pred_value, palette, name):
if isinstance(pred_value, np.ndarray):
if pred_value.ndim == 3:
batch_size = pred_value.shape[0]
if batch_size == 1:
pred_value = pred_value.squeeze(axis=0)
image = Image.fromarray(pred_value).convert('P')
image.putpalette(palette)
save_path = '{}.png'.format(name)
save_dir = os.path.dirname(save_path)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
image.save(save_path)
else:
for batch_id in range(batch_size):
value = pred_value[batch_id]
image = Image.fromarray(value).convert('P')
image.putpalette(palette)
save_path = '{}.png'.format(name[batch_id])
save_dir = os.path.dirname(save_path)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
image.save(save_path)
elif pred_value.ndim == 2:
image = Image.fromarray(pred_value).convert('P')
image.putpalette(palette)
save_path = '{}.png'.format(name)
save_dir = os.path.dirname(save_path)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
image.save(save_path)
else:
raise ValueError('暂只支持nd array')
def save_png_test(path):
im = Image.open(path)
im_array = np.array(im).astype('uint8')
save_png(im_array, pat(), 'save_png_test')
def pat():
palette = []
for i in range(256):
palette.extend((i, i, i))
palette[:3 * 19] = np.array([[128, 64, 128],
[244, 35, 232],
[70, 70, 70],
[102, 102, 156],
[190, 153, 153],
[153, 153, 153],
[250, 170, 30],
[220, 220, 0],
[107, 142, 35],
[152, 251, 152],
[70, 130, 180],
[220, 20, 60],
[255, 0, 0],
[0, 0, 142],
[0, 0, 70],
[0, 60, 100],
[0, 80, 100],
[0, 0, 230],
[119, 11, 32]], dtype='uint8').flatten()
return palette
if __name__ == '__main__':
options = Options()
args = options.parse()
options.print_args()
# model_path = 'checkpoint/DANet101_better_model_paddle1.5.2'
model_path = 'checkpoint/DANet101_better_model_paddle1.6'
eval(args, model_path)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import numpy as np
class IOUMetric:
def __init__(self, num_classes):
self.num_classes = num_classes+1
self.hist = np.zeros((num_classes+1, num_classes+1))
def _fast_hist(self, label_pred, label_true):
mask = (label_true >= 0) & (label_true < self.num_classes)
hist = np.bincount(
self.num_classes * label_true[mask].astype(int) +
label_pred[mask], minlength=self.num_classes ** 2).reshape(self.num_classes, self.num_classes)
return hist
def add_batch(self, predictions, gts):
# gts = BHW
# predictions = BHW
if isinstance(gts, np.ndarray):
gts_ig = (gts == 255).astype(np.int32)
gts_nig = (gts != 255).astype(np.int32)
# print(predictions)
gts[gts == 255] = self.num_classes-1 # 19
predictions = gts_nig * predictions + gts_ig * (self.num_classes-1)
# print(predictions)
for lp, lt in zip(predictions, gts):
self.hist += self._fast_hist(lp.flatten(), lt.flatten())
def evaluate(self):
acc = np.diag(self.hist).sum() / self.hist.sum()
acc_cls = np.nanmean(np.diag(self.hist) / self.hist.sum(axis=1))
iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist))
mean_iu = np.nanmean(iu)
freq = self.hist.sum(axis=1) / self.hist.sum()
fwavacc = (freq[freq > 0] * iu[freq > 0]).sum()
kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / (
self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum())
return acc, acc_cls, iu, mean_iu, fwavacc, kappa
def evaluate_kappa(self):
kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / (
self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum())
return kappa
def evaluate_iou_kappa(self):
iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist))
mean_iu = np.nanmean(iu)
kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / (
self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum())
return mean_iu, kappa
def evaluate_iu(self):
iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist))
return iu
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import argparse
class Options:
def __init__(self):
parser = argparse.ArgumentParser(description='Paddle DANet Segmentation')
# model and dataset
parser.add_argument('--model', type=str, default='danet',
help='model name (default: danet)')
parser.add_argument('--backbone', type=str, default='resnet101',
help='backbone name (default: resnet101)')
parser.add_argument('--dataset', type=str, default='cityscapes',
help='dataset name (default: cityscapes)')
parser.add_argument('--num_classes', type=int, default=19,
help='num_classes (default: cityscapes = 19)')
parser.add_argument('--data_folder', type=str,
default='./dataset',
help='training dataset folder (default: ./dataset')
parser.add_argument('--base_size', type=int, default=1024,
help='base image size')
parser.add_argument('--crop_size', type=int, default=768,
help='crop image size')
# training hyper params
parser.add_argument('--aux', default=True,
help='Auxilary Loss')
parser.add_argument('--se_loss', default=True,
help='Semantic Encoding Loss SE-loss')
parser.add_argument('--epoch_num', type=int, default=1200, metavar='N',
help='number of epochs to train (default: auto)')
parser.add_argument('--start_epoch', type=int, default=0,
metavar='N', help='start epochs (default:0)')
parser.add_argument('--batch_size', type=int, default=None,
metavar='N', help='input batch size for \
training (default: auto)')
parser.add_argument('--test_batch_size', type=int, default=None,
metavar='N', help='input batch size for \
testing (default: same as batch size)')
# optimizer params
parser.add_argument('--lr', type=float, default=0.01, metavar='LR',
help='learning rate (default: auto)')
parser.add_argument('--lr_scheduler', type=str, default='poly',
help='learning rate scheduler (default: poly)')
parser.add_argument('--lr_pow', type=float, default=0.9,
help='learning rate scheduler (default: 0.9)')
parser.add_argument('--lr_step', type=int, default=None,
help='lr step to change lr')
parser.add_argument('--warm_up', type=bool, default=False,
help='warm_up (default: False)')
parser.add_argument('--warmup_epoch', type=int, default=5,
help='warmup_epoch (default: 5)')
parser.add_argument('--total_step', type=int, default=500000,
metavar='N', help='total_step (default: auto):500000)')
parser.add_argument('--step_per_epoch', type=int, default=None,
metavar='N', help='step_per_epoch (default: auto)')
parser.add_argument('--momentum', type=float, default=0.9,
metavar='M', help='momentum (default: 0.9)')
parser.add_argument('--weight_decay', type=float, default=1e-4, # 正则化系数
metavar='M', help='w-decay (default: 1e-4)')
# cuda, seed and logging
parser.add_argument('--cuda', default=True, type=bool,
help='use CUDA training')
parser.add_argument('--use_data_parallel', default=True, type=bool,
help='use data_parallel training')
parser.add_argument('--seed', type=int, default=1, metavar='S',
help='random seed (default: 1)')
parser.add_argument('--log_root', type=str,
default='./', help='set a log path folder')
# checkpoint
parser.add_argument("--save_model", default='./checkpoint/', type=str,
help="model path")
# finetuning pre-trained models
parser.add_argument("--load_pretrained_model", default=True, type=bool,
help="load pretrained model (default: True)")
# load better models
parser.add_argument("--load_better_model", default=False, type=bool,
help="load better model (default: False)")
parser.add_argument('--multi-scales', type=bool, default=True,
help="testing scale,default:(multi scale)")
parser.add_argument('--flip', type=bool, default=True,
help="testing flip image,default:(True)")
# multi grid dilation option
parser.add_argument("--dilated", default=True, type=bool,
help="use dilation policy")
parser.add_argument("--multi_grid", default=True, type=bool,
help="use multi grid dilation policy")
parser.add_argument('--multi_dilation', type=int, default=[4, 8, 16],
help="multi grid dilation list")
parser.add_argument('--scale', action='store_false', default=True,
help='choose to use random scale transform(0.75-2.0),default:multi scale')
# the parser
self.parser = parser
def parse(self):
args = self.parser.parse_args()
# default settings for epochs, batch_size and lr
if args.epoch_num is None:
epoches = {
'pascal_voc': 180,
'pascal_aug': 180,
'pcontext': 180,
'ade20k': 180,
'cityscapes': 240,
}
num_class_dict = {
'pascal_voc': None,
'pascal_aug': None,
'pcontext': None,
'ade20k': None,
'cityscapes': 19,
}
total_steps = {
'pascal_voc': 500000,
'pascal_aug': 500000,
'pcontext': 500000,
'ade20k': 500000,
'cityscapes': 500000,
}
args.epoch_num = epoches[args.dataset.lower()]
args.num_classes = num_class_dict[args.dataset.lower()]
args.total_step = total_steps[args.dataset.lower()]
if args.batch_size is None:
args.batch_size = 2
if args.test_batch_size is None:
args.test_batch_size = args.batch_size
if args.step_per_epoch is None:
step_per_epoch = {
'pascal_voc': 185,
'pascal_aug': 185,
'pcontext': 185,
'ade20k': 185,
'cityscapes': 185, # 2975 // batch_size // GPU_num
}
args.step_per_epoch = step_per_epoch[args.dataset.lower()]
if args.lr is None:
lrs = {
'pascal_voc': 0.0001,
'pascal_aug': 0.001,
'pcontext': 0.001,
'ade20k': 0.01,
'cityscapes': 0.01,
}
args.lr = lrs[args.dataset.lower()] / 8 * args.batch_size
return args
def print_args(self):
arg_dict = self.parse().__dict__
for k, v in arg_dict.items():
print('{:30s}: {}'.format(k, v))
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99"
import paddle.fluid as fluid
import numpy as np
import paddle
import logging
import shutil
from datetime import datetime
from paddle.utils import Ploter
from danet import DANet
from options import Options
from utils.cityscapes_data import cityscapes_train
from utils.cityscapes_data import cityscapes_val
from utils.lr_scheduler import Lr
def get_model(args):
model = DANet('DANet',
backbone=args.backbone,
num_classes=args.num_classes,
batch_size=args.batch_size,
dilated=args.dilated,
multi_grid=args.multi_grid,
multi_dilation=args.multi_dilation)
return model
def mean_iou(pred, label, num_classes=19):
label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32),
fluid.layers.assign(np.array([num_classes], dtype=np.int32)))
label_ig = (label == num_classes).astype('int32')
label_ng = (label != num_classes).astype('int32')
pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32')
pred = pred * label_ng + label_ig * num_classes
miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1)
label.stop_gradient = True
return miou, wrong, correct
def loss_fn(pred, pred2, pred3, label, num_classes=19):
pred = fluid.layers.transpose(pred, perm=[0, 2, 3, 1])
pred = fluid.layers.reshape(pred, [-1, num_classes])
pred2 = fluid.layers.transpose(pred2, perm=[0, 2, 3, 1])
pred2 = fluid.layers.reshape(pred2, [-1, num_classes])
pred3 = fluid.layers.transpose(pred3, perm=[0, 2, 3, 1])
pred3 = fluid.layers.reshape(pred3, [-1, num_classes])
label = fluid.layers.reshape(label, [-1, 1])
pred = fluid.layers.softmax(pred, use_cudnn=False)
loss1 = fluid.layers.cross_entropy(pred, label, ignore_index=255)
pred2 = fluid.layers.softmax(pred2, use_cudnn=False)
loss2 = fluid.layers.cross_entropy(pred2, label, ignore_index=255)
pred3 = fluid.layers.softmax(pred3, use_cudnn=False)
loss3 = fluid.layers.cross_entropy(pred3, label, ignore_index=255)
label.stop_gradient = True
return loss1 + loss2 + loss3
def optimizer_setting(args):
if args.weight_decay is not None:
regular = fluid.regularizer.L2Decay(regularization_coeff=args.weight_decay)
else:
regular = None
if args.lr_scheduler == 'poly':
lr_scheduler = Lr(lr_policy='poly',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
power=args.lr_pow,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch)
decayed_lr = lr_scheduler.get_lr()
elif args.lr_scheduler == 'cosine':
lr_scheduler = Lr(lr_policy='cosine',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch)
decayed_lr = lr_scheduler.get_lr()
elif args.lr_scheduler == 'piecewise':
lr_scheduler = Lr(lr_policy='piecewise',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch,
decay_epoch=[50, 100, 150],
gamma=0.1)
decayed_lr = lr_scheduler.get_lr()
else:
decayed_lr = args.lr
return fluid.optimizer.MomentumOptimizer(learning_rate=decayed_lr,
momentum=args.momentum,
regularization=regular)
def main(args):
batch_size = args.batch_size
num_epochs = args.epoch_num
num_classes = args.num_classes
data_root = args.data_folder
num = fluid.core.get_cuda_device_count()
print('GPU设备数量: {}'.format(num))
# program
start_prog = fluid.default_startup_program()
train_prog = fluid.default_main_program()
start_prog.random_seed = args.seed
train_prog.random_seed = args.seed
logging.basicConfig(level=logging.INFO,
filename='DANet_{}_train.log'.format(args.backbone),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logging.info('DANet')
logging.info(args)
place = fluid.CUDAPlace(0) if args.cuda else fluid.CPUPlace()
train_loss_title = 'Train_loss'
test_loss_title = 'Test_loss'
train_iou_title = 'Train_mIOU'
test_iou_title = 'Test_mIOU'
plot_loss = Ploter(train_loss_title, test_loss_title)
plot_iou = Ploter(train_iou_title, test_iou_title)
with fluid.dygraph.guard(place):
model = get_model(args)
x = np.random.randn(batch_size, 3, 224, 224).astype('float32')
x = fluid.dygraph.to_variable(x)
model(x)
# 加载预训练模型
if args.load_pretrained_model:
save_dir = 'checkpoint/DANet101_pretrained_model_paddle1.6'
if os.path.exists(save_dir + '.pdparams'):
param, _ = fluid.load_dygraph(save_dir)
model.set_dict(param)
assert len(param) == len(model.state_dict()), "参数量不一致,加载参数失败," \
"请核对模型是否初始化/模型是否一致"
print('load pretrained model!')
# 加载最优模型
if args.load_better_model:
save_dir = 'checkpoint/DANet101_better_model_paddle1.6'
if os.path.exists(save_dir + '.pdparams'):
param, _ = fluid.load_dygraph(save_dir)
model.set_dict(param)
assert len(param) == len(model.state_dict()), "参数量不一致,加载参数失败," \
"请核对模型是否初始化/模型是否一致"
print('load better model!')
optimizer = optimizer_setting(args)
train_data = cityscapes_train(data_root=data_root,
base_size=args.base_size,
crop_size=args.crop_size,
scale=args.scale,
xmap=True,
batch_size=batch_size,
gpu_num=num)
batch_train_data = paddle.batch(paddle.reader.shuffle(
train_data, buf_size=batch_size * 64),
batch_size=batch_size,
drop_last=True)
val_data = cityscapes_val(data_root=data_root,
base_size=args.base_size,
crop_size=args.crop_size,
scale=args.scale,
xmap=True)
batch_test_data = paddle.batch(val_data,
batch_size=batch_size,
drop_last=True)
train_iou_manager = fluid.metrics.Accuracy()
train_avg_loss_manager = fluid.metrics.Accuracy()
test_iou_manager = fluid.metrics.Accuracy()
test_avg_loss_manager = fluid.metrics.Accuracy()
better_miou_train = 0
better_miou_test = 0
for epoch in range(num_epochs):
prev_time = datetime.now()
train_avg_loss_manager.reset()
train_iou_manager.reset()
for batch_id, data in enumerate(batch_train_data()):
image = np.array([x[0] for x in data]).astype('float32')
label = np.array([x[1] for x in data]).astype('int64')
image = fluid.dygraph.to_variable(image)
label = fluid.dygraph.to_variable(label)
label.stop_gradient = True
pred, pred2, pred3 = model(image)
train_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes)
train_avg_loss = fluid.layers.mean(train_loss)
miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes)
train_avg_loss.backward()
optimizer.minimize(train_avg_loss)
model.clear_gradients()
train_iou_manager.update(miou.numpy(), weight=batch_size*num)
train_avg_loss_manager.update(train_avg_loss.numpy(), weight=batch_size*num)
batch_train_str = "epoch: {}, batch: {}, train_avg_loss: {:.6f}, " \
"train_miou: {:.6f}.".format(epoch + 1,
batch_id + 1,
train_avg_loss.numpy()[0],
miou.numpy()[0])
if batch_id % 100 == 0:
logging.info(batch_train_str)
print(batch_train_str)
cur_time = datetime.now()
h, remainder = divmod((cur_time - prev_time).seconds, 3600)
m, s = divmod(remainder, 60)
time_str = " Time %02d:%02d:%02d" % (h, m, s)
train_str = "\nepoch: {}, train_avg_loss: {:.6f}, " \
"train_miou: {:.6f}.".format(epoch + 1,
train_avg_loss_manager.eval()[0],
train_iou_manager.eval()[0])
print(train_str + time_str + '\n')
logging.info(train_str + time_str + '\n')
plot_loss.append(train_loss_title, epoch, train_avg_loss_manager.eval()[0])
plot_loss.plot('./DANet_loss.jpg')
plot_iou.append(train_iou_title, epoch, train_iou_manager.eval()[0])
plot_iou.plot('./DANet_miou.jpg')
fluid.dygraph.save_dygraph(model.state_dict(), 'checkpoint/DANet_epoch_new')
# save_model
if better_miou_train < train_iou_manager.eval()[0]:
shutil.rmtree('checkpoint/DAnet_better_train_{:.4f}.pdparams'.format(better_miou_train), ignore_errors=True)
better_miou_train = train_iou_manager.eval()[0]
fluid.dygraph.save_dygraph(model.state_dict(),
'checkpoint/DAnet_better_train_{:.4f}'.format(better_miou_train))
########## test ############
model.eval()
test_iou_manager.reset()
test_avg_loss_manager.reset()
prev_time = datetime.now()
for (batch_id, data) in enumerate(batch_test_data()):
image = np.array([x[0] for x in data]).astype('float32')
label = np.array([x[1] for x in data]).astype('int64')
image = fluid.dygraph.to_variable(image)
label = fluid.dygraph.to_variable(label)
label.stop_gradient = True
pred, pred2, pred3 = model(image)
test_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes)
test_avg_loss = fluid.layers.mean(test_loss)
miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes)
test_iou_manager.update(miou.numpy(), weight=batch_size*num)
test_avg_loss_manager.update(test_avg_loss.numpy(), weight=batch_size*num)
batch_test_str = "epoch: {}, batch: {}, test_avg_loss: {:.6f}, " \
"test_miou: {:.6f}.".format(epoch + 1, batch_id + 1,
test_avg_loss.numpy()[0],
miou.numpy()[0])
if batch_id % 20 == 0:
logging.info(batch_test_str)
print(batch_test_str)
cur_time = datetime.now()
h, remainder = divmod((cur_time - prev_time).seconds, 3600)
m, s = divmod(remainder, 60)
time_str = " Time %02d:%02d:%02d" % (h, m, s)
test_str = "\nepoch: {}, test_avg_loss: {:.6f}, " \
"test_miou: {:.6f}.".format(epoch + 1,
test_avg_loss_manager.eval()[0],
test_iou_manager.eval()[0])
print(test_str + time_str + '\n')
logging.info(test_str + time_str + '\n')
plot_loss.append(test_loss_title, epoch, test_avg_loss_manager.eval()[0])
plot_loss.plot('./DANet_loss.jpg')
plot_iou.append(test_iou_title, epoch, test_iou_manager.eval()[0])
plot_iou.plot('./DANet_miou.jpg')
model.train()
# save_model
if better_miou_test < test_iou_manager.eval()[0]:
shutil.rmtree('checkpoint/DAnet_better_test_{:.4f}.pdparams'.format(better_miou_test), ignore_errors=True)
better_miou_test = test_iou_manager.eval()[0]
fluid.dygraph.save_dygraph(model.state_dict(),
'checkpoint/DAnet_better_test_{:.4f}'.format(better_miou_test))
if __name__ == '__main__':
options = Options()
args = options.parse()
options.print_args()
main(args)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99"
import paddle.fluid as fluid
import numpy as np
import paddle
import logging
import shutil
from datetime import datetime
from paddle.utils import Ploter
from danet import DANet
from options import Options
from utils.cityscapes_data import cityscapes_train
from utils.cityscapes_data import cityscapes_val
from utils.lr_scheduler import Lr
def get_model(args):
model = DANet('DANet',
backbone=args.backbone,
num_classes=args.num_classes,
batch_size=args.batch_size,
dilated=args.dilated,
multi_grid=args.multi_grid,
multi_dilation=args.multi_dilation)
return model
def mean_iou(pred, label, num_classes=19):
label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32),
fluid.layers.assign(np.array([num_classes], dtype=np.int32)))
label_ig = (label == num_classes).astype('int32')
label_ng = (label != num_classes).astype('int32')
pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32')
pred = pred * label_ng + label_ig * num_classes
miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1)
label.stop_gradient = True
return miou, wrong, correct
def loss_fn(pred, pred2, pred3, label, num_classes=19):
pred = fluid.layers.transpose(pred, perm=[0, 2, 3, 1])
pred = fluid.layers.reshape(pred, [-1, num_classes])
pred2 = fluid.layers.transpose(pred2, perm=[0, 2, 3, 1])
pred2 = fluid.layers.reshape(pred2, [-1, num_classes])
pred3 = fluid.layers.transpose(pred3, perm=[0, 2, 3, 1])
pred3 = fluid.layers.reshape(pred3, [-1, num_classes])
label = fluid.layers.reshape(label, [-1, 1])
# loss1 = fluid.layers.softmax_with_cross_entropy(pred, label, ignore_index=255)
# 以上方式会出现loss为NaN的情况
pred = fluid.layers.softmax(pred, use_cudnn=False)
loss1 = fluid.layers.cross_entropy(pred, label, ignore_index=255)
pred2 = fluid.layers.softmax(pred2, use_cudnn=False)
loss2 = fluid.layers.cross_entropy(pred2, label, ignore_index=255)
pred3 = fluid.layers.softmax(pred3, use_cudnn=False)
loss3 = fluid.layers.cross_entropy(pred3, label, ignore_index=255)
label.stop_gradient = True
return loss1 + loss2 + loss3
def save_model(save_dir, exe, program=None):
if os.path.exists(save_dir):
shutil.rmtree(save_dir, ignore_errors=True)
os.makedirs(save_dir)
fluid.io.save_persistables(exe, save_dir, program)
print('已保存: {}'.format(os.path.basename(save_dir)))
else:
os.makedirs(save_dir)
fluid.io.save_persistables(exe, save_dir, program)
print('不存在,创建: {}'.format(os.path.basename(save_dir)))
def load_model(save_dir, exe, program=None):
if os.path.exists(save_dir):
fluid.io.load_persistables(exe, save_dir, program)
print('存在, 加载成功')
else:
raise Exception('请核对地址')
def optimizer_setting(args):
if args.weight_decay is not None:
regular = fluid.regularizer.L2Decay(regularization_coeff=args.weight_decay)
else:
regular = None
if args.lr_scheduler == 'poly':
lr_scheduler = Lr(lr_policy='poly',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
power=args.lr_pow,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch)
decayed_lr = lr_scheduler.get_lr()
elif args.lr_scheduler == 'cosine':
lr_scheduler = Lr(lr_policy='cosine',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch)
decayed_lr = lr_scheduler.get_lr()
elif args.lr_scheduler == 'piecewise':
lr_scheduler = Lr(lr_policy='piecewise',
base_lr=args.lr,
epoch_nums=args.epoch_num,
step_per_epoch=args.step_per_epoch,
warm_up=args.warm_up,
warmup_epoch=args.warmup_epoch,
decay_epoch=[50, 100, 150],
gamma=0.1)
decayed_lr = lr_scheduler.get_lr()
else:
decayed_lr = args.lr
return fluid.optimizer.MomentumOptimizer(learning_rate=decayed_lr,
momentum=args.momentum,
regularization=regular)
def main(args):
image_shape = args.crop_size
image = fluid.layers.data(name='image', shape=[3, image_shape, image_shape], dtype='float32')
label = fluid.layers.data(name='label', shape=[image_shape, image_shape], dtype='int64')
batch_size = args.batch_size
epoch_num = args.epoch_num
num_classes = args.num_classes
data_root = args.data_folder
num = fluid.core.get_cuda_device_count()
print('GPU设备数量: {}'.format(num))
# program
start_prog = fluid.default_startup_program()
train_prog = fluid.default_main_program()
start_prog.random_seed = args.seed
train_prog.random_seed = args.seed
# clone
test_prog = train_prog.clone(for_test=True)
logging.basicConfig(level=logging.INFO,
filename='DANet_{}_train.log'.format(args.backbone),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logging.info('DANet')
logging.info(args)
with fluid.program_guard(train_prog, start_prog):
with fluid.unique_name.guard():
train_py_reader = fluid.io.PyReader(feed_list=[image, label],
capacity=64,
use_double_buffer=True,
iterable=False)
train_data = cityscapes_train(data_root=data_root,
base_size=args.base_size,
crop_size=args.crop_size,
scale=args.scale,
xmap=True,
batch_size=batch_size,
gpu_num=num)
batch_train_data = paddle.batch(paddle.reader.shuffle(
train_data, buf_size=batch_size * 16),
batch_size=batch_size,
drop_last=True)
train_py_reader.decorate_sample_list_generator(batch_train_data)
model = get_model(args)
pred, pred2, pred3 = model(image)
train_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes)
train_avg_loss = fluid.layers.mean(train_loss)
optimizer = optimizer_setting(args)
optimizer.minimize(train_avg_loss)
# miou不是真实的
miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes)
with fluid.program_guard(test_prog, start_prog):
with fluid.unique_name.guard():
test_py_reader = fluid.io.PyReader(feed_list=[image, label],
capacity=64,
iterable=False,
use_double_buffer=True)
val_data = cityscapes_val(data_root=data_root,
base_size=args.base_size,
crop_size=args.crop_size,
scale=args.scale,
xmap=True)
batch_test_data = paddle.batch(val_data,
batch_size=batch_size,
drop_last=True)
test_py_reader.decorate_sample_list_generator(batch_test_data)
model = get_model(args)
pred, pred2, pred3 = model(image)
test_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes)
test_avg_loss = fluid.layers.mean(test_loss)
# miou不是真实的
miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes)
place = fluid.CUDAPlace(0) if args.cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(start_prog)
if args.use_data_parallel:
exec_strategy = fluid.ExecutionStrategy()
exec_strategy.num_threads = fluid.core.get_cuda_device_count()
exec_strategy.num_iteration_per_drop_scope = 100
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
print("sync_batch_norm = True!")
compiled_train_prog = fluid.compiler.CompiledProgram(train_prog).with_data_parallel(
loss_name=train_avg_loss.name,
build_strategy=build_strategy,
exec_strategy=exec_strategy)
else:
compiled_train_prog = fluid.compiler.CompiledProgram(train_prog)
# 加载预训练模型
if args.load_pretrained_model:
save_dir = 'checkpoint/DANet101_better_model_paddle1.5.2'
if os.path.exists(save_dir):
load_model(save_dir, exe, program=train_prog)
print('load pretrained model!')
# 加载最优模型
if args.load_better_model:
save_dir = 'checkpoint/DANet101_better_model_paddle1.5.2'
if os.path.exists(save_dir):
load_model(save_dir, exe, program=train_prog)
print('load better model!')
train_iou_manager = fluid.metrics.Accuracy()
train_avg_loss_manager = fluid.metrics.Accuracy()
test_iou_manager = fluid.metrics.Accuracy()
test_avg_loss_manager = fluid.metrics.Accuracy()
better_miou_train = 0
better_miou_test = 0
train_loss_title = 'Train_loss'
test_loss_title = 'Test_loss'
train_iou_title = 'Train_mIOU'
test_iou_title = 'Test_mIOU'
plot_loss = Ploter(train_loss_title, test_loss_title)
plot_iou = Ploter(train_iou_title, test_iou_title)
for epoch in range(epoch_num):
prev_time = datetime.now()
train_avg_loss_manager.reset()
train_iou_manager.reset()
logging.info('training, epoch = {}'.format(epoch + 1))
train_py_reader.start()
batch_id = 0
while True:
try:
train_fetch_list = [train_avg_loss, miou, wrong, correct]
train_avg_loss_value, train_iou_value, w, c = exe.run(
program=compiled_train_prog,
fetch_list=train_fetch_list)
train_iou_manager.update(train_iou_value, weight=batch_size * num)
train_avg_loss_manager.update(train_avg_loss_value, weight=batch_size * num)
batch_train_str = "epoch: {}, batch: {}, train_avg_loss: {:.6f}, " \
"train_miou: {:.6f}.".format(epoch + 1,
batch_id + 1,
train_avg_loss_value[0],
train_iou_value[0])
save_dir = './checkpoint/DAnet_better_train_{:.4f}'.format(22.5)
save_model(save_dir, exe, program=train_prog)
if batch_id % 40 == 0:
logging.info(batch_train_str)
print(batch_train_str)
batch_id += 1
except fluid.core.EOFException:
train_py_reader.reset()
break
cur_time = datetime.now()
h, remainder = divmod((cur_time - prev_time).seconds, 3600)
m, s = divmod(remainder, 60)
time_str = " Time %02d:%02d:%02d" % (h, m, s)
train_str = "epoch: {}, train_avg_loss: {:.6f}, " \
"train_miou: {:.6f}.".format(epoch + 1,
train_avg_loss_manager.eval()[0],
train_iou_manager.eval()[0])
print(train_str + time_str + '\n')
logging.info(train_str + time_str)
plot_loss.append(train_loss_title, epoch, train_avg_loss_manager.eval()[0])
plot_loss.plot('./DANet_loss.jpg')
plot_iou.append(train_iou_title, epoch, train_iou_manager.eval()[0])
plot_iou.plot('./DANet_miou.jpg')
# save_model
if better_miou_train < train_iou_manager.eval()[0]:
shutil.rmtree('./checkpoint/DAnet_better_train_{:.4f}'.format(better_miou_train),
ignore_errors=True)
better_miou_train = train_iou_manager.eval()[0]
logging.warning(
'-----------train---------------better_train: {:.6f}, epoch: {}, -----------successful save train model!\n'.format(
better_miou_train, epoch + 1))
save_dir = './checkpoint/DAnet_better_train_{:.4f}'.format(better_miou_train)
save_model(save_dir, exe, program=train_prog)
if (epoch + 1) % 5 == 0:
save_dir = './checkpoint/DAnet_epoch_train'
save_model(save_dir, exe, program=train_prog)
# test
test_py_reader.start()
test_iou_manager.reset()
test_avg_loss_manager.reset()
prev_time = datetime.now()
logging.info('testing, epoch = {}'.format(epoch + 1))
batch_id = 0
while True:
try:
test_fetch_list = [test_avg_loss, miou, wrong, correct]
test_avg_loss_value, test_iou_value, _, _ = exe.run(program=test_prog,
fetch_list=test_fetch_list)
test_iou_manager.update(test_iou_value, weight=batch_size * num)
test_avg_loss_manager.update(test_avg_loss_value, weight=batch_size * num)
batch_test_str = "epoch: {}, batch: {}, test_avg_loss: {:.6f}, " \
"test_miou: {:.6f}. ".format(epoch + 1,
batch_id + 1,
test_avg_loss_value[0],
test_iou_value[0])
if batch_id % 40 == 0:
logging.info(batch_test_str)
print(batch_test_str)
batch_id += 1
except fluid.core.EOFException:
test_py_reader.reset()
break
cur_time = datetime.now()
h, remainder = divmod((cur_time - prev_time).seconds, 3600)
m, s = divmod(remainder, 60)
time_str = " Time %02d:%02d:%02d" % (h, m, s)
test_str = "epoch: {}, test_avg_loss: {:.6f}, " \
"test_miou: {:.6f}.".format(epoch + 1,
test_avg_loss_manager.eval()[0],
test_iou_manager.eval()[0])
print(test_str + time_str + '\n')
logging.info(test_str + time_str)
plot_loss.append(test_loss_title, epoch, test_avg_loss_manager.eval()[0])
plot_loss.plot('./DANet_loss.jpg')
plot_iou.append(test_iou_title, epoch, test_iou_manager.eval()[0])
plot_iou.plot('./DANet_miou.jpg')
# save_model_infer
if better_miou_test < test_iou_manager.eval()[0]:
shutil.rmtree('./checkpoint/infer/DAnet_better_test_{:.4f}'.format(better_miou_test),
ignore_errors=True)
better_miou_test = test_iou_manager.eval()[0]
logging.warning(
'------------test-------------infer better_test: {:.6f}, epoch: {}, ----------------successful save infer model!\n'.format(
better_miou_test, epoch + 1))
save_dir = './checkpoint/infer/DAnet_better_test_{:.4f}'.format(better_miou_test)
# save_model(save_dir, exe, program=test_prog)
fluid.io.save_inference_model(save_dir, [image.name], [pred, pred2, pred3], exe)
print('successful save infer model!')
if __name__ == '__main__':
options = Options()
args = options.parse()
options.print_args()
main(args)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .base import BaseDataSet
from .cityscapes import CityScapes
from .lr_scheduler import Lr
from .cityscapes_data import *
from .voc import VOC
from .voc_data import *
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import random
import numpy as np
from PIL import Image, ImageOps, ImageFilter, ImageEnhance
import os
import sys
curPath = os.path.abspath(os.path.dirname(__file__))
parentPath = os.path.split(curPath)[0]
rootPath = os.path.split(parentPath)[0]
sys.path.append(rootPath)
class BaseDataSet:
def __init__(self, root, split, base_size=1024, crop_size=768, scale=True):
self.root = root
support = ['train', 'train_val', 'val', 'test']
assert split in support, "split= \'{}\' not in {}".format(split, support)
self.split = split
self.crop_size = crop_size # 裁剪大小
self.base_size = base_size # 图片最短边
self.scale = scale
self.image_path = None
self.label_path = None
def sync_transform(self, image, label):
crop_size = self.crop_size
if self.scale:
short_size = random.randint(int(self.base_size * 0.75), int(self.base_size * 2.0))
else:
short_size = self.base_size
# 随机左右翻转
if random.random() > 0.5:
image = image.transpose(Image.FLIP_LEFT_RIGHT)
label = label.transpose(Image.FLIP_LEFT_RIGHT)
w, h = image.size
# 同比例缩放
if h > w:
out_w = short_size
out_h = int(1.0 * h / w * out_w)
else:
out_h = short_size
out_w = int(1.0 * w / h * out_h)
image = image.resize((out_w, out_h), Image.BILINEAR)
label = label.resize((out_w, out_h), Image.NEAREST)
# 四周填充
if short_size < crop_size:
pad_h = crop_size - out_h if out_h < crop_size else 0
pad_w = crop_size - out_w if out_w < crop_size else 0
image = ImageOps.expand(image, border=(pad_w // 2, pad_h // 2, pad_w - pad_w // 2, pad_h - pad_h // 2),
fill=0)
label = ImageOps.expand(label, border=(pad_w // 2, pad_h // 2, pad_w - pad_w // 2, pad_h - pad_h // 2),
fill=255)
# 随机裁剪
w, h = image.size
x = random.randint(0, w - crop_size)
y = random.randint(0, h - crop_size)
image = image.crop((x, y, x + crop_size, y + crop_size))
label = label.crop((x, y, x + crop_size, y + crop_size))
# # 高斯模糊,可选
# if random.random() > 0.7:
# image = image.filter(ImageFilter.GaussianBlur(radius=random.random()))
# 可选
# if random.random() > 0.7:
# # 随机亮度
# factor = np.random.uniform(0.75, 1.25)
# image = ImageEnhance.Brightness(image).enhance(factor)
#
# # 颜色抖动
# factor = np.random.uniform(0.75, 1.25)
# image = ImageEnhance.Color(image).enhance(factor)
#
# # 随机对比度
# factor = np.random.uniform(0.75, 1.25)
# image = ImageEnhance.Contrast(image).enhance(factor)
#
# # 随机锐度
# factor = np.random.uniform(0.75, 1.25)
# image = ImageEnhance.Sharpness(image).enhance(factor)
return image, label
def sync_val_transform(self, image, label):
crop_size = self.crop_size
short_size = self.base_size
w, h = image.size
# 同比例缩放
if h > w:
out_w = short_size
out_h = int(1.0 * h / w * out_w)
else:
out_h = short_size
out_w = int(1.0 * w / h * out_h)
image = image.resize((out_w, out_h), Image.BILINEAR)
label = label.resize((out_w, out_h), Image.NEAREST)
# 中心裁剪
w, h = image.size
x1 = int(round((w - crop_size) / 2.))
y1 = int(round((h - crop_size) / 2.))
image = image.crop((x1, y1, x1 + crop_size, y1 + crop_size))
label = label.crop((x1, y1, x1 + crop_size, y1 + crop_size))
return image, label
def eval(self, image):
pass
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
import os
from utils.base import BaseDataSet
class CityScapes(BaseDataSet):
"""prepare cityscapes path_pairs"""
BASE_DIR = 'cityscapes'
NUM_CLASS = 19
def __init__(self, root='./dataset', split='train', **kwargs):
super(CityScapes, self).__init__(root, split, **kwargs)
if os.sep == '\\': # windows
root = root.replace('/', '\\')
root = os.path.join(root, self.BASE_DIR)
assert os.path.exists(root), "please download cityscapes data_set, put in dataset(dir),or check root"
self.image_path, self.label_path = self._get_cityscapes_pairs(root, split)
assert len(self.image_path) == len(self.label_path), "please check image_length = label_length"
self.print_param()
def print_param(self): # 用于核对当前数据集的信息
print('INFO: dataset_root: {}, split: {}, '
'base_size: {}, crop_size: {}, scale: {}, '
'image_length: {}, label_length: {}'.format(self.root, self.split, self.base_size,
self.crop_size, self.scale, len(self.image_path),
len(self.label_path)))
@staticmethod
def _get_cityscapes_pairs(root, split):
def get_pairs(root, file_image, file_label):
file_image = os.path.join(root, file_image)
file_label = os.path.join(root, file_label)
with open(file_image, 'r') as f:
file_list_image = f.read().split()
with open(file_label, 'r') as f:
file_list_label = f.read().split()
if os.sep == '\\': # for windows
image_path = [os.path.join(root, x.replace('/', '\\')) for x in file_list_image]
label_path = [os.path.join(root, x.replace('/', '\\')) for x in file_list_label]
else:
image_path = [os.path.join(root, x) for x in file_list_image]
label_path = [os.path.join(root, x) for x in file_list_label]
return image_path, label_path
if split == 'train':
image_path, label_path = get_pairs(root, 'trainImages.txt', 'trainLabels.txt')
elif split == 'val':
image_path, label_path = get_pairs(root, 'valImages.txt', 'valLabels.txt')
elif split == 'test':
image_path, label_path = get_pairs(root, 'testImages.txt', 'testLabels.txt') # 返回文件路径,test_label并不存在
else: # 'train_val'
image_path1, label_path1 = get_pairs(root, 'trainImages.txt', 'trainLabels.txt')
image_path2, label_path2 = get_pairs(root, 'valImages.txt', 'valLabels.txt')
image_path, label_path = image_path1+image_path2, label_path1+label_path2
return image_path, label_path
def get_path_pairs(self):
return self.image_path, self.label_path
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import random
import paddle
import numpy as np
from PIL import Image
from utils.cityscapes import CityScapes
__all__ = ['cityscapes_train', 'cityscapes_val', 'cityscapes_train_val', 'cityscapes_test']
# globals
data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
def mapper_train(sample):
image_path, label_path, city = sample
image = Image.open(image_path, mode='r').convert('RGB')
label = Image.open(label_path, mode='r')
image, label = city.sync_transform(image, label)
image_array = np.array(image) # HWC
label_array = np.array(label) # HW
image_array = image_array.transpose((2, 0, 1)) # CHW
image_array = image_array / 255.0
image_array = (image_array - data_mean) / data_std
image_array = image_array.astype('float32')
label_array = label_array.astype('int64')
return image_array, label_array
def mapper_val(sample):
image_path, label_path, city = sample
image = Image.open(image_path, mode='r').convert('RGB')
label = Image.open(label_path, mode='r')
image, label = city.sync_val_transform(image, label)
image_array = np.array(image) # HWC
label_array = np.array(label) # HW
image_array = image_array.transpose((2, 0, 1)) # CHW
image_array = image_array / 255.0
image_array = (image_array - data_mean) / data_std
image_array = image_array.astype('float32')
label_array = label_array.astype('int64')
return image_array, label_array
def mapper_test(sample):
image_path, label_path = sample # label is path
image = Image.open(image_path, mode='r').convert('RGB')
image_array = image
return image_array, label_path # image is a picture, label is path
# root, base_size, crop_size; gpu_num必须设置,否则syncBN会出现某些卡没有数据的情况
def cityscapes_train(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True, batch_size=1, gpu_num=1):
city = CityScapes(root=data_root, split='train', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = city.get_path_pairs()
def reader():
if len(image_path) % (batch_size * gpu_num) != 0:
length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num)
else:
length = len(image_path)
for i in range(2):
if i == 0:
cc = list(zip(image_path, label_path))
random.shuffle(cc)
image_path[:], label_path[:] = zip(*cc)
yield image_path[i], label_path[i], city
if xmap:
return paddle.reader.xmap_readers(mapper_train, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_train, reader)
def cityscapes_val(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True):
city = CityScapes(root=data_root, split='val', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = city.get_path_pairs()
def reader():
for i in range(len(image_path[:2])):
yield image_path[i], label_path[i], city
if xmap:
return paddle.reader.xmap_readers(mapper_val, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_val, reader)
def cityscapes_train_val(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True, batch_size=1, gpu_num=1):
city = CityScapes(root=data_root, split='train_val', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = city.get_path_pairs()
def reader():
if len(image_path) % (batch_size * gpu_num) != 0:
length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num)
else:
length = len(image_path)
for i in range(length):
if i == 0:
cc = list(zip(image_path, label_path))
random.shuffle(cc)
image_path[:], label_path[:] = zip(*cc)
yield image_path[i], label_path[i], city
if xmap:
return paddle.reader.xmap_readers(mapper_train, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_train, reader)
def cityscapes_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True):
# 实际未使用base_size, crop_size, scale
city = CityScapes(split=split, base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = city.get_path_pairs()
def reader():
for i in range(len(image_path)):
yield image_path[i], label_path[i]
if xmap:
return paddle.reader.xmap_readers(mapper_test, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_test, reader)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle.fluid as fluid
import math
class Lr(object):
"""
示例:使用poly策略, 有热身,
lr_scheduler = Lr(lr_policy='poly', base_lr=0.003, epoch_nums=200, step_per_epoch=20,
warm_up=True, warmup_epoch=11)
lr = lr_scheduler.get_lr()
示例:使用cosine策略, 有热身,
lr_scheduler = Lr(lr_policy='cosine', base_lr=0.003, epoch_nums=200, step_per_epoch=20,
warm_up=True, warmup_epoch=11)
lr = lr_scheduler.get_lr()
示例:使用piecewise策略, 有热身,必须设置边界(decay_epoch list), gamma系数默认0.1
lr_scheduler = Lr(lr_policy='piecewise', base_lr=0.003, epoch_nums=200, step_per_epoch=20,
warm_up=True, warmup_epoch=11, decay_epoch=[50], gamma=0.1)
lr = lr_scheduler.get_lr()
"""
def __init__(self, lr_policy, base_lr, epoch_nums, step_per_epoch,
power=0.9, end_lr=0.0, gamma=0.1, decay_epoch=[],
warm_up=False, warmup_epoch=0):
support_lr_policy = ['poly', 'piecewise', 'cosine']
assert lr_policy in support_lr_policy, "Only support poly, piecewise, cosine"
self.lr_policy = lr_policy # 学习率衰减策略 : str(`cosine`, `poly`, `piecewise`)
assert base_lr >= 0, "Start learning rate should greater than 0"
self.base_lr = base_lr # 基础学习率: float
assert end_lr >= 0, "End learning rate should greater than 0"
self.end_lr = end_lr # 学习率终点: float
assert epoch_nums, "epoch_nums should greater than 0"
assert step_per_epoch, "step_per_epoch should greater than 0"
self.epoch_nums = epoch_nums # epoch数: int
self.step_per_epoch = step_per_epoch # 每个epoch的迭代数: int
self.total_step = epoch_nums * step_per_epoch # 总的迭代数 :auto
self.power = power # 指数: float
self.gamma = gamma # 分段衰减的系数: float
self.decay_epoch = decay_epoch # 分段衰减的epoch: list
if self.lr_policy == 'piecewise':
assert len(decay_epoch) >= 1, "use piecewise policy, should set decay_epoch list"
self.warm_up = warm_up # 是否热身:bool
if self.warm_up:
assert warmup_epoch, "warmup_epoch should greater than 0"
assert warmup_epoch < epoch_nums, "warmup_epoch should less than epoch_nums"
self.warmup_epoch = warmup_epoch
self.warmup_steps = warmup_epoch * step_per_epoch # 热身steps:int(epoch*step_per_epoch)
def _piecewise_decay(self):
gamma = self.gamma
bd = [self.step_per_epoch * e for e in self.decay_epoch]
lr = [self.base_lr * (gamma ** i) for i in range(len(bd) + 1)]
decayed_lr = fluid.layers.piecewise_decay(boundaries=bd, values=lr)
return decayed_lr
def _poly_decay(self):
decayed_lr = fluid.layers.polynomial_decay(
self.base_lr, self.total_step, end_learning_rate=self.end_lr, power=self.power)
return decayed_lr
def _cosine_decay(self):
decayed_lr = fluid.layers.cosine_decay(
self.base_lr, self.step_per_epoch, self.epoch_nums)
return decayed_lr
def get_lr(self):
if self.lr_policy.lower() == 'poly':
if self.warm_up:
warm_up_end_lr = (self.base_lr - self.end_lr) * pow(
(1 - self.warmup_steps / self.total_step), self.power) + self.end_lr
print('poly warm_up_end_lr:', warm_up_end_lr)
decayed_lr = fluid.layers.linear_lr_warmup(self._poly_decay(),
warmup_steps=self.warmup_steps,
start_lr=0.0,
end_lr=warm_up_end_lr)
else:
decayed_lr = self._poly_decay()
elif self.lr_policy.lower() == 'piecewise':
if self.warm_up:
assert self.warmup_steps < self.decay_epoch[0] * self.step_per_epoch
warm_up_end_lr = self.base_lr
print('piecewise warm_up_end_lr:', warm_up_end_lr)
decayed_lr = fluid.layers.linear_lr_warmup(self._piecewise_decay(),
warmup_steps=self.warmup_steps,
start_lr=0.0,
end_lr=warm_up_end_lr)
else:
decayed_lr = self._piecewise_decay()
elif self.lr_policy.lower() == 'cosine':
if self.warm_up:
warm_up_end_lr = self.base_lr*0.5*(math.cos(self.warmup_epoch*math.pi/self.epoch_nums)+1)
print('cosine warm_up_end_lr:', warm_up_end_lr)
decayed_lr = fluid.layers.linear_lr_warmup(self._cosine_decay(),
warmup_steps=self.warmup_steps,
start_lr=0.0,
end_lr=warm_up_end_lr)
else:
decayed_lr = self._cosine_decay()
else:
raise Exception(
"unsupport learning decay policy! only support poly,piecewise,cosine"
)
return decayed_lr
if __name__ == '__main__':
epoch_nums = 200
step_per_epoch = 180
base_lr = 0.003
warmup_epoch = 5 # 热身数
lr_scheduler = Lr(lr_policy='poly', base_lr=base_lr, epoch_nums=epoch_nums, step_per_epoch=step_per_epoch,
warm_up=True, warmup_epoch=warmup_epoch, decay_epoch=[50])
lr = lr_scheduler.get_lr()
exe = fluid.Executor(fluid.CPUPlace())
exe.run(fluid.default_startup_program())
lr_list = []
for epoch in range(epoch_nums):
for i in range(step_per_epoch):
x = exe.run(fluid.default_main_program(),
fetch_list=[lr])
lr_list.append(x[0])
# print(x[0])
# 绘图
from matplotlib import pyplot as plt
plt.plot(range(epoch_nums*step_per_epoch), lr_list)
plt.xlabel('step')
plt.ylabel('lr')
plt.show()
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
from utils.base import BaseDataSet
class VOC(BaseDataSet):
"""prepare pascalVOC path_pairs"""
BASE_DIR = 'VOC2012_SBD'
NUM_CLASS = 21
def __init__(self, root='../dataset', split='train', **kwargs):
super(VOC, self).__init__(root, split, **kwargs)
if os.sep == '\\': # windows
root = root.replace('/', '\\')
root = os.path.join(root, self.BASE_DIR)
assert os.path.exists(root), "please download voc2012 data_set, put in dataset(dir)"
if split == 'test':
self.image_path = self._get_cityscapes_pairs(root, split)
else:
self.image_path, self.label_path = self._get_cityscapes_pairs(root, split)
if self.label_path is None:
pass
else:
assert len(self.image_path) == len(self.label_path), "please check image_length = label_length"
self.print_param()
def print_param(self): # 用于核对当前数据集的信息
if self.label_path is None:
print('INFO: dataset_root: {}, split: {}, '
'base_size: {}, crop_size: {}, scale: {}, '
'image_length: {}'.format(self.root, self.split, self.base_size,
self.crop_size, self.scale, len(self.image_path)))
else:
print('INFO: dataset_root: {}, split: {}, '
'base_size: {}, crop_size: {}, scale: {}, '
'image_length: {}, label_length: {}'.format(self.root, self.split, self.base_size,
self.crop_size, self.scale, len(self.image_path),
len(self.label_path)))
@staticmethod
def _get_cityscapes_pairs(root, split):
def get_pairs(root, file):
if file.find('test') == -1:
file = os.path.join(root, file)
with open(file, 'r') as f:
file_list = f.readlines()
if os.sep == '\\': # for windows
image_path = [
os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:].replace('/', '\\').replace('\n', ''))
for x in file_list]
label_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[1][1:].replace('/', '\\')) for x in
file_list]
else:
image_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:]) for x in file_list]
label_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[1][1:]) for x in file_list]
return image_path, label_path
else:
file = os.path.join(root, file)
with open(file, 'r') as f:
file_list = f.readlines()
if os.sep == '\\': # for windows
image_path = [
os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:].replace('/', '\\').replace('\n', ''))
for x in file_list]
else:
image_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:]) for x in file_list]
return image_path
if split == 'train':
image_path, label_path = get_pairs(root, 'list/train_aug.txt')
elif split == 'val':
image_path, label_path = get_pairs(root, 'list/val.txt')
elif split == 'test':
image_path = get_pairs(root, 'list/test.txt') # 返回文件路径,test_label并不存在
return image_path
else: # 'train_val'
image_path, label_path = get_pairs(root, 'list/trainval_aug.txt')
return image_path, label_path
def get_path_pairs(self):
return self.image_path, self.label_path
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import random
import paddle
import numpy as np
from PIL import Image
from utils.voc import VOC
__all__ = ['voc_train', 'voc_val', 'voc_train_val', 'voc_test']
# globals
data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
def mapper_train(sample):
image_path, label_path, voc = sample
image = Image.open(image_path, mode='r').convert('RGB')
label = Image.open(label_path, mode='r')
image, label = voc.sync_transform(image, label)
image_array = np.array(image) # HWC
label_array = np.array(label) # HW
image_array = image_array.transpose((2, 0, 1)) # CHW
image_array = image_array / 255.0
image_array = (image_array - data_mean) / data_std
image_array = image_array.astype('float32')
label_array = label_array.astype('int64')
return image_array, label_array
def mapper_val(sample):
image_path, label_path, city = sample
image = Image.open(image_path, mode='r').convert('RGB')
label = Image.open(label_path, mode='r')
image, label = city.sync_val_transform(image, label)
image_array = np.array(image)
label_array = np.array(label)
image_array = image_array.transpose((2, 0, 1))
image_array = image_array / 255.0
image_array = (image_array - data_mean) / data_std
image_array = image_array.astype('float32')
label_array = label_array.astype('int64')
return image_array, label_array
def mapper_test(sample):
image_path, label_path = sample # label is path
image = Image.open(image_path, mode='r').convert('RGB')
image_array = image
return image_array, label_path # label is path
# 已完成, 引用时记得传入参数,root, base_size, crop_size等, gpu_num必须设置,否则syncBN会出现某些卡没有数据的情况
def voc_train(data_root='../dataset', base_size=768, crop_size=576, scale=True, xmap=True, batch_size=1, gpu_num=1):
voc = VOC(root=data_root, split='train', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = voc.get_path_pairs()
def reader():
if len(image_path) % (batch_size * gpu_num) != 0:
length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num)
else:
length = len(image_path)
for i in range(length):
if i == 0:
cc = list(zip(image_path, label_path))
random.shuffle(cc)
image_path[:], label_path[:] = zip(*cc)
yield image_path[i], label_path[i], voc
if xmap:
return paddle.reader.xmap_readers(mapper_train, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_train, reader)
def voc_val(data_root='../dataset', base_size=768, crop_size=576, scale=True, xmap=True):
voc = VOC(root=data_root, split='val', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = voc.get_path_pairs()
def reader():
for i in range(len(image_path)):
yield image_path[i], label_path[i], voc
if xmap:
return paddle.reader.xmap_readers(mapper_val, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_val, reader)
def voc_train_val(data_root='./dataset', base_size=768, crop_size=576, scale=True, xmap=True, batch_size=1, gpu_num=1):
voc = VOC(root=data_root, split='train_val', base_size=base_size, crop_size=crop_size, scale=scale)
image_path, label_path = voc.get_path_pairs()
def reader():
if len(image_path) % (batch_size * gpu_num) != 0:
length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num)
else:
length = len(image_path)
for i in range(length):
if i == 0:
cc = list(zip(image_path, label_path))
random.shuffle(cc)
image_path[:], label_path[:] = zip(*cc)
yield image_path[i], label_path[i]
if xmap:
return paddle.reader.xmap_readers(mapper_train, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_train, reader)
def voc_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True):
# 实际未使用base_size, crop_size, scale
voc = VOC(split=split, base_size=base_size, crop_size=crop_size, scale=scale)
image_path = voc.get_path_pairs()
def reader():
for i in range(len(image_path[:1])):
yield image_path[i], image_path[i]
if xmap:
return paddle.reader.xmap_readers(mapper_test, reader, 4, 32)
else:
return paddle.reader.map_readers(mapper_test, reader)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册