diff --git a/PaddleCV/Research/danet/README.md b/PaddleCV/Research/danet/README.md deleted file mode 100644 index 02348ae2d2c06a548ac6ad9987d398eb83dce49d..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/README.md +++ /dev/null @@ -1,155 +0,0 @@ -# [Dual Attention Network for Scene Segmentation (CVPR2019)](https://arxiv.org/pdf/1809.02983.pdf) - -本项目是[DANet](https://arxiv.org/pdf/1809.02983.pdf)的 PaddlePaddle(>=1.5.2) 实现, 包含模型训练,验证等内容。 - -## 模型简介 -![net](img/Network.png) -骨干网络使用ResNet,为更好地进行语义分割任务,作者对ResNet做出以下改动: - - 1、将最后两个layer的downsampling取消,使得特征图是原图的1/8,保持较高空间分辨率。 - 2、最后两个layer采用空洞卷积扩大感受野。 -然后接上两个并行的注意力模块(位置注意力和通道注意力),最终将两个模块的结果进行elementwise操作,之后再接一层卷积输出分割图。 - -### 位置注意力 - -![position](img/position.png) - -A是骨干网络ResNet输出经过一层卷积生成的特征图,维度为CHW; -A经过3个卷积操作输出维度均为CHW的B、C、D。将B、C、D都reshape到CN(N = H*W); -然后将B reshape后的结果转置与C相乘,得到N * N的矩阵, 对于矩阵的每一个点进行softmax; -然后将D与softmax后的结果相乘并reshape到CHW,再与A进行elementwise。 - -### 通道注意力 -![channel](img/channel.png) - - -A是骨干网络ResNet输出经过一层卷积生成的特征图,维度为CHW; -A经过3个reshape操作输出维度均为CN(N = H*W)的B、C、D; -然后将B转置与C相乘,得到C * C的矩阵,对于矩阵的每一个点进行softmax; -然后将D与softmax后的结果相乘并reshape到CHW,再与A进行elementwise。 - - - -## 数据准备 - -公开数据集:Cityscapes - -训练集2975张,验证集500张,测试集1525张,图片分辨率都是1024*2048。 - -数据集来源:AIstudio数据集页面上[下载](https://aistudio.baidu.com/aistudio/datasetDetail/11503), cityscapes.zip解压至dataset文件夹下,train.zip解压缩到cityscapes/leftImg8bit,其目录结构如下: -```text -dataset - ├── cityscapes # Cityscapes数据集 - ├── gtFine # 精细化标注的label - ├── leftImg8bit # 训练,验证,测试图片 - ├── trainLabels.txt # 训练图片路径 - ├── valLabels.txt # 验证图片路径 - ... ... -``` -## 训练说明 - -#### 数据增强策略 - 1、随机尺度缩放:尺度范围0.75到2.0 - 2、随机左右翻转:发生概率0.5 - 3、同比例缩放:缩放的大小由选项1决定。 - 4、随机裁剪: - 5、高斯模糊:发生概率0.3(可选) - 6、颜色抖动,对比度,锐度,亮度; 发生概率0.3(可选) -###### 默认1、2、3、4、5、6都开启 - -#### 学习率调节策略 - 1、使用热身策略,学习率由0递增到base_lr,热身轮数(epoch)是5 - 2、在热身策略之后使用学习率衰减策略(poly),学习率由base_lr递减到0 - -#### 优化器选择 - Momentum: 动量0.9,正则化系数1e-4 - -#### 加载预训练模型 - 设置 --load_pretrained_model(默认为False) - 预训练文件: - checkpoint/DANet50_pretrained_model_paddle1.6.pdparams - checkpoint/DANet101_pretrained_model_paddle1.6.pdparams - -#### 加载训练好的模型 - 设置 --load_better_model(默认为False) - 训练好的文件: - checkpoint/DANet101_better_model_paddle1.6.pdparams -##### 【注】 - 训练时paddle版本是1.5.2,代码已转为1.6版本(兼容1.6版本),预训练参数、训练好的参数来自1.5.2版本 - -#### 配置模型文件路径 -[预训练参数、最优模型参数下载](https://paddlemodels.bj.bcebos.com/DANet/DANet_models.tar) - -其目录结构如下: -```text -checkpoint - ├── DANet50_pretrained_model_paddle1.6.pdparams # DANet50预训练模型,需要paddle >=1.6.0 - ├── DANet101_pretrained_model_paddle1.6.pdparams # DANet101预训练模型,需要paddle >=1.6.0 - ├── DANet101_better_model_paddle1.6.pdparams # DANet101训练最优模型,需要paddle >=1.6.0 - ├── DANet101_better_model_paddle1.5.2 # DANet101在1.5.2版本训练的最优模型,需要paddle >= 1.5.2 - -``` - -## 模型训练 - -```sh -cd danet -export PYTHONPATH=`pwd`:$PYTHONPATH -# open garbage collection to save memory -export FLAGS_eager_delete_tensor_gb=0.0 -# setting visible devices for train -export CUDA_VISIBLE_DEVICES=0,1,2,3 -``` - -executor执行以下命令进行训练 -```sh -python train_executor.py --backbone resnet101 --base_size 1024 --crop_size 768 --epoch_num 350 --batch_size 2 --lr 0.003 --lr_scheduler poly --warm_up --warmup_epoch 2 --cuda --use_data_parallel --load_pretrained_model --save_model checkpoint/DANet101_better_model_paddle1.5.2 --multi_scales --flip --dilated --multi_grid --scale --multi_dilation 4 8 16 -``` -参数含义: 使用ResNet101骨干网络,训练图片基础大小是1024,裁剪大小是768,训练轮数是350次,batch size是2 -学习率是0.003,学习率衰减策略是poly,使用学习率热身,热身轮数是2轮,使用GPU,使用数据并行, 加载预训练模型,设置加载的模型地址,使用多尺度测试, 使用图片左右翻转测试,使用空洞卷积,使用multi_grid,multi_dilation设置为4 8 16,使用多尺度训练 -##### Windows下训练需要去掉 --use_data_parallel -#### 或者 -dygraph执行以下命令进行训练 -```sh -python train_dygraph.py --backbone resnet101 --base_size 1024 --crop_size 768 --epoch_num 350 --batch_size 2 --lr 0.003 --lr_scheduler poly --cuda --use_data_parallel --load_pretrained_model --save_model checkpoint/DANet101_better_model_paddle1.6 --multi_scales --flip --dilated --multi_grid --scale --multi_dilation 4 8 16 -``` -参数含义: 使用ResNet101骨干网络,训练图片基础大小是1024,裁剪大小是768,训练轮数是350次,batch size是2,学习率是0.003,学习率衰减策略是poly,使用GPU, 使用数据并行,加载预训练模型,设置加载的模型地址,使用多尺度测试,使用图片左右翻转测试,使用空洞卷积,使用multi_grid,multi_dilation设置4 8 16,使用多尺度训练 - -#### 【注】 -##### train_executor.py使用executor方式训练(适合paddle >= 1.5.2),train_dygraph.py使用动态图方式训练(适合paddle >= 1.6.0),两种方式都可以 -##### 动态图方式训练暂时不支持学习率热身 - -#### 在训练阶段,输出的验证结果不是真实的,需要使用eval.py来获得验证的最终结果。 - - ## 模型验证 -```sh -# open garbage collection to save memory -export FLAGS_eager_delete_tensor_gb=0.0 -# setting visible devices for prediction -export CUDA_VISIBLE_DEVICES=0 - -python eval.py --backbone resnet101 --base_size 2048 --crop_size 1024 --cuda --use_data_parallel --load_better_model --save_model checkpoint/DANet101_better_model_paddle1.6 --multi_scales --flip --dilated --multi_grid --multi_dilation 4 8 16 -``` -##### 如果需要把executor训练的参数转成dygraph模式下进行验证的话,请在命令行加上--change_executor_to_dygraph - -## 验证结果 -评测指标:mean IOU(平均交并比) - - -| 模型 | 单尺度 | 多尺度 | -| :---:|:---:| :---:| -|DANet101|0.8043836|0.8138021 - -##### 具体数值 -| 模型 | cls1 | cls2 | cls3 | cls4 | cls5 | cls6 | cls7 | cls8 | cls9 | cls10 | cls11 | cls12 | cls13 | cls14 | cls15 | cls16 |cls17 | cls18 | cls19 | -| :---:|:---: | :---:| :---:|:---: | :---:| :---:|:---: | :---:| :---:|:---: |:---: |:---: |:---: | :---: | :---: |:---: | :---:| :---: |:---: | -|DANet101-SS|0.98212|0.85372|0.92799|0.59976|0.63318|0.65819|0.72023|0.80000|0.92605|0.65788|0.94841|0.83377|0.65206|0.95566|0.87148|0.91233|0.84352|0.71948|0.78737| -|DANet101-MS|0.98047|0.84637|0.93084|0.62699|0.64839|0.67769|0.73650|0.81343|0.92942|0.67010|0.95127|0.84466|0.66635|0.95749|0.87755|0.92370|0.85344|0.73007|0.79742| - -## 输出结果可视化 -![val_1](img/val_1.png) -###### 输入图片 -![val_gt](img/val_gt.png) -###### 图片label -![val_output](img/val_output.png) -###### DANet101模型输出 diff --git a/PaddleCV/Research/danet/checkpoint/.gitkeep b/PaddleCV/Research/danet/checkpoint/.gitkeep deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/PaddleCV/Research/danet/danet.py b/PaddleCV/Research/danet/danet.py deleted file mode 100644 index 566a13e5cb7c9079de704db86647bcf2a5cabf1b..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/danet.py +++ /dev/null @@ -1,641 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import shutil -import paddle.fluid as fluid -import os - - -__all__ = ['DANet'] - - -class ConvBN(fluid.dygraph.Layer): - - def __init__(self, - name_scope, - num_filters, - filter_size=3, - stride=1, - dilation=1, - act=None, - learning_rate=1.0, - dtype='float32', - bias_attr=False): - super(ConvBN, self).__init__(name_scope) - - if dilation != 1: - padding = dilation - else: - padding = (filter_size - 1) // 2 - - self._conv = fluid.dygraph.Conv2D(name_scope, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=padding, - dilation=dilation, - act=None, - dtype=dtype, - bias_attr=bias_attr if bias_attr is False else fluid.ParamAttr( - learning_rate=learning_rate, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=learning_rate, - name='weight') - ) - self._bn = fluid.dygraph.BatchNorm(name_scope, - num_channels=num_filters, - act=act, - dtype=dtype, - momentum=0.9, - epsilon=1e-5, - bias_attr=fluid.ParamAttr( - learning_rate=learning_rate, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=learning_rate, - name='weight'), - moving_mean_name='running_mean', - moving_variance_name='running_var' - ) - - def forward(self, inputs): - x = self._conv(inputs) - x = self._bn(x) - return x - - -class BasicBlock(fluid.dygraph.Layer): - - def __init__(self, - name_scope, - num_filters, - stride=1, - dilation=1, - same=False): - super(BasicBlock, self).__init__(name_scope) - self._conv0 = ConvBN(self.full_name(), - num_filters=num_filters, - filter_size=3, - stride=stride, - dilation=dilation, - act='relu') - self._conv1 = ConvBN(self.full_name(), - num_filters=num_filters, - filter_size=3, - stride=1, - dilation=dilation, - act=None) - - self.same = same - - if not same: - self._skip = ConvBN(self.full_name(), - num_filters=num_filters, - filter_size=1, - stride=stride, - act=None) - - def forward(self, inputs): - x = self._conv0(inputs) - x = self._conv1(x) - if self.same: - skip = inputs - else: - skip = self._skip(inputs) - x = fluid.layers.elementwise_add(x, skip, act='relu') - return x - - -class BottleneckBlock(fluid.dygraph.Layer): - def __init__(self, name_scope, num_filters, stride, dilation=1, same=False): - super(BottleneckBlock, self).__init__(name_scope) - self.expansion = 4 - - self._conv0 = ConvBN(name_scope, - num_filters=num_filters, - filter_size=1, - stride=1, - act='relu') - self._conv1 = ConvBN(name_scope, - num_filters=num_filters, - filter_size=3, - stride=stride, - dilation=dilation, - act='relu') - self._conv2 = ConvBN(name_scope, - num_filters=num_filters * self.expansion, - filter_size=1, - stride=1, - act=None) - self.same = same - - if not same: - self._skip = ConvBN(name_scope, - num_filters=num_filters * self.expansion, - filter_size=1, - stride=stride, - act=None) - - def forward(self, inputs): - x = self._conv0(inputs) - x = self._conv1(x) - x = self._conv2(x) - if self.same: - skip = inputs - else: - skip = self._skip(inputs) - x = fluid.layers.elementwise_add(x, skip, act='relu') - return x - - -class ResNet(fluid.dygraph.Layer): - def __init__(self, - name_scope, - layer=152, - num_class=1000, - dilated=True, - multi_grid=True, - multi_dilation=[4, 8, 16], - need_fc=False): - super(ResNet, self).__init__(name_scope) - - support_layer = [18, 34, 50, 101, 152] - assert layer in support_layer, 'layer({}) not in {}'.format(layer, support_layer) - self.need_fc = need_fc - self.num_filters_list = [64, 128, 256, 512] - if layer == 18: - self.depth = [2, 2, 2, 2] - elif layer == 34: - self.depth = [3, 4, 6, 3] - elif layer == 50: - self.depth = [3, 4, 6, 3] - elif layer == 101: - self.depth = [3, 4, 23, 3] - elif layer == 152: - self.depth = [3, 8, 36, 3] - - if multi_grid: - assert multi_dilation is not None - self.multi_dilation = multi_dilation - - self._conv = ConvBN(name_scope, 64, 7, 2, act='relu') - self._pool = fluid.dygraph.Pool2D(name_scope, - pool_size=3, - pool_stride=2, - pool_padding=1, - pool_type='max') - if layer >= 50: - self.layer1 = self._make_layer(block=BottleneckBlock, - depth=self.depth[0], - num_filters=self.num_filters_list[0], - stride=1, - same=False, - name='layer1') - self.layer2 = self._make_layer(block=BottleneckBlock, - depth=self.depth[1], - num_filters=self.num_filters_list[1], - stride=2, - same=False, - name='layer2') - if dilated: - self.layer3 = self._make_layer(block=BottleneckBlock, - depth=self.depth[2], - num_filters=self.num_filters_list[2], - stride=2, - dilation=2, - same=False, - name='layer3') - if multi_grid: # layer4 采用不同的采样率 - self.layer4 = self._make_layer(block=BottleneckBlock, - depth=self.depth[3], - num_filters=self.num_filters_list[3], - stride=2, - dilation=4, - multi_grid=multi_grid, - multi_dilation=self.multi_dilation, - same=False, - name='layer4') - else: - self.layer4 = self._make_layer(block=BottleneckBlock, - depth=self.depth[3], - num_filters=self.num_filters_list[3], - stride=2, - dilation=4, - same=False, - name='layer4') - else: - self.layer3 = self._make_layer(block=BottleneckBlock, - depth=self.depth[2], - num_filters=self.num_filters_list[2], - stride=2, - dilation=1, - same=False, - name='layer3') - self.layer4 = self._make_layer(block=BottleneckBlock, - depth=self.depth[3], - num_filters=self.num_filters_list[3], - stride=2, - dilation=1, - same=False, - name='layer4') - - else: # layer=18 or layer=34 - self.layer1 = self._make_layer(block=BasicBlock, - depth=self.depth[0], - num_filters=self.num_filters_list[0], - stride=1, - same=True, - name=name_scope) - self.layer2 = self._make_layer(block=BasicBlock, - depth=self.depth[1], - num_filters=self.num_filters_list[1], - stride=2, - same=False, - name=name_scope) - self.layer3 = self._make_layer(block=BasicBlock, - depth=self.depth[2], - num_filters=self.num_filters_list[2], - stride=2, - dilation=1, - same=False, - name=name_scope) - self.layer4 = self._make_layer(block=BasicBlock, - depth=self.depth[3], - num_filters=self.num_filters_list[3], - stride=2, - dilation=1, - same=False, - name=name_scope) - - self._avgpool = fluid.dygraph.Pool2D(name_scope, - global_pooling=True, - pool_type='avg') - self.fc = fluid.dygraph.FC(name_scope, - size=num_class, - act='softmax') - - def _make_layer(self, block, depth, num_filters, stride=1, dilation=1, same=False, multi_grid=False, - multi_dilation=None, name=None): - layers = [] - if dilation != 1: - # stride(2x2) with a dilated convolution instead - stride = 1 - - if multi_grid: - assert len(multi_dilation) == 3 - for depth in range(depth): - temp = block(name + '.{}'.format(depth), - num_filters=num_filters, - stride=stride, - dilation=multi_dilation[depth], - same=same) - stride = 1 - same = True - layers.append(self.add_sublayer('_{}_{}'.format(name, depth + 1), temp)) - else: - for depth in range(depth): - temp = block(name + '.{}'.format(depth), - num_filters=num_filters, - stride=stride, - dilation=dilation if depth > 0 else 1, - same=same) - stride = 1 - same = True - layers.append(self.add_sublayer('_{}_{}'.format(name, depth + 1), temp)) - return layers - - def forward(self, inputs): - x = self._conv(inputs) - - x = self._pool(x) - for layer in self.layer1: - x = layer(x) - c1 = x - - for layer in self.layer2: - x = layer(x) - c2 = x - - for layer in self.layer3: - x = layer(x) - c3 = x - - for layer in self.layer4: - x = layer(x) - c4 = x - - if self.need_fc: - x = self._avgpool(x) - x = self.fc(x) - return x - else: - return c1, c2, c3, c4 - - -class CAM(fluid.dygraph.Layer): - def __init__(self, - name_scope, - in_channels=512, - default_value=0): - """ - channel_attention_module - """ - super(CAM, self).__init__(name_scope) - self.in_channels = in_channels - self.gamma = fluid.layers.create_parameter(shape=[1], - dtype='float32', - is_bias=True, - attr=fluid.ParamAttr( - learning_rate=10.0, - name='cam_gamma'), - default_initializer=fluid.initializer.ConstantInitializer( - value=default_value) - ) - - def forward(self, inputs): - batch_size, c, h, w = inputs.shape - out_b = fluid.layers.reshape(inputs, shape=[batch_size, self.in_channels, h * w]) - out_c = fluid.layers.reshape(inputs, shape=[batch_size, self.in_channels, h * w]) - out_c_t = fluid.layers.transpose(out_c, perm=[0, 2, 1]) - mul_bc = fluid.layers.matmul(out_b, out_c_t) - - mul_bc_max = fluid.layers.reduce_max(mul_bc, dim=-1, keep_dim=True) - mul_bc_max = fluid.layers.expand(mul_bc_max, expand_times=[1, 1, c]) - x = fluid.layers.elementwise_sub(mul_bc_max, mul_bc) - - attention = fluid.layers.softmax(x, use_cudnn=True, axis=-1) - - out_d = fluid.layers.reshape(inputs, shape=[batch_size, self.in_channels, h * w]) - attention_mul = fluid.layers.matmul(attention, out_d) - - attention_reshape = fluid.layers.reshape(attention_mul, shape=[batch_size, self.in_channels, h, w]) - gamma_attention = fluid.layers.elementwise_mul(attention_reshape, self.gamma) - out = fluid.layers.elementwise_add(gamma_attention, inputs) - return out - - -class PAM(fluid.dygraph.Layer): - def __init__(self, - name_scope, - in_channels=512, - default_value=0): - """ - position_attention_module - """ - super(PAM, self).__init__(name_scope) - - assert in_channels // 8, 'in_channel // 8 > 0 ' - self.channel_in = in_channels // 8 - self._convB = fluid.dygraph.Conv2D(name_scope, - num_filters=in_channels // 8, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - self._convC = fluid.dygraph.Conv2D(name_scope, - num_filters=in_channels // 8, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - self._convD = fluid.dygraph.Conv2D(name_scope, - num_filters=in_channels, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - self.gamma = fluid.layers.create_parameter(shape=[1], - dtype='float32', - is_bias=True, - attr=fluid.ParamAttr( - learning_rate=10.0, - name='pam_gamma'), - default_initializer=fluid.initializer.ConstantInitializer( - value=default_value)) - - def forward(self, inputs): - batch_size, c, h, w = inputs.shape - out_b = self._convB(inputs) - out_b_reshape = fluid.layers.reshape(out_b, shape=[batch_size, self.channel_in, h * w]) - out_b_reshape_t = fluid.layers.transpose(out_b_reshape, perm=[0, 2, 1]) - out_c = self._convC(inputs) - out_c_reshape = fluid.layers.reshape(out_c, shape=[batch_size, self.channel_in, h * w]) - - mul_bc = fluid.layers.matmul(out_b_reshape_t, out_c_reshape) - soft_max_bc = fluid.layers.softmax(mul_bc, use_cudnn=True, axis=-1) - - out_d = self._convD(inputs) - out_d_reshape = fluid.layers.reshape(out_d, shape=[batch_size, self.channel_in * 8, h * w]) - attention = fluid.layers.matmul(out_d_reshape, fluid.layers.transpose(soft_max_bc, perm=[0, 2, 1])) - attention = fluid.layers.reshape(attention, shape=[batch_size, self.channel_in * 8, h, w]) - - gamma_attention = fluid.layers.elementwise_mul(attention, self.gamma) - out = fluid.layers.elementwise_add(gamma_attention, inputs) - return out - - -class DAHead(fluid.dygraph.Layer): - def __init__(self, - name_scope, - in_channels, - out_channels, - batch_size): - super(DAHead, self).__init__(name_scope) - self.in_channel = in_channels // 4 - self.batch_size = batch_size - self._conv_bn_relu0 = ConvBN(name_scope, - num_filters=self.in_channel, - filter_size=3, - stride=1, - act='relu', - learning_rate=10.0, - bias_attr=False) - - self._conv_bn_relu1 = ConvBN(name_scope, - num_filters=self.in_channel, - filter_size=3, - stride=1, - act='relu', - learning_rate=10.0, - bias_attr=False) - - self._pam = PAM('pam', in_channels=self.in_channel, default_value=0.0) - self._cam = CAM('cam', in_channels=self.in_channel, default_value=0.0) - - self._conv_bn_relu2 = ConvBN(name_scope, - num_filters=self.in_channel, - filter_size=3, - stride=1, - act='relu', - learning_rate=10.0, - bias_attr=False) - - self._conv_bn_relu3 = ConvBN(name_scope, - num_filters=self.in_channel, - filter_size=3, - stride=1, - act='relu', - learning_rate=10.0, - bias_attr=False) - self._pam_last_conv = fluid.dygraph.Conv2D(name_scope, - num_filters=out_channels, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - self._cam_last_conv = fluid.dygraph.Conv2D(name_scope, - num_filters=out_channels, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - self._last_conv = fluid.dygraph.Conv2D(name_scope, - num_filters=out_channels, - filter_size=1, - bias_attr=fluid.ParamAttr( - learning_rate=10.0, - name='bias'), - param_attr=fluid.ParamAttr( - learning_rate=10.0, - name='weight') - ) - - def forward(self, inputs): - out = [] - inputs_pam = self._conv_bn_relu0(inputs) - pam = self._pam(inputs_pam) - position = self._conv_bn_relu2(pam) - - batch_size, num_channels = position.shape[:2] - - # dropout2d - ones = fluid.layers.ones(shape=[self.batch_size, num_channels], dtype='float32') - dropout1d_P = fluid.layers.dropout(ones, 0.1, dropout_implementation='upscale_in_train') - out_position_drop2d = fluid.layers.elementwise_mul(position, dropout1d_P, axis=0) - dropout1d_P.stop_gradient = True - - inputs_cam = self._conv_bn_relu1(inputs) - cam = self._cam(inputs_cam) - channel = self._conv_bn_relu3(cam) - - # dropout2d - ones2 = fluid.layers.ones(shape=[self.batch_size, num_channels], dtype='float32') - dropout1d_C = fluid.layers.dropout(ones2, 0.1, dropout_implementation='upscale_in_train') - out_channel_drop2d = fluid.layers.elementwise_mul(channel, dropout1d_C, axis=0) - dropout1d_C.stop_gradient = True - position_out = self._pam_last_conv(out_position_drop2d) - channel_out = self._cam_last_conv(out_channel_drop2d) - - feat_sum = fluid.layers.elementwise_add(position, channel, axis=1) - feat_sum_batch_size, feat_sum_num_channels = feat_sum.shape[:2] - - # dropout2d - feat_sum_ones = fluid.layers.ones(shape=[self.batch_size, feat_sum_num_channels], dtype='float32') - dropout1d_sum = fluid.layers.dropout(feat_sum_ones, 0.1, dropout_implementation='upscale_in_train') - dropout2d_feat_sum = fluid.layers.elementwise_mul(feat_sum, dropout1d_sum, axis=0) - dropout1d_sum.stop_gradient = True - feat_sum_out = self._last_conv(dropout2d_feat_sum) - - out.append(feat_sum_out) - out.append(position_out) - out.append(channel_out) - return tuple(out) - - -class DANet(fluid.dygraph.Layer): - def __init__(self, - name_scope, - backbone='resnet50', - num_classes=19, - batch_size=1, - dilated=True, - multi_grid=True, - multi_dilation=[4, 8, 16]): - super(DANet, self).__init__(name_scope) - if backbone == 'resnet50': - print('backbone resnet50, dilated={}, multi_grid={}, ' - 'multi_dilation={}'.format(dilated, multi_grid, multi_dilation)) - self._backone = ResNet('resnet50', layer=50, dilated=dilated, - multi_grid=multi_grid, multi_dilation=multi_dilation) - elif backbone == 'resnet101': - print('backbone resnet101, dilated={}, multi_grid={}, ' - 'multi_dilation={}'.format(dilated, multi_grid, multi_dilation)) - self._backone = ResNet('resnet101', layer=101, dilated=dilated, - multi_grid=multi_grid, multi_dilation=multi_dilation) - elif backbone == 'resnet152': - print('backbone resnet152, dilated={}, multi_grid={}, ' - 'multi_dilation={}'.format(dilated, multi_grid, multi_dilation)) - self._backone = ResNet('resnet152', layer=152, dilated=dilated, - multi_grid=multi_grid, multi_dilation=multi_dilation) - else: - raise ValueError('unknown backbone: {}'.format(backbone)) - - self._head = DAHead('DA_head', in_channels=2048, out_channels=num_classes, batch_size=batch_size) - - def forward(self, inputs): - h, w = inputs.shape[2:] - _, _, c3, c4 = self._backone(inputs) - x1, x2, x3 = self._head(c4) - out = [] - out1 = fluid.layers.resize_bilinear(x1, out_shape=[h, w]) - out2 = fluid.layers.resize_bilinear(x2, out_shape=[h, w]) - out3 = fluid.layers.resize_bilinear(x3, out_shape=[h, w]) - out.append(out1) - out.append(out2) - out.append(out3) - return out - - -def copy_model(path, new_path): - shutil.rmtree(new_path, ignore_errors=True) - shutil.copytree(path, new_path) - model_path = os.path.join(new_path, '__model__') - if os.path.exists(model_path): - os.remove(model_path) - - -if __name__ == '__main__': - import numpy as np - - with fluid.dygraph.guard(fluid.CPUPlace()): - x = np.random.randn(2, 3, 224, 224).astype('float32') - x = fluid.dygraph.to_variable(x) - model = DANet('test', backbone='resnet101', num_classes=19, batch_size=2) - y = model(x) - print(y[0].shape) diff --git a/PaddleCV/Research/danet/dataset/.gitkeep b/PaddleCV/Research/danet/dataset/.gitkeep deleted file mode 100644 index 8b137891791fe96927ad78e64b0aad7bded08bdc..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/dataset/.gitkeep +++ /dev/null @@ -1 +0,0 @@ - diff --git a/PaddleCV/Research/danet/eval.py b/PaddleCV/Research/danet/eval.py deleted file mode 100644 index 46c825fabb71e8fee5834d2c09d5a2332833e007..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/eval.py +++ /dev/null @@ -1,410 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os - -os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0" -os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99" - -import paddle.fluid as fluid -import paddle -import logging -import math -import numpy as np -import shutil -import os - -from PIL import ImageOps, Image, ImageEnhance, ImageFilter -from datetime import datetime - -from danet import DANet -from options import Options -from utils.cityscapes_data import cityscapes_train -from utils.cityscapes_data import cityscapes_val -from utils.cityscapes_data import cityscapes_test -from utils.lr_scheduler import Lr -from iou import IOUMetric - -# globals -data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1) -data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1) - - -def pad_single_image(image, crop_size): - w, h = image.size - pad_h = crop_size - h if h < crop_size else 0 - pad_w = crop_size - w if w < crop_size else 0 - image = ImageOps.expand(image, border=(0, 0, pad_w, pad_h), fill=0) - assert (image.size[0] >= crop_size and image.size[1] >= crop_size) - return image - - -def crop_image(image, h0, w0, h1, w1): - return image.crop((w0, h0, w1, h1)) - - -def flip_left_right_image(image): - return image.transpose(Image.FLIP_LEFT_RIGHT) - - -def resize_image(image, out_h, out_w, mode=Image.BILINEAR): - return image.resize((out_w, out_h), mode) - - -def mapper_image(image): - image_array = np.array(image) - image_array = image_array.transpose((2, 0, 1)) - image_array = image_array / 255.0 - image_array = (image_array - data_mean) / data_std - image_array = image_array.astype('float32') - image_array = image_array[np.newaxis, :] - return image_array - - -def get_model(args): - model = DANet('DANet', - backbone=args.backbone, - num_classes=args.num_classes, - batch_size=1, - dilated=args.dilated, - multi_grid=args.multi_grid, - multi_dilation=args.multi_dilation) - return model - - -def copy_model(path, new_path): - shutil.rmtree(new_path, ignore_errors=True) - shutil.copytree(path, new_path) - model_path = os.path.join(new_path, '__model__') - if os.path.exists(model_path): - os.remove(model_path) - - -def mean_iou(pred, label, num_classes=19): - label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32), - fluid.layers.assign(np.array([num_classes], dtype=np.int32))) - label_ig = (label == num_classes).astype('int32') - label_ng = (label != num_classes).astype('int32') - pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32') - pred = pred * label_ng + label_ig * num_classes - miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1) - label.stop_gradient = True - return miou, wrong, correct - - -def change_model_executor_to_dygraph(args): - temp_image = fluid.layers.data(name='temp_image', shape=[3, 224, 224], dtype='float32') - model = get_model(args) - y = model(temp_image) - if args.cuda: - gpu_id = int(os.environ.get('FLAGS_selected_gpus', 0)) - place = fluid.CUDAPlace(gpu_id) if args.cuda else fluid.CPUPlace() - exe = fluid.Executor(place) - exe.run(fluid.default_startup_program()) - model_path = args.save_model - assert os.path.exists(model_path), "Please check whether the executor model file address {} exists. " \ - "Note: the executor model file is multiple files.".format(model_path) - fluid.io.load_persistables(exe, model_path, fluid.default_main_program()) - print('load executor train model successful, start change!') - param_list = fluid.default_main_program().block(0).all_parameters() - param_name_list = [p.name for p in param_list] - temp_dict = {} - for name in param_name_list: - tensor = fluid.global_scope().find_var(name).get_tensor() - npt = np.asarray(tensor) - temp_dict[name] = npt - del model - with fluid.dygraph.guard(): - x = np.random.randn(1, 3, 224, 224).astype('float32') - x = fluid.dygraph.to_variable(x) - model = get_model(args) - y = model(x) - new_param_dict = {} - for k, v in temp_dict.items(): - value = v - value_shape = value.shape - name = k - tensor = fluid.layers.create_parameter(shape=value_shape, - name=name, - dtype='float32', - default_initializer=fluid.initializer.NumpyArrayInitializer(value)) - new_param_dict[name] = tensor - assert len(new_param_dict) == len( - model.state_dict()), "The number of parameters is not equal. Loading parameters failed, " \ - "Please check whether the model is consistent!" - model.set_dict(new_param_dict) - fluid.save_dygraph(model.state_dict(), model_path) - del model - del temp_dict - print('change executor model to dygraph successful!') - - -def eval(args): - if args.change_executor_to_dygraph: - change_model_executor_to_dygraph(args) - with fluid.dygraph.guard(): - num_classes = args.num_classes - base_size = args.base_size - crop_size = args.crop_size - multi_scales = args.multi_scales - flip = args.flip - - if not multi_scales: - scales = [1.0] - else: - # scales = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.2] - scales = [0.5, 0.75, 1.0, 1.25, 1.35, 1.5, 1.75, 2.0, 2.2] # It might work better - - if len(scales) == 1: # single scale - # stride_rate = 2.0 / 3.0 - stride_rate = 1.0 / 2.0 # It might work better - else: - stride_rate = 1.0 / 2.0 - stride = int(crop_size * stride_rate) # slid stride - - model = get_model(args) - x = np.random.randn(1, 3, 224, 224).astype('float32') - x = fluid.dygraph.to_variable(x) - y = model(x) - iou = IOUMetric(num_classes) - model_path = args.save_model - # load_better_model - if paddle.__version__ == '1.5.2' and args.load_better_model: - assert os.path.exists(model_path), "your input save_model: {} ,but '{}' is not exists".format( - model_path, model_path) - print('better model exist!') - new_model_path = 'dygraph/' + model_path - copy_model(model_path, new_model_path) - model_param, _ = fluid.dygraph.load_persistables(new_model_path) - model.load_dict(model_param) - elif args.load_better_model: - assert os.path.exists(model_path + '.pdparams'), "your input save_model: {} ,but '{}' is not exists".format( - model_path, model_path + '.pdparams') - print('better model exist!') - model_param, _ = fluid.dygraph.load_dygraph(model_path) - model.load_dict(model_param) - else: - raise ValueError('Please set --load_better_model!') - - assert len(model_param) == len( - model.state_dict()), "The number of parameters is not equal. Loading parameters failed, " \ - "Please check whether the model is consistent!" - model.eval() - - prev_time = datetime.now() - # reader = cityscapes_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True) - reader = cityscapes_test(split='val', base_size=2048, crop_size=1024, scale=True, xmap=True) - - print('MultiEvalModule: base_size {}, crop_size {}'. - format(base_size, crop_size)) - print('scales: {}'.format(scales)) - print('val ing...') - logging.basicConfig(level=logging.INFO, - filename='DANet_{}_eval_dygraph.log'.format(args.backbone), - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') - logging.info('DANet') - logging.info(args) - palette = pat() - for data in reader(): - image = data[0] - label_path = data[1] # val_label is a picture, test_label is a path - label = Image.open(label_path, mode='r') # val_label is a picture, test_label is a path - save_png_path = label_path.replace('val', '{}_val'.format(args.backbone)).replace('test', '{}_test'.format( - args.backbone)) - label_np = np.array(label) - w, h = image.size # h 1024, w 2048 - scores = np.zeros(shape=[num_classes, h, w], dtype='float32') - for scale in scales: - long_size = int(math.ceil(base_size * scale)) # long_size - if h > w: - height = long_size - width = int(1.0 * w * long_size / h + 0.5) - short_size = width - else: - width = long_size - height = int(1.0 * h * long_size / w + 0.5) - short_size = height - - cur_img = resize_image(image, height, width) - # pad - if long_size <= crop_size: - pad_img = pad_single_image(cur_img, crop_size) - pad_img = mapper_image(pad_img) - pad_img = fluid.dygraph.to_variable(pad_img) - pred1, pred2, pred3 = model(pad_img) - pred1 = pred1.numpy() - outputs = pred1[:, :, :height, :width] - if flip: - pad_img_filp = flip_left_right_image(cur_img) - pad_img_filp = pad_single_image(pad_img_filp, crop_size) # pad - pad_img_filp = mapper_image(pad_img_filp) - pad_img_filp = fluid.dygraph.to_variable(pad_img_filp) - pred1, pred2, pred3 = model(pad_img_filp) - pred1 = fluid.layers.reverse(pred1, axis=3) - pred1 = pred1.numpy() - outputs += pred1[:, :, :height, :width] - else: - if short_size < crop_size: - # pad if needed - pad_img = pad_single_image(cur_img, crop_size) - else: - pad_img = cur_img - pw, ph = pad_img.size - assert (ph >= height and pw >= width) - - # slid window - h_grids = int(math.ceil(1.0 * (ph - crop_size) / stride)) + 1 - w_grids = int(math.ceil(1.0 * (pw - crop_size) / stride)) + 1 - outputs = np.zeros(shape=[1, num_classes, ph, pw], dtype='float32') - count_norm = np.zeros(shape=[1, 1, ph, pw], dtype='int32') - for idh in range(h_grids): - for idw in range(w_grids): - h0 = idh * stride - w0 = idw * stride - h1 = min(h0 + crop_size, ph) - w1 = min(w0 + crop_size, pw) - crop_img = crop_image(pad_img, h0, w0, h1, w1) - pad_crop_img = pad_single_image(crop_img, crop_size) - pad_crop_img = mapper_image(pad_crop_img) - pad_crop_img = fluid.dygraph.to_variable(pad_crop_img) - pred1, pred2, pred3 = model(pad_crop_img) # shape [1, num_class, h, w] - pred = pred1.numpy() # channel, h, w - outputs[:, :, h0:h1, w0:w1] += pred[:, :, 0:h1 - h0, 0:w1 - w0] - count_norm[:, :, h0:h1, w0:w1] += 1 - if flip: - pad_img_filp = flip_left_right_image(crop_img) - pad_img_filp = pad_single_image(pad_img_filp, crop_size) # pad - pad_img_array = mapper_image(pad_img_filp) - pad_img_array = fluid.dygraph.to_variable(pad_img_array) - pred1, pred2, pred3 = model(pad_img_array) - pred1 = fluid.layers.reverse(pred1, axis=3) - pred = pred1.numpy() - outputs[:, :, h0:h1, w0:w1] += pred[:, :, 0:h1 - h0, 0:w1 - w0] - count_norm[:, :, h0:h1, w0:w1] += 1 - assert ((count_norm == 0).sum() == 0) - outputs = outputs / count_norm - outputs = outputs[:, :, :height, :width] - outputs = fluid.dygraph.to_variable(outputs) - outputs = fluid.layers.resize_bilinear(outputs, out_shape=[h, w]) - score = outputs.numpy()[0] - scores += score # the sum of all scales, shape: [channel, h, w] - pred = np.argmax(score, axis=0).astype('uint8') - picture_path = '{}'.format(save_png_path).replace('.png', '_scale_{}'.format(scale)) - save_png(pred, palette, picture_path) - pred = np.argmax(scores, axis=0).astype('uint8') - picture_path = '{}'.format(save_png_path).replace('.png', '_scores') - save_png(pred, palette, picture_path) - iou.add_batch(pred, label_np) # cal iou - print('eval done!') - logging.info('eval done!') - acc, acc_cls, iu, mean_iu, fwavacc, kappa = iou.evaluate() - print('acc = {}'.format(acc)) - logging.info('acc = {}'.format(acc)) - print('acc_cls = {}'.format(acc_cls)) - logging.info('acc_cls = {}'.format(acc_cls)) - print('iu = {}'.format(iu)) - logging.info('iu = {}'.format(iu)) - print('mean_iou -- 255 = {}'.format(mean_iu)) - logging.info('mean_iou --255 = {}'.format(mean_iu)) - print('mean_iou = {}'.format(np.nanmean(iu[:-1]))) # realy iou - logging.info('mean_iou = {}'.format(np.nanmean(iu[:-1]))) - print('fwavacc = {}'.format(fwavacc)) - logging.info('fwavacc = {}'.format(fwavacc)) - print('kappa = {}'.format(kappa)) - logging.info('kappa = {}'.format(kappa)) - cur_time = datetime.now() - h, remainder = divmod((cur_time - prev_time).seconds, 3600) - m, s = divmod(remainder, 60) - time_str = "Time %02d:%02d:%02d" % (h, m, s) - print('val ' + time_str) - logging.info('val ' + time_str) - - -def save_png(pred_value, palette, name): - if isinstance(pred_value, np.ndarray): - if pred_value.ndim == 3: - batch_size = pred_value.shape[0] - if batch_size == 1: - pred_value = pred_value.squeeze(axis=0) - image = Image.fromarray(pred_value).convert('P') - image.putpalette(palette) - save_path = '{}.png'.format(name) - save_dir = os.path.dirname(save_path) - if not os.path.exists(save_dir): - os.makedirs(save_dir) - image.save(save_path) - else: - for batch_id in range(batch_size): - value = pred_value[batch_id] - image = Image.fromarray(value).convert('P') - image.putpalette(palette) - save_path = '{}.png'.format(name[batch_id]) - save_dir = os.path.dirname(save_path) - if not os.path.exists(save_dir): - os.makedirs(save_dir) - image.save(save_path) - elif pred_value.ndim == 2: - image = Image.fromarray(pred_value).convert('P') - image.putpalette(palette) - save_path = '{}.png'.format(name) - save_dir = os.path.dirname(save_path) - if not os.path.exists(save_dir): - os.makedirs(save_dir) - image.save(save_path) - else: - raise ValueError('Only support nd-array') - - -def save_png_test(path): - im = Image.open(path) - im_array = np.array(im).astype('uint8') - save_png(im_array, pat(), 'save_png_test') - - -def pat(): - palette = [] - for i in range(256): - palette.extend((i, i, i)) - palette[:3 * 19] = np.array([[128, 64, 128], - [244, 35, 232], - [70, 70, 70], - [102, 102, 156], - [190, 153, 153], - [153, 153, 153], - [250, 170, 30], - [220, 220, 0], - [107, 142, 35], - [152, 251, 152], - [70, 130, 180], - [220, 20, 60], - [255, 0, 0], - [0, 0, 142], - [0, 0, 70], - [0, 60, 100], - [0, 80, 100], - [0, 0, 230], - [119, 11, 32]], dtype='uint8').flatten() - return palette - - -if __name__ == '__main__': - options = Options() - args = options.parse() - options.print_args() - eval(args) - diff --git a/PaddleCV/Research/danet/img/Network.png b/PaddleCV/Research/danet/img/Network.png deleted file mode 100644 index ac109b403a122a0241cb391c2d17b45ca43cb41b..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/Network.png and /dev/null differ diff --git a/PaddleCV/Research/danet/img/channel.png b/PaddleCV/Research/danet/img/channel.png deleted file mode 100644 index eae8854c4252dec561f0b71febf5ddf1372b428c..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/channel.png and /dev/null differ diff --git a/PaddleCV/Research/danet/img/position.png b/PaddleCV/Research/danet/img/position.png deleted file mode 100644 index b46f9e1751783eb338b4554da70696df5e411457..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/position.png and /dev/null differ diff --git a/PaddleCV/Research/danet/img/val_1.png b/PaddleCV/Research/danet/img/val_1.png deleted file mode 100644 index 4f4610d36f3d16ec669a89aaaf6ee71b24982435..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/val_1.png and /dev/null differ diff --git a/PaddleCV/Research/danet/img/val_gt.png b/PaddleCV/Research/danet/img/val_gt.png deleted file mode 100644 index 5a0d27351a66a0cab3f885f86e42141a7f96b06d..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/val_gt.png and /dev/null differ diff --git a/PaddleCV/Research/danet/img/val_output.png b/PaddleCV/Research/danet/img/val_output.png deleted file mode 100644 index 3d9ee2191629b8dad656672e99716e1bcb6f720c..0000000000000000000000000000000000000000 Binary files a/PaddleCV/Research/danet/img/val_output.png and /dev/null differ diff --git a/PaddleCV/Research/danet/iou.py b/PaddleCV/Research/danet/iou.py deleted file mode 100644 index 1f560a3041c29f47deb70a7eecbe937d1d096317..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/iou.py +++ /dev/null @@ -1,74 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import numpy as np - - -class IOUMetric(object): - - def __init__(self, num_classes): - self.num_classes = num_classes + 1 - self.hist = np.zeros((num_classes + 1, num_classes + 1)) - - def _fast_hist(self, label_pred, label_true): - mask = (label_true >= 0) & (label_true < self.num_classes) - hist = np.bincount( - self.num_classes * label_true[mask].astype(int) + - label_pred[mask], minlength=self.num_classes ** 2).reshape(self.num_classes, self.num_classes) - return hist - - def add_batch(self, predictions, gts): - # gts = BHW - # predictions = BHW - if isinstance(gts, np.ndarray): - gts_ig = (gts == 255).astype(np.int32) - gts_nig = (gts != 255).astype(np.int32) - # print(predictions) - gts[gts == 255] = self.num_classes - 1 # 19 - predictions = gts_nig * predictions + gts_ig * (self.num_classes - 1) - # print(predictions) - for lp, lt in zip(predictions, gts): - self.hist += self._fast_hist(lp.flatten(), lt.flatten()) - - def evaluate(self): - acc = np.diag(self.hist).sum() / self.hist.sum() - acc_cls = np.nanmean(np.diag(self.hist) / self.hist.sum(axis=1)) - iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist)) - mean_iu = np.nanmean(iu) - freq = self.hist.sum(axis=1) / self.hist.sum() - fwavacc = (freq[freq > 0] * iu[freq > 0]).sum() - kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / ( - self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) - return acc, acc_cls, iu, mean_iu, fwavacc, kappa - - def evaluate_kappa(self): - kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / ( - self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) - return kappa - - def evaluate_iou_kappa(self): - iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist)) - mean_iu = np.nanmean(iu) - kappa = (self.hist.sum() * np.diag(self.hist).sum() - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) / ( - self.hist.sum() ** 2 - (self.hist.sum(axis=0) * self.hist.sum(axis=1)).sum()) - return mean_iu, kappa - - def evaluate_iu(self): - iu = np.diag(self.hist) / (self.hist.sum(axis=1) + self.hist.sum(axis=0) - np.diag(self.hist)) - return iu - diff --git a/PaddleCV/Research/danet/options.py b/PaddleCV/Research/danet/options.py deleted file mode 100644 index 40f73feef8ae2ee53491c506cba8cb5232e0e4c8..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/options.py +++ /dev/null @@ -1,176 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import argparse - - -class Options(object): - def __init__(self): - parser = argparse.ArgumentParser(description='Paddle DANet Segmentation') - - # model and dataset - parser.add_argument('--model', type=str, default='danet', - help='model name (default: danet)') - parser.add_argument('--backbone', type=str, default='resnet101', - help='backbone name (default: resnet101)') - parser.add_argument('--dataset', type=str, default='cityscapes', - help='dataset name (default: cityscapes)') - parser.add_argument('--num_classes', type=int, default=19, - help='num_classes (default: cityscapes = 19)') - parser.add_argument('--data_folder', type=str, - default='./dataset', - help='training dataset folder (default: ./dataset') - parser.add_argument('--base_size', type=int, default=1024, - help='base image size') - parser.add_argument('--crop_size', type=int, default=768, - help='crop image size') - - # training hyper params - parser.add_argument('--epoch_num', type=int, default=None, metavar='N', - help='number of epochs to train (default: auto)') - parser.add_argument('--start_epoch', type=int, default=0, - metavar='N', help='start epochs (default:0)') - parser.add_argument('--batch_size', type=int, default=None, - metavar='N', help='input batch size for \ - training (default: auto)') - parser.add_argument('--test_batch_size', type=int, default=None, - metavar='N', help='input batch size for \ - testing (default: same as batch size)') - - # optimizer params - parser.add_argument('--lr', type=float, default=None, metavar='LR', - help='learning rate (default: auto)') - parser.add_argument('--lr_scheduler', type=str, default='poly', - help='learning rate scheduler (default: poly)') - parser.add_argument('--lr_pow', type=float, default=0.9, - help='learning rate scheduler (default: 0.9)') - parser.add_argument('--lr_step', type=int, default=None, - help='lr step to change lr') - parser.add_argument('--warm_up', action='store_true', default=False, - help='warm_up (default: False)') - parser.add_argument('--warmup_epoch', type=int, default=5, - help='warmup_epoch (default: 5)') - parser.add_argument('--total_step', type=int, default=None, - metavar='N', help='total_step (default: auto)') - parser.add_argument('--step_per_epoch', type=int, default=None, - metavar='N', help='step_per_epoch (default: auto)') - parser.add_argument('--momentum', type=float, default=0.9, - metavar='M', help='momentum (default: 0.9)') - parser.add_argument('--weight_decay', type=float, default=1e-4, - metavar='M', help='w-decay (default: 1e-4)') - - # cuda, seed and logging - parser.add_argument('--cuda', action='store_true', default=False, - help='use CUDA training, (default: False)') - parser.add_argument('--use_data_parallel', action='store_true', default=False, - help='use data_parallel training, (default: False)') - parser.add_argument('--seed', type=int, default=1, metavar='S', - help='random seed (default: 1)') - parser.add_argument('--log_root', type=str, - default='./', help='set a log path folder') - - # checkpoint - parser.add_argument("--save_model", default='checkpoint/DANet101_better_model_paddle1.6', type=str, - help="model path, (default: checkpoint/DANet101_better_model_paddle1.6)") - - # change executor model params to dygraph model params - parser.add_argument("--change_executor_to_dygraph", action='store_true', default=False, - help="change executor model params to dygraph model params (default:False)") - - # finetuning pre-trained models - parser.add_argument("--load_pretrained_model", action='store_true', default=False, - help="load pretrained model (default: False)") - # load better models - parser.add_argument("--load_better_model", action='store_true', default=False, - help="load better model (default: False)") - parser.add_argument('--multi_scales', action='store_true', default=False, - help="testing scale, (default: False)") - parser.add_argument('--flip', action='store_true', default=False, - help="testing flip image, (default: False)") - - # multi grid dilation option - parser.add_argument("--dilated", action='store_true', default=False, - help="use dilation policy, (default: False)") - parser.add_argument("--multi_grid", action='store_true', default=False, - help="use multi grid dilation policy, default: False") - parser.add_argument('--multi_dilation', nargs='+', type=int, default=None, - help="multi grid dilation list, (default: None), can use --mutil_dilation 4 8 16") - parser.add_argument('--scale', action='store_true', default=False, - help='choose to use random scale transform(0.75-2.0) for train, (default: False)') - - # the parser - self.parser = parser - - def parse(self): - args = self.parser.parse_args() - # default settings for epochs, batch_size and lr - if args.epoch_num is None: - epoches = { - 'pascal_voc': 180, - 'pascal_aug': 180, - 'pcontext': 180, - 'ade20k': 180, - 'cityscapes': 350, - } - num_class_dict = { - 'pascal_voc': 21, - 'pascal_aug': 21, - 'pcontext': 21, - 'ade20k': None, - 'cityscapes': 19, - } - total_steps = { - 'pascal_voc': 200000, - 'pascal_aug': 500000, - 'pcontext': 500000, - 'ade20k': 500000, - 'cityscapes': 150000, - } - args.epoch_num = epoches[args.dataset.lower()] - args.num_classes = num_class_dict[args.dataset.lower()] - args.total_step = total_steps[args.dataset.lower()] - if args.batch_size is None: - args.batch_size = 2 - if args.test_batch_size is None: - args.test_batch_size = args.batch_size - if args.step_per_epoch is None: - step_per_epoch = { - 'pascal_voc': 185, - 'pascal_aug': 185, - 'pcontext': 185, - 'ade20k': 185, - 'cityscapes': 371, # 2975 // batch_size // GPU_num - } - args.step_per_epoch = step_per_epoch[args.dataset.lower()] - if args.lr is None: - lrs = { - 'pascal_voc': 0.0001, - 'pascal_aug': 0.001, - 'pcontext': 0.001, - 'ade20k': 0.01, - 'cityscapes': 0.003, - } - args.lr = lrs[args.dataset.lower()] / 8 * args.batch_size - return args - - def print_args(self): - arg_dict = self.parse().__dict__ - for k, v in arg_dict.items(): - print('{:30s}: {}'.format(k, v)) - diff --git a/PaddleCV/Research/danet/train_dygraph.py b/PaddleCV/Research/danet/train_dygraph.py deleted file mode 100644 index df610999e5eaff47aafe3e53b18833b1eb73b576..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/train_dygraph.py +++ /dev/null @@ -1,353 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os - -os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0" -os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99" - -import paddle.fluid as fluid -import numpy as np -import random -import paddle -import logging -import shutil -import multiprocessing -import sys -from datetime import datetime -from paddle.utils import Ploter - -from danet import DANet -from options import Options -from utils.cityscapes_data import cityscapes_train -from utils.cityscapes_data import cityscapes_val -from utils.lr_scheduler import Lr -import matplotlib - -matplotlib.use('Agg') - - -def get_model(args): - model = DANet('DANet', - backbone=args.backbone, - num_classes=args.num_classes, - batch_size=args.batch_size, - dilated=args.dilated, - multi_grid=args.multi_grid, - multi_dilation=args.multi_dilation) - return model - - -def _cpu_num(): - if "CPU_NUM" not in os.environ.keys(): - if multiprocessing.cpu_count() > 1: - sys.stderr.write( - '!!! The CPU_NUM is not specified, you should set CPU_NUM in the environment variable list.\n' - 'CPU_NUM indicates that how many CPUPlace are used in the current task.\n' - 'And if this parameter are set as N (equal to the number of physical CPU core) the program may be faster.\n\n' - 'export CPU_NUM={} # for example, set CPU_NUM as number of physical CPU core which is {}.\n\n' - '!!! The default number of CPU_NUM=1.\n'.format( - multiprocessing.cpu_count(), multiprocessing.cpu_count())) - os.environ['CPU_NUM'] = str(1) - cpu_num = os.environ.get('CPU_NUM') - return int(cpu_num) - - -def mean_iou(pred, label, num_classes=19): - label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32), - fluid.layers.assign(np.array([num_classes], dtype=np.int32))) - label_ig = (label == num_classes).astype('int32') - label_ng = (label != num_classes).astype('int32') - pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32') - pred = pred * label_ng + label_ig * num_classes - miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1) - label.stop_gradient = True - return miou, wrong, correct - - -def loss_fn(pred, pred2, pred3, label, num_classes=19): - pred = fluid.layers.transpose(pred, perm=[0, 2, 3, 1]) - pred = fluid.layers.reshape(pred, [-1, num_classes]) - - pred2 = fluid.layers.transpose(pred2, perm=[0, 2, 3, 1]) - pred2 = fluid.layers.reshape(pred2, [-1, num_classes]) - - pred3 = fluid.layers.transpose(pred3, perm=[0, 2, 3, 1]) - pred3 = fluid.layers.reshape(pred3, [-1, num_classes]) - - label = fluid.layers.reshape(label, [-1, 1]) - - pred = fluid.layers.softmax(pred, use_cudnn=False) - loss1 = fluid.layers.cross_entropy(pred, label, ignore_index=255) - - pred2 = fluid.layers.softmax(pred2, use_cudnn=False) - loss2 = fluid.layers.cross_entropy(pred2, label, ignore_index=255) - - pred3 = fluid.layers.softmax(pred3, use_cudnn=False) - loss3 = fluid.layers.cross_entropy(pred3, label, ignore_index=255) - - label.stop_gradient = True - return loss1 + loss2 + loss3 - - -def optimizer_setting(args): - if args.weight_decay is not None: - regular = fluid.regularizer.L2Decay(regularization_coeff=args.weight_decay) - else: - regular = None - if args.lr_scheduler == 'poly': - lr_scheduler = Lr(lr_policy='poly', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - power=args.lr_pow, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch) - decayed_lr = lr_scheduler.get_lr() - elif args.lr_scheduler == 'cosine': - lr_scheduler = Lr(lr_policy='cosine', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch) - decayed_lr = lr_scheduler.get_lr() - elif args.lr_scheduler == 'piecewise': - lr_scheduler = Lr(lr_policy='piecewise', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch, - decay_epoch=[50, 100, 150], - gamma=0.1) - decayed_lr = lr_scheduler.get_lr() - else: - decayed_lr = args.lr - return fluid.optimizer.MomentumOptimizer(learning_rate=decayed_lr, - momentum=args.momentum, - regularization=regular) - - -def main(args): - batch_size = args.batch_size - num_epochs = args.epoch_num - num_classes = args.num_classes - data_root = args.data_folder - if args.cuda: - num = fluid.core.get_cuda_device_count() - print('The number of GPU: {}'.format(num)) - else: - num = _cpu_num() - print('The number of CPU: {}'.format(num)) - - # program - start_prog = fluid.default_startup_program() - train_prog = fluid.default_main_program() - - start_prog.random_seed = args.seed - train_prog.random_seed = args.seed - np.random.seed(args.seed) - random.seed(args.seed) - - logging.basicConfig(level=logging.INFO, - filename='DANet_{}_train_dygraph.log'.format(args.backbone), - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') - logging.info('DANet') - logging.info(args) - - if args.cuda: - gpu_id = int(os.environ.get('FLAGS_selected_gpus', 0)) - - place = fluid.CUDAPlace(gpu_id) if args.cuda else fluid.CPUPlace() - train_loss_title = 'Train_loss' - test_loss_title = 'Test_loss' - - train_iou_title = 'Train_mIOU' - test_iou_title = 'Test_mIOU' - - plot_loss = Ploter(train_loss_title, test_loss_title) - plot_iou = Ploter(train_iou_title, test_iou_title) - - with fluid.dygraph.guard(place): - - model = get_model(args) - x = np.random.randn(batch_size, 3, 224, 224).astype('float32') - x = fluid.dygraph.to_variable(x) - model(x) - - # load_pretrained_model - if args.load_pretrained_model: - save_dir = args.save_model - assert os.path.exists(save_dir + '.pdparams'), "your input save_model: {} ,but '{}' is not exists".format( - save_dir, save_dir + '.pdparams') - param, _ = fluid.load_dygraph(save_dir) - model.set_dict(param) - assert len(param) == len( - model.state_dict()), "The number of parameters is not equal. Loading parameters failed, " \ - "Please check whether the model is consistent!" - print('load pretrained model!') - - # load_better_model - if args.load_better_model: - save_dir = args.save_model - assert os.path.exists(save_dir + '.pdparams'), "your input save_model: {} ,but '{}' is not exists".format( - save_dir, save_dir + '.pdparams') - param, _ = fluid.load_dygraph(save_dir) - model.set_dict(param) - assert len(param) == len( - model.state_dict()), "The number of parameters is not equal. Loading parameters failed, " \ - "Please check whether the model is consistent!" - print('load better model!') - - optimizer = optimizer_setting(args) - train_data = cityscapes_train(data_root=data_root, - base_size=args.base_size, - crop_size=args.crop_size, - scale=args.scale, - xmap=True, - batch_size=batch_size, - gpu_num=num) - batch_train_data = paddle.batch(paddle.reader.shuffle( - train_data, buf_size=batch_size * 64), - batch_size=batch_size, - drop_last=True) - - val_data = cityscapes_val(data_root=data_root, - base_size=args.base_size, - crop_size=args.crop_size, - scale=args.scale, - xmap=True) - batch_test_data = paddle.batch(val_data, - batch_size=batch_size, - drop_last=True) - - train_iou_manager = fluid.metrics.Accuracy() - train_avg_loss_manager = fluid.metrics.Accuracy() - test_iou_manager = fluid.metrics.Accuracy() - test_avg_loss_manager = fluid.metrics.Accuracy() - - better_miou_train = 0 - better_miou_test = 0 - - for epoch in range(num_epochs): - prev_time = datetime.now() - train_avg_loss_manager.reset() - train_iou_manager.reset() - for batch_id, data in enumerate(batch_train_data()): - image = np.array([x[0] for x in data]).astype('float32') - label = np.array([x[1] for x in data]).astype('int64') - - image = fluid.dygraph.to_variable(image) - label = fluid.dygraph.to_variable(label) - label.stop_gradient = True - pred, pred2, pred3 = model(image) - train_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes) - train_avg_loss = fluid.layers.mean(train_loss) - miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes) - train_avg_loss.backward() - optimizer.minimize(train_avg_loss) - model.clear_gradients() - train_iou_manager.update(miou.numpy(), weight=int(batch_size * num)) - train_avg_loss_manager.update(train_avg_loss.numpy(), weight=int(batch_size * num)) - batch_train_str = "epoch: {}, batch: {}, train_avg_loss: {:.6f}, " \ - "train_miou: {:.6f}.".format(epoch + 1, - batch_id + 1, - train_avg_loss.numpy()[0], - miou.numpy()[0]) - if batch_id % 100 == 0: - logging.info(batch_train_str) - print(batch_train_str) - cur_time = datetime.now() - h, remainder = divmod((cur_time - prev_time).seconds, 3600) - m, s = divmod(remainder, 60) - time_str = " Time %02d:%02d:%02d" % (h, m, s) - train_str = "\nepoch: {}, train_avg_loss: {:.6f}, " \ - "train_miou: {:.6f}.".format(epoch + 1, - train_avg_loss_manager.eval()[0], - train_iou_manager.eval()[0]) - print(train_str + time_str + '\n') - logging.info(train_str + time_str + '\n') - plot_loss.append(train_loss_title, epoch, train_avg_loss_manager.eval()[0]) - plot_loss.plot('./DANet_loss_dygraph.jpg') - plot_iou.append(train_iou_title, epoch, train_iou_manager.eval()[0]) - plot_iou.plot('./DANet_miou_dygraph.jpg') - fluid.dygraph.save_dygraph(model.state_dict(), 'checkpoint/DANet_epoch_new') - # save_model - if better_miou_train < train_iou_manager.eval()[0]: - shutil.rmtree('checkpoint/DANet_better_train_{:.4f}.pdparams'.format(better_miou_train), - ignore_errors=True) - better_miou_train = train_iou_manager.eval()[0] - fluid.dygraph.save_dygraph(model.state_dict(), - 'checkpoint/DANet_better_train_{:.4f}'.format(better_miou_train)) - - ########## test ############ - model.eval() - test_iou_manager.reset() - test_avg_loss_manager.reset() - prev_time = datetime.now() - for (batch_id, data) in enumerate(batch_test_data()): - image = np.array([x[0] for x in data]).astype('float32') - label = np.array([x[1] for x in data]).astype('int64') - - image = fluid.dygraph.to_variable(image) - label = fluid.dygraph.to_variable(label) - - label.stop_gradient = True - pred, pred2, pred3 = model(image) - test_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes) - test_avg_loss = fluid.layers.mean(test_loss) - miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes) - test_iou_manager.update(miou.numpy(), weight=int(batch_size * num)) - test_avg_loss_manager.update(test_avg_loss.numpy(), weight=int(batch_size * num)) - batch_test_str = "epoch: {}, batch: {}, test_avg_loss: {:.6f}, " \ - "test_miou: {:.6f}.".format(epoch + 1, batch_id + 1, - test_avg_loss.numpy()[0], - miou.numpy()[0]) - if batch_id % 20 == 0: - logging.info(batch_test_str) - print(batch_test_str) - cur_time = datetime.now() - h, remainder = divmod((cur_time - prev_time).seconds, 3600) - m, s = divmod(remainder, 60) - time_str = " Time %02d:%02d:%02d" % (h, m, s) - test_str = "\nepoch: {}, test_avg_loss: {:.6f}, " \ - "test_miou: {:.6f}.".format(epoch + 1, - test_avg_loss_manager.eval()[0], - test_iou_manager.eval()[0]) - print(test_str + time_str + '\n') - logging.info(test_str + time_str + '\n') - plot_loss.append(test_loss_title, epoch, test_avg_loss_manager.eval()[0]) - plot_loss.plot('./DANet_loss_dygraph.jpg') - plot_iou.append(test_iou_title, epoch, test_iou_manager.eval()[0]) - plot_iou.plot('./DANet_miou_dygraph.jpg') - model.train() - # save_model - if better_miou_test < test_iou_manager.eval()[0]: - shutil.rmtree('checkpoint/DANet_better_test_{:.4f}.pdparams'.format(better_miou_test), - ignore_errors=True) - better_miou_test = test_iou_manager.eval()[0] - fluid.dygraph.save_dygraph(model.state_dict(), - 'checkpoint/DANet_better_test_{:.4f}'.format(better_miou_test)) - - -if __name__ == '__main__': - options = Options() - args = options.parse() - options.print_args() - main(args) diff --git a/PaddleCV/Research/danet/train_executor.py b/PaddleCV/Research/danet/train_executor.py deleted file mode 100644 index 82f451dd168527886081c684686dd271a9e3c38c..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/train_executor.py +++ /dev/null @@ -1,423 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os - -os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0" -os.environ['FLAGS_fraction_of_gpu_memory_to_use'] = "0.99" - -import paddle.fluid as fluid -import numpy as np -import random -import paddle -import logging -import shutil -import multiprocessing -import sys -from datetime import datetime -from paddle.utils import Ploter - -from danet import DANet -from options import Options -from utils.cityscapes_data import cityscapes_train -from utils.cityscapes_data import cityscapes_val -from utils.lr_scheduler import Lr -import matplotlib - -matplotlib.use('Agg') - - -def get_model(args): - model = DANet('DANet', - backbone=args.backbone, - num_classes=args.num_classes, - batch_size=args.batch_size, - dilated=args.dilated, - multi_grid=args.multi_grid, - multi_dilation=args.multi_dilation) - return model - - -def _cpu_num(): - if "CPU_NUM" not in os.environ.keys(): - if multiprocessing.cpu_count() > 1: - sys.stderr.write( - '!!! The CPU_NUM is not specified, you should set CPU_NUM in the environment variable list.\n' - 'CPU_NUM indicates that how many CPUPlace are used in the current task.\n' - 'And if this parameter are set as N (equal to the number of physical CPU core) the program may be faster.\n\n' - 'export CPU_NUM={} # for example, set CPU_NUM as number of physical CPU core which is {}.\n\n' - '!!! The default number of CPU_NUM=1.\n'.format( - multiprocessing.cpu_count(), multiprocessing.cpu_count())) - os.environ['CPU_NUM'] = str(1) - cpu_num = os.environ.get('CPU_NUM') - return int(cpu_num) - - -def mean_iou(pred, label, num_classes=19): - label = fluid.layers.elementwise_min(fluid.layers.cast(label, np.int32), - fluid.layers.assign(np.array([num_classes], dtype=np.int32))) - label_ig = (label == num_classes).astype('int32') - label_ng = (label != num_classes).astype('int32') - pred = fluid.layers.cast(fluid.layers.argmax(pred, axis=1), 'int32') - pred = pred * label_ng + label_ig * num_classes - miou, wrong, correct = fluid.layers.mean_iou(pred, label, num_classes + 1) - label.stop_gradient = True - return miou, wrong, correct - - -def loss_fn(pred, pred2, pred3, label, num_classes=19): - pred = fluid.layers.transpose(pred, perm=[0, 2, 3, 1]) - pred = fluid.layers.reshape(pred, [-1, num_classes]) - - pred2 = fluid.layers.transpose(pred2, perm=[0, 2, 3, 1]) - pred2 = fluid.layers.reshape(pred2, [-1, num_classes]) - - pred3 = fluid.layers.transpose(pred3, perm=[0, 2, 3, 1]) - pred3 = fluid.layers.reshape(pred3, [-1, num_classes]) - - label = fluid.layers.reshape(label, [-1, 1]) - - # loss1 = fluid.layers.softmax_with_cross_entropy(pred, label, ignore_index=255) - # 以上方式会出现loss为NaN的情况 - pred = fluid.layers.softmax(pred, use_cudnn=False) - loss1 = fluid.layers.cross_entropy(pred, label, ignore_index=255) - - pred2 = fluid.layers.softmax(pred2, use_cudnn=False) - loss2 = fluid.layers.cross_entropy(pred2, label, ignore_index=255) - - pred3 = fluid.layers.softmax(pred3, use_cudnn=False) - loss3 = fluid.layers.cross_entropy(pred3, label, ignore_index=255) - - label.stop_gradient = True - return loss1 + loss2 + loss3 - - -def save_model(save_dir, exe, program=None): - if os.path.exists(save_dir): - shutil.rmtree(save_dir, ignore_errors=True) - os.makedirs(save_dir) - # fluid.io.save_persistables(exe, save_dir, program) - fluid.io.save_params(exe, save_dir, program) - print('save: {}'.format(os.path.basename(save_dir))) - else: - os.makedirs(save_dir) - fluid.io.save_persistables(exe, save_dir, program) - print('create: {}'.format(os.path.basename(save_dir))) - - -def load_model(save_dir, exe, program=None): - if os.path.exists(save_dir): - # fluid.io.load_persistables(exe, save_dir, program) - fluid.io.load_params(exe, save_dir, program) - print('Load successful!') - else: - raise Exception('Please check the model path!') - - -def optimizer_setting(args): - if args.weight_decay is not None: - regular = fluid.regularizer.L2Decay(regularization_coeff=args.weight_decay) - else: - regular = None - if args.lr_scheduler == 'poly': - lr_scheduler = Lr(lr_policy='poly', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - power=args.lr_pow, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch) - decayed_lr = lr_scheduler.get_lr() - elif args.lr_scheduler == 'cosine': - lr_scheduler = Lr(lr_policy='cosine', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch) - decayed_lr = lr_scheduler.get_lr() - elif args.lr_scheduler == 'piecewise': - lr_scheduler = Lr(lr_policy='piecewise', - base_lr=args.lr, - epoch_nums=args.epoch_num, - step_per_epoch=args.step_per_epoch, - warm_up=args.warm_up, - warmup_epoch=args.warmup_epoch, - decay_epoch=[50, 100, 150], - gamma=0.1) - decayed_lr = lr_scheduler.get_lr() - else: - decayed_lr = args.lr - return fluid.optimizer.MomentumOptimizer(learning_rate=decayed_lr, - momentum=args.momentum, - regularization=regular) - - -def main(args): - image_shape = args.crop_size - image = fluid.layers.data(name='image', shape=[3, image_shape, image_shape], dtype='float32') - label = fluid.layers.data(name='label', shape=[image_shape, image_shape], dtype='int64') - - batch_size = args.batch_size - epoch_num = args.epoch_num - num_classes = args.num_classes - data_root = args.data_folder - if args.cuda: - num = fluid.core.get_cuda_device_count() - print('The number of GPU: {}'.format(num)) - else: - num = _cpu_num() - print('The number of CPU: {}'.format(num)) - - # program - start_prog = fluid.default_startup_program() - train_prog = fluid.default_main_program() - - start_prog.random_seed = args.seed - train_prog.random_seed = args.seed - np.random.seed(args.seed) - random.seed(args.seed) - - # clone - test_prog = train_prog.clone(for_test=True) - - logging.basicConfig(level=logging.INFO, - filename='DANet_{}_train_executor.log'.format(args.backbone), - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') - logging.info('DANet') - logging.info(args) - - with fluid.program_guard(train_prog, start_prog): - with fluid.unique_name.guard(): - train_py_reader = fluid.io.PyReader(feed_list=[image, label], - capacity=64, - use_double_buffer=True, - iterable=False) - train_data = cityscapes_train(data_root=data_root, - base_size=args.base_size, - crop_size=args.crop_size, - scale=args.scale, - xmap=True, - batch_size=batch_size, - gpu_num=num) - batch_train_data = paddle.batch(paddle.reader.shuffle( - train_data, buf_size=batch_size * 16), - batch_size=batch_size, - drop_last=True) - train_py_reader.decorate_sample_list_generator(batch_train_data) - - model = get_model(args) - pred, pred2, pred3 = model(image) - train_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes) - train_avg_loss = fluid.layers.mean(train_loss) - optimizer = optimizer_setting(args) - optimizer.minimize(train_avg_loss) - # miou不是真实的 - miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes) - - with fluid.program_guard(test_prog, start_prog): - with fluid.unique_name.guard(): - test_py_reader = fluid.io.PyReader(feed_list=[image, label], - capacity=64, - iterable=False, - use_double_buffer=True) - val_data = cityscapes_val(data_root=data_root, - base_size=args.base_size, - crop_size=args.crop_size, - scale=args.scale, - xmap=True) - batch_test_data = paddle.batch(val_data, - batch_size=batch_size, - drop_last=True) - test_py_reader.decorate_sample_list_generator(batch_test_data) - - model = get_model(args) - pred, pred2, pred3 = model(image) - test_loss = loss_fn(pred, pred2, pred3, label, num_classes=num_classes) - test_avg_loss = fluid.layers.mean(test_loss) - # miou不是真实的 - miou, wrong, correct = mean_iou(pred, label, num_classes=num_classes) - - place = fluid.CUDAPlace(0) if args.cuda else fluid.CPUPlace() - exe = fluid.Executor(place) - exe.run(start_prog) - - if args.use_data_parallel and args.cuda: - exec_strategy = fluid.ExecutionStrategy() - exec_strategy.num_threads = fluid.core.get_cuda_device_count() - exec_strategy.num_iteration_per_drop_scope = 100 - build_strategy = fluid.BuildStrategy() - build_strategy.sync_batch_norm = True - print("sync_batch_norm = True!") - compiled_train_prog = fluid.compiler.CompiledProgram(train_prog).with_data_parallel( - loss_name=train_avg_loss.name, - build_strategy=build_strategy, - exec_strategy=exec_strategy) - else: - compiled_train_prog = fluid.compiler.CompiledProgram(train_prog) - - # 加载预训练模型 - if args.load_pretrained_model: - assert os.path.exists(args.save_model), "your input save_model: {} ,but '{}' is not exists".format( - args.save_model, args.save_model) - load_model(args.save_model, exe, program=train_prog) - print('load pretrained model!') - - # 加载最优模型 - if args.load_better_model: - assert os.path.exists(args.save_model), "your input save_model: {} ,but '{}' is not exists".format( - args.save_model, args.save_model) - load_model(args.save_model, exe, program=train_prog) - print('load better model!') - - train_iou_manager = fluid.metrics.Accuracy() - train_avg_loss_manager = fluid.metrics.Accuracy() - test_iou_manager = fluid.metrics.Accuracy() - test_avg_loss_manager = fluid.metrics.Accuracy() - better_miou_train = 0 - better_miou_test = 0 - - train_loss_title = 'Train_loss' - test_loss_title = 'Test_loss' - - train_iou_title = 'Train_mIOU' - test_iou_title = 'Test_mIOU' - - plot_loss = Ploter(train_loss_title, test_loss_title) - plot_iou = Ploter(train_iou_title, test_iou_title) - - for epoch in range(epoch_num): - prev_time = datetime.now() - train_avg_loss_manager.reset() - train_iou_manager.reset() - logging.info('training, epoch = {}'.format(epoch + 1)) - train_py_reader.start() - batch_id = 0 - while True: - try: - train_fetch_list = [train_avg_loss, miou, wrong, correct] - train_avg_loss_value, train_iou_value, w, c = exe.run( - program=compiled_train_prog, - fetch_list=train_fetch_list) - - train_iou_manager.update(train_iou_value, weight=int(batch_size * num)) - train_avg_loss_manager.update(train_avg_loss_value, weight=int(batch_size * num)) - batch_train_str = "epoch: {}, batch: {}, train_avg_loss: {:.6f}, " \ - "train_miou: {:.6f}.".format(epoch + 1, - batch_id + 1, - train_avg_loss_value[0], - train_iou_value[0]) - if batch_id % 40 == 0: - logging.info(batch_train_str) - print(batch_train_str) - batch_id += 1 - except fluid.core.EOFException: - train_py_reader.reset() - break - cur_time = datetime.now() - h, remainder = divmod((cur_time - prev_time).seconds, 3600) - m, s = divmod(remainder, 60) - time_str = " Time %02d:%02d:%02d" % (h, m, s) - train_str = "epoch: {}, train_avg_loss: {:.6f}, " \ - "train_miou: {:.6f}.".format(epoch + 1, - train_avg_loss_manager.eval()[0], - train_iou_manager.eval()[0]) - print(train_str + time_str + '\n') - logging.info(train_str + time_str) - plot_loss.append(train_loss_title, epoch, train_avg_loss_manager.eval()[0]) - plot_loss.plot('./DANet_loss_executor.jpg') - plot_iou.append(train_iou_title, epoch, train_iou_manager.eval()[0]) - plot_iou.plot('./DANet_miou_executor.jpg') - - # save_model - if better_miou_train < train_iou_manager.eval()[0]: - shutil.rmtree('./checkpoint/DANet_better_train_{:.4f}'.format(better_miou_train), - ignore_errors=True) - better_miou_train = train_iou_manager.eval()[0] - logging.warning( - '-----------train---------------better_train: {:.6f}, epoch: {}, -----------Train model saved successfully!\n'.format( - better_miou_train, epoch + 1)) - save_dir = './checkpoint/DANet_better_train_{:.4f}'.format(better_miou_train) - save_model(save_dir, exe, program=train_prog) - if (epoch + 1) % 5 == 0: - save_dir = './checkpoint/DANet_epoch_train' - save_model(save_dir, exe, program=train_prog) - - # test - test_py_reader.start() - test_iou_manager.reset() - test_avg_loss_manager.reset() - prev_time = datetime.now() - logging.info('testing, epoch = {}'.format(epoch + 1)) - batch_id = 0 - while True: - try: - test_fetch_list = [test_avg_loss, miou, wrong, correct] - test_avg_loss_value, test_iou_value, _, _ = exe.run(program=test_prog, - fetch_list=test_fetch_list) - test_iou_manager.update(test_iou_value, weight=int(batch_size * num)) - test_avg_loss_manager.update(test_avg_loss_value, weight=int(batch_size * num)) - batch_test_str = "epoch: {}, batch: {}, test_avg_loss: {:.6f}, " \ - "test_miou: {:.6f}. ".format(epoch + 1, - batch_id + 1, - test_avg_loss_value[0], - test_iou_value[0]) - if batch_id % 40 == 0: - logging.info(batch_test_str) - print(batch_test_str) - batch_id += 1 - except fluid.core.EOFException: - test_py_reader.reset() - break - cur_time = datetime.now() - h, remainder = divmod((cur_time - prev_time).seconds, 3600) - m, s = divmod(remainder, 60) - time_str = " Time %02d:%02d:%02d" % (h, m, s) - test_str = "epoch: {}, test_avg_loss: {:.6f}, " \ - "test_miou: {:.6f}.".format(epoch + 1, - test_avg_loss_manager.eval()[0], - test_iou_manager.eval()[0]) - print(test_str + time_str + '\n') - logging.info(test_str + time_str) - plot_loss.append(test_loss_title, epoch, test_avg_loss_manager.eval()[0]) - plot_loss.plot('./DANet_loss_executor.jpg') - plot_iou.append(test_iou_title, epoch, test_iou_manager.eval()[0]) - plot_iou.plot('./DANet_miou_executor.jpg') - - # save_model_infer - if better_miou_test < test_iou_manager.eval()[0]: - shutil.rmtree('./checkpoint/infer/DANet_better_test_{:.4f}'.format(better_miou_test), - ignore_errors=True) - better_miou_test = test_iou_manager.eval()[0] - logging.warning( - '------------test-------------infer better_test: {:.6f}, epoch: {}, ----------------Inference model saved successfully!\n'.format( - better_miou_test, epoch + 1)) - save_dir = './checkpoint/infer/DANet_better_test_{:.4f}'.format(better_miou_test) - # save_model(save_dir, exe, program=test_prog) - fluid.io.save_inference_model(save_dir, [image.name], [pred, pred2, pred3], exe) - print('Inference model saved successfully') - - -if __name__ == '__main__': - options = Options() - args = options.parse() - options.print_args() - main(args) - - - diff --git a/PaddleCV/Research/danet/utils/__init__.py b/PaddleCV/Research/danet/utils/__init__.py deleted file mode 100644 index 8469aa22358578b58a9161761d987815064060f2..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/__init__.py +++ /dev/null @@ -1,24 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -from .base import BaseDataSet -from .cityscapes import CityScapes -from .lr_scheduler import Lr -from .cityscapes_data import * -from .voc import VOC -from .voc_data import * diff --git a/PaddleCV/Research/danet/utils/base.py b/PaddleCV/Research/danet/utils/base.py deleted file mode 100644 index b1528917f870b5965634b3b31f803866036b539f..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/base.py +++ /dev/null @@ -1,132 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import random -import numpy as np -from PIL import Image, ImageOps, ImageFilter, ImageEnhance -import os -import sys - -curPath = os.path.abspath(os.path.dirname(__file__)) -parentPath = os.path.split(curPath)[0] -rootPath = os.path.split(parentPath)[0] -sys.path.append(rootPath) - - -class BaseDataSet(object): - - def __init__(self, root, split, base_size=1024, crop_size=768, scale=True): - self.root = root - support = ['train', 'train_val', 'val', 'test'] - assert split in support, "split= \'{}\' not in {}".format(split, support) - self.split = split - self.crop_size = crop_size # 裁剪大小 - self.base_size = base_size # 图片最短边 - self.scale = scale - self.image_path = None - self.label_path = None - - def sync_transform(self, image, label, aug=True): - crop_size = self.crop_size - if self.scale: - short_size = random.randint(int(self.base_size * 0.75), int(self.base_size * 2.0)) - else: - short_size = self.base_size - - # 随机左右翻转 - if random.random() > 0.5: - image = image.transpose(Image.FLIP_LEFT_RIGHT) - label = label.transpose(Image.FLIP_LEFT_RIGHT) - w, h = image.size - - # 同比例缩放 - if h > w: - out_w = short_size - out_h = int(1.0 * h / w * out_w) - else: - out_h = short_size - out_w = int(1.0 * w / h * out_h) - image = image.resize((out_w, out_h), Image.BILINEAR) - label = label.resize((out_w, out_h), Image.NEAREST) - - # 四周填充 - if short_size < crop_size: - pad_h = crop_size - out_h if out_h < crop_size else 0 - pad_w = crop_size - out_w if out_w < crop_size else 0 - image = ImageOps.expand(image, border=(pad_w // 2, pad_h // 2, pad_w - pad_w // 2, pad_h - pad_h // 2), - fill=0) - label = ImageOps.expand(label, border=(pad_w // 2, pad_h // 2, pad_w - pad_w // 2, pad_h - pad_h // 2), - fill=255) - - # 随机裁剪 - w, h = image.size - x = random.randint(0, w - crop_size) - y = random.randint(0, h - crop_size) - image = image.crop((x, y, x + crop_size, y + crop_size)) - label = label.crop((x, y, x + crop_size, y + crop_size)) - - if aug: - # 高斯模糊,可选 - if random.random() > 0.7: - image = image.filter(ImageFilter.GaussianBlur(radius=random.random())) - - # 可选 - if random.random() > 0.7: - # 随机亮度 - factor = np.random.uniform(0.75, 1.25) - image = ImageEnhance.Brightness(image).enhance(factor) - - # 颜色抖动 - factor = np.random.uniform(0.75, 1.25) - image = ImageEnhance.Color(image).enhance(factor) - - # 随机对比度 - factor = np.random.uniform(0.75, 1.25) - image = ImageEnhance.Contrast(image).enhance(factor) - - # 随机锐度 - factor = np.random.uniform(0.75, 1.25) - image = ImageEnhance.Sharpness(image).enhance(factor) - return image, label - - def sync_val_transform(self, image, label): - crop_size = self.crop_size - short_size = self.base_size - - w, h = image.size - - # 同比例缩放 - if h > w: - out_w = short_size - out_h = int(1.0 * h / w * out_w) - else: - out_h = short_size - out_w = int(1.0 * w / h * out_h) - image = image.resize((out_w, out_h), Image.BILINEAR) - label = label.resize((out_w, out_h), Image.NEAREST) - - # 中心裁剪 - w, h = image.size - x1 = int(round((w - crop_size) / 2.)) - y1 = int(round((h - crop_size) / 2.)) - image = image.crop((x1, y1, x1 + crop_size, y1 + crop_size)) - label = label.crop((x1, y1, x1 + crop_size, y1 + crop_size)) - return image, label - - def eval(self, image): - pass diff --git a/PaddleCV/Research/danet/utils/cityscapes.py b/PaddleCV/Research/danet/utils/cityscapes.py deleted file mode 100644 index 5c7ee431c11c8d9eb0cdd9d3bd976156d5883766..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/cityscapes.py +++ /dev/null @@ -1,79 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -from utils.base import BaseDataSet - - -class CityScapes(BaseDataSet): - """prepare cityscapes path_pairs""" - - BASE_DIR = 'cityscapes' - NUM_CLASS = 19 - - def __init__(self, root='./dataset', split='train', **kwargs): - super(CityScapes, self).__init__(root, split, **kwargs) - if os.sep == '\\': # windows - root = root.replace('/', '\\') - - root = os.path.join(root, self.BASE_DIR) - assert os.path.exists(root), "please download cityscapes data_set, put in dataset(dir),or check root" - self.image_path, self.label_path = self._get_cityscapes_pairs(root, split) - assert len(self.image_path) == len(self.label_path), "please check image_length = label_length" - self.print_param() - - def print_param(self): # 用于核对当前数据集的信息 - print('INFO: dataset_root: {}, split: {}, ' - 'base_size: {}, crop_size: {}, scale: {}, ' - 'image_length: {}, label_length: {}'.format(self.root, self.split, self.base_size, - self.crop_size, self.scale, len(self.image_path), - len(self.label_path))) - - @staticmethod - def _get_cityscapes_pairs(root, split): - - def get_pairs(root, file_image, file_label): - file_image = os.path.join(root, file_image) - file_label = os.path.join(root, file_label) - with open(file_image, 'r') as f: - file_list_image = f.read().split() - with open(file_label, 'r') as f: - file_list_label = f.read().split() - if os.sep == '\\': # for windows - image_path = [os.path.join(root, x.replace('/', '\\')) for x in file_list_image] - label_path = [os.path.join(root, x.replace('/', '\\')) for x in file_list_label] - else: - image_path = [os.path.join(root, x) for x in file_list_image] - label_path = [os.path.join(root, x) for x in file_list_label] - return image_path, label_path - - if split == 'train': - image_path, label_path = get_pairs(root, 'trainImages.txt', 'trainLabels.txt') - elif split == 'val': - image_path, label_path = get_pairs(root, 'valImages.txt', 'valLabels.txt') - elif split == 'test': - image_path, label_path = get_pairs(root, 'testImages.txt', 'testLabels.txt') # 返回文件路径,test_label并不存在 - else: # 'train_val' - image_path1, label_path1 = get_pairs(root, 'trainImages.txt', 'trainLabels.txt') - image_path2, label_path2 = get_pairs(root, 'valImages.txt', 'valLabels.txt') - image_path, label_path = image_path1+image_path2, label_path1+label_path2 - return image_path, label_path - - def get_path_pairs(self): - return self.image_path, self.label_path - diff --git a/PaddleCV/Research/danet/utils/cityscapes_data.py b/PaddleCV/Research/danet/utils/cityscapes_data.py deleted file mode 100644 index e96534cf31d5f5a2e226435d527515bef7bd8f03..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/cityscapes_data.py +++ /dev/null @@ -1,144 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import random -import paddle -import numpy as np - -from PIL import Image - -from utils.cityscapes import CityScapes - -__all__ = ['cityscapes_train', 'cityscapes_val', 'cityscapes_train_val', 'cityscapes_test'] - -# globals -data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1) -data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1) - - -def mapper_train(sample): - image_path, label_path, city = sample - image = Image.open(image_path, mode='r').convert('RGB') - label = Image.open(label_path, mode='r') - - image, label = city.sync_transform(image, label) - image_array = np.array(image) # HWC - label_array = np.array(label) # HW - - image_array = image_array.transpose((2, 0, 1)) # CHW - image_array = image_array / 255.0 - image_array = (image_array - data_mean) / data_std - image_array = image_array.astype('float32') - label_array = label_array.astype('int64') - return image_array, label_array - - -def mapper_val(sample): - image_path, label_path, city = sample - image = Image.open(image_path, mode='r').convert('RGB') - label = Image.open(label_path, mode='r') - - image, label = city.sync_val_transform(image, label) - image_array = np.array(image) # HWC - label_array = np.array(label) # HW - - image_array = image_array.transpose((2, 0, 1)) # CHW - image_array = image_array / 255.0 - image_array = (image_array - data_mean) / data_std - image_array = image_array.astype('float32') - label_array = label_array.astype('int64') - return image_array, label_array - - -def mapper_test(sample): - image_path, label_path = sample # label is path - image = Image.open(image_path, mode='r').convert('RGB') - image_array = image - return image_array, label_path # image is a picture, label is path - - -# root, base_size, crop_size; gpu_num必须设置,否则syncBN会出现某些卡没有数据的情况 -def cityscapes_train(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True, batch_size=1, gpu_num=1): - city = CityScapes(root=data_root, split='train', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = city.get_path_pairs() - - def reader(): - if len(image_path) % (batch_size * gpu_num) != 0: - length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num) - else: - length = len(image_path) - for i in range(length): - if i == 0: - cc = list(zip(image_path, label_path)) - random.shuffle(cc) - image_path[:], label_path[:] = zip(*cc) - yield image_path[i], label_path[i], city - if xmap: - return paddle.reader.xmap_readers(mapper_train, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_train, reader) - - -def cityscapes_val(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True): - city = CityScapes(root=data_root, split='val', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = city.get_path_pairs() - - def reader(): - for i in range(len(image_path)): - yield image_path[i], label_path[i], city - - if xmap: - return paddle.reader.xmap_readers(mapper_val, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_val, reader) - - -def cityscapes_train_val(data_root='./dataset', base_size=1024, crop_size=768, scale=True, xmap=True, batch_size=1, gpu_num=1): - city = CityScapes(root=data_root, split='train_val', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = city.get_path_pairs() - - def reader(): - if len(image_path) % (batch_size * gpu_num) != 0: - length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num) - else: - length = len(image_path) - for i in range(length): - if i == 0: - cc = list(zip(image_path, label_path)) - random.shuffle(cc) - image_path[:], label_path[:] = zip(*cc) - yield image_path[i], label_path[i], city - - if xmap: - return paddle.reader.xmap_readers(mapper_train, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_train, reader) - - -def cityscapes_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True): - # 实际未使用base_size, crop_size, scale - city = CityScapes(split=split, base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = city.get_path_pairs() - - def reader(): - for i in range(len(image_path)): - yield image_path[i], label_path[i] - if xmap: - return paddle.reader.xmap_readers(mapper_test, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_test, reader) diff --git a/PaddleCV/Research/danet/utils/lr_scheduler.py b/PaddleCV/Research/danet/utils/lr_scheduler.py deleted file mode 100644 index 4ce8316a43536aef9414ca5e40a4e8a5ccb63aba..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/lr_scheduler.py +++ /dev/null @@ -1,152 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import paddle.fluid as fluid -import math - - -class Lr(object): - """ - 示例:使用poly策略, 有热身, - lr_scheduler = Lr(lr_policy='poly', base_lr=0.003, epoch_nums=200, step_per_epoch=20, - warm_up=True, warmup_epoch=11) - lr = lr_scheduler.get_lr() - - 示例:使用cosine策略, 有热身, - lr_scheduler = Lr(lr_policy='cosine', base_lr=0.003, epoch_nums=200, step_per_epoch=20, - warm_up=True, warmup_epoch=11) - lr = lr_scheduler.get_lr() - - 示例:使用piecewise策略, 有热身,必须设置边界(decay_epoch list), gamma系数默认0.1 - lr_scheduler = Lr(lr_policy='piecewise', base_lr=0.003, epoch_nums=200, step_per_epoch=20, - warm_up=True, warmup_epoch=11, decay_epoch=[50], gamma=0.1) - lr = lr_scheduler.get_lr() - """ - def __init__(self, lr_policy, base_lr, epoch_nums, step_per_epoch, - power=0.9, end_lr=0.0, gamma=0.1, decay_epoch=[], - warm_up=False, warmup_epoch=0): - support_lr_policy = ['poly', 'piecewise', 'cosine'] - assert lr_policy in support_lr_policy, "Only support poly, piecewise, cosine" - self.lr_policy = lr_policy # 学习率衰减策略 : str(`cosine`, `poly`, `piecewise`) - - assert base_lr >= 0, "Start learning rate should greater than 0" - self.base_lr = base_lr # 基础学习率: float - - assert end_lr >= 0, "End learning rate should greater than 0" - self.end_lr = end_lr # 学习率终点: float - - assert epoch_nums, "epoch_nums should greater than 0" - assert step_per_epoch, "step_per_epoch should greater than 0" - - self.epoch_nums = epoch_nums # epoch数: int - self.step_per_epoch = step_per_epoch # 每个epoch的迭代数: int - self.total_step = epoch_nums * step_per_epoch # 总的迭代数 :auto - self.power = power # 指数: float - self.gamma = gamma # 分段衰减的系数: float - self.decay_epoch = decay_epoch # 分段衰减的epoch: list - if self.lr_policy == 'piecewise': - assert len(decay_epoch) >= 1, "use piecewise policy, should set decay_epoch list" - self.warm_up = warm_up # 是否热身:bool - if self.warm_up: - assert warmup_epoch, "warmup_epoch should greater than 0" - assert warmup_epoch < epoch_nums, "warmup_epoch should less than epoch_nums" - self.warmup_epoch = warmup_epoch - self.warmup_steps = warmup_epoch * step_per_epoch # 热身steps:int(epoch*step_per_epoch) - - def _piecewise_decay(self): - gamma = self.gamma - bd = [self.step_per_epoch * e for e in self.decay_epoch] - lr = [self.base_lr * (gamma ** i) for i in range(len(bd) + 1)] - decayed_lr = fluid.layers.piecewise_decay(boundaries=bd, values=lr) - return decayed_lr - - def _poly_decay(self): - decayed_lr = fluid.layers.polynomial_decay( - self.base_lr, self.total_step, end_learning_rate=self.end_lr, power=self.power) - return decayed_lr - - def _cosine_decay(self): - decayed_lr = fluid.layers.cosine_decay( - self.base_lr, self.step_per_epoch, self.epoch_nums) - return decayed_lr - - def get_lr(self): - if self.lr_policy.lower() == 'poly': - if self.warm_up: - warm_up_end_lr = (self.base_lr - self.end_lr) * pow( - (1 - self.warmup_steps / self.total_step), self.power) + self.end_lr - print('poly warm_up_end_lr:', warm_up_end_lr) - decayed_lr = fluid.layers.linear_lr_warmup(self._poly_decay(), - warmup_steps=self.warmup_steps, - start_lr=0.0, - end_lr=warm_up_end_lr) - else: - decayed_lr = self._poly_decay() - elif self.lr_policy.lower() == 'piecewise': - if self.warm_up: - assert self.warmup_steps < self.decay_epoch[0] * self.step_per_epoch - warm_up_end_lr = self.base_lr - print('piecewise warm_up_end_lr:', warm_up_end_lr) - decayed_lr = fluid.layers.linear_lr_warmup(self._piecewise_decay(), - warmup_steps=self.warmup_steps, - start_lr=0.0, - end_lr=warm_up_end_lr) - else: - decayed_lr = self._piecewise_decay() - elif self.lr_policy.lower() == 'cosine': - if self.warm_up: - warm_up_end_lr = self.base_lr*0.5*(math.cos(self.warmup_epoch*math.pi/self.epoch_nums)+1) - print('cosine warm_up_end_lr:', warm_up_end_lr) - decayed_lr = fluid.layers.linear_lr_warmup(self._cosine_decay(), - warmup_steps=self.warmup_steps, - start_lr=0.0, - end_lr=warm_up_end_lr) - else: - decayed_lr = self._cosine_decay() - else: - raise Exception( - "unsupport learning decay policy! only support poly,piecewise,cosine" - ) - return decayed_lr - - -if __name__ == '__main__': - epoch_nums = 200 - step_per_epoch = 180 - base_lr = 0.003 - warmup_epoch = 5 # 热身数 - lr_scheduler = Lr(lr_policy='poly', base_lr=base_lr, epoch_nums=epoch_nums, step_per_epoch=step_per_epoch, - warm_up=True, warmup_epoch=warmup_epoch, decay_epoch=[50]) - lr = lr_scheduler.get_lr() - exe = fluid.Executor(fluid.CPUPlace()) - exe.run(fluid.default_startup_program()) - - lr_list = [] - for epoch in range(epoch_nums): - for i in range(step_per_epoch): - x = exe.run(fluid.default_main_program(), - fetch_list=[lr]) - lr_list.append(x[0]) - # print(x[0]) - # 绘图 - from matplotlib import pyplot as plt - plt.plot(range(epoch_nums*step_per_epoch), lr_list) - plt.xlabel('step') - plt.ylabel('lr') - plt.show() - diff --git a/PaddleCV/Research/danet/utils/voc.py b/PaddleCV/Research/danet/utils/voc.py deleted file mode 100644 index 01021ec01f6e0e96df65e6af50863db96e400eef..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/voc.py +++ /dev/null @@ -1,101 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os - -from utils.base import BaseDataSet - - -class VOC(BaseDataSet): - """prepare pascalVOC path_pairs""" - BASE_DIR = 'VOC2012_SBD' - NUM_CLASS = 21 - - def __init__(self, root='../dataset', split='train', **kwargs): - super(VOC, self).__init__(root, split, **kwargs) - if os.sep == '\\': # windows - root = root.replace('/', '\\') - - root = os.path.join(root, self.BASE_DIR) - assert os.path.exists(root), "please download voc2012 data_set, put in dataset(dir)" - if split == 'test': - self.image_path = self._get_cityscapes_pairs(root, split) - else: - self.image_path, self.label_path = self._get_cityscapes_pairs(root, split) - if self.label_path is None: - pass - else: - assert len(self.image_path) == len(self.label_path), "please check image_length = label_length" - self.print_param() - - def print_param(self): # 用于核对当前数据集的信息 - if self.label_path is None: - print('INFO: dataset_root: {}, split: {}, ' - 'base_size: {}, crop_size: {}, scale: {}, ' - 'image_length: {}'.format(self.root, self.split, self.base_size, - self.crop_size, self.scale, len(self.image_path))) - else: - print('INFO: dataset_root: {}, split: {}, ' - 'base_size: {}, crop_size: {}, scale: {}, ' - 'image_length: {}, label_length: {}'.format(self.root, self.split, self.base_size, - self.crop_size, self.scale, len(self.image_path), - len(self.label_path))) - - @staticmethod - def _get_cityscapes_pairs(root, split): - - def get_pairs(root, file): - if file.find('test') == -1: - file = os.path.join(root, file) - with open(file, 'r') as f: - file_list = f.readlines() - if os.sep == '\\': # for windows - image_path = [ - os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:].replace('/', '\\').replace('\n', '')) - for x in file_list] - label_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[1][1:].replace('/', '\\')) for x in - file_list] - else: - image_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:]) for x in file_list] - label_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[1][1:]) for x in file_list] - return image_path, label_path - else: - file = os.path.join(root, file) - with open(file, 'r') as f: - file_list = f.readlines() - if os.sep == '\\': # for windows - image_path = [ - os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:].replace('/', '\\').replace('\n', '')) - for x in file_list] - else: - image_path = [os.path.join(root, 'pascal', 'VOC2012', x.split()[0][1:]) for x in file_list] - return image_path - - if split == 'train': - image_path, label_path = get_pairs(root, 'list/train_aug.txt') - elif split == 'val': - image_path, label_path = get_pairs(root, 'list/val.txt') - elif split == 'test': - image_path = get_pairs(root, 'list/test.txt') # 返回文件路径,test_label并不存在 - return image_path - else: # 'train_val' - image_path, label_path = get_pairs(root, 'list/trainval_aug.txt') - return image_path, label_path - - def get_path_pairs(self): - return self.image_path, self.label_path diff --git a/PaddleCV/Research/danet/utils/voc_data.py b/PaddleCV/Research/danet/utils/voc_data.py deleted file mode 100644 index d2dba4f9135dc80fd9c015ea7a7c3bde1af5b0e1..0000000000000000000000000000000000000000 --- a/PaddleCV/Research/danet/utils/voc_data.py +++ /dev/null @@ -1,144 +0,0 @@ -# -*- coding: utf-8 -*- -# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import random -import paddle -import numpy as np - -from PIL import Image - -from utils.voc import VOC - -__all__ = ['voc_train', 'voc_val', 'voc_train_val', 'voc_test'] - -# globals -data_mean = np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1) -data_std = np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1) - - -def mapper_train(sample): - image_path, label_path, voc = sample - image = Image.open(image_path, mode='r').convert('RGB') - label = Image.open(label_path, mode='r') - - image, label = voc.sync_transform(image, label) - image_array = np.array(image) # HWC - label_array = np.array(label) # HW - - image_array = image_array.transpose((2, 0, 1)) # CHW - image_array = image_array / 255.0 - image_array = (image_array - data_mean) / data_std - image_array = image_array.astype('float32') - label_array = label_array.astype('int64') - return image_array, label_array - - -def mapper_val(sample): - image_path, label_path, city = sample - image = Image.open(image_path, mode='r').convert('RGB') - label = Image.open(label_path, mode='r') - - image, label = city.sync_val_transform(image, label) - image_array = np.array(image) - label_array = np.array(label) - - image_array = image_array.transpose((2, 0, 1)) - image_array = image_array / 255.0 - image_array = (image_array - data_mean) / data_std - image_array = image_array.astype('float32') - label_array = label_array.astype('int64') - return image_array, label_array - - -def mapper_test(sample): - image_path, label_path = sample # label is path - image = Image.open(image_path, mode='r').convert('RGB') - image_array = image - return image_array, label_path # label is path - - -# 已完成, 引用时记得传入参数,root, base_size, crop_size等, gpu_num必须设置,否则syncBN会出现某些卡没有数据的情况 -def voc_train(data_root='../dataset', base_size=768, crop_size=576, scale=True, xmap=True, batch_size=1, gpu_num=1): - voc = VOC(root=data_root, split='train', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = voc.get_path_pairs() - - def reader(): - if len(image_path) % (batch_size * gpu_num) != 0: - length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num) - else: - length = len(image_path) - for i in range(length): - if i == 0: - cc = list(zip(image_path, label_path)) - random.shuffle(cc) - image_path[:], label_path[:] = zip(*cc) - yield image_path[i], label_path[i], voc - if xmap: - return paddle.reader.xmap_readers(mapper_train, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_train, reader) - - -def voc_val(data_root='../dataset', base_size=768, crop_size=576, scale=True, xmap=True): - voc = VOC(root=data_root, split='val', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = voc.get_path_pairs() - - def reader(): - for i in range(len(image_path)): - yield image_path[i], label_path[i], voc - - if xmap: - return paddle.reader.xmap_readers(mapper_val, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_val, reader) - - -def voc_train_val(data_root='./dataset', base_size=768, crop_size=576, scale=True, xmap=True, batch_size=1, gpu_num=1): - voc = VOC(root=data_root, split='train_val', base_size=base_size, crop_size=crop_size, scale=scale) - image_path, label_path = voc.get_path_pairs() - - def reader(): - if len(image_path) % (batch_size * gpu_num) != 0: - length = (len(image_path) // (batch_size * gpu_num)) * (batch_size * gpu_num) - else: - length = len(image_path) - for i in range(length): - if i == 0: - cc = list(zip(image_path, label_path)) - random.shuffle(cc) - image_path[:], label_path[:] = zip(*cc) - yield image_path[i], label_path[i] - - if xmap: - return paddle.reader.xmap_readers(mapper_train, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_train, reader) - - -def voc_test(split='test', base_size=2048, crop_size=1024, scale=True, xmap=True): - # 实际未使用base_size, crop_size, scale - voc = VOC(split=split, base_size=base_size, crop_size=crop_size, scale=scale) - image_path = voc.get_path_pairs() - - def reader(): - for i in range(len(image_path[:1])): - yield image_path[i], image_path[i] - if xmap: - return paddle.reader.xmap_readers(mapper_test, reader, 4, 32) - else: - return paddle.reader.map_readers(mapper_test, reader)