未验证 提交 7126bb37 编写于 作者: L LutaoChu 提交者: GitHub

add HRNet, polish README, add precision and recall metrics (#256)

* modify label tool

* fix label tools bug

* add RemoteSensing

* optimize data clip and normalize method for remote sensing

* fix label problem using mobile devices

* add README.md

* polish README.md, optimize main.py and transforms.py, add transforms.md

* add demo dataset

* add create_dataset_list tool, requirements.txt

* add channel-by-channel normalize and clip, remove RemoteSensing import

* update vdl to 2.0

* polish README

* add tool to randomly split dataset and generate file list

* open eval_best_metric while traing, fix typo

* remove python2.7 ci

* add hrnet, polish README

* add metrics for unet like precision,recall,etc

* save more attrs in model.yaml, fix typo

* optimize predict_demo.py

* fix hrnet dice loss bug

* polish remote sensing README

* refactor model base class

* modify train_demo.py

* Update README.md

* Update requirements.txt
上级 cbce658d
......@@ -102,7 +102,7 @@ class SegModel(object):
# 当前模型状态
self.status = 'Normal'
def _get_single_car_bs(self, batch_size):
def _get_single_card_bs(self, batch_size):
if batch_size % len(self.places) == 0:
return int(batch_size // len(self.places))
else:
......@@ -144,7 +144,7 @@ class SegModel(object):
capacity=64,
use_double_buffer=True,
iterable=True)
batch_size_each_gpu = self._get_single_car_bs(batch_size)
batch_size_each_gpu = self._get_single_card_bs(batch_size)
self.train_data_loader.set_sample_list_generator(
dataset.generator(batch_size=batch_size_each_gpu),
places=self.places)
......
......@@ -24,7 +24,7 @@ import models
def load_model(model_dir):
if not osp.exists(osp.join(model_dir, "model.yml")):
raise Exception("There's not model.yml in {}".format(model_dir))
raise Exception("There's no model.yml in {}".format(model_dir))
with open(osp.join(model_dir, "model.yml")) as f:
info = yaml.load(f.read(), Loader=yaml.Loader)
status = info['status']
......
......@@ -3,6 +3,7 @@
提供基于PaddlSeg最新的分割特色模型:
- [人像分割](./HumanSeg)
- [遥感分割](./RemoteSensing)
- [人体解析](./ACE2P)
- [车道线分割](./LaneNet)
- [工业表盘分割](#工业表盘分割)
......@@ -12,6 +13,14 @@
HumanSeg系列全新升级,提供三个适用于不同场景,包含适用于移动端实时分割场景的模型`HumanSeg-lite`,提供了包含光流的后处理的优化,使人像分割在视频场景中更加顺畅,更多详情请参考[HumanSeg](./HumanSeg)
## 遥感分割 Remote Sensing Segmentation
PaddleSeg遥感影像分割涵盖图像预处理、数据增强、模型训练、预测流程。
针对遥感数据多通道、分布范围大、分布不均的特点,我们支持多通道训练预测,内置10+多通道预处理和数据增强的策略,可结合实际业务场景进行定制组合,提升模型泛化能力和鲁棒性。
内置U-Net, HRNet两种主流分割网络,可选择不同的损失函数如Dice Loss, BCE Loss等方式强化小目标和不均衡样本场景下的分割精度。更多详情请参考[RemoteSensing](./RemoteSensing)
以下是遥感云检测的示例效果:
![](./RemoteSensing/docs/imgs/rs.png)
## 人体解析 Human Parsing
......
# 遥感分割(RemoteSensing)
# PaddleSeg遥感影像分割
遥感影像分割是图像分割领域中的重要应用场景,广泛应用于土地测绘、环境监测、城市建设等领域。遥感影像分割的目标多种多样,有诸如积雪、农作物、道路、建筑、水源等地物目标,也有例如云层的空中目标。
PaddleSeg提供了针对遥感专题的语义分割库RemoteSensing,涵盖图像预处理、数据增强、模型训练、预测流程,帮助用户利用深度学习技术解决遥感影像分割问题。
PaddleSeg遥感影像分割涵盖图像预处理、数据增强、模型训练、预测流程,帮助用户利用深度学习技术解决遥感影像分割问题。
## 特点
针对遥感数据多通道、分布范围大、分布不均的特点,我们支持多通道训练预测,内置一系列多通道预处理和数据增强的策略,可结合实际业务场景进行定制组合,提升模型泛化能力和鲁棒性。
- 针对遥感数据多通道、分布范围大、分布不均的特点,我们支持多通道训练预测,内置10+多通道预处理和数据增强的策略,可结合实际业务场景进行定制组合,提升模型泛化能力和鲁棒性。
**Note:** 所有命令需要在`PaddleSeg/contrib/RemoteSensing/`目录下执行。
- 内置U-Net, HRNet两种主流分割网络,可选择不同的损失函数如Dice Loss, BCE Loss等方式强化小目标和不均衡样本场景下的分割精度。
以下是遥感云检测的示例效果:
![](./docs/imgs/rs.png)
## 前置依赖
**Note:** 若没有特殊说明,以下所有命令需要在`PaddleSeg/contrib/RemoteSensing/`目录下执行。
- Paddle 1.7.1+
由于图像分割模型计算开销大,推荐在GPU版本的PaddlePaddle下使用。
PaddlePaddle的安装, 请按照[官网指引](https://paddlepaddle.org.cn/install/quick)安装合适自己的版本。
......@@ -18,7 +24,6 @@ PaddlePaddle的安装, 请按照[官网指引](https://paddlepaddle.org.cn/insta
- 其他依赖安装
通过以下命令安装python包依赖,请确保至少执行过一次以下命令:
```
cd RemoteSensing
pip install -r requirements.txt
```
......@@ -63,9 +68,9 @@ RemoteSensing # 根目录
```
其中,相应的文件名可根据需要自行定义。
遥感领域图像格式多种多样,不同传感器产生的数据格式可能不同。为方便数据加载,本分割库统一采用numpy存储格式`npy`作为原图格式,采用`png`无损压缩格式作为标注图片格式。
原图的前两维是图像的尺寸,第3维是图像的通道数。
标注图像为单通道图像,像素值即为对应的类别,像素标注类别需要从0开始递增
遥感影像的格式多种多样,不同传感器产生的数据格式也可能不同。PaddleSeg以numpy.ndarray数据类型进行图像预处理。为统一接口并方便数据加载,我们采用numpy存储格式`npy`作为原图格式,采用`png`无损压缩格式作为标注图片格式。
原图的尺寸应为(h, w, channel),其中h, w为图像的高和宽,channel为图像的通道数。
标注图像为单通道图像,像素值即为对应的类别,像素标注类别需要从0开始递增
例如0,1,2,3表示有4种类别,标注类别最多为256类。其中可以指定特定的像素值用于表示该值的像素不参与训练和评估(默认为255)。
`train_list.txt``val_list.txt`文本以空格为分割符分为两列,第一列为图像文件相对于dataset的相对路径,第二列为标注图像文件相对于dataset的相对路径。如下所示:
......@@ -93,154 +98,38 @@ labelB
### 1. 准备数据集
为了快速体验,我们准备了一个小型demo数据集,已位于`RemoteSensing/dataset/demo/`目录下.
对于您自己的数据集,您需要按照上述的数据协议进行格式转换,可分别使用numpy和pil库保存遥感数据和标注图片。其中numpy api示例如下:
对于您自己的数据集,您需要按照上述的数据协议进行格式转换,可分别使用numpy和Pillow库保存遥感数据和标注图片。其中numpy API示例如下:
```python
import numpy as np
# 保存遥感数据
# 将遥感数据保存到以 .npy 为扩展名的文件中
# img类型:numpy.ndarray
np.save(save_path, img)
```
### 2. 训练代码开发
通过如下`train_demo.py`代码进行训练。
> 导入RemoteSensing api
```python
import transforms.transforms as T
from readers.reader import Reader
from models import UNet
```
> 定义训练和验证时的数据处理和增强流程, 在`train_transforms`中加入了`RandomVerticalFlip`,`RandomHorizontalFlip`等数据增强方式。
```python
train_transforms = T.Compose([
T.RandomVerticalFlip(0.5),
T.RandomHorizontalFlip(0.5),
T.ResizeStepScaling(0.5, 2.0, 0.25),
T.RandomPaddingCrop(256),
T.Normalize(mean=[0.5] * channel, std=[0.5] * channel),
])
eval_transforms = T.Compose([
T.Normalize(mean=[0.5] * channel, std=[0.5] * channel),
])
```
> 定义数据读取器
```python
import os
import os.path as osp
train_list = osp.join(data_dir, 'train.txt')
val_list = osp.join(data_dir, 'val.txt')
label_list = osp.join(data_dir, 'labels.txt')
train_reader = Reader(
data_dir=data_dir,
file_list=train_list,
label_list=label_list,
transforms=train_transforms,
num_workers=8,
buffer_size=16,
shuffle=True,
parallel_method='thread')
eval_reader = Reader(
data_dir=data_dir,
file_list=val_list,
label_list=label_list,
transforms=eval_transforms,
num_workers=8,
buffer_size=16,
shuffle=False,
parallel_method='thread')
```
> 模型构建
```python
model = UNet(
num_classes=2, input_channel=channel, use_bce_loss=True, use_dice_loss=True)
```
> 模型训练,并开启边训边评估
```python
model.train(
num_epochs=num_epochs,
train_reader=train_reader,
train_batch_size=train_batch_size,
eval_reader=eval_reader,
save_interval_epochs=5,
log_interval_steps=10,
save_dir=save_dir,
pretrain_weights=None,
optimizer=None,
learning_rate=lr,
use_vdl=True
)
```
### 3. 模型训练
> 设置GPU卡号
### 2. 模型训练
#### (1) 设置GPU卡号
```shell script
export CUDA_VISIBLE_DEVICES=0
```
> 在RemoteSensing目录下运行`train_demo.py`即可开始训练。
#### (2) 以U-Net为例,在RemoteSensing目录下运行`train_demo.py`即可开始训练。
```shell script
python train_demo.py --data_dir dataset/demo/ --save_dir saved_model/unet/ --channel 3 --num_epochs 20
```
### 4. 模型预测代码开发
通过如下`predict_demo.py`代码进行预测。
> 导入RemoteSensing api
```python
from models import load_model
```
> 加载训练过程中最好的模型,设置预测结果保存路径。
```python
import os
import os.path as osp
model = load_model(osp.join(save_dir, 'best_model'))
pred_dir = osp.join(save_dir, 'pred')
if not osp.exists(pred_dir):
os.mkdir(pred_dir)
```
> 使用模型对验证集进行测试,并保存预测结果。
```python
import numpy as np
from PIL import Image as Image
val_list = osp.join(data_dir, 'val.txt')
color_map = [0, 0, 0, 255, 255, 255]
with open(val_list) as f:
lines = f.readlines()
for line in lines:
img_path = line.split(' ')[0]
print('Predicting {}'.format(img_path))
img_path_ = osp.join(data_dir, img_path)
pred = model.predict(img_path_)
# 以伪彩色png图片保存预测结果
pred_name = osp.basename(img_path).rstrip('npy') + 'png'
pred_path = osp.join(pred_dir, pred_name)
pred_mask = Image.fromarray(pred.astype(np.uint8), mode='P')
pred_mask.putpalette(color_map)
pred_mask.save(pred_path)
python train_demo.py --model_type unet --data_dir dataset/demo/ --save_dir saved_model/unet/ --channel 3 --num_epochs 20
```
### 5. 模型预测
> 设置GPU卡号
### 3. 模型预测
#### (1) 设置GPU卡号
```shell script
export CUDA_VISIBLE_DEVICES=0
```
> 在RemoteSensing目录下运行`predict_demo.py`即可开始训练。
#### (2) 以刚训练好的U-Net最优模型为例,在RemoteSensing目录下运行`predict_demo.py`即可开始训练。
```shell script
python predict_demo.py --data_dir dataset/demo/ --load_model_dir saved_model/unet/best_model/
python predict_demo.py --data_dir dataset/demo/ --file_list val.txt --load_model_dir saved_model/unet/best_model
```
## Api说明
## API说明
您可以使用`RemoteSensing`目录下提供的api构建自己的分割代码。
您可以使用`RemoteSensing`目录下提供的API构建自己的分割代码。
- [数据处理-transforms](docs/transforms.md)
from .load_model import *
from .unet import *
from .hrnet import *
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
import paddle.fluid as fluid
import os
from os import path as osp
import numpy as np
from collections import OrderedDict
import copy
import math
import time
import tqdm
import cv2
import yaml
import utils
import utils.logging as logging
from utils.utils import seconds_to_hms, get_environ_info
from utils.metrics import ConfusionMatrix
import nets
import transforms.transforms as T
from .base import BaseModel
def dict2str(dict_input):
out = ''
for k, v in dict_input.items():
try:
v = round(float(v), 6)
except:
pass
out = out + '{}={}, '.format(k, v)
return out.strip(', ')
class HRNet(BaseModel):
def __init__(self,
num_classes=2,
input_channel=3,
stage1_num_modules=1,
stage1_num_blocks=[4],
stage1_num_channels=[64],
stage2_num_modules=1,
stage2_num_blocks=[4, 4],
stage2_num_channels=[18, 36],
stage3_num_modules=4,
stage3_num_blocks=[4, 4, 4],
stage3_num_channels=[18, 36, 72],
stage4_num_modules=3,
stage4_num_blocks=[4, 4, 4, 4],
stage4_num_channels=[18, 36, 72, 144],
use_bce_loss=False,
use_dice_loss=False,
class_weight=None,
ignore_index=255,
sync_bn=True):
super().__init__(
num_classes=num_classes,
use_bce_loss=use_bce_loss,
use_dice_loss=use_dice_loss,
class_weight=class_weight,
ignore_index=ignore_index,
sync_bn=sync_bn)
self.init_params = locals()
self.input_channel = input_channel
self.stage1_num_modules = stage1_num_modules
self.stage1_num_blocks = stage1_num_blocks
self.stage1_num_channels = stage1_num_channels
self.stage2_num_modules = stage2_num_modules
self.stage2_num_blocks = stage2_num_blocks
self.stage2_num_channels = stage2_num_channels
self.stage3_num_modules = stage3_num_modules
self.stage3_num_blocks = stage3_num_blocks
self.stage3_num_channels = stage3_num_channels
self.stage4_num_modules = stage4_num_modules
self.stage4_num_blocks = stage4_num_blocks
self.stage4_num_channels = stage4_num_channels
def build_net(self, mode='train'):
"""应根据不同的情况进行构建"""
model = nets.HRNet(
self.num_classes,
self.input_channel,
mode=mode,
stage1_num_modules=self.stage1_num_modules,
stage1_num_blocks=self.stage1_num_blocks,
stage1_num_channels=self.stage1_num_channels,
stage2_num_modules=self.stage2_num_modules,
stage2_num_blocks=self.stage2_num_blocks,
stage2_num_channels=self.stage2_num_channels,
stage3_num_modules=self.stage3_num_modules,
stage3_num_blocks=self.stage3_num_blocks,
stage3_num_channels=self.stage3_num_channels,
stage4_num_modules=self.stage4_num_modules,
stage4_num_blocks=self.stage4_num_blocks,
stage4_num_channels=self.stage4_num_channels,
use_bce_loss=self.use_bce_loss,
use_dice_loss=self.use_dice_loss,
class_weight=self.class_weight,
ignore_index=self.ignore_index)
inputs = model.generate_inputs()
model_out = model.build_net(inputs)
outputs = OrderedDict()
if mode == 'train':
self.optimizer.minimize(model_out)
outputs['loss'] = model_out
else:
outputs['pred'] = model_out[0]
outputs['logit'] = model_out[1]
return inputs, outputs
def train(self,
num_epochs,
train_reader,
train_batch_size=2,
eval_reader=None,
eval_best_metric='kappa',
save_interval_epochs=1,
log_interval_steps=2,
save_dir='output',
pretrain_weights=None,
resume_weights=None,
optimizer=None,
learning_rate=0.01,
lr_decay_power=0.9,
regularization_coeff=5e-4,
use_vdl=False):
super().train(
num_epochs=num_epochs,
train_reader=train_reader,
train_batch_size=train_batch_size,
eval_reader=eval_reader,
eval_best_metric=eval_best_metric,
save_interval_epochs=save_interval_epochs,
log_interval_steps=log_interval_steps,
save_dir=save_dir,
pretrain_weights=pretrain_weights,
resume_weights=resume_weights,
optimizer=optimizer,
learning_rate=learning_rate,
lr_decay_power=lr_decay_power,
regularization_coeff=regularization_coeff,
use_vdl=use_vdl)
......@@ -25,7 +25,7 @@ import models
def load_model(model_dir):
if not osp.exists(osp.join(model_dir, "model.yml")):
raise Exception("There's not model.yml in {}".format(model_dir))
raise Exception("There's no model.yml in {}".format(model_dir))
with open(osp.join(model_dir, "model.yml")) as f:
info = yaml.load(f.read(), Loader=yaml.Loader)
status = info['status']
......@@ -35,8 +35,7 @@ def load_model(model_dir):
info['Model']))
model = getattr(models, info['Model'])(**info['_init_params'])
if status == "Normal" or \
status == "Prune":
if status == "Normal":
startup_prog = fluid.Program()
model.test_prog = fluid.Program()
with fluid.program_guard(model.test_prog, startup_prog):
......@@ -45,17 +44,12 @@ def load_model(model_dir):
mode='test')
model.test_prog = model.test_prog.clone(for_test=True)
model.exe.run(startup_prog)
if status == "Prune":
from .slim.prune import update_program
model.test_prog = update_program(model.test_prog, model_dir,
model.places[0])
import pickle
with open(osp.join(model_dir, 'model.pdparams'), 'rb') as f:
load_dict = pickle.load(f)
fluid.io.set_program_state(model.test_prog, load_dict)
elif status == "Infer" or \
status == "Quant":
elif status == "Infer":
[prog, input_names, outputs] = fluid.io.load_inference_model(
model_dir, model.exe, params_filename='__params__')
model.test_prog = prog
......@@ -67,8 +61,8 @@ def load_model(model_dir):
for i, out in enumerate(outputs):
var_desc = test_outputs_info[i]
model.test_outputs[var_desc[0]] = out
if 'Transforms' in info:
model.test_transforms = build_transforms(info['Transforms'])
if 'test_transforms' in info:
model.test_transforms = build_transforms(info['test_transforms'])
model.eval_transforms = copy.deepcopy(model.test_transforms)
if '_Attributes' in info:
......
......@@ -13,19 +13,18 @@
#limitations under the License.
from __future__ import absolute_import
import os.path as osp
import numpy as np
import math
import cv2
import paddle.fluid as fluid
import utils.logging as logging
from collections import OrderedDict
from .base import BaseAPI
from .base import BaseModel
from utils.metrics import ConfusionMatrix
import nets
class UNet(BaseAPI):
class UNet(BaseModel):
"""实现UNet网络的构建并进行训练、评估、预测和模型导出。
Args:
......@@ -55,9 +54,16 @@ class UNet(BaseAPI):
use_bce_loss=False,
use_dice_loss=False,
class_weight=None,
ignore_index=255):
ignore_index=255,
sync_bn=True):
super().__init__(
num_classes=num_classes,
use_bce_loss=use_bce_loss,
use_dice_loss=use_dice_loss,
class_weight=class_weight,
ignore_index=ignore_index,
sync_bn=sync_bn)
self.init_params = locals()
super(UNet, self).__init__()
# dice_loss或bce_loss只适用两类分割中
if num_classes > 2 and (use_bce_loss or use_dice_loss):
raise ValueError(
......@@ -115,24 +121,6 @@ class UNet(BaseAPI):
outputs['logit'] = model_out[1]
return inputs, outputs
def default_optimizer(self,
learning_rate,
num_epochs,
num_steps_each_epoch,
lr_decay_power=0.9):
decay_step = num_epochs * num_steps_each_epoch
lr_decay = fluid.layers.polynomial_decay(
learning_rate,
decay_step,
end_learning_rate=0,
power=lr_decay_power)
optimizer = fluid.optimizer.Momentum(
lr_decay,
momentum=0.9,
regularization=fluid.regularizer.L2Decay(
regularization_coeff=4e-05))
return optimizer
def train(self,
num_epochs,
train_reader,
......@@ -142,13 +130,13 @@ class UNet(BaseAPI):
save_interval_epochs=1,
log_interval_steps=2,
save_dir='output',
pretrain_weights='COCO',
pretrain_weights=None,
resume_weights=None,
optimizer=None,
learning_rate=0.01,
lr_decay_power=0.9,
use_vdl=False,
sensitivities_file=None,
eval_metric_loss=0.05):
regularization_coeff=5e-4,
use_vdl=False):
"""训练。
Args:
......@@ -160,46 +148,17 @@ class UNet(BaseAPI):
save_interval_epochs (int): 模型保存间隔(单位:迭代轮数)。默认为1。
log_interval_steps (int): 训练日志输出间隔(单位:迭代次数)。默认为2。
save_dir (str): 模型保存路径。默认'output'。
pretrain_weights (str): 若指定为路径时,则加载路径下预训练模型;若为字符串'COCO',
则自动下载在COCO图片数据上预训练的模型权重;若为None,则不使用预训练模型。默认为'COCO'。
pretrain_weights (str): 若指定为路径时,则加载路径下预训练模型;若为None,则不使用预训练模型。
optimizer (paddle.fluid.optimizer): 优化器。当改参数为None时,使用默认的优化器:使用
fluid.optimizer.Momentum优化方法,polynomial的学习率衰减策略。
learning_rate (float): 默认优化器的初始学习率。默认0.01。
lr_decay_power (float): 默认优化器学习率多项式衰减系数。默认0.9。
use_vdl (bool): 是否使用VisualDL进行可视化。默认False。
sensitivities_file (str): 若指定为路径时,则加载路径下敏感度信息进行裁剪;若为字符串'DEFAULT',
则自动下载在ImageNet图片数据上获得的敏感度信息进行裁剪;若为None,则不进行裁剪。默认为None。
eval_metric_loss (float): 可容忍的精度损失。默认为0.05。
Raises:
ValueError: 模型从inference model进行加载。
"""
if not self.trainable:
raise ValueError(
"Model is not trainable since it was loaded from a inference model."
)
self.labels = train_reader.labels
if optimizer is None:
num_steps_each_epoch = train_reader.num_samples // train_batch_size
optimizer = self.default_optimizer(
learning_rate=learning_rate,
num_epochs=num_epochs,
num_steps_each_epoch=num_steps_each_epoch,
lr_decay_power=lr_decay_power)
self.optimizer = optimizer
# 构建训练、验证、预测网络
self.build_program()
# 初始化网络权重
self.net_initialize(
startup_prog=fluid.default_startup_program(),
pretrain_weights=pretrain_weights,
save_dir=save_dir,
sensitivities_file=sensitivities_file,
eval_metric_loss=eval_metric_loss)
# 训练
self.train_loop(
super().train(
num_epochs=num_epochs,
train_reader=train_reader,
train_batch_size=train_batch_size,
......@@ -208,6 +167,12 @@ class UNet(BaseAPI):
save_interval_epochs=save_interval_epochs,
log_interval_steps=log_interval_steps,
save_dir=save_dir,
pretrain_weights=pretrain_weights,
resume_weights=resume_weights,
optimizer=optimizer,
learning_rate=learning_rate,
lr_decay_power=lr_decay_power,
regularization_coeff=regularization_coeff,
use_vdl=use_vdl)
def evaluate(self,
......@@ -231,7 +196,7 @@ class UNet(BaseAPI):
tuple (metrics, eval_details):当return_details为True时,增加返回dict (eval_details),
包含关键字:'confusion_matrix',表示评估的混淆矩阵。
"""
self.arrange_transforms(transforms=eval_reader.transforms, mode='eval')
self.arrange_transform(transforms=eval_reader.transforms, mode='eval')
total_steps = math.ceil(eval_reader.num_samples * 1.0 / batch_size)
conf_mat = ConfusionMatrix(self.num_classes, streaming=True)
data_generator = eval_reader.generator(
......@@ -272,11 +237,16 @@ class UNet(BaseAPI):
category_iou, miou = conf_mat.mean_iou()
category_acc, macc = conf_mat.accuracy()
precision, recall = conf_mat.precision_recall()
metrics = OrderedDict(
zip(['miou', 'category_iou', 'macc', 'category_acc', 'kappa'],
[miou, category_iou, macc, category_acc,
conf_mat.kappa()]))
zip([
'miou', 'category_iou', 'macc', 'category_acc', 'kappa',
'precision', 'recall'
], [
miou, category_iou, macc, category_acc,
conf_mat.kappa(), precision, recall
]))
if return_details:
eval_details = {
'confusion_matrix': conf_mat.confusion_matrix.tolist()
......@@ -296,11 +266,10 @@ class UNet(BaseAPI):
if transforms is None and not hasattr(self, 'test_transforms'):
raise Exception("transforms need to be defined, now is None.")
if transforms is not None:
self.arrange_transforms(transforms=transforms, mode='test')
self.arrange_transform(transforms=transforms, mode='test')
im, im_info = transforms(im_file)
else:
self.arrange_transforms(
transforms=self.test_transforms, mode='test')
self.arrange_transform(transforms=self.test_transforms, mode='test')
im, im_info = self.test_transforms(im_file)
im = im.astype(np.float32)
im = np.expand_dims(im, axis=0)
......@@ -319,4 +288,4 @@ class UNet(BaseAPI):
h, w = im_info[k][0], im_info[k][1]
pred = pred[0:h, 0:w]
return pred
return {'label_map': pred}
from .unet import UNet
from .hrnet import HRNet
# coding: utf8
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from collections import OrderedDict
import paddle.fluid as fluid
from paddle.fluid.initializer import MSRA
from paddle.fluid.param_attr import ParamAttr
from .loss import softmax_with_loss
from .loss import dice_loss
from .loss import bce_loss
from .libs import sigmoid_to_softmax
class HRNet(object):
def __init__(self,
num_classes,
input_channel=3,
mode='train',
stage1_num_modules=1,
stage1_num_blocks=[4],
stage1_num_channels=[64],
stage2_num_modules=1,
stage2_num_blocks=[4, 4],
stage2_num_channels=[18, 36],
stage3_num_modules=4,
stage3_num_blocks=[4, 4, 4],
stage3_num_channels=[18, 36, 72],
stage4_num_modules=3,
stage4_num_blocks=[4, 4, 4, 4],
stage4_num_channels=[18, 36, 72, 144],
use_bce_loss=False,
use_dice_loss=False,
class_weight=None,
ignore_index=255):
# dice_loss或bce_loss只适用两类分割中
if num_classes > 2 and (use_bce_loss or use_dice_loss):
raise ValueError(
"dice loss and bce loss is only applicable to binary classfication"
)
if class_weight is not None:
if isinstance(class_weight, list):
if len(class_weight) != num_classes:
raise ValueError(
"Length of class_weight should be equal to number of classes"
)
elif isinstance(class_weight, str):
if class_weight.lower() != 'dynamic':
raise ValueError(
"if class_weight is string, must be dynamic!")
else:
raise TypeError(
'Expect class_weight is a list or string but receive {}'.
format(type(class_weight)))
self.num_classes = num_classes
self.input_channel = input_channel
self.mode = mode
self.use_bce_loss = use_bce_loss
self.use_dice_loss = use_dice_loss
self.class_weight = class_weight
self.ignore_index = ignore_index
self.stage1_num_modules = stage1_num_modules
self.stage1_num_blocks = stage1_num_blocks
self.stage1_num_channels = stage1_num_channels
self.stage2_num_modules = stage2_num_modules
self.stage2_num_blocks = stage2_num_blocks
self.stage2_num_channels = stage2_num_channels
self.stage3_num_modules = stage3_num_modules
self.stage3_num_blocks = stage3_num_blocks
self.stage3_num_channels = stage3_num_channels
self.stage4_num_modules = stage4_num_modules
self.stage4_num_blocks = stage4_num_blocks
self.stage4_num_channels = stage4_num_channels
def build_net(self, inputs):
if self.use_dice_loss or self.use_bce_loss:
self.num_classes = 1
image = inputs['image']
logit = self._high_resolution_net(image, self.num_classes)
if self.num_classes == 1:
out = sigmoid_to_softmax(logit)
out = fluid.layers.transpose(out, [0, 2, 3, 1])
else:
out = fluid.layers.transpose(logit, [0, 2, 3, 1])
pred = fluid.layers.argmax(out, axis=3)
pred = fluid.layers.unsqueeze(pred, axes=[3])
if self.mode == 'train':
label = inputs['label']
mask = label != self.ignore_index
return self._get_loss(logit, label, mask)
else:
if self.num_classes == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = fluid.layers.softmax(logit, axis=1)
return pred, logit
return logit
def generate_inputs(self):
inputs = OrderedDict()
inputs['image'] = fluid.data(
dtype='float32',
shape=[None, self.input_channel, None, None],
name='image')
if self.mode == 'train':
inputs['label'] = fluid.data(
dtype='int32', shape=[None, 1, None, None], name='label')
elif self.mode == 'eval':
inputs['label'] = fluid.data(
dtype='int32', shape=[None, 1, None, None], name='label')
return inputs
def _get_loss(self, logit, label, mask):
avg_loss = 0
if not (self.use_dice_loss or self.use_bce_loss):
avg_loss += softmax_with_loss(
logit,
label,
mask,
num_classes=self.num_classes,
weight=self.class_weight,
ignore_index=self.ignore_index)
else:
if self.use_dice_loss:
avg_loss += dice_loss(logit, label, mask)
if self.use_bce_loss:
avg_loss += bce_loss(
logit, label, mask, ignore_index=self.ignore_index)
return avg_loss
def _conv_bn_layer(self,
input,
filter_size,
num_filters,
stride=1,
padding=1,
num_groups=1,
if_act=True,
name=None):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
filter_size=filter_size,
stride=stride,
padding=(filter_size - 1) // 2,
groups=num_groups,
act=None,
param_attr=ParamAttr(initializer=MSRA(), name=name + '_weights'),
bias_attr=False)
bn_name = name + '_bn'
bn = fluid.layers.batch_norm(
input=conv,
param_attr=ParamAttr(
name=bn_name + "_scale",
initializer=fluid.initializer.Constant(1.0)),
bias_attr=ParamAttr(
name=bn_name + "_offset",
initializer=fluid.initializer.Constant(0.0)),
moving_mean_name=bn_name + '_mean',
moving_variance_name=bn_name + '_variance')
if if_act:
bn = fluid.layers.relu(bn)
return bn
def _basic_block(self,
input,
num_filters,
stride=1,
downsample=False,
name=None):
residual = input
conv = self._conv_bn_layer(
input=input,
filter_size=3,
num_filters=num_filters,
stride=stride,
name=name + '_conv1')
conv = self._conv_bn_layer(
input=conv,
filter_size=3,
num_filters=num_filters,
if_act=False,
name=name + '_conv2')
if downsample:
residual = self._conv_bn_layer(
input=input,
filter_size=1,
num_filters=num_filters,
if_act=False,
name=name + '_downsample')
return fluid.layers.elementwise_add(x=residual, y=conv, act='relu')
def _bottleneck_block(self,
input,
num_filters,
stride=1,
downsample=False,
name=None):
residual = input
conv = self._conv_bn_layer(
input=input,
filter_size=1,
num_filters=num_filters,
name=name + '_conv1')
conv = self._conv_bn_layer(
input=conv,
filter_size=3,
num_filters=num_filters,
stride=stride,
name=name + '_conv2')
conv = self._conv_bn_layer(
input=conv,
filter_size=1,
num_filters=num_filters * 4,
if_act=False,
name=name + '_conv3')
if downsample:
residual = self._conv_bn_layer(
input=input,
filter_size=1,
num_filters=num_filters * 4,
if_act=False,
name=name + '_downsample')
return fluid.layers.elementwise_add(x=residual, y=conv, act='relu')
def _fuse_layers(self, x, channels, multi_scale_output=True, name=None):
out = []
for i in range(len(channels) if multi_scale_output else 1):
residual = x[i]
shape = fluid.layers.shape(residual)[-2:]
for j in range(len(channels)):
if j > i:
y = self._conv_bn_layer(
x[j],
filter_size=1,
num_filters=channels[i],
if_act=False,
name=name + '_layer_' + str(i + 1) + '_' + str(j + 1))
y = fluid.layers.resize_bilinear(input=y, out_shape=shape)
residual = fluid.layers.elementwise_add(
x=residual, y=y, act=None)
elif j < i:
y = x[j]
for k in range(i - j):
if k == i - j - 1:
y = self._conv_bn_layer(
y,
filter_size=3,
num_filters=channels[i],
stride=2,
if_act=False,
name=name + '_layer_' + str(i + 1) + '_' +
str(j + 1) + '_' + str(k + 1))
else:
y = self._conv_bn_layer(
y,
filter_size=3,
num_filters=channels[j],
stride=2,
name=name + '_layer_' + str(i + 1) + '_' +
str(j + 1) + '_' + str(k + 1))
residual = fluid.layers.elementwise_add(
x=residual, y=y, act=None)
residual = fluid.layers.relu(residual)
out.append(residual)
return out
def _branches(self, x, block_num, channels, name=None):
out = []
for i in range(len(channels)):
residual = x[i]
for j in range(block_num[i]):
residual = self._basic_block(
residual,
channels[i],
name=name + '_branch_layer_' + str(i + 1) + '_' +
str(j + 1))
out.append(residual)
return out
def _high_resolution_module(self,
x,
blocks,
channels,
multi_scale_output=True,
name=None):
residual = self._branches(x, blocks, channels, name=name)
out = self._fuse_layers(
residual,
channels,
multi_scale_output=multi_scale_output,
name=name)
return out
def _transition_layer(self, x, in_channels, out_channels, name=None):
num_in = len(in_channels)
num_out = len(out_channels)
out = []
for i in range(num_out):
if i < num_in:
if in_channels[i] != out_channels[i]:
residual = self._conv_bn_layer(
x[i],
filter_size=3,
num_filters=out_channels[i],
name=name + '_layer_' + str(i + 1))
out.append(residual)
else:
out.append(x[i])
else:
residual = self._conv_bn_layer(
x[-1],
filter_size=3,
num_filters=out_channels[i],
stride=2,
name=name + '_layer_' + str(i + 1))
out.append(residual)
return out
def _stage(self,
x,
num_modules,
num_blocks,
num_channels,
multi_scale_output=True,
name=None):
out = x
for i in range(num_modules):
if i == num_modules - 1 and multi_scale_output == False:
out = self._high_resolution_module(
out,
num_blocks,
num_channels,
multi_scale_output=False,
name=name + '_' + str(i + 1))
else:
out = self._high_resolution_module(
out, num_blocks, num_channels, name=name + '_' + str(i + 1))
return out
def _layer1(self, input, num_modules, num_blocks, num_channels, name=None):
# num_modules 默认为1,是否增加处理,官网实现为[1],是否对齐。
conv = input
for i in range(num_blocks[0]):
conv = self._bottleneck_block(
conv,
num_filters=num_channels[0],
downsample=True if i == 0 else False,
name=name + '_' + str(i + 1))
return conv
def _high_resolution_net(self, input, num_classes):
x = self._conv_bn_layer(
input=input,
filter_size=3,
num_filters=self.stage1_num_channels[0],
stride=2,
if_act=True,
name='layer1_1')
x = self._conv_bn_layer(
input=x,
filter_size=3,
num_filters=self.stage1_num_channels[0],
stride=2,
if_act=True,
name='layer1_2')
la1 = self._layer1(
x,
self.stage1_num_modules,
self.stage1_num_blocks,
self.stage1_num_channels,
name='layer2')
tr1 = self._transition_layer([la1],
self.stage1_num_channels,
self.stage2_num_channels,
name='tr1')
st2 = self._stage(
tr1,
self.stage2_num_modules,
self.stage2_num_blocks,
self.stage2_num_channels,
name='st2')
tr2 = self._transition_layer(
st2, self.stage2_num_channels, self.stage3_num_channels, name='tr2')
st3 = self._stage(
tr2,
self.stage3_num_modules,
self.stage3_num_blocks,
self.stage3_num_channels,
name='st3')
tr3 = self._transition_layer(
st3, self.stage3_num_channels, self.stage4_num_channels, name='tr3')
st4 = self._stage(
tr3,
self.stage4_num_modules,
self.stage4_num_blocks,
self.stage4_num_channels,
name='st4')
# upsample
shape = fluid.layers.shape(st4[0])[-2:]
st4[1] = fluid.layers.resize_bilinear(st4[1], out_shape=shape)
st4[2] = fluid.layers.resize_bilinear(st4[2], out_shape=shape)
st4[3] = fluid.layers.resize_bilinear(st4[3], out_shape=shape)
out = fluid.layers.concat(st4, axis=1)
last_channels = sum(self.stage4_num_channels)
out = self._conv_bn_layer(
input=out,
filter_size=1,
num_filters=last_channels,
stride=1,
if_act=True,
name='conv-2')
out = fluid.layers.conv2d(
input=out,
num_filters=num_classes,
filter_size=1,
stride=1,
padding=0,
act=None,
param_attr=ParamAttr(initializer=MSRA(), name='conv-1_weights'),
bias_attr=False)
input_shape = fluid.layers.shape(input)[-2:]
out = fluid.layers.resize_bilinear(out, input_shape)
return out
import os
import os.path as osp
import sys
import numpy as np
from PIL import Image as Image
import argparse
......@@ -8,35 +9,65 @@ from models import load_model
def parse_args():
parser = argparse.ArgumentParser(description='RemoteSensing predict')
parser.add_argument(
'--single_img',
dest='single_img',
help='single image path to predict',
default=None,
type=str)
parser.add_argument(
'--data_dir',
dest='data_dir',
help='dataset directory',
default=None,
type=str)
parser.add_argument(
'--file_list',
dest='file_list',
help='file name of predict file list',
default=None,
type=str)
parser.add_argument(
'--load_model_dir',
dest='load_model_dir',
help='model load directory',
default=None,
type=str)
parser.add_argument(
'--save_img_dir',
dest='save_img_dir',
help='save directory name of predict results',
default='predict_results',
type=str)
if len(sys.argv) < 2:
parser.print_help()
sys.exit(1)
return parser.parse_args()
args = parse_args()
data_dir = args.data_dir
file_list = args.file_list
single_img = args.single_img
load_model_dir = args.load_model_dir
save_img_dir = args.save_img_dir
if not osp.exists(save_img_dir):
os.makedirs(save_img_dir)
# predict
model = load_model(load_model_dir)
pred_dir = osp.join(load_model_dir, 'predict')
if not osp.exists(pred_dir):
os.mkdir(pred_dir)
val_list = osp.join(data_dir, 'val.txt')
color_map = [0, 0, 0, 255, 255, 255]
with open(val_list) as f:
color_map = [0, 0, 0, 0, 255, 0]
if single_img is not None:
pred = model.predict(single_img)
# 以伪彩色png图片保存预测结果
pred_name = osp.basename(single_img).rstrip('npy') + 'png'
pred_path = osp.join(save_img_dir, pred_name)
pred_mask = Image.fromarray(pred['label_map'].astype(np.uint8), mode='P')
pred_mask.putpalette(color_map)
pred_mask.save(pred_path)
elif (file_list is not None) and (data_dir is not None):
with open(osp.join(data_dir, file_list)) as f:
lines = f.readlines()
for line in lines:
img_path = line.split(' ')[0]
......@@ -47,7 +78,12 @@ with open(val_list) as f:
# 以伪彩色png图片保存预测结果
pred_name = osp.basename(img_path).rstrip('npy') + 'png'
pred_path = osp.join(pred_dir, pred_name)
pred_mask = Image.fromarray(pred.astype(np.uint8), mode='P')
pred_path = osp.join(save_img_dir, pred_name)
pred_mask = Image.fromarray(
pred['label_map'].astype(np.uint8), mode='P')
pred_mask.putpalette(color_map)
pred_mask.save(pred_path)
else:
raise Exception(
'You should either set the parameter single_img, or set the parameters data_dir, file_list.'
)
......@@ -2,11 +2,17 @@ import os.path as osp
import argparse
import transforms.transforms as T
from readers.reader import Reader
from models import UNet
from models import UNet, HRNet
def parse_args():
parser = argparse.ArgumentParser(description='RemoteSensing training')
parser.add_argument(
'--model_type',
dest='model_type',
help="Model type for traing, which is one of ('unet', 'hrnet')",
type=str,
default='hrnet')
parser.add_argument(
'--data_dir',
dest='data_dir',
......@@ -43,7 +49,6 @@ def parse_args():
args = parse_args()
data_dir = args.data_dir
save_dir = args.save_dir
channel = args.channel
......@@ -52,17 +57,9 @@ train_batch_size = args.train_batch_size
lr = args.lr
# 定义训练和验证时的transforms
train_transforms = T.Compose([
T.RandomVerticalFlip(0.5),
T.RandomHorizontalFlip(0.5),
T.ResizeStepScaling(0.5, 2.0, 0.25),
T.RandomPaddingCrop(256),
T.Normalize(mean=[0.5] * channel, std=[0.5] * channel),
])
train_transforms = T.Compose([T.RandomHorizontalFlip(0.5), T.Normalize()])
eval_transforms = T.Compose([
T.Normalize(mean=[0.5] * channel, std=[0.5] * channel),
])
eval_transforms = T.Compose([T.Normalize()])
train_list = osp.join(data_dir, 'train.txt')
val_list = osp.join(data_dir, 'val.txt')
......@@ -74,23 +71,30 @@ train_reader = Reader(
file_list=train_list,
label_list=label_list,
transforms=train_transforms,
num_workers=8,
buffer_size=16,
shuffle=True,
parallel_method='thread')
shuffle=True)
eval_reader = Reader(
data_dir=data_dir,
file_list=val_list,
label_list=label_list,
transforms=eval_transforms,
num_workers=8,
buffer_size=16,
shuffle=False,
parallel_method='thread')
transforms=eval_transforms)
model = UNet(
num_classes=2, input_channel=channel, use_bce_loss=True, use_dice_loss=True)
if args.model_type == 'unet':
model = UNet(
num_classes=2,
input_channel=channel,
use_bce_loss=True,
use_dice_loss=True)
elif args.model_type == 'hrnet':
model = HRNet(
num_classes=2,
input_channel=channel,
use_bce_loss=True,
use_dice_loss=True)
else:
raise ValueError(
"--model_type: {} is set wrong, it shold be one of ('unet', "
"'hrnet')".format(args.model_type))
model.train(
num_epochs=num_epochs,
......@@ -100,7 +104,5 @@ model.train(
save_interval_epochs=5,
log_interval_steps=10,
save_dir=save_dir,
pretrain_weights=None,
optimizer=None,
learning_rate=lr,
use_vdl=True)
......@@ -143,3 +143,14 @@ class ConfusionMatrix(object):
kappa = (po - pe) / (1 - pe)
return kappa
def precision_recall(self):
'''
precision, recall of foreground(value=1) for 2 categories
'''
TP = self.confusion_matrix[1, 1]
FN = self.confusion_matrix[1, 0]
FP = self.confusion_matrix[0, 1]
recall = TP / (TP + FN)
precision = TP / (TP + FP)
return precision, recall
......@@ -12,13 +12,10 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import time
import os
import os.path as osp
import numpy as np
import six
import yaml
import math
from . import logging
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册