提交 0a2bf840 编写于 作者: W wuyefeilin 提交者: wuzewu

add hrnet turtorial (#103)

* add hrnet tutorial
上级 50c2074e
......@@ -64,6 +64,7 @@ $ pip install -r requirements.txt
* [如何训练U-Net](./turtorial/finetune_unet.md)
* [如何训练ICNet](./turtorial/finetune_icnet.md)
* [如何训练PSPNet](./turtorial/finetune_pspnet.md)
* [如何训练HRNet](./turtorial/finetune_hrnet.md)
### 预测部署
......
TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling
EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 1.25 # for stepscaling
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
BATCH_SIZE: 4
DATASET:
DATA_DIR: "./dataset/mini_pet/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 3
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt"
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
IGNORE_INDEX: 255
SEPARATOR: " "
FREEZE:
MODEL_FILENAME: "__model__"
PARAMS_FILENAME: "__params__"
MODEL:
MODEL_NAME: "hrnet"
DEFAULT_NORM_TYPE: "bn"
HRNET:
STAGE2:
NUM_CHANNELS: [18, 36]
STAGE3:
NUM_CHANNELS: [18, 36, 72]
STAGE4:
NUM_CHANNELS: [18, 36, 72, 144]
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/hrnet_w18_bn_pet/"
SNAPSHOT_EPOCH: 10
TEST:
TEST_MODEL: "./saved_model/hrnet_w18_bn_pet/final"
SOLVER:
NUM_EPOCHS: 100
LR: 0.005
LR_POLICY: "poly"
OPTIMIZER: "sgd"
......@@ -5,10 +5,11 @@ MODEL Group存放所有和模型相关的配置,该Group还包含三个子Grou
* [DeepLabv3p](./model_deeplabv3p_group.md)
* [UNet](./model_unet_group.md)
* [ICNet](./model_icnet_group.md)
* [HRNet](./model_hrnet_group.md)
## `MODEL_NAME`
所选模型,支持`deeplabv3p` `unet` `icnet`种模型
所选模型,支持`deeplabv3p` `unet` `icnet` `hrnet`种模型
### 默认值
......
# cfg.MODEL.HRNET
MODEL.HRNET 子Group存放所有和HRNet模型相关的配置
## `STAGE2.NUM_MODULES`
HRNet在第二阶段执行modularized block(multi-resolution parallel convolution + multi-resolution fusion)的重复次数
### 默认值
1
<br/>
<br/>
## `STAGE2.NUM_CHANNELS`
HRNet在第二阶段各个分支的通道数
### 默认值
[40, 80]
<br/>
<br/>
## `STAGE3.NUM_MODULES`
HRNet在第三阶段执行modularized block的重复次数
### 默认值
4
<br/>
<br/>
## `STAGE3.NUM_CHANNELS`
HRNet在第三阶段各个分支的通道数
### 默认值
[40, 80, 160]
<br/>
<br/>
## `STAGE4.NUM_MODULES`
HRNet在第四阶段执行modularized block的重复次数
### 默认值
3
<br/>
<br/>
## `STAGE4.NUM_CHANNELS`
HRNet在第四阶段各个分支的通道数
### 默认值
[40, 80, 160, 320]
<br/>
<br/>
\ No newline at end of file
......@@ -22,6 +22,16 @@ PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训
| Xception65 | ImageNet | [Xception65_pretrained.tgz](https://paddleseg.bj.bcebos.com/models/Xception65_pretrained.tgz) | 80.32%/94.47% |
| Xception71 | ImageNet | coming soon | -- |
| 模型 | 数据集合 | 下载地址 | Accuray Top1/5 Error |
|---|---|---|---|
| HRNet_W18 | ImageNet | [hrnet_w18_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w18_imagenet.tar) | 76.92%/93.39% |
| HRNet_W30 | ImageNet | [hrnet_w30_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w30_imagenet.tar) | 78.04%/94.02% |
| HRNet_W32 | ImageNet | [hrnet_w32_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w32_imagenet.tar) | 78.28%/94.24% |
| HRNet_W40 | ImageNet | [hrnet_w40_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w40_imagenet.tar) | 78.77%/94.47% |
| HRNet_W44 | ImageNet | [hrnet_w44_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w44_imagenet.tar) | 79.00%/94.51% |
| HRNet_W48 | ImageNet | [hrnet_w48_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w48_imagenet.tar) | 78.95%/94.42% |
| HRNet_W64 | ImageNet | [hrnet_w64_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w64_imagenet.tar) | 79.30%/94.61% |
## COCO预训练模型
数据集为COCO实例分割数据集合转换成的语义分割数据集合
......@@ -46,3 +56,4 @@ train数据集合为Cityscapes训练集合,测试为Cityscapes的验证集合
| ICNet/bn | Cityscapes |[icnet_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/icnet_cityscapes.tar.gz) |16|false| 0.6831 |
| PSPNet/bn | Cityscapes |[pspnet50_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz) |16|false| 0.7013 |
| PSPNet/bn | Cityscapes |[pspnet101_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/pspnet101_cityscapes.tgz) |16|false| 0.7734 |
| HRNet_W18/bn | Cityscapes |[hrnet_w18_bn_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/hrnet_w18_bn_cityscapes.tgz) | 4 | false | 0.7936 |
......@@ -112,6 +112,7 @@ def softmax(logit):
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def sigmoid_to_softmax(logit):
"""
one channel to two channel
......@@ -143,8 +144,9 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
# 在导出模型的时候,增加图像标准化预处理,减小预测部署时图像的处理流程
# 预测部署时只须对输入图像增加batch_size维度即可
if ModelPhase.is_predict(phase):
origin_image = fluid.layers.data(name='image',
shape=[ -1, 1, 1, cfg.DATASET.DATA_DIM],
origin_image = fluid.layers.data(
name='image',
shape=[-1, 1, 1, cfg.DATASET.DATA_DIM],
dtype='float32',
append_batch_size=False)
image = fluid.layers.transpose(origin_image, [0, 3, 1, 2])
......@@ -153,9 +155,12 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
mean = fluid.layers.assign(mean.astype('float32'))
std = np.array(cfg.STD).reshape(1, len(cfg.STD), 1, 1)
std = fluid.layers.assign(std.astype('float32'))
image = (image/255 - mean)/std
image = fluid.layers.resize_bilinear(image,
out_shape=[height, width], align_corners=False, align_mode=0)
image = fluid.layers.resize_bilinear(
image,
out_shape=[height, width],
align_corners=False,
align_mode=0)
image = (image / 255 - mean) / std
else:
image = fluid.layers.data(
name='image', shape=image_shape, dtype='float32')
......@@ -180,14 +185,19 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
loss_type = list(loss_type)
# dice_loss或bce_loss只适用两类分割中
if class_num > 2 and (("dice_loss" in loss_type) or ("bce_loss" in loss_type)):
raise Exception("dice loss and bce loss is only applicable to binary classfication")
if class_num > 2 and (("dice_loss" in loss_type) or
("bce_loss" in loss_type)):
raise Exception(
"dice loss and bce loss is only applicable to binary classfication"
)
# 在两类分割情况下,当loss函数选择dice_loss或bce_loss的时候,最后logit输出通道数设置为1
if ("dice_loss" in loss_type) or ("bce_loss" in loss_type):
class_num = 1
if "softmax_loss" in loss_type:
raise Exception("softmax loss can not combine with dice loss or bce loss")
raise Exception(
"softmax loss can not combine with dice loss or bce loss"
)
logits = model_func(image, class_num)
......@@ -197,8 +207,8 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
avg_loss_list = []
valid_loss = []
if "softmax_loss" in loss_type:
avg_loss_list.append(multi_softmax_with_loss(logits,
label, mask,class_num))
avg_loss_list.append(
multi_softmax_with_loss(logits, label, mask, class_num))
loss_valid = True
valid_loss.append("softmax_loss")
if "dice_loss" in loss_type:
......@@ -210,13 +220,17 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
loss_valid = True
valid_loss.append("bce_loss")
if not loss_valid:
raise Exception("SOLVER.LOSS: {} is set wrong. it should "
raise Exception(
"SOLVER.LOSS: {} is set wrong. it should "
"include one of (softmax_loss, bce_loss, dice_loss) at least"
" example: ['softmax_loss'], ['dice_loss'], ['bce_loss', 'dice_loss']".format(cfg.SOLVER.LOSS))
" example: ['softmax_loss'], ['dice_loss'], ['bce_loss', 'dice_loss']"
.format(cfg.SOLVER.LOSS))
invalid_loss = [x for x in loss_type if x not in valid_loss]
if len(invalid_loss) > 0:
print("Warning: the loss {} you set is invalid. it will not be included in loss computed.".format(invalid_loss))
print(
"Warning: the loss {} you set is invalid. it will not be included in loss computed."
.format(invalid_loss))
avg_loss = 0
for i in range(0, len(avg_loss_list)):
......@@ -238,7 +252,11 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
logit = fluid.layers.resize_bilinear(logit, out_shape=origin_shape, align_corners=False, align_mode=0)
logit = fluid.layers.resize_bilinear(
logit,
out_shape=origin_shape,
align_corners=False,
align_mode=0)
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.argmax(logit, axis=3)
return origin_image, logit
......
......@@ -146,7 +146,7 @@ def layer1(input, name=None):
name=name + '_' + str(i + 1))
return conv
def highResolutionNet(input, num_classes):
def high_resolution_net(input, num_classes):
channels_2 = cfg.MODEL.HRNET.STAGE2.NUM_CHANNELS
channels_3 = cfg.MODEL.HRNET.STAGE3.NUM_CHANNELS
......@@ -198,7 +198,7 @@ def highResolutionNet(input, num_classes):
def hrnet(input, num_classes):
logit = highResolutionNet(input, num_classes)
logit = high_resolution_net(input, num_classes)
return logit
if __name__ == '__main__':
......
......@@ -37,6 +37,20 @@ model_urls = {
"https://paddleseg.bj.bcebos.com/models/Xception41_pretrained.tgz",
"xception65_imagenet":
"https://paddleseg.bj.bcebos.com/models/Xception65_pretrained.tgz",
"hrnet_w18_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w18_imagenet.tar",
"hrnet_w30_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w30_imagenet.tar",
"hrnet_w32_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w32_imagenet.tar" ,
"hrnet_w40_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w40_imagenet.tar",
"hrnet_w44_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w44_imagenet.tar",
"hrnet_w48_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w48_imagenet.tar",
"hrnet_w64_bn_imagenet":
"https://paddleseg.bj.bcebos.com/models/hrnet_w64_imagenet.tar",
# COCO pretrained
"deeplabv3p_mobilenetv2-1-0_bn_coco":
......@@ -65,6 +79,8 @@ model_urls = {
"https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz",
"pspnet101_bn_cityscapes":
"https://paddleseg.bj.bcebos.com/models/pspnet101_cityscapes.tgz",
"hrnet_w18_bn_cityscapes":
"https://paddleseg.bj.bcebos.com/models/hrnet_w18_bn_cityscapes.tgz",
}
if __name__ == "__main__":
......
# HRNet模型训练教程
* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`HRNet`*** 预训练模型在自定义数据集上进行训练。
* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解
* 本教程的所有命令都基于PaddleSeg主目录进行执行
## 一. 准备待训练数据
我们提前准备好了一份数据集,通过以下代码进行下载
```shell
python dataset/download_pet.py
```
## 二. 下载预训练模型
关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置
接着下载对应的预训练模型
```shell
python pretrained_model/download_model.py hrnet_w18_bn_cityscapes
```
## 三. 准备配置
接着我们需要确定相关配置,从本教程的角度,配置分为三部分:
* 数据集
* 训练集主目录
* 训练集文件列表
* 测试集文件列表
* 评估集文件列表
* 预训练模型
* 预训练模型名称
* 预训练模型各阶段通道数设置
* 预训练模型的Normalization类型
* 预训练模型路径
* 其他
* 学习率
* Batch大小
* ...
在三者中,预训练模型的配置尤为重要,如果模型配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所展示。
数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/hrnet_w18_pet.yaml**
```yaml
# 数据集配置
DATASET:
DATA_DIR: "./dataset/mini_pet/"
NUM_CLASSES: 3
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt"
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "hrnet"
DEFAULT_NORM_TYPE: "bn"
HRNET:
STAGE2:
NUM_CHANNELS: [18, 36]
STAGE3:
NUM_CHANNELS: [18, 36, 72]
STAGE4:
NUM_CHANNELS: [18, 36, 72, 144]
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/hrnet_w18_bn_pet/"
SNAPSHOT_EPOCH: 10
TEST:
TEST_MODEL: "./saved_model/hrnet_w18_bn_pet/final"
SOLVER:
NUM_EPOCHS: 100
LR: 0.005
LR_POLICY: "poly"
OPTIMIZER: "sgd"
```
## 四. 配置/数据校验
在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程
```shell
python pdseg/check.py --cfg ./configs/hrnet_w18_pet.yaml
```
## 五. 开始训练
校验通过后,使用下述命令启动训练
```shell
python pdseg/train.py --use_gpu --cfg ./configs/hrnet_w18_pet.yaml
```
## 六. 进行评估
模型训练完成,使用下述命令启动评估
```shell
python pdseg/eval.py --use_gpu --cfg ./configs/hrnet_w18_pet.yaml
```
## 模型组合
|预训练模型名称|BackBone|Norm Type|数据集|配置|
|-|-|-|-|-|
|hrnet_w18_bn_cityscapes|-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144] <br> MODEL.DEFAULT_NORM_TYPE: bn|
| hrnet_w18_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w30_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [30, 60] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [30, 60, 120] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [30, 60, 120, 240] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w32_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [32, 64] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [32, 64, 128] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [32, 64, 128, 256] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w40_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [40, 80] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [40, 80, 160] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [40, 80, 160, 320] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w44_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [44, 88] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [44, 88, 176] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [44, 88, 176, 352] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w48_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [48, 96] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [48, 96, 192] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [48, 96, 192, 384] <br> MODEL.DEFAULT_NORM_TYPE: bn |
| hrnet_w64_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet <br> MODEL.HRNET.STAGE2.NUM_CHANNELS: [64, 128] <br> MODEL.HRNET.STAGE3.NUM_CHANNELS: [64, 128, 256] <br> MODEL.HRNET.STAGE4.NUM_CHANNELS: [64, 128, 256, 512] <br> MODEL.DEFAULT_NORM_TYPE: bn |
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册