提交 6abfe42d 编写于 作者: L LielinJiang

Merge branch 'master' of https://github.com/PaddlePaddle/PaddleSeg

track official
......@@ -60,7 +60,9 @@ PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加
### 高级功能
* [PaddleSeg的数据增强](./docs/data_aug.md)
* [PaddleSeg的loss选择](./docs/loss_select.md)
* [特色垂类模型使用](./contrib)
* [多进程训练和混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)
</br>
......@@ -87,6 +89,10 @@ A: 降低Batch size,使用Group Norm策略;请注意训练过程中当`DEFAU
</br>
#### Q: 出现错误 ModuleNotFoundError: No module named 'paddle.fluid.contrib.mixed_precision'
A: 请将PaddlePaddle升级至1.5.2版本或以上。
## 在线体验
PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:
......@@ -100,15 +106,13 @@ PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:
</br>
## 交流与反馈
* 欢迎您通过Github Issues来提交问题、报告与建议
* 欢迎您通过[Github Issues](https://github.com/PaddlePaddle/PaddleSeg/issues)来提交问题、报告与建议
* 微信公众号:飞桨PaddlePaddle
* QQ群: 796771754
<p align="center"><img width="200" height="200" src="https://user-images.githubusercontent.com/45189361/64117959-1969de80-cdc9-11e9-84f7-e1c2849a004c.jpeg"/>&#8194;&#8194;&#8194;&#8194;&#8194;<img width="200" height="200" margin="500" src="./docs/imgs/qq_group2.png"/></p>
<p align="center"> &#8194;&#8194;&#8194;微信公众号&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;官方技术交流QQ群</p>
* 论坛: 欢迎大家在[PaddlePaddle论坛](https://ai.baidu.com/forum/topic/list/168)分享在使用PaddlePaddle中遇到的问题和经验, 营造良好的论坛氛围
## 更新日志
* 2019.09.10
......
EVAL_CROP_SIZE: (1025, 1025) # (width, height), for unpadding rangescaling and stepscaling
TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: u"stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MIN_SCALE_FACTOR: 0.5 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
BATCH_SIZE: 8
DATASET:
DATA_DIR: "./dataset/MiniDeepGlobeRoadExtraction/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 2
TEST_FILE_LIST: "dataset/MiniDeepGlobeRoadExtraction/val.txt"
TRAIN_FILE_LIST: "dataset/MiniDeepGlobeRoadExtraction/train.txt"
VAL_FILE_LIST: "dataset/MiniDeepGlobeRoadExtraction/val.txt"
IGNORE_INDEX: 255
SEPARATOR: '|'
FREEZE:
MODEL_FILENAME: "model"
PARAMS_FILENAME: "params"
SAVE_DIR: "freeze_model"
MODEL:
DEFAULT_NORM_TYPE: "bn"
MODEL_NAME: "deeplabv3p"
DEEPLAB:
BACKBONE: "mobilenet"
DEPTH_MULTIPLIER: 1.0
ENCODER_WITH_ASPP: False
ENABLE_DECODER: False
TEST:
TEST_MODEL: "./saved_model/deeplabv3p_mobilenetv2-1-0_bn_deepglobe_road_extraction/final"
TRAIN:
MODEL_SAVE_DIR: "./saved_model/deeplabv3p_mobilenetv2-1-0_bn_deepglobe_road_extraction/"
PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_mobilenetv2-1-0_bn_coco/"
SNAPSHOT_EPOCH: 10
SOLVER:
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
NUM_EPOCHS: 300
......@@ -30,13 +30,13 @@ MODEL:
MODEL_NAME: "unet"
DEFAULT_NORM_TYPE: "bn"
TEST:
TEST_MODEL: "./test/saved_model/unet_pet/final/"
TEST_MODEL: "./saved_model/unet_pet/final/"
TRAIN:
MODEL_SAVE_DIR: "./test/saved_models/unet_pet/"
PRETRAINED_MODEL_DIR: "./test/models/unet_coco/"
MODEL_SAVE_DIR: "./saved_model/unet_pet/"
PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/"
SNAPSHOT_EPOCH: 10
SOLVER:
NUM_EPOCHS: 500
NUM_EPOCHS: 100
LR: 0.005
LR_POLICY: "poly"
OPTIMIZER: "adam"
# 数据下载
## PASCAL VOC 2012数据集
下载 PASCAL VOC 2012数据集并将分割部分的假彩色标注图(`SegmentationClass`文件夹)转换成灰度图并存储在`SegmentationClassAug`文件夹, 并在文件夹`ImageSets/Segmentation`下重新生成列表文件`train.list、val.list和trainval.list。
```shell
# 下载数据集并进行解压转换
python download_and_convert_voc2012.py
```
如果已经下载好PASCAL VOC 2012数据集,将数据集移至dataset目录后使用下述命令直接进行转换即可。
```shell
# 数据集转换
python convert_voc2012.py
```
## Oxford-IIIT Pet数据集
我们使用了Oxford-IIIT中的猫和狗两个类别数据制作了一个小数据集mini_pet,更多关于数据集的介绍请参考[Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/)。
```shell
# 下载数据集并进行解压
python dataset/download_pet.py
```
## Cityscapes数据集
运行下述命令下载并解压Cityscapes数据集。
```shell
# 下载数据集并进行解压
python dataset/download_cityscapes.py
```
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
import numpy as np
import os
from PIL import Image
import glob
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
def remove_colormap(filename):
gray_anno = np.array(Image.open(filename))
return gray_anno
def save_annotation(annotation, filename):
annotation = annotation.astype(dtype=np.uint8)
annotation = Image.fromarray(annotation)
annotation.save(filename)
def convert_list(origin_file, seg_file, output_folder):
with open(seg_file, 'w') as fid_seg:
with open(origin_file) as fid_ori:
lines = fid_ori.readlines()
for line in lines:
line = line.strip()
line = '.'.join([line, 'jpg'])
img_name = os.path.join("JPEGImages", line)
line = line.replace('jpg', 'png')
anno_name = os.path.join(output_folder.split(os.sep)[-1], line)
new_line = ' '.join([img_name, anno_name])
fid_seg.write(new_line + "\n")
if __name__ == "__main__":
pascal_root = "./VOCtrainval_11-May-2012/VOC2012"
pascal_root = os.path.join(LOCAL_PATH, pascal_root)
seg_folder = os.path.join(pascal_root, "SegmentationClass")
txt_folder = os.path.join(pascal_root, "ImageSets/Segmentation")
train_path = os.path.join(txt_folder, "train.txt")
val_path = os.path.join(txt_folder, "val.txt")
trainval_path = os.path.join(txt_folder, "trainval.txt")
# 标注图转换后存储目录
output_folder = os.path.join(pascal_root, "SegmentationClassAug")
print("annotation convert and file list convert")
if not os.path.exists(os.path.join(LOCAL_PATH, output_folder)):
os.mkdir(os.path.join(LOCAL_PATH, output_folder))
annotation_names = glob.glob(os.path.join(seg_folder, '*.png'))
for annotation_name in annotation_names:
annotation = remove_colormap(annotation_name)
filename = os.path.basename(annotation_name)
save_name = os.path.join(output_folder, filename)
save_annotation(annotation, save_name)
convert_list(train_path, train_path.replace('txt', 'list'), output_folder)
convert_list(val_path, val_path.replace('txt', 'list'), output_folder)
convert_list(trainval_path, trainval_path.replace('txt', 'list'), output_folder)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
import numpy as np
import os
import glob
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "..", "test")
sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress
from convert_voc2012 import convert_list
from convert_voc2012 import remove_colormap
from convert_voc2012 import save_annotation
def download_VOC_dataset(savepath, extrapath):
url = "https://paddleseg.bj.bcebos.com/dataset/VOCtrainval_11-May-2012.tar"
download_file_and_uncompress(
url=url, savepath=savepath, extrapath=extrapath)
if __name__ == "__main__":
download_VOC_dataset(LOCAL_PATH, LOCAL_PATH)
print("Dataset download finish!")
pascal_root = "./VOCtrainval_11-May-2012/VOC2012"
pascal_root = os.path.join(LOCAL_PATH, pascal_root)
seg_folder = os.path.join(pascal_root, "SegmentationClass")
txt_folder = os.path.join(pascal_root, "ImageSets/Segmentation")
train_path = os.path.join(txt_folder, "train.txt")
val_path = os.path.join(txt_folder, "val.txt")
trainval_path = os.path.join(txt_folder, "trainval.txt")
# 标注图转换后存储目录
output_folder = os.path.join(pascal_root, "SegmentationClassAug")
print("annotation convert and file list convert")
if not os.path.exists(output_folder):
os.mkdir(output_folder)
annotation_names = glob.glob(os.path.join(seg_folder, '*.png'))
for annotation_name in annotation_names:
annotation = remove_colormap(annotation_name)
filename = os.path.basename(annotation_name)
save_name = os.path.join(output_folder, filename)
save_annotation(annotation, save_name)
convert_list(train_path, train_path.replace('txt', 'list'), output_folder)
convert_list(val_path, val_path.replace('txt', 'list'), output_folder)
convert_list(trainval_path, trainval_path.replace('txt', 'list'), output_folder)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "..", "test")
sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress
def download_deepglobe_road_dataset(savepath, extrapath):
url = "https://paddleseg.bj.bcebos.com/dataset/MiniDeepGlobeRoadExtraction.zip"
download_file_and_uncompress(
url=url, savepath=savepath, extrapath=extrapath)
if __name__ == "__main__":
download_deepglobe_road_dataset(LOCAL_PATH, LOCAL_PATH)
print("Dataset download finish!")
# PaddleSeg 数据标注
用户需预先采集好用于训练、评估和测试的图片,并使用数据标注工具完成数据标注。
PaddleSeg支持2种标注工具:LabelMe、精灵数据标注工具。
标注教程如下:
- [LabelMe标注教程](labelme2seg.md)
- [精灵数据标注工具教程](jingling2seg.md)
最后用我们提供的数据转换脚本将上述标注工具产出的数据格式转换为模型训练时所需的数据格式。
......@@ -6,7 +6,7 @@
# YAML_FILE_PATH为yaml配置文件路径
python pdseg/check.py --cfg ${YAML_FILE_PATH}
```
运行后,命令行将显示校验结果的概览信息,详细信息可到detail.log文件中查看。
运行后,命令行将显示校验结果的概览信息,详细的错误信息可到detail.log文件中查看。
### 1 列表分割符校验
判断在`TRAIN_FILE_LIST``VAL_FILE_LIST``TEST_FILE_LIST`列表文件中的分隔符`DATASET.SEPARATOR`设置是否正确。
......@@ -31,18 +31,24 @@ python pdseg/check.py --cfg ${YAML_FILE_PATH}
标注类别最好从0开始,否则可能影响精度。
### 6 标注像素统计
统计每种类别像素数量,显示以供参考。
统计每种类别的像素总数和所占比例,显示以供参考。统计结果如下:
```
Doing label pixel statistics:
(label class, total pixel number, percentage) = [(0, 2048984, 0.5211), (1, 1682943, 0.428), (2, 197976, 0.0503), (3, 2257, 0.0006)]
```
### 7 图像格式校验
检查图片类型`DATASET.IMAGE_TYPE`是否设置正确。
**NOTE:** 当数据集包含三通道图片时`DATASET.IMAGE_TYPE`设置为rgb;
当数据集全部为四通道图片时`DATASET.IMAGE_TYPE`设置为rgba;
### 8 图像与标注图尺寸一致性校验
### 8 图像最大尺寸统计
统计数据集中图片的最大高和最大宽,显示以供参考。
### 9 图像与标注图尺寸一致性校验
验证图像尺寸和对应标注图尺寸是否一致。
### 9 模型验证参数`EVAL_CROP_SIZE`校验
### 10 模型验证参数`EVAL_CROP_SIZE`校验
验证`EVAL_CROP_SIZE`是否设置正确,共有3种情形:
-`AUG.AUG_METHOD`为unpadding时,`EVAL_CROP_SIZE`的宽高应不小于`AUG.FIX_RESIZE_SIZE`的宽高。
......@@ -51,5 +57,5 @@ python pdseg/check.py --cfg ${YAML_FILE_PATH}
-`AUG.AUG_METHOD`为rangscaling时,`EVAL_CROP_SIZE`的宽高应不小于缩放后图像中最大的宽高。
### 10 数据增强参数`AUG.INF_RESIZE_VALUE`校验
### 11 数据增强参数`AUG.INF_RESIZE_VALUE`校验
验证`AUG.INF_RESIZE_VALUE`是否在[`AUG.MIN_RESIZE_VALUE`~`AUG.MAX_RESIZE_VALUE`]范围内。若在范围内,则通过校验。
......@@ -121,4 +121,20 @@ L2正则化系数
10(意味着每训练10个EPOCH保存一次模型)
<br/>
<br/>
\ No newline at end of file
<br/>
## `loss`
训练时选择的损失函数, 支持`softmax_loss(sotfmax with cross entroy loss)`,
`dice_loss(dice coefficient loss)`, `bce_loss(binary cross entroy loss)`三种损失函数。
其中`dice_loss``bce_loss`仅在两类分割问题中适用,`softmax_loss`不能与`dice_loss`
`bce_loss`组合,`dice_loss`可以和`bce_loss`组合使用。使用示例如下:
`['softmax_loss']``['dice_loss','bce_loss']`
### 默认值
['softmax_loss']
<br/>
<br/>
......@@ -2,10 +2,18 @@
## 数据标注
数据标注推荐使用LabelMe工具,具体可参考文档[PaddleSeg 数据标注](./annotation/README.md)
用户需预先采集好用于训练、评估和测试的图片,然后使用数据标注工具完成数据标注。
PddleSeg已支持2种标注工具:LabelMe、精灵数据标注工具。标注教程如下:
## 语义分割标注规范
- [LabelMe标注教程](annotation/labelme2seg.md)
- [精灵数据标注工具教程](annotation/jingling2seg.md)
最后用我们提供的数据转换脚本将上述标注工具产出的数据格式转换为模型训练时所需的数据格式。
## 文件列表
### 文件列表规范
PaddleSeg采用通用的文件列表方式组织训练集、验证集和测试集。像素标注类别需要从0开始递增。
......@@ -57,4 +65,94 @@ PaddleSeg采用通用的文件列表方式组织训练集、验证集和测试
![cityscapes_filelist](./imgs/file_list.png)
若数据集缺少标注图片,则文件列表不用包含分隔符和标注图片路径,如下图所示。
![cityscapes_filelist](./imgs/file_list2.png)
**注意事项**
此时的文件列表仅可在调用`pdseg/vis.py`进行可视化展示时使用,
即仅可在`DATASET.TEST_FILE_LIST``DATASET.VIS_FILE_LIST`配置项中使用。
不可在`DATASET.TRAIN_FILE_LIST``DATASET.VAL_FILE_LIST`配置项中使用。
完整的配置信息可以参考[`./dataset/cityscapes_demo`](../dataset/cityscapes_demo/)目录下的yaml和文件列表。
### 文件列表生成
PaddleSeg提供了生成文件列表的使用脚本,可适用于自定义数据集或cityscapes数据集,并支持通过不同的Flags来开启特定功能。
```
python pdseg/tools/create_dataset_list.py <your/dataset/dir> ${FLAGS}
```
运行后将在数据集根目录下生成训练/验证/测试集的文件列表(文件主名与`--second_folder`一致,扩展名为`.txt`)。
**Note:** 若训练/验证/测试集缺少标注图片,仍可自动生成不含分隔符和标注图片路径的文件列表。
#### 命令行FLAGS列表
|FLAG|用途|默认值|参数数目|
|-|-|-|-|
|--type|指定数据集类型,`cityscapes``自定义`|`自定义`|1|
|--separator|文件列表分隔符|'&#124;'|1|
|--folder|图片和标签集的文件夹名|'images' 'annotations'|2|
|--second_folder|训练/验证/测试集的文件夹名|'train' 'val' 'test'|若干|
|--format|图片和标签集的数据格式|'jpg' 'png'|2|
|--postfix|按文件主名(无扩展名)是否包含指定后缀对图片和标签集进行筛选|'' ''(2个空字符)|2|
#### 使用示例
- **对于自定义数据集**
如果用户想要生成自己数据集的文件列表,需要整理成如下的目录结构:
```
./dataset/ # 数据集根目录
├── annotations # 标注目录
│   ├── test
│   │   ├── ...
│   │   └── ...
│   ├── train
│   │   ├── ...
│   │   └── ...
│   └── val
│   ├── ...
│   └── ...
└── images # 原图目录
├── test
│   ├── ...
│   └── ...
├── train
│   ├── ...
│   └── ...
└── val
├── ...
└── ...
Note:以上目录名可任意
```
必须指定自定义数据集目录,可以按需要设定FLAG。
**Note:** 无需指定`--type`
```
# 生成文件列表,其分隔符为空格,图片和标签集的数据格式都为png
python pdseg/tools/create_dataset_list.py <your/dataset/dir> --separator " " --format png png
```
```
# 生成文件列表,其图片和标签集的文件夹名为img和gt,训练和验证集的文件夹名为training和validation,不生成测试集列表
python pdseg/tools/create_dataset_list.py <your/dataset/dir> \
--folder img gt --second_folder training validation
```
- **对于cityscapes数据集**
必须指定cityscapes数据集目录,`--type`必须为`cityscapes`
在cityscapes类型下,部分FLAG将被重新设定,无需手动指定,具体如下:
|FLAG|固定值|
|-|-|
|--folder|'leftImg8bit' 'gtFine'|
|--format|'png' 'png'|
|--postfix|'_leftImg8bit' '_gtFine_labelTrainIds'|
其余FLAG可以按需要设定。
```
# 生成cityscapes文件列表,其分隔符为逗号
python pdseg/tools/create_dataset_list.py <your/dataset/dir> --type cityscapes --separator ","
```
# PaddleSeg 安装说明
## 推荐开发环境
## 1. 安装PaddlePaddle
* Python 2.7 or 3.5+
* CUDA 9.2
* NVIDIA cuDNN v7.1
版本要求
* PaddlePaddle >= 1.5.2
* 如果有多卡训练需求,请安装 NVIDIA NCCL >= 2.4.7,并在Linux环境下运行
## 1. 安装PaddlePaddle
* Python 2.7 or 3.5+
### pip安装
......@@ -27,6 +22,8 @@ PaddlePaddle最新版本1.5支持Conda安装,可以减少相关依赖安装成
conda install -c paddle paddlepaddle-gpu cudatoolkit=9.0
```
* 如果有多卡训练需求,请安装 NVIDIA NCCL >= 2.4.7,并在Linux环境下运行
更多安装方式详情可以查看 [PaddlePaddle安装说明](https://www.paddlepaddle.org.cn/documentation/docs/zh/beginners_guide/install/index_cn.html)
......
# dice loss解决二分类中样本不均衡问题
对于二类图像分割任务中,往往存在类别分布不均的情况,如:瑕疵检测,道路提取及病变区域提取等等。
在DeepGlobe比赛的Road Extraction中,训练数据道路占比为:%4.5。如下为其图片样例:
<p align="center">
<img src="./imgs/deepglobe.png" hspace='10'/> <br />
</p>
可以看出道路在整张图片中的比例很小。
## 数据集下载
我们从DeepGlobe比赛的Road Extraction的训练集中随机抽取了800张图片作为训练集,200张图片作为验证集,
制作了一个小型的道路提取数据集[MiniDeepGlobeRoadExtraction](https://paddleseg.bj.bcebos.com/dataset/MiniDeepGlobeRoadExtraction.zip)
## softmax loss与dice loss
在图像分割中,softmax loss(sotfmax with cross entroy loss)同等的对待每一像素,因此当背景占据绝大部分的情况下,
网络将偏向于背景的学习,使网络对目标的提取能力变差。`dice loss(dice coefficient loss)`通过计算预测与标注之间的重叠部分计算损失函数,避免了类别不均衡带来的影响,能够取得更好的效果。
在实际应用中`dice loss`往往与`bce loss(binary cross entroy loss)`结合使用,提高模型训练的稳定性。
dice loss的定义如下:
![equation](http://latex.codecogs.com/gif.latex?dice\\_loss=1-\frac{2|Y\bigcap{P}|}{|Y|+|P|})
其中 ![equation](http://latex.codecogs.com/gif.latex?|Y\bigcap{P}|) 表示*Y**P*的共有元素数,
实际计算通过求两者的乘积之和进行计算。如下所示:
<p align="center">
<img src="./imgs/dice1.png" hspace='10' height="68" width="513"/> <br />
</p>
[dice系数详解](https://zh.wikipedia.org/wiki/Dice%E7%B3%BB%E6%95%B0)
## PaddleSeg指定训练loss
PaddleSeg通过`cfg.SOLVER.LOSS`参数可以选择训练时的损失函数,
`cfg.SOLVER.LOSS=['dice_loss','bce_loss']`将指定训练loss为`dice loss``bce loss`的组合
## 实验比较
在MiniDeepGlobeRoadExtraction数据集进行了实验比较。
* 数据集下载
```shell
python dataset/download_mini_deepglobe_road_extraction.py
```
* 预训练模型下载
```shell
python pretrained_model/download_model.py deeplabv3p_mobilenetv2-1-0_bn_coco
```
* 配置/数据校验
```shell
python pdseg/check.py --cfg ./configs/deepglobe_road_extraction.yaml
```
* 训练
```shell
python pdseg/train.py --cfg ./configs/deepglobe_road_extraction.yaml --use_gpu SOLVER.LOSS "['dice_loss','bce_loss']"
```
* 评估
```
python pdseg/eval.py --cfg ./configs/deepglobe_road_extraction.yaml --use_gpu SOLVER.LOSS "['dice_loss','bce_loss']"
```
* 结果比较
softmax loss和dice loss + bce loss实验结果如下图所示。
图中橙色曲线为dice loss + bce loss,最高mIoU为76.02%,蓝色曲线为softmax loss, 最高mIoU为73.62%。
<p align="center">
<img src="./imgs/loss_comparison.png" hspace='10' height="208" width="516"/> <br />
</p>
......@@ -8,11 +8,11 @@ PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训
| 模型 | 数据集合 | Depth multiplier | 下载地址 | Accuray Top1/5 Error|
|---|---|---|---|---|
| MobieNetV2_1.0x | ImageNet | 1.0x | [MobileNetV2_1.0x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) | 72.15%/90.65% |
| MobieNetV2_0.25x | ImageNet | 0.25x |[MobileNetV2_0.25x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_25_pretrained.tar) | 53.21%/76.52% |
| MobieNetV2_0.5x | ImageNet | 0.5x | [MobileNetV2_0.5x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_5_pretrained.tar) | 65.03%/85.72% |
| MobieNetV2_1.5x | ImageNet | 1.5x | [MobileNetV2_1.5x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x1_5_pretrained.tar) | 74.12%/91.67% |
| MobieNetV2_2.0x | ImageNet | 2.0x | [MobileNetV2_2.0x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x2_0_pretrained.tar) | 75.23%/92.58% |
| MobileNetV2_1.0x | ImageNet | 1.0x | [MobileNetV2_1.0x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) | 72.15%/90.65% |
| MobileNetV2_0.25x | ImageNet | 0.25x |[MobileNetV2_0.25x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_25_pretrained.tar) | 53.21%/76.52% |
| MobileNetV2_0.5x | ImageNet | 0.5x | [MobileNetV2_0.5x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_5_pretrained.tar) | 65.03%/85.72% |
| MobileNetV2_1.5x | ImageNet | 1.5x | [MobileNetV2_1.5x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x1_5_pretrained.tar) | 74.12%/91.67% |
| MobileNetV2_2.0x | ImageNet | 2.0x | [MobileNetV2_2.0x](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x2_0_pretrained.tar) | 75.23%/92.58% |
用户可以结合实际场景的精度和预测性能要求,选取不同`Depth multiplier`参数的MobileNet模型。
......@@ -31,6 +31,8 @@ PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训
| DeepLabv3+/MobileNetv2/bn | COCO |[deeplab_mobilenet_x1_0_coco.tgz](https://bj.bcebos.com/v1/paddleseg/deeplab_mobilenet_x1_0_coco.tgz) | 16 | --| -- |
| DeeplabV3+/Xception65/bn | COCO | [xception65_coco.tgz](https://paddleseg.bj.bcebos.com/models/xception65_coco.tgz)| 16 | -- | -- |
| U-Net/bn | COCO | [unet_coco.tgz](https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz) | 16 | -- | -- |
| PSPNet/bn | COCO | [pspnet50_coco.tgz](https://paddleseg.bj.bcebos.com/models/pspnet50_coco.tgz) | 16 | -- | -- |
| PSPNet/bn | COCO | [pspnet101_coco.tgz](https://paddleseg.bj.bcebos.com/models/pspnet101_coco.tgz) | 16 | -- | -- |
## Cityscapes预训练模型
......@@ -42,4 +44,5 @@ train数据集合为Cityscapes训练集合,测试为Cityscapes的验证集合
| DeepLabv3+/Xception65/gn | Cityscapes |[deeplabv3p_xception65_gn_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/deeplabv3p_xception65_cityscapes.tgz) |16|false| 0.7824 |
| DeepLabv3+/Xception65/bn | Cityscapes |[deeplabv3p_xception65_bn_cityscapes_.tgz](https://paddleseg.bj.bcebos.com/models/xception65_bn_cityscapes.tgz) | 16 | false | 0.7930 |
| ICNet/bn | Cityscapes |[icnet_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/icnet6831.tar.gz) |16|false| 0.6831 |
| PSPNet/bn | Cityscapes |[pspnet50_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz) |16|false| 0.6968 |
| PSPNet/bn | Cityscapes |[pspnet50_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz) |16|false| 0.7013 |
| PSPNet/bn | Cityscapes |[pspnet101_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/pspnet101_cityscapes.tgz) |16|false| 0.7734 |
# PaddleSeg 多进程训练和混合精度训练
### 环境要求
* PaddlePaddle >= 1.6.0
* NVIDIA NCCL >= 2.4.7,并在Linux环境下运行
环境配置,数据,预训练模型准备等工作请参考[安装说明](./installation.md)[PaddleSeg使用说明](./usage.md)
### 多进程训练示例
多进程训练,可以按如下方式启动
```
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch pdseg/train.py --use_gpu \
--do_eval \
--cfg configs/unet_pet.yaml \
BATCH_SIZE 4 \
TRAIN.PRETRAINED_MODEL_DIR pretrained_model/unet_bn_coco \
SOLVER.LR 5e-5
```
### 混合精度训练示例
启动混合精度训练,只需将```MODEL.FP16```设置为```True```,具体命令如下
```
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch pdseg/train.py --use_gpu \
--do_eval \
--cfg configs/unet_pet.yaml \
BATCH_SIZE 4 \
TRAIN.PRETRAINED_MODEL_DIR pretrained_model/unet_bn_coco \
SOLVER.LR 5e-5 \
MODEL.FP16 True
```
这时候会采用动态scale的方式,若想使用静态scale的方式,可通过```MODEL.SCALE_LOSS```设置,具体命令如下
```
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch pdseg/train.py --use_gpu \
--do_eval \
--cfg configs/unet_pet.yaml \
BATCH_SIZE 8 \
TRAIN.PRETRAINED_MODEL_DIR pretrained_model/unet_bn_coco \
SOLVER.LR 5e-5 \
MODEL.FP16 True \
MODEL.SCALE_LOSS 512.0
```
### benchmark
| 模型 | 数据集合 | batch size | number gpu cards | 多进程训练 | 混合精度训练 | 显存占用 | 速度(image/s) | mIoU on val |
|---|---|---|---|---|---|---|---|---|
| DeepLabv3+/Xception65/bn | Cityscapes | 16 | 4 | False | False | 15988 MiB | 17.27 | 79.20 |
| DeepLabv3+/Xception65/bn | Cityscapes | 16 | 4 | True | False | 15814 MiB | 19.80 | 78.90 |
| DeepLabv3+/Xception65/bn | Cityscapes | 16 | 4 | True | True | 14922 MiB | 25.84 |79.06|
### 参考
- [Mixed Precision Training](https://arxiv.org/abs/1710.03740)
......@@ -26,7 +26,7 @@ python pdseg/train.py BATCH_SIZE 1 --cfg configs/cityscapes.yaml
|-|-|-|-|-|
|--cfg|ALL|配置文件路径|None||
|--use_gpu|ALL|是否使用GPU进行训练|False||
|--use_mpio|train/eval|是否使用多线程进行IO处理|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。</br> **NOTE:** windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。 </br> |
|--use_mpio|train/eval|是否使用多程进行IO处理|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。</br> **NOTE:** windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。 </br> |
|--use_tb|train|是否使用TensorBoard记录训练数据|False||
|--log_steps|train|训练日志的打印周期(单位为step)|10||
|--debug|train|是否打印debug信息|False|IOU等指标涉及到混淆矩阵的计算,会降低训练速度|
......@@ -117,16 +117,24 @@ NOTE:
```shell
python pdseg/eval.py --use_gpu \
--cfg configs/unet_pet.yaml \
TEST.TEST_MODEL test/saved_models/unet_pet/final
TEST.TEST_MODEL saved_model/unet_pet/final
```
可以看到,在经过训练后,模型在验证集上的mIoU指标达到了0.70+(由于随机种子等因素的影响,效果会有小范围波动,属于正常情况)。
### 模型可视化
通过vis.py来评估模型效果,我们选择最后保存的模型进行效果的评估:
```shell
python pdseg/vis.py --use_gpu \
--cfg configs/unet_pet.yaml \
TEST.TEST_MODEL test/saved_models/unet_pet/final
TEST.TEST_MODEL saved_model/unet_pet/final
```
执行上述脚本后,会在主目录下产生一个visual/visual_results文件夹,里面存放着测试集图片的预测结果,我们选择其中几张图片进行查看,可以看到,在测试集中的图片上的预测效果已经很不错:
![](./imgs/usage_vis_demo.jpg)
![](./imgs/usage_vis_demo2.jpg)
![](./imgs/usage_vis_demo3.jpg)
`NOTE`
1. 可视化的图片会默认保存在visual/visual_results目录下,可以通过`--vis_dir`来指定输出目录
2. 训练过程中会使用DATASET.VIS_FILE_LIST中的图片进行可视化显示,而vis.py则会使用DATASET.TEST_FILE_LIST
......@@ -42,14 +42,23 @@ include_directories("${PADDLE_DIR}/third_party/install/protobuf/include")
include_directories("${PADDLE_DIR}/third_party/install/glog/include")
include_directories("${PADDLE_DIR}/third_party/install/gflags/include")
include_directories("${PADDLE_DIR}/third_party/install/xxhash/include")
include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/include")
include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/include")
include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
endif()
include_directories("${PADDLE_DIR}/third_party/install/zlib/include")
include_directories("${PADDLE_DIR}/third_party/boost")
include_directories("${PADDLE_DIR}/third_party/eigen3")
link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
endif()
link_directories("${PADDLE_DIR}/third_party/install/zlib/lib")
link_directories("${PADDLE_DIR}/third_party/install/protobuf/lib")
link_directories("${PADDLE_DIR}/third_party/install/glog/lib")
......@@ -82,7 +91,7 @@ if (WIN32)
add_definitions(-DSTATIC_LIB)
endif()
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -o2 -std=c++11")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -o2 -fopenmp -std=c++11")
set(CMAKE_STATIC_LIBRARY_PREFIX "")
endif()
......@@ -160,13 +169,25 @@ if (NOT WIN32)
set(EXTERNAL_LIB "-lrt -ldl -lpthread")
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
glog gflags protobuf yaml-cpp snappystream snappy z xxhash
glog gflags protobuf yaml-cpp z xxhash
${EXTERNAL_LIB})
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
else()
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
opencv_world346 glog libyaml-cppmt gflags_static libprotobuf snappy zlibstatic xxhash snappystream ${EXTERNAL_LIB})
opencv_world346 glog libyaml-cppmt gflags_static libprotobuf zlibstatic xxhash ${EXTERNAL_LIB})
set(DEPS ${DEPS} libcmt shlwapi)
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
endif(NOT WIN32)
if(WITH_GPU)
......
......@@ -71,7 +71,7 @@ deeplabv3p_xception65_humanseg
### 2. 修改配置
基于`PaddleSeg`训练的模型导出时,会自动生成对应的预测模型配置文件,请参考文档:[模型导出](../docs/export_model.md)
基于`PaddleSeg`训练的模型导出时,会自动生成对应的预测模型配置文件,请参考文档:[模型导出](../docs/model_export.md)
`inference`源代码(即本目录)的`conf`目录下提供了示例人像分割模型的配置文件`humanseg.yaml`, 相关的字段含义和说明如下:
......@@ -88,7 +88,7 @@ DEPLOY:
# 预测图片的的标准输入尺寸,输入尺寸不一致会做resize
EVAL_CROP_SIZE: (513, 513)
# 均值
MEAN: [104.008, 116.669, 122.675]
MEAN: [0.40787450980392154, 0.4575254901960784, 0.481078431372549]
# 方差
STD: [1.0, 1.0, 1.0]
# 图片类型, rgb 或者 rgba
......
DEPLOY:
USE_GPU: 1
MODEL_PATH: "/root/projects/models/deeplabv3p_xception65_humanseg"
MODEL_NAME: "unet"
MODEL_FILENAME: "__model__"
PARAMS_FILENAME: "__params__"
EVAL_CROP_SIZE: (513, 513)
MEAN: [0.5, 0.5, 0.5]
STD: [1.0, 1.0, 1.0]
MEAN: [0.40787450980392154, 0.4575254901960784, 0.481078431372549]
STD: [0.00392156862745098, 0.00392156862745098, 0.00392156862745098]
IMAGE_TYPE: "rgb"
NUM_CLASSES: 2
CHANNELS : 3
PRE_PROCESSOR: "SegPreProcessor"
PREDICTOR_MODE: "NATIVE"
BATCH_SIZE : 3
BATCH_SIZE : 1
#include "seg_predictor.h"
#include <unsupported/Eigen/CXX11/Tensor>
namespace PaddleSolution {
......@@ -78,26 +79,8 @@ namespace PaddleSolution {
//post process
_mask.clear();
_scoremap.clear();
int out_img_len = eval_height * eval_width;
for (int i = 0; i < out_img_len; ++i) {
float max_value = -1;
int label = 0;
for (int j = 0; j < eval_num_class; ++j) {
int index = i + j * out_img_len;
if (index >= blob_out_len) {
break;
}
float value = p_out[index];
if (value > max_value) {
max_value = value;
label = j;
}
}
if (label == 0) max_value = 0;
_mask[i] = uchar(label);
_scoremap[i] = uchar(max_value * 255);
}
std::vector<int> out_shape{eval_num_class, eval_height, eval_width};
utils::argmax(p_out, out_shape, _mask, _scoremap);
cv::Mat mask_png = cv::Mat(eval_height, eval_width, CV_8UC1);
mask_png.data = _mask.data();
std::string nname(fname);
......@@ -251,6 +234,7 @@ namespace PaddleSolution {
int idx = u * default_batch_size + i;
imgs_batch.push_back(imgs[idx]);
}
if (!_preprocessor->batch_process(imgs_batch, input_buffer.data(), org_height.data(), org_width.data())) {
return -1;
}
......
......@@ -32,21 +32,7 @@ namespace PaddleSolution {
if (*ori_h != rh || *ori_w != rw) {
cv::resize(im, im, resize_size, 0, 0, cv::INTER_LINEAR);
}
float* pmean = _config->_mean.data();
float* pscale = _config->_std.data();
for (int h = 0; h < rh; ++h) {
const uchar* ptr = im.ptr<uchar>(h);
int im_index = 0;
for (int w = 0; w < rw; ++w) {
for (int c = 0; c < channels; ++c) {
int top_index = (c * rh + h) * rw + w;
float pixel = static_cast<float>(ptr[im_index++]);
pixel = (pixel / 255 - pmean[c]) / pscale[c];
data[top_index] = pixel;
}
}
}
utils::normalize(im, data, _config->_mean, _config->_std);
return true;
}
......
#pragma once
#include "preprocessor.h"
#include "utils/utils.h"
namespace PaddleSolution {
......
......@@ -4,6 +4,10 @@
#include <vector>
#include <string>
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#ifdef _WIN32
#include <filesystem>
#else
......@@ -59,5 +63,58 @@ namespace PaddleSolution {
return imgs;
}
#endif
// normalize and HWC_BGR -> CHW_RGB
inline void normalize(cv::Mat& im, float* data, std::vector<float>& fmean, std::vector<float>& fstd) {
int rh = im.rows;
int rw = im.cols;
int rc = im.channels();
double normf = (double)1.0 / 255.0;
#pragma omp parallel for
for (int h = 0; h < rh; ++h) {
const uchar* ptr = im.ptr<uchar>(h);
int im_index = 0;
for (int w = 0; w < rw; ++w) {
for (int c = 0; c < rc; ++c) {
int top_index = (c * rh + h) * rw + w;
float pixel = static_cast<float>(ptr[im_index++]);
pixel = (pixel * normf - fmean[c]) / fstd[c];
data[top_index] = pixel;
}
}
}
}
// argmax
inline void argmax(float* out, std::vector<int>& shape, std::vector<uchar>& mask, std::vector<uchar>& scoremap) {
int out_img_len = shape[1] * shape[2];
int blob_out_len = out_img_len * shape[0];
/*
Eigen::TensorMap<Eigen::Tensor<float, 3>> out_3d(out, shape[0], shape[1], shape[2]);
Eigen::Tensor<Eigen::DenseIndex, 2> argmax = out_3d.argmax(0);
*/
float max_value = -1;
int label = 0;
#pragma omp parallel private(label)
for (int i = 0; i < out_img_len; ++i) {
max_value = -1;
label = 0;
#pragma omp for reduction(max : max_value)
for (int j = 0; j < shape[0]; ++j) {
int index = i + j * out_img_len;
if (index >= blob_out_len) {
continue;
}
float value = out[index];
if (value > max_value) {
max_value = value;
label = j;
}
}
if (label == 0) max_value = 0;
mask[i] = uchar(label);
scoremap[i] = uchar(max_value * 255);
}
}
}
}
此差异已折叠。
......@@ -361,7 +361,7 @@ def hsv_color_jitter(crop_img,
if brightness_jitter_ratio > 0 or \
saturation_jitter_ratio > 0 or \
contrast_jitter_ratio > 0:
random_jitter(crop_img, saturation_jitter_ratio,
crop_img = random_jitter(crop_img, saturation_jitter_ratio,
brightness_jitter_ratio, contrast_jitter_ratio)
return crop_img
......
......@@ -29,6 +29,7 @@ def softmax_with_loss(logit, label, ignore_mask=None, num_classes=2):
label = fluid.layers.reshape(label, [-1, 1])
label = fluid.layers.cast(label, 'int64')
ignore_mask = fluid.layers.reshape(ignore_mask, [-1, 1])
loss, probs = fluid.layers.softmax_with_cross_entropy(
logit,
label,
......@@ -36,18 +37,48 @@ def softmax_with_loss(logit, label, ignore_mask=None, num_classes=2):
return_softmax=True)
loss = loss * ignore_mask
if cfg.MODEL.FP16:
loss = fluid.layers.cast(loss, 'float32')
avg_loss = fluid.layers.mean(loss) / fluid.layers.mean(ignore_mask)
avg_loss = fluid.layers.cast(avg_loss, 'float16')
else:
avg_loss = fluid.layers.mean(loss) / fluid.layers.mean(ignore_mask)
if cfg.MODEL.SCALE_LOSS > 1.0:
avg_loss = avg_loss * cfg.MODEL.SCALE_LOSS
avg_loss = fluid.layers.mean(loss) / fluid.layers.mean(ignore_mask)
label.stop_gradient = True
ignore_mask.stop_gradient = True
return avg_loss
# to change, how to appicate ignore index and ignore mask
def dice_loss(logit, label, ignore_mask=None, epsilon=0.00001):
if logit.shape[1] != 1 or label.shape[1] != 1 or ignore_mask.shape[1] != 1:
raise Exception("dice loss is only applicable to one channel classfication")
ignore_mask = fluid.layers.cast(ignore_mask, 'float32')
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
label = fluid.layers.transpose(label, [0, 2, 3, 1])
label = fluid.layers.cast(label, 'int64')
ignore_mask = fluid.layers.transpose(ignore_mask, [0, 2, 3, 1])
logit = fluid.layers.sigmoid(logit)
logit = logit * ignore_mask
label = label * ignore_mask
reduce_dim = list(range(1, len(logit.shape)))
inse = fluid.layers.reduce_sum(logit * label, dim=reduce_dim)
dice_denominator = fluid.layers.reduce_sum(
logit, dim=reduce_dim) + fluid.layers.reduce_sum(
label, dim=reduce_dim)
dice_score = 1 - inse * 2 / (dice_denominator + epsilon)
label.stop_gradient = True
ignore_mask.stop_gradient = True
return fluid.layers.reduce_mean(dice_score)
def bce_loss(logit, label, ignore_mask=None):
if logit.shape[1] != 1 or label.shape[1] != 1 or ignore_mask.shape[1] != 1:
raise Exception("bce loss is only applicable to binary classfication")
label = fluid.layers.cast(label, 'float32')
loss = fluid.layers.sigmoid_cross_entropy_with_logits(
x=logit,
label=label,
ignore_index=cfg.DATASET.IGNORE_INDEX,
normalize=True) # or False
loss = fluid.layers.reduce_sum(loss)
label.stop_gradient = True
ignore_mask.stop_gradient = True
return loss
def multi_softmax_with_loss(logits, label, ignore_mask=None, num_classes=2):
if isinstance(logits, tuple):
......@@ -63,19 +94,28 @@ def multi_softmax_with_loss(logits, label, ignore_mask=None, num_classes=2):
avg_loss = softmax_with_loss(logits, label, ignore_mask, num_classes)
return avg_loss
def multi_dice_loss(logits, label, ignore_mask=None):
if isinstance(logits, tuple):
avg_loss = 0
for i, logit in enumerate(logits):
logit_label = fluid.layers.resize_nearest(label, logit.shape[2:])
logit_mask = (logit_label.astype('int32') !=
cfg.DATASET.IGNORE_INDEX).astype('int32')
loss = dice_loss(logit, logit_label, logit_mask)
avg_loss += cfg.MODEL.MULTI_LOSS_WEIGHT[i] * loss
else:
avg_loss = dice_loss(logits, label, ignore_mask)
return avg_loss
# to change, how to appicate ignore index and ignore mask
def dice_loss(logit, label, ignore_mask=None, num_classes=2):
if num_classes != 2:
raise Exception("dice loss is only applicable to binary classfication")
ignore_mask = fluid.layers.cast(ignore_mask, 'float32')
label = fluid.layers.elementwise_min(
label, fluid.layers.assign(np.array([num_classes - 1], dtype=np.int32)))
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.reshape(logit, [-1, num_classes])
logit = fluid.layers.softmax(logit)
label = fluid.layers.reshape(label, [-1, 1])
label = fluid.layers.cast(label, 'int64')
ignore_mask = fluid.layers.reshape(ignore_mask, [-1, 1])
loss = fluid.layers.dice_loss(logit, label)
return loss
def multi_bce_loss(logits, label, ignore_mask=None):
if isinstance(logits, tuple):
avg_loss = 0
for i, logit in enumerate(logits):
logit_label = fluid.layers.resize_nearest(label, logit.shape[2:])
logit_mask = (logit_label.astype('int32') !=
cfg.DATASET.IGNORE_INDEX).astype('int32')
loss = bce_loss(logit, logit_label, logit_mask)
avg_loss += cfg.MODEL.MULTI_LOSS_WEIGHT[i] * loss
else:
avg_loss = bce_loss(logits, label, ignore_mask)
return avg_loss
......@@ -228,7 +228,7 @@ class MobileNetV2():
num_groups=num_expfilter,
if_act=True,
name=name + '_dwise',
use_cudnn=True if cfg.MODEL.FP16 else False)
use_cudnn=False)
depthwise_output = bottleneck_conv
......
......@@ -149,7 +149,7 @@ def separate_conv(input, channel, stride, filter, dilation=1, act=None):
groups=input.shape[1],
padding=(filter // 2) * dilation,
dilation=dilation,
use_cudnn=True if cfg.MODEL.FP16 else False,
use_cudnn=False,
param_attr=param_attr)
input = bn(input)
if act: input = act(input)
......
......@@ -24,6 +24,8 @@ from paddle.fluid.proto.framework_pb2 import VarType
import solver
from utils.config import cfg
from loss import multi_softmax_with_loss
from loss import multi_dice_loss
from loss import multi_bce_loss
class ModelPhase(object):
......@@ -109,6 +111,17 @@ def softmax(logit):
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def sigmoid_to_softmax(logit):
"""
one channel to two channel
"""
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.sigmoid(logit)
logit_back = 1 - logit
logit = fluid.layers.concat([logit_back, logit], axis=-1)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
if not ModelPhase.is_valid_phase(phase):
......@@ -140,15 +153,53 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
capacity=cfg.DATALOADER.BUF_SIZE,
iterable=False,
use_double_buffer=True)
if cfg.MODEL.FP16:
image = fluid.layers.cast(image, "float16")
model_name = map_model_name(cfg.MODEL.MODEL_NAME)
model_func = get_func("modeling." + model_name)
loss_type = cfg.SOLVER.LOSS
if not isinstance(loss_type, list):
loss_type = list(loss_type)
if class_num > 2 and (("dice_loss" in loss_type) or ("bce_loss" in loss_type)):
raise Exception("dice loss and bce loss is only applicable to binary classfication")
if ("dice_loss" in loss_type) or ("bce_loss" in loss_type):
class_num = 1
if "softmax_loss" in loss_type:
raise Exception("softmax loss can not combine with dice loss or bce loss")
logits = model_func(image, class_num)
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
avg_loss = multi_softmax_with_loss(logits, label, mask,
class_num)
loss_valid = False
avg_loss_list = []
valid_loss = []
if "softmax_loss" in loss_type:
avg_loss_list.append(multi_softmax_with_loss(logits,
label, mask,class_num))
loss_valid = True
valid_loss.append("softmax_loss")
if "dice_loss" in loss_type:
avg_loss_list.append(multi_dice_loss(logits, label, mask))
loss_valid = True
valid_loss.append("dice_loss")
if "bce_loss" in loss_type:
avg_loss_list.append(multi_bce_loss(logits, label, mask))
loss_valid = True
valid_loss.append("bce_loss")
if not loss_valid:
raise Exception("SOLVER.LOSS: {} is set wrong. it should "
"include one of (softmax_loss, bce_loss, dice_loss) at least"
" example: ['softmax_loss'], ['dice_loss'], ['bce_loss', 'dice_loss']".format(cfg.SOLVER.LOSS))
invalid_loss = [x for x in loss_type if x not in valid_loss]
if len(invalid_loss) > 0:
print("Warning: the loss {} you set is invalid. it will not be included in loss computed.".format(invalid_loss))
avg_loss = 0
for i in range(0, len(avg_loss_list)):
avg_loss += avg_loss_list[i]
#get pred result in original size
if isinstance(logits, tuple):
......@@ -161,17 +212,25 @@ def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
# return image input and logit output for inference graph prune
if ModelPhase.is_predict(phase):
logit = softmax(logit)
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return image, logit
out = fluid.layers.transpose(x=logit, perm=[0, 2, 3, 1])
if cfg.MODEL.FP16:
out = fluid.layers.cast(out, 'float32')
if class_num == 1:
out = sigmoid_to_softmax(logit)
out = fluid.layers.transpose(out, [0, 2, 3, 1])
else:
out = fluid.layers.transpose(logit, [0, 2, 3, 1])
pred = fluid.layers.argmax(out, axis=3)
pred = fluid.layers.unsqueeze(pred, axes=[3])
if ModelPhase.is_visual(phase):
logit = softmax(logit)
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return pred, logit
if ModelPhase.is_eval(phase):
......
......@@ -27,7 +27,6 @@ from models.libs.model_libs import separate_conv
from models.backbone.mobilenet_v2 import MobileNetV2 as mobilenet_backbone
from models.backbone.xception import Xception as xception_backbone
def encoder(input):
# 编码器配置,采用ASPP架构,pooling + 1x1_conv + 三个不同尺度的空洞卷积并行, concat后1x1conv
# ASPP_WITH_SEP_CONV:默认为真,使用depthwise可分离卷积,否则使用普通卷积
......@@ -48,13 +47,8 @@ def encoder(input):
with scope('encoder'):
channel = 256
with scope("image_pool"):
if cfg.MODEL.FP16:
image_avg = fluid.layers.reduce_mean(
fluid.layers.cast(input, 'float32'), [2, 3], keep_dim=True)
image_avg = fluid.layers.cast(image_avg, 'float16')
else:
image_avg = fluid.layers.reduce_mean(
input, [2, 3], keep_dim=True)
image_avg = fluid.layers.reduce_mean(
input, [2, 3], keep_dim=True)
image_avg = bn_relu(
conv(
image_avg,
......@@ -64,11 +58,8 @@ def encoder(input):
groups=1,
padding=0,
param_attr=param_attr))
if cfg.MODEL.FP16:
image_avg = fluid.layers.cast(image_avg, 'float32')
image_avg = fluid.layers.resize_bilinear(image_avg, input.shape[2:])
if cfg.MODEL.FP16:
image_avg = fluid.layers.cast(image_avg, 'float16')
with scope("aspp0"):
aspp0 = bn_relu(
conv(
......@@ -157,12 +148,9 @@ def decoder(encode_data, decode_shortcut):
groups=1,
padding=0,
param_attr=param_attr))
if cfg.MODEL.FP16:
encode_data = fluid.layers.cast(encode_data, 'float32')
encode_data = fluid.layers.resize_bilinear(
encode_data, decode_shortcut.shape[2:])
if cfg.MODEL.FP16:
encode_data = fluid.layers.cast(encode_data, 'float16')
encode_data = fluid.layers.concat([encode_data, decode_shortcut],
axis=1)
if cfg.MODEL.DEEPLAB.DECODER_USE_SEP_CONV:
......@@ -270,9 +258,6 @@ def deeplabv3p(img, num_classes):
padding=0,
bias_attr=True,
param_attr=param_attr)
if cfg.MODEL.FP16:
logit = fluid.layers.cast(logit, 'float32')
logit = fluid.layers.resize_bilinear(logit, img.shape[2:])
if cfg.MODEL.FP16:
logit = fluid.layers.cast(logit, 'float16')
return logit
......@@ -32,7 +32,7 @@ import data_aug as aug
from utils.config import cfg
from data_utils import GeneratorEnqueuer
from models.model_builder import ModelPhase
import copy
def cv2_imread(file_path, flag=cv2.IMREAD_COLOR):
# resolve cv2.imread open Chinese file path issues on Windows Platform.
......@@ -49,15 +49,25 @@ class SegDataset(object):
self.shuffle = shuffle
self.data_dir = data_dir
self.shuffle_seed = 0
# NOTE: Please ensure file list was save in UTF-8 coding format
with codecs.open(file_list, 'r', 'utf-8') as flist:
self.lines = [line.strip() for line in flist]
if shuffle:
self.all_lines = copy.deepcopy(self.lines)
if shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
elif shuffle:
np.random.shuffle(self.lines)
def generator(self):
if self.shuffle:
if self.shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
num_lines = len(self.all_lines) // cfg.NUM_TRAINERS
self.lines = self.all_lines[num_lines * cfg.TRAINER_ID: num_lines * (cfg.TRAINER_ID + 1)]
self.shuffle_seed += 1
elif self.shuffle:
np.random.shuffle(self.lines)
for line in self.lines:
yield self.process_image(line, self.data_dir, self.mode)
......@@ -78,8 +88,14 @@ class SegDataset(object):
def multiprocess_generator(self, max_queue_size=32, num_processes=8):
# Re-shuffle file list
if self.shuffle:
if self.shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
num_lines = len(self.all_lines) // self.num_trainers
self.lines = self.all_lines[num_lines * self.trainer_id: num_lines * (self.trainer_id + 1)]
self.shuffle_seed += 1
elif self.shuffle:
np.random.shuffle(self.lines)
# Create multiple sharding generators according to num_processes for multiple processes
generators = []
for pid in range(num_processes):
......
......@@ -18,7 +18,7 @@ import paddle.fluid as fluid
import numpy as np
import importlib
from utils.config import cfg
from paddle.fluid.contrib.mixed_precision.fp16_utils import create_master_params_grads, master_param_to_train_param
from paddle.fluid.contrib.mixed_precision.decorator import OptimizerWithMixedPrecison, decorate, AutoMixedPrecisionLists
class Solver(object):
......@@ -74,15 +74,22 @@ class Solver(object):
regularization_coeff=self.weight_decay),
)
if cfg.MODEL.FP16:
params_grads = optimizer.backward(loss, self.start_prog)
master_params_grads = create_master_params_grads(
params_grads, self.main_prog, self.start_prog,
cfg.MODEL.SCALE_LOSS)
optimizer.apply_gradients(master_params_grads)
master_param_to_train_param(master_params_grads, params_grads,
self.main_prog)
else:
optimizer.minimize(loss)
if cfg.MODEL.MODEL_NAME in ["pspnet"]:
custom_black_list = {"pool2d"}
else:
custom_black_list = {}
amp_lists = AutoMixedPrecisionLists(custom_black_list=custom_black_list)
assert isinstance(cfg.MODEL.SCALE_LOSS, float) or isinstance(cfg.MODEL.SCALE_LOSS, str), \
"data type of MODEL.SCALE_LOSS must be float or str"
if isinstance(cfg.MODEL.SCALE_LOSS, float):
optimizer = decorate(optimizer, amp_lists=amp_lists, init_loss_scaling=cfg.MODEL.SCALE_LOSS,
use_dynamic_loss_scaling=False)
else:
assert cfg.MODEL.SCALE_LOSS.lower() in ['dynamic'], "if MODEL.SCALE_LOSS is a string,\
must be set as 'DYNAMIC'!"
optimizer = decorate(optimizer, amp_lists=amp_lists, use_dynamic_loss_scaling=True)
optimizer.minimize(loss)
return decayed_lr
def adam_optimizer(self, lr_policy, loss):
......
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import glob
import os.path
import argparse
import warnings
def parse_args():
parser = argparse.ArgumentParser(
description='PaddleSeg generate file list on cityscapes or your customized dataset.')
parser.add_argument(
'dataset_root',
help='dataset root directory',
type=str
)
parser.add_argument(
'--type',
help='dataset type: \n'
'- cityscapes \n'
'- custom(default)',
default="custom",
type=str
)
parser.add_argument(
'--separator',
dest='separator',
help='file list separator',
default="|",
type=str
)
parser.add_argument(
'--folder',
help='the folder names of images and labels',
type=str,
nargs=2,
default=['images', 'annotations']
)
parser.add_argument(
'--second_folder',
help='the second-level folder names of train set, validation set, test set',
type=str,
nargs='*',
default=['train', 'val', 'test']
)
parser.add_argument(
'--format',
help='data format of images and labels, e.g. jpg or png.',
type=str,
nargs=2,
default=['jpg', 'png']
)
parser.add_argument(
'--postfix',
help='postfix of images or labels',
type=str,
nargs=2,
default=['', '']
)
return parser.parse_args()
def cityscape_cfg(args):
args.postfix = ['_leftImg8bit', '_gtFine_labelTrainIds']
args.folder = ['leftImg8bit', 'gtFine']
args.format = ['png', 'png']
def get_files(image_or_label, dataset_split, args):
dataset_root = args.dataset_root
postfix = args.postfix
format = args.format
folder = args.folder
pattern = '*%s.%s' % (postfix[image_or_label], format[image_or_label])
search_files = os.path.join(dataset_root, folder[image_or_label],
dataset_split, pattern)
search_files2 = os.path.join(dataset_root, folder[image_or_label],
dataset_split, "*", pattern) # 包含子目录
search_files3 = os.path.join(dataset_root, folder[image_or_label],
dataset_split, "*", "*", pattern) # 包含三级目录
filenames = glob.glob(search_files)
filenames2 = glob.glob(search_files2)
filenames3 = glob.glob(search_files3)
filenames = filenames + filenames2 + filenames3
return sorted(filenames)
def generate_list(args):
dataset_root = args.dataset_root
separator = args.separator
for dataset_split in args.second_folder:
print("Creating {}.txt...".format(dataset_split))
image_files = get_files(0, dataset_split, args)
label_files = get_files(1, dataset_split, args)
if not image_files:
img_dir = os.path.join(dataset_root, args.folder[0], dataset_split)
print("No files in {}".format(img_dir))
num_images = len(image_files)
if not label_files:
label_dir = os.path.join(dataset_root, args.folder[1], dataset_split)
print("No files in {}".format(label_dir))
num_label = len(label_files)
if num_images < num_label:
warnings.warn("number of images = {} < number of labels = {}."
.format(num_images, num_label))
continue
file_list = os.path.join(dataset_root, dataset_split + '.txt')
with open(file_list, "w") as f:
for item in range(num_images):
left = image_files[item].replace(dataset_root, '')
if left[0] == os.path.sep:
left = left.lstrip(os.path.sep)
try:
right = label_files[item].replace(dataset_root, '')
if right[0] == os.path.sep:
right = right.lstrip(os.path.sep)
line = left + separator + right + '\n'
except:
line = left + '\n'
f.write(line)
print(line)
if __name__ == '__main__':
args = parse_args()
if args.type == 'cityscapes':
cityscape_cfg(args)
generate_list(args)
......@@ -40,8 +40,7 @@ from models.model_builder import ModelPhase
from models.model_builder import parse_shape_from_file
from eval import evaluate
from vis import visualize
from utils.fp16_utils import load_fp16_vars
from utils import dist_utils
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg training')
......@@ -178,6 +177,9 @@ def load_checkpoint(exe, program):
return begin_epoch
def print_info(*msg):
if cfg.TRAINER_ID == 0:
print(*msg)
def train(cfg):
startup_prog = fluid.Program()
......@@ -201,7 +203,7 @@ def train(cfg):
batch_data = []
for b in data_gen:
batch_data.append(b)
if len(batch_data) == cfg.BATCH_SIZE:
if len(batch_data) == (cfg.BATCH_SIZE // cfg.NUM_TRAINERS):
for item in batch_data:
yield item[0], item[1], item[2]
batch_data = []
......@@ -212,11 +214,15 @@ def train(cfg):
yield item[0], item[1], item[2]
# Get device environment
# places = fluid.cuda_places() if args.use_gpu else fluid.cpu_places()
# place = places[0]
gpu_id = int(os.environ.get('FLAGS_selected_gpus', 0))
place = fluid.CUDAPlace(gpu_id) if args.use_gpu else fluid.CPUPlace()
places = fluid.cuda_places() if args.use_gpu else fluid.cpu_places()
place = places[0]
# Get number of GPU
dev_count = len(places)
print("#Device count: {}".format(dev_count))
dev_count = cfg.NUM_TRAINERS if cfg.NUM_TRAINERS > 1 else len(places)
print_info("#Device count: {}".format(dev_count))
# Make sure BATCH_SIZE can divided by GPU cards
assert cfg.BATCH_SIZE % dev_count == 0, (
......@@ -224,7 +230,7 @@ def train(cfg):
cfg.BATCH_SIZE, dev_count))
# If use multi-gpu training mode, batch data will allocated to each GPU evenly
batch_size_per_dev = cfg.BATCH_SIZE // dev_count
print("batch_size_per_dev: {}".format(batch_size_per_dev))
print_info("batch_size_per_dev: {}".format(batch_size_per_dev))
py_reader, avg_loss, lr, pred, grts, masks = build_model(
train_prog, startup_prog, phase=ModelPhase.TRAIN)
......@@ -240,13 +246,18 @@ def train(cfg):
exec_strategy.num_threads = fluid.core.get_cuda_device_count()
exec_strategy.num_iteration_per_drop_scope = 100
build_strategy = fluid.BuildStrategy()
if cfg.NUM_TRAINERS > 1 and args.use_gpu:
dist_utils.prepare_for_multi_process(exe, build_strategy, train_prog)
exec_strategy.num_threads = 1
if cfg.TRAIN.SYNC_BATCH_NORM and args.use_gpu:
if dev_count > 1:
# Apply sync batch norm strategy
print("Sync BatchNorm strategy is effective.")
print_info("Sync BatchNorm strategy is effective.")
build_strategy.sync_batch_norm = True
else:
print("Sync BatchNorm strategy will not be effective if GPU device"
print_info("Sync BatchNorm strategy will not be effective if GPU device"
" count <= 1")
compiled_train_prog = fluid.CompiledProgram(train_prog).with_data_parallel(
loss_name=avg_loss.name,
......@@ -259,7 +270,7 @@ def train(cfg):
begin_epoch = load_checkpoint(exe, train_prog)
# Load pretrained model
elif os.path.exists(cfg.TRAIN.PRETRAINED_MODEL_DIR):
print('Pretrained model dir:', cfg.TRAIN.PRETRAINED_MODEL_DIR)
print_info('Pretrained model dir: ', cfg.TRAIN.PRETRAINED_MODEL_DIR)
load_vars = []
load_fail_vars = []
......@@ -283,22 +294,19 @@ def train(cfg):
load_vars.append(x)
else:
load_fail_vars.append(x)
if cfg.MODEL.FP16:
# If open FP16 training mode, load FP16 var separate
load_fp16_vars(exe, cfg.TRAIN.PRETRAINED_MODEL_DIR, train_prog)
else:
fluid.io.load_vars(
exe, dirname=cfg.TRAIN.PRETRAINED_MODEL_DIR, vars=load_vars)
fluid.io.load_vars(
exe, dirname=cfg.TRAIN.PRETRAINED_MODEL_DIR, vars=load_vars)
for var in load_vars:
print("Parameter[{}] loaded sucessfully!".format(var.name))
print_info("Parameter[{}] loaded sucessfully!".format(var.name))
for var in load_fail_vars:
print("Parameter[{}] don't exist or shape does not match current network, skip"
print_info("Parameter[{}] don't exist or shape does not match current network, skip"
" to load it.".format(var.name))
print("{}/{} pretrained parameters loaded successfully!".format(
print_info("{}/{} pretrained parameters loaded successfully!".format(
len(load_vars),
len(load_vars) + len(load_fail_vars)))
else:
print('Pretrained model dir {} not exists, training from scratch...'.
print_info('Pretrained model dir {} not exists, training from scratch...'.
format(cfg.TRAIN.PRETRAINED_MODEL_DIR))
fetch_list = [avg_loss.name, lr.name]
......@@ -312,12 +320,14 @@ def train(cfg):
if args.use_tb:
if not args.tb_log_dir:
print("Please specify the log directory by --tb_log_dir.")
print_info("Please specify the log directory by --tb_log_dir.")
exit(1)
from tb_paddle import SummaryWriter
log_writer = SummaryWriter(args.tb_log_dir)
# trainer_id = int(os.getenv("PADDLE_TRAINER_ID", 0))
# num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
global_step = 0
all_step = cfg.DATASET.TRAIN_TOTAL_IMAGES // cfg.BATCH_SIZE
if cfg.DATASET.TRAIN_TOTAL_IMAGES % cfg.BATCH_SIZE and drop_last != True:
......@@ -333,9 +343,9 @@ def train(cfg):
begin_epoch, cfg.SOLVER.NUM_EPOCHS))
if args.use_mpio:
print("Use multiprocess reader")
print_info("Use multiprocess reader")
else:
print("Use multi-thread reader")
print_info("Use multi-thread reader")
for epoch in range(begin_epoch, cfg.SOLVER.NUM_EPOCHS + 1):
py_reader.start()
......@@ -348,7 +358,6 @@ def train(cfg):
program=compiled_train_prog,
fetch_list=fetch_list,
return_numpy=True)
cm.calculate(pred, grts, masks)
avg_loss += np.mean(np.array(loss))
global_step += 1
......@@ -359,13 +368,13 @@ def train(cfg):
category_acc, mean_acc = cm.accuracy()
category_iou, mean_iou = cm.mean_iou()
print((
print_info((
"epoch={} step={} lr={:.5f} loss={:.4f} acc={:.5f} mIoU={:.5f} step/sec={:.3f} | ETA {}"
).format(epoch, global_step, lr[0], avg_loss, mean_acc,
mean_iou, speed,
calculate_eta(all_step - global_step, speed)))
print("Category IoU:", category_iou)
print("Category Acc:", category_acc)
print_info("Category IoU: ", category_iou)
print_info("Category Acc: ", category_acc)
if args.use_tb:
log_writer.add_scalar('Train/mean_iou', mean_iou,
global_step)
......@@ -390,7 +399,7 @@ def train(cfg):
avg_loss += np.mean(np.array(loss))
global_step += 1
if global_step % args.log_steps == 0:
if global_step % args.log_steps == 0 and cfg.TRAINER_ID == 0:
avg_loss /= args.log_steps
speed = args.log_steps / timer.elapsed_time()
print((
......@@ -414,7 +423,7 @@ def train(cfg):
except Exception as e:
print(e)
if epoch % cfg.TRAIN.SNAPSHOT_EPOCH == 0:
if epoch % cfg.TRAIN.SNAPSHOT_EPOCH == 0 and cfg.TRAINER_ID == 0:
ckpt_dir = save_checkpoint(exe, train_prog, epoch)
if args.do_eval:
......@@ -441,16 +450,20 @@ def train(cfg):
log_writer=log_writer)
# save final model
save_checkpoint(exe, train_prog, 'final')
if cfg.TRAINER_ID == 0:
save_checkpoint(exe, train_prog, 'final')
def main(args):
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts is not None:
cfg.update_from_list(args.opts)
cfg.check_and_infer(reset_dataset=True)
print(pprint.pformat(cfg))
cfg.TRAINER_ID = int(os.getenv("PADDLE_TRAINER_ID", 0))
cfg.NUM_TRAINERS = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
cfg.check_and_infer()
print_info(pprint.pformat(cfg))
train(cfg)
......
......@@ -88,7 +88,7 @@ class SegConfig(dict):
except KeyError:
raise KeyError('Non-existent config key: {}'.format(key))
def check_and_infer(self, reset_dataset=False):
def check_and_infer(self):
if self.DATASET.IMAGE_TYPE in ['rgb', 'gray']:
self.DATASET.DATA_DIM = 3
elif self.DATASET.IMAGE_TYPE in ['rgba']:
......@@ -110,17 +110,13 @@ class SegConfig(dict):
'EVAL_CROP_SIZE is empty! Please set a pair of values in format (width, height)'
)
if reset_dataset:
# Ensure file list is use UTF-8 encoding
train_sets = codecs.open(self.DATASET.TRAIN_FILE_LIST, 'r',
'utf-8').readlines()
val_sets = codecs.open(self.DATASET.VAL_FILE_LIST, 'r',
'utf-8').readlines()
test_sets = codecs.open(self.DATASET.TEST_FILE_LIST, 'r',
'utf-8').readlines()
self.DATASET.TRAIN_TOTAL_IMAGES = len(train_sets)
self.DATASET.VAL_TOTAL_IMAGES = len(val_sets)
self.DATASET.TEST_TOTAL_IMAGES = len(test_sets)
# Ensure file list is use UTF-8 encoding
train_sets = codecs.open(self.DATASET.TRAIN_FILE_LIST, 'r', 'utf-8').readlines()
val_sets = codecs.open(self.DATASET.VAL_FILE_LIST, 'r', 'utf-8').readlines()
test_sets = codecs.open(self.DATASET.TEST_FILE_LIST, 'r', 'utf-8').readlines()
self.DATASET.TRAIN_TOTAL_IMAGES = len(train_sets)
self.DATASET.VAL_TOTAL_IMAGES = len(val_sets)
self.DATASET.TEST_TOTAL_IMAGES = len(test_sets)
if self.MODEL.MODEL_NAME == 'icnet' and \
len(self.MODEL.MULTI_LOSS_WEIGHT) != 3:
......@@ -139,7 +135,7 @@ class SegConfig(dict):
def update_from_file(self, config_file):
with codecs.open(config_file, 'r', 'utf-8') as file:
dic = yaml.load(file)
dic = yaml.load(file, Loader=yaml.FullLoader)
self.update_from_segconfig(dic)
def set_immutable(self, immutable):
......
......@@ -31,7 +31,10 @@ cfg.BATCH_SIZE = 1
cfg.EVAL_CROP_SIZE = tuple()
# 训练时图像裁剪尺寸(宽,高)
cfg.TRAIN_CROP_SIZE = tuple()
# 多进程训练总进程数
cfg.NUM_TRAINERS = 1
# 多进程训练进程ID
cfg.TRAINER_ID = 0
########################## 数据载入配置 #######################################
# 数据载入时的并发数, 建议值8
cfg.DATALOADER.NUM_WORKERS = 8
......@@ -149,6 +152,8 @@ cfg.SOLVER.WEIGHT_DECAY = 0.00004
cfg.SOLVER.BEGIN_EPOCH = 1
# 训练epoch数,正整数
cfg.SOLVER.NUM_EPOCHS = 30
# loss的选择,支持softmax_loss, bce_loss, dice_loss
cfg.SOLVER.LOSS = ["softmax_loss"]
########################## 测试配置 ###########################################
# 测试模型路径
......@@ -169,8 +174,8 @@ cfg.MODEL.DEFAULT_EPSILON = 1e-5
cfg.MODEL.BN_MOMENTUM = 0.99
# 是否使用FP16训练
cfg.MODEL.FP16 = False
# FP16需对LOSS进行scale, 一般训练FP16设置为8.0
cfg.MODEL.SCALE_LOSS = 1.0
# 混合精度训练需对LOSS进行scale, 默认为动态scale,静态scale可以设置为512.0
cfg.MODEL.SCALE_LOSS = "DYNAMIC"
########################## DeepLab模型配置 ####################################
# DeepLab backbone 配置, 可选项xception_65, mobilenetv2
......
#copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import paddle.fluid as fluid
def nccl2_prepare(args, startup_prog, main_prog):
config = fluid.DistributeTranspilerConfig()
config.mode = "nccl2"
t = fluid.DistributeTranspiler(config=config)
envs = args.dist_env
t.transpile(
envs["trainer_id"],
trainers=','.join(envs["trainer_endpoints"]),
current_endpoint=envs["current_endpoint"],
startup_program=startup_prog,
program=main_prog)
def pserver_prepare(args, train_prog, startup_prog):
config = fluid.DistributeTranspilerConfig()
config.slice_var_up = args.split_var
t = fluid.DistributeTranspiler(config=config)
envs = args.dist_env
training_role = envs["training_role"]
t.transpile(
envs["trainer_id"],
program=train_prog,
pservers=envs["pserver_endpoints"],
trainers=envs["num_trainers"],
sync_mode=not args.async_mode,
startup_program=startup_prog)
if training_role == "PSERVER":
pserver_program = t.get_pserver_program(envs["current_endpoint"])
pserver_startup_program = t.get_startup_program(
envs["current_endpoint"],
pserver_program,
startup_program=startup_prog)
return pserver_program, pserver_startup_program
elif training_role == "TRAINER":
train_program = t.get_trainer_program()
return train_program, startup_prog
else:
raise ValueError(
'PADDLE_TRAINING_ROLE environment variable must be either TRAINER or PSERVER'
)
def nccl2_prepare_paddle(trainer_id, startup_prog, main_prog):
config = fluid.DistributeTranspilerConfig()
config.mode = "nccl2"
t = fluid.DistributeTranspiler(config=config)
t.transpile(
trainer_id,
trainers=os.environ.get('PADDLE_TRAINER_ENDPOINTS'),
current_endpoint=os.environ.get('PADDLE_CURRENT_ENDPOINT'),
startup_program=startup_prog,
program=main_prog)
def prepare_for_multi_process(exe, build_strategy, train_prog):
# prepare for multi-process
trainer_id = int(os.environ.get('PADDLE_TRAINER_ID', 0))
num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
if num_trainers < 2: return
build_strategy.num_trainers = num_trainers
build_strategy.trainer_id = trainer_id
# NOTE(zcd): use multi processes to train the model,
# and each process use one GPU card.
startup_prog = fluid.Program()
nccl2_prepare_paddle(trainer_id, startup_prog, train_prog)
# the startup_prog are run two times, but it doesn't matter.
exe.run(startup_prog)
......@@ -45,6 +45,10 @@ model_urls = {
"https://paddleseg.bj.bcebos.com/models/xception65_coco.tgz",
"unet_bn_coco":
"https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz",
"pspnet50_bn_coco":
"https://paddleseg.bj.bcebos.com/models/pspnet50_coco.tgz",
"pspnet101_bn_coco":
"https://paddleseg.bj.bcebos.com/models/pspnet101_coco.tgz",
# Cityscapes pretrained
"deeplabv3p_mobilenetv2-1-0_bn_cityscapes":
......@@ -58,7 +62,9 @@ model_urls = {
"icnet_bn_cityscapes":
"https://paddleseg.bj.bcebos.com/models/icnet_cityscapes.tar.gz",
"pspnet50_bn_cityscapes":
"https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz"
"https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz",
"pspnet101_bn_cityscapes":
"https://paddleseg.bj.bcebos.com/models/pspnet101_cityscapes.tgz",
}
if __name__ == "__main__":
......
......@@ -31,10 +31,10 @@ MODEL:
ASPP_WITH_SEP_CONV: True
DECODER_USE_SEP_CONV: True
TEST:
TEST_MODEL: "snapshots/cityscape_v5/final/"
TEST_MODEL: "./saved_model/cityscape_v5/final/"
TRAIN:
MODEL_SAVE_DIR: "snapshots/cityscape_v5/"
PRETRAINED_MODEL_DIR: "pretrain/deeplabv3plus_gn_init"
MODEL_SAVE_DIR: "./saved_model/cityscape_v5/"
PRETRAINED_MODEL_DIR: "pretrained_model/deeplabv3plus_gn_init"
SNAPSHOT_EPOCH: 10
SOLVER:
LR: 0.001
......
......@@ -12,15 +12,15 @@ AUG:
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
BATCH_SIZE: 6
BATCH_SIZE: 4
DATASET:
DATA_DIR: "./dataset/pet/"
DATA_DIR: "./dataset/mini_pet/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 4 # including ignore
TEST_FILE_LIST: "./dataset/pet/test_list.txt"
TRAIN_FILE_LIST: "./dataset/pet/train_list.txt"
VAL_FILE_LIST: "./dataset/pet/val_list.txt"
VIS_FILE_LIST: "./dataset/pet/val_list.txt"
NUM_CLASSES: 3
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt"
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
IGNORE_INDEX: 255
SEPARATOR: " "
FREEZE:
......@@ -30,13 +30,13 @@ MODEL:
MODEL_NAME: "unet"
DEFAULT_NORM_TYPE: "bn"
TEST:
TEST_MODEL: "./test/saved_model/unet_pet/final/"
TEST_MODEL: "./saved_model/unet_pet/final/"
TRAIN:
MODEL_SAVE_DIR: "./test/saved_models/unet_pet/"
PRETRAINED_MODEL_DIR: "./test/models/unet_coco/"
MODEL_SAVE_DIR: "./saved_model/unet_pet/"
PRETRAINED_MODEL_DIR: "./test/models/unet_coco_init/"
SNAPSHOT_EPOCH: 10
SOLVER:
NUM_EPOCHS: 500
NUM_EPOCHS: 100
LR: 0.005
LR_POLICY: "poly"
OPTIMIZER: "adam"
......@@ -50,7 +50,7 @@ if __name__ == "__main__":
dest="devices",
help="GPU id of running. if more than one, use spacing to separate.",
nargs="+",
default=0,
default=[0],
type=int)
args = parser.parse_args()
......
......@@ -51,7 +51,7 @@ if __name__ == "__main__":
dest="devices",
help="GPU id of running. if more than one, use spacing to separate.",
nargs="+",
default=0,
default=[0],
type=int)
args = parser.parse_args()
......
......@@ -21,7 +21,7 @@ python dataset/download_pet.py
接着下载对应的预训练模型
```shell
python pretrained_model/download_model.py deeplabv3p_xception65_bn_cityscapes
python pretrained_model/download_model.py deeplabv3p_xception65_bn_coco
```
## 三. 准备配置
......@@ -47,7 +47,7 @@ python pretrained_model/download_model.py deeplabv3p_xception65_bn_cityscapes
数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_deeplabv3p_pet.yaml`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/deeplabv3p_xception65_pet.yaml**
```yaml
# 数据集配置
......@@ -91,7 +91,7 @@ SOLVER:
在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程
```shell
python pdseg/check.py --cfg ./configs/test_deeplabv3p_pet.yaml
python pdseg/check.py --cfg ./configs/deeplabv3p_xception65_pet.yaml
```
......@@ -100,7 +100,7 @@ python pdseg/check.py --cfg ./configs/test_deeplabv3p_pet.yaml
校验通过后,使用下述命令启动训练
```shell
python pdseg/train.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml
python pdseg/train.py --use_gpu --cfg ./configs/deeplabv3p_xception65_pet.yaml
```
## 六. 进行评估
......@@ -108,7 +108,7 @@ python pdseg/train.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml
模型训练完成,使用下述命令启动评估
```shell
python pdseg/eval.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml
python pdseg/eval.py --use_gpu --cfg ./configs/deeplabv3p_xception65_pet.yaml
```
## 模型组合
......@@ -123,7 +123,7 @@ python pdseg/eval.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml
|xception41_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_41 <br> MODEL.DEFAULT_NORM_TYPE: bn|
|xception65_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: bn|
|deeplabv3p_mobilenetv2-1-0_bn_coco|MobileNet V2|bn|COCO|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: mobilenet <br> MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0 <br> MODEL.DEEPLAB.ENCODER_WITH_ASPP: False <br> MODEL.DEEPLAB.ENABLE_DECODER: False <br> MODEL.DEFAULT_NORM_TYPE: bn|
|deeplabv3p_xception65_bn_coco|Xception|bn|COCO|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: bn |
|**deeplabv3p_xception65_bn_coco**|Xception|bn|COCO|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: bn |
|deeplabv3p_mobilenetv2-1-0_bn_cityscapes|MobileNet V2|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: mobilenet <br> MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0 <br> MODEL.DEEPLAB.ENCODER_WITH_ASPP: False <br> MODEL.DEEPLAB.ENABLE_DECODER: False <br> MODEL.DEFAULT_NORM_TYPE: bn|
|deeplabv3p_xception65_gn_cityscapes|Xception|gn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: gn|
|**deeplabv3p_xception65_bn_cityscapes**|Xception|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: bn|
|deeplabv3p_xception65_bn_cityscapes|Xception|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p <br> MODEL.DEEPLAB.BACKBONE: xception_65 <br> MODEL.DEFAULT_NORM_TYPE: bn|
......@@ -47,7 +47,7 @@ python pretrained_model/download_model.py icnet_bn_cityscapes
数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_pet.yaml`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/icnet_pet.yaml**
```yaml
# 数据集配置
......@@ -93,7 +93,7 @@ SOLVER:
在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程
```shell
python pdseg/check.py --cfg ./configs/test_pet.yaml
python pdseg/check.py --cfg ./configs/icnet_pet.yaml
```
......@@ -102,7 +102,7 @@ python pdseg/check.py --cfg ./configs/test_pet.yaml
校验通过后,使用下述命令启动训练
```shell
python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml
python pdseg/train.py --use_gpu --cfg ./configs/icnet_pet.yaml
```
## 六. 进行评估
......@@ -110,11 +110,11 @@ python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml
模型训练完成,使用下述命令启动评估
```shell
python pdseg/eval.py --use_gpu --cfg ./configs/test_pet.yaml
python pdseg/eval.py --use_gpu --cfg ./configs/icnet_pet.yaml
```
## 模型组合
|预训练模型名称|BackBone|Norm|数据集|配置|
|-|-|-|-|-|
|icnet_bn_cityscapes|-|bn|Cityscapes|MODEL.MODEL_NAME: icnet <br> MODEL.DEFAULT_NORM_TYPE: bn <br> MULTI_LOSS_WEIGHT: [1.0, 0.4, 0.16]|
|icnet_bn_cityscapes|-|bn|Cityscapes|MODEL.MODEL_NAME: icnet <br> MODEL.DEFAULT_NORM_TYPE: bn <br> MODEL.MULTI_LOSS_WEIGHT: [1.0, 0.4, 0.16]|
......@@ -47,7 +47,7 @@ python pretrained_model/download_model.py unet_bn_coco
数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_unet_pet.yaml`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/unet_pet.yaml**
```yaml
# 数据集配置
......@@ -90,7 +90,7 @@ SOLVER:
在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程
```shell
python pdseg/check.py --cfg ./configs/test_unet_pet.yaml
python pdseg/check.py --cfg ./configs/unet_pet.yaml
```
......@@ -99,7 +99,7 @@ python pdseg/check.py --cfg ./configs/test_unet_pet.yaml
校验通过后,使用下述命令启动训练
```shell
python pdseg/train.py --use_gpu --cfg ./configs/test_unet_pet.yaml
python pdseg/train.py --use_gpu --cfg ./configs/unet_pet.yaml
```
## 六. 进行评估
......@@ -107,11 +107,11 @@ python pdseg/train.py --use_gpu --cfg ./configs/test_unet_pet.yaml
模型训练完成,使用下述命令启动评估
```shell
python pdseg/eval.py --use_gpu --cfg ./configs/test_unet_pet.yaml
python pdseg/eval.py --use_gpu --cfg ./configs/unet_pet.yaml
```
## 模型组合
|预训练模型名称|BackBone|Norm|数据集|配置|
|-|-|-|-|-|
|unet_bn_coco|-|bn|Cityscapes|MODEL.MODEL_NAME: unet <br> MODEL.DEFAULT_NORM_TYPE: bn|
|unet_bn_coco|-|bn|COCO|MODEL.MODEL_NAME: unet <br> MODEL.DEFAULT_NORM_TYPE: bn|
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册