diff --git a/README.md b/README.md index b7537c23590912f33f3758695d291a1c68daa84c..0fb10073f04f7f050be3575ac3217cecd465cce5 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,8 @@ -# PaddleSeg 语义分割库 +# PaddleSeg 图像分割库 [![Build Status](https://travis-ci.org/PaddlePaddle/PaddleSeg.svg?branch=master)](https://travis-ci.org/PaddlePaddle/PaddleSeg) [![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE) +[![Version](https://img.shields.io/github/release/PaddlePaddle/PaddleSeg.svg)](https://github.com/PaddlePaddle/PaddleSeg/releases) ## 简介 @@ -32,7 +33,7 @@ PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加 我们提供了一系列的使用教程,来说明如何使用PaddleSeg完成一个语义分割模型的训练、评估、部署。 -这一系列的文档被分为`快速入门`、`基础功能`、`预测部署`、`高级功能`四个部分,四个教程由浅至深地介绍PaddleSeg的设计思路和使用方法。 +这一系列的文档被分为**快速入门**、**基础功能**、**预测部署**、**高级功能**四个部分,四个教程由浅至深地介绍PaddleSeg的设计思路和使用方法。 ### 快速入门 @@ -45,9 +46,9 @@ PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加 * [预训练模型列表](./docs/model_zoo.md) * [自定义数据的准备与标注](./docs/data_prepare.md) * [数据和配置校验](./docs/check.md) -* [使用DeepLabv3+预训练模型](./turtorial/finetune_deeplabv3plus.md) -* [使用UNet预训练模型](./turtorial/finetune_unet.md) -* [使用ICNet预训练模型](./turtorial/finetune_icnet.md) +* [如何训练DeepLabv3+](./turtorial/finetune_deeplabv3plus.md) +* [如何训练U-Net](./turtorial/finetune_unet.md) +* [如何训练ICNet](./turtorial/finetune_icnet.md) ### 预测部署 @@ -66,34 +67,50 @@ PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加 #### Q: 安装requirements.txt指定的依赖包时,部分包提示找不到? -A: 可能是pip源的问题,这种情况下建议切换为官方源 +A: 可能是pip源的问题,这种情况下建议切换为官方源,或者通过`pip install -r requirements.txt -i `指定其他源地址。 -#### Q:图像分割的数据增强如何配置,unpadding, step-scaling, range-scaling的原理是什么? +#### Q:图像分割的数据增强如何配置,Unpadding, StepScaling, RangeScaling的原理是什么? A: 更详细数据增强文档可以参考[数据增强](./docs/data_aug.md) +#### Q: 训练时因为某些原因中断了,如何恢复训练? + +A: 启动训练脚本时通过命令行覆盖TRAIN.RESUME_MODEL_DIR配置为模型checkpoint目录即可, 以下代码示例第100轮重新恢复训练: +``` +python pdseg/train.py --cfg xxx.yaml TRAIN.RESUME_MODEL_DIR /PATH/TO/MODEL_CKPT/100 +``` + #### Q: 预测时图片过大,导致显存不足如何处理? -A: 降低Batch size,使用Group Norm策略等。 +A: 降低Batch size,使用Group Norm策略;请注意训练过程中当`DEFAULT_NORM_TYPE`选择`bn`时,为了Batch Norm计算稳定性,batch size需要满足>=2
## 在线体验 -PaddleSeg提供了多种预训练模型,并且以NoteBook的方式提供了在线体验的教程,欢迎体验: +PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验: |教程|链接| |-|-| -|快速开始:人像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/100798)| |U-Net宠物分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)| |DeepLabv3+图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/101696)| |PaddleSeg特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/115541)|
+## 交流与反馈 +* 欢迎您通过Github Issues来提交问题、报告与建议 +* 微信公众号:飞桨PaddlePaddle +* QQ群: 796771754 + +

     

+

   微信公众号                官方技术交流QQ群

+ +* 论坛: 欢迎大家在[PaddlePaddle论坛](https://ai.baidu.com/forum/topic/list/168)分享在使用PaddlePaddle中遇到的问题和经验, 营造良好的论坛氛围 + ## 更新日志 -* 2019.08.26 +* 2019.09.10 **`v0.1.0`** * PaddleSeg分割库初始版本发布,包含DeepLabv3+, U-Net, ICNet三类分割模型, 其中DeepLabv3+支持Xception, MobileNet两种可调节的骨干网络。 diff --git a/configs/cityscape.yaml b/configs/cityscape.yaml deleted file mode 100644 index f14234d77970e680cb39be46287f94ee5999c1a7..0000000000000000000000000000000000000000 --- a/configs/cityscape.yaml +++ /dev/null @@ -1,41 +0,0 @@ -EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling -TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling -AUG: - AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding - INF_RESIZE_VALUE: 500 # for rangescaling - MAX_RESIZE_VALUE: 600 # for rangescaling - MIN_RESIZE_VALUE: 400 # for rangescaling - MAX_SCALE_FACTOR: 2.0 # for stepscaling - MIN_SCALE_FACTOR: 0.5 # for stepscaling - SCALE_STEP_SIZE: 0.25 # for stepscaling - MIRROR: True -BATCH_SIZE: 4 -DATASET: - DATA_DIR: "./dataset/cityscapes/" - IMAGE_TYPE: "rgb" # choice rgb or rgba - NUM_CLASSES: 19 - TEST_FILE_LIST: "dataset/cityscapes/val.list" - TRAIN_FILE_LIST: "dataset/cityscapes/train.list" - VAL_FILE_LIST: "dataset/cityscapes/val.list" - IGNORE_INDEX: 255 -FREEZE: - MODEL_FILENAME: "model" - PARAMS_FILENAME: "params" -MODEL: - DEFAULT_NORM_TYPE: "gn" - MODEL_NAME: "deeplabv3p" - DEEPLAB: - ASPP_WITH_SEP_CONV: True - DECODER_USE_SEP_CONV: True -TEST: - TEST_MODEL: "snapshots/cityscape_v5/final/" -TRAIN: - MODEL_SAVE_DIR: "snapshots/cityscape_v7/" - PRETRAINED_MODEL_DIR: "pretrain/deeplabv3plus_gn_init" - SNAPSHOT_EPOCH: 10 -SOLVER: - LR: 0.001 - LR_POLICY: "poly" - OPTIMIZER: "sgd" - NUM_EPOCHS: 700 diff --git a/docs/data_aug.md b/docs/data_aug.md index 90f0e8b5ab13262f0f47cc44a833ef567e2bb4eb..2865d413b7090f414eb44c0681562837de21f19a 100644 --- a/docs/data_aug.md +++ b/docs/data_aug.md @@ -7,20 +7,20 @@ ## Resize -resize 步骤是指将输入图像按照某种规则先进行resize,PaddleSeg支持以下3种resize方式: +resize步骤是指将输入图像按照某种规则讲图片重新缩放到某一个尺寸,PaddleSeg支持以下3种resize方式: ![](imgs/aug_method.png) -- unpadding +- Un-padding 将输入图像直接resize到某一个固定大小下,送入到网络中间训练,对应参数为AUG.FIX_RESIZE_SIZE。预测时同样操作。 -- stepscaling +- Step-Scaling 将输入图像按照某一个比例resize,这个比例以某一个步长在一定范围内随机变动。设定最小比例参数为`AUG.MIN_SCALE_FACTOR`, 最大比例参数`AUG.MAX_SCALE_FACTOR`,步长参数为`AUG.SCALE_STEP_SIZE`。预测时不对输入图像做处理。 -- rangescaling +- Range-Scaling 固定长宽比resize,即图像长边对齐到某一个固定大小,短边随同样的比例变化。设定最小大小参数为`AUG.MIN_RESIZE_VALUE`,设定最大大小参数为`AUG.MAX_RESIZE_VALUE`。预测时需要将长边对齐到`AUG.INF_RESIZE_VALUE`所指定的大小,其中`AUG.INF_RESIZE_VALUE`在`AUG.MIN_RESIZE_VALUE`和`AUG.MAX_RESIZE_VALUE`范围内。 -rangescaling示意图如下: +Range-Scaling示意图如下: ![](imgs/rangescale.png) diff --git a/docs/imgs/qq_group2.png b/docs/imgs/qq_group2.png new file mode 100644 index 0000000000000000000000000000000000000000..b28e0c30600d5dcfc513ab648071ca01d12080a8 Binary files /dev/null and b/docs/imgs/qq_group2.png differ diff --git a/docs/model_export.md b/docs/model_export.md index 4389f2d07f445b861890f6ed0d3573e471bee72d..eab31fddb4c590403726b504c340ae0155bd952b 100644 --- a/docs/model_export.md +++ b/docs/model_export.md @@ -18,4 +18,4 @@ python pdseg/export_model.py --cfg configs/unet_pet.yaml TEST.TEST_MODEL test/saved_models/unet_pet/final ``` -模型会导出到freeze_model目录 +预测模型会导出到`freeze_model`目录,用于C++预测的模型配置会导出到`freeze_model/deploy.yaml`下 diff --git a/docs/model_zoo.md b/docs/model_zoo.md index 817acd5d55d55543f9a13c157d34f02b51fe4991..484c7cfa9c15b9f8f6c9a0e019293193b38c183f 100644 --- a/docs/model_zoo.md +++ b/docs/model_zoo.md @@ -39,6 +39,6 @@ train数据集合为Cityscapes训练集合,测试为Cityscapes的验证集合 | 模型 | 数据集合 | 下载地址 |Output Stride| mutli-scale test| mIoU on val| |---|---|---|---|---|---| | DeepLabv3+/MobileNetv2/bn | Cityscapes |[mobilenet_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/mobilenet_cityscapes.tgz) |16|false| 0.698| -| DeepLabv3+/Xception65/gn | Cityscapes |[deeplabv3p_xception65_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/deeplabv3p_xception65_cityscapes.tgz) |16|false| 0.7804 | -| DeepLabv3+/Xception65/bn | Cityscapes |[Xception65_deeplab_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/xception65_bn_cityscapes.tgz) | 16 | false | 0.7715 | +| DeepLabv3+/Xception65/gn | Cityscapes |[deeplabv3p_xception65_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/deeplabv3p_xception65_cityscapes.tgz) |16|false| 0.7824 | +| DeepLabv3+/Xception65/bn | Cityscapes |[Xception65_deeplab_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/xception65_bn_cityscapes.tgz) | 16 | false | 0.7930 | | ICNet/bn | Cityscapes |[icnet_cityscapes.tgz](https://paddleseg.bj.bcebos.com/models/icnet6831.tar.gz) |16|false| 0.6831 | diff --git a/docs/usage.md b/docs/usage.md index 7b06846c565281ede2dab7bc5d40eaa1eb607941..03418eba8edff90e7350430fa97e83a0d4ec937c 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,6 +1,6 @@ # 训练/评估/可视化 -PaddleSeg提供了 `训练`/`评估`/`可视化` 等三个功能的使用脚本。三个脚本都支持通过不同的Flags来开启特定功能,也支持通过Options来修改默认的[训练配置](./config.md)。三者的使用方式非常接近,如下: +PaddleSeg提供了 **训练**/**评估**/**可视化** 等三个功能的使用脚本。三个脚本都支持通过不同的Flags来开启特定功能,也支持通过Options来修改默认的[训练配置](./config.md)。三者的使用方式非常接近,如下: ```shell # 训练 @@ -11,22 +11,22 @@ python pdseg/eval.py ${FLAGS} ${OPTIONS} python pdseg/vis.py ${FLAGS} ${OPTIONS} ``` -`Note`: +**Note:** -> * FLAGS必须位于OPTIONS之前,否会将会遇到报错,例如如下的例子: -> -> ```shell -> # FLAGS "--cfg configs/cityscapes.yaml" 必须在 OPTIONS "BATCH_SIZE 1" 之前 -> python pdseg/train.py BATCH_SIZE 1 --cfg configs/cityscapes.yaml -> ``` +* FLAGS必须位于OPTIONS之前,否会将会遇到报错,例如如下的例子: -## FLAGS +```shell +# FLAGS "--cfg configs/cityscapes.yaml" 必须在 OPTIONS "BATCH_SIZE 1" 之前 +python pdseg/train.py BATCH_SIZE 1 --cfg configs/cityscapes.yaml +``` + +## 命令行FLAGS列表 |FLAG|支持脚本|用途|默认值|备注| |-|-|-|-|-| |--cfg|ALL|配置文件路径|None|| |--use_gpu|ALL|是否使用GPU进行训练|False|| -|--use_mpio|train/eval|是否使用多线程进行IO处理|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。
NOTE:windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。
| +|--use_mpio|train/eval|是否使用多线程进行IO处理|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。
**NOTE:** windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。
| |--use_tb|train|是否使用TensorBoard记录训练数据|False|| |--log_steps|train|训练日志的打印周期(单位为step)|10|| |--debug|train|是否打印debug信息|False|IOU等指标涉及到混淆矩阵的计算,会降低训练速度| @@ -55,8 +55,10 @@ python pdseg/vis.py ${FLAGS} ${OPTIONS} # 下载预训练模型并进行解压 python pretrained_model/download_model.py unet_bn_coco ``` -### 下载mini_pet数据集 -我们使用了Oxford-IIIT中的猫和狗两个类别数据制作了一个小数据集mini_pet,用于快速体验 +### 下载Oxford-IIIT Pet数据集 +我们使用了Oxford-IIIT中的猫和狗两个类别数据制作了一个小数据集mini_pet,用于快速体验。 +更多关于数据集的介绍情参考[Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) + ```shell # 下载预训练模型并进行解压 python dataset/download_pet.py @@ -77,19 +79,22 @@ python pdseg/train.py --use_gpu \ --cfg configs/unet_pet.yaml \ BATCH_SIZE 4 \ TRAIN.PRETRAINED_MODEL_DIR pretrained_model/unet_bn_coco \ - TRAIN.SYNC_BATCH_NORM True \ SOLVER.LR 5e-5 ``` -`NOTE`: -> * 上述示例中,一共存在三套配置方案: PaddleSeg默认配置/unet_pet.yaml/OPTIONS,三者的优先级顺序为 OPTIONS > yaml > 默认配置。这个原则对于train.py/eval.py/vis.py都适用 -> -> * 如果发现因为内存不足而Crash。请适当调低BATCH_SIZE。如果本机GPU内存充足,则可以调高BATCH_SIZE的大小以获得更快的训练速度 +**NOTE:** + +* 上述示例中,一共存在三套配置方案: PaddleSeg默认配置/unet_pet.yaml/OPTIONS,三者的优先级顺序为 OPTIONS > yaml > 默认配置。这个原则对于train.py/eval.py/vis.py都适用 + +* 如果发现因为内存不足而Crash。请适当调低BATCH_SIZE。如果本机GPU内存充足,则可以调高BATCH_SIZE的大小以获得更快的训练速度,BATCH_SIZE增大时,可以适当调高学习率。 + +* 如果在Linux系统下训练,可以使用`--use_mpio`使用多进程I/O,通过提升数据增强的处理速度进而大幅度提升GPU利用率。 + ### 训练过程可视化 -当打开do_eval和use_tb两个开关后,我们可以通过TensorBoard查看训练的效果 +当打开do_eval和use_tb两个开关后,我们可以通过TensorBoard查看边训练边评估的效果。 ```shell tensorboard --logdir train_log --host {$HOST_IP} --port {$PORT} @@ -107,14 +112,14 @@ NOTE: ![](./imgs/tensorboard_image.JPG) ### 模型评估 -训练完成后,我们可以通过eval.py来评估模型效果。由于我们设置的训练EPOCH数量为500,保存间隔为10,因此一共会产生50个定期保存的模型,加上最终保存的final模型,一共有51个模型。我们选择最后保存的模型进行效果的评估: +训练完成后,我们可以通过eval.py来评估模型效果。由于我们设置的训练EPOCH数量为100,保存间隔为10,因此一共会产生10个定期保存的模型,加上最终保存的final模型,一共有11个模型。我们选择最后保存的模型进行效果的评估: + ```shell python pdseg/eval.py --use_gpu \ --cfg configs/unet_pet.yaml \ TEST.TEST_MODEL test/saved_models/unet_pet/final ``` - ### 模型可视化 通过vis.py来评估模型效果,我们选择最后保存的模型进行效果的评估: ```shell diff --git a/inference/README.md b/inference/README.md index a29a45b54de496399809081a3dd4bd0c8cbde929..15872fe20bde05e410300bbc7e2ae1586e349a92 100644 --- a/inference/README.md +++ b/inference/README.md @@ -70,7 +70,11 @@ deeplabv3p_xception65_humanseg ### 2. 修改配置 -源代码的`conf`目录下提供了示例人像分割模型的配置文件`humanseg.yaml`, 相关的字段含义和说明如下: + +基于`PaddleSeg`训练的模型导出时,会自动生成对应的预测模型配置文件,请参考文档:[模型导出](../docs/export_model.md)。 + +`inference`源代码(即本目录)的`conf`目录下提供了示例人像分割模型的配置文件`humanseg.yaml`, 相关的字段含义和说明如下: + ```yaml DEPLOY: # 是否使用GPU预测 @@ -102,7 +106,6 @@ DEPLOY: ``` 修改字段`MODEL_PATH`的值为你在**上一步**下载并解压的模型文件所放置的目录即可。 - ### 3. 执行预测 在终端中切换到生成的可执行文件所在目录为当前目录(Windows系统为`cmd`)。 diff --git a/pdseg/data_aug.py b/pdseg/data_aug.py index d845d9c3ac5d40675de35276bbe71bb4f19589a3..474fba9a1236ee8db478a45dd5355f225c875afb 100644 --- a/pdseg/data_aug.py +++ b/pdseg/data_aug.py @@ -374,7 +374,7 @@ def rand_crop(crop_img, crop_seg, mode=ModelPhase.TRAIN): Args: crop_img(numpy.ndarray): 输入图像 crop_seg(numpy.ndarray): 标签图 - mode(string): 模式, 默认训练模式,验证或预测模式时crop尺寸需大于原始图片尺寸, 其他模式无限制 + mode(string): 模式, 默认训练模式,验证或预测、可视化模式时crop尺寸需大于原始图片尺寸 Returns: 裁剪后的图片和标签图 @@ -391,7 +391,7 @@ def rand_crop(crop_img, crop_seg, mode=ModelPhase.TRAIN): crop_width = cfg.EVAL_CROP_SIZE[0] crop_height = cfg.EVAL_CROP_SIZE[1] - if ModelPhase.is_eval(mode) or ModelPhase.is_predict(mode): + if not ModelPhase.is_train(mode): if (crop_height < img_height or crop_width < img_width): raise Exception( "Crop size({},{}) must large than img size({},{}) when in EvalPhase." @@ -410,7 +410,7 @@ def rand_crop(crop_img, crop_seg, mode=ModelPhase.TRAIN): 0, pad_width, cv2.BORDER_CONSTANT, - value=cfg.MEAN) + value=cfg.DATASET.PADDING_VALUE) if crop_seg is not None: crop_seg = cv2.copyMakeBorder( crop_seg, diff --git a/pdseg/export_model.py b/pdseg/export_model.py index 93c9bea98862572454960c75a6d7548249deebe9..27423bb705ba94d6418e376af04ad06d5d8ccb8e 100644 --- a/pdseg/export_model.py +++ b/pdseg/export_model.py @@ -49,6 +49,32 @@ def parse_args(): sys.exit(1) return parser.parse_args() +def export_inference_config(): + deploy_cfg = '''DEPLOY: + USE_GPU : 1 + MODEL_PATH : "%s" + MODEL_FILENAME : "%s" + PARAMS_FILENAME : "%s" + EVAL_CROP_SIZE : %s + MEAN : %s + STD : %s + IMAGE_TYPE : "%s" + NUM_CLASSES : %d + CHANNELS : %d + PRE_PROCESSOR : "SegPreProcessor" + PREDICTOR_MODE : "ANALYSIS" + BATCH_SIZE : 1 + ''' % (cfg.FREEZE.SAVE_DIR, cfg.FREEZE.MODEL_FILENAME, + cfg.FREEZE.PARAMS_FILENAME, cfg.EVAL_CROP_SIZE, + cfg.MEAN, cfg.STD, cfg.DATASET.IMAGE_TYPE, + cfg.DATASET.NUM_CLASSES, len(cfg.STD)) + if not os.path.exists(cfg.FREEZE.SAVE_DIR): + os.mkdir(cfg.FREEZE.SAVE_DIR) + yaml_path = os.path.join(cfg.FREEZE.SAVE_DIR, 'deploy.yaml') + with open(yaml_path, "w") as fp: + fp.write(deploy_cfg) + return yaml_path + def export_inference_model(args): """ @@ -81,6 +107,9 @@ def export_inference_model(args): model_filename=cfg.FREEZE.MODEL_FILENAME, params_filename=cfg.FREEZE.PARAMS_FILENAME) print("Inference model exported!") + print("Exporting inference model config...") + deploy_cfg_path = export_inference_config() + print("Inference model saved : [%s]" % (deploy_cfg_path)) def main(): diff --git a/pdseg/models/backbone/resnet.py b/pdseg/models/backbone/resnet.py index 5a2b624a9d7a9271d7120f728640142d57533fb4..6eb9f12bc33c97f42664327ada3712c8283943b1 100644 --- a/pdseg/models/backbone/resnet.py +++ b/pdseg/models/backbone/resnet.py @@ -85,7 +85,7 @@ class ResNet(): depth = [3, 8, 36, 3] num_filters = [64, 128, 256, 512] - if self.stem == 'icnet': + if self.stem == 'icnet' or self.stem == 'pspnet': conv = self.conv_bn_layer( input=input, num_filters=int(64 * self.scale), @@ -139,9 +139,9 @@ class ResNet(): else: conv_name = "res" + str(block + 2) + "b" + str(i) else: - conv_name = "conv" + str(block + 2) + '_' + str(1 + i) + conv_name = "res" + str(block + 2) + chr(97 + i) dilation_rate = get_dilated_rate(dilation_dict, block) - + conv = self.bottleneck_block( input=conv, num_filters=int(num_filters[block] * self.scale), @@ -215,6 +215,12 @@ class ResNet(): groups=1, act=None, name=None): + + if self.stem == 'pspnet': + bias_attr=ParamAttr(name=name + "_biases") + else: + bias_attr=False + conv = fluid.layers.conv2d( input=input, num_filters=num_filters, @@ -224,20 +230,21 @@ class ResNet(): dilation=dilation, groups=groups, act=None, - param_attr=ParamAttr(name=name + "/weights"), - bias_attr=False, + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=bias_attr, name=name + '.conv2d.output.1') - bn_name = name + '/BatchNorm/' - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '.output.1', - param_attr=ParamAttr(name=bn_name + 'gamma'), - bias_attr=ParamAttr(bn_name + 'beta'), - moving_mean_name=bn_name + 'moving_mean', - moving_variance_name=bn_name + 'moving_variance', - ) + if name == "conv1": + bn_name = "bn_" + name + else: + bn_name = "bn" + name[3:] + return fluid.layers.batch_norm(input=conv, + act=act, + name=bn_name + '.output.1', + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance', ) def shortcut(self, input, ch_out, stride, is_first, name): ch_in = input.shape[1] @@ -247,12 +254,17 @@ class ResNet(): return input def bottleneck_block(self, input, num_filters, stride, name, dilation=1): + if self.stem == 'pspnet' and self.layers == 101: + strides = [1, stride] + else: + strides = [stride, 1] + conv0 = self.conv_bn_layer( input=input, num_filters=num_filters, filter_size=1, dilation=1, - stride=stride, + stride=strides[0], act='relu', name=name + "_branch2a") if dilation > 1: @@ -262,6 +274,7 @@ class ResNet(): num_filters=num_filters, filter_size=3, dilation=dilation, + stride=strides[1], act='relu', name=name + "_branch2b") conv2 = self.conv_bn_layer( diff --git a/pdseg/models/model_builder.py b/pdseg/models/model_builder.py index 5ff5d51e3f2a8539529d1606a78747f7ecfa61e7..f2ba513a7b14b1b34c2f5dfba2080072cf965356 100644 --- a/pdseg/models/model_builder.py +++ b/pdseg/models/model_builder.py @@ -73,6 +73,7 @@ def map_model_name(model_name): "unet": "unet.unet", "deeplabv3p": "deeplab.deeplabv3p", "icnet": "icnet.icnet", + "pspnet": "pspnet.pspnet", } if model_name in name_dict.keys(): return name_dict[model_name] diff --git a/pdseg/models/modeling/pspnet.py b/pdseg/models/modeling/pspnet.py new file mode 100644 index 0000000000000000000000000000000000000000..8c322b44011f90ac6b29716a51d3e89483bdb3e4 --- /dev/null +++ b/pdseg/models/modeling/pspnet.py @@ -0,0 +1,112 @@ +# coding: utf8 +# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from models.libs.model_libs import scope, name_scope +from models.libs.model_libs import avg_pool, conv, bn +from models.backbone.resnet import ResNet as resnet_backbone +from utils.config import cfg + +def get_logit_interp(input, num_classes, out_shape, name="logit"): + # 根据类别数决定最后一层卷积输出, 并插值回原始尺寸 + param_attr = fluid.ParamAttr( + name=name + 'weights', + regularizer=fluid.regularizer.L2DecayRegularizer( + regularization_coeff=0.0), + initializer=fluid.initializer.TruncatedNormal(loc=0.0, scale=0.01)) + + with scope(name): + logit = conv(input, + num_classes, + filter_size=1, + param_attr=param_attr, + bias_attr=True, + name=name+'_conv') + logit_interp = fluid.layers.resize_bilinear( + logit, + out_shape=out_shape, + name=name+'_interp') + return logit_interp + + +def psp_module(input, out_features): + # Pyramid Scene Parsing 金字塔池化模块 + # 输入:backbone输出的特征 + # 输出:对输入进行不同尺度pooling, 卷积操作后插值回原始尺寸,并concat + # 最后进行一个卷积及BN操作 + + cat_layers = [] + sizes = (1,2,3,6) + for size in sizes: + psp_name = "psp" + str(size) + with scope(psp_name): + pool = fluid.layers.adaptive_pool2d(input, + pool_size=[size, size], + pool_type='avg', + name=psp_name+'_adapool') + data = conv(pool, out_features, + filter_size=1, + bias_attr=True, + name= psp_name + '_conv') + data_bn = bn(data, act='relu') + interp = fluid.layers.resize_bilinear(data_bn, + out_shape=input.shape[2:], + name=psp_name+'_interp') + cat_layers.append(interp) + cat_layers = [input] + cat_layers[::-1] + cat = fluid.layers.concat(cat_layers, axis=1, name='psp_cat') + + psp_end_name = "psp_end" + with scope(psp_end_name): + data = conv(cat, + out_features, + filter_size=3, + padding=1, + bias_attr=True, + name=psp_end_name) + out = bn(data, act='relu') + + return out + +def resnet(input): + # PSPNET backbone: resnet, 默认resnet50 + # end_points: resnet终止层数 + # dilation_dict: resnet block数及对应的膨胀卷积尺度 + scale = cfg.MODEL.PSPNET.DEPTH_MULTIPLIER + layers = cfg.MODEL.PSPNET.LAYERS + end_points = layers - 1 + dilation_dict = {2:2, 3:4} + model = resnet_backbone(layers, scale, stem='pspnet') + data, _ = model.net(input, + end_points=end_points, + dilation_dict=dilation_dict) + + return data + +def pspnet(input, num_classes): + # Backbone: ResNet + res = resnet(input) + # PSP模块 + psp = psp_module(res, 512) + dropout = fluid.layers.dropout(psp, dropout_prob=0.1, name="dropout") + # 根据类别数决定最后一层卷积输出, 并插值回原始尺寸 + logit = get_logit_interp(dropout, num_classes, input.shape[2:]) + return logit + diff --git a/pdseg/reader.py b/pdseg/reader.py index c839828cf99b89ceff62837cd6877a5659f80d06..e53b5912e07b1591d4014314f809e6997d925731 100644 --- a/pdseg/reader.py +++ b/pdseg/reader.py @@ -246,7 +246,7 @@ class SegDataset(object): img, grt, rich_crop_max_rotation=cfg.AUG.RICH_CROP.MAX_ROTATION, - mean_value=cfg.MEAN) + mean_value=cfg.DATASET.PADDING_VALUE) img, grt = aug.rand_scale_aspect( img, diff --git a/pdseg/train.py b/pdseg/train.py index 295468ef14b87d8f4886fe0203d362bef3611a42..22a430f7f1bd3ff4c5c1e5ee28a624badc3cac41 100644 --- a/pdseg/train.py +++ b/pdseg/train.py @@ -292,7 +292,7 @@ def train(cfg): for var in load_vars: print("Parameter[{}] loaded sucessfully!".format(var.name)) for var in load_fail_vars: - print("Parameter[{}] shape does not match current network, skip" + print("Parameter[{}] don't exist or shape does not match current network, skip" " to load it.".format(var.name)) print("{}/{} pretrained parameters loaded successfully!".format( len(load_vars), diff --git a/pdseg/utils/collect.py b/pdseg/utils/collect.py index 010a5c4bcc7bc84b9bb21cdf4f845c4ad3de3dbc..6b8f2f4eb0b4c98ce3078812f41dacafc3097bc1 100644 --- a/pdseg/utils/collect.py +++ b/pdseg/utils/collect.py @@ -97,6 +97,8 @@ class SegConfig(dict): raise KeyError( 'DATASET.IMAGE_TYPE config error, only support `rgb`, `gray` and `rgba`' ) + if self.MEAN is not None: + self.DATASET.PADDING_VALUE = [x*255.0 for x in self.MEAN] if not self.TRAIN_CROP_SIZE: raise ValueError( diff --git a/pdseg/utils/config.py b/pdseg/utils/config.py index 1c1c56a6508988cf23fc382771593bd9e8cc52b4..332a5143cfb837160be674c0ccc0bf8edc428000 100644 --- a/pdseg/utils/config.py +++ b/pdseg/utils/config.py @@ -65,6 +65,8 @@ cfg.DATASET.DATA_DIM = 3 cfg.DATASET.SEPARATOR = ' ' # 忽略的像素标签值, 默认为255,一般无需改动 cfg.DATASET.IGNORE_INDEX = 255 +# 数据增强是图像的padding值 +cfg.DATASET.PADDING_VALUE = [127.5,127.5,127.5] ########################### 数据增强配置 ###################################### # 图像镜像左右翻转 @@ -196,6 +198,12 @@ cfg.MODEL.ICNET.DEPTH_MULTIPLIER = 0.5 # RESNET 层数 设置 cfg.MODEL.ICNET.LAYERS = 50 +########################## PSPNET模型配置 ###################################### +# RESNET backbone scale 设置 +cfg.MODEL.PSPNET.DEPTH_MULTIPLIER = 1 +# RESNET 层数 设置 50或101 +cfg.MODEL.PSPNET.LAYERS = 50 + ########################## 预测部署模型配置 ################################### # 预测保存的模型名称 cfg.FREEZE.MODEL_FILENAME = '__model__' diff --git a/pretrained_model/download_model.py b/pretrained_model/download_model.py index b2bde566b210286452813c3ca4b2e424fb8af6b9..b175e17bd2ef122f84428ca7d6614805268ff139 100644 --- a/pretrained_model/download_model.py +++ b/pretrained_model/download_model.py @@ -23,15 +23,15 @@ from test_utils import download_file_and_uncompress model_urls = { # ImageNet Pretrained - "mobilnetv2-2-0_bn_imagenet": + "mobilenetv2-2-0_bn_imagenet": "https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x2_0_pretrained.tar", - "mobilnetv2-1-5_bn_imagenet": + "mobilenetv2-1-5_bn_imagenet": "https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x1_5_pretrained.tar", - "mobilnetv2-1-0_bn_imagenet": + "mobilenetv2-1-0_bn_imagenet": "https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar", - "mobilnetv2-0-5_bn_imagenet": + "mobilenetv2-0-5_bn_imagenet": "https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_5_pretrained.tar", - "mobilnetv2-0-25_bn_imagenet": + "mobilenetv2-0-25_bn_imagenet": "https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_x0_25_pretrained.tar", "xception41_imagenet": "https://paddleseg.bj.bcebos.com/models/Xception41_pretrained.tgz", @@ -39,11 +39,12 @@ model_urls = { "https://paddleseg.bj.bcebos.com/models/Xception65_pretrained.tgz", # COCO pretrained - "deeplabv3p_mobilnetv2-1-0_bn_coco": - "https://bj.bcebos.com/v1/paddleseg/deeplabv3plus_coco_bn_init.tgz", + "deeplabv3p_mobilenetv2-1-0_bn_coco": + "https://paddleseg.bj.bcebos.com/deeplab_mobilenet_x1_0_coco.tgz", "deeplabv3p_xception65_bn_coco": "https://paddleseg.bj.bcebos.com/models/xception65_coco.tgz", - "unet_bn_coco": "https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz", + "unet_bn_coco": + "https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz", # Cityscapes pretrained "deeplabv3p_mobilenetv2-1-0_bn_cityscapes": @@ -52,9 +53,10 @@ model_urls = { "https://paddleseg.bj.bcebos.com/models/deeplabv3p_xception65_cityscapes.tgz", "deeplabv3p_xception65_bn_cityscapes": "https://paddleseg.bj.bcebos.com/models/xception65_bn_cityscapes.tgz", - "unet_bn_coco": "https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz", + "unet_bn_coco": + "https://paddleseg.bj.bcebos.com/models/unet_coco_v3.tgz", "icnet_bn_cityscapes": - "https://paddleseg.bj.bcebos.com/models/icnet6831.tar.gz" + "https://paddleseg.bj.bcebos.com/models/icnet_cityscapes.tar.gz" } if __name__ == "__main__": diff --git a/turtorial/finetune_deeplabv3plus.md b/turtorial/finetune_deeplabv3plus.md index 18e3e1428921e43dba92e39df53f362d5833206c..7bc3918d36d73d3a4368135f08b89417ec55b570 100644 --- a/turtorial/finetune_deeplabv3plus.md +++ b/turtorial/finetune_deeplabv3plus.md @@ -1,4 +1,4 @@ -# 关于本教程 +# DeepLabv3+模型训练教程 * 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`DeeplabV3+/Xception65/BatchNorm`*** 预训练模型在自定义数据集上进行训练。除了该配置之外,DeeplabV3+还支持以下不同[模型组合](#模型组合)的预训练模型,如果需要使用对应模型作为预训练模型,将下述内容中的Xception Backbone中的内容进行替换即可 @@ -47,7 +47,7 @@ python pretrained_model/download_model.py deeplabv3p_xception65_bn_cityscapes 数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_pet.yaml` +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_deeplabv3p_pet.yaml` ```yaml # 数据集配置 @@ -59,16 +59,12 @@ DATASET: VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - # 预训练模型配置 MODEL: MODEL_NAME: "deeplabv3p" DEFAULT_NORM_TYPE: "bn" DEEPLAB: BACKBONE: "xception_65" -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_pet/" - # 其他配置 TRAIN_CROP_SIZE: (512, 512) @@ -78,15 +74,16 @@ AUG: FIX_RESIZE_SIZE: (512, 512) BATCH_SIZE: 4 TRAIN: - MODEL_SAVE_DIR: "./finetune/deeplabv3p_xception65_bn_pet/" + PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/" + MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_pet/" SNAPSHOT_EPOCH: 10 TEST: - TEST_MODEL: "./finetune/deeplabv3p_xception65_bn_pet/final" + TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_pet/final" SOLVER: - NUM_EPOCHS: 500 + NUM_EPOCHS: 100 LR: 0.005 LR_POLICY: "poly" - OPTIMIZER: "adam" + OPTIMIZER: "sgd" ``` ## 四. 配置/数据校验 @@ -94,7 +91,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/test_pet.yaml +python pdseg/check.py --cfg ./configs/test_deeplabv3p_pet.yaml ``` @@ -103,7 +100,7 @@ python pdseg/check.py --cfg ./configs/test_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml +python pdseg/train.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml ``` ## 六. 进行评估 @@ -111,7 +108,7 @@ python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/test_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/test_deeplabv3p_pet.yaml ``` ## 模型组合 diff --git a/turtorial/finetune_icnet.md b/turtorial/finetune_icnet.md index 231d163d53b25e870e2181ff4f9f500cb2e0a1b1..54aa9d43c395abe32b76cfcdf74759fe58753cd7 100644 --- a/turtorial/finetune_icnet.md +++ b/turtorial/finetune_icnet.md @@ -1,4 +1,4 @@ -# 关于本教程 +# ICNet模型训练教程 * 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`ICNet`*** 预训练模型在自定义数据集上进行训练 @@ -65,9 +65,8 @@ MODEL: MODEL_NAME: "icnet" DEFAULT_NORM_TYPE: "bn" MULTI_LOSS_WEIGHT: "[1.0, 0.4, 0.16]" -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/" - + ICNET: + DEPTH_MULTIPLIER: 0.5 # 其他配置 TRAIN_CROP_SIZE: (512, 512) @@ -77,13 +76,14 @@ AUG: FIX_RESIZE_SIZE: (512, 512) BATCH_SIZE: 4 TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/" MODEL_SAVE_DIR: "./saved_model/icnet_pet/" SNAPSHOT_EPOCH: 10 TEST: TEST_MODEL: "./saved_model/icnet_pet/final" SOLVER: NUM_EPOCHS: 100 - LR: 0.01 + LR: 0.005 LR_POLICY: "poly" OPTIMIZER: "sgd" ``` diff --git a/turtorial/finetune_unet.md b/turtorial/finetune_unet.md index e24535501d9c1ee25b255a8241418b85e78c4571..656541d842c3e89ca0f41f50e23bb9a2b120988b 100644 --- a/turtorial/finetune_unet.md +++ b/turtorial/finetune_unet.md @@ -1,6 +1,6 @@ -# 关于本教程 +# U-Net模型训练教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`UNet`*** 预训练模型在自定义数据集上进行训练 +* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`U-Net`*** 预训练模型在自定义数据集上进行训练 * 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 @@ -64,9 +64,6 @@ DATASET: MODEL: MODEL_NAME: "unet" DEFAULT_NORM_TYPE: "bn" -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" - # 其他配置 TRAIN_CROP_SIZE: (512, 512) @@ -76,6 +73,7 @@ AUG: FIX_RESIZE_SIZE: (512, 512) BATCH_SIZE: 4 TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" MODEL_SAVE_DIR: "./saved_model/unet_pet/" SNAPSHOT_EPOCH: 10 TEST: