diff --git a/README.md b/README.md index af7ba67a8c2280a51580815762ed8d5e306c567f..ab654768f09d13a8e130bc0284d21912a0e6248d 100644 --- a/README.md +++ b/README.md @@ -6,10 +6,26 @@ ## 简介 -PaddleSeg是基于[PaddlePaddle](https://www.paddlepaddle.org.cn)开发的语义分割库,覆盖了DeepLabv3+, U-Net, ICNet三类主流的分割模型。通过统一的配置,帮助用户更便捷地完成从训练到部署的全流程图像分割应用。 +PaddleSeg是基于[PaddlePaddle](https://www.paddlepaddle.org.cn)开发的语义分割库,覆盖了DeepLabv3+, U-Net, ICNet, PSPNet, HRNet等主流分割模型。通过统一的配置,帮助用户更便捷地完成从训练到部署的全流程图像分割应用。 -PaddleSeg具备高性能、丰富的数据增强、工业级部署、全流程应用的特点: +
+ +- [特点](#特点) +- [安装](#安装) +- [使用教程](#使用教程) + - [快速入门](#快速入门) + - [基础功能](#基础功能) + - [预测部署](#预测部署) + - [高级功能](#高级功能) +- [在线体验](#在线体验) +- [FAQ](#FAQ) +- [交流与反馈](#交流与反馈) +- [更新日志](#更新日志) +- [贡献代码](#贡献代码) + +
+## 特点 - **丰富的数据增强** @@ -17,29 +33,42 @@ PaddleSeg具备高性能、丰富的数据增强、工业级部署、全流程 - **模块化设计** -支持U-Net, DeepLabv3+, ICNet, PSPNet四种主流分割网络,结合预训练模型和可调节的骨干网络,满足不同性能和精度的要求;选择不同的损失函数如Dice Loss, BCE Loss等方式可以强化小目标和不均衡样本场景下的分割精度。 +支持U-Net, DeepLabv3+, ICNet, PSPNet, HRNet五种主流分割网络,结合预训练模型和可调节的骨干网络,满足不同性能和精度的要求;选择不同的损失函数如Dice Loss, BCE Loss等方式可以强化小目标和不均衡样本场景下的分割精度。 - **高性能** -PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加速策略,结合飞桨核心框架的显存优化功能,可以大幅度减少分割模型的显存开销,更快完成分割模型训练。 +PaddleSeg支持多进程I/O、多卡并行、跨卡Batch Norm同步等训练加速策略,结合飞桨核心框架的显存优化功能,可大幅度减少分割模型的显存开销,让开发者更低成本、更高效地完成图像分割训练。 - **工业级部署** -基于[Paddle Serving](https://github.com/PaddlePaddle/Serving)和PaddlePaddle高性能预测引擎,结合百度开放的AI能力,轻松搭建人像分割和车道线分割服务。 +全面提供**服务端**和**移动端**的工业级部署能力,依托飞桨高性能推理引擎和高性能图像处理实现,开发者可以轻松完成高性能的分割模型部署和集成。通过[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite),可以在移动设备或者嵌入式设备上完成轻量级、高性能的人像分割模型部署。 -
+## 安装 -## 环境依赖 +### 1. 安装PaddlePaddle +版本要求 * PaddlePaddle >= 1.6.1 * Python 2.7 or 3.5+ -通过以下命令安装python包依赖,请确保在该分支上至少执行过一次以下命令 -```shell -$ pip install -r requirements.txt +由于图像分割模型计算开销大,推荐在GPU版本的PaddlePaddle下使用PaddleSeg. +``` +pip install paddlepaddle-gpu ``` +更多PaddlePaddle的详细安装信息请查看[PaddlePaddle安装说明](https://www.paddlepaddle.org.cn/install/doc/index)。 -其他如CUDA版本、cuDNN版本等兼容信息请查看[PaddlePaddle安装](https://www.paddlepaddle.org.cn/install/doc/index) +### 2. 下载PaddleSeg代码 + +``` +git clone https://github.com/PaddlePaddle/PaddleSeg +``` + +### 3. 安装PaddleSeg依赖 +通过以下命令安装python包依赖,请确保在该分支上至少执行过一次以下命令: +``` +cd PaddleSeg +pip install -r requirements.txt +```
@@ -51,36 +80,49 @@ $ pip install -r requirements.txt ### 快速入门 -* [安装说明](./docs/installation.md) -* [训练/评估/可视化](./docs/usage.md) +* [PaddleSeg快速入门](./docs/usage.md) ### 基础功能 -* [分割模型介绍](./docs/models.md) -* [预训练模型列表](./docs/model_zoo.md) -* [自定义数据的准备与标注](./docs/data_prepare.md) +* [自定义数据的标注与准备](./docs/data_prepare.md) +* [脚本使用和配置说明](./docs/config.md) * [数据和配置校验](./docs/check.md) -* [如何训练DeepLabv3+](./turtorial/finetune_deeplabv3plus.md) -* [如何训练U-Net](./turtorial/finetune_unet.md) -* [如何训练ICNet](./turtorial/finetune_icnet.md) -* [如何训练PSPNet](./turtorial/finetune_pspnet.md) -* [如何训练HRNet](./turtorial/finetune_hrnet.md) +* [分割模型介绍](./docs/models.md) +* [预训练模型下载](./docs/model_zoo.md) +* [DeepLabv3+模型使用教程](./turtorial/finetune_deeplabv3plus.md) +* [U-Net模型使用教程](./turtorial/finetune_unet.md) +* [ICNet模型使用教程](./turtorial/finetune_icnet.md) +* [PSPNet模型使用教程](./turtorial/finetune_pspnet.md) +* [HRNet模型使用教程](./turtorial/finetune_hrnet.md) ### 预测部署 * [模型导出](./docs/model_export.md) -* [使用Python预测](./deploy/python/) -* [使用C++预测](./deploy/cpp/) -* [移动端预测部署](./deploy/lite/) +* [Python预测](./deploy/python/) +* [C++预测](./deploy/cpp/) +* [Paddle-Lite移动端预测部署](./deploy/lite/) ### 高级功能 * [PaddleSeg的数据增强](./docs/data_aug.md) -* [PaddleSeg的loss选择](./docs/loss_select.md) +* [如何解决二分类中类别不均衡问题](./docs/loss_select.md) * [特色垂类模型使用](./contrib) * [多进程训练和混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md) +## 在线体验 + +我们在AI Studio平台上提供了在线体验的教程,欢迎体验: + +|在线教程|链接| +|-|-| +|快速开始|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/100798)| +|U-Net图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)| +|DeepLabv3+图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/226703)| +|工业质检(零件瑕疵检测)|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184392)| +|人像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/188833)| +|PaddleSeg特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/226710)| +
## FAQ @@ -104,25 +146,14 @@ python pdseg/train.py --cfg xxx.yaml TRAIN.RESUME_MODEL_DIR /PATH/TO/MODEL_CKPT/ A: 降低Batch size,使用Group Norm策略;请注意训练过程中当`DEFAULT_NORM_TYPE`选择`bn`时,为了Batch Norm计算稳定性,batch size需要满足>=2 -
#### Q: 出现错误 ModuleNotFoundError: No module named 'paddle.fluid.contrib.mixed_precision' A: 请将PaddlePaddle升级至1.5.2版本或以上。 -## 在线体验 - -PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验: - -|教程|链接| -|-|-| -|U-Net宠物分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)| -|DeepLabv3+图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/101696)| -|PaddleSeg特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/115541)| -
-## 交流与反馈 +## 交流与反馈 * 欢迎您通过[Github Issues](https://github.com/PaddlePaddle/PaddleSeg/issues)来提交问题、报告与建议 * 微信公众号:飞桨PaddlePaddle * QQ群: 796771754 @@ -131,25 +162,36 @@ PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:

   微信公众号                官方技术交流QQ群

## 更新日志 - +* 2019.12.15 + + **`v0.3.0`** + * 新增HRNet分割网络,提供基于cityscapes和ImageNet的[预训练模型](./docs/model_zoo.md)8个 + * 支持使用[伪彩色标签](./docs/data_prepare.md#%E7%81%B0%E5%BA%A6%E6%A0%87%E6%B3%A8vs%E4%BC%AA%E5%BD%A9%E8%89%B2%E6%A0%87%E6%B3%A8)进行训练/评估/预测,提升训练体验,并提供将灰度标注图转为伪彩色标注图的脚本 + * 新增[学习率warmup](./docs/configs/solver_group.md#lr_warmup)功能,支持与不同的学习率Decay策略配合使用 + * 新增图像归一化操作的GPU化实现,进一步提升预测速度。 + * 新增Python部署方案,更低成本完成工业级部署。 + * 新增Paddle-Lite移动端部署方案,支持人像分割模型的移动端部署。 + * 新增不同分割模型的预测[性能数据Benchmark](./deploy/python/docs/PaddleSeg_Infer_Benchmark.md), 便于开发者提供模型选型性能参考。 + + * 2019.11.04 **`v0.2.0`** - * 新增PSPNet分割网络,提供基于COCO和cityscapes数据集的[预训练模型](./docs/model_zoo.md)4个 - * 新增Dice Loss、BCE Loss以及组合Loss配置,支持样本不均衡场景下的[模型优化](./docs/loss_select.md) - * 支持[FP16混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)以及动态Loss Scaling,在不损耗精度的情况下,训练速度提升30%+ - * 支持[PaddlePaddle多卡多进程训练](./docs/multiple_gpus_train_and_mixed_precision_train.md),多卡训练时训练速度提升15%+ - * 发布基于UNet的[工业标记表盘分割模型](./contrib#%E5%B7%A5%E4%B8%9A%E7%94%A8%E8%A1%A8%E5%88%86%E5%89%B2) + * 新增PSPNet分割网络,提供基于COCO和cityscapes数据集的[预训练模型](./docs/model_zoo.md)4个。 + * 新增Dice Loss、BCE Loss以及组合Loss配置,支持样本不均衡场景下的[模型优化](./docs/loss_select.md)。 + * 支持[FP16混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)以及动态Loss Scaling,在不损耗精度的情况下,训练速度提升30%+。 + * 支持[PaddlePaddle多卡多进程训练](./docs/multiple_gpus_train_and_mixed_precision_train.md),多卡训练时训练速度提升15%+。 + * 发布基于UNet的[工业标记表盘分割模型](./contrib#%E5%B7%A5%E4%B8%9A%E7%94%A8%E8%A1%A8%E5%88%86%E5%89%B2)。 * 2019.09.10 **`v0.1.0`** * PaddleSeg分割库初始版本发布,包含DeepLabv3+, U-Net, ICNet三类分割模型, 其中DeepLabv3+支持Xception, MobileNet v2两种可调节的骨干网络。 - * CVPR19 LIP人体部件分割比赛冠军预测模型发布[ACE2P](./contrib/ACE2P) - * 预置基于DeepLabv3+网络的[人像分割](./contrib/HumanSeg/)和[车道线分割](./contrib/RoadLine)预测模型发布 + * CVPR19 LIP人体部件分割比赛冠军预测模型发布[ACE2P](./contrib/ACE2P)。 + * 预置基于DeepLabv3+网络的[人像分割](./contrib/HumanSeg/)和[车道线分割](./contrib/RoadLine)预测模型发布。
-## 如何贡献代码 +## 贡献代码 -我们非常欢迎您为PaddleSeg贡献代码或者提供使用建议。 +我们非常欢迎您为PaddleSeg贡献代码或者提供使用建议。如果您可以修复某个issue或者增加一个新功能,欢迎给我们提交pull requests. diff --git a/configs/pspnet.yaml b/configs/deeplabv3p_xception65_cityscapes.yaml similarity index 64% rename from configs/pspnet.yaml rename to configs/deeplabv3p_xception65_cityscapes.yaml index fdc960d6af81bcba30128e972f260da05b33b0e8..ec352f0f5856218fabd00b6b316d0184a45d90d1 100644 --- a/configs/pspnet.yaml +++ b/configs/deeplabv3p_xception65_cityscapes.yaml @@ -1,8 +1,8 @@ EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling -TRAIN_CROP_SIZE: (713, 713) # (width, height), for unpadding rangescaling and stepscaling +TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling AUG: AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding + FIX_RESIZE_SIZE: (2048, 1024) # (width, height), for unpadding INF_RESIZE_VALUE: 500 # for rangescaling MAX_RESIZE_VALUE: 600 # for rangescaling MIN_RESIZE_VALUE: 400 # for rangescaling @@ -19,23 +19,25 @@ DATASET: TRAIN_FILE_LIST: "dataset/cityscapes/train.list" VAL_FILE_LIST: "dataset/cityscapes/val.list" IGNORE_INDEX: 255 + SEPARATOR: " " FREEZE: MODEL_FILENAME: "model" PARAMS_FILENAME: "params" MODEL: - MODEL_NAME: "pspnet" DEFAULT_NORM_TYPE: "bn" - PSPNET: - DEPTH_MULTIPLIER: 1 - LAYERS: 50 -TEST: - TEST_MODEL: "snapshots/cityscapes_pspnet50/final" + MODEL_NAME: "deeplabv3p" + DEEPLAB: + ASPP_WITH_SEP_CONV: True + DECODER_USE_SEP_CONV: True TRAIN: - MODEL_SAVE_DIR: "snapshots/cityscapes_pspnet50/" - PRETRAINED_MODEL_DIR: u"pretrained_model/pspnet50_bn_cityscapes/" + PRETRAINED_MODEL_DIR: u"pretrained_model/deeplabv3p_xception65_bn_coco" + MODEL_SAVE_DIR: "saved_model/deeplabv3p_xception65_bn_cityscapes" SNAPSHOT_EPOCH: 10 + SYNC_BATCH_NORM: True +TEST: + TEST_MODEL: "saved_model/deeplabv3p_xception65_bn_cityscapes/final" SOLVER: - LR: 0.001 + LR: 0.01 LR_POLICY: "poly" OPTIMIZER: "sgd" - NUM_EPOCHS: 700 + NUM_EPOCHS: 100 diff --git a/configs/deeplabv3p_xception65_optic.yaml b/configs/deeplabv3p_xception65_optic.yaml new file mode 100644 index 0000000000000000000000000000000000000000..7ec86926db355c53439802a5d891d9e736a1bba0 --- /dev/null +++ b/configs/deeplabv3p_xception65_optic.yaml @@ -0,0 +1,34 @@ +# 数据集配置 +DATASET: + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + +# 预训练模型配置 +MODEL: + MODEL_NAME: "deeplabv3p" + DEFAULT_NORM_TYPE: "bn" + DEEPLAB: + BACKBONE: "xception_65" + +# 其他配置 +TRAIN_CROP_SIZE: (512, 512) +EVAL_CROP_SIZE: (512, 512) +AUG: + AUG_METHOD: "unpadding" + FIX_RESIZE_SIZE: (512, 512) +BATCH_SIZE: 4 +TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/" + MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_optic/" + SNAPSHOT_EPOCH: 5 +TEST: + TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_optic/final" +SOLVER: + NUM_EPOCHS: 10 + LR: 0.001 + LR_POLICY: "poly" + OPTIMIZER: "adam" \ No newline at end of file diff --git a/configs/deeplabv3p_xception65_pet.yaml b/configs/deeplabv3p_xception65_pet.yaml deleted file mode 100644 index 1b574497ea882c86c7e5785e16de976e5b33a50f..0000000000000000000000000000000000000000 --- a/configs/deeplabv3p_xception65_pet.yaml +++ /dev/null @@ -1,44 +0,0 @@ -TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -AUG: - AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding - - INF_RESIZE_VALUE: 500 # for rangescaling - MAX_RESIZE_VALUE: 600 # for rangescaling - MIN_RESIZE_VALUE: 400 # for rangescaling - - MAX_SCALE_FACTOR: 1.25 # for stepscaling - MIN_SCALE_FACTOR: 0.75 # for stepscaling - SCALE_STEP_SIZE: 0.25 # for stepscaling - MIRROR: True -BATCH_SIZE: 4 -DATASET: - DATA_DIR: "./dataset/mini_pet/" - IMAGE_TYPE: "rgb" # choice rgb or rgba - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - IGNORE_INDEX: 255 - SEPARATOR: " " -FREEZE: - MODEL_FILENAME: "__model__" - PARAMS_FILENAME: "__params__" -MODEL: - MODEL_NAME: "deeplabv3p" - DEFAULT_NORM_TYPE: "bn" - DEEPLAB: - BACKBONE: "xception_65" -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/" - MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_pet/" - SNAPSHOT_EPOCH: 10 -TEST: - TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_pet/final" -SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 - LR_POLICY: "poly" - OPTIMIZER: "sgd" diff --git a/configs/hrnet_optic.yaml b/configs/hrnet_optic.yaml new file mode 100644 index 0000000000000000000000000000000000000000..7154bceeeaf99ec1962a4ac6e5ac79a9d78d3f4a --- /dev/null +++ b/configs/hrnet_optic.yaml @@ -0,0 +1,39 @@ +# 数据集配置 +DATASET: + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + +# 预训练模型配置 +MODEL: + MODEL_NAME: "hrnet" + DEFAULT_NORM_TYPE: "bn" + HRNET: + STAGE2: + NUM_CHANNELS: [18, 36] + STAGE3: + NUM_CHANNELS: [18, 36, 72] + STAGE4: + NUM_CHANNELS: [18, 36, 72, 144] + +# 其他配置 +TRAIN_CROP_SIZE: (512, 512) +EVAL_CROP_SIZE: (512, 512) +AUG: + AUG_METHOD: "unpadding" + FIX_RESIZE_SIZE: (512, 512) +BATCH_SIZE: 4 +TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/" + MODEL_SAVE_DIR: "./saved_model/hrnet_optic/" + SNAPSHOT_EPOCH: 5 +TEST: + TEST_MODEL: "./saved_model/hrnet_optic/final" +SOLVER: + NUM_EPOCHS: 10 + LR: 0.001 + LR_POLICY: "poly" + OPTIMIZER: "adam" diff --git a/configs/hrnet_w18_pet.yaml b/configs/hrnet_w18_pet.yaml deleted file mode 100644 index b1bfb9215e7f204444613fd9f6c78eba9c1c1432..0000000000000000000000000000000000000000 --- a/configs/hrnet_w18_pet.yaml +++ /dev/null @@ -1,49 +0,0 @@ -TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -AUG: - AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding - - INF_RESIZE_VALUE: 500 # for rangescaling - MAX_RESIZE_VALUE: 600 # for rangescaling - MIN_RESIZE_VALUE: 400 # for rangescaling - - MAX_SCALE_FACTOR: 1.25 # for stepscaling - MIN_SCALE_FACTOR: 0.75 # for stepscaling - SCALE_STEP_SIZE: 0.25 # for stepscaling - MIRROR: True -BATCH_SIZE: 4 -DATASET: - DATA_DIR: "./dataset/mini_pet/" - IMAGE_TYPE: "rgb" # choice rgb or rgba - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - IGNORE_INDEX: 255 - SEPARATOR: " " -FREEZE: - MODEL_FILENAME: "__model__" - PARAMS_FILENAME: "__params__" -MODEL: - MODEL_NAME: "hrnet" - DEFAULT_NORM_TYPE: "bn" - HRNET: - STAGE2: - NUM_CHANNELS: [18, 36] - STAGE3: - NUM_CHANNELS: [18, 36, 72] - STAGE4: - NUM_CHANNELS: [18, 36, 72, 144] -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/" - MODEL_SAVE_DIR: "./saved_model/hrnet_w18_bn_pet/" - SNAPSHOT_EPOCH: 10 -TEST: - TEST_MODEL: "./saved_model/hrnet_w18_bn_pet/final" -SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 - LR_POLICY: "poly" - OPTIMIZER: "sgd" diff --git a/configs/icnet_optic.yaml b/configs/icnet_optic.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0f2742e6cf3626ed82c1f379749c24ee6200fa3c --- /dev/null +++ b/configs/icnet_optic.yaml @@ -0,0 +1,35 @@ +# 数据集配置 +DATASET: + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + +# 预训练模型配置 +MODEL: + MODEL_NAME: "icnet" + DEFAULT_NORM_TYPE: "bn" + MULTI_LOSS_WEIGHT: "[1.0, 0.4, 0.16]" + ICNET: + DEPTH_MULTIPLIER: 0.5 + +# 其他配置 +TRAIN_CROP_SIZE: (512, 512) +EVAL_CROP_SIZE: (512, 512) +AUG: + AUG_METHOD: "unpadding" + FIX_RESIZE_SIZE: (512, 512) +BATCH_SIZE: 4 +TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/" + MODEL_SAVE_DIR: "./saved_model/icnet_optic/" + SNAPSHOT_EPOCH: 5 +TEST: + TEST_MODEL: "./saved_model/icnet_optic/final" +SOLVER: + NUM_EPOCHS: 10 + LR: 0.001 + LR_POLICY: "poly" + OPTIMIZER: "adam" diff --git a/configs/icnet_pet.yaml b/configs/icnet_pet.yaml deleted file mode 100644 index 0398d131ca12aea7902ec7be6542650377201c25..0000000000000000000000000000000000000000 --- a/configs/icnet_pet.yaml +++ /dev/null @@ -1,45 +0,0 @@ -TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -AUG: - AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding - - INF_RESIZE_VALUE: 500 # for rangescaling - MAX_RESIZE_VALUE: 600 # for rangescaling - MIN_RESIZE_VALUE: 400 # for rangescaling - - MAX_SCALE_FACTOR: 1.25 # for stepscaling - MIN_SCALE_FACTOR: 0.75 # for stepscaling - SCALE_STEP_SIZE: 0.25 # for stepscaling - MIRROR: True -BATCH_SIZE: 4 -DATASET: - DATA_DIR: "./dataset/mini_pet/" - IMAGE_TYPE: "rgb" # choice rgb or rgba - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - IGNORE_INDEX: 255 - SEPARATOR: " " -FREEZE: - MODEL_FILENAME: "__model__" - PARAMS_FILENAME: "__params__" -MODEL: - MODEL_NAME: "icnet" - DEFAULT_NORM_TYPE: "bn" - MULTI_LOSS_WEIGHT: "[1.0, 0.4, 0.16]" - ICNET: - DEPTH_MULTIPLIER: 0.5 -TRAIN: - PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/" - MODEL_SAVE_DIR: "./saved_model/icnet_pet/" - SNAPSHOT_EPOCH: 10 -TEST: - TEST_MODEL: "./saved_model/icnet_pet/final" -SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 - LR_POLICY: "poly" - OPTIMIZER: "sgd" diff --git a/configs/pspnet_optic.yaml b/configs/pspnet_optic.yaml new file mode 100644 index 0000000000000000000000000000000000000000..589e2b53cc640f124ad868f59a412e36fd7ced85 --- /dev/null +++ b/configs/pspnet_optic.yaml @@ -0,0 +1,35 @@ +# 数据集配置 +DATASET: + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + +# 预训练模型配置 +MODEL: + MODEL_NAME: "pspnet" + DEFAULT_NORM_TYPE: "bn" + PSPNET: + DEPTH_MULTIPLIER: 1 + LAYERS: 50 + +# 其他配置 +TRAIN_CROP_SIZE: (512, 512) +EVAL_CROP_SIZE: (512, 512) +AUG: + AUG_METHOD: "unpadding" + FIX_RESIZE_SIZE: (512, 512) +BATCH_SIZE: 4 +TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/pspnet50_bn_cityscapes/" + MODEL_SAVE_DIR: "./saved_model/pspnet_optic/" + SNAPSHOT_EPOCH: 5 +TEST: + TEST_MODEL: "./saved_model/pspnet_optic/final" +SOLVER: + NUM_EPOCHS: 10 + LR: 0.001 + LR_POLICY: "poly" + OPTIMIZER: "adam" diff --git a/configs/unet_optic.yaml b/configs/unet_optic.yaml new file mode 100644 index 0000000000000000000000000000000000000000..cd564817c7147c18ceaf360993042735019ec16d --- /dev/null +++ b/configs/unet_optic.yaml @@ -0,0 +1,32 @@ +# 数据集配置 +DATASET: + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + +# 预训练模型配置 +MODEL: + MODEL_NAME: "unet" + DEFAULT_NORM_TYPE: "bn" + +# 其他配置 +TRAIN_CROP_SIZE: (512, 512) +EVAL_CROP_SIZE: (512, 512) +AUG: + AUG_METHOD: "unpadding" + FIX_RESIZE_SIZE: (512, 512) +BATCH_SIZE: 4 +TRAIN: + PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" + MODEL_SAVE_DIR: "./saved_model/unet_optic/" + SNAPSHOT_EPOCH: 5 +TEST: + TEST_MODEL: "./saved_model/unet_optic/final" +SOLVER: + NUM_EPOCHS: 10 + LR: 0.001 + LR_POLICY: "poly" + OPTIMIZER: "adam" diff --git a/configs/unet_pet.yaml b/configs/unet_pet.yaml deleted file mode 100644 index a1781c5e8c4963ac269c4850f1012cc3d9ad8d15..0000000000000000000000000000000000000000 --- a/configs/unet_pet.yaml +++ /dev/null @@ -1,42 +0,0 @@ -TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling -AUG: - AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling - FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding - - INF_RESIZE_VALUE: 500 # for rangescaling - MAX_RESIZE_VALUE: 600 # for rangescaling - MIN_RESIZE_VALUE: 400 # for rangescaling - - MAX_SCALE_FACTOR: 1.25 # for stepscaling - MIN_SCALE_FACTOR: 0.75 # for stepscaling - SCALE_STEP_SIZE: 0.25 # for stepscaling - MIRROR: True -BATCH_SIZE: 4 -DATASET: - DATA_DIR: "./dataset/mini_pet/" - IMAGE_TYPE: "rgb" # choice rgb or rgba - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - IGNORE_INDEX: 255 - SEPARATOR: " " -FREEZE: - MODEL_FILENAME: "__model__" - PARAMS_FILENAME: "__params__" -MODEL: - MODEL_NAME: "unet" - DEFAULT_NORM_TYPE: "bn" -TEST: - TEST_MODEL: "./saved_model/unet_pet/final/" -TRAIN: - MODEL_SAVE_DIR: "./saved_model/unet_pet/" - PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" - SNAPSHOT_EPOCH: 10 -SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 - LR_POLICY: "poly" - OPTIMIZER: "adam" diff --git a/contrib/ACE2P/config.py b/contrib/ACE2P/config.py index f6ad509581a84d04bc1b6badca83648505c19444..a1fad0ec1e6c50493a0a8dfab0c5301add410ad0 100644 --- a/contrib/ACE2P/config.py +++ b/contrib/ACE2P/config.py @@ -6,13 +6,13 @@ args = get_arguments() cfg = AttrDict() # 待预测图像所在路径 -cfg.data_dir = os.path.join(args.example , "data", "testing_images") +cfg.data_dir = os.path.join("data", "testing_images") # 待预测图像名称列表 -cfg.data_list_file = os.path.join(args.example , "data", "test_id.txt") +cfg.data_list_file = os.path.join("data", "test_id.txt") # 模型加载路径 -cfg.model_path = os.path.join(args.example , "ACE2P") +cfg.model_path = args.example # 预测结果保存路径 -cfg.vis_dir = os.path.join(args.example , "result") +cfg.vis_dir = "result" # 预测类别数 cfg.class_num = 20 diff --git a/contrib/ACE2P/download_ACE2P.py b/contrib/ACE2P/download_ACE2P.py new file mode 100644 index 0000000000000000000000000000000000000000..bb4d33771dbd879a2d77664d2e0e45ed33b9bcb2 --- /dev/null +++ b/contrib/ACE2P/download_ACE2P.py @@ -0,0 +1,31 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License" +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys +import os + +LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) +TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test") +sys.path.append(TEST_PATH) + +from test_utils import download_file_and_uncompress + +if __name__ == "__main__": + download_file_and_uncompress( + url='https://paddleseg.bj.bcebos.com/models/ACE2P.tgz', + savepath=LOCAL_PATH, + extrapath=LOCAL_PATH, + extraname='ACE2P') + + print("Pretrained Model download success!") diff --git a/contrib/ACE2P/imgs/117676_2149260.jpg b/contrib/ACE2P/imgs/117676_2149260.jpg new file mode 100644 index 0000000000000000000000000000000000000000..8314d8f8cc723b6f96785053bdcfe39d867755d5 Binary files /dev/null and b/contrib/ACE2P/imgs/117676_2149260.jpg differ diff --git a/contrib/ACE2P/imgs/117676_2149260.png b/contrib/ACE2P/imgs/117676_2149260.png new file mode 100644 index 0000000000000000000000000000000000000000..e3a9529644ead2013748431a3ade2f34264f19de Binary files /dev/null and b/contrib/ACE2P/imgs/117676_2149260.png differ diff --git a/contrib/ACE2P/infer.py b/contrib/ACE2P/infer.py new file mode 100644 index 0000000000000000000000000000000000000000..16eddc1eab8628eec7e38d27b1f18df13dd480d7 --- /dev/null +++ b/contrib/ACE2P/infer.py @@ -0,0 +1,130 @@ +# -*- coding: utf-8 -*- +import os +import cv2 +import numpy as np +from utils.util import get_arguments +from utils.palette import get_palette +from PIL import Image as PILImage +import importlib + +args = get_arguments() +config = importlib.import_module('config') +cfg = getattr(config, 'cfg') + +# paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启 +os.environ['FLAGS_eager_delete_tensor_gb']='0.0' + +import paddle.fluid as fluid + +# 预测数据集类 +class TestDataSet(): + def __init__(self): + self.data_dir = cfg.data_dir + self.data_list_file = cfg.data_list_file + self.data_list = self.get_data_list() + self.data_num = len(self.data_list) + + def get_data_list(self): + # 获取预测图像路径列表 + data_list = [] + data_file_handler = open(self.data_list_file, 'r') + for line in data_file_handler: + img_name = line.strip() + name_prefix = img_name.split('.')[0] + if len(img_name.split('.')) == 1: + img_name = img_name + '.jpg' + img_path = os.path.join(self.data_dir, img_name) + data_list.append(img_path) + return data_list + + def preprocess(self, img): + # 图像预处理 + if cfg.example == 'ACE2P': + reader = importlib.import_module('reader') + ACE2P_preprocess = getattr(reader, 'preprocess') + img = ACE2P_preprocess(img) + else: + img = cv2.resize(img, cfg.input_size).astype(np.float32) + img -= np.array(cfg.MEAN) + img /= np.array(cfg.STD) + img = img.transpose((2, 0, 1)) + img = np.expand_dims(img, axis=0) + return img + + def get_data(self, index): + # 获取图像信息 + img_path = self.data_list[index] + img = cv2.imread(img_path, cv2.IMREAD_COLOR) + if img is None: + return img, img,img_path, None + + img_name = img_path.split(os.sep)[-1] + name_prefix = img_name.replace('.'+img_name.split('.')[-1],'') + img_shape = img.shape[:2] + img_process = self.preprocess(img) + + return img, img_process, name_prefix, img_shape + + +def infer(): + if not os.path.exists(cfg.vis_dir): + os.makedirs(cfg.vis_dir) + palette = get_palette(cfg.class_num) + # 人像分割结果显示阈值 + thresh = 120 + + place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace() + exe = fluid.Executor(place) + + # 加载预测模型 + test_prog, feed_name, fetch_list = fluid.io.load_inference_model( + dirname=cfg.model_path, executor=exe, params_filename='__params__') + + #加载预测数据集 + test_dataset = TestDataSet() + data_num = test_dataset.data_num + + for idx in range(data_num): + # 数据获取 + ori_img, image, im_name, im_shape = test_dataset.get_data(idx) + if image is None: + print(im_name, 'is None') + continue + + # 预测 + if cfg.example == 'ACE2P': + # ACE2P模型使用多尺度预测 + reader = importlib.import_module('reader') + multi_scale_test = getattr(reader, 'multi_scale_test') + parsing, logits = multi_scale_test(exe, test_prog, feed_name, fetch_list, image, im_shape) + else: + # HumanSeg,RoadLine模型单尺度预测 + result = exe.run(program=test_prog, feed={feed_name[0]: image}, fetch_list=fetch_list) + parsing = np.argmax(result[0][0], axis=0) + parsing = cv2.resize(parsing.astype(np.uint8), im_shape[::-1]) + + # 预测结果保存 + result_path = os.path.join(cfg.vis_dir, im_name + '.png') + if cfg.example == 'HumanSeg': + logits = result[0][0][1]*255 + logits = cv2.resize(logits, im_shape[::-1]) + ret, logits = cv2.threshold(logits, thresh, 0, cv2.THRESH_TOZERO) + logits = 255 *(logits - thresh)/(255 - thresh) + # 将分割结果添加到alpha通道 + rgba = np.concatenate((ori_img, np.expand_dims(logits, axis=2)), axis=2) + cv2.imwrite(result_path, rgba) + else: + output_im = PILImage.fromarray(np.asarray(parsing, dtype=np.uint8)) + output_im.putpalette(palette) + output_im.save(result_path) + + if (idx + 1) % 100 == 0: + print('%d processd' % (idx + 1)) + + print('%d processd done' % (idx + 1)) + + return 0 + + +if __name__ == "__main__": + infer() diff --git a/contrib/ACE2P/reader.py b/contrib/ACE2P/reader.py index 0a266637f3cf425a1bc3d61ad7377ff30de55723..ef5cc370738daf8550adfc20c227f942f1dd300f 100644 --- a/contrib/ACE2P/reader.py +++ b/contrib/ACE2P/reader.py @@ -1,7 +1,7 @@ # -*- coding: utf-8 -*- import numpy as np import paddle.fluid as fluid -from ACE2P.config import cfg +from config import cfg import cv2 def get_affine_points(src_shape, dst_shape, rot_grad=0): diff --git a/contrib/utils/__init__.py b/contrib/ACE2P/utils/__init__.py similarity index 100% rename from contrib/utils/__init__.py rename to contrib/ACE2P/utils/__init__.py diff --git a/contrib/utils/palette.py b/contrib/ACE2P/utils/palette.py similarity index 100% rename from contrib/utils/palette.py rename to contrib/ACE2P/utils/palette.py diff --git a/contrib/utils/util.py b/contrib/ACE2P/utils/util.py similarity index 100% rename from contrib/utils/util.py rename to contrib/ACE2P/utils/util.py diff --git a/contrib/imgs/Human.jpg b/contrib/HumanSeg/imgs/Human.jpg similarity index 100% rename from contrib/imgs/Human.jpg rename to contrib/HumanSeg/imgs/Human.jpg diff --git a/contrib/imgs/HumanSeg.jpg b/contrib/HumanSeg/imgs/HumanSeg.jpg similarity index 100% rename from contrib/imgs/HumanSeg.jpg rename to contrib/HumanSeg/imgs/HumanSeg.jpg diff --git a/contrib/infer.py b/contrib/HumanSeg/infer.py similarity index 98% rename from contrib/infer.py rename to contrib/HumanSeg/infer.py index 8f939c8455cd3868120781a7a8d96ace0ff772b1..971476933c431977ce80c73e1d939fe079e1af19 100644 --- a/contrib/infer.py +++ b/contrib/HumanSeg/infer.py @@ -8,7 +8,7 @@ from PIL import Image as PILImage import importlib args = get_arguments() -config = importlib.import_module(args.example+'.config') +config = importlib.import_module('config') cfg = getattr(config, 'cfg') # paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启 diff --git a/contrib/HumanSeg/utils/__init__.py b/contrib/HumanSeg/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/contrib/HumanSeg/utils/palette.py b/contrib/HumanSeg/utils/palette.py new file mode 100644 index 0000000000000000000000000000000000000000..2186203cbc2789f6eff70dfd92f724b4fe16cdb7 --- /dev/null +++ b/contrib/HumanSeg/utils/palette.py @@ -0,0 +1,38 @@ +##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +## Created by: RainbowSecret +## Microsoft Research +## yuyua@microsoft.com +## Copyright (c) 2018 +## +## This source code is licensed under the MIT-style license found in the +## LICENSE file in the root directory of this source tree +##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +import numpy as np +import cv2 + + +def get_palette(num_cls): + """ Returns the color map for visualizing the segmentation mask. + Args: + num_cls: Number of classes + Returns: + The color map + """ + n = num_cls + palette = [0] * (n * 3) + for j in range(0, n): + lab = j + palette[j * 3 + 0] = 0 + palette[j * 3 + 1] = 0 + palette[j * 3 + 2] = 0 + i = 0 + while lab: + palette[j * 3 + 0] |= (((lab >> 0) & 1) << (7 - i)) + palette[j * 3 + 1] |= (((lab >> 1) & 1) << (7 - i)) + palette[j * 3 + 2] |= (((lab >> 2) & 1) << (7 - i)) + i += 1 + lab >>= 3 + return palette diff --git a/contrib/HumanSeg/utils/util.py b/contrib/HumanSeg/utils/util.py new file mode 100644 index 0000000000000000000000000000000000000000..7394870e7c94c1fb16169e314696b931eecdc3b2 --- /dev/null +++ b/contrib/HumanSeg/utils/util.py @@ -0,0 +1,47 @@ +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals +import argparse +import os + +def get_arguments(): + parser = argparse.ArgumentParser() + parser.add_argument("--use_gpu", + action="store_true", + help="Use gpu or cpu to test.") + parser.add_argument('--example', + type=str, + help='RoadLine, HumanSeg or ACE2P') + + return parser.parse_args() + + +class AttrDict(dict): + def __init__(self, *args, **kwargs): + super(AttrDict, self).__init__(*args, **kwargs) + + def __getattr__(self, name): + if name in self.__dict__: + return self.__dict__[name] + elif name in self: + return self[name] + else: + raise AttributeError(name) + + def __setattr__(self, name, value): + if name in self.__dict__: + self.__dict__[name] = value + else: + self[name] = value + +def merge_cfg_from_args(args, cfg): + """Merge config keys, values in args into the global config.""" + for k, v in vars(args).items(): + d = cfg + try: + value = eval(v) + except: + value = v + if value is not None: + cfg[k] = value + diff --git a/dataset/download_mini_mechanical_industry_meter.py b/contrib/MechanicalIndustryMeter/download_mini_mechanical_industry_meter.py similarity index 95% rename from dataset/download_mini_mechanical_industry_meter.py rename to contrib/MechanicalIndustryMeter/download_mini_mechanical_industry_meter.py index 3049df25219df7641990cedd409566779012a08d..f0409581ea9454417c545aa616b98ee8ece4dc53 100644 --- a/dataset/download_mini_mechanical_industry_meter.py +++ b/contrib/MechanicalIndustryMeter/download_mini_mechanical_industry_meter.py @@ -16,7 +16,7 @@ import sys import os LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) -TEST_PATH = os.path.join(LOCAL_PATH, "..", "test") +TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test") sys.path.append(TEST_PATH) from test_utils import download_file_and_uncompress diff --git a/contrib/MechanicalIndustryMeter/download_unet_mechanical_industry_meter.py b/contrib/MechanicalIndustryMeter/download_unet_mechanical_industry_meter.py new file mode 100644 index 0000000000000000000000000000000000000000..aa55bf5e03b8dcf31e52043fd5dc87086c03c32f --- /dev/null +++ b/contrib/MechanicalIndustryMeter/download_unet_mechanical_industry_meter.py @@ -0,0 +1,30 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License" +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys +import os + +LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) +TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test") +sys.path.append(TEST_PATH) + +from test_utils import download_file_and_uncompress + +if __name__ == "__main__": + download_file_and_uncompress( + url='https://paddleseg.bj.bcebos.com/models/unet_mechanical_industry_meter.tar', + savepath=LOCAL_PATH, + extrapath=LOCAL_PATH) + + print("Pretrained Model download success!") diff --git a/contrib/imgs/1560143028.5_IMG_3091.JPG b/contrib/MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.JPG similarity index 100% rename from contrib/imgs/1560143028.5_IMG_3091.JPG rename to contrib/MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.JPG diff --git a/contrib/imgs/1560143028.5_IMG_3091.png b/contrib/MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.png similarity index 100% rename from contrib/imgs/1560143028.5_IMG_3091.png rename to contrib/MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.png diff --git a/configs/unet_mechanical_meter.yaml b/contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml similarity index 77% rename from configs/unet_mechanical_meter.yaml rename to contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml index e1bc3a1183d2b435c84ad7b16002a3f604cf85b0..45ac8616f7993e15d3d262dc0e27f67624957e2a 100644 --- a/configs/unet_mechanical_meter.yaml +++ b/contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml @@ -21,14 +21,14 @@ DATALOADER: BUF_SIZE: 256 NUM_WORKERS: 4 DATASET: - DATA_DIR: "./dataset/mini_mechanical_industry_meter_data/" + DATA_DIR: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/" IMAGE_TYPE: "rgb" # choice rgb or rgba NUM_CLASSES: 5 - TEST_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/val_mini.txt" + TEST_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/val_mini.txt" TEST_TOTAL_IMAGES: 8 - TRAIN_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/train_mini.txt" + TRAIN_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/train_mini.txt" TRAIN_TOTAL_IMAGES: 64 - VAL_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/val_mini.txt" + VAL_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/val_mini.txt" VAL_TOTAL_IMAGES: 8 SEPARATOR: "|" IGNORE_INDEX: 255 diff --git a/contrib/README.md b/contrib/README.md index 0dbbb9b473500820a919badff3ea21b5b9123bef..225ffb7747dba76b3dc3db2b27c764868a5e4fc5 100644 --- a/contrib/README.md +++ b/contrib/README.md @@ -1,72 +1,139 @@ # PaddleSeg 特色垂类分割模型 -提供基于PaddlePaddle最新的分割特色模型 +提供基于PaddlePaddle最新的分割特色模型: -## Augmented Context Embedding with Edge Perceiving (ACE2P) +- [人像分割](#人像分割) +- [人体解析](#人体解析) +- [车道线分割](#车道线分割) +- [工业用表分割](#工业用表分割) +- [在线体验](#在线体验) +## 人像分割 -### 1. 模型概述 - -CVPR 19 Look into Person (LIP) 单人人像分割比赛冠军模型,详见[ACE2P](./ACE2P) +**Note:** 本章节所有命令均在`contrib/HumanSeg`目录下执行。 -### 2. 模型下载 +``` +cd contrib/HumanSeg +``` -点击[链接](https://paddleseg.bj.bcebos.com/models/ACE2P.tgz),下载, 在contrib/ACE2P下解压, `tar -xzf ACE2P.tgz` +### 1. 模型结构 -### 3. 数据下载 +DeepLabv3+ backbone为Xception65 -前往LIP数据集官网: http://47.100.21.47:9999/overview.php 或点击 [Baidu_Drive](https://pan.baidu.com/s/1nvqmZBN#list/path=%2Fsharelink2787269280-523292635003760%2FLIP%2FLIP&parentPath=%2Fsharelink2787269280-523292635003760), +### 2. 下载模型和数据 + +执行以下命令下载并解压模型和数据集: -加载Testing_images.zip, 解压到contrib/ACE2P/data文件夹下 +``` +python download_HumanSeg.py +``` +或点击[链接](https://paddleseg.bj.bcebos.com/models/HumanSeg.tgz)进行手动下载,并解压到contrib/HumanSeg文件夹下 -### 4. 运行 -**NOTE:** 运行该模型需要2G左右显存 +### 3. 运行 -使用GPU预测 +使用GPU预测: ``` -python -u infer.py --example ACE2P --use_gpu +python -u infer.py --example HumanSeg --use_gpu ``` + 使用CPU预测: ``` -python -u infer.py --example ACE2P +python -u infer.py --example HumanSeg ``` -## 人像分割 (HumanSeg) +预测结果存放在contrib/HumanSeg/HumanSeg/result目录下。 -### 1. 模型结构 +### 4. 预测结果示例: -DeepLabv3+ backbone为Xception65 + 原图: + + ![](HumanSeg/imgs/Human.jpg) + + 预测结果: + + ![](HumanSeg/imgs/HumanSeg.jpg) -### 2. 下载模型和数据 - -点击[链接](https://paddleseg.bj.bcebos.com/models/HumanSeg.tgz),下载解压到contrib文件夹下 -### 3. 运行 +## 人体解析 + +![](ACE2P/imgs/result.jpg) + +人体解析(Human Parsing)是细粒度的语义分割任务,旨在识别像素级别的人类图像的组成部分(例如,身体部位和服装)。本章节使用冠军模型Augmented Context Embedding with Edge Perceiving (ACE2P)进行预测分割。 + + +**Note:** 本章节所有命令均在`contrib/ACE2P`目录下执行。 -使用GPU预测: ``` -python -u infer.py --example HumanSeg --use_gpu +cd contrib/ACE2P ``` +### 1. 模型概述 + +Augmented Context Embedding with Edge Perceiving (ACE2P)通过融合底层特征、全局上下文信息和边缘细节,端到端训练学习人体解析任务。以ACE2P单人人体解析网络为基础的解决方案在CVPR2019第三届Look into Person (LIP)挑战赛中赢得了全部三个人体解析任务的第一名。详情请参见[ACE2P](./ACE2P) + +### 2. 模型下载 + +执行以下命令下载并解压ACE2P预测模型: -使用CPU预测: ``` -python -u infer.py --example HumanSeg +python download_ACE2P.py ``` +或点击[链接](https://paddleseg.bj.bcebos.com/models/ACE2P.tgz)进行手动下载, 并在contrib/ACE2P下解压。 -### 4. 预测结果示例: +### 3. 数据下载 + +测试图片共10000张, +点击 [Baidu_Drive](https://pan.baidu.com/s/1nvqmZBN#list/path=%2Fsharelink2787269280-523292635003760%2FLIP%2FLIP&parentPath=%2Fsharelink2787269280-523292635003760) +下载Testing_images.zip,或前往LIP数据集官网进行下载。 +下载后解压到contrib/ACE2P/data文件夹下 + + +### 4. 运行 + + +使用GPU预测 +``` +python -u infer.py --example ACE2P --use_gpu +``` + +使用CPU预测: +``` +python -u infer.py --example ACE2P +``` - 原图:![](imgs/Human.jpg) +**NOTE:** 运行该模型需要2G左右显存。由于数据图片较多,预测过程将比较耗时。 + +#### 5. 预测结果示例: + + 原图: + + ![](ACE2P/imgs/117676_2149260.jpg) + + 预测结果: - 预测结果:![](imgs/HumanSeg.jpg) + ![](ACE2P/imgs/117676_2149260.png) + +### 备注 -## 车道线分割 (RoadLine) +1. 数据及模型路径等详细配置见ACE2P/HumanSeg/RoadLine下的config.py文件 +2. ACE2P模型需预留2G显存,若显存超可调小FLAGS_fraction_of_gpu_memory_to_use + + + + +## 车道线分割 + +**Note:** 本章节所有命令均在`contrib/RoadLine`目录下执行。 + +``` +cd contrib/RoadLine +``` ### 1. 模型结构 @@ -75,7 +142,15 @@ Deeplabv3+ backbone为MobileNetv2 ### 2. 下载模型和数据 -点击[链接](https://paddleseg.bj.bcebos.com/inference_model/RoadLine.tgz),下载解压在contrib文件夹下 + +执行以下命令下载并解压模型和数据集: + +``` +python download_RoadLine.py +``` + +或点击[链接](https://paddleseg.bj.bcebos.com/inference_model/RoadLine.tgz)进行手动下载,并解压到contrib/RoadLine文件夹下 + ### 3. 运行 @@ -92,45 +167,84 @@ python -u infer.py --example RoadLine --use_gpu python -u infer.py --example RoadLine ``` +预测结果存放在contrib/RoadLine/RoadLine/result目录下。 #### 4. 预测结果示例: - 原图:![](imgs/RoadLine.jpg) + 原图: + + ![](RoadLine/imgs/RoadLine.jpg) - 预测结果:![](imgs/RoadLine.png) + 预测结果: + + ![](RoadLine/imgs/RoadLine.png) + + ## 工业用表分割 + +**Note:** 本章节所有命令均在`PaddleSeg`目录下执行。 + ### 1. 模型结构 unet ### 2. 数据准备 -cd到PaddleSeg/dataset文件夹下,执行download_mini_mechanical_industry_meter.py +执行以下命令下载并解压数据集,数据集将存放在contrib/MechanicalIndustryMeter文件夹下: + +``` +python ./contrib/MechanicalIndustryMeter/download_mini_mechanical_industry_meter.py +``` + + +### 3. 下载预训练模型 + +``` +python ./pretrained_model/download_model.py unet_bn_coco +``` +### 4. 训练与评估 -### 3. 训练与评估 +``` +export CUDA_VISIBLE_DEVICES=0 +python ./pdseg/train.py --log_steps 10 --cfg contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml --use_gpu --do_eval --use_mpio +``` + +### 5. 可视化 +我们已提供了一个训练好的模型,执行以下命令进行下载,下载后将存放在./contrib/MechanicalIndustryMeter/文件夹下。 ``` -CUDA_VISIBLE_DEVICES=0 python ./pdseg/train.py --log_steps 10 --cfg configs/unet_mechanical_meter.yaml --use_gpu --do_eval --use_mpio +python ./contrib/MechanicalIndustryMeter/download_unet_mechanical_industry_meter.py ``` -### 4. 可视化 -我们提供了一个训练好的模型,点击[链接](https://paddleseg.bj.bcebos.com/models/unet_mechanical_industry_meter.tar),下载后放在PaddleSeg/pretrained_model下 +使用该模型进行预测可视化: + ``` -CUDA_VISIBLE_DEVICES=0 python ./pdseg/vis.py --cfg configs/unet_mechanical_meter.yaml --use_gpu --vis_dir vis_meter \ -TEST.TEST_MODEL "./pretrained_model/unet_gongyeyongbiao/" +python ./pdseg/vis.py --cfg contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml --use_gpu --vis_dir vis_meter \ +TEST.TEST_MODEL "./contrib/MechanicalIndustryMeter/unet_mechanical_industry_meter/" ``` -可视化结果会保存在vis_meter文件夹下 +可视化结果会保存在./vis_meter文件夹下。 -### 5. 可视化结果示例: +### 6. 可视化结果示例: - 原图:![](imgs/1560143028.5_IMG_3091.JPG) + 原图: + + ![](MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.JPG) - 预测结果:![](imgs/1560143028.5_IMG_3091.png) + 预测结果: -# 备注 + ![](MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.png) + +## 在线体验 + +PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验: + +|教程|链接| +|-|-| +|工业质检|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184392)| +|人像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/188833)| +|特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/115541)| + -1. 数据及模型路径等详细配置见ACE2P/HumanSeg/RoadLine下的config.py文件 -2. ACE2P模型需预留2G显存,若显存超可调小FLAGS_fraction_of_gpu_memory_to_use diff --git a/contrib/RoadLine/download_RoadLine.py b/contrib/RoadLine/download_RoadLine.py new file mode 100644 index 0000000000000000000000000000000000000000..86b631784edadcff6d575c59e67ee23a1775216d --- /dev/null +++ b/contrib/RoadLine/download_RoadLine.py @@ -0,0 +1,31 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License" +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys +import os + +LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) +TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test") +sys.path.append(TEST_PATH) + +from test_utils import download_file_and_uncompress + +if __name__ == "__main__": + download_file_and_uncompress( + url='https://paddleseg.bj.bcebos.com/inference_model/RoadLine.tgz', + savepath=LOCAL_PATH, + extrapath=LOCAL_PATH, + extraname='RoadLine') + + print("Pretrained Model download success!") diff --git a/contrib/imgs/RoadLine.jpg b/contrib/RoadLine/imgs/RoadLine.jpg similarity index 100% rename from contrib/imgs/RoadLine.jpg rename to contrib/RoadLine/imgs/RoadLine.jpg diff --git a/contrib/imgs/RoadLine.png b/contrib/RoadLine/imgs/RoadLine.png similarity index 100% rename from contrib/imgs/RoadLine.png rename to contrib/RoadLine/imgs/RoadLine.png diff --git a/contrib/RoadLine/infer.py b/contrib/RoadLine/infer.py new file mode 100644 index 0000000000000000000000000000000000000000..971476933c431977ce80c73e1d939fe079e1af19 --- /dev/null +++ b/contrib/RoadLine/infer.py @@ -0,0 +1,130 @@ +# -*- coding: utf-8 -*- +import os +import cv2 +import numpy as np +from utils.util import get_arguments +from utils.palette import get_palette +from PIL import Image as PILImage +import importlib + +args = get_arguments() +config = importlib.import_module('config') +cfg = getattr(config, 'cfg') + +# paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启 +os.environ['FLAGS_eager_delete_tensor_gb']='0.0' + +import paddle.fluid as fluid + +# 预测数据集类 +class TestDataSet(): + def __init__(self): + self.data_dir = cfg.data_dir + self.data_list_file = cfg.data_list_file + self.data_list = self.get_data_list() + self.data_num = len(self.data_list) + + def get_data_list(self): + # 获取预测图像路径列表 + data_list = [] + data_file_handler = open(self.data_list_file, 'r') + for line in data_file_handler: + img_name = line.strip() + name_prefix = img_name.split('.')[0] + if len(img_name.split('.')) == 1: + img_name = img_name + '.jpg' + img_path = os.path.join(self.data_dir, img_name) + data_list.append(img_path) + return data_list + + def preprocess(self, img): + # 图像预处理 + if cfg.example == 'ACE2P': + reader = importlib.import_module(args.example+'.reader') + ACE2P_preprocess = getattr(reader, 'preprocess') + img = ACE2P_preprocess(img) + else: + img = cv2.resize(img, cfg.input_size).astype(np.float32) + img -= np.array(cfg.MEAN) + img /= np.array(cfg.STD) + img = img.transpose((2, 0, 1)) + img = np.expand_dims(img, axis=0) + return img + + def get_data(self, index): + # 获取图像信息 + img_path = self.data_list[index] + img = cv2.imread(img_path, cv2.IMREAD_COLOR) + if img is None: + return img, img,img_path, None + + img_name = img_path.split(os.sep)[-1] + name_prefix = img_name.replace('.'+img_name.split('.')[-1],'') + img_shape = img.shape[:2] + img_process = self.preprocess(img) + + return img, img_process, name_prefix, img_shape + + +def infer(): + if not os.path.exists(cfg.vis_dir): + os.makedirs(cfg.vis_dir) + palette = get_palette(cfg.class_num) + # 人像分割结果显示阈值 + thresh = 120 + + place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace() + exe = fluid.Executor(place) + + # 加载预测模型 + test_prog, feed_name, fetch_list = fluid.io.load_inference_model( + dirname=cfg.model_path, executor=exe, params_filename='__params__') + + #加载预测数据集 + test_dataset = TestDataSet() + data_num = test_dataset.data_num + + for idx in range(data_num): + # 数据获取 + ori_img, image, im_name, im_shape = test_dataset.get_data(idx) + if image is None: + print(im_name, 'is None') + continue + + # 预测 + if cfg.example == 'ACE2P': + # ACE2P模型使用多尺度预测 + reader = importlib.import_module(args.example+'.reader') + multi_scale_test = getattr(reader, 'multi_scale_test') + parsing, logits = multi_scale_test(exe, test_prog, feed_name, fetch_list, image, im_shape) + else: + # HumanSeg,RoadLine模型单尺度预测 + result = exe.run(program=test_prog, feed={feed_name[0]: image}, fetch_list=fetch_list) + parsing = np.argmax(result[0][0], axis=0) + parsing = cv2.resize(parsing.astype(np.uint8), im_shape[::-1]) + + # 预测结果保存 + result_path = os.path.join(cfg.vis_dir, im_name + '.png') + if cfg.example == 'HumanSeg': + logits = result[0][0][1]*255 + logits = cv2.resize(logits, im_shape[::-1]) + ret, logits = cv2.threshold(logits, thresh, 0, cv2.THRESH_TOZERO) + logits = 255 *(logits - thresh)/(255 - thresh) + # 将分割结果添加到alpha通道 + rgba = np.concatenate((ori_img, np.expand_dims(logits, axis=2)), axis=2) + cv2.imwrite(result_path, rgba) + else: + output_im = PILImage.fromarray(np.asarray(parsing, dtype=np.uint8)) + output_im.putpalette(palette) + output_im.save(result_path) + + if (idx + 1) % 100 == 0: + print('%d processd' % (idx + 1)) + + print('%d processd done' % (idx + 1)) + + return 0 + + +if __name__ == "__main__": + infer() diff --git a/contrib/RoadLine/utils/__init__.py b/contrib/RoadLine/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/contrib/RoadLine/utils/palette.py b/contrib/RoadLine/utils/palette.py new file mode 100644 index 0000000000000000000000000000000000000000..2186203cbc2789f6eff70dfd92f724b4fe16cdb7 --- /dev/null +++ b/contrib/RoadLine/utils/palette.py @@ -0,0 +1,38 @@ +##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +## Created by: RainbowSecret +## Microsoft Research +## yuyua@microsoft.com +## Copyright (c) 2018 +## +## This source code is licensed under the MIT-style license found in the +## LICENSE file in the root directory of this source tree +##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +import numpy as np +import cv2 + + +def get_palette(num_cls): + """ Returns the color map for visualizing the segmentation mask. + Args: + num_cls: Number of classes + Returns: + The color map + """ + n = num_cls + palette = [0] * (n * 3) + for j in range(0, n): + lab = j + palette[j * 3 + 0] = 0 + palette[j * 3 + 1] = 0 + palette[j * 3 + 2] = 0 + i = 0 + while lab: + palette[j * 3 + 0] |= (((lab >> 0) & 1) << (7 - i)) + palette[j * 3 + 1] |= (((lab >> 1) & 1) << (7 - i)) + palette[j * 3 + 2] |= (((lab >> 2) & 1) << (7 - i)) + i += 1 + lab >>= 3 + return palette diff --git a/contrib/RoadLine/utils/util.py b/contrib/RoadLine/utils/util.py new file mode 100644 index 0000000000000000000000000000000000000000..7394870e7c94c1fb16169e314696b931eecdc3b2 --- /dev/null +++ b/contrib/RoadLine/utils/util.py @@ -0,0 +1,47 @@ +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals +import argparse +import os + +def get_arguments(): + parser = argparse.ArgumentParser() + parser.add_argument("--use_gpu", + action="store_true", + help="Use gpu or cpu to test.") + parser.add_argument('--example', + type=str, + help='RoadLine, HumanSeg or ACE2P') + + return parser.parse_args() + + +class AttrDict(dict): + def __init__(self, *args, **kwargs): + super(AttrDict, self).__init__(*args, **kwargs) + + def __getattr__(self, name): + if name in self.__dict__: + return self.__dict__[name] + elif name in self: + return self[name] + else: + raise AttributeError(name) + + def __setattr__(self, name, value): + if name in self.__dict__: + self.__dict__[name] = value + else: + self[name] = value + +def merge_cfg_from_args(args, cfg): + """Merge config keys, values in args into the global config.""" + for k, v in vars(args).items(): + d = cfg + try: + value = eval(v) + except: + value = v + if value is not None: + cfg[k] = value + diff --git a/dataset/download_optic.py b/dataset/download_optic.py new file mode 100644 index 0000000000000000000000000000000000000000..2fd66be11ef2e0bca483ecf6d7bcec2b01bebd7a --- /dev/null +++ b/dataset/download_optic.py @@ -0,0 +1,33 @@ +# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License" +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys +import os + +LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) +TEST_PATH = os.path.join(LOCAL_PATH, "..", "test") +sys.path.append(TEST_PATH) + +from test_utils import download_file_and_uncompress + + +def download_pet_dataset(savepath, extrapath): + url = "https://paddleseg.bj.bcebos.com/dataset/optic_disc_seg.zip" + download_file_and_uncompress( + url=url, savepath=savepath, extrapath=extrapath) + + +if __name__ == "__main__": + download_pet_dataset(LOCAL_PATH, LOCAL_PATH) + print("Dataset download finish!") diff --git a/deploy/lite/README.md b/deploy/lite/README.md index f4ec50be28e75d79ce2f61453737930bccf52cf4..a46dc2077df3e061e18e8ebf9e4b21ca4d0fbbaf 100644 --- a/deploy/lite/README.md +++ b/deploy/lite/README.md @@ -10,11 +10,10 @@ * Android手机或开发板; ### 2.2 安装 -* git clone https://github.com/PaddlePaddle/PaddleSeg.git ; -* 打开Android Studio,在"Welcome to Android Studio"窗口点击"Open an existing Android Studio project",在弹出的路径选择窗口中进入"/PaddleSeg/lite/humanseg-android-demo/"目录,然后点击右下角的"Open"按钮即可导入工程 +* git clone https://github.com/PaddlePaddle/PaddleSeg.git ; +* 打开Android Studio,在"Welcome to Android Studio"窗口点击"Open an existing Android Studio project",在弹出的路径选择窗口中进入"/PaddleSeg/lite/humanseg_android_demo/"目录,然后点击右下角的"Open"按钮即可导入工程,构建工程的过程中会下载demo需要的模型和Lite预测库; * 通过USB连接Android手机或开发板; * 载入工程后,点击菜单栏的Run->Run 'App'按钮,在弹出的"Select Deployment Target"窗口选择已经连接的Android设备,然后点击"OK"按钮; -* 手机上会出现Demo的主界面,选择"Image Segmentation"图标,进入的人像分割示例程序; * 在人像分割Demo中,默认会载入一张人像图像,并会在图像下方给出CPU的预测结果; * 在人像分割Demo中,你还可以通过上方的"Gallery"和"Take Photo"按钮从相册或相机中加载测试图像; @@ -48,7 +47,7 @@ Paddle-Lite的编译目前支持Docker,Linux和Mac OS开发环境,建议使 * PaddlePredictor.jar; * arm64-v8a/libpaddle_lite_jni.so; -* armeabi-v7a/libpaddle_lite_jni.so; +* armeabi-v7a/libpaddle_lite_jni.so; 下面分别介绍两种方法: diff --git a/deploy/lite/humanseg-android-demo/.gitignore b/deploy/lite/human_segmentation_demo/.gitignore similarity index 100% rename from deploy/lite/humanseg-android-demo/.gitignore rename to deploy/lite/human_segmentation_demo/.gitignore diff --git a/deploy/lite/humanseg-android-demo/app/.gitignore b/deploy/lite/human_segmentation_demo/app/.gitignore similarity index 100% rename from deploy/lite/humanseg-android-demo/app/.gitignore rename to deploy/lite/human_segmentation_demo/app/.gitignore diff --git a/deploy/lite/human_segmentation_demo/app/build.gradle b/deploy/lite/human_segmentation_demo/app/build.gradle new file mode 100644 index 0000000000000000000000000000000000000000..88d5a19ece9d3b1c14069a6fca3ceb70c2e3e7e6 --- /dev/null +++ b/deploy/lite/human_segmentation_demo/app/build.gradle @@ -0,0 +1,119 @@ +import java.security.MessageDigest + +apply plugin: 'com.android.application' + +android { + compileSdkVersion 28 + defaultConfig { + applicationId "com.baidu.paddle.lite.demo.human_segmentation" + minSdkVersion 15 + targetSdkVersion 28 + versionCode 1 + versionName "1.0" + testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner" + } + buildTypes { + release { + minifyEnabled false + proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro' + } + } +} + +dependencies { + implementation fileTree(include: ['*.jar'], dir: 'libs') + implementation 'com.android.support:appcompat-v7:28.0.0' + implementation 'com.android.support.constraint:constraint-layout:1.1.3' + implementation 'com.android.support:design:28.0.0' + testImplementation 'junit:junit:4.12' + androidTestImplementation 'com.android.support.test:runner:1.0.2' + androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2' + implementation files('libs/PaddlePredictor.jar') +} + +def paddleLiteLibs = 'https://paddlelite-demo.bj.bcebos.com/libs/android/paddle_lite_libs_v2_1_0_bug_fixed.tar.gz' +task downloadAndExtractPaddleLiteLibs(type: DefaultTask) { + doFirst { + println "Downloading and extracting Paddle Lite libs" + } + doLast { + // Prepare cache folder for libs + if (!file("cache").exists()) { + mkdir "cache" + } + // Generate cache name for libs + MessageDigest messageDigest = MessageDigest.getInstance('MD5') + messageDigest.update(paddleLiteLibs.bytes) + String cacheName = new BigInteger(1, messageDigest.digest()).toString(32) + // Download libs + if (!file("cache/${cacheName}.tar.gz").exists()) { + ant.get(src: paddleLiteLibs, dest: file("cache/${cacheName}.tar.gz")) + } + // Unpack libs + copy { + from tarTree("cache/${cacheName}.tar.gz") + into "cache/${cacheName}" + } + // Copy PaddlePredictor.jar + if (!file("libs/PaddlePredictor.jar").exists()) { + copy { + from "cache/${cacheName}/java/PaddlePredictor.jar" + into "libs" + } + } + // Copy libpaddle_lite_jni.so for armeabi-v7a and arm64-v8a + if (!file("src/main/jniLibs/armeabi-v7a/libpaddle_lite_jni.so").exists()) { + copy { + from "cache/${cacheName}/java/libs/armeabi-v7a/" + into "src/main/jniLibs/armeabi-v7a" + } + } + if (!file("src/main/jniLibs/arm64-v8a/libpaddle_lite_jni.so").exists()) { + copy { + from "cache/${cacheName}/java/libs/arm64-v8a/" + into "src/main/jniLibs/arm64-v8a" + } + } + } +} +preBuild.dependsOn downloadAndExtractPaddleLiteLibs + +def paddleLiteModels = [ + [ + 'src' : 'https://paddlelite-demo.bj.bcebos.com/models/deeplab_mobilenet_fp32_for_cpu_v2_1_0.tar.gz', + 'dest' : 'src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu' + ], +] +task downloadAndExtractPaddleLiteModels(type: DefaultTask) { + doFirst { + println "Downloading and extracting Paddle Lite models" + } + doLast { + // Prepare cache folder for models + if (!file("cache").exists()) { + mkdir "cache" + } + paddleLiteModels.eachWithIndex { model, index -> + MessageDigest messageDigest = MessageDigest.getInstance('MD5') + messageDigest.update(model.src.bytes) + String cacheName = new BigInteger(1, messageDigest.digest()).toString(32) + // Download model file + if (!file("cache/${cacheName}.tar.gz").exists()) { + ant.get(src: model.src, dest: file("cache/${cacheName}.tar.gz")) + } + // Unpack model file + copy { + from tarTree("cache/${cacheName}.tar.gz") + into "cache/${cacheName}" + } + // Copy model file + if (!file("${model.dest}/__model__.nb").exists() || !file("${model.dest}/param.nb").exists()) { + copy { + from "cache/${cacheName}" + into "${model.dest}" + } + } + } + } +} +preBuild.dependsOn downloadAndExtractPaddleLiteModels diff --git a/deploy/lite/humanseg-android-demo/app/gradle/wrapper/gradle-wrapper.jar b/deploy/lite/human_segmentation_demo/app/gradle/wrapper/gradle-wrapper.jar similarity index 100% rename from deploy/lite/humanseg-android-demo/app/gradle/wrapper/gradle-wrapper.jar rename to deploy/lite/human_segmentation_demo/app/gradle/wrapper/gradle-wrapper.jar diff --git a/deploy/lite/humanseg-android-demo/app/gradle/wrapper/gradle-wrapper.properties b/deploy/lite/human_segmentation_demo/app/gradle/wrapper/gradle-wrapper.properties similarity index 100% rename from deploy/lite/humanseg-android-demo/app/gradle/wrapper/gradle-wrapper.properties rename to deploy/lite/human_segmentation_demo/app/gradle/wrapper/gradle-wrapper.properties diff --git a/deploy/lite/humanseg-android-demo/app/gradlew b/deploy/lite/human_segmentation_demo/app/gradlew similarity index 100% rename from deploy/lite/humanseg-android-demo/app/gradlew rename to deploy/lite/human_segmentation_demo/app/gradlew diff --git a/deploy/lite/humanseg-android-demo/app/gradlew.bat b/deploy/lite/human_segmentation_demo/app/gradlew.bat similarity index 100% rename from deploy/lite/humanseg-android-demo/app/gradlew.bat rename to deploy/lite/human_segmentation_demo/app/gradlew.bat diff --git a/deploy/lite/human_segmentation_demo/app/local.properties b/deploy/lite/human_segmentation_demo/app/local.properties new file mode 100644 index 0000000000000000000000000000000000000000..f3bc0d0f5319e7573b7cba2cd997b979060f3eec --- /dev/null +++ b/deploy/lite/human_segmentation_demo/app/local.properties @@ -0,0 +1,8 @@ +## This file must *NOT* be checked into Version Control Systems, +# as it contains information specific to your local configuration. +# +# Location of the SDK. This is only used by Gradle. +# For customization when using a Version Control System, please read the +# header note. +#Mon Nov 25 17:01:52 CST 2019 +sdk.dir=/Users/chenlingchi/Library/Android/sdk diff --git a/deploy/lite/humanseg-android-demo/app/proguard-rules.pro b/deploy/lite/human_segmentation_demo/app/proguard-rules.pro similarity index 100% rename from deploy/lite/humanseg-android-demo/app/proguard-rules.pro rename to deploy/lite/human_segmentation_demo/app/proguard-rules.pro diff --git a/deploy/lite/humanseg-android-demo/app/src/androidTest/java/com/baidu/paddle/lite/demo/ExampleInstrumentedTest.java b/deploy/lite/human_segmentation_demo/app/src/androidTest/java/com/baidu/paddle/lite/demo/ExampleInstrumentedTest.java similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/androidTest/java/com/baidu/paddle/lite/demo/ExampleInstrumentedTest.java rename to deploy/lite/human_segmentation_demo/app/src/androidTest/java/com/baidu/paddle/lite/demo/ExampleInstrumentedTest.java diff --git a/deploy/lite/humanseg-android-demo/app/src/main/AndroidManifest.xml b/deploy/lite/human_segmentation_demo/app/src/main/AndroidManifest.xml similarity index 79% rename from deploy/lite/humanseg-android-demo/app/src/main/AndroidManifest.xml rename to deploy/lite/human_segmentation_demo/app/src/main/AndroidManifest.xml index 67e06269f4b2764034d4d7c400f1c93c1504fe6a..39789e0370b04e67a6e80e1b21e79ef058500370 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/AndroidManifest.xml +++ b/deploy/lite/human_segmentation_demo/app/src/main/AndroidManifest.xml @@ -1,6 +1,6 @@ + package="com.baidu.paddle.lite.demo.segmentation"> @@ -17,15 +17,11 @@ - - diff --git a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/images/human.jpg b/deploy/lite/human_segmentation_demo/app/src/main/assets/image_segmentation/images/human.jpg similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/images/human.jpg rename to deploy/lite/human_segmentation_demo/app/src/main/assets/image_segmentation/images/human.jpg diff --git a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/labels/label_list b/deploy/lite/human_segmentation_demo/app/src/main/assets/image_segmentation/labels/label_list similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/labels/label_list rename to deploy/lite/human_segmentation_demo/app/src/main/assets/image_segmentation/labels/label_list diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/AppCompatPreferenceActivity.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/AppCompatPreferenceActivity.java similarity index 98% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/AppCompatPreferenceActivity.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/AppCompatPreferenceActivity.java index 960f34257d58b9b19d3e9701f92659575be8a701..314c045620e5edc8911196cbe8ff5d1eadfb7a16 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/AppCompatPreferenceActivity.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/AppCompatPreferenceActivity.java @@ -14,7 +14,7 @@ * limitations under the License. */ -package com.baidu.paddle.lite.demo; +package com.baidu.paddle.lite.demo.segmentation; import android.content.res.Configuration; import android.os.Bundle; diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/CommonActivity.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/MainActivity.java similarity index 56% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/CommonActivity.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/MainActivity.java index 88146b3961e5f2c8ed366816e505144ba3ac9f6b..aab9f54c30c2a963b845970a8cf42480eb3fcf17 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/CommonActivity.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/MainActivity.java @@ -1,39 +1,45 @@ -package com.baidu.paddle.lite.demo; +package com.baidu.paddle.lite.demo.segmentation; import android.Manifest; import android.app.ProgressDialog; import android.content.ContentResolver; import android.content.Intent; +import android.content.SharedPreferences; import android.content.pm.PackageManager; import android.database.Cursor; import android.graphics.Bitmap; import android.graphics.BitmapFactory; import android.net.Uri; import android.os.Bundle; -import android.os.Environment; import android.os.Handler; import android.os.HandlerThread; import android.os.Message; +import android.preference.PreferenceManager; import android.provider.MediaStore; import android.support.annotation.NonNull; import android.support.v4.app.ActivityCompat; import android.support.v4.content.ContextCompat; -import android.support.v4.content.FileProvider; -import android.support.v7.app.ActionBar; import android.support.v7.app.AppCompatActivity; +import android.text.method.ScrollingMovementMethod; import android.util.Log; import android.view.Menu; import android.view.MenuInflater; import android.view.MenuItem; +import android.widget.ImageView; +import android.widget.TextView; import android.widget.Toast; +import com.baidu.paddle.lite.demo.segmentation.config.Config; +import com.baidu.paddle.lite.demo.segmentation.preprocess.Preprocess; +import com.baidu.paddle.lite.demo.segmentation.visual.Visualize; + import java.io.File; import java.io.IOException; -import java.text.SimpleDateFormat; -import java.util.Date; +import java.io.InputStream; + +public class MainActivity extends AppCompatActivity { -public class CommonActivity extends AppCompatActivity { - private static final String TAG = CommonActivity.class.getSimpleName(); + private static final String TAG = MainActivity.class.getSimpleName(); public static final int OPEN_GALLERY_REQUEST_CODE = 0; public static final int TAKE_PHOTO_REQUEST_CODE = 1; @@ -51,14 +57,25 @@ public class CommonActivity extends AppCompatActivity { protected Handler sender = null; // send command to worker thread protected HandlerThread worker = null; // worker thread to load&run model + + protected TextView tvInputSetting; + protected ImageView ivInputImage; + protected TextView tvOutputResult; + protected TextView tvInferenceTime; + + // model config + Config config = new Config(); + + protected Predictor predictor = new Predictor(); + + Preprocess preprocess = new Preprocess(); + + Visualize visualize = new Visualize(); + @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); - ActionBar supportActionBar = getSupportActionBar(); - if (supportActionBar != null) { - supportActionBar.setDisplayHomeAsUpEnabled(true); - } - + setContentView(R.layout.activity_main); receiver = new Handler() { @Override public void handleMessage(Message msg) { @@ -69,7 +86,7 @@ public class CommonActivity extends AppCompatActivity { break; case RESPONSE_LOAD_MODEL_FAILED: pbLoadModel.dismiss(); - Toast.makeText(CommonActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show(); + Toast.makeText(MainActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show(); onLoadModelFailed(); break; case RESPONSE_RUN_MODEL_SUCCESSED: @@ -78,7 +95,7 @@ public class CommonActivity extends AppCompatActivity { break; case RESPONSE_RUN_MODEL_FAILED: pbRunModel.dismiss(); - Toast.makeText(CommonActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show(); + Toast.makeText(MainActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show(); onRunModelFailed(); break; default: @@ -113,6 +130,29 @@ public class CommonActivity extends AppCompatActivity { } } }; + + tvInputSetting = findViewById(R.id.tv_input_setting); + ivInputImage = findViewById(R.id.iv_input_image); + tvInferenceTime = findViewById(R.id.tv_inference_time); + tvOutputResult = findViewById(R.id.tv_output_result); + tvInputSetting.setMovementMethod(ScrollingMovementMethod.getInstance()); + tvOutputResult.setMovementMethod(ScrollingMovementMethod.getInstance()); + } + + + public boolean onLoadModel() { + return predictor.init(MainActivity.this, config); + } + + + public boolean onRunModel() { + return predictor.isLoaded() && predictor.runModel(preprocess,visualize); + } + + public void onLoadModelFailed() { + + } + public void onRunModelFailed() { } public void loadModel() { @@ -125,33 +165,61 @@ public class CommonActivity extends AppCompatActivity { sender.sendEmptyMessage(REQUEST_RUN_MODEL); } - public boolean onLoadModel() { - return true; - } - - public boolean onRunModel() { - return true; - } - public void onLoadModelSuccessed() { - } - - public void onLoadModelFailed() { + // load test image from file_paths and run model + try { + if (config.imagePath.isEmpty()) { + return; + } + Bitmap image = null; + // read test image file from custom file_paths if the first character of mode file_paths is '/', otherwise read test + // image file from assets + if (!config.imagePath.substring(0, 1).equals("/")) { + InputStream imageStream = getAssets().open(config.imagePath); + image = BitmapFactory.decodeStream(imageStream); + } else { + if (!new File(config.imagePath).exists()) { + return; + } + image = BitmapFactory.decodeFile(config.imagePath); + } + if (image != null && predictor.isLoaded()) { + predictor.setInputImage(image); + runModel(); + } + } catch (IOException e) { + Toast.makeText(MainActivity.this, "Load image failed!", Toast.LENGTH_SHORT).show(); + e.printStackTrace(); + } } public void onRunModelSuccessed() { + // obtain results and update UI + tvInferenceTime.setText("Inference time: " + predictor.inferenceTime() + " ms"); + Bitmap outputImage = predictor.outputImage(); + if (outputImage != null) { + ivInputImage.setImageBitmap(outputImage); + } + tvOutputResult.setText(predictor.outputResult()); + tvOutputResult.scrollTo(0, 0); } - public void onRunModelFailed() { - } public void onImageChanged(Bitmap image) { + // rerun model if users pick test image from gallery or camera + if (image != null && predictor.isLoaded()) { + predictor.setInputImage(image); + runModel(); + } } public void onImageChanged(String path) { - + Bitmap image = BitmapFactory.decodeFile(path); + predictor.setInputImage(image); + runModel(); } public void onSettingsClicked() { + startActivity(new Intent(MainActivity.this, SettingsActivity.class)); } @Override @@ -186,7 +254,6 @@ public class CommonActivity extends AppCompatActivity { } return super.onOptionsItemSelected(item); } - @Override public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) { @@ -195,33 +262,6 @@ public class CommonActivity extends AppCompatActivity { Toast.makeText(this, "Permission Denied", Toast.LENGTH_SHORT).show(); } } - - private boolean requestAllPermissions() { - if (ContextCompat.checkSelfPermission(this, Manifest.permission.WRITE_EXTERNAL_STORAGE) - != PackageManager.PERMISSION_GRANTED || ContextCompat.checkSelfPermission(this, - Manifest.permission.CAMERA) - != PackageManager.PERMISSION_GRANTED) { - ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE, - Manifest.permission.CAMERA}, - 0); - return false; - } - return true; - } - - private void openGallery() { - Intent intent = new Intent(Intent.ACTION_PICK, null); - intent.setDataAndType(MediaStore.Images.Media.EXTERNAL_CONTENT_URI, "image/*"); - startActivityForResult(intent, OPEN_GALLERY_REQUEST_CODE); - } - - private void takePhoto() { - Intent takePhotoIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE); - if (takePhotoIntent.resolveActivity(getPackageManager()) != null) { - startActivityForResult(takePhotoIntent, TAKE_PHOTO_REQUEST_CODE); - } - } - @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); @@ -251,14 +291,97 @@ public class CommonActivity extends AppCompatActivity { } } } + private boolean requestAllPermissions() { + if (ContextCompat.checkSelfPermission(this, Manifest.permission.WRITE_EXTERNAL_STORAGE) + != PackageManager.PERMISSION_GRANTED || ContextCompat.checkSelfPermission(this, + Manifest.permission.CAMERA) + != PackageManager.PERMISSION_GRANTED) { + ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE, + Manifest.permission.CAMERA}, + 0); + return false; + } + return true; + } + + private void openGallery() { + Intent intent = new Intent(Intent.ACTION_PICK, null); + intent.setDataAndType(MediaStore.Images.Media.EXTERNAL_CONTENT_URI, "image/*"); + startActivityForResult(intent, OPEN_GALLERY_REQUEST_CODE); + } + + private void takePhoto() { + Intent takePhotoIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE); + if (takePhotoIntent.resolveActivity(getPackageManager()) != null) { + startActivityForResult(takePhotoIntent, TAKE_PHOTO_REQUEST_CODE); + } + } + + @Override + public boolean onPrepareOptionsMenu(Menu menu) { + boolean isLoaded = predictor.isLoaded(); + menu.findItem(R.id.open_gallery).setEnabled(isLoaded); + menu.findItem(R.id.take_photo).setEnabled(isLoaded); + return super.onPrepareOptionsMenu(menu); + } @Override protected void onResume() { + Log.i(TAG,"begin onResume"); super.onResume(); + + SharedPreferences sharedPreferences = PreferenceManager.getDefaultSharedPreferences(this); + boolean settingsChanged = false; + String model_path = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY), + getString(R.string.MODEL_PATH_DEFAULT)); + String label_path = sharedPreferences.getString(getString(R.string.LABEL_PATH_KEY), + getString(R.string.LABEL_PATH_DEFAULT)); + String image_path = sharedPreferences.getString(getString(R.string.IMAGE_PATH_KEY), + getString(R.string.IMAGE_PATH_DEFAULT)); + settingsChanged |= !model_path.equalsIgnoreCase(config.modelPath); + settingsChanged |= !label_path.equalsIgnoreCase(config.labelPath); + settingsChanged |= !image_path.equalsIgnoreCase(config.imagePath); + int cpu_thread_num = Integer.parseInt(sharedPreferences.getString(getString(R.string.CPU_THREAD_NUM_KEY), + getString(R.string.CPU_THREAD_NUM_DEFAULT))); + settingsChanged |= cpu_thread_num != config.cpuThreadNum; + String cpu_power_mode = + sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY), + getString(R.string.CPU_POWER_MODE_DEFAULT)); + settingsChanged |= !cpu_power_mode.equalsIgnoreCase(config.cpuPowerMode); + String input_color_format = + sharedPreferences.getString(getString(R.string.INPUT_COLOR_FORMAT_KEY), + getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); + settingsChanged |= !input_color_format.equalsIgnoreCase(config.inputColorFormat); + long[] input_shape = + Utils.parseLongsFromString(sharedPreferences.getString(getString(R.string.INPUT_SHAPE_KEY), + getString(R.string.INPUT_SHAPE_DEFAULT)), ","); + + settingsChanged |= input_shape.length != config.inputShape.length; + + if (!settingsChanged) { + for (int i = 0; i < input_shape.length; i++) { + settingsChanged |= input_shape[i] != config.inputShape[i]; + } + } + + if (settingsChanged) { + config.init(model_path,label_path,image_path,cpu_thread_num,cpu_power_mode, + input_color_format,input_shape); + preprocess.init(config); + // update UI + tvInputSetting.setText("Model: " + config.modelPath.substring(config.modelPath.lastIndexOf("/") + 1) + "\n" + "CPU" + + " Thread Num: " + Integer.toString(config.cpuThreadNum) + "\n" + "CPU Power Mode: " + config.cpuPowerMode); + tvInputSetting.scrollTo(0, 0); + // reload model if configure has been changed + loadModel(); + } } @Override protected void onDestroy() { + if (predictor != null) { + predictor.releaseModel(); + } worker.quit(); super.onDestroy(); } diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegPredictor.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Predictor.java similarity index 50% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegPredictor.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Predictor.java index 717e086adf078a2eea69bf3fc720af8c233fd9a3..27bfe3544a9913f77c56b6f059616b6e83ca5dc8 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegPredictor.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Predictor.java @@ -4,9 +4,12 @@ import android.content.Context; import android.graphics.Bitmap; import android.util.Log; +import com.baidu.paddle.lite.MobileConfig; +import com.baidu.paddle.lite.PaddlePredictor; +import com.baidu.paddle.lite.PowerMode; import com.baidu.paddle.lite.Tensor; -import com.baidu.paddle.lite.demo.Predictor; import com.baidu.paddle.lite.demo.segmentation.config.Config; + import com.baidu.paddle.lite.demo.segmentation.preprocess.Preprocess; import com.baidu.paddle.lite.demo.segmentation.visual.Visualize; @@ -14,15 +17,11 @@ import java.io.InputStream; import java.util.Date; import java.util.Vector; -import static android.graphics.Color.blue; -import static android.graphics.Color.green; -import static android.graphics.Color.red; - -public class ImgSegPredictor extends Predictor { - private static final String TAG = ImgSegPredictor.class.getSimpleName(); +public class Predictor { + private static final String TAG = Predictor.class.getSimpleName(); protected Vector wordLabels = new Vector(); - Config config; + Config config = new Config(); protected Bitmap inputImage = null; protected Bitmap scaledImage = null; @@ -31,10 +30,27 @@ public class ImgSegPredictor extends Predictor { protected float preprocessTime = 0; protected float postprocessTime = 0; - public ImgSegPredictor() { + public boolean isLoaded = false; + public int warmupIterNum = 0; + public int inferIterNum = 1; + protected Context appCtx = null; + public int cpuThreadNum = 1; + public String cpuPowerMode = "LITE_POWER_HIGH"; + public String modelPath = ""; + public String modelName = ""; + protected PaddlePredictor paddlePredictor = null; + protected float inferenceTime = 0; + + public Predictor() { super(); } + public boolean init(Context appCtx, String modelPath, int cpuThreadNum, String cpuPowerMode) { + this.appCtx = appCtx; + isLoaded = loadModel(modelPath, cpuThreadNum, cpuPowerMode); + return isLoaded; + } + public boolean init(Context appCtx, Config config) { if (config.inputShape.length != 4) { @@ -55,8 +71,9 @@ public class ImgSegPredictor extends Predictor { Log.i(TAG, "only RGB and BGR color format is supported."); return false; } - super.init(appCtx, config.modelPath, config.cpuThreadNum, config.cpuPowerMode); - if (!super.isLoaded()) { + init(appCtx, config.modelPath, config.cpuThreadNum, config.cpuPowerMode); + + if (!isLoaded()) { return false; } this.config = config; @@ -64,6 +81,11 @@ public class ImgSegPredictor extends Predictor { return isLoaded; } + + public boolean isLoaded() { + return paddlePredictor != null && isLoaded; + } + protected boolean loadLabel(String labelPath) { wordLabels.clear(); // load word labels from file @@ -87,11 +109,80 @@ public class ImgSegPredictor extends Predictor { } public Tensor getInput(int idx) { - return super.getInput(idx); + if (!isLoaded()) { + return null; + } + return paddlePredictor.getInput(idx); } public Tensor getOutput(int idx) { - return super.getOutput(idx); + if (!isLoaded()) { + return null; + } + return paddlePredictor.getOutput(idx); + } + + protected boolean loadModel(String modelPath, int cpuThreadNum, String cpuPowerMode) { + // release model if exists + releaseModel(); + + // load model + if (modelPath.isEmpty()) { + return false; + } + String realPath = modelPath; + if (!modelPath.substring(0, 1).equals("/")) { + // read model files from custom file_paths if the first character of mode file_paths is '/' + // otherwise copy model to cache from assets + realPath = appCtx.getCacheDir() + "/" + modelPath; + Utils.copyDirectoryFromAssets(appCtx, modelPath, realPath); + } + if (realPath.isEmpty()) { + return false; + } + MobileConfig modelConfig = new MobileConfig(); + modelConfig.setModelDir(realPath); + modelConfig.setThreads(cpuThreadNum); + if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_HIGH")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_HIGH); + } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_LOW")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_LOW); + } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_FULL")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_FULL); + } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_NO_BIND")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_NO_BIND); + } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_RAND_HIGH")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_RAND_HIGH); + } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_RAND_LOW")) { + modelConfig.setPowerMode(PowerMode.LITE_POWER_RAND_LOW); + } else { + Log.e(TAG, "unknown cpu power mode!"); + return false; + } + paddlePredictor = PaddlePredictor.createPaddlePredictor(modelConfig); + this.cpuThreadNum = cpuThreadNum; + this.cpuPowerMode = cpuPowerMode; + this.modelPath = realPath; + this.modelName = realPath.substring(realPath.lastIndexOf("/") + 1); + return true; + } + + public boolean runModel() { + if (!isLoaded()) { + return false; + } + // warm up + for (int i = 0; i < warmupIterNum; i++){ + paddlePredictor.run(); + } + // inference + Date start = new Date(); + for (int i = 0; i < inferIterNum; i++) { + paddlePredictor.run(); + } + Date end = new Date(); + inferenceTime = (end.getTime() - start.getTime()) / (float) inferIterNum; + return true; } public boolean runModel(Bitmap image) { @@ -106,39 +197,42 @@ public class ImgSegPredictor extends Predictor { // set input shape Tensor inputTensor = getInput(0); - inputTensor.resize(config.inputShape); // pre-process image Date start = new Date(); preprocess.init(config); - preprocess.to_array(scaledImage); // feed input tensor with pre-processed data - inputTensor.setData(preprocess.inputData); Date end = new Date(); preprocessTime = (float) (end.getTime() - start.getTime()); // inference - super.runModel(); + runModel(); + start = new Date(); Tensor outputTensor = getOutput(0); // post-process - this.outputImage = visualize.draw(inputImage,outputTensor); - + this.outputImage = visualize.draw(inputImage, outputTensor); postprocessTime = (float) (end.getTime() - start.getTime()); - start = new Date(); outputResult = new String(); - end = new Date(); return true; } + public void releaseModel() { + paddlePredictor = null; + isLoaded = false; + cpuThreadNum = 1; + cpuPowerMode = "LITE_POWER_HIGH"; + modelPath = ""; + modelName = ""; + } public void setConfig(Config config){ this.config = config; @@ -164,13 +258,32 @@ public class ImgSegPredictor extends Predictor { return postprocessTime; } + public String modelPath() { + return modelPath; + } + + public String modelName() { + return modelName; + } + + public int cpuThreadNum() { + return cpuThreadNum; + } + + public String cpuPowerMode() { + return cpuPowerMode; + } + + public float inferenceTime() { + return inferenceTime; + } + public void setInputImage(Bitmap image) { if (image == null) { return; } // scale image to the size of input tensor Bitmap rgbaImage = image.copy(Bitmap.Config.ARGB_8888, true); - Bitmap scaleImage = Bitmap.createScaledBitmap(rgbaImage, (int) this.config.inputShape[3], (int) this.config.inputShape[2], true); this.inputImage = rgbaImage; this.scaledImage = scaleImage; diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegSettingsActivity.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/SettingsActivity.java similarity index 60% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegSettingsActivity.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/SettingsActivity.java index 710d318572088f63f97d298df0ce931d7ecd323e..8f53974d48ed572cd3ccf5d9da4ea74dcdd718c0 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegSettingsActivity.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/SettingsActivity.java @@ -7,14 +7,10 @@ import android.preference.EditTextPreference; import android.preference.ListPreference; import android.support.v7.app.ActionBar; -import com.baidu.paddle.lite.demo.AppCompatPreferenceActivity; -import com.baidu.paddle.lite.demo.R; -import com.baidu.paddle.lite.demo.Utils; - import java.util.ArrayList; import java.util.List; -public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implements SharedPreferences.OnSharedPreferenceChangeListener { +public class SettingsActivity extends AppCompatPreferenceActivity implements SharedPreferences.OnSharedPreferenceChangeListener { ListPreference lpChoosePreInstalledModel = null; CheckBoxPreference cbEnableCustomSettings = null; EditTextPreference etModelPath = null; @@ -23,24 +19,21 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen ListPreference lpCPUThreadNum = null; ListPreference lpCPUPowerMode = null; ListPreference lpInputColorFormat = null; - EditTextPreference etInputShape = null; - EditTextPreference etInputMean = null; - EditTextPreference etInputStd = null; + + List preInstalledModelPaths = null; List preInstalledLabelPaths = null; List preInstalledImagePaths = null; - List preInstalledInputShapes = null; List preInstalledCPUThreadNums = null; List preInstalledCPUPowerModes = null; List preInstalledInputColorFormats = null; - List preInstalledInputMeans = null; - List preInstalledInputStds = null; + @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); - addPreferencesFromResource(R.xml.settings_img_seg); + addPreferencesFromResource(R.xml.settings); ActionBar supportActionBar = getSupportActionBar(); if (supportActionBar != null) { supportActionBar.setDisplayHomeAsUpEnabled(true); @@ -50,24 +43,20 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen preInstalledModelPaths = new ArrayList(); preInstalledLabelPaths = new ArrayList(); preInstalledImagePaths = new ArrayList(); - preInstalledInputShapes = new ArrayList(); + preInstalledCPUThreadNums = new ArrayList(); preInstalledCPUPowerModes = new ArrayList(); preInstalledInputColorFormats = new ArrayList(); - preInstalledInputMeans = new ArrayList(); - preInstalledInputStds = new ArrayList(); // add deeplab_mobilenet_for_cpu - preInstalledModelPaths.add(getString(R.string.ISG_MODEL_PATH_DEFAULT)); - preInstalledLabelPaths.add(getString(R.string.ISG_LABEL_PATH_DEFAULT)); - preInstalledImagePaths.add(getString(R.string.ISG_IMAGE_PATH_DEFAULT)); - preInstalledCPUThreadNums.add(getString(R.string.ISG_CPU_THREAD_NUM_DEFAULT)); - preInstalledCPUPowerModes.add(getString(R.string.ISG_CPU_POWER_MODE_DEFAULT)); - preInstalledInputColorFormats.add(getString(R.string.ISG_INPUT_COLOR_FORMAT_DEFAULT)); - preInstalledInputShapes.add(getString(R.string.ISG_INPUT_SHAPE_DEFAULT)); - + preInstalledModelPaths.add(getString(R.string.MODEL_PATH_DEFAULT)); + preInstalledLabelPaths.add(getString(R.string.LABEL_PATH_DEFAULT)); + preInstalledImagePaths.add(getString(R.string.IMAGE_PATH_DEFAULT)); + preInstalledCPUThreadNums.add(getString(R.string.CPU_THREAD_NUM_DEFAULT)); + preInstalledCPUPowerModes.add(getString(R.string.CPU_POWER_MODE_DEFAULT)); + preInstalledInputColorFormats.add(getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); // initialize UI components lpChoosePreInstalledModel = - (ListPreference) findPreference(getString(R.string.ISG_CHOOSE_PRE_INSTALLED_MODEL_KEY)); + (ListPreference) findPreference(getString(R.string.CHOOSE_PRE_INSTALLED_MODEL_KEY)); String[] preInstalledModelNames = new String[preInstalledModelPaths.size()]; for (int i = 0; i < preInstalledModelPaths.size(); i++) { preInstalledModelNames[i] = @@ -76,38 +65,36 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen lpChoosePreInstalledModel.setEntries(preInstalledModelNames); lpChoosePreInstalledModel.setEntryValues(preInstalledModelPaths.toArray(new String[preInstalledModelPaths.size()])); cbEnableCustomSettings = - (CheckBoxPreference) findPreference(getString(R.string.ISG_ENABLE_CUSTOM_SETTINGS_KEY)); - etModelPath = (EditTextPreference) findPreference(getString(R.string.ISG_MODEL_PATH_KEY)); + (CheckBoxPreference) findPreference(getString(R.string.ENABLE_CUSTOM_SETTINGS_KEY)); + etModelPath = (EditTextPreference) findPreference(getString(R.string.MODEL_PATH_KEY)); etModelPath.setTitle("Model Path (SDCard: " + Utils.getSDCardDirectory() + ")"); - etLabelPath = (EditTextPreference) findPreference(getString(R.string.ISG_LABEL_PATH_KEY)); - etImagePath = (EditTextPreference) findPreference(getString(R.string.ISG_IMAGE_PATH_KEY)); + etLabelPath = (EditTextPreference) findPreference(getString(R.string.LABEL_PATH_KEY)); + etImagePath = (EditTextPreference) findPreference(getString(R.string.IMAGE_PATH_KEY)); lpCPUThreadNum = - (ListPreference) findPreference(getString(R.string.ISG_CPU_THREAD_NUM_KEY)); + (ListPreference) findPreference(getString(R.string.CPU_THREAD_NUM_KEY)); lpCPUPowerMode = - (ListPreference) findPreference(getString(R.string.ISG_CPU_POWER_MODE_KEY)); + (ListPreference) findPreference(getString(R.string.CPU_POWER_MODE_KEY)); lpInputColorFormat = - (ListPreference) findPreference(getString(R.string.ISG_INPUT_COLOR_FORMAT_KEY)); - etInputShape = (EditTextPreference) findPreference(getString(R.string.ISG_INPUT_SHAPE_KEY)); + (ListPreference) findPreference(getString(R.string.INPUT_COLOR_FORMAT_KEY)); } private void reloadPreferenceAndUpdateUI() { SharedPreferences sharedPreferences = getPreferenceScreen().getSharedPreferences(); boolean enableCustomSettings = - sharedPreferences.getBoolean(getString(R.string.ISG_ENABLE_CUSTOM_SETTINGS_KEY), false); - String modelPath = sharedPreferences.getString(getString(R.string.ISG_CHOOSE_PRE_INSTALLED_MODEL_KEY), - getString(R.string.ISG_MODEL_PATH_DEFAULT)); + sharedPreferences.getBoolean(getString(R.string.ENABLE_CUSTOM_SETTINGS_KEY), false); + String modelPath = sharedPreferences.getString(getString(R.string.CHOOSE_PRE_INSTALLED_MODEL_KEY), + getString(R.string.MODEL_PATH_DEFAULT)); int modelIdx = lpChoosePreInstalledModel.findIndexOfValue(modelPath); if (modelIdx >= 0 && modelIdx < preInstalledModelPaths.size()) { if (!enableCustomSettings) { SharedPreferences.Editor editor = sharedPreferences.edit(); - editor.putString(getString(R.string.ISG_MODEL_PATH_KEY), preInstalledModelPaths.get(modelIdx)); - editor.putString(getString(R.string.ISG_LABEL_PATH_KEY), preInstalledLabelPaths.get(modelIdx)); - editor.putString(getString(R.string.ISG_IMAGE_PATH_KEY), preInstalledImagePaths.get(modelIdx)); - editor.putString(getString(R.string.ISG_CPU_THREAD_NUM_KEY), preInstalledCPUThreadNums.get(modelIdx)); - editor.putString(getString(R.string.ISG_CPU_POWER_MODE_KEY), preInstalledCPUPowerModes.get(modelIdx)); - editor.putString(getString(R.string.ISG_INPUT_COLOR_FORMAT_KEY), + editor.putString(getString(R.string.MODEL_PATH_KEY), preInstalledModelPaths.get(modelIdx)); + editor.putString(getString(R.string.LABEL_PATH_KEY), preInstalledLabelPaths.get(modelIdx)); + editor.putString(getString(R.string.IMAGE_PATH_KEY), preInstalledImagePaths.get(modelIdx)); + editor.putString(getString(R.string.CPU_THREAD_NUM_KEY), preInstalledCPUThreadNums.get(modelIdx)); + editor.putString(getString(R.string.CPU_POWER_MODE_KEY), preInstalledCPUPowerModes.get(modelIdx)); + editor.putString(getString(R.string.INPUT_COLOR_FORMAT_KEY), preInstalledInputColorFormats.get(modelIdx)); - editor.putString(getString(R.string.ISG_INPUT_SHAPE_KEY), preInstalledInputShapes.get(modelIdx)); editor.commit(); } lpChoosePreInstalledModel.setSummary(modelPath); @@ -119,23 +106,18 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen lpCPUThreadNum.setEnabled(enableCustomSettings); lpCPUPowerMode.setEnabled(enableCustomSettings); lpInputColorFormat.setEnabled(enableCustomSettings); - etInputShape.setEnabled(enableCustomSettings); - etInputMean.setEnabled(enableCustomSettings); - etInputStd.setEnabled(enableCustomSettings); - modelPath = sharedPreferences.getString(getString(R.string.ISG_MODEL_PATH_KEY), - getString(R.string.ISG_MODEL_PATH_DEFAULT)); - String labelPath = sharedPreferences.getString(getString(R.string.ISG_LABEL_PATH_KEY), - getString(R.string.ISG_LABEL_PATH_DEFAULT)); - String imagePath = sharedPreferences.getString(getString(R.string.ISG_IMAGE_PATH_KEY), - getString(R.string.ISG_IMAGE_PATH_DEFAULT)); - String cpuThreadNum = sharedPreferences.getString(getString(R.string.ISG_CPU_THREAD_NUM_KEY), - getString(R.string.ISG_CPU_THREAD_NUM_DEFAULT)); - String cpuPowerMode = sharedPreferences.getString(getString(R.string.ISG_CPU_POWER_MODE_KEY), - getString(R.string.ISG_CPU_POWER_MODE_DEFAULT)); - String inputColorFormat = sharedPreferences.getString(getString(R.string.ISG_INPUT_COLOR_FORMAT_KEY), - getString(R.string.ISG_INPUT_COLOR_FORMAT_DEFAULT)); - String inputShape = sharedPreferences.getString(getString(R.string.ISG_INPUT_SHAPE_KEY), - getString(R.string.ISG_INPUT_SHAPE_DEFAULT)); + modelPath = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY), + getString(R.string.MODEL_PATH_DEFAULT)); + String labelPath = sharedPreferences.getString(getString(R.string.LABEL_PATH_KEY), + getString(R.string.LABEL_PATH_DEFAULT)); + String imagePath = sharedPreferences.getString(getString(R.string.IMAGE_PATH_KEY), + getString(R.string.IMAGE_PATH_DEFAULT)); + String cpuThreadNum = sharedPreferences.getString(getString(R.string.CPU_THREAD_NUM_KEY), + getString(R.string.CPU_THREAD_NUM_DEFAULT)); + String cpuPowerMode = sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY), + getString(R.string.CPU_POWER_MODE_DEFAULT)); + String inputColorFormat = sharedPreferences.getString(getString(R.string.INPUT_COLOR_FORMAT_KEY), + getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); etModelPath.setSummary(modelPath); etModelPath.setText(modelPath); etLabelPath.setSummary(labelPath); @@ -148,8 +130,7 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen lpCPUPowerMode.setSummary(cpuPowerMode); lpInputColorFormat.setValue(inputColorFormat); lpInputColorFormat.setSummary(inputColorFormat); - etInputShape.setSummary(inputShape); - etInputShape.setText(inputShape); + } @Override @@ -167,9 +148,9 @@ public class ImgSegSettingsActivity extends AppCompatPreferenceActivity implemen @Override public void onSharedPreferenceChanged(SharedPreferences sharedPreferences, String key) { - if (key.equals(getString(R.string.ISG_CHOOSE_PRE_INSTALLED_MODEL_KEY))) { + if (key.equals(getString(R.string.CHOOSE_PRE_INSTALLED_MODEL_KEY))) { SharedPreferences.Editor editor = sharedPreferences.edit(); - editor.putBoolean(getString(R.string.ISG_ENABLE_CUSTOM_SETTINGS_KEY), false); + editor.putBoolean(getString(R.string.ENABLE_CUSTOM_SETTINGS_KEY), false); editor.commit(); } reloadPreferenceAndUpdateUI(); diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Utils.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Utils.java similarity index 98% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Utils.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Utils.java index a8b252365d05313d847d4ccd491fb44596f31227..3d581592dfc78b0e26bedb201c704fe9eff79ebc 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Utils.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/Utils.java @@ -1,4 +1,4 @@ -package com.baidu.paddle.lite.demo; +package com.baidu.paddle.lite.demo.segmentation; import android.content.Context; import android.os.Environment; diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java similarity index 99% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java index 3f059878334cb324dbf01a9fb7b7e91632c1333f..4f09eb53cb642bfbeca001c75e410774bb984fa8 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java +++ b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/config/Config.java @@ -9,7 +9,6 @@ public class Config { public String imagePath = ""; public int cpuThreadNum = 1; public String cpuPowerMode = ""; - public String inputColorFormat = ""; public long[] inputShape = new long[]{}; @@ -22,7 +21,6 @@ public class Config { this.imagePath = imagePath; this.cpuThreadNum = cpuThreadNum; this.cpuPowerMode = cpuPowerMode; - this.inputColorFormat = inputColorFormat; this.inputShape = inputShape; } @@ -30,7 +28,6 @@ public class Config { public void setInputShape(Bitmap inputImage){ this.inputShape[0] = 1; this.inputShape[1] = 3; - this.inputShape[2] = inputImage.getHeight(); this.inputShape[3] = inputImage.getWidth(); diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/preprocess/Preprocess.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/preprocess/Preprocess.java similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/preprocess/Preprocess.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/preprocess/Preprocess.java diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/visual/Visualize.java b/deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/visual/Visualize.java similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/visual/Visualize.java rename to deploy/lite/human_segmentation_demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/visual/Visualize.java diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/drawable-v24/ic_launcher_foreground.xml b/deploy/lite/human_segmentation_demo/app/src/main/res/drawable-v24/ic_launcher_foreground.xml similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/res/drawable-v24/ic_launcher_foreground.xml rename to deploy/lite/human_segmentation_demo/app/src/main/res/drawable-v24/ic_launcher_foreground.xml diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/drawable/ic_launcher_background.xml b/deploy/lite/human_segmentation_demo/app/src/main/res/drawable/ic_launcher_background.xml similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/res/drawable/ic_launcher_background.xml rename to deploy/lite/human_segmentation_demo/app/src/main/res/drawable/ic_launcher_background.xml diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_img_seg.xml b/deploy/lite/human_segmentation_demo/app/src/main/res/layout/activity_main.xml similarity index 98% rename from deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_img_seg.xml rename to deploy/lite/human_segmentation_demo/app/src/main/res/layout/activity_main.xml index a2839ba627ef41bda0676d225d3bd95508795b2b..356b0069df58b2f33eaab1dc1077daeda946f9e5 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_img_seg.xml +++ b/deploy/lite/human_segmentation_demo/app/src/main/res/layout/activity_main.xml @@ -4,7 +4,7 @@ xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" - tools:context=".segmentation.ImgSegActivity"> + tools:context=".segmentation.MainActivity"> +Human Segmentation + +CHOOSE_PRE_INSTALLED_MODEL_KEY +ENABLE_CUSTOM_SETTINGS_KEY +MODEL_PATH_KEY +LABEL_PATH_KEY +IMAGE_PATH_KEY +CPU_THREAD_NUM_KEY +CPU_POWER_MODE_KEY +INPUT_COLOR_FORMAT_KEY +INPUT_SHAPE_KEY +image_segmentation/models/deeplab_mobilenet_for_cpu +image_segmentation/labels/label_list +image_segmentation/images/human.jpg +1 +LITE_POWER_HIGH +RGB +1,3,513,513 + diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/values/styles.xml b/deploy/lite/human_segmentation_demo/app/src/main/res/values/styles.xml similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/main/res/values/styles.xml rename to deploy/lite/human_segmentation_demo/app/src/main/res/values/styles.xml diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/xml/settings_img_seg.xml b/deploy/lite/human_segmentation_demo/app/src/main/res/xml/settings.xml similarity index 62% rename from deploy/lite/humanseg-android-demo/app/src/main/res/xml/settings_img_seg.xml rename to deploy/lite/human_segmentation_demo/app/src/main/res/xml/settings.xml index 8f9e5e76634fae82cf800cc75d3e058c14a255f3..8f1a723ceb4a3b6860c05a3b09c5b3f61e1a6ae2 100644 --- a/deploy/lite/humanseg-android-demo/app/src/main/res/xml/settings_img_seg.xml +++ b/deploy/lite/human_segmentation_demo/app/src/main/res/xml/settings.xml @@ -2,42 +2,42 @@ - + diff --git a/deploy/lite/humanseg-android-demo/app/src/test/java/com/baidu/paddle/lite/demo/ExampleUnitTest.java b/deploy/lite/human_segmentation_demo/app/src/test/java/com/baidu/paddle/lite/demo/ExampleUnitTest.java similarity index 100% rename from deploy/lite/humanseg-android-demo/app/src/test/java/com/baidu/paddle/lite/demo/ExampleUnitTest.java rename to deploy/lite/human_segmentation_demo/app/src/test/java/com/baidu/paddle/lite/demo/ExampleUnitTest.java diff --git a/deploy/lite/humanseg-android-demo/build.gradle b/deploy/lite/human_segmentation_demo/build.gradle similarity index 100% rename from deploy/lite/humanseg-android-demo/build.gradle rename to deploy/lite/human_segmentation_demo/build.gradle diff --git a/deploy/lite/humanseg-android-demo/gradle.properties b/deploy/lite/human_segmentation_demo/gradle.properties similarity index 100% rename from deploy/lite/humanseg-android-demo/gradle.properties rename to deploy/lite/human_segmentation_demo/gradle.properties diff --git a/deploy/lite/humanseg-android-demo/gradle/wrapper/gradle-wrapper.jar b/deploy/lite/human_segmentation_demo/gradle/wrapper/gradle-wrapper.jar similarity index 100% rename from deploy/lite/humanseg-android-demo/gradle/wrapper/gradle-wrapper.jar rename to deploy/lite/human_segmentation_demo/gradle/wrapper/gradle-wrapper.jar diff --git a/deploy/lite/humanseg-android-demo/gradle/wrapper/gradle-wrapper.properties b/deploy/lite/human_segmentation_demo/gradle/wrapper/gradle-wrapper.properties similarity index 100% rename from deploy/lite/humanseg-android-demo/gradle/wrapper/gradle-wrapper.properties rename to deploy/lite/human_segmentation_demo/gradle/wrapper/gradle-wrapper.properties diff --git a/deploy/lite/humanseg-android-demo/gradlew b/deploy/lite/human_segmentation_demo/gradlew similarity index 100% rename from deploy/lite/humanseg-android-demo/gradlew rename to deploy/lite/human_segmentation_demo/gradlew diff --git a/deploy/lite/humanseg-android-demo/gradlew.bat b/deploy/lite/human_segmentation_demo/gradlew.bat similarity index 100% rename from deploy/lite/humanseg-android-demo/gradlew.bat rename to deploy/lite/human_segmentation_demo/gradlew.bat diff --git a/deploy/lite/humanseg-android-demo/settings.gradle b/deploy/lite/human_segmentation_demo/settings.gradle similarity index 100% rename from deploy/lite/humanseg-android-demo/settings.gradle rename to deploy/lite/human_segmentation_demo/settings.gradle diff --git a/deploy/lite/humanseg-android-demo/app/build.gradle b/deploy/lite/humanseg-android-demo/app/build.gradle deleted file mode 100644 index 087d90ca07b67a94030346989c9b1e8597693f61..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/build.gradle +++ /dev/null @@ -1,30 +0,0 @@ -apply plugin: 'com.android.application' - -android { - compileSdkVersion 28 - defaultConfig { - applicationId "com.baidu.paddle.lite.demo" - minSdkVersion 15 - targetSdkVersion 28 - versionCode 1 - versionName "1.0" - testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner" - } - buildTypes { - release { - minifyEnabled false - proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro' - } - } -} - -dependencies { - implementation fileTree(include: ['*.jar'], dir: 'libs') - implementation 'com.android.support:appcompat-v7:28.0.0' - implementation 'com.android.support.constraint:constraint-layout:1.1.3' - implementation 'com.android.support:design:28.0.0' - testImplementation 'junit:junit:4.12' - androidTestImplementation 'com.android.support.test:runner:1.0.2' - androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2' - implementation files('libs/PaddlePredictor.jar') -} diff --git a/deploy/lite/humanseg-android-demo/app/libs/PaddlePredictor.jar b/deploy/lite/humanseg-android-demo/app/libs/PaddlePredictor.jar deleted file mode 100644 index 037d569f712578c5cda766b1160654ea491115df..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/libs/PaddlePredictor.jar and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/__model__.nb b/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/__model__.nb deleted file mode 100644 index 1a83251934c56c808de90abb5bc20887305da87e..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/__model__.nb and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/param.nb b/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/param.nb deleted file mode 100644 index 1a184669cadd6e435c143d8a43eb09e33a1ef9c2..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/assets/image_segmentation/models/deeplab_mobilenet_for_cpu/param.nb and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/MainActivity.java b/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/MainActivity.java deleted file mode 100644 index 00728f865a77e601ec60dc30d2f8dc047aa42472..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/MainActivity.java +++ /dev/null @@ -1,43 +0,0 @@ -package com.baidu.paddle.lite.demo; - -import android.content.Intent; -import android.content.SharedPreferences; -import android.os.Bundle; -import android.preference.PreferenceManager; -import android.support.v7.app.AppCompatActivity; -import android.util.Log; -import android.view.View; - -import com.baidu.paddle.lite.demo.segmentation.ImgSegActivity; - -public class MainActivity extends AppCompatActivity implements View.OnClickListener { - private static final String TAG = MainActivity.class.getSimpleName(); - - @Override - protected void onCreate(Bundle savedInstanceState) { - super.onCreate(savedInstanceState); - setContentView(R.layout.activity_main); - - // clear all setting items to avoid app crashing due to the incorrect settings - SharedPreferences sharedPreferences = PreferenceManager.getDefaultSharedPreferences(this); - SharedPreferences.Editor editor = sharedPreferences.edit(); - editor.clear(); - editor.commit(); - } - - @Override - public void onClick(View v) { - switch (v.getId()) { - case R.id.v_img_seg: { - Intent intent = new Intent(MainActivity.this, ImgSegActivity.class); - startActivity(intent); - } break; - } - } - - @Override - protected void onDestroy() { - super.onDestroy(); - System.exit(0); - } -} diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Predictor.java b/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Predictor.java deleted file mode 100644 index 27bd971017eba6bb52901a7e2aa1e0a8e3cf5ef0..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/Predictor.java +++ /dev/null @@ -1,143 +0,0 @@ -package com.baidu.paddle.lite.demo; - -import android.content.Context; -import android.util.Log; -import com.baidu.paddle.lite.*; - -import java.util.ArrayList; -import java.util.Date; - -public class Predictor { - private static final String TAG = Predictor.class.getSimpleName(); - - public boolean isLoaded = false; - public int warmupIterNum = 0; - public int inferIterNum = 1; - protected Context appCtx = null; - public int cpuThreadNum = 1; - public String cpuPowerMode = "LITE_POWER_HIGH"; - public String modelPath = ""; - public String modelName = ""; - protected PaddlePredictor paddlePredictor = null; - protected float inferenceTime = 0; - - public Predictor() { - } - - public boolean init(Context appCtx, String modelPath, int cpuThreadNum, String cpuPowerMode) { - this.appCtx = appCtx; - isLoaded = loadModel(modelPath, cpuThreadNum, cpuPowerMode); - return isLoaded; - } - - protected boolean loadModel(String modelPath, int cpuThreadNum, String cpuPowerMode) { - // release model if exists - releaseModel(); - - // load model - if (modelPath.isEmpty()) { - return false; - } - String realPath = modelPath; - if (!modelPath.substring(0, 1).equals("/")) { - // read model files from custom file_paths if the first character of mode file_paths is '/' - // otherwise copy model to cache from assets - realPath = appCtx.getCacheDir() + "/" + modelPath; - Utils.copyDirectoryFromAssets(appCtx, modelPath, realPath); - } - if (realPath.isEmpty()) { - return false; - } - MobileConfig config = new MobileConfig(); - config.setModelDir(realPath); - config.setThreads(cpuThreadNum); - if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_HIGH")) { - config.setPowerMode(PowerMode.LITE_POWER_HIGH); - } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_LOW")) { - config.setPowerMode(PowerMode.LITE_POWER_LOW); - } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_FULL")) { - config.setPowerMode(PowerMode.LITE_POWER_FULL); - } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_NO_BIND")) { - config.setPowerMode(PowerMode.LITE_POWER_NO_BIND); - } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_RAND_HIGH")) { - config.setPowerMode(PowerMode.LITE_POWER_RAND_HIGH); - } else if (cpuPowerMode.equalsIgnoreCase("LITE_POWER_RAND_LOW")) { - config.setPowerMode(PowerMode.LITE_POWER_RAND_LOW); - } else { - Log.e(TAG, "unknown cpu power mode!"); - return false; - } - paddlePredictor = PaddlePredictor.createPaddlePredictor(config); - - this.cpuThreadNum = cpuThreadNum; - this.cpuPowerMode = cpuPowerMode; - this.modelPath = realPath; - this.modelName = realPath.substring(realPath.lastIndexOf("/") + 1); - return true; - } - - public void releaseModel() { - paddlePredictor = null; - isLoaded = false; - cpuThreadNum = 1; - cpuPowerMode = "LITE_POWER_HIGH"; - modelPath = ""; - modelName = ""; - } - - public Tensor getInput(int idx) { - if (!isLoaded()) { - return null; - } - return paddlePredictor.getInput(idx); - } - - public Tensor getOutput(int idx) { - if (!isLoaded()) { - return null; - } - return paddlePredictor.getOutput(idx); - } - - public boolean runModel() { - if (!isLoaded()) { - return false; - } - // warm up - for (int i = 0; i < warmupIterNum; i++){ - paddlePredictor.run(); - } - // inference - Date start = new Date(); - for (int i = 0; i < inferIterNum; i++) { - paddlePredictor.run(); - } - Date end = new Date(); - inferenceTime = (end.getTime() - start.getTime()) / (float) inferIterNum; - return true; - } - - public boolean isLoaded() { - return paddlePredictor != null && isLoaded; - } - - public String modelPath() { - return modelPath; - } - - public String modelName() { - return modelName; - } - - public int cpuThreadNum() { - return cpuThreadNum; - } - - public String cpuPowerMode() { - return cpuPowerMode; - } - - public float inferenceTime() { - return inferenceTime; - } -} diff --git a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegActivity.java b/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegActivity.java deleted file mode 100644 index d18895aedb892405783c030167cb3e9d1ed2d304..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/src/main/java/com/baidu/paddle/lite/demo/segmentation/ImgSegActivity.java +++ /dev/null @@ -1,210 +0,0 @@ -package com.baidu.paddle.lite.demo.segmentation; - -import android.content.Intent; -import android.content.SharedPreferences; -import android.graphics.Bitmap; -import android.graphics.BitmapFactory; -import android.os.Bundle; -import android.preference.PreferenceManager; -import android.text.method.ScrollingMovementMethod; -import android.util.Log; -import android.view.Menu; -import android.widget.ImageView; -import android.widget.TextView; -import android.widget.Toast; - -import com.baidu.paddle.lite.demo.CommonActivity; -import com.baidu.paddle.lite.demo.R; -import com.baidu.paddle.lite.demo.Utils; -import com.baidu.paddle.lite.demo.segmentation.config.Config; -import com.baidu.paddle.lite.demo.segmentation.preprocess.Preprocess; -import com.baidu.paddle.lite.demo.segmentation.visual.Visualize; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; - -public class ImgSegActivity extends CommonActivity { - private static final String TAG = ImgSegActivity.class.getSimpleName(); - - protected TextView tvInputSetting; - protected ImageView ivInputImage; - protected TextView tvOutputResult; - protected TextView tvInferenceTime; - - // model config - Config config = new Config(); - - protected ImgSegPredictor predictor = new ImgSegPredictor(); - - Preprocess preprocess = new Preprocess(); - - Visualize visualize = new Visualize(); - - @Override - protected void onCreate(Bundle savedInstanceState) { - - super.onCreate(savedInstanceState); - setContentView(R.layout.activity_img_seg); - tvInputSetting = findViewById(R.id.tv_input_setting); - ivInputImage = findViewById(R.id.iv_input_image); - tvInferenceTime = findViewById(R.id.tv_inference_time); - tvOutputResult = findViewById(R.id.tv_output_result); - tvInputSetting.setMovementMethod(ScrollingMovementMethod.getInstance()); - tvOutputResult.setMovementMethod(ScrollingMovementMethod.getInstance()); - } - - @Override - public boolean onLoadModel() { - return super.onLoadModel() && predictor.init(ImgSegActivity.this, config); - } - - @Override - public boolean onRunModel() { - return super.onRunModel() && predictor.isLoaded() && predictor.runModel(preprocess,visualize); - } - - @Override - public void onLoadModelSuccessed() { - super.onLoadModelSuccessed(); - // load test image from file_paths and run model - try { - if (config.imagePath.isEmpty()) { - return; - } - Bitmap image = null; - // read test image file from custom file_paths if the first character of mode file_paths is '/', otherwise read test - // image file from assets - if (!config.imagePath.substring(0, 1).equals("/")) { - InputStream imageStream = getAssets().open(config.imagePath); - image = BitmapFactory.decodeStream(imageStream); - } else { - if (!new File(config.imagePath).exists()) { - return; - } - image = BitmapFactory.decodeFile(config.imagePath); - } - if (image != null && predictor.isLoaded()) { - predictor.setInputImage(image); - runModel(); - } - } catch (IOException e) { - Toast.makeText(ImgSegActivity.this, "Load image failed!", Toast.LENGTH_SHORT).show(); - e.printStackTrace(); - } - } - - @Override - public void onLoadModelFailed() { - super.onLoadModelFailed(); - } - - @Override - public void onRunModelSuccessed() { - super.onRunModelSuccessed(); - // obtain results and update UI - tvInferenceTime.setText("Inference time: " + predictor.inferenceTime() + " ms"); - Bitmap outputImage = predictor.outputImage(); - if (outputImage != null) { - ivInputImage.setImageBitmap(outputImage); - } - tvOutputResult.setText(predictor.outputResult()); - tvOutputResult.scrollTo(0, 0); - } - - @Override - public void onRunModelFailed() { - super.onRunModelFailed(); - } - - @Override - public void onImageChanged(Bitmap image) { - super.onImageChanged(image); - // rerun model if users pick test image from gallery or camera - if (image != null && predictor.isLoaded()) { -// predictor.setConfig(config); - predictor.setInputImage(image); - runModel(); - } - } - - @Override - public void onImageChanged(String path) { - super.onImageChanged(path); - Bitmap image = BitmapFactory.decodeFile(path); - predictor.setInputImage(image); - runModel(); - } - public void onSettingsClicked() { - super.onSettingsClicked(); - startActivity(new Intent(ImgSegActivity.this, ImgSegSettingsActivity.class)); - } - - @Override - public boolean onPrepareOptionsMenu(Menu menu) { - boolean isLoaded = predictor.isLoaded(); - menu.findItem(R.id.open_gallery).setEnabled(isLoaded); - menu.findItem(R.id.take_photo).setEnabled(isLoaded); - return super.onPrepareOptionsMenu(menu); - } - - @Override - protected void onResume() { - Log.i(TAG,"begin onResume"); - super.onResume(); - - SharedPreferences sharedPreferences = PreferenceManager.getDefaultSharedPreferences(this); - boolean settingsChanged = false; - String model_path = sharedPreferences.getString(getString(R.string.ISG_MODEL_PATH_KEY), - getString(R.string.ISG_MODEL_PATH_DEFAULT)); - String label_path = sharedPreferences.getString(getString(R.string.ISG_LABEL_PATH_KEY), - getString(R.string.ISG_LABEL_PATH_DEFAULT)); - String image_path = sharedPreferences.getString(getString(R.string.ISG_IMAGE_PATH_KEY), - getString(R.string.ISG_IMAGE_PATH_DEFAULT)); - settingsChanged |= !model_path.equalsIgnoreCase(config.modelPath); - settingsChanged |= !label_path.equalsIgnoreCase(config.labelPath); - settingsChanged |= !image_path.equalsIgnoreCase(config.imagePath); - int cpu_thread_num = Integer.parseInt(sharedPreferences.getString(getString(R.string.ISG_CPU_THREAD_NUM_KEY), - getString(R.string.ISG_CPU_THREAD_NUM_DEFAULT))); - settingsChanged |= cpu_thread_num != config.cpuThreadNum; - String cpu_power_mode = - sharedPreferences.getString(getString(R.string.ISG_CPU_POWER_MODE_KEY), - getString(R.string.ISG_CPU_POWER_MODE_DEFAULT)); - settingsChanged |= !cpu_power_mode.equalsIgnoreCase(config.cpuPowerMode); - String input_color_format = - sharedPreferences.getString(getString(R.string.ISG_INPUT_COLOR_FORMAT_KEY), - getString(R.string.ISG_INPUT_COLOR_FORMAT_DEFAULT)); - settingsChanged |= !input_color_format.equalsIgnoreCase(config.inputColorFormat); - long[] input_shape = - Utils.parseLongsFromString(sharedPreferences.getString(getString(R.string.ISG_INPUT_SHAPE_KEY), - getString(R.string.ISG_INPUT_SHAPE_DEFAULT)), ","); - - settingsChanged |= input_shape.length != config.inputShape.length; - - if (!settingsChanged) { - for (int i = 0; i < input_shape.length; i++) { - settingsChanged |= input_shape[i] != config.inputShape[i]; - } - } - - if (settingsChanged) { - config.init(model_path,label_path,image_path,cpu_thread_num,cpu_power_mode, - input_color_format,input_shape); - preprocess.init(config); - // update UI - tvInputSetting.setText("Model: " + config.modelPath.substring(config.modelPath.lastIndexOf("/") + 1) + "\n" + "CPU" + - " Thread Num: " + Integer.toString(config.cpuThreadNum) + "\n" + "CPU Power Mode: " + config.cpuPowerMode); - tvInputSetting.scrollTo(0, 0); - // reload model if configure has been changed - loadModel(); - } - } - - @Override - protected void onDestroy() { - if (predictor != null) { - predictor.releaseModel(); - } - super.onDestroy(); - } -} diff --git a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libhiai.so b/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libhiai.so deleted file mode 100644 index 8b6c40b403ecaa9ace3dbc44eb328c1ad928775b..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libhiai.so and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libpaddle_lite_jni.so b/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libpaddle_lite_jni.so deleted file mode 100644 index b8d79c61f2981f6c7581ad2dd5aa4547ca11aad6..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/arm64-v8a/libpaddle_lite_jni.so and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libhiai.so b/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libhiai.so deleted file mode 100644 index f0ba095c525217f288d9db98dc853882bf7ba6ed..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libhiai.so and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libpaddle_lite_jni.so b/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libpaddle_lite_jni.so deleted file mode 100644 index 5947bf2b4dd44d11585dc53f2a7e9256c3396cf4..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/jniLibs/armeabi-v7a/libpaddle_lite_jni.so and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/drawable/image_segementation.jpg b/deploy/lite/humanseg-android-demo/app/src/main/res/drawable/image_segementation.jpg deleted file mode 100644 index 234044abb6b978124c811c9a632b80e29c002c3f..0000000000000000000000000000000000000000 Binary files a/deploy/lite/humanseg-android-demo/app/src/main/res/drawable/image_segementation.jpg and /dev/null differ diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_main.xml b/deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_main.xml deleted file mode 100644 index 84f15a20fde16981d3d05c8389c66cffd35633ef..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/src/main/res/layout/activity_main.xml +++ /dev/null @@ -1,58 +0,0 @@ - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/deploy/lite/humanseg-android-demo/app/src/main/res/values/strings.xml b/deploy/lite/humanseg-android-demo/app/src/main/res/values/strings.xml deleted file mode 100644 index 88b26f593cdc0d619a163221d59f96b931392f21..0000000000000000000000000000000000000000 --- a/deploy/lite/humanseg-android-demo/app/src/main/res/values/strings.xml +++ /dev/null @@ -1,20 +0,0 @@ - -Segmentation-demo - -ISG_CHOOSE_PRE_INSTALLED_MODEL_KEY -ISG_ENABLE_CUSTOM_SETTINGS_KEY -ISG_MODEL_PATH_KEY -ISG_LABEL_PATH_KEY -ISG_IMAGE_PATH_KEY -ISG_CPU_THREAD_NUM_KEY -ISG_CPU_POWER_MODE_KEY -ISG_INPUT_COLOR_FORMAT_KEY -ISG_INPUT_SHAPE_KEY -image_segmentation/models/deeplab_mobilenet_for_cpu -image_segmentation/labels/label_list -image_segmentation/images/human.jpg -1 -LITE_POWER_HIGH -RGB -1,3,513,513 - diff --git a/deploy/python/docs/PaddleSeg_Infer_Benchmark.md b/deploy/python/docs/PaddleSeg_Infer_Benchmark.md index bfe0f4eca91a50c7112cb8678f6b008d5bb26a21..196e3c3055fa3be6d71b18716216713a6127d301 100644 --- a/deploy/python/docs/PaddleSeg_Infer_Benchmark.md +++ b/deploy/python/docs/PaddleSeg_Infer_Benchmark.md @@ -1,4 +1,4 @@ -# PaddleSeg 分割模型预测性能测试 +# PaddleSeg 分割模型预测Benchmark ## 测试软件环境 - CUDA 9.0 @@ -9,15 +9,6 @@ - GPU: Tesla V100 - CPU:Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz -## 测试方法 -- 输入采用 1000张RGB图片,batch_size 统一为 1。 -- 重复跑多轮,去掉第一轮预热时间,计后续几轮的平均时间:包括数据拷贝到GPU,预测引擎计算时间,预测结果拷贝回CPU 时间。 -- 采用Fluid C++预测引擎 -- 测试时开启了 FLAGS_cudnn_exhaustive_search=True,使用exhaustive方式搜索卷积计算算法。 -- 对于每个模型,同事测试了`OP`优化模型和原生模型的推理速度, 并分别就是否开启`FP16`和`FP32`的进行了测试 - - - ## 推理速度测试数据 **说明**: `OP优化模型`指的是`PaddleSeg 0.3.0`版以后导出的新版模型,把图像的预处理和后处理部分放入 GPU 中进行加速,提高性能。每个模型包含了三种`eval_crop_size`:`192x192`/`512x512`/`768x768`。 @@ -501,7 +492,7 @@ -### 3. 不同的EVAL_CROP_SIZE对图片想能的影响 +### 3. 不同的EVAL_CROP_SIZE对图片性能的影响 在 `deeplabv3p_xception`上的数据对比图: ![xception](https://paddleseg.bj.bcebos.com/inference/benchmark/xception.png) diff --git a/deploy/python/docs/compile_paddle_with_tensorrt.md b/deploy/python/docs/compile_paddle_with_tensorrt.md index e2afad0519867a98363776ebd2879d06adf08bc4..cca07daf7b50a7fd926024ed912db94a869a64d4 100644 --- a/deploy/python/docs/compile_paddle_with_tensorrt.md +++ b/deploy/python/docs/compile_paddle_with_tensorrt.md @@ -11,11 +11,11 @@ ## 2. 安装 TensorRT 5.1 -请参考`Nvidia`的[官方安装教程](https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html) +请参考Nvidia的[官方安装教程](https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html) ## 3. 编译 PaddlePaddle - 这里假设`Python`版本为`3.7`以及`cuda` `cudnn` `tensorRT`安装路径如下: + 这里假设`Python`版本为`3.7`以及`CUDA` `cuDNN` `TensorRT`安装路径如下: ```bash # 假设 cuda 安装路径 /usr/local/cuda-9.0/ diff --git a/docs/annotation/jingling2seg.md b/docs/annotation/jingling2seg.md index 2637df5146e6bd5027600a26a42d5c2a6d3ece80..de36b5395bc599f83875c6732c1372bd86862c3c 100644 --- a/docs/annotation/jingling2seg.md +++ b/docs/annotation/jingling2seg.md @@ -44,7 +44,7 @@ **注意:导出的标注文件位于`保存位置`下的`outputs`目录。** -精灵标注产出的真值文件可参考我们给出的文件夹`docs/annotation/jingling_demo`。 +精灵标注产出的真值文件可参考我们给出的文件夹[docs/annotation/jingling_demo](jingling_demo)
@@ -54,6 +54,7 @@ **注意:** 对于中间有空洞的目标(例如游泳圈),暂不支持对空洞部分的标注。如有需要,可借助[labelme](./labelme2seg.md)。 ## 3 数据格式转换 +最后用我们提供的数据转换脚本将上述标注工具产出的数据格式转换为模型训练时所需的数据格式。 * 经过数据格式转换后的数据集目录结构如下: @@ -84,13 +85,18 @@ pip install pillow * 运行以下代码,将标注后的数据转换成满足以上格式的数据集: ``` - python pdseg/tools/jingling2seg.py +python pdseg/tools/jingling2seg.py ``` -其中,``为精灵标注产出的json文件所在文件夹的目录,一般为精灵工具使用(3)中`保存位置`下的`outputs`目录。 +其中,``为精灵标注产出的json文件所在文件夹的目录,一般为精灵工具使用(3)中`保存位置`下的`outputs`目录。 +我们已内置了一个标注的示例,可运行以下代码进行体验: -转换得到的数据集可参考我们给出的文件夹`docs/annotation/jingling_demo`。其中,文件`class_names.txt`是数据集中所有标注类别的名称,包含背景类;文件夹`annotations`保存的是各图片的像素级别的真值信息,背景类`_background_`对应为0,其它目标类别从1开始递增,至多为255。 +``` +python pdseg/tools/jingling2seg.py docs/annotation/jingling_demo/outputs/ +``` + +转换得到的数据集可参考我们给出的文件夹[docs/annotation/jingling_demo](jingling_demo)。其中,文件`class_names.txt`是数据集中所有标注类别的名称,包含背景类;文件夹`annotations`保存的是各图片的像素级别的真值信息,背景类`_background_`对应为0,其它目标类别从1开始递增,至多为255。
diff --git a/docs/annotation/jingling_demo/aa63d7e6db0d03137883772c246c6761fc201059.jpg b/docs/annotation/jingling_demo/jingling.jpg similarity index 100% rename from docs/annotation/jingling_demo/aa63d7e6db0d03137883772c246c6761fc201059.jpg rename to docs/annotation/jingling_demo/jingling.jpg diff --git a/docs/annotation/jingling_demo/outputs/aa63d7e6db0d03137883772c246c6761fc201059.json b/docs/annotation/jingling_demo/outputs/aa63d7e6db0d03137883772c246c6761fc201059.json deleted file mode 100644 index 69d80205de92afc9cffa304b32ff0e3e95502687..0000000000000000000000000000000000000000 --- a/docs/annotation/jingling_demo/outputs/aa63d7e6db0d03137883772c246c6761fc201059.json +++ /dev/null @@ -1 +0,0 @@ -{"path":"/Users/dataset/aa63d7e6db0d03137883772c246c6761fc201059.jpg","outputs":{"object":[{"name":"person","polygon":{"x1":321.99,"y1":63,"x2":293,"y2":98.00999999999999,"x3":245.01,"y3":141.01,"x4":221,"y4":194,"x5":231.99,"y5":237,"x6":231.99,"y6":348.01,"x7":191,"y7":429,"x8":197,"y8":465.01,"x9":193,"y9":586,"x10":151,"y10":618.01,"x11":124,"y11":622,"x12":100,"y12":703,"x13":121.99,"y13":744,"x14":141.99,"y14":724,"x15":163,"y15":658.01,"x16":238.01,"y16":646,"x17":259,"y17":627,"x18":313,"y18":618.01,"x19":416,"y19":639,"x20":464,"y20":606,"x21":454,"y21":555.01,"x22":404,"y22":508.01,"x23":430,"y23":489,"x24":407,"y24":464,"x25":397,"y25":365.01,"x26":407,"y26":290,"x27":361.99,"y27":252,"x28":376,"y28":215.01,"x29":391.99,"y29":189,"x30":388.01,"y30":135.01,"x31":340,"y31":120,"x32":313,"y32":161.01,"x33":307,"y33":188.01,"x34":311,"y34":207,"x35":277,"y35":186,"x36":293,"y36":137,"x37":308.01,"y37":117,"x38":361,"y38":93}}]},"time_labeled":1568101256852,"labeled":true,"size":{"width":706,"height":1000,"depth":3}} \ No newline at end of file diff --git a/docs/annotation/jingling_demo/outputs/annotations/aa63d7e6db0d03137883772c246c6761fc201059.png b/docs/annotation/jingling_demo/outputs/annotations/aa63d7e6db0d03137883772c246c6761fc201059.png deleted file mode 100644 index 8dfbff7b73bcfff7ef79b904667241731641d4a4..0000000000000000000000000000000000000000 Binary files a/docs/annotation/jingling_demo/outputs/annotations/aa63d7e6db0d03137883772c246c6761fc201059.png and /dev/null differ diff --git a/docs/annotation/jingling_demo/outputs/annotations/jingling.png b/docs/annotation/jingling_demo/outputs/annotations/jingling.png new file mode 100644 index 0000000000000000000000000000000000000000..526acefdcdd8317c5778a5d47495d7049a46269d Binary files /dev/null and b/docs/annotation/jingling_demo/outputs/annotations/jingling.png differ diff --git a/docs/annotation/jingling_demo/outputs/jingling.json b/docs/annotation/jingling_demo/outputs/jingling.json new file mode 100644 index 0000000000000000000000000000000000000000..0021522487a26f66dadc979a96ea631c0314adab --- /dev/null +++ b/docs/annotation/jingling_demo/outputs/jingling.json @@ -0,0 +1 @@ +{"path":"/Users/dataset/jingling.jpg","outputs":{"object":[{"name":"person","polygon":{"x1":321.99,"y1":63,"x2":293,"y2":98.00999999999999,"x3":245.01,"y3":141.01,"x4":221,"y4":194,"x5":231.99,"y5":237,"x6":231.99,"y6":348.01,"x7":191,"y7":429,"x8":197,"y8":465.01,"x9":193,"y9":586,"x10":151,"y10":618.01,"x11":124,"y11":622,"x12":100,"y12":703,"x13":121.99,"y13":744,"x14":141.99,"y14":724,"x15":163,"y15":658.01,"x16":238.01,"y16":646,"x17":259,"y17":627,"x18":313,"y18":618.01,"x19":416,"y19":639,"x20":464,"y20":606,"x21":454,"y21":555.01,"x22":404,"y22":508.01,"x23":430,"y23":489,"x24":407,"y24":464,"x25":397,"y25":365.01,"x26":407,"y26":290,"x27":361.99,"y27":252,"x28":376,"y28":215.01,"x29":391.99,"y29":189,"x30":388.01,"y30":135.01,"x31":340,"y31":120,"x32":313,"y32":161.01,"x33":307,"y33":188.01,"x34":311,"y34":207,"x35":277,"y35":186,"x36":293,"y36":137,"x37":308.01,"y37":117,"x38":361,"y38":93}}]},"time_labeled":1568101256852,"labeled":true,"size":{"width":706,"height":1000,"depth":3}} \ No newline at end of file diff --git a/docs/annotation/labelme2seg.md b/docs/annotation/labelme2seg.md index a270591d06131ec48f4ebb0d25ec206031956a24..235e3c41b6a79ece0b7512955aba04fe06faabe3 100644 --- a/docs/annotation/labelme2seg.md +++ b/docs/annotation/labelme2seg.md @@ -47,7 +47,7 @@ git clone https://github.com/wkentaro/labelme ​ (3) 图片中所有目标的标注都完成后,点击`Save`保存json文件,**请将json文件和图片放在同一个文件夹里**,点击`Next Image`标注下一张图片。 -LableMe产出的真值文件可参考我们给出的文件夹`docs/annotation/labelme_demo`。 +LableMe产出的真值文件可参考我们给出的文件夹[docs/annotation/labelme_demo](labelme_demo)。
@@ -64,6 +64,7 @@ LableMe产出的真值文件可参考我们给出的文件夹`docs/annotation/la
## 3 数据格式转换 +最后用我们提供的数据转换脚本将上述标注工具产出的数据格式转换为模型训练时所需的数据格式。 * 经过数据格式转换后的数据集目录结构如下: @@ -94,12 +95,18 @@ pip install pillow * 运行以下代码,将标注后的数据转换成满足以上格式的数据集: ``` - python pdseg/tools/labelme2seg.py + python pdseg/tools/labelme2seg.py ``` -其中,``为图片以及LabelMe产出的json文件所在文件夹的目录,同时也是转换后的标注集所在文件夹的目录。 +其中,``为图片以及LabelMe产出的json文件所在文件夹的目录,同时也是转换后的标注集所在文件夹的目录。 -转换得到的数据集可参考我们给出的文件夹`docs/annotation/labelme_demo`。其中,文件`class_names.txt`是数据集中所有标注类别的名称,包含背景类;文件夹`annotations`保存的是各图片的像素级别的真值信息,背景类`_background_`对应为0,其它目标类别从1开始递增,至多为255。 +我们已内置了一个标注的示例,可运行以下代码进行体验: + +``` +python pdseg/tools/labelme2seg.py docs/annotation/labelme_demo/ +``` + +转换得到的数据集可参考我们给出的文件夹[docs/annotation/labelme_demo](labelme_demo)。其中,文件`class_names.txt`是数据集中所有标注类别的名称,包含背景类;文件夹`annotations`保存的是各图片的像素级别的真值信息,背景类`_background_`对应为0,其它目标类别从1开始递增,至多为255。
diff --git a/docs/annotation/labelme_demo/annotations/2011_000025.png b/docs/annotation/labelme_demo/annotations/2011_000025.png index dcf7c96517d4870f6e83293cef62e3285e5b37e3..0b5a56dda153c92f4411ac7d71665aaf93111e10 100644 Binary files a/docs/annotation/labelme_demo/annotations/2011_000025.png and b/docs/annotation/labelme_demo/annotations/2011_000025.png differ diff --git a/docs/benchmark.md b/docs/benchmark.md deleted file mode 100644 index c1e6de2fcee971437c29e370e9410f9d00c9145f..0000000000000000000000000000000000000000 --- a/docs/benchmark.md +++ /dev/null @@ -1,17 +0,0 @@ -# PaddleSeg 性能Benchmark - -## 训练性能 - -### 多GPU加速比 - -### 显存开销对比 - -## 预测性能对比 - -### Windows - -### Linux - -#### Naive - -#### Analysis diff --git a/docs/check.md b/docs/check.md index fac9520f11ef46d3628ecab3fcc4127a468a3ca5..20dc87f7e10d856f050a80554adb9c93d0ff05e3 100644 --- a/docs/check.md +++ b/docs/check.md @@ -55,7 +55,7 @@ Doing label pixel statistics: - 当`AUG.AUG_METHOD`为stepscaling时,`EVAL_CROP_SIZE`的宽高应不小于原图中最大的宽高。 -- 当`AUG.AUG_METHOD`为rangscaling时,`EVAL_CROP_SIZE`的宽高应不小于缩放后图像中最大的宽高。 +- 当`AUG.AUG_METHOD`为rangescaling时,`EVAL_CROP_SIZE`的宽高应不小于缩放后图像中最大的宽高。 ### 11 数据增强参数`AUG.INF_RESIZE_VALUE`校验 验证`AUG.INF_RESIZE_VALUE`是否在[`AUG.MIN_RESIZE_VALUE`~`AUG.MAX_RESIZE_VALUE`]范围内。若在范围内,则通过校验。 diff --git a/docs/config.md b/docs/config.md index 387af4d4e18dc0e5b8cee7baa96ecf5b713f03ab..67e1353a7d88994b584d5bd3da4dd36d81430a59 100644 --- a/docs/config.md +++ b/docs/config.md @@ -1,18 +1,281 @@ -# PaddleSeg 分割库配置说明 +# 脚本使用和配置说明 -PaddleSeg提供了提供了统一的配置用于 训练/评估/可视化/导出模型 +PaddleSeg提供了 **训练**/**评估**/**可视化**/**模型导出** 等4个功能的使用脚本。所有脚本都支持通过不同的Flags来开启特定功能,也支持通过Options来修改默认的训练配置。它们的使用方式非常接近,如下: + +```shell +# 训练 +python pdseg/train.py ${FLAGS} ${OPTIONS} +# 评估 +python pdseg/eval.py ${FLAGS} ${OPTIONS} +# 可视化 +python pdseg/vis.py ${FLAGS} ${OPTIONS} +# 模型导出 +python pdseg/export_model.py ${FLAGS} ${OPTIONS} +``` + +**Note:** FLAGS必须位于OPTIONS之前,否会将会遇到报错,例如如下的例子: + +```shell +# FLAGS "--cfg configs/unet_optic.yaml" 必须在 OPTIONS "BATCH_SIZE 1" 之前 +python pdseg/train.py BATCH_SIZE 1 --cfg configs/unet_optic.yaml +``` + +## 命令行FLAGS + +|FLAG|用途|支持脚本|默认值|备注| +|-|-|-|-|-| +|--cfg|配置文件路径|ALL|None|| +|--use_gpu|是否使用GPU进行训练|train/eval/vis|False|| +|--use_mpio|是否使用多进程进行IO处理|train/eval|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。
**NOTE:** windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。 | +|--use_tb|是否使用TensorBoard记录训练数据|train|False|| +|--log_steps|训练日志的打印周期(单位为step)|train|10|| +|--debug|是否打印debug信息|train|False|IOU等指标涉及到混淆矩阵的计算,会降低训练速度| +|--tb_log_dir                      |TensorBoard的日志路径|train|None|| +|--do_eval|是否在保存模型时进行效果评估                                                        |train|False|| +|--vis_dir|保存可视化图片的路径|vis|"visual"|| + +## OPTIONS + +PaddleSeg提供了统一的配置用于 训练/评估/可视化/导出模型。一共存在三套配置方案: +* 命令行窗口传递的参数。 +* configs目录下的yaml文件。 +* 默认参数,位于pdseg/utils/config.py。 + +三者的优先级顺序为 命令行窗口 > yaml > 默认配置。 配置包含以下Group: -* [通用](./configs/basic_group.md) -* [DATASET](./configs/dataset_group.md) -* [DATALOADER](./configs/dataloader_group.md) -* [FREEZE](./configs/freeze_group.md) -* [MODEL](./configs/model_group.md) -* [SOLVER](./configs/solver_group.md) -* [TRAIN](./configs/train_group.md) -* [TEST](./configs/test_group.md) - -`Note`: - - 代码详见pdseg/utils/config.py +|OPTIONS|用途|支持脚本| +|-|-|-| +|[BASIC](./configs/basic_group.md)|通用配置|ALL| +|[DATASET](./configs/dataset_group.md)|数据集相关|train/eval/vis| +|[MODEL](./configs/model_group.md)|模型相关|ALL| +|[TRAIN](./configs/train_group.md)|训练相关|train| +|[SOLVER](./configs/solver_group.md)|训练优化相关|train| +|[TEST](./configs/test_group.md)|测试模型相关|eval/vis/export_model| +|[AUG](./data_aug.md)|数据增强|ALL| +[FREEZE](./configs/freeze_group.md)|模型导出相关|export_model| +|[DATALOADER](./configs/dataloader_group.md)|数据加载相关|ALL| + +在进行自定义的分割任务之前,您需要准备一份yaml文件,建议参照[configs目录下的示例yaml](../configs)进行修改。 + +以下是PaddleSeg的默认配置,供查询使用。 + +```yaml +########################## 基本配置 ########################################### +# 批处理大小 +BATCH_SIZE: 1 +# 验证时图像裁剪尺寸(宽,高) +EVAL_CROP_SIZE: tuple() +# 训练时图像裁剪尺寸(宽,高) +TRAIN_CROP_SIZE: tuple() + +########################## 数据集配置 ######################################### +DATASET: + # 数据主目录目录 + DATA_DIR: './dataset/cityscapes/' + # 训练集列表 + TRAIN_FILE_LIST: './dataset/cityscapes/train.list' + # 验证集列表 + VAL_FILE_LIST: './dataset/cityscapes/val.list' + # 测试数据列表 + TEST_FILE_LIST: './dataset/cityscapes/test.list' + # Tensorboard 可视化的数据集 + VIS_FILE_LIST: None + # 类别数(需包括背景类) + NUM_CLASSES: 19 + # 输入图像类型, 支持三通道'rgb',四通道'rgba',单通道灰度图'gray' + IMAGE_TYPE: 'rgb' + # 输入图片的通道数 + DATA_DIM: 3 + # 数据列表分割符, 默认为空格 + SEPARATOR: ' ' + # 忽略的像素标签值, 默认为255,一般无需改动 + IGNORE_INDEX: 255 + +########################## 模型通用配置 ####################################### +MODEL: + # 模型名称, 已支持deeplabv3p, unet, icnet,pspnet,hrnet + MODEL_NAME: '' + # BatchNorm类型: bn、gn(group_norm) + DEFAULT_NORM_TYPE: 'bn' + # 多路损失加权值 + MULTI_LOSS_WEIGHT: [1.0] + # DEFAULT_NORM_TYPE为gn时group数 + DEFAULT_GROUP_NUMBER: 32 + # 极小值, 防止分母除0溢出,一般无需改动 + DEFAULT_EPSILON: 1e-5 + # BatchNorm动量, 一般无需改动 + BN_MOMENTUM: 0.99 + # 是否使用FP16训练 + FP16: False + + ########################## DeepLab模型配置 #################################### + DEEPLAB: + # DeepLab backbone 配置, 可选项xception_65, mobilenetv2 + BACKBONE: "xception_65" + # DeepLab output stride + OUTPUT_STRIDE: 16 + # MobileNet v2 backbone scale 设置 + DEPTH_MULTIPLIER: 1.0 + # MobileNet v2 backbone scale 设置 + ENCODER_WITH_ASPP: True + # MobileNet v2 backbone scale 设置 + ENABLE_DECODER: True + # ASPP是否使用可分离卷积 + ASPP_WITH_SEP_CONV: True + # 解码器是否使用可分离卷积 + DECODER_USE_SEP_CONV: True + + ########################## UNET模型配置 ####################################### + UNET: + # 上采样方式, 默认为双线性插值 + UPSAMPLE_MODE: 'bilinear' + + ########################## ICNET模型配置 ###################################### + ICNET: + # RESNET backbone scale 设置 + DEPTH_MULTIPLIER: 0.5 + # RESNET 层数 设置 + LAYERS: 50 + + ########################## PSPNET模型配置 ###################################### + PSPNET: + # RESNET backbone scale 设置 + DEPTH_MULTIPLIER: 1 + # RESNET backbone 层数 设置 + LAYERS: 50 + + ########################## HRNET模型配置 ###################################### + HRNET: + # HRNET STAGE2 设置 + STAGE2: + NUM_MODULES: 1 + NUM_CHANNELS: [40, 80] + # HRNET STAGE3 设置 + STAGE3: + NUM_MODULES: 4 + NUM_CHANNELS: [40, 80, 160] + # HRNET STAGE4 设置 + STAGE4: + NUM_MODULES: 3 + NUM_CHANNELS: [40, 80, 160, 320] + +########################### 训练配置 ########################################## +TRAIN: + # 模型保存路径 + MODEL_SAVE_DIR: '' + # 预训练模型路径 + PRETRAINED_MODEL_DIR: '' + # 是否resume,继续训练 + RESUME_MODEL_DIR: '' + # 是否使用多卡间同步BatchNorm均值和方差 + SYNC_BATCH_NORM: False + # 模型参数保存的epoch间隔数,可用来继续训练中断的模型 + SNAPSHOT_EPOCH: 10 + +########################### 模型优化相关配置 ################################## +SOLVER: + # 初始学习率 + LR: 0.1 + # 学习率下降方法, 支持poly piecewise cosine 三种 + LR_POLICY: "poly" + # 优化算法, 支持SGD和Adam两种算法 + OPTIMIZER: "sgd" + # 动量参数 + MOMENTUM: 0.9 + # 二阶矩估计的指数衰减率 + MOMENTUM2: 0.999 + # 学习率Poly下降指数 + POWER: 0.9 + # step下降指数 + GAMMA: 0.1 + # step下降间隔 + DECAY_EPOCH: [10, 20] + # 学习率权重衰减,0-1 + WEIGHT_DECAY: 0.00004 + # 训练开始epoch数,默认为1 + BEGIN_EPOCH: 1 + # 训练epoch数,正整数 + NUM_EPOCHS: 30 + # loss的选择,支持softmax_loss, bce_loss, dice_loss + LOSS: ["softmax_loss"] + # 是否开启warmup学习策略 + LR_WARMUP: False + # warmup的迭代次数 + LR_WARMUP_STEPS: 2000 + +########################## 测试配置 ########################################### +TEST: + # 测试模型路径 + TEST_MODEL: '' + +########################### 数据增强配置 ###################################### +AUG: + # 图像resize的方式有三种: + # unpadding(固定尺寸),stepscaling(按比例resize),rangescaling(长边对齐) + AUG_METHOD: 'unpadding' + + # 图像resize的固定尺寸(宽,高),非负 + FIX_RESIZE_SIZE: (500, 500) + + # 图像resize方式为stepscaling,resize最小尺度,非负 + MIN_SCALE_FACTOR: 0.5 + # 图像resize方式为stepscaling,resize最大尺度,不小于MIN_SCALE_FACTOR + MAX_SCALE_FACTOR: 2.0 + # 图像resize方式为stepscaling,resize尺度范围间隔,非负 + SCALE_STEP_SIZE: 0.25 + + # 图像resize方式为rangescaling,训练时长边resize的范围最小值,非负 + MIN_RESIZE_VALUE: 400 + # 图像resize方式为rangescaling,训练时长边resize的范围最大值, + # 不小于MIN_RESIZE_VALUE + MAX_RESIZE_VALUE: 600 + # 图像resize方式为rangescaling, 测试验证可视化模式下长边resize的长度, + # 在MIN_RESIZE_VALUE到MAX_RESIZE_VALUE范围内 + INF_RESIZE_VALUE: 500 + + # 图像镜像左右翻转 + MIRROR: True + # 图像上下翻转开关,True/False + FLIP: False + # 图像启动上下翻转的概率,0-1 + FLIP_RATIO: 0.5 + + RICH_CROP: + # RichCrop数据增广开关,用于提升模型鲁棒性 + ENABLE: False + # 图像旋转最大角度,0-90 + MAX_ROTATION: 15 + # 裁取图像与原始图像面积比,0-1 + MIN_AREA_RATIO: 0.5 + # 裁取图像宽高比范围,非负 + ASPECT_RATIO: 0.33 + # 亮度调节范围,0-1 + BRIGHTNESS_JITTER_RATIO: 0.5 + # 饱和度调节范围,0-1 + SATURATION_JITTER_RATIO: 0.5 + # 对比度调节范围,0-1 + CONTRAST_JITTER_RATIO: 0.5 + # 图像模糊开关,True/False + BLUR: False + # 图像启动模糊百分比,0-1 + BLUR_RATIO: 0.1 + +########################## 预测部署模型配置 ################################### +FREEZE: + # 预测保存的模型名称 + MODEL_FILENAME: '__model__' + # 预测保存的参数名称 + PARAMS_FILENAME: '__params__' + # 预测模型参数保存的路径 + SAVE_DIR: 'freeze_model' + +########################## 数据载入配置 ####################################### +DATALOADER: + # 数据载入时的并发数, 建议值8 + NUM_WORKERS: 8 + # 数据载入时缓存队列大小, 建议值256 + BUF_SIZE: 256 +``` + diff --git a/docs/configs/basic_group.md b/docs/configs/basic_group.md index c66752f38e153084601c89e0aeb0c9385f02885b..dbe22b91da0632ad6b0b435582495b784aa2b276 100644 --- a/docs/configs/basic_group.md +++ b/docs/configs/basic_group.md @@ -2,70 +2,58 @@ BASIC Group存放所有通用配置 -## `MEAN` +## `BATCH_SIZE` -图像预处理减去的均值(格式为 *[R, G, B]* ) +训练、评估、可视化时所用的BATCH大小 ### 默认值 -[0.5, 0.5, 0.5] +1(需要根据实际需求填写) -
-
+### 注意事项 -## `STD` +* 当指定了多卡运行时,PaddleSeg会将数据平分到每张卡上运行,因此每张卡单次运行的数量为 BATCH_SIZE // dev_count -图像预处理所除的标准差(格式为 *[R, G, B]* ) +* 多卡运行时,请确保BATCH_SIZE可被dev_count整除 -### 默认值 +* 增大BATCH_SIZE有利于模型训练时的收敛速度,但是会带来显存的开销。请根据实际情况评估后填写合适的值 -[0.5, 0.5, 0.5] +* 目前PaddleSeg提供的很多预训练模型都有BN层,如果BATCH SIZE设置为1,则此时训练可能不稳定导致nan

-## `EVAL_CROP_SIZE` +## `TRAIN_CROP_SIZE` -评估时所对图片裁剪的大小(格式为 *[宽, 高]* ) +训练时所对图片裁剪的大小(格式为 *[宽, 高]* ) ### 默认值 无(需要用户自己填写) ### 注意事项 -* 裁剪的大小不能小于原图,请将该字段的值填写为评估数据中最长的宽和高 +`TRAIN_CROP_SIZE`可以设置任意大小,具体如何设置根据数据集而定。

-## `TRAIN_CROP_SIZE` +## `EVAL_CROP_SIZE` -训练时所对图片裁剪的大小(格式为 *[宽, 高]* ) +评估时所对图片裁剪的大小(格式为 *[宽, 高]* ) ### 默认值 无(需要用户自己填写) -
-
- -## `BATCH_SIZE` - -训练、评估、可视化时所用的BATCH大小 - -### 默认值 - -1(需要根据实际需求填写) - ### 注意事项 +`EVAL_CROP_SIZE`的设置需要满足以下条件,共有3种情形: +- 当`AUG.AUG_METHOD`为unpadding时,`EVAL_CROP_SIZE`的宽高应不小于`AUG.FIX_RESIZE_SIZE`的宽高。 +- 当`AUG.AUG_METHOD`为stepscaling时,`EVAL_CROP_SIZE`的宽高应不小于原图中最长的宽高。 +- 当`AUG.AUG_METHOD`为rangescaling时,`EVAL_CROP_SIZE`的宽高应不小于缩放后图像中最长的宽高。 -* 当指定了多卡运行时,PaddleSeg会将数据平分到每张卡上运行,因此每张卡单次运行的数量为 BATCH_SIZE // dev_count +
+
-* 多卡运行时,请确保BATCH_SIZE可被dev_count整除 -* 增大BATCH_SIZE有利于模型训练时的收敛速度,但是会带来显存的开销。请根据实际情况评估后填写合适的值 -* 目前PaddleSeg提供的很多预训练模型都有BN层,如果BATCH SIZE设置为1,则此时训练可能不稳定导致nan -
-
diff --git a/docs/configs/model_group.md b/docs/configs/model_group.md index e11b769de7d8aabbd14583e6666045de6cfc5b42..ca8758cdf2e93337da9bcd4400d572e88f006445 100644 --- a/docs/configs/model_group.md +++ b/docs/configs/model_group.md @@ -5,11 +5,12 @@ MODEL Group存放所有和模型相关的配置,该Group还包含三个子Grou * [DeepLabv3p](./model_deeplabv3p_group.md) * [UNet](./model_unet_group.md) * [ICNet](./model_icnet_group.md) +* [PSPNet](./model_pspnet_group.md) * [HRNet](./model_hrnet_group.md) ## `MODEL_NAME` -所选模型,支持`deeplabv3p` `unet` `icnet` `hrnet`四种模型 +所选模型,支持`deeplabv3p` `unet` `icnet` `pspnet` `hrnet`五种模型 ### 默认值 @@ -20,7 +21,13 @@ MODEL Group存放所有和模型相关的配置,该Group还包含三个子Grou ## `DEFAULT_NORM_TYPE` -模型所用norm类型,支持`bn` [`gn`]() +模型所用norm类型,支持`bn`(Batch Norm)、`gn`(Group Norm) + +![](../imgs/gn.png) + +关于Group Norm的介绍可以参考论文:https://arxiv.org/abs/1803.08494 + +GN 把通道分为组,并计算每一组之内的均值和方差,以进行归一化。GN 的计算与批量大小无关,其精度也在各种批量大小下保持稳定。适应于网络参数很重的模型,比如deeplabv3+这种,可以在一个小batch下取得一个较好的训练效果。 ### 默认值 @@ -111,4 +118,3 @@ loss = 1.0 * loss1 + 0.4 * loss2 + 0.16 * loss3

- diff --git a/docs/configs/model_pspnet_group.md b/docs/configs/model_pspnet_group.md new file mode 100644 index 0000000000000000000000000000000000000000..c1acd31b296b8b64ac05730e0e92b840264a4f23 --- /dev/null +++ b/docs/configs/model_pspnet_group.md @@ -0,0 +1,25 @@ +# cfg.MODEL.PSPNET + +MODEL.PSPNET 子Group存放所有和PSPNet模型相关的配置 + +## `DEPTH_MULTIPER` + +Resnet backbone的depth multiper + +### 默认值 + +1 + +
+
+ +## `LAYERS` + +ResNet backbone的层数,支持`18` `34` `50` `101` `152`等五种 + +### 默认值 + +50 + +
+
diff --git a/docs/configs/train_group.md b/docs/configs/train_group.md index 6c8a0d79c79af665d8c7bf54a2b7555aa024bb8d..2fc8806c457d561978379589f6e05657e62a6e86 100644 --- a/docs/configs/train_group.md +++ b/docs/configs/train_group.md @@ -5,7 +5,7 @@ TRAIN Group存放所有和训练相关的配置 ## `MODEL_SAVE_DIR` 在训练周期内定期保存模型的主目录 -## 默认值 +### 默认值 无(需要用户自己填写)
@@ -14,10 +14,10 @@ TRAIN Group存放所有和训练相关的配置 ## `PRETRAINED_MODEL_DIR` 预训练模型路径 -## 默认值 +### 默认值 无 -## 注意事项 +### 注意事项 * 若未指定该字段,则模型会随机初始化所有的参数,从头开始训练 @@ -31,10 +31,10 @@ TRAIN Group存放所有和训练相关的配置 ## `RESUME_MODEL_DIR` 从指定路径中恢复参数并继续训练 -## 默认值 +### 默认值 无 -## 注意事项 +### 注意事项 * 当`RESUME_MODEL_DIR`存在时,PaddleSeg会恢复到上一次训练的最近一个epoch,并且恢复训练过程中的临时变量(如已经衰减过的学习率,Optimizer的动量数据等),`PRETRAINED_MODEL`路径的最后一个目录必须为int数值或者字符串final,PaddleSeg会将int数值作为当前起始EPOCH继续训练,若目录为final,则不会继续训练。若目录不满足上述条件,PaddleSeg会抛出错误。 @@ -42,12 +42,17 @@ TRAIN Group存放所有和训练相关的配置
## `SYNC_BATCH_NORM` -是否在多卡间同步BN的均值和方差 +是否在多卡间同步BN的均值和方差。 -## 默认值 +Synchronized Batch Norm跨GPU批归一化策略最早在[MegDet: A Large Mini-Batch Object Detector](https://arxiv.org/abs/1711.07240) +论文中提出,在[Bag of Freebies for Training Object Detection Neural Networks](https://arxiv.org/pdf/1902.04103.pdf)论文中以Yolov3验证了这一策略的有效性,[PaddleCV/yolov3](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/yolov3)实现了这一系列策略并比Darknet框架版本在COCO17数据上mAP高5.9. + +PaddleSeg基于PaddlePaddle框架的sync_batch_norm策略,可以支持通过多卡实现大batch size的分割模型训练,可以得到更高的mIoU精度。 + +### 默认值 False -## 注意事项 +### 注意事项 * 打开该选项会带来一定的性能消耗(多卡间同步数据导致) diff --git a/docs/data_aug.md b/docs/data_aug.md index 2865d413b7090f414eb44c0681562837de21f19a..ed1b5c3c2dc66fa94f6dc1067bdec19161cda431 100644 --- a/docs/data_aug.md +++ b/docs/data_aug.md @@ -7,67 +7,108 @@ ## Resize -resize步骤是指将输入图像按照某种规则讲图片重新缩放到某一个尺寸,PaddleSeg支持以下3种resize方式: +Resize步骤是指将输入图像按照某种规则讲图片重新缩放到某一个尺寸,PaddleSeg支持以下3种resize方式: ![](imgs/aug_method.png) -- Un-padding -将输入图像直接resize到某一个固定大小下,送入到网络中间训练,对应参数为AUG.FIX_RESIZE_SIZE。预测时同样操作。 +- Unpadding +将输入图像直接resize到某一个固定大小下,送入到网络中间训练。预测时同样操作。 - Step-Scaling -将输入图像按照某一个比例resize,这个比例以某一个步长在一定范围内随机变动。设定最小比例参数为`AUG.MIN_SCALE_FACTOR`, 最大比例参数`AUG.MAX_SCALE_FACTOR`,步长参数为`AUG.SCALE_STEP_SIZE`。预测时不对输入图像做处理。 +将输入图像按照某一个比例resize,这个比例以某一个步长在一定范围内随机变动。预测时不对输入图像做处理。 - Range-Scaling -固定长宽比resize,即图像长边对齐到某一个固定大小,短边随同样的比例变化。设定最小大小参数为`AUG.MIN_RESIZE_VALUE`,设定最大大小参数为`AUG.MAX_RESIZE_VALUE`。预测时需要将长边对齐到`AUG.INF_RESIZE_VALUE`所指定的大小,其中`AUG.INF_RESIZE_VALUE`在`AUG.MIN_RESIZE_VALUE`和`AUG.MAX_RESIZE_VALUE`范围内。 +将输入图像按照长边变化进行resize,即图像长边对齐到某一长度,该长度在一定范围内随机变动,短边随同样的比例变化。 +预测时需要将长边对齐到另外指定的固定长度。 Range-Scaling示意图如下: ![](imgs/rangescale.png) +|Resize方式|配置参数|含义|备注| +|-|-|-|-| +|Unpadding|AUG.FIX_RESIZE_SIZE|Resize的固定尺寸| +|Step-Scaling|AUG.MIN_SCALE_FACTOR|Resize最小比例| +||AUG.MAX_SCALE_FACTOR|Resize最大比例| +||AUG.SCALE_STEP_SIZE|Resize比例选取的步长| +|Range-Scaling|AUG.MIN_RESIZE_VALUE|图像长边变动范围的最小值| +||AUG.MAX_RESIZE_VALUE|图像长边变动范围的最大值| +|                              |AUG.INF_RESIZE_VALUE|预测时长边对齐时所指定的固定长度|取值必须在
[AUG.MIN_RESIZE_VALUE,
AUG.MAX_RESIZE_VALUE]
范围内。| + +**注:本文所有配置参数可在configs目录下您的yaml文件中进行设置。** + ## 图像翻转 PaddleSeg支持以下2种翻转方式: - 左右翻转(Mirror) -使用开关`AUG.MIRROR`,为True时该项功能开启,为False时该项功能关闭。 +以50%概率对图像进行左右翻转。 - 上下翻转(Flip) -使用开关`AUG.FLIP`,为True时该项功能开启,`AUG.FLIP_RATIO`控制是否上下翻转的概率。为False时该项功能关闭。 +以一定概率对图像进行上下翻转。 以上2种开关独立运作,可组合使用。故图像翻转一共有如下4种可能的情况: +|图像翻转方式|配置参数|含义|备注| +|-|-|-|-| +|Mirror|AUG.MIRROR|左右翻转开关|为True时开启,为False时关闭| +|Flip|AUG.FLIP|上下翻转开关|为True时开启,为False时关闭| +||AUG.FLIP_RATIO|控制是否上下翻转的概率|当AUG.FLIP为False时无效| + + ## Rich Crop Rich Crop是PaddleSeg结合实际业务经验开放的一套数据增强策略,面向标注数据少,测试数据情况繁杂的分割业务场景使用的数据增强策略。流程如下图所示: ![RichCrop示意图](imgs/data_aug_example.png) -rich crop是指对图像进行多种变换,保证在训练过程中数据的丰富多样性,PaddleSeg支持以下几种变换。`AUG.RICH_CROP.ENABLE`为False时会直接跳过该步骤。 +Rich Crop是指对图像进行多种变换,保证在训练过程中数据的丰富多样性,包含以下4种变换: + +- Blur +使用高斯模糊对图像进行平滑。 + +- Rotation +图像旋转,旋转角度在一定范围内随机选取,旋转产生的多余的区域使用`DATASET.PADDING_VALUE`值进行填充。 -- blur -图像加模糊,使用开关`AUG.RICH_CROP.BLUR`,为False时该项功能关闭。`AUG.RICH_CROP.BLUR_RATIO`控制加入模糊的概率。 +- Aspect +图像长宽比调整,从图像中按一定大小和宽高比裁取一定区域出来之后进行resize。 -- rotation -图像旋转,`AUG.RICH_CROP.MAX_ROTATION`控制最大旋转角度。旋转产生的多余的区域的填充值为均值。 +- Color jitter +图像颜色抖动,共进行亮度、饱和度和对比度三种颜色属性的调节。 -- aspect -图像长宽比调整,从图像中crop一定区域出来之后在某一长宽比内进行resize。控制参数`AUG.RICH_CROP.MIN_AREA_RATIO`和`AUG.RICH_CROP.ASPECT_RATIO`。 +|Rich crop方式|配置参数|含义|备注| +|-|-|-|-| +|Rich crop|AUG.RICH_CROP.ENABLE|Rich crop总开关|为True时开启,为False时关闭所有变换| +|Blur|AUG.RICH_CROP.BLUR|图像模糊开关|为True时开启,为False时关闭| +||AUG.RICH_CROP.BLUR_RATIO|控制进行模糊的概率|当AUG.RICH_CROP.BLUR为False时无效| +|Rotation|AUG.RICH_CROP.MAX_ROTATION|图像正向旋转的最大角度|取值0~90°,实际旋转角度在\[-AUG.RICH_CROP.MAX_ROTATION, AUG.RICH_CROP.MAX_ROTATION]范围内随机选取| +|Aspect|AUG.RICH_CROP.MIN_AREA_RATIO|裁取图像与原始图像面积比最小值|取值0~1,取值越小则变化范围越大,若为0则不进行调节| +||AUG.RICH_CROP.ASPECT_RATIO|裁取图像宽高比范围|取值非负,越小则变化范围越大,若为0则不进行调节| +|Color jitter|AUG.RICH_CROP.BRIGHTNESS_JITTER_RATIO|亮度调节因子|取值0~1,取值越大则变化范围越大,若为0则不进行调节| +||AUG.RICH_CROP.SATURATION_JITTER_RATIO|饱和度调节因子|取值0~1,取值越大则变化范围越大,若为0则不进行调节| +|                              |AUG.RICH_CROP.CONTRAST_JITTER_RATIO|对比度调节因子                     |取值0~1,取值越大则变化范围越大,若为0则不进行调节| -- color jitter -图像颜色调整,控制参数`AUG.RICH_CROP.BRIGHTNESS_JITTER_RATIO`、`AUG.RICH_CROP.SATURATION_JITTER_RATIO`、`AUG.RICH_CROP.CONTRAST_JITTER_RATIO`。 ## Random Crop -该步骤主要是通过crop的方式使得输入到网络中的图像在某一个固定大小,控制该大小的参数为TRAIN_CROP_SIZE,类型为tuple,格式为(width, height). 当输入图像大小小于CROP_SIZE的时候会对输入图像进行padding,padding值为均值。 - -- 输入图片格式 - - 原图 - - 图片格式:RGB三通道图片和RGBA四通道图片两种类型的图片进行训练,但是在一次训练过程只能存在一种格式。 - - 图片转换:灰度图片经过预处理后之后会转变成三通道图片 - - 图片参数设置:当图片为三通道图片时IMAGE_TYPE设置为rgb, 对应MEAN和STD也必须是一个长度为3的list,当图片为四通道图片时IMAGE_TYPE设置为rgba,对应的MEAN和STD必须是一个长度为4的list。 - - 标注图 - - 图片格式:标注图片必须为png格式的单通道多值图,元素值代表的是这个元素所属于的类别。 - - 图片转换:在datalayer层对label图片进行的任何resize,以及旋转的操作,都必须采用最近邻的插值方式。 - - 图片ignore:设置TRAIN.IGNORE_INDEX 参数可以选择性忽略掉属于某一个类别的所有像素点。这个参数一般设置为255 +随机裁剪图片和标签图,该步骤主要是通过裁剪的方式使得输入到网络中的图像在某一个固定大小。 + +Random crop过程分为3种情形: +- 当输入图像尺寸等于CROP_SIZE时,返回原图。 +- 当输入图像尺寸大于CROP_SIZE时,直接裁剪。 +- 当输入图像尺寸小于CROP_SIZE时,分别使用`DATASET.PADDING_VALUE`值和`DATASET.IGNORE_INDEX`值对图像和标签图进行填充,再进行裁剪。 + +|Random crop方式|配置参数|含义|备注| +|-|-|-|-| +|Train crop|TRAIN_CROP_SIZE|训练过程进行random crop后的图像尺寸|类型为tuple,格式为(width, height) +|Eval crop                         |EVAL_CROP_SIZE|除训练外的过程进行random crop后的图像尺寸|类型为tuple,格式为(width, height) + +`TRAIN_CROP_SIZE`可以设置任意大小,具体如何设置根据数据集而定。 + +`EVAL_CROP_SIZE`的设置需要满足以下条件,共有3种情形: +- 当`AUG.AUG_METHOD`为unpadding时,`EVAL_CROP_SIZE`的宽高应不小于`AUG.FIX_RESIZE_SIZE`的宽高。 +- 当`AUG.AUG_METHOD`为stepscaling时,`EVAL_CROP_SIZE`的宽高应不小于原图中最长的宽高。 +- 当`AUG.AUG_METHOD`为rangescaling时,`EVAL_CROP_SIZE`的宽高应不小于缩放后图像中最长的宽高。 + diff --git a/docs/data_prepare.md b/docs/data_prepare.md index 50864a730a534c4a0e5eba84fb11dfb1bb9c542d..de1fd7965cf74efe22b5c126b94ae063ac8a52ca 100644 --- a/docs/data_prepare.md +++ b/docs/data_prepare.md @@ -2,6 +2,45 @@ ## 数据标注 +### 标注协议 +PaddleSeg采用单通道的标注图片,每一种像素值代表一种类别,像素标注类别需要从0开始递增,例如0,1,2,3表示有4种类别。 + +**NOTE:** 标注图像请使用PNG无损压缩格式的图片。标注类别最多为256类。 + +### 灰度标注vs伪彩色标注 +一般的分割库使用单通道灰度图作为标注图片,往往显示出来是全黑的效果。灰度标注图的弊端: +1. 对图像标注后,无法直接观察标注是否正确。 +2. 模型测试过程无法直接判断分割的实际效果。 + +**PaddleSeg支持伪彩色图作为标注图片,在原来的单通道图片基础上,注入调色板。在基本不增加图片大小的基础上,却可以显示出彩色的效果。** + +同时PaddleSeg也兼容灰度图标注,用户原来的灰度数据集可以不做修改,直接使用。 +![](./imgs/annotation/image-11.png) + +### 灰度标注转换为伪彩色标注 +如果用户需要转换成伪彩色标注图,可使用我们的转换工具。适用于以下两种常见的情况: +1. 如果您希望将指定目录下的所有灰度标注图转换为伪彩色标注图,则执行以下命令,指定灰度标注所在的目录即可。 +```buildoutcfg +python pdseg/tools/gray2pseudo_color.py +``` + +|参数|用途| +|-|-| +|dir_or_file|指定灰度标注所在目录| +|output_dir|彩色标注图片的输出目录| + +2. 如果您仅希望将指定数据集中的部分灰度标注图转换为伪彩色标注图,则执行以下命令,需要已有文件列表,按列表读取指定图片。 +```buildoutcfg +python pdseg/tools/gray2pseudo_color.py --dataset_dir --file_separator +``` +|参数|用途| +|-|-| +|dir_or_file|指定文件列表路径| +|output_dir|彩色标注图片的输出目录| +|--dataset_dir|数据集所在根目录| +|--file_separator|文件列表分隔符| + +### 标注教程 用户需预先采集好用于训练、评估和测试的图片,然后使用数据标注工具完成数据标注。 PddleSeg已支持2种标注工具:LabelMe、精灵数据标注工具。标注教程如下: @@ -9,63 +48,32 @@ PddleSeg已支持2种标注工具:LabelMe、精灵数据标注工具。标注 - [LabelMe标注教程](annotation/labelme2seg.md) - [精灵数据标注工具教程](annotation/jingling2seg.md) -最后用我们提供的数据转换脚本将上述标注工具产出的数据格式转换为模型训练时所需的数据格式。 ## 文件列表 ### 文件列表规范 -PaddleSeg采用通用的文件列表方式组织训练集、验证集和测试集。像素标注类别需要从0开始递增。 - -**NOTE:** 标注图像请使用PNG无损压缩格式的图片 - -以Cityscapes数据集为例, 我们需要整理出训练集、验证集、测试集对应的原图和标注文件列表用于PaddleSeg训练即可。 - -其中`DATASET.DATA_DIR`为数据根目录,文件列表的路径以数据集根目录作为相对路径起始点。 - -``` -./cityscapes/ # 数据集根目录 -├── gtFine # 标注目录 -│   ├── test -│   │   ├── berlin -│   │   └── ... -│   ├── train -│   │   ├── aachen -│   │   └── ... -│   └── val -│   ├── frankfurt -│   └── ... -└── leftImg8bit # 原图目录 - ├── test - │   ├── berlin - │   └── ... - ├── train - │   ├── aachen - │   └── ... - └── val - ├── frankfurt - └── ... -``` +PaddleSeg采用通用的文件列表方式组织训练集、验证集和测试集。在训练、评估、可视化过程前必须准备好相应的文件列表。 文件列表组织形式如下 ``` 原始图片路径 [SEP] 标注图片路径 ``` +其中`[SEP]`是文件路径分割符,可以在`DATASET.SEPARATOR`配置项中修改, 默认为空格。文件列表的路径以数据集根目录作为相对路径起始点,`DATASET.DATA_DIR`即为数据集根目录。 + +如下图所示,左边为原图的图片路径,右边为图片对应的标注路径。 -其中`[SEP]`是文件路径分割符,可以在`DATASET.SEPARATOR`配置项中修改, 默认为空格。 +![cityscapes_filelist](./imgs/file_list.png) **注意事项** -* 务必保证分隔符在文件列表中每行只存在一次, 如文件名中存在空格,请使用'|'等文件名不可用字符进行切分 +* 务必保证分隔符在文件列表中每行只存在一次, 如文件名中存在空格,请使用"|"等文件名不可用字符进行切分 * 文件列表请使用**UTF-8**格式保存, PaddleSeg默认使用UTF-8编码读取file_list文件 -如下图所示,左边为原图的图片路径,右边为图片对应的标注路径。 - -![cityscapes_filelist](./imgs/file_list.png) - 若数据集缺少标注图片,则文件列表不用包含分隔符和标注图片路径,如下图所示。 + ![cityscapes_filelist](./imgs/file_list2.png) **注意事项** @@ -75,32 +83,14 @@ PaddleSeg采用通用的文件列表方式组织训练集、验证集和测试 不可在`DATASET.TRAIN_FILE_LIST`和`DATASET.VAL_FILE_LIST`配置项中使用。 -完整的配置信息可以参考[`./docs/annotation/cityscapes_demo`](../docs/annotation/cityscapes_demo/)目录下的yaml和文件列表。 +**符合规范的文件列表是什么样的呢?** -### 文件列表生成 -PaddleSeg提供了生成文件列表的使用脚本,可适用于自定义数据集或cityscapes数据集,并支持通过不同的Flags来开启特定功能。 -``` -python pdseg/tools/create_dataset_list.py ${FLAGS} -``` -运行后将在数据集根目录下生成训练/验证/测试集的文件列表(文件主名与`--second_folder`一致,扩展名为`.txt`)。 - -**Note:** 若训练/验证/测试集缺少标注图片,仍可自动生成不含分隔符和标注图片路径的文件列表。 - -#### 命令行FLAGS列表 +请参考目录[`./docs/annotation/cityscapes_demo`](../docs/annotation/cityscapes_demo/)。 -|FLAG|用途|默认值|参数数目| -|-|-|-|-| -|--type|指定数据集类型,`cityscapes`或`自定义`|`自定义`|1| -|--separator|文件列表分隔符|'|'|1| -|--folder|图片和标签集的文件夹名|'images' 'annotations'|2| -|--second_folder|训练/验证/测试集的文件夹名|'train' 'val' 'test'|若干| -|--format|图片和标签集的数据格式|'jpg' 'png'|2| -|--postfix|按文件主名(无扩展名)是否包含指定后缀对图片和标签集进行筛选|'' ''(2个空字符)|2| +### 数据集目录结构整理 -#### 使用示例 -- **对于自定义数据集** +如果用户想要生成数据集的文件列表,需要整理成如下的目录结构(类似于Cityscapes数据集): -如果用户想要生成自己数据集的文件列表,需要整理成如下的目录结构: ``` ./dataset/ # 数据集根目录 ├── annotations # 标注目录 @@ -125,9 +115,32 @@ python pdseg/tools/create_dataset_list.py ${FLAGS} └── ... Note:以上目录名可任意 ``` -必须指定自定义数据集目录,可以按需要设定FLAG。 -**Note:** 无需指定`--type`。 +### 文件列表生成 +PaddleSeg提供了生成文件列表的使用脚本,可适用于自定义数据集或cityscapes数据集,并支持通过不同的Flags来开启特定功能。 +``` +python pdseg/tools/create_dataset_list.py ${FLAGS} +``` +运行后将在数据集根目录下生成训练/验证/测试集的文件列表(文件主名与`--second_folder`一致,扩展名为`.txt`)。 + +**Note:** 生成文件列表要求:要么原图和标注图片数量一致,要么只有原图,没有标注图片。若数据集缺少标注图片,仍可自动生成不含分隔符和标注图片路径的文件列表。 + +#### 命令行FLAGS列表 + +|FLAG|用途|默认值|参数数目| +|-|-|-|-| +|--type|指定数据集类型,`cityscapes`或`自定义`|`自定义`|1| +|--separator|文件列表分隔符|"|"|1| +|--folder|图片和标签集的文件夹名|"images" "annotations"|2| +|--second_folder|训练/验证/测试集的文件夹名|"train" "val" "test"|若干| +|--format|图片和标签集的数据格式|"jpg" "png"|2| +|--postfix|按文件主名(无扩展名)是否包含指定后缀对图片和标签集进行筛选|"" ""(2个空字符)|2| + +#### 使用示例 +- **对于自定义数据集** + +若您已经按上述说明整理好了数据集目录结构,可以运行下面的命令生成文件列表。 + ``` # 生成文件列表,其分隔符为空格,图片和标签集的数据格式都为png python pdseg/tools/create_dataset_list.py --separator " " --format png png @@ -137,22 +150,26 @@ python pdseg/tools/create_dataset_list.py --separator " " --f python pdseg/tools/create_dataset_list.py \ --folder img gt --second_folder training validation ``` - +**Note:** 必须指定自定义数据集目录,可以按需要设定FLAG。无需指定`--type`。 - **对于cityscapes数据集** +若您使用的是cityscapes数据集,可以运行下面的命令生成文件列表。 + +``` +# 生成cityscapes文件列表,其分隔符为逗号 +python pdseg/tools/create_dataset_list.py --type cityscapes --separator "," +``` +**Note:** + 必须指定cityscapes数据集目录,`--type`必须为`cityscapes`。 在cityscapes类型下,部分FLAG将被重新设定,无需手动指定,具体如下: |FLAG|固定值| |-|-| -|--folder|'leftImg8bit' 'gtFine'| -|--format|'png' 'png'| -|--postfix|'_leftImg8bit' '_gtFine_labelTrainIds'| +|--folder|"leftImg8bit" "gtFine"| +|--format|"png" "png"| +|--postfix|"_leftImg8bit" "_gtFine_labelTrainIds"| 其余FLAG可以按需要设定。 -``` -# 生成cityscapes文件列表,其分隔符为逗号 -python pdseg/tools/create_dataset_list.py --type cityscapes --separator "," -``` diff --git a/docs/imgs/annotation/image-11.png b/docs/imgs/annotation/image-11.png new file mode 100644 index 0000000000000000000000000000000000000000..2e3b6ff1f1ffd33fb57a35b547bcce31ca248e19 Binary files /dev/null and b/docs/imgs/annotation/image-11.png differ diff --git a/docs/imgs/annotation/image-7.png b/docs/imgs/annotation/image-7.png index b65d56e92b2b5c1633f5c3168eee2971b476e8f3..7c24ca50361e0f602bc5a603e6377af021dbb63d 100644 Binary files a/docs/imgs/annotation/image-7.png and b/docs/imgs/annotation/image-7.png differ diff --git a/docs/imgs/annotation/jingling-5.png b/docs/imgs/annotation/jingling-5.png index 59a15567a3e25df338a3577fe9a9035c5bd0c719..5106559099570140fe91a94e2cdffffe2fdbdaca 100644 Binary files a/docs/imgs/annotation/jingling-5.png and b/docs/imgs/annotation/jingling-5.png differ diff --git a/docs/imgs/deeplabv3p.png b/docs/imgs/deeplabv3p.png index c0f12db6474e28f68ea45aa498026ef5261bcbe9..ba754f3e8b75c49630a96d4cd9fcb4aa45d6e5bd 100644 Binary files a/docs/imgs/deeplabv3p.png and b/docs/imgs/deeplabv3p.png differ diff --git a/docs/imgs/dice.png b/docs/imgs/dice.png new file mode 100644 index 0000000000000000000000000000000000000000..56f443dfade0a02240dad61d6554a23c91213bb5 Binary files /dev/null and b/docs/imgs/dice.png differ diff --git a/docs/imgs/dice1.png b/docs/imgs/dice1.png deleted file mode 100644 index f8520802296cc264849fae4a8442792cf56cb20a..0000000000000000000000000000000000000000 Binary files a/docs/imgs/dice1.png and /dev/null differ diff --git a/docs/imgs/dice2.png b/docs/imgs/dice2.png new file mode 100644 index 0000000000000000000000000000000000000000..37c3da1f1906421c0d3928ab18212a4d1a0966a0 Binary files /dev/null and b/docs/imgs/dice2.png differ diff --git a/docs/imgs/dice3.png b/docs/imgs/dice3.png new file mode 100644 index 0000000000000000000000000000000000000000..50b422385ee1e6b0cf7652ac63571652ce1d52ef Binary files /dev/null and b/docs/imgs/dice3.png differ diff --git a/docs/imgs/hrnet.png b/docs/imgs/hrnet.png new file mode 100644 index 0000000000000000000000000000000000000000..a4733a7b7c62534f8cfc8f8cfeb4fe049d6dfba8 Binary files /dev/null and b/docs/imgs/hrnet.png differ diff --git a/docs/imgs/icnet.png b/docs/imgs/icnet.png index 7d9659db01bfb7a887f94b36fdaad303284deab7..125889691edcc5857d8e1322704dda652412d33f 100644 Binary files a/docs/imgs/icnet.png and b/docs/imgs/icnet.png differ diff --git a/docs/imgs/pspnet.png b/docs/imgs/pspnet.png new file mode 100644 index 0000000000000000000000000000000000000000..2963aeadb89aef05bfb19163f89d413d620c6564 Binary files /dev/null and b/docs/imgs/pspnet.png differ diff --git a/docs/imgs/pspnet2.png b/docs/imgs/pspnet2.png new file mode 100644 index 0000000000000000000000000000000000000000..401263a9b5fddc4c6ca77ef2dc172c7cb565c00f Binary files /dev/null and b/docs/imgs/pspnet2.png differ diff --git a/docs/imgs/softmax_loss.png b/docs/imgs/softmax_loss.png new file mode 100644 index 0000000000000000000000000000000000000000..3c5cbbce470fe48ca5f500c59995776c2fbd5ec5 Binary files /dev/null and b/docs/imgs/softmax_loss.png differ diff --git a/docs/imgs/tensorboard_image.JPG b/docs/imgs/tensorboard_image.JPG index 2d5d0ceb001cb7fc9f68622842710afd9d032463..140aa2a0ed6a9b1a2d0a98477685b9e6d434a113 100644 Binary files a/docs/imgs/tensorboard_image.JPG and b/docs/imgs/tensorboard_image.JPG differ diff --git a/docs/imgs/tensorboard_scalar.JPG b/docs/imgs/tensorboard_scalar.JPG index 2de89c32a3469764631352597f0e55f8a431ad4b..322c98dc8ba7e5ca96477f3dbe193a70a8cf4609 100644 Binary files a/docs/imgs/tensorboard_scalar.JPG and b/docs/imgs/tensorboard_scalar.JPG differ diff --git a/docs/imgs/unet.png b/docs/imgs/unet.png index 960f289321a9a6b894d3054ec4f257a36cb8969e..5a7b691ae54f9fe29dded913d8e6f6cacac494f7 100644 Binary files a/docs/imgs/unet.png and b/docs/imgs/unet.png differ diff --git a/docs/imgs/usage_vis_demo.jpg b/docs/imgs/usage_vis_demo.jpg index 50bedf2f547d11cb4aaefa0435022acc0392ba3c..40b35f13418e7c68e0bfaabf992d8411bd87bc77 100644 Binary files a/docs/imgs/usage_vis_demo.jpg and b/docs/imgs/usage_vis_demo.jpg differ diff --git a/docs/imgs/usage_vis_demo2.jpg b/docs/imgs/usage_vis_demo2.jpg deleted file mode 100644 index 9665e9e2f4d90d6db75411d43d0dc5a34d8b28e7..0000000000000000000000000000000000000000 Binary files a/docs/imgs/usage_vis_demo2.jpg and /dev/null differ diff --git a/docs/imgs/usage_vis_demo3.jpg b/docs/imgs/usage_vis_demo3.jpg deleted file mode 100644 index 318c06bcf7debf76b7bff504648df056802130df..0000000000000000000000000000000000000000 Binary files a/docs/imgs/usage_vis_demo3.jpg and /dev/null differ diff --git a/docs/installation.md b/docs/installation.md deleted file mode 100644 index 80cc341bb8764065dc7fd871e81fdb31225d636a..0000000000000000000000000000000000000000 --- a/docs/installation.md +++ /dev/null @@ -1,44 +0,0 @@ -# PaddleSeg 安装说明 - -## 1. 安装PaddlePaddle - -版本要求 -* PaddlePaddle >= 1.6.1 -* Python 2.7 or 3.5+ - -更多详细安装信息如CUDA版本、cuDNN版本等兼容信息请查看[PaddlePaddle安装](https://www.paddlepaddle.org.cn/install/doc/index) - -### pip安装 - -由于图像分割模型计算开销大,推荐在GPU版本的PaddlePaddle下使用PaddleSeg. - -``` -pip install paddlepaddle-gpu -``` - -### Conda安装 - -PaddlePaddle最新版本1.5支持Conda安装,可以减少相关依赖安装成本,conda相关使用说明可以参考[Anaconda](https://www.anaconda.com/distribution/) - -``` -conda install -c paddle paddlepaddle-gpu cudatoolkit=9.0 -``` - - * 如果有多卡训练需求,请安装 NVIDIA NCCL >= 2.4.7,并在Linux环境下运行 - -更多安装方式详情可以查看 [PaddlePaddle安装说明](https://www.paddlepaddle.org.cn/documentation/docs/zh/beginners_guide/install/index_cn.html) - - -## 2. 下载PaddleSeg代码 - -``` -git clone https://github.com/PaddlePaddle/PaddleSeg -``` - - -## 3. 安装PaddleSeg依赖 - -``` -cd PaddleSeg -pip install -r requirements.txt -``` diff --git a/docs/loss_select.md b/docs/loss_select.md index 454085c9c22a5c3308c77c93c961628b53157042..6749979821de5cd7387f3161e0a2bd25a9f02e4e 100644 --- a/docs/loss_select.md +++ b/docs/loss_select.md @@ -1,41 +1,66 @@ -# dice loss解决二分类中样本不均衡问题 +# 如何解决二分类中类别不均衡问题 +对于二类图像分割任务中,经常出现类别分布不均匀的情况,例如:工业产品的瑕疵检测、道路提取及病变区域提取等。 + +目前PaddleSeg提供了三种loss函数,分别为softmax loss(sotfmax with cross entroy loss)、dice loss(dice coefficient loss)和bce loss(binary cross entroy loss). 我们可使用dice loss解决这个问题。 + +注:dice loss和bce loss仅支持二分类。 + +## Dice loss +Dice loss的定义如下: -对于二类图像分割任务中,往往存在类别分布不均的情况,如:瑕疵检测,道路提取及病变区域提取等等。 -在DeepGlobe比赛的Road Extraction中,训练数据道路占比为:%4.5。如下为其图片样例:

-
+

-可以看出道路在整张图片中的比例很小。 - -## 数据集下载 -我们从DeepGlobe比赛的Road Extraction的训练集中随机抽取了800张图片作为训练集,200张图片作为验证集, -制作了一个小型的道路提取数据集[MiniDeepGlobeRoadExtraction](https://paddleseg.bj.bcebos.com/dataset/MiniDeepGlobeRoadExtraction.zip) -## softmax loss与dice loss -在图像分割中,softmax loss(sotfmax with cross entroy loss)同等的对待每一像素,因此当背景占据绝大部分的情况下, -网络将偏向于背景的学习,使网络对目标的提取能力变差。`dice loss(dice coefficient loss)`通过计算预测与标注之间的重叠部分计算损失函数,避免了类别不均衡带来的影响,能够取得更好的效果。 -在实际应用中`dice loss`往往与`bce loss(binary cross entroy loss)`结合使用,提高模型训练的稳定性。 +其中 Y 表示ground truth,P 表示预测结果。| |表示矩阵元素之和。![](./imgs/dice2.png) 表示*Y*和*P*的共有元素数, +实际通过求两者的逐像素乘积之和进行计算。例如: + +

+
+

+ +其中 1 表示前景,0 表示背景。 + +**Note:** 在标注图片中,务必保证前景像素值为1,背景像素值为0. -dice loss的定义如下: +Dice系数请参见[维基百科](https://zh.wikipedia.org/wiki/Dice%E7%B3%BB%E6%95%B0) -![equation](http://latex.codecogs.com/gif.latex?dice\\_loss=1-\frac{2|Y\bigcap{P}|}{|Y|+|P|}) +**为什么在类别不均衡问题上,dice loss效果比softmax loss更好?** -其中 ![equation](http://latex.codecogs.com/gif.latex?|Y\bigcap{P}|) 表示*Y*和*P*的共有元素数, -实际计算通过求两者的乘积之和进行计算。如下所示: +首先来看softmax loss的定义:

-
+

+ +其中 y 表示ground truth,p 表示网络输出。 + +在图像分割中,`softmax loss`评估每一个像素点的类别预测,然后平均所有的像素点。这个本质上就是对图片上的每个像素进行平等的学习。这就造成了一个问题,如果在图像上的多种类别有不平衡的表征,那么训练会由最主流的类别主导。以上面DeepGlobe道路提取的数据为例子,网络将偏向于背景的学习,降低了网络对前景目标的提取能力。 +而`dice loss(dice coefficient loss)`通过预测和标注的交集除以它们的总体像素进行计算,它将一个类别的所有像素作为一个整体作为考量,而且计算交集在总体中的占比,所以不受大量背景像素的影响,能够取得更好的效果。 + +在实际应用中`dice loss`往往与`bce loss(binary cross entroy loss)`结合使用,提高模型训练的稳定性。 -[dice系数详解](https://zh.wikipedia.org/wiki/Dice%E7%B3%BB%E6%95%B0) ## PaddleSeg指定训练loss PaddleSeg通过`cfg.SOLVER.LOSS`参数可以选择训练时的损失函数, 如`cfg.SOLVER.LOSS=['dice_loss','bce_loss']`将指定训练loss为`dice loss`与`bce loss`的组合 -## 实验比较 +## Dice loss解决类别不均衡问题的示例 + +我们以道路提取任务为例应用dice loss. +在DeepGlobe比赛的Road Extraction中,训练数据道路占比为:4.5%. 如下为其图片样例: +

+
+

+可以看出道路在整张图片中的比例很小。 + +### 数据集下载 +我们从DeepGlobe比赛的Road Extraction的训练集中随机抽取了800张图片作为训练集,200张图片作为验证集, +制作了一个小型的道路提取数据集[MiniDeepGlobeRoadExtraction](https://paddleseg.bj.bcebos.com/dataset/MiniDeepGlobeRoadExtraction.zip) + +### 实验比较 在MiniDeepGlobeRoadExtraction数据集进行了实验比较。 @@ -73,5 +98,4 @@ softmax loss和dice loss + bce loss实验结果如下图所示。

- diff --git a/docs/model_zoo.md b/docs/model_zoo.md index 7e625db73a5ae185b8db00e8dd6f04e26d4e11e5..2b18260e3290561dcc7aa729ea307f23e45c26b0 100644 --- a/docs/model_zoo.md +++ b/docs/model_zoo.md @@ -1,6 +1,7 @@ # PaddleSeg 预训练模型 -PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训练模型,通过加载预训练模型后训练可以在自定义数据集中得到更稳定地效果。 +PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训练模型。因为对于自定 +义数据集的场景,使用预训练模型进行训练可以得到更稳定地效果。用户可以根据模型类型、自己的数据集和预训练数据集的相似程度,选择并下载预训练模型。 ## ImageNet预训练模型 @@ -32,6 +33,11 @@ PaddleSeg对所有内置的分割模型都提供了公开数据集下的预训 | HRNet_W48 | ImageNet | [hrnet_w48_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w48_imagenet.tar) | 78.95%/94.42% | | HRNet_W64 | ImageNet | [hrnet_w64_imagenet.tar](https://paddleseg.bj.bcebos.com/models/hrnet_w64_imagenet.tar) | 79.30%/94.61% | +| 模型 | 数据集合 | 下载地址 | Accuray Top1/5 Error | +|---|---|---|---| +| ResNet50(适配PSPNet) | ImageNet | [resnet50_v2_pspnet](https://paddleseg.bj.bcebos.com/resnet50_v2_pspnet.tgz)| -- | +| ResNet101(适配PSPNet) | ImageNet | [resnet101_v2_pspnet](https://paddleseg.bj.bcebos.com/resnet101_v2_pspnet.tgz)| -- | + ## COCO预训练模型 数据集为COCO实例分割数据集合转换成的语义分割数据集合 diff --git a/docs/models.md b/docs/models.md index 680dfe87356db9dd6be181e003598d3eb8967ffe..a452aa3639c3901d8f75d1aa4f5f1b7f393ce0b7 100644 --- a/docs/models.md +++ b/docs/models.md @@ -1,56 +1,74 @@ # PaddleSeg 分割模型介绍 -### U-Net -U-Net 起源于医疗图像分割,整个网络是标准的encoder-decoder网络,特点是参数少,计算快,应用性强,对于一般场景适应度很高。 +- [U-Net](#U-Net) +- [DeepLabv3+](#DeepLabv3) +- [PSPNet](#PSPNet) +- [ICNet](#ICNet) +- [HRNet](#HRNet) + +## U-Net +U-Net [1] 起源于医疗图像分割,整个网络是标准的encoder-decoder网络,特点是参数少,计算快,应用性强,对于一般场景适应度很高。U-Net最早于2015年提出,并在ISBI 2015 Cell Tracking Challenge取得了第一。经过发展,目前有多个变形和应用。 + +原始U-Net的结构如下图所示,由于网络整体结构类似于大写的英文字母U,故得名U-net。左侧可视为一个编码器,右侧可视为一个解码器。编码器有四个子模块,每个子模块包含两个卷积层,每个子模块之后通过max pool进行下采样。由于卷积使用的是valid模式,故实际输出比输入图像小一些。具体来说,后一个子模块的分辨率=(前一个子模块的分辨率-4)/2。U-Net使用了Overlap-tile 策略用于补全输入图像的上下信息,使得任意大小的输入图像都可获得无缝分割。同样解码器也包含四个子模块,分辨率通过上采样操作依次上升,直到与输入图像的分辨率基本一致。该网络还使用了跳跃连接,以拼接的方式将解码器和编码器中相同分辨率的feature map进行特征融合,帮助解码器更好地恢复目标的细节。 + ![](./imgs/unet.png) -### DeepLabv3+ +## DeepLabv3+ -DeepLabv3+ 是DeepLab系列的最后一篇文章,其前作有 DeepLabv1,DeepLabv2, DeepLabv3, -在最新作中,DeepLab的作者通过encoder-decoder进行多尺度信息的融合,同时保留了原来的空洞卷积和ASSP层, -其骨干网络使用了Xception模型,提高了语义分割的健壮性和运行速率,在 PASCAL VOC 2012 dataset取得新的state-of-art performance,89.0mIOU。 +DeepLabv3+ [2] 是DeepLab系列的最后一篇文章,其前作有 DeepLabv1, DeepLabv2, DeepLabv3. +在最新作中,作者通过encoder-decoder进行多尺度信息的融合,以优化分割效果,尤其是目标边缘的效果。 +并且其使用了Xception模型作为骨干网络,并将深度可分离卷积(depthwise separable convolution)应用到atrous spatial pyramid pooling(ASPP)中和decoder模块,提高了语义分割的健壮性和运行速率,在 PASCAL VOC 2012 和 Cityscapes 数据集上取得新的state-of-art performance. ![](./imgs/deeplabv3p.png) -在PaddleSeg当前实现中,支持两种分类Backbone网络的切换 +在PaddleSeg当前实现中,支持两种分类Backbone网络的切换: -- MobileNetv2: +- MobileNetv2 适用于移动设备的快速网络,如果对分割性能有较高的要求,请使用这一backbone网络。 -- Xception: +- Xception DeepLabv3+原始实现的backbone网络,兼顾了精度和性能,适用于服务端部署。 +## PSPNet + +Pyramid Scene Parsing Network (PSPNet) [3] 起源于场景解析(Scene Parsing)领域。如下图所示,普通FCN [4] 面向复杂场景出现三种误分割现象:(1)关系不匹配。将船误分类成车,显然车一般不会出现在水面上。(2)类别混淆。摩天大厦和建筑物这两个类别相近,误将摩天大厦分类成建筑物。(3)类别不显著。枕头区域较小且纹理与床相近,误将枕头分类成床。 + +![](./imgs/pspnet2.png) -### ICNet +PSPNet的出发点是在算法中引入更多的上下文信息来解决上述问题。为了融合了图像中不同区域的上下文信息,PSPNet通过特殊设计的全局均值池化操作(global average pooling)和特征融合构造金字塔池化模块 (Pyramid Pooling Module)。PSPNet最终获得了2016年ImageNet场景解析挑战赛的冠军,并在PASCAL VOC 2012 和 Cityscapes 数据集上取得当时的最佳效果。整个网络结构如下: -Image Cascade Network(ICNet)主要用于图像实时语义分割。相较于其它压缩计算的方法,ICNet即考虑了速度,也考虑了准确性。 ICNet的主要思想是将输入图像变换为不同的分辨率,然后用不同计算复杂度的子网络计算不同分辨率的输入,然后将结果合并。ICNet由三个子网络组成,计算复杂度高的网络处理低分辨率输入,计算复杂度低的网络处理分辨率高的网络,通过这种方式在高分辨率图像的准确性和低复杂度网络的效率之间获得平衡。 +![](./imgs/pspnet.png) + + +## ICNet + +Image Cascade Network(ICNet) [5] 是一个基于PSPNet的语义分割网络,设计目的是减少PSPNet推断时期的耗时。ICNet主要用于图像实时语义分割。ICNet由三个不同分辨率的子网络组成,将输入图像变换为不同的分辨率,随后使用计算复杂度高的网络处理低分辨率输入,计算复杂度低的网络处理分辨率高的网络,通过这种方式在高分辨率图像的准确性和低复杂度网络的效率之间获得平衡。并在PSPNet的基础上引入级联特征融合单元(cascade feature fusion unit),实现快速且高质量的分割模型。 整个网络结构如下: ![](./imgs/icnet.png) -## 参考 +### HRNet -- [Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611) +High-Resolution Network (HRNet) [6] 在整个训练过程中始终维持高分辨率表示。 +HRNet具有两个特点:(1)从高分辨率到低分辨率并行连接各子网络,(2)反复交换跨分辨率子网络信息。这两个特点使HRNet网络能够学习到更丰富的语义信息和细节信息。 +HRNet在人体姿态估计、语义分割和目标检测领域都取得了显著的性能提升。 -- [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597) - -- [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545) +整个网络结构如下: -# PaddleSeg特殊网络结构介绍 +![](./imgs/hrnet.png) -### Group Norm +## 参考文献 -![](./imgs/gn.png) -关于Group Norm的介绍可以参考论文:https://arxiv.org/abs/1803.08494 +[1] [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597) -GN 把通道分为组,并计算每一组之内的均值和方差,以进行归一化。GN 的计算与批量大小无关,其精度也在各种批量大小下保持稳定。适应于网络参数很重的模型,比如deeplabv3+这种,可以在一个小batch下取得一个较好的训练效果。 +[2] [Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1802.02611) +[3] [Pyramid Scene Parsing Network](https://arxiv.org/abs/1612.01105) -### Synchronized Batch Norm +[4] [Fully Convolutional Networks for Semantic Segmentation](https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) -Synchronized Batch Norm跨GPU批归一化策略最早在[MegDet: A Large Mini-Batch Object Detector](https://arxiv.org/abs/1711.07240) -论文中提出,在[Bag of Freebies for Training Object Detection Neural Networks](https://arxiv.org/pdf/1902.04103.pdf)论文中以Yolov3验证了这一策略的有效性,[PaddleCV/yolov3](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/yolov3)实现了这一系列策略并比Darknet框架版本在COCO17数据上mAP高5.9. +[5] [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545) -PaddleSeg基于PaddlePaddle框架的sync_batch_norm策略,可以支持通过多卡实现大batch size的分割模型训练,可以得到更高的mIoU精度。 +[6] [Deep High-Resolution Representation Learning for Visual Recognition](https://arxiv.org/abs/1908.07919) diff --git a/docs/usage.md b/docs/usage.md index e38d16e047b4b97a71278b1ba17682d20c4586ee..6da85a2de7b8be220e955a9e20a351c2d306b489 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,98 +1,74 @@ -# 训练/评估/可视化 +# PaddleSeg快速入门 -PaddleSeg提供了 **训练**/**评估**/**可视化** 等三个功能的使用脚本。三个脚本都支持通过不同的Flags来开启特定功能,也支持通过Options来修改默认的[训练配置](./config.md)。三者的使用方式非常接近,如下: +本教程通过一个简单的示例,说明如何基于PaddleSeg启动训练(训练可视化)、评估和可视化。我们选择基于COCO数据集预训练的unet模型作为预训练模型,以一个眼底医疗分割数据集为例。 -```shell -# 训练 -python pdseg/train.py ${FLAGS} ${OPTIONS} -# 评估 -python pdseg/eval.py ${FLAGS} ${OPTIONS} -# 可视化 -python pdseg/vis.py ${FLAGS} ${OPTIONS} -``` - -**Note:** - -* FLAGS必须位于OPTIONS之前,否会将会遇到报错,例如如下的例子: - -```shell -# FLAGS "--cfg configs/cityscapes.yaml" 必须在 OPTIONS "BATCH_SIZE 1" 之前 -python pdseg/train.py BATCH_SIZE 1 --cfg configs/cityscapes.yaml -``` - -## 命令行FLAGS列表 - -|FLAG|支持脚本|用途|默认值|备注| -|-|-|-|-|-| -|--cfg|ALL|配置文件路径|None|| -|--use_gpu|ALL|是否使用GPU进行训练|False|| -|--use_mpio|train/eval|是否使用多进程进行IO处理|False|打开该开关会占用一定量的CPU内存,但是可以提高训练速度。
**NOTE:** windows平台下不支持该功能, 建议使用自定义数据初次训练时不打开,打开会导致数据读取异常不可见。
| -|--use_tb|train|是否使用TensorBoard记录训练数据|False|| -|--log_steps|train|训练日志的打印周期(单位为step)|10|| -|--debug|train|是否打印debug信息|False|IOU等指标涉及到混淆矩阵的计算,会降低训练速度| -|--tb_log_dir|train|TensorBoard的日志路径|None|| -|--do_eval|train|是否在保存模型时进行效果评估|False|| -|--vis_dir|vis|保存可视化图片的路径|"visual"|| -|--also_save_raw_results|vis|是否保存原始的预测图片|False|| - -## OPTIONS - -详见[训练配置](./config.md) +- [1.准备工作](#1准备工作) +- [2.下载待训练数据](#2下载待训练数据) +- [3.下载预训练模型](#3下载预训练模型) +- [4.模型训练](#4模型训练) +- [5.训练过程可视化](#5训练过程可视化) +- [6.模型评估](#6模型评估) +- [7.模型可视化](#7模型可视化) +- [在线体验](#在线体验) -## 使用示例 -下面通过一个简单的示例,说明如何基于PaddleSeg提供的预训练模型启动训练。我们选择基于COCO数据集预训练的unet模型作为预训练模型,在一个Oxford-IIIT Pet数据集上进行训练。 -**Note:** 为了快速体验,我们使用Oxford-IIIT Pet做了一个小型数据集,后续数据都使用该小型数据集。 -### 准备工作 +## 1.准备工作 在开始教程前,请先确认准备工作已经完成: 1. 正确安装了PaddlePaddle 2. PaddleSeg相关依赖已经安装 -如果有不确认的地方,请参考[安装说明](./installation.md) +如果有不确认的地方,请参考[首页安装说明](../README.md#安装) + +## 2.下载待训练数据 + +![](../turtorial/imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集--视盘分割(optic disc segmentation),包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: -### 下载预训练模型 ```shell -# 下载预训练模型并进行解压 -python pretrained_model/download_model.py unet_bn_coco +# 下载待训练数据集 +python dataset/download_optic.py ``` -### 下载Oxford-IIIT Pet数据集 -我们使用了Oxford-IIIT中的猫和狗两个类别数据制作了一个小数据集mini_pet,用于快速体验。 -更多关于数据集的介绍情参考[Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) +## 3.下载预训练模型 ```shell # 下载预训练模型并进行解压 -python dataset/download_pet.py +python pretrained_model/download_model.py unet_bn_coco ``` -### 模型训练 +## 4.模型训练 -为了方便体验,我们在configs目录下放置了mini_pet所对应的配置文件`unet_pet.yaml`,可以通过`--cfg`指向该文件来设置训练配置。 +为了方便体验,我们在configs目录下放置了配置文件`unet_optic.yaml`,可以通过`--cfg`指向该文件来设置训练配置。 -我们选择GPU 0号卡进行训练,这可以通过环境变量`CUDA_VISIBLE_DEVICES`来指定。 +可以通过环境变量`CUDA_VISIBLE_DEVICES`来指定GPU卡号。 ``` +# 指定GPU卡号(以0号卡为例) export CUDA_VISIBLE_DEVICES=0 -python pdseg/train.py --use_gpu \ +# 训练 +python pdseg/train.py --cfg configs/unet_optic.yaml \ + --use_gpu \ --do_eval \ --use_tb \ --tb_log_dir train_log \ - --cfg configs/unet_pet.yaml \ BATCH_SIZE 4 \ - TRAIN.PRETRAINED_MODEL_DIR pretrained_model/unet_bn_coco \ - SOLVER.LR 5e-5 + SOLVER.LR 0.001 + +``` +若需要使用多块GPU,以0、1、2号卡为例,可输入 +``` +export CUDA_VISIBLE_DEVICES=0,1,2 ``` **NOTE:** -* 上述示例中,一共存在三套配置方案: PaddleSeg默认配置/unet_pet.yaml/OPTIONS,三者的优先级顺序为 OPTIONS > yaml > 默认配置。这个原则对于train.py/eval.py/vis.py都适用 - -* 如果发现因为内存不足而Crash。请适当调低BATCH_SIZE。如果本机GPU内存充足,则可以调高BATCH_SIZE的大小以获得更快的训练速度,BATCH_SIZE增大时,可以适当调高学习率。 +* 如果发现因为内存不足而Crash。请适当调低`BATCH_SIZE`。如果本机GPU内存充足,则可以调高`BATCH_SIZE`的大小以获得更快的训练速度,`BATCH_SIZE`增大时,可以适当调高学习率`SOLVER.LR`. * 如果在Linux系统下训练,可以使用`--use_mpio`使用多进程I/O,通过提升数据增强的处理速度进而大幅度提升GPU利用率。 -### 训练过程可视化 +## 5.训练过程可视化 当打开do_eval和use_tb两个开关后,我们可以通过TensorBoard查看边训练边评估的效果。 @@ -101,40 +77,42 @@ tensorboard --logdir train_log --host {$HOST_IP} --port {$PORT} ``` NOTE: -1. 上述示例中,$HOST\_IP为机器IP地址,请替换为实际IP,$PORT请替换为可访问的端口 -2. 数据量较大时,前端加载速度会比较慢,请耐心等待 +1. 上述示例中,$HOST\_IP为机器IP地址,请替换为实际IP,$PORT请替换为可访问的端口。 +2. 数据量较大时,前端加载速度会比较慢,请耐心等待。 -启动TensorBoard命令后,我们可以在浏览器中查看对应的训练数据 -在`SCALAR`这个tab中,查看训练loss、iou、acc的变化趋势 +启动TensorBoard命令后,我们可以在浏览器中查看对应的训练数据。 +在`SCALAR`这个tab中,查看训练loss、iou、acc的变化趋势。 ![](./imgs/tensorboard_scalar.JPG) -在`IMAGE`这个tab中,查看样本的预测情况 +在`IMAGE`这个tab中,查看样本图片。 ![](./imgs/tensorboard_image.JPG) -### 模型评估 -训练完成后,我们可以通过eval.py来评估模型效果。由于我们设置的训练EPOCH数量为100,保存间隔为10,因此一共会产生10个定期保存的模型,加上最终保存的final模型,一共有11个模型。我们选择最后保存的模型进行效果的评估: +## 6.模型评估 +训练完成后,我们可以通过eval.py来评估模型效果。由于我们设置的训练EPOCH数量为10,保存间隔为5,因此一共会产生2个定期保存的模型,加上最终保存的final模型,一共有3个模型。我们选择最后保存的模型进行效果的评估: ```shell python pdseg/eval.py --use_gpu \ - --cfg configs/unet_pet.yaml \ - TEST.TEST_MODEL saved_model/unet_pet/final + --cfg configs/unet_optic.yaml \ + TEST.TEST_MODEL saved_model/unet_optic/final ``` -可以看到,在经过训练后,模型在验证集上的mIoU指标达到了0.70+(由于随机种子等因素的影响,效果会有小范围波动,属于正常情况)。 +可以看到,在经过训练后,模型在验证集上的mIoU指标达到了0.85+(由于随机种子等因素的影响,效果会有小范围波动,属于正常情况)。 -### 模型可视化 -通过vis.py来评估模型效果,我们选择最后保存的模型进行效果的评估: +## 7.模型可视化 +通过vis.py进行测试和可视化,以选择最后保存的模型进行测试为例: ```shell python pdseg/vis.py --use_gpu \ - --cfg configs/unet_pet.yaml \ - TEST.TEST_MODEL saved_model/unet_pet/final + --cfg configs/unet_optic.yaml \ + TEST.TEST_MODEL saved_model/unet_optic/final ``` -执行上述脚本后,会在主目录下产生一个visual/visual_results文件夹,里面存放着测试集图片的预测结果,我们选择其中几张图片进行查看,可以看到,在测试集中的图片上的预测效果已经很不错: +执行上述脚本后,会在主目录下产生一个visual文件夹,里面存放着测试集图片的预测结果,我们选择其中1张图片进行查看: ![](./imgs/usage_vis_demo.jpg) -![](./imgs/usage_vis_demo2.jpg) -![](./imgs/usage_vis_demo3.jpg) `NOTE` -1. 可视化的图片会默认保存在visual/visual_results目录下,可以通过`--vis_dir`来指定输出目录 -2. 训练过程中会使用DATASET.VIS_FILE_LIST中的图片进行可视化显示,而vis.py则会使用DATASET.TEST_FILE_LIST +1. 可视化的图片会默认保存在visual目录下,可以通过`--vis_dir`来指定输出目录。 +2. 训练过程中会使用`DATASET.VIS_FILE_LIST`中的图片进行可视化显示,而vis.py则会使用`DATASET.TEST_FILE_LIST`. + +## 在线体验 + +PaddleSeg在AI Studio平台上提供了在线体验的快速入门教程,欢迎[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/100798) diff --git a/pdseg/__init__.py b/pdseg/__init__.py index 7f051e1e16ed29046c6ea46e341d62e4280f412d..5a1851ecb5fc0575deb449110d69da3087282719 100644 --- a/pdseg/__init__.py +++ b/pdseg/__init__.py @@ -14,3 +14,4 @@ # limitations under the License. import models import utils +import tools \ No newline at end of file diff --git a/pdseg/data_aug.py b/pdseg/data_aug.py index 15186150a3734a3a0c026386a04206ac036c7858..ae976bf7e4bb6751ba6ec4186a137cbf5644ce84 100644 --- a/pdseg/data_aug.py +++ b/pdseg/data_aug.py @@ -327,7 +327,7 @@ def random_jitter(cv_img, saturation_range, brightness_range, contrast_range): brightness_ratio = np.random.uniform(-brightness_range, brightness_range) contrast_ratio = np.random.uniform(-contrast_range, contrast_range) - order = [1, 2, 3] + order = [0, 1, 2] np.random.shuffle(order) for i in range(3): @@ -368,7 +368,7 @@ def hsv_color_jitter(crop_img, def rand_crop(crop_img, crop_seg, mode=ModelPhase.TRAIN): """ - 随机裁剪图片和标签图, 若crop尺寸大于原始尺寸,分别使用均值和ignore值填充再进行crop, + 随机裁剪图片和标签图, 若crop尺寸大于原始尺寸,分别使用DATASET.PADDING_VALUE值和DATASET.IGNORE_INDEX值填充再进行crop, crop尺寸与原始尺寸一致,返回原图,crop尺寸小于原始尺寸直接crop Args: diff --git a/pdseg/tools/create_dataset_list.py b/pdseg/tools/create_dataset_list.py index aca6d95d20bc645c1843399c99f5e56d4560f7f8..6c7d7c943c9baf916533621d353d5f2700388a01 100644 --- a/pdseg/tools/create_dataset_list.py +++ b/pdseg/tools/create_dataset_list.py @@ -116,18 +116,19 @@ def generate_list(args): label_files = get_files(1, dataset_split, args) if not image_files: img_dir = os.path.join(dataset_root, args.folder[0], dataset_split) - print("No files in {}".format(img_dir)) + warnings.warn("No images in {} !!!".format(img_dir)) num_images = len(image_files) if not label_files: label_dir = os.path.join(dataset_root, args.folder[1], dataset_split) - print("No files in {}".format(label_dir)) + warnings.warn("No labels in {} !!!".format(label_dir)) num_label = len(label_files) - if num_images < num_label: - warnings.warn("number of images = {} < number of labels = {}." - .format(num_images, num_label)) - continue + if num_images != num_label and num_label > 0: + raise Exception("Number of images = {} number of labels = {} \n" + "Either number of images is equal to number of labels, " + "or number of labels is equal to 0.\n" + "Please check your dataset!".format(num_images, num_label)) file_list = os.path.join(dataset_root, dataset_split + '.txt') with open(file_list, "w") as f: diff --git a/pdseg/tools/gray2pseudo_color.py b/pdseg/tools/gray2pseudo_color.py index b385049172c4b134aca849682cbf76193c569f62..3627db0b216175b04a50d9012999d441f4df69fb 100644 --- a/pdseg/tools/gray2pseudo_color.py +++ b/pdseg/tools/gray2pseudo_color.py @@ -2,13 +2,11 @@ from __future__ import print_function import argparse -import glob import os import os.path as osp import sys import numpy as np from PIL import Image -from pdseg.vis import get_color_map_list def parse_args(): @@ -26,6 +24,28 @@ def parse_args(): return parser.parse_args() +def get_color_map_list(num_classes): + """ Returns the color map for visualizing the segmentation mask, + which can support arbitrary number of classes. + Args: + num_classes: Number of classes + Returns: + The color map + """ + color_map = num_classes * [0, 0, 0] + for i in range(0, num_classes): + j = 0 + lab = i + while lab: + color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j)) + color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j)) + color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j)) + j += 1 + lab >>= 3 + + return color_map + + def gray2pseudo_color(args): """将灰度标注图片转换为伪彩色图片""" input = args.dir_or_file @@ -36,18 +56,28 @@ def gray2pseudo_color(args): color_map = get_color_map_list(256) if os.path.isdir(input): - for grt_path in glob.glob(osp.join(input, '*.png')): - print('Converting original label:', grt_path) - basename = osp.basename(grt_path) + for fpath, dirs, fs in os.walk(input): + for f in fs: + try: + grt_path = osp.join(fpath, f) + _output_dir = fpath.replace(input, '') + _output_dir = _output_dir.lstrip(os.path.sep) - im = Image.open(grt_path) - lbl = np.asarray(im) + im = Image.open(grt_path) + lbl = np.asarray(im) - lbl_pil = Image.fromarray(lbl.astype(np.uint8), mode='P') - lbl_pil.putpalette(color_map) + lbl_pil = Image.fromarray(lbl.astype(np.uint8), mode='P') + lbl_pil.putpalette(color_map) - new_file = osp.join(output_dir, basename) - lbl_pil.save(new_file) + real_dir = osp.join(output_dir, _output_dir) + if not osp.exists(real_dir): + os.makedirs(real_dir) + new_grt_path = osp.join(real_dir, f) + + lbl_pil.save(new_grt_path) + print('New label path:', new_grt_path) + except: + continue elif os.path.isfile(input): if args.dataset_dir is None or args.file_separator is None: print('No dataset_dir or file_separator input!') @@ -58,17 +88,20 @@ def gray2pseudo_color(args): grt_name = parts[1] grt_path = os.path.join(args.dataset_dir, grt_name) - print('Converting original label:', grt_path) - basename = osp.basename(grt_path) - im = Image.open(grt_path) lbl = np.asarray(im) lbl_pil = Image.fromarray(lbl.astype(np.uint8), mode='P') lbl_pil.putpalette(color_map) - new_file = osp.join(output_dir, basename) - lbl_pil.save(new_file) + grt_dir, _ = osp.split(grt_name) + new_dir = osp.join(output_dir, grt_dir) + if not osp.exists(new_dir): + os.makedirs(new_dir) + new_grt_path = osp.join(output_dir, grt_name) + + lbl_pil.save(new_grt_path) + print('New label path:', new_grt_path) else: print('It\'s neither a dir nor a file') diff --git a/pdseg/tools/jingling2seg.py b/pdseg/tools/jingling2seg.py index 9c1d663685cb357017387c54ed25115e6117408e..28bce3b0436242f5174087c0852dde99a7878684 100644 --- a/pdseg/tools/jingling2seg.py +++ b/pdseg/tools/jingling2seg.py @@ -12,7 +12,7 @@ import numpy as np import PIL.Image import labelme -from pdseg.vis import get_color_map_list +from gray2pseudo_color import get_color_map_list def parse_args(): diff --git a/pdseg/tools/labelme2seg.py b/pdseg/tools/labelme2seg.py index be1c99ee32c249cda29fea3d628b707415bf8b23..6ae3ad3a50a6df750ce321d94b7235ef57dcf80b 100755 --- a/pdseg/tools/labelme2seg.py +++ b/pdseg/tools/labelme2seg.py @@ -12,7 +12,7 @@ import numpy as np import PIL.Image import labelme -from pdseg.vis import get_color_map_list +from gray2pseudo_color import get_color_map_list def parse_args(): diff --git a/pdseg/utils/config.py b/pdseg/utils/config.py index d321aa4f8475fcdef7645fbd051aa26deeed3221..1beff8f055479e5ae6a2cb982ea50d9ea2a900da 100644 --- a/pdseg/utils/config.py +++ b/pdseg/utils/config.py @@ -72,17 +72,11 @@ cfg.DATASET.IGNORE_INDEX = 255 cfg.DATASET.PADDING_VALUE = [127.5, 127.5, 127.5] ########################### 数据增强配置 ###################################### -# 图像镜像左右翻转 -cfg.AUG.MIRROR = True -# 图像上下翻转开关,True/False -cfg.AUG.FLIP = False -# 图像启动上下翻转的概率,0-1 -cfg.AUG.FLIP_RATIO = 0.5 -# 图像resize的固定尺寸(宽,高),非负 -cfg.AUG.FIX_RESIZE_SIZE = tuple() # 图像resize的方式有三种: # unpadding(固定尺寸),stepscaling(按比例resize),rangescaling(长边对齐) -cfg.AUG.AUG_METHOD = 'rangescaling' +cfg.AUG.AUG_METHOD = 'unpadding' +# 图像resize的固定尺寸(宽,高),非负 +cfg.AUG.FIX_RESIZE_SIZE = (512, 512) # 图像resize方式为stepscaling,resize最小尺度,非负 cfg.AUG.MIN_SCALE_FACTOR = 0.5 # 图像resize方式为stepscaling,resize最大尺度,不小于MIN_SCALE_FACTOR @@ -98,6 +92,13 @@ cfg.AUG.MAX_RESIZE_VALUE = 600 # 在MIN_RESIZE_VALUE到MAX_RESIZE_VALUE范围内 cfg.AUG.INF_RESIZE_VALUE = 500 +# 图像镜像左右翻转 +cfg.AUG.MIRROR = True +# 图像上下翻转开关,True/False +cfg.AUG.FLIP = False +# 图像启动上下翻转的概率,0-1 +cfg.AUG.FLIP_RATIO = 0.5 + # RichCrop数据增广开关,用于提升模型鲁棒性 cfg.AUG.RICH_CROP.ENABLE = False # 图像旋转最大角度,0-90 @@ -167,7 +168,7 @@ cfg.SOLVER.CROSS_ENTROPY_WEIGHT = None cfg.TEST.TEST_MODEL = '' ########################## 模型通用配置 ####################################### -# 模型名称, 支持deeplab, unet, icnet三种 +# 模型名称, 已支持deeplabv3p, unet, icnet,pspnet,hrnet cfg.MODEL.MODEL_NAME = '' # BatchNorm类型: bn、gn(group_norm) cfg.MODEL.DEFAULT_NORM_TYPE = 'bn' diff --git a/pdseg/vis.py b/pdseg/vis.py index 9fc349a3876f2667f8cc86bc1b9556594acfa638..d94221c0be1a0b4abe241e75966215863d8fd35d 100644 --- a/pdseg/vis.py +++ b/pdseg/vis.py @@ -34,6 +34,7 @@ from utils.config import cfg from reader import SegDataset from models.model_builder import build_model from models.model_builder import ModelPhase +from tools.gray2pseudo_color import get_color_map_list def parse_args(): @@ -73,28 +74,6 @@ def makedirs(directory): os.makedirs(directory) -def get_color_map_list(num_classes): - """ Returns the color map for visualizing the segmentation mask, - which can support arbitrary number of classes. - Args: - num_classes: Number of classes - Returns: - The color map - """ - color_map = num_classes * [0, 0, 0] - for i in range(0, num_classes): - j = 0 - lab = i - while lab: - color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j)) - color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j)) - color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j)) - j += 1 - lab >>= 3 - - return color_map - - def to_png_fn(fn): """ Append png as filename postfix @@ -108,7 +87,7 @@ def to_png_fn(fn): def visualize(cfg, vis_file_list=None, use_gpu=False, - vis_dir="visual_predict", + vis_dir="visual", ckpt_dir=None, log_writer=None, local_test=False, @@ -138,7 +117,7 @@ def visualize(cfg, fluid.io.load_params(exe, ckpt_dir, main_program=test_prog) - save_dir = os.path.join('visual', vis_dir) + save_dir = vis_dir makedirs(save_dir) fetch_list = [pred.name] diff --git a/turtorial/finetune_deeplabv3plus.md b/turtorial/finetune_deeplabv3plus.md index 35fb677d9d416512a79ded14bcdcadf516aa6b70..89d4c801bfc7c21c2a26aa1ba1c6e41f419d6bbe 100644 --- a/turtorial/finetune_deeplabv3plus.md +++ b/turtorial/finetune_deeplabv3plus.md @@ -1,29 +1,32 @@ -# DeepLabv3+模型训练教程 +# DeepLabv3+模型使用教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`DeeplabV3+/Xception65/BatchNorm`*** 预训练模型在自定义数据集上进行训练。除了该配置之外,DeeplabV3+还支持以下不同[模型组合](#模型组合)的预训练模型,如果需要使用对应模型作为预训练模型,将下述内容中的Xception Backbone中的内容进行替换即可 +本教程旨在介绍如何使用`DeepLabv3+`预训练模型在自定义数据集上进行训练、评估和可视化。我们以`DeeplabV3+/Xception65/BatchNorm`预训练模型为例。 -* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 +* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解。 -* 本教程的所有命令都基于PaddleSeg主目录进行执行 +* 本教程的所有命令都基于PaddleSeg主目录进行执行。 ## 一. 准备待训练数据 -我们提前准备好了一份数据集,通过以下代码进行下载 +![](./imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集,包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: ```shell -python dataset/download_pet.py +python dataset/download_optic.py ``` ## 二. 下载预训练模型 -关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置 - 接着下载对应的预训练模型 ```shell python pretrained_model/download_model.py deeplabv3p_xception65_bn_coco ``` +关于已有的DeepLabv3+预训练模型的列表,请参见[模型组合](#模型组合)。如果需要使用其他预训练模型,下载该模型并将配置中的BACKBONE、NORM_TYPE等进行替换即可。 + + ## 三. 准备配置 接着我们需要确定相关配置,从本教程的角度,配置分为三部分: @@ -45,19 +48,19 @@ python pretrained_model/download_model.py deeplabv3p_xception65_bn_coco 在三者中,预训练模型的配置尤为重要,如果模型或者BACKBONE配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所展示。 -数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 +数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/optic_disc_seg`中。 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/deeplabv3p_xception65_pet.yaml** +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/deeplabv3p_xception65_optic.yaml**。 ```yaml # 数据集配置 DATASET: - DATA_DIR: "./dataset/mini_pet/" - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" # 预训练模型配置 MODEL: @@ -75,15 +78,15 @@ AUG: BATCH_SIZE: 4 TRAIN: PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/" - MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_pet/" - SNAPSHOT_EPOCH: 10 + MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_optic/" + SNAPSHOT_EPOCH: 5 TEST: - TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_pet/final" + TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_optic/final" SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 + NUM_EPOCHS: 10 + LR: 0.001 LR_POLICY: "poly" - OPTIMIZER: "sgd" + OPTIMIZER: "adam" ``` ## 四. 配置/数据校验 @@ -91,7 +94,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/deeplabv3p_xception65_pet.yaml +python pdseg/check.py --cfg ./configs/deeplabv3p_xception65_optic.yaml ``` @@ -100,7 +103,10 @@ python pdseg/check.py --cfg ./configs/deeplabv3p_xception65_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/deeplabv3p_xception65_pet.yaml +# 指定GPU卡号(以0号卡为例) +export CUDA_VISIBLE_DEVICES=0 +# 训练 +python pdseg/train.py --use_gpu --cfg ./configs/deeplabv3p_xception65_optic.yaml ``` ## 六. 进行评估 @@ -108,22 +114,39 @@ python pdseg/train.py --use_gpu --cfg ./configs/deeplabv3p_xception65_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/deeplabv3p_xception65_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/deeplabv3p_xception65_optic.yaml +``` + +## 七. 进行可视化 + +使用下述命令启动预测和可视化 + +```shell +python pdseg/vis.py --use_gpu --cfg ./configs/deeplabv3p_xception65_optic.yaml ``` +预测结果将保存在`visual`目录下,以下展示其中1张图片的预测效果: + +![](imgs/optic_deeplab.png) + +## 在线体验 + +PaddleSeg在AI Studio平台上提供了在线体验的DeepLabv3+图像分割教程,欢迎[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/101696)。 + + ## 模型组合 -|预训练模型名称|BackBone|Norm Type|数据集|配置| +|预训练模型名称|Backbone|Norm Type|数据集|配置| |-|-|-|-|-| -|mobilenetv2-2-0_bn_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 2.0
MODEL.DEFAULT_NORM_TYPE: bn| -|mobilenetv2-1-5_bn_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.5
MODEL.DEFAULT_NORM_TYPE: bn| -|mobilenetv2-1-0_bn_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEFAULT_NORM_TYPE: bn| -|mobilenetv2-0-5_bn_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 0.5
MODEL.DEFAULT_NORM_TYPE: bn| -|mobilenetv2-0-25_bn_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 0.25
MODEL.DEFAULT_NORM_TYPE: bn| -|xception41_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_41
MODEL.DEFAULT_NORM_TYPE: bn| -|xception65_imagenet|-|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn| -|deeplabv3p_mobilenetv2-1-0_bn_coco|MobileNet V2|bn|COCO|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEEPLAB.ENCODER_WITH_ASPP: False
MODEL.DEEPLAB.ENABLE_DECODER: False
MODEL.DEFAULT_NORM_TYPE: bn| -|**deeplabv3p_xception65_bn_coco**|Xception|bn|COCO|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn | -|deeplabv3p_mobilenetv2-1-0_bn_cityscapes|MobileNet V2|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEEPLAB.ENCODER_WITH_ASPP: False
MODEL.DEEPLAB.ENABLE_DECODER: False
MODEL.DEFAULT_NORM_TYPE: bn| -|deeplabv3p_xception65_gn_cityscapes|Xception|gn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: gn| -|deeplabv3p_xception65_bn_cityscapes|Xception|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn| +|mobilenetv2-2-0_bn_imagenet|MobileNetV2|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 2.0
MODEL.DEFAULT_NORM_TYPE: bn| +|mobilenetv2-1-5_bn_imagenet|MobileNetV2|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.5
MODEL.DEFAULT_NORM_TYPE: bn| +|mobilenetv2-1-0_bn_imagenet|MobileNetV2|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEFAULT_NORM_TYPE: bn| +|mobilenetv2-0-5_bn_imagenet|MobileNetV2|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 0.5
MODEL.DEFAULT_NORM_TYPE: bn| +|mobilenetv2-0-25_bn_imagenet|MobileNetV2|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 0.25
MODEL.DEFAULT_NORM_TYPE: bn| +|xception41_imagenet|Xception41|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_41
MODEL.DEFAULT_NORM_TYPE: bn| +|xception65_imagenet|Xception65|bn|ImageNet|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn| +|deeplabv3p_mobilenetv2-1-0_bn_coco|MobileNetV2|bn|COCO|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEEPLAB.ENCODER_WITH_ASPP: False
MODEL.DEEPLAB.ENABLE_DECODER: False
MODEL.DEFAULT_NORM_TYPE: bn| +|**deeplabv3p_xception65_bn_coco**|Xception65|bn|COCO|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn | +|deeplabv3p_mobilenetv2-1-0_bn_cityscapes|MobileNetV2|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: mobilenetv2
MODEL.DEEPLAB.DEPTH_MULTIPLIER: 1.0
MODEL.DEEPLAB.ENCODER_WITH_ASPP: False
MODEL.DEEPLAB.ENABLE_DECODER: False
MODEL.DEFAULT_NORM_TYPE: bn| +|deeplabv3p_xception65_gn_cityscapes|Xception65|gn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: gn| +|deeplabv3p_xception65_bn_cityscapes|Xception65|bn|Cityscapes|MODEL.MODEL_NAME: deeplabv3p
MODEL.DEEPLAB.BACKBONE: xception_65
MODEL.DEFAULT_NORM_TYPE: bn| diff --git a/turtorial/finetune_hrnet.md b/turtorial/finetune_hrnet.md index f7feb9ddafd909fa829cf5f3e3d1c66c82505f57..9475a8aab8386364ab6be7e976ac30dae73d4645 100644 --- a/turtorial/finetune_hrnet.md +++ b/turtorial/finetune_hrnet.md @@ -1,22 +1,23 @@ -# HRNet模型训练教程 +# HRNet模型使用教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`HRNet`*** 预训练模型在自定义数据集上进行训练。 +本教程旨在介绍如何通过使用PaddleSeg提供的 ***`HRNet`*** 预训练模型在自定义数据集上进行训练、评估和可视化。 -* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 +* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解。 -* 本教程的所有命令都基于PaddleSeg主目录进行执行 +* 本教程的所有命令都基于PaddleSeg主目录进行执行。 ## 一. 准备待训练数据 -我们提前准备好了一份数据集,通过以下代码进行下载 +![](./imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集,包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: ```shell -python dataset/download_pet.py +python dataset/download_optic.py ``` -## 二. 下载预训练模型 -关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置 +## 二. 下载预训练模型 接着下载对应的预训练模型 @@ -24,6 +25,8 @@ python dataset/download_pet.py python pretrained_model/download_model.py hrnet_w18_bn_cityscapes ``` +关于已有的HRNet预训练模型的列表,请参见[模型组合](#模型组合)。如果需要使用其他预训练模型,下载该模型并将配置中的BACKBONE、NORM_TYPE等进行替换即可。 + ## 三. 准备配置 接着我们需要确定相关配置,从本教程的角度,配置分为三部分: @@ -45,19 +48,19 @@ python pretrained_model/download_model.py hrnet_w18_bn_cityscapes 在三者中,预训练模型的配置尤为重要,如果模型配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所展示。 -数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 +数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/optic_disc_seg`中 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/hrnet_w18_pet.yaml** +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/hrnet_optic.yaml** ```yaml # 数据集配置 DATASET: - DATA_DIR: "./dataset/mini_pet/" - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" # 预训练模型配置 MODEL: @@ -80,15 +83,15 @@ AUG: BATCH_SIZE: 4 TRAIN: PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/" - MODEL_SAVE_DIR: "./saved_model/hrnet_w18_bn_pet/" - SNAPSHOT_EPOCH: 10 + MODEL_SAVE_DIR: "./saved_model/hrnet_optic/" + SNAPSHOT_EPOCH: 5 TEST: - TEST_MODEL: "./saved_model/hrnet_w18_bn_pet/final" + TEST_MODEL: "./saved_model/hrnet_optic/final" SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 + NUM_EPOCHS: 10 + LR: 0.001 LR_POLICY: "poly" - OPTIMIZER: "sgd" + OPTIMIZER: "adam" ``` ## 四. 配置/数据校验 @@ -96,7 +99,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/hrnet_w18_pet.yaml +python pdseg/check.py --cfg ./configs/hrnet_optic.yaml ``` @@ -105,7 +108,10 @@ python pdseg/check.py --cfg ./configs/hrnet_w18_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/hrnet_w18_pet.yaml +# 指定GPU卡号(以0号卡为例) +export CUDA_VISIBLE_DEVICES=0 +# 训练 +python pdseg/train.py --use_gpu --cfg ./configs/hrnet_optic.yaml ``` ## 六. 进行评估 @@ -113,19 +119,30 @@ python pdseg/train.py --use_gpu --cfg ./configs/hrnet_w18_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/hrnet_w18_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/hrnet_optic.yaml +``` + +## 七. 进行可视化 +使用下述命令启动预测和可视化 + +```shell +python pdseg/vis.py --use_gpu --cfg ./configs/hrnet_optic.yaml ``` +预测结果将保存在visual目录下,以下展示其中1张图片的预测效果: + +![](imgs/optic_hrnet.png) + ## 模型组合 -|预训练模型名称|BackBone|Norm Type|数据集|配置| -|-|-|-|-|-| -|hrnet_w18_bn_cityscapes|-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144]
MODEL.DEFAULT_NORM_TYPE: bn| -| hrnet_w18_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w30_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [30, 60]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [30, 60, 120]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [30, 60, 120, 240]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w32_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [32, 64]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [32, 64, 128]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [32, 64, 128, 256]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w40_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [40, 80]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [40, 80, 160]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [40, 80, 160, 320]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w44_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [44, 88]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [44, 88, 176]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [44, 88, 176, 352]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w48_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [48, 96]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [48, 96, 192]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [48, 96, 192, 384]
MODEL.DEFAULT_NORM_TYPE: bn | -| hrnet_w64_bn_imagenet |-|bn| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [64, 128]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [64, 128, 256]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [64, 128, 256, 512]
MODEL.DEFAULT_NORM_TYPE: bn | +|预训练模型名称|Backbone|数据集|配置| +|-|-|-|-| +|hrnet_w18_bn_cityscapes|HRNet| Cityscapes | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144]
MODEL.DEFAULT_NORM_TYPE: bn| +| hrnet_w18_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [18, 36]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [18, 36, 72]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [18, 36, 72, 144]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w30_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [30, 60]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [30, 60, 120]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [30, 60, 120, 240]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w32_bn_imagenet |HRNet|ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [32, 64]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [32, 64, 128]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [32, 64, 128, 256]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w40_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [40, 80]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [40, 80, 160]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [40, 80, 160, 320]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w44_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [44, 88]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [44, 88, 176]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [44, 88, 176, 352]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w48_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [48, 96]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [48, 96, 192]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [48, 96, 192, 384]
MODEL.DEFAULT_NORM_TYPE: bn | +| hrnet_w64_bn_imagenet |HRNet| ImageNet | MODEL.MODEL_NAME: hrnet
MODEL.HRNET.STAGE2.NUM_CHANNELS: [64, 128]
MODEL.HRNET.STAGE3.NUM_CHANNELS: [64, 128, 256]
MODEL.HRNET.STAGE4.NUM_CHANNELS: [64, 128, 256, 512]
MODEL.DEFAULT_NORM_TYPE: bn | diff --git a/turtorial/finetune_icnet.md b/turtorial/finetune_icnet.md index 00caf4f87f206000bc2dde8440bdbe08ff03f555..57adc200d9d4857768d5055d8160b7b729332389 100644 --- a/turtorial/finetune_icnet.md +++ b/turtorial/finetune_icnet.md @@ -1,32 +1,34 @@ -# ICNet模型训练教程 +# ICNet模型使用教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`ICNet`*** 预训练模型在自定义数据集上进行训练 +本教程旨在介绍如何通过使用PaddleSeg提供的 ***`ICNet`*** 预训练模型在自定义数据集上进行训练、评估和可视化。 -* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 +* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解。 -* 本教程的所有命令都基于PaddleSeg主目录进行执行 +* 本教程的所有命令都基于PaddleSeg主目录进行执行。 * 注意 ***`ICNet`*** 不支持在cpu环境上训练和评估 ## 一. 准备待训练数据 -我们提前准备好了一份数据集,通过以下代码进行下载 +![](./imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集,包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: ```shell -python dataset/download_pet.py +python dataset/download_optic.py ``` ## 二. 下载预训练模型 -关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置。 - 接着下载对应的预训练模型 ```shell python pretrained_model/download_model.py icnet_bn_cityscapes ``` +关于已有的ICNet预训练模型的列表,请参见[模型组合](#模型组合)。如果需要使用其他预训练模型,下载该模型并将配置中的BACKBONE、NORM_TYPE等进行替换即可。 + ## 三. 准备配置 接着我们需要确定相关配置,从本教程的角度,配置分为三部分: @@ -48,20 +50,19 @@ python pretrained_model/download_model.py icnet_bn_cityscapes 在三者中,预训练模型的配置尤为重要,如果模型或者BACKBONE配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所示。 -数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 +数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/optic_disc_seg`中 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/icnet_pet.yaml** +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/icnet_optic.yaml** ```yaml # 数据集配置 DATASET: - DATA_DIR: "./dataset/mini_pet/" - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" # 预训练模型配置 MODEL: @@ -80,15 +81,15 @@ AUG: BATCH_SIZE: 4 TRAIN: PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/" - MODEL_SAVE_DIR: "./saved_model/icnet_pet/" - SNAPSHOT_EPOCH: 10 + MODEL_SAVE_DIR: "./saved_model/icnet_optic/" + SNAPSHOT_EPOCH: 5 TEST: - TEST_MODEL: "./saved_model/icnet_pet/final" + TEST_MODEL: "./saved_model/icnet_optic/final" SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 + NUM_EPOCHS: 10 + LR: 0.001 LR_POLICY: "poly" - OPTIMIZER: "sgd" + OPTIMIZER: "adam" ``` ## 四. 配置/数据校验 @@ -96,7 +97,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/icnet_pet.yaml +python pdseg/check.py --cfg ./configs/icnet_optic.yaml ``` @@ -105,7 +106,10 @@ python pdseg/check.py --cfg ./configs/icnet_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/icnet_pet.yaml +# 指定GPU卡号(以0号卡为例) +export CUDA_VISIBLE_DEVICES=0 +# 训练 +python pdseg/train.py --use_gpu --cfg ./configs/icnet_optic.yaml ``` ## 六. 进行评估 @@ -113,11 +117,22 @@ python pdseg/train.py --use_gpu --cfg ./configs/icnet_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/icnet_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/icnet_optic.yaml +``` + +## 七. 进行可视化 +使用下述命令启动预测和可视化 + +```shell +python pdseg/vis.py --use_gpu --cfg ./configs/icnet_optic.yaml ``` +预测结果将保存在visual目录下,以下展示其中1张图片的预测效果: + +![](imgs/optic_icnet.png) + ## 模型组合 -|预训练模型名称|BackBone|Norm|数据集|配置| -|-|-|-|-|-| -|icnet_bn_cityscapes|-|bn|Cityscapes|MODEL.MODEL_NAME: icnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.MULTI_LOSS_WEIGHT: [1.0, 0.4, 0.16]| +|预训练模型名称|Backbone|数据集|配置| +|-|-|-|-| +|icnet_bn_cityscapes|ResNet50|Cityscapes|MODEL.MODEL_NAME: icnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.MULTI_LOSS_WEIGHT: [1.0, 0.4, 0.16]| diff --git a/turtorial/finetune_pspnet.md b/turtorial/finetune_pspnet.md index 931c3c5f7515e2ebec3d4fccf3069ecc6d6c00fb..8c52bbe4646d253f70a24001ed6e414a1bee3cc3 100644 --- a/turtorial/finetune_pspnet.md +++ b/turtorial/finetune_pspnet.md @@ -1,29 +1,31 @@ # PSPNET模型训练教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`PSPNET`*** 预训练模型在自定义数据集上进行训练 +本教程旨在介绍如何通过使用PaddleSeg提供的 ***`PSPNET`*** 预训练模型在自定义数据集上进行训练。 -* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 +* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解。 -* 本教程的所有命令都基于PaddleSeg主目录进行执行 +* 本教程的所有命令都基于PaddleSeg主目录进行执行。 ## 一. 准备待训练数据 -我们提前准备好了一份数据集,通过以下代码进行下载 +![](./imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集,包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: ```shell -python dataset/download_pet.py +python dataset/download_optic.py ``` ## 二. 下载预训练模型 -关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置。 - 接着下载对应的预训练模型 ```shell python pretrained_model/download_model.py pspnet50_bn_cityscapes ``` +关于已有的PSPNet预训练模型的列表,请参见[PSPNet预训练模型组合](#PSPNet预训练模型组合)。如果需要使用其他预训练模型,下载该模型并将配置中的BACKBONE、NORM_TYPE等进行替换即可。 + ## 三. 准备配置 接着我们需要确定相关配置,从本教程的角度,配置分为三部分: @@ -45,20 +47,19 @@ python pretrained_model/download_model.py pspnet50_bn_cityscapes 在三者中,预训练模型的配置尤为重要,如果模型或者BACKBONE配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所示。 -数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 +数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/optic_disc_seg`中 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/test_pet.yaml` +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为`configs/pspnet_optic.yaml` ```yaml # 数据集配置 DATASET: - DATA_DIR: "./dataset/mini_pet/" - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" # 预训练模型配置 MODEL: @@ -77,15 +78,15 @@ AUG: BATCH_SIZE: 4 TRAIN: PRETRAINED_MODEL_DIR: "./pretrained_model/pspnet50_bn_cityscapes/" - MODEL_SAVE_DIR: "./saved_model/pspnet_pet/" - SNAPSHOT_EPOCH: 10 + MODEL_SAVE_DIR: "./saved_model/pspnet_optic/" + SNAPSHOT_EPOCH: 5 TEST: - TEST_MODEL: "./saved_model/pspnet_pet/final" + TEST_MODEL: "./saved_model/pspnet_optic/final" SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 + NUM_EPOCHS: 10 + LR: 0.001 LR_POLICY: "poly" - OPTIMIZER: "sgd" + OPTIMIZER: "adam" ``` ## 四. 配置/数据校验 @@ -93,7 +94,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/test_pet.yaml +python pdseg/check.py --cfg ./configs/pspnet_optic.yaml ``` @@ -102,7 +103,10 @@ python pdseg/check.py --cfg ./configs/test_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml +# 指定GPU卡号(以0号卡为例) +export CUDA_VISIBLE_DEVICES=0 +# 训练 +python pdseg/train.py --use_gpu --cfg ./configs/pspnet_optic.yaml ``` ## 六. 进行评估 @@ -110,12 +114,27 @@ python pdseg/train.py --use_gpu --cfg ./configs/test_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/test_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/pspnet_optic.yaml +``` + +## 七. 进行可视化 +使用下述命令启动预测和可视化 + +```shell +python pdseg/vis.py --use_gpu --cfg ./configs/pspnet_optic.yaml ``` -## 模型组合 +预测结果将保存在visual目录下,以下展示其中1张图片的预测效果: + +![](imgs/optic_pspnet.png) + +## PSPNet预训练模型组合 -|预训练模型名称|BackBone|Norm|数据集|配置| -|-|-|-|-|-| -|pspnet50_bn_cityscapes|ResNet50|bn|Cityscapes|MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 50| -|pspnet101_bn_cityscapes|ResNet101|bn|Cityscapes|MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 101| +|模型|BackBone|数据集|配置| +|-|-|-|-| +|[pspnet50_cityscapes](https://paddleseg.bj.bcebos.com/models/pspnet50_cityscapes.tgz)|ResNet50(适配PSPNet)|Cityscapes |MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 50| +|[pspnet101_cityscapes](https://paddleseg.bj.bcebos.com/models/pspnet101_cityscapes.tgz)|ResNet101(适配PSPNet)|Cityscapes |MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 101| +| [pspnet50_coco](https://paddleseg.bj.bcebos.com/models/pspnet50_coco.tgz)|ResNet50(适配PSPNet)|COCO |MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 50| +| [pspnet101_coco](https://paddleseg.bj.bcebos.com/models/pspnet101_coco.tgz) |ResNet101(适配PSPNet)| COCO |MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 101| +| [resnet50_v2_pspnet](https://paddleseg.bj.bcebos.com/resnet50_v2_pspnet.tgz)| ResNet50(适配PSPNet) | ImageNet | MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 50 | +| [resnet101_v2_pspnet](https://paddleseg.bj.bcebos.com/resnet101_v2_pspnet.tgz)| ResNet101(适配PSPNet) | ImageNet | MODEL.MODEL_NAME: pspnet
MODEL.DEFAULT_NORM_TYPE: bn
MODEL.PSPNET.LAYERS: 101 | diff --git a/turtorial/finetune_unet.md b/turtorial/finetune_unet.md index b1baff8b0d6a9438df0ae4ed6a5f0dfdae4d3414..dd2945cf587fc18ed760639a56ad7b8edebc0087 100644 --- a/turtorial/finetune_unet.md +++ b/turtorial/finetune_unet.md @@ -1,29 +1,31 @@ -# U-Net模型训练教程 +# U-Net模型使用教程 -* 本教程旨在介绍如何通过使用PaddleSeg提供的 ***`U-Net`*** 预训练模型在自定义数据集上进行训练 +本教程旨在介绍如何通过使用PaddleSeg提供的 ***`U-Net`*** 预训练模型在自定义数据集上进行训练、评估和可视化。 -* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解 +* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)和[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解。 -* 本教程的所有命令都基于PaddleSeg主目录进行执行 +* 本教程的所有命令都基于PaddleSeg主目录进行执行。 ## 一. 准备待训练数据 -我们提前准备好了一份数据集,通过以下代码进行下载 +![](./imgs/optic.png) + +我们提前准备好了一份眼底医疗分割数据集,包含267张训练图片、76张验证图片、38张测试图片。通过以下命令进行下载: ```shell -python dataset/download_pet.py +python dataset/download_optic.py ``` ## 二. 下载预训练模型 -关于PaddleSeg支持的所有预训练模型的列表,我们可以从[模型组合](#模型组合)中查看我们所需模型的名字和配置。 - 接着下载对应的预训练模型 ```shell python pretrained_model/download_model.py unet_bn_coco ``` +关于已有的U-Net预训练模型的列表,请参见[模型组合](#模型组合)。如果需要使用其他预训练模型,下载该模型并将配置中的BACKBONE、NORM_TYPE等进行替换即可。 + ## 三. 准备配置 接着我们需要确定相关配置,从本教程的角度,配置分为三部分: @@ -45,20 +47,19 @@ python pretrained_model/download_model.py unet_bn_coco 在三者中,预训练模型的配置尤为重要,如果模型或者BACKBONE配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所展示。 -数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/mini_pet`中 +数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/optic_disc_seg`中 -其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/unet_pet.yaml** +其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/unet_optic.yaml** ```yaml # 数据集配置 DATASET: - DATA_DIR: "./dataset/mini_pet/" - NUM_CLASSES: 3 - TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" - VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" - VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" - + DATA_DIR: "./dataset/optic_disc_seg/" + NUM_CLASSES: 2 + TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" + TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt" + VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt" + VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt" # 预训练模型配置 MODEL: @@ -74,13 +75,13 @@ AUG: BATCH_SIZE: 4 TRAIN: PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" - MODEL_SAVE_DIR: "./saved_model/unet_pet/" - SNAPSHOT_EPOCH: 10 + MODEL_SAVE_DIR: "./saved_model/unet_optic/" + SNAPSHOT_EPOCH: 5 TEST: - TEST_MODEL: "./saved_model/unet_pet/final" + TEST_MODEL: "./saved_model/unet_optic/final" SOLVER: - NUM_EPOCHS: 100 - LR: 0.005 + NUM_EPOCHS: 10 + LR: 0.001 LR_POLICY: "poly" OPTIMIZER: "adam" ``` @@ -90,7 +91,7 @@ SOLVER: 在开始训练和评估之前,我们还需要对配置和数据进行一次校验,确保数据和配置是正确的。使用下述命令启动校验流程 ```shell -python pdseg/check.py --cfg ./configs/unet_pet.yaml +python pdseg/check.py --cfg ./configs/unet_optic.yaml ``` @@ -99,7 +100,10 @@ python pdseg/check.py --cfg ./configs/unet_pet.yaml 校验通过后,使用下述命令启动训练 ```shell -python pdseg/train.py --use_gpu --cfg ./configs/unet_pet.yaml +# 指定GPU卡号(以0号卡为例) +export CUDA_VISIBLE_DEVICES=0 +# 训练 +python pdseg/train.py --use_gpu --cfg ./configs/unet_optic.yaml ``` ## 六. 进行评估 @@ -107,11 +111,26 @@ python pdseg/train.py --use_gpu --cfg ./configs/unet_pet.yaml 模型训练完成,使用下述命令启动评估 ```shell -python pdseg/eval.py --use_gpu --cfg ./configs/unet_pet.yaml +python pdseg/eval.py --use_gpu --cfg ./configs/unet_optic.yaml +``` + +## 七. 进行可视化 +使用下述命令启动预测和可视化 + +```shell +python pdseg/vis.py --use_gpu --cfg ./configs/unet_optic.yaml ``` +预测结果将保存在visual目录下,以下展示其中1张图片的预测效果: + +![](imgs/optic_unet.png) + +## 在线体验 + +PaddleSeg在AI Studio平台上提供了在线体验的U-Net分割教程,欢迎[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)。 + ## 模型组合 -|预训练模型名称|BackBone|Norm|数据集|配置| -|-|-|-|-|-| -|unet_bn_coco|-|bn|COCO|MODEL.MODEL_NAME: unet
MODEL.DEFAULT_NORM_TYPE: bn| +|预训练模型名称|Backbone|数据集|配置| +|-|-|-|-| +|unet_bn_coco|VGG16|COCO|MODEL.MODEL_NAME: unet
MODEL.DEFAULT_NORM_TYPE: bn| diff --git a/turtorial/imgs/optic.png b/turtorial/imgs/optic.png new file mode 100644 index 0000000000000000000000000000000000000000..34acaae49303e71e6b59db26202a9079965f05eb Binary files /dev/null and b/turtorial/imgs/optic.png differ diff --git a/turtorial/imgs/optic_deeplab.png b/turtorial/imgs/optic_deeplab.png new file mode 100644 index 0000000000000000000000000000000000000000..8edc957362715bb742042d6f0f6e6c36fd7aec52 Binary files /dev/null and b/turtorial/imgs/optic_deeplab.png differ diff --git a/turtorial/imgs/optic_hrnet.png b/turtorial/imgs/optic_hrnet.png new file mode 100644 index 0000000000000000000000000000000000000000..8d19190aa5a057fe5aa72cd800c1c9fed642d9ef Binary files /dev/null and b/turtorial/imgs/optic_hrnet.png differ diff --git a/turtorial/imgs/optic_icnet.png b/turtorial/imgs/optic_icnet.png new file mode 100644 index 0000000000000000000000000000000000000000..a4d36b7ab0f086af46840a5c1e8f1624054048be Binary files /dev/null and b/turtorial/imgs/optic_icnet.png differ diff --git a/turtorial/imgs/optic_pspnet.png b/turtorial/imgs/optic_pspnet.png new file mode 100644 index 0000000000000000000000000000000000000000..44fd2795d6edfdc95378046da906949ad01431d9 Binary files /dev/null and b/turtorial/imgs/optic_pspnet.png differ diff --git a/turtorial/imgs/optic_unet.png b/turtorial/imgs/optic_unet.png new file mode 100644 index 0000000000000000000000000000000000000000..9ca439ebc76427516127d56aac56b5d09dd68263 Binary files /dev/null and b/turtorial/imgs/optic_unet.png differ