提交 1942c445 编写于 作者: C chenguowei01

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleSeg into develop

...@@ -6,10 +6,26 @@ ...@@ -6,10 +6,26 @@
## 简介 ## 简介
PaddleSeg是基于[PaddlePaddle](https://www.paddlepaddle.org.cn)开发的语义分割库,覆盖了DeepLabv3+, U-Net, ICNet三类主流的分割模型。通过统一的配置,帮助用户更便捷地完成从训练到部署的全流程图像分割应用。 PaddleSeg是基于[PaddlePaddle](https://www.paddlepaddle.org.cn)开发的语义分割库,覆盖了DeepLabv3+, U-Net, ICNet, PSPNet, HRNet等主流分割模型。通过统一的配置,帮助用户更便捷地完成从训练到部署的全流程图像分割应用。
PaddleSeg具备高性能、丰富的数据增强、工业级部署、全流程应用的特点: </br>
- [特点](#特点)
- [安装](#安装)
- [使用教程](#使用教程)
- [快速入门](#快速入门)
- [基础功能](#基础功能)
- [预测部署](#预测部署)
- [高级功能](#高级功能)
- [在线体验](#在线体验)
- [FAQ](#FAQ)
- [交流与反馈](#交流与反馈)
- [更新日志](#更新日志)
- [贡献代码](#贡献代码)
</br>
## 特点
- **丰富的数据增强** - **丰富的数据增强**
...@@ -17,29 +33,42 @@ PaddleSeg具备高性能、丰富的数据增强、工业级部署、全流程 ...@@ -17,29 +33,42 @@ PaddleSeg具备高性能、丰富的数据增强、工业级部署、全流程
- **模块化设计** - **模块化设计**
支持U-Net, DeepLabv3+, ICNet, PSPNet种主流分割网络,结合预训练模型和可调节的骨干网络,满足不同性能和精度的要求;选择不同的损失函数如Dice Loss, BCE Loss等方式可以强化小目标和不均衡样本场景下的分割精度。 支持U-Net, DeepLabv3+, ICNet, PSPNet, HRNet五种主流分割网络,结合预训练模型和可调节的骨干网络,满足不同性能和精度的要求;选择不同的损失函数如Dice Loss, BCE Loss等方式可以强化小目标和不均衡样本场景下的分割精度。
- **高性能** - **高性能**
PaddleSeg支持多进程IO、多卡并行、跨卡Batch Norm同步等训练加速策略,结合飞桨核心框架的显存优化功能,可以大幅度减少分割模型的显存开销,更快完成分割模型训练。 PaddleSeg支持多进程I/O、多卡并行、跨卡Batch Norm同步等训练加速策略,结合飞桨核心框架的显存优化功能,可大幅度减少分割模型的显存开销,让开发者更低成本、更高效地完成图像分割训练。
- **工业级部署** - **工业级部署**
基于[Paddle Serving](https://github.com/PaddlePaddle/Serving)和PaddlePaddle高性能预测引擎,结合百度开放的AI能力,轻松搭建人像分割和车道线分割服务 全面提供**服务端****移动端**的工业级部署能力,依托飞桨高性能推理引擎和高性能图像处理实现,开发者可以轻松完成高性能的分割模型部署和集成。通过[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite),可以在移动设备或者嵌入式设备上完成轻量级、高性能的人像分割模型部署
</br> ## 安装
## 环境依赖 ### 1. 安装PaddlePaddle
版本要求
* PaddlePaddle >= 1.6.1 * PaddlePaddle >= 1.6.1
* Python 2.7 or 3.5+ * Python 2.7 or 3.5+
通过以下命令安装python包依赖,请确保在该分支上至少执行过一次以下命令 由于图像分割模型计算开销大,推荐在GPU版本的PaddlePaddle下使用PaddleSeg.
```shell ```
$ pip install -r requirements.txt pip install -U paddlepaddle-gpu
``` ```
同时请保证您参考NVIDIA官网,已经正确配置和安装了显卡驱动,CUDA 9,cuDNN 7.3,NCCL2等依赖,其他更加详细的安装信息请参考:[PaddlePaddle安装说明](https://www.paddlepaddle.org.cn/install/doc/index)
其他如CUDA版本、cuDNN版本等兼容信息请查看[PaddlePaddle安装](https://www.paddlepaddle.org.cn/install/doc/index) ### 2. 下载PaddleSeg代码
```
git clone https://github.com/PaddlePaddle/PaddleSeg
```
### 3. 安装PaddleSeg依赖
通过以下命令安装python包依赖,请确保在该分支上至少执行过一次以下命令:
```
cd PaddleSeg
pip install -r requirements.txt
```
</br> </br>
...@@ -51,35 +80,49 @@ $ pip install -r requirements.txt ...@@ -51,35 +80,49 @@ $ pip install -r requirements.txt
### 快速入门 ### 快速入门
* [安装说明](./docs/installation.md) * [PaddleSeg快速入门](./docs/usage.md)
* [训练/评估/可视化](./docs/usage.md)
### 基础功能 ### 基础功能
* [分割模型介绍](./docs/models.md) * [自定义数据的标注与准备](./docs/data_prepare.md)
* [预训练模型列表](./docs/model_zoo.md) * [脚本使用和配置说明](./docs/config.md)
* [自定义数据的准备与标注](./docs/data_prepare.md)
* [数据和配置校验](./docs/check.md) * [数据和配置校验](./docs/check.md)
* [如何训练DeepLabv3+](./turtorial/finetune_deeplabv3plus.md) * [分割模型介绍](./docs/models.md)
* [如何训练U-Net](./turtorial/finetune_unet.md) * [预训练模型下载](./docs/model_zoo.md)
* [如何训练ICNet](./turtorial/finetune_icnet.md) * [DeepLabv3+模型使用教程](./turtorial/finetune_deeplabv3plus.md)
* [如何训练PSPNet](./turtorial/finetune_pspnet.md) * [U-Net模型使用教程](./turtorial/finetune_unet.md)
* [如何训练HRNet](./turtorial/finetune_hrnet.md) * [ICNet模型使用教程](./turtorial/finetune_icnet.md)
* [PSPNet模型使用教程](./turtorial/finetune_pspnet.md)
* [HRNet模型使用教程](./turtorial/finetune_hrnet.md)
* [Fast-SCNN模型使用教程](./turtorial/finetune_fast_scnn.md)
### 预测部署 ### 预测部署
* [模型导出](./docs/model_export.md) * [模型导出](./docs/model_export.md)
* [使用Python预测](./deploy/python/) * [Python预测](./deploy/python/)
* [使用C++预测](./deploy/cpp/) * [C++预测](./deploy/cpp/)
* [移动端预测部署](./deploy/lite/) * [Paddle-Lite移动端预测部署](./deploy/lite/)
### 高级功能 ### 高级功能
* [PaddleSeg的数据增强](./docs/data_aug.md) * [PaddleSeg的数据增强](./docs/data_aug.md)
* [PaddleSeg的loss选择](./docs/loss_select.md) * [如何解决二分类中类别不均衡问题](./docs/loss_select.md)
* [特色垂类模型使用](./contrib) * [特色垂类模型使用](./contrib)
* [多进程训练和混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md) * [多进程训练和混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)
* 使用PaddleSlim进行分割模型压缩([量化](./slim/quantization/README.md), [蒸馏](./slim/distillation/README.md), [剪枝](./slim/prune/README.md), [搜索](./slim/nas/README.md))
## 在线体验
我们在AI Studio平台上提供了在线体验的教程,欢迎体验:
|在线教程|链接|
|-|-|
|快速开始|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/100798)|
|U-Net图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)|
|DeepLabv3+图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/226703)|
|工业质检(零件瑕疵检测)|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184392)|
|人像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/188833)|
|PaddleSeg特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/226710)|
</br> </br>
...@@ -104,25 +147,14 @@ python pdseg/train.py --cfg xxx.yaml TRAIN.RESUME_MODEL_DIR /PATH/TO/MODEL_CKPT/ ...@@ -104,25 +147,14 @@ python pdseg/train.py --cfg xxx.yaml TRAIN.RESUME_MODEL_DIR /PATH/TO/MODEL_CKPT/
A: 降低Batch size,使用Group Norm策略;请注意训练过程中当`DEFAULT_NORM_TYPE`选择`bn`时,为了Batch Norm计算稳定性,batch size需要满足>=2 A: 降低Batch size,使用Group Norm策略;请注意训练过程中当`DEFAULT_NORM_TYPE`选择`bn`时,为了Batch Norm计算稳定性,batch size需要满足>=2
</br>
#### Q: 出现错误 ModuleNotFoundError: No module named 'paddle.fluid.contrib.mixed_precision' #### Q: 出现错误 ModuleNotFoundError: No module named 'paddle.fluid.contrib.mixed_precision'
A: 请将PaddlePaddle升级至1.5.2版本或以上。 A: 请将PaddlePaddle升级至1.5.2版本或以上。
## 在线体验
PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:
|教程|链接|
|-|-|
|U-Net宠物分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/102889)|
|DeepLabv3+图像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectDetail/101696)|
|PaddleSeg特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/115541)|
</br> </br>
## 交流与反馈 ## 交流与反馈
* 欢迎您通过[Github Issues](https://github.com/PaddlePaddle/PaddleSeg/issues)来提交问题、报告与建议 * 欢迎您通过[Github Issues](https://github.com/PaddlePaddle/PaddleSeg/issues)来提交问题、报告与建议
* 微信公众号:飞桨PaddlePaddle * 微信公众号:飞桨PaddlePaddle
* QQ群: 796771754 * QQ群: 796771754
...@@ -131,25 +163,36 @@ PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验: ...@@ -131,25 +163,36 @@ PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:
<p align="center"> &#8194;&#8194;&#8194;微信公众号&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;官方技术交流QQ群</p> <p align="center"> &#8194;&#8194;&#8194;微信公众号&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;官方技术交流QQ群</p>
## 更新日志 ## 更新日志
* 2019.12.15
**`v0.3.0`**
* 新增HRNet分割网络,提供基于cityscapes和ImageNet的[预训练模型](./docs/model_zoo.md)8个
* 支持使用[伪彩色标签](./docs/data_prepare.md#%E7%81%B0%E5%BA%A6%E6%A0%87%E6%B3%A8vs%E4%BC%AA%E5%BD%A9%E8%89%B2%E6%A0%87%E6%B3%A8)进行训练/评估/预测,提升训练体验,并提供将灰度标注图转为伪彩色标注图的脚本
* 新增[学习率warmup](./docs/configs/solver_group.md#lr_warmup)功能,支持与不同的学习率Decay策略配合使用
* 新增图像归一化操作的GPU化实现,进一步提升预测速度。
* 新增Python部署方案,更低成本完成工业级部署。
* 新增Paddle-Lite移动端部署方案,支持人像分割模型的移动端部署。
* 新增不同分割模型的预测[性能数据Benchmark](./deploy/python/docs/PaddleSeg_Infer_Benchmark.md), 便于开发者提供模型选型性能参考。
* 2019.11.04 * 2019.11.04
**`v0.2.0`** **`v0.2.0`**
* 新增PSPNet分割网络,提供基于COCO和cityscapes数据集的[预训练模型](./docs/model_zoo.md)4个 * 新增PSPNet分割网络,提供基于COCO和cityscapes数据集的[预训练模型](./docs/model_zoo.md)4个
* 新增Dice Loss、BCE Loss以及组合Loss配置,支持样本不均衡场景下的[模型优化](./docs/loss_select.md) * 新增Dice Loss、BCE Loss以及组合Loss配置,支持样本不均衡场景下的[模型优化](./docs/loss_select.md)
* 支持[FP16混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)以及动态Loss Scaling,在不损耗精度的情况下,训练速度提升30%+ * 支持[FP16混合精度训练](./docs/multiple_gpus_train_and_mixed_precision_train.md)以及动态Loss Scaling,在不损耗精度的情况下,训练速度提升30%+
* 支持[PaddlePaddle多卡多进程训练](./docs/multiple_gpus_train_and_mixed_precision_train.md),多卡训练时训练速度提升15%+ * 支持[PaddlePaddle多卡多进程训练](./docs/multiple_gpus_train_and_mixed_precision_train.md),多卡训练时训练速度提升15%+
* 发布基于UNet的[工业标记表盘分割模型](./contrib#%E5%B7%A5%E4%B8%9A%E7%94%A8%E8%A1%A8%E5%88%86%E5%89%B2) * 发布基于UNet的[工业标记表盘分割模型](./contrib#%E5%B7%A5%E4%B8%9A%E7%94%A8%E8%A1%A8%E5%88%86%E5%89%B2)
* 2019.09.10 * 2019.09.10
**`v0.1.0`** **`v0.1.0`**
* PaddleSeg分割库初始版本发布,包含DeepLabv3+, U-Net, ICNet三类分割模型, 其中DeepLabv3+支持Xception, MobileNet v2两种可调节的骨干网络。 * PaddleSeg分割库初始版本发布,包含DeepLabv3+, U-Net, ICNet三类分割模型, 其中DeepLabv3+支持Xception, MobileNet v2两种可调节的骨干网络。
* CVPR19 LIP人体部件分割比赛冠军预测模型发布[ACE2P](./contrib/ACE2P) * CVPR19 LIP人体部件分割比赛冠军预测模型发布[ACE2P](./contrib/ACE2P)
* 预置基于DeepLabv3+网络的[人像分割](./contrib/HumanSeg/)[车道线分割](./contrib/RoadLine)预测模型发布 * 预置基于DeepLabv3+网络的[人像分割](./contrib/HumanSeg/)[车道线分割](./contrib/RoadLine)预测模型发布
</br> </br>
## 如何贡献代码 ## 贡献代码
我们非常欢迎您为PaddleSeg贡献代码或者提供使用建议。 我们非常欢迎您为PaddleSeg贡献代码或者提供使用建议。如果您可以修复某个issue或者增加一个新功能,欢迎给我们提交pull requests.
EVAL_CROP_SIZE: (2048, 1024) # (width, height), for unpadding rangescaling and stepscaling
TRAIN_CROP_SIZE: (1024, 1024) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MIN_SCALE_FACTOR: 0.5 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
FLIP: False
FLIP_RATIO: 0.2
RICH_CROP:
ENABLE: True
ASPECT_RATIO: 0.0
BLUR: False
BLUR_RATIO: 0.1
MAX_ROTATION: 0
MIN_AREA_RATIO: 0.0
BRIGHTNESS_JITTER_RATIO: 0.4
CONTRAST_JITTER_RATIO: 0.4
SATURATION_JITTER_RATIO: 0.4
BATCH_SIZE: 12
MEAN: [0.5, 0.5, 0.5]
STD: [0.5, 0.5, 0.5]
DATASET:
DATA_DIR: "./dataset/cityscapes/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 19
TEST_FILE_LIST: "dataset/cityscapes/val.list"
TRAIN_FILE_LIST: "dataset/cityscapes/train.list"
VAL_FILE_LIST: "dataset/cityscapes/val.list"
IGNORE_INDEX: 255
FREEZE:
MODEL_FILENAME: "model"
PARAMS_FILENAME: "params"
MODEL:
DEFAULT_NORM_TYPE: "bn"
MODEL_NAME: "fast_scnn"
TEST:
TEST_MODEL: "snapshots/cityscape_fast_scnn/final/"
TRAIN:
MODEL_SAVE_DIR: "snapshots/cityscape_fast_scnn/"
SNAPSHOT_EPOCH: 10
SOLVER:
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "sgd"
NUM_EPOCHS: 100
TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling
EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling
AUG: AUG:
AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding FIX_RESIZE_SIZE: (2048, 1024) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MAX_SCALE_FACTOR: 1.25 # for stepscaling MIN_SCALE_FACTOR: 0.5 # for stepscaling
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True MIRROR: True
BATCH_SIZE: 4 BATCH_SIZE: 4
DATASET: DATASET:
DATA_DIR: "./dataset/mini_pet/" DATA_DIR: "./dataset/cityscapes/"
IMAGE_TYPE: "rgb" # choice rgb or rgba IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 3 NUM_CLASSES: 19
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" TEST_FILE_LIST: "dataset/cityscapes/val.list"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" TRAIN_FILE_LIST: "dataset/cityscapes/train.list"
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" VAL_FILE_LIST: "dataset/cityscapes/val.list"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
IGNORE_INDEX: 255 IGNORE_INDEX: 255
SEPARATOR: " " SEPARATOR: " "
FREEZE: FREEZE:
MODEL_FILENAME: "__model__" MODEL_FILENAME: "model"
PARAMS_FILENAME: "__params__" PARAMS_FILENAME: "params"
MODEL: MODEL:
MODEL_NAME: "deeplabv3p"
DEFAULT_NORM_TYPE: "bn" DEFAULT_NORM_TYPE: "bn"
MODEL_NAME: "deeplabv3p"
DEEPLAB: DEEPLAB:
BACKBONE: "xception_65" BACKBONE: "mobilenetv2"
ASPP_WITH_SEP_CONV: True
DECODER_USE_SEP_CONV: True
ENCODER_WITH_ASPP: False
ENABLE_DECODER: False
TRAIN: TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/" PRETRAINED_MODEL_DIR: u"pretrained_model/deeplabv3p_mobilenetv2-1-0_bn_coco"
MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_pet/" MODEL_SAVE_DIR: "saved_model/deeplabv3p_mobilenetv2_cityscapes"
SNAPSHOT_EPOCH: 10 SNAPSHOT_EPOCH: 10
SYNC_BATCH_NORM: True
TEST: TEST:
TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_pet/final" TEST_MODEL: "saved_model/deeplabv3p_mobilenetv2_cityscapes/final"
SOLVER: SOLVER:
NUM_EPOCHS: 100 LR: 0.01
LR: 0.005
LR_POLICY: "poly" LR_POLICY: "poly"
OPTIMIZER: "sgd" OPTIMIZER: "sgd"
NUM_EPOCHS: 100
EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling EVAL_CROP_SIZE: (2049, 1025) # (width, height), for unpadding rangescaling and stepscaling
TRAIN_CROP_SIZE: (713, 713) # (width, height), for unpadding rangescaling and stepscaling TRAIN_CROP_SIZE: (769, 769) # (width, height), for unpadding rangescaling and stepscaling
AUG: AUG:
AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling AUG_METHOD: "stepscaling" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (640, 640) # (width, height), for unpadding FIX_RESIZE_SIZE: (2048, 1024) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling MIN_RESIZE_VALUE: 400 # for rangescaling
...@@ -19,23 +19,25 @@ DATASET: ...@@ -19,23 +19,25 @@ DATASET:
TRAIN_FILE_LIST: "dataset/cityscapes/train.list" TRAIN_FILE_LIST: "dataset/cityscapes/train.list"
VAL_FILE_LIST: "dataset/cityscapes/val.list" VAL_FILE_LIST: "dataset/cityscapes/val.list"
IGNORE_INDEX: 255 IGNORE_INDEX: 255
SEPARATOR: " "
FREEZE: FREEZE:
MODEL_FILENAME: "model" MODEL_FILENAME: "model"
PARAMS_FILENAME: "params" PARAMS_FILENAME: "params"
MODEL: MODEL:
MODEL_NAME: "pspnet"
DEFAULT_NORM_TYPE: "bn" DEFAULT_NORM_TYPE: "bn"
PSPNET: MODEL_NAME: "deeplabv3p"
DEPTH_MULTIPLIER: 1 DEEPLAB:
LAYERS: 50 ASPP_WITH_SEP_CONV: True
TEST: DECODER_USE_SEP_CONV: True
TEST_MODEL: "snapshots/cityscapes_pspnet50/final"
TRAIN: TRAIN:
MODEL_SAVE_DIR: "snapshots/cityscapes_pspnet50/" PRETRAINED_MODEL_DIR: u"pretrained_model/deeplabv3p_xception65_bn_coco"
PRETRAINED_MODEL_DIR: u"pretrained_model/pspnet50_bn_cityscapes/" MODEL_SAVE_DIR: "saved_model/deeplabv3p_xception65_bn_cityscapes"
SNAPSHOT_EPOCH: 10 SNAPSHOT_EPOCH: 10
SYNC_BATCH_NORM: True
TEST:
TEST_MODEL: "saved_model/deeplabv3p_xception65_bn_cityscapes/final"
SOLVER: SOLVER:
LR: 0.001 LR: 0.01
LR_POLICY: "poly" LR_POLICY: "poly"
OPTIMIZER: "sgd" OPTIMIZER: "sgd"
NUM_EPOCHS: 700 NUM_EPOCHS: 100
# 数据集配置
DATASET:
DATA_DIR: "./dataset/optic_disc_seg/"
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt"
VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt"
VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "deeplabv3p"
DEFAULT_NORM_TYPE: "bn"
DEEPLAB:
BACKBONE: "xception_65"
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/deeplabv3p_xception65_bn_coco/"
MODEL_SAVE_DIR: "./saved_model/deeplabv3p_xception65_bn_optic/"
SNAPSHOT_EPOCH: 5
TEST:
TEST_MODEL: "./saved_model/deeplabv3p_xception65_bn_optic/final"
SOLVER:
NUM_EPOCHS: 10
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
\ No newline at end of file
...@@ -27,16 +27,17 @@ FREEZE: ...@@ -27,16 +27,17 @@ FREEZE:
MODEL_FILENAME: "__model__" MODEL_FILENAME: "__model__"
PARAMS_FILENAME: "__params__" PARAMS_FILENAME: "__params__"
MODEL: MODEL:
MODEL_NAME: "unet" MODEL_NAME: "fast_scnn"
DEFAULT_NORM_TYPE: "bn" DEFAULT_NORM_TYPE: "bn"
TEST:
TEST_MODEL: "./saved_model/unet_pet/final/"
TRAIN: TRAIN:
MODEL_SAVE_DIR: "./saved_model/unet_pet/" PRETRAINED_MODEL_DIR: "./pretrained_model/fast_scnn_cityscapes/"
PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/" MODEL_SAVE_DIR: "./saved_model/fast_scnn_pet/"
SNAPSHOT_EPOCH: 10 SNAPSHOT_EPOCH: 10
TEST:
TEST_MODEL: "./saved_model/fast_scnn_pet/final"
SOLVER: SOLVER:
NUM_EPOCHS: 100 NUM_EPOCHS: 100
LR: 0.005 LR: 0.005
LR_POLICY: "poly" LR_POLICY: "poly"
OPTIMIZER: "adam" OPTIMIZER: "sgd"
# 数据集配置
DATASET:
DATA_DIR: "./dataset/optic_disc_seg/"
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt"
VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt"
VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "hrnet"
DEFAULT_NORM_TYPE: "bn"
HRNET:
STAGE2:
NUM_CHANNELS: [18, 36]
STAGE3:
NUM_CHANNELS: [18, 36, 72]
STAGE4:
NUM_CHANNELS: [18, 36, 72, 144]
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/hrnet_optic/"
SNAPSHOT_EPOCH: 5
TEST:
TEST_MODEL: "./saved_model/hrnet_optic/final"
SOLVER:
NUM_EPOCHS: 10
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
# 数据集配置
DATASET:
DATA_DIR: "./dataset/optic_disc_seg/"
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt"
VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt"
VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "icnet"
DEFAULT_NORM_TYPE: "bn"
MULTI_LOSS_WEIGHT: "[1.0, 0.4, 0.16]"
ICNET:
DEPTH_MULTIPLIER: 0.5
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/icnet_optic/"
SNAPSHOT_EPOCH: 5
TEST:
TEST_MODEL: "./saved_model/icnet_optic/final"
SOLVER:
NUM_EPOCHS: 10
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling
EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling
AUG:
AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 1.25 # for stepscaling
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True
BATCH_SIZE: 4
DATASET:
DATA_DIR: "./dataset/mini_pet/"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 3
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt"
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt"
IGNORE_INDEX: 255
SEPARATOR: " "
FREEZE:
MODEL_FILENAME: "__model__"
PARAMS_FILENAME: "__params__"
MODEL:
MODEL_NAME: "icnet"
DEFAULT_NORM_TYPE: "bn"
MULTI_LOSS_WEIGHT: "[1.0, 0.4, 0.16]"
ICNET:
DEPTH_MULTIPLIER: 0.5
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/icnet_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/icnet_pet/"
SNAPSHOT_EPOCH: 10
TEST:
TEST_MODEL: "./saved_model/icnet_pet/final"
SOLVER:
NUM_EPOCHS: 100
LR: 0.005
LR_POLICY: "poly"
OPTIMIZER: "sgd"
# 数据集配置
DATASET:
DATA_DIR: "./dataset/optic_disc_seg/"
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt"
VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt"
VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "pspnet"
DEFAULT_NORM_TYPE: "bn"
PSPNET:
DEPTH_MULTIPLIER: 1
LAYERS: 50
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/pspnet50_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/pspnet_optic/"
SNAPSHOT_EPOCH: 5
TEST:
TEST_MODEL: "./saved_model/pspnet_optic/final"
SOLVER:
NUM_EPOCHS: 10
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
# 数据集配置
DATASET:
DATA_DIR: "./dataset/optic_disc_seg/"
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
TRAIN_FILE_LIST: "./dataset/optic_disc_seg/train_list.txt"
VAL_FILE_LIST: "./dataset/optic_disc_seg/val_list.txt"
VIS_FILE_LIST: "./dataset/optic_disc_seg/test_list.txt"
# 预训练模型配置
MODEL:
MODEL_NAME: "unet"
DEFAULT_NORM_TYPE: "bn"
# 其他配置
TRAIN_CROP_SIZE: (512, 512)
EVAL_CROP_SIZE: (512, 512)
AUG:
AUG_METHOD: "unpadding"
FIX_RESIZE_SIZE: (512, 512)
BATCH_SIZE: 4
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/unet_bn_coco/"
MODEL_SAVE_DIR: "./saved_model/unet_optic/"
SNAPSHOT_EPOCH: 5
TEST:
TEST_MODEL: "./saved_model/unet_optic/final"
SOLVER:
NUM_EPOCHS: 10
LR: 0.001
LR_POLICY: "poly"
OPTIMIZER: "adam"
...@@ -6,13 +6,13 @@ args = get_arguments() ...@@ -6,13 +6,13 @@ args = get_arguments()
cfg = AttrDict() cfg = AttrDict()
# 待预测图像所在路径 # 待预测图像所在路径
cfg.data_dir = os.path.join(args.example , "data", "testing_images") cfg.data_dir = os.path.join("data", "testing_images")
# 待预测图像名称列表 # 待预测图像名称列表
cfg.data_list_file = os.path.join(args.example , "data", "test_id.txt") cfg.data_list_file = os.path.join("data", "test_id.txt")
# 模型加载路径 # 模型加载路径
cfg.model_path = os.path.join(args.example , "ACE2P") cfg.model_path = args.example
# 预测结果保存路径 # 预测结果保存路径
cfg.vis_dir = os.path.join(args.example , "result") cfg.vis_dir = "result"
# 预测类别数 # 预测类别数
cfg.class_num = 20 cfg.class_num = 20
......
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test")
sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress
if __name__ == "__main__":
download_file_and_uncompress(
url='https://paddleseg.bj.bcebos.com/models/ACE2P.tgz',
savepath=LOCAL_PATH,
extrapath=LOCAL_PATH,
extraname='ACE2P')
print("Pretrained Model download success!")
# -*- coding: utf-8 -*-
import os
import cv2
import numpy as np
from utils.util import get_arguments
from utils.palette import get_palette
from PIL import Image as PILImage
import importlib
args = get_arguments()
config = importlib.import_module('config')
cfg = getattr(config, 'cfg')
# paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启
os.environ['FLAGS_eager_delete_tensor_gb']='0.0'
import paddle.fluid as fluid
# 预测数据集类
class TestDataSet():
def __init__(self):
self.data_dir = cfg.data_dir
self.data_list_file = cfg.data_list_file
self.data_list = self.get_data_list()
self.data_num = len(self.data_list)
def get_data_list(self):
# 获取预测图像路径列表
data_list = []
data_file_handler = open(self.data_list_file, 'r')
for line in data_file_handler:
img_name = line.strip()
name_prefix = img_name.split('.')[0]
if len(img_name.split('.')) == 1:
img_name = img_name + '.jpg'
img_path = os.path.join(self.data_dir, img_name)
data_list.append(img_path)
return data_list
def preprocess(self, img):
# 图像预处理
if cfg.example == 'ACE2P':
reader = importlib.import_module('reader')
ACE2P_preprocess = getattr(reader, 'preprocess')
img = ACE2P_preprocess(img)
else:
img = cv2.resize(img, cfg.input_size).astype(np.float32)
img -= np.array(cfg.MEAN)
img /= np.array(cfg.STD)
img = img.transpose((2, 0, 1))
img = np.expand_dims(img, axis=0)
return img
def get_data(self, index):
# 获取图像信息
img_path = self.data_list[index]
img = cv2.imread(img_path, cv2.IMREAD_COLOR)
if img is None:
return img, img,img_path, None
img_name = img_path.split(os.sep)[-1]
name_prefix = img_name.replace('.'+img_name.split('.')[-1],'')
img_shape = img.shape[:2]
img_process = self.preprocess(img)
return img, img_process, name_prefix, img_shape
def infer():
if not os.path.exists(cfg.vis_dir):
os.makedirs(cfg.vis_dir)
palette = get_palette(cfg.class_num)
# 人像分割结果显示阈值
thresh = 120
place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
# 加载预测模型
test_prog, feed_name, fetch_list = fluid.io.load_inference_model(
dirname=cfg.model_path, executor=exe, params_filename='__params__')
#加载预测数据集
test_dataset = TestDataSet()
data_num = test_dataset.data_num
for idx in range(data_num):
# 数据获取
ori_img, image, im_name, im_shape = test_dataset.get_data(idx)
if image is None:
print(im_name, 'is None')
continue
# 预测
if cfg.example == 'ACE2P':
# ACE2P模型使用多尺度预测
reader = importlib.import_module('reader')
multi_scale_test = getattr(reader, 'multi_scale_test')
parsing, logits = multi_scale_test(exe, test_prog, feed_name, fetch_list, image, im_shape)
else:
# HumanSeg,RoadLine模型单尺度预测
result = exe.run(program=test_prog, feed={feed_name[0]: image}, fetch_list=fetch_list)
parsing = np.argmax(result[0][0], axis=0)
parsing = cv2.resize(parsing.astype(np.uint8), im_shape[::-1])
# 预测结果保存
result_path = os.path.join(cfg.vis_dir, im_name + '.png')
if cfg.example == 'HumanSeg':
logits = result[0][0][1]*255
logits = cv2.resize(logits, im_shape[::-1])
ret, logits = cv2.threshold(logits, thresh, 0, cv2.THRESH_TOZERO)
logits = 255 *(logits - thresh)/(255 - thresh)
# 将分割结果添加到alpha通道
rgba = np.concatenate((ori_img, np.expand_dims(logits, axis=2)), axis=2)
cv2.imwrite(result_path, rgba)
else:
output_im = PILImage.fromarray(np.asarray(parsing, dtype=np.uint8))
output_im.putpalette(palette)
output_im.save(result_path)
if (idx + 1) % 100 == 0:
print('%d processd' % (idx + 1))
print('%d processd done' % (idx + 1))
return 0
if __name__ == "__main__":
infer()
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import numpy as np import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
from ACE2P.config import cfg from config import cfg
import cv2 import cv2
def get_affine_points(src_shape, dst_shape, rot_grad=0): def get_affine_points(src_shape, dst_shape, rot_grad=0):
......
...@@ -8,7 +8,7 @@ from PIL import Image as PILImage ...@@ -8,7 +8,7 @@ from PIL import Image as PILImage
import importlib import importlib
args = get_arguments() args = get_arguments()
config = importlib.import_module(args.example+'.config') config = importlib.import_module('config')
cfg = getattr(config, 'cfg') cfg = getattr(config, 'cfg')
# paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启 # paddle垃圾回收策略FLAG,ACE2P模型较大,当显存不够时建议开启
......
##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## Created by: RainbowSecret
## Microsoft Research
## yuyua@microsoft.com
## Copyright (c) 2018
##
## This source code is licensed under the MIT-style license found in the
## LICENSE file in the root directory of this source tree
##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import cv2
def get_palette(num_cls):
""" Returns the color map for visualizing the segmentation mask.
Args:
num_cls: Number of classes
Returns:
The color map
"""
n = num_cls
palette = [0] * (n * 3)
for j in range(0, n):
lab = j
palette[j * 3 + 0] = 0
palette[j * 3 + 1] = 0
palette[j * 3 + 2] = 0
i = 0
while lab:
palette[j * 3 + 0] |= (((lab >> 0) & 1) << (7 - i))
palette[j * 3 + 1] |= (((lab >> 1) & 1) << (7 - i))
palette[j * 3 + 2] |= (((lab >> 2) & 1) << (7 - i))
i += 1
lab >>= 3
return palette
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
import argparse
import os
def get_arguments():
parser = argparse.ArgumentParser()
parser.add_argument("--use_gpu",
action="store_true",
help="Use gpu or cpu to test.")
parser.add_argument('--example',
type=str,
help='RoadLine, HumanSeg or ACE2P')
return parser.parse_args()
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif name in self:
return self[name]
else:
raise AttributeError(name)
def __setattr__(self, name, value):
if name in self.__dict__:
self.__dict__[name] = value
else:
self[name] = value
def merge_cfg_from_args(args, cfg):
"""Merge config keys, values in args into the global config."""
for k, v in vars(args).items():
d = cfg
try:
value = eval(v)
except:
value = v
if value is not None:
cfg[k] = value
# LaneNet 模型训练教程
* 本教程旨在介绍如何通过使用PaddleSeg进行车道线检测
* 在阅读本教程前,请确保您已经了解过PaddleSeg的[快速入门](../README.md#快速入门)[基础功能](../README.md#基础功能)等章节,以便对PaddleSeg有一定的了解
## 环境依赖
* PaddlePaddle >= 1.7.0 或develop版本
* Python 3.5+
通过以下命令安装python包依赖,请确保在该分支上至少执行过一次以下命令
```shell
$ pip install -r requirements.txt
```
## 一. 准备待训练数据
我们提前准备好了一份处理好的数据集,通过以下代码进行下载,该数据集由图森车道线检测数据集转换而来,你也可以在这个[页面](https://github.com/TuSimple/tusimple-benchmark/issues/3)下载原始数据集。
```shell
python dataset/download_tusimple.py
```
数据目录结构
```
LaneNet
|-- dataset
|-- tusimple_lane_detection
|-- training
|-- gt_binary_image
|-- gt_image
|-- gt_instance_image
|-- train_part.txt
|-- val_part.txt
```
## 二. 下载预训练模型
下载[vgg预训练模型](https://paddle-imagenet-models-name.bj.bcebos.com/VGG16_pretrained.tar),放在```pretrained_models```文件夹下。
## 三. 准备配置
接着我们需要确定相关配置,从本教程的角度,配置分为三部分:
* 数据集
* 训练集主目录
* 训练集文件列表
* 测试集文件列表
* 评估集文件列表
* 预训练模型
* 预训练模型名称
* 预训练模型的backbone网络
* 预训练模型路径
* 其他
* 学习率
* Batch大小
* ...
在三者中,预训练模型的配置尤为重要,如果模型或者BACKBONE配置错误,会导致预训练的参数没有加载,进而影响收敛速度。预训练模型相关的配置如第二步所展示。
数据集的配置和数据路径有关,在本教程中,数据存放在`dataset/tusimple_lane_detection`
其他配置则根据数据集和机器环境的情况进行调节,最终我们保存一个如下内容的yaml配置文件,存放路径为**configs/lanenet.yaml**
```yaml
# 数据集配置
DATASET:
DATA_DIR: "./dataset/tusimple_lane_detection"
IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/tusimple_lane_detection/training/val_part.txt"
TRAIN_FILE_LIST: "./dataset/tusimple_lane_detection/training/train_part.txt"
VAL_FILE_LIST: "./dataset/tusimple_lane_detection/training/val_part.txt"
SEPARATOR: " "
# 预训练模型配置
MODEL:
MODEL_NAME: "lanenet"
# 其他配置
EVAL_CROP_SIZE: (512, 256)
TRAIN_CROP_SIZE: (512, 256)
AUG:
AUG_METHOD: u"unpadding" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (512, 256) # (width, height), for unpadding
MIRROR: False
RICH_CROP:
ENABLE: False
BATCH_SIZE: 4
TEST:
TEST_MODEL: "./saved_model/lanenet/final/"
TRAIN:
MODEL_SAVE_DIR: "./saved_model/lanenet/"
PRETRAINED_MODEL_DIR: "./pretrained_models/VGG16_pretrained"
SNAPSHOT_EPOCH: 5
SOLVER:
NUM_EPOCHS: 100
LR: 0.0005
LR_POLICY: "poly"
OPTIMIZER: "sgd"
WEIGHT_DECAY: 0.001
```
## 五. 开始训练
使用下述命令启动训练
```shell
CUDA_VISIBLE_DEVICES=0 python -u train.py --cfg configs/lanenet.yaml --use_gpu --use_mpio --do_eval
```
## 六. 进行评估
模型训练完成,使用下述命令启动评估
```shell
CUDA_VISIBLE_DEVICES=0 python -u eval.py --use_gpu --cfg configs/lanenet.yaml
```
## 七. 可视化
需要先下载一个车前视角和鸟瞰图视角转换所需文件,点击[链接](https://paddleseg.bj.bcebos.com/resources/tusimple_ipm_remap.tar),下载后放在```./utils```下。同时我们提供了一个训练好的模型,点击[链接](https://paddleseg.bj.bcebos.com/models/lanenet_vgg_tusimple.tar),下载后放在```./pretrained_models/```下,使用如下命令进行可视化
```shell
CUDA_VISIBLE_DEVICES=0 python -u ./vis.py --cfg configs/lanenet.yaml --use_gpu --vis_dir vis_result \
TEST.TEST_MODEL pretrained_models/LaneNet_vgg_tusimple/
```
可视化结果示例:
预测结果:<br/>
![](imgs/0005_pred_lane.png)
分割结果:<br/>
![](imgs/0005_pred_binary.png)<br/>
车道线实例预测结果:<br/>
![](imgs/0005_pred_instance.png)
TRAIN_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling EVAL_CROP_SIZE: (512, 256) # (width, height), for unpadding rangescaling and stepscaling
EVAL_CROP_SIZE: (512, 512) # (width, height), for unpadding rangescaling and stepscaling TRAIN_CROP_SIZE: (512, 256) # (width, height), for unpadding rangescaling and stepscaling
AUG: AUG:
AUG_METHOD: "unpadding" # choice unpadding rangescaling and stepscaling AUG_METHOD: u"unpadding" # choice unpadding rangescaling and stepscaling
FIX_RESIZE_SIZE: (512, 512) # (width, height), for unpadding FIX_RESIZE_SIZE: (512, 256) # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MAX_SCALE_FACTOR: 1.25 # for stepscaling MIN_SCALE_FACTOR: 0.5 # for stepscaling
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling SCALE_STEP_SIZE: 0.25 # for stepscaling
MIRROR: True MIRROR: False
RICH_CROP:
ENABLE: False
BATCH_SIZE: 4 BATCH_SIZE: 4
DATASET:
DATA_DIR: "./dataset/mini_pet/" DATALOADER:
BUF_SIZE: 256
NUM_WORKERS: 4
DATASET:
DATA_DIR: "./dataset/tusimple_lane_detection"
IMAGE_TYPE: "rgb" # choice rgb or rgba IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 3 NUM_CLASSES: 2
TEST_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" TEST_FILE_LIST: "./dataset/tusimple_lane_detection/training/val_part.txt"
TRAIN_FILE_LIST: "./dataset/mini_pet/file_list/train_list.txt" TEST_TOTAL_IMAGES: 362
VAL_FILE_LIST: "./dataset/mini_pet/file_list/val_list.txt" TRAIN_FILE_LIST: "./dataset/tusimple_lane_detection/training/train_part.txt"
VIS_FILE_LIST: "./dataset/mini_pet/file_list/test_list.txt" TRAIN_TOTAL_IMAGES: 3264
IGNORE_INDEX: 255 VAL_FILE_LIST: "./dataset/tusimple_lane_detection/training/val_part.txt"
VAL_TOTAL_IMAGES: 362
SEPARATOR: " " SEPARATOR: " "
IGNORE_INDEX: 255
FREEZE: FREEZE:
MODEL_FILENAME: "__model__" MODEL_FILENAME: "__model__"
PARAMS_FILENAME: "__params__" PARAMS_FILENAME: "__params__"
MODEL: MODEL:
MODEL_NAME: "hrnet" MODEL_NAME: "lanenet"
DEFAULT_NORM_TYPE: "bn" DEFAULT_NORM_TYPE: "bn"
HRNET:
STAGE2:
NUM_CHANNELS: [18, 36]
STAGE3:
NUM_CHANNELS: [18, 36, 72]
STAGE4:
NUM_CHANNELS: [18, 36, 72, 144]
TRAIN:
PRETRAINED_MODEL_DIR: "./pretrained_model/hrnet_w18_bn_cityscapes/"
MODEL_SAVE_DIR: "./saved_model/hrnet_w18_bn_pet/"
SNAPSHOT_EPOCH: 10
TEST: TEST:
TEST_MODEL: "./saved_model/hrnet_w18_bn_pet/final" TEST_MODEL: "./saved_model/lanenet/final/"
TRAIN:
MODEL_SAVE_DIR: "./saved_model/lanenet/"
PRETRAINED_MODEL_DIR: "./pretrained_models/VGG16_pretrained"
SNAPSHOT_EPOCH: 1
SOLVER: SOLVER:
NUM_EPOCHS: 100 NUM_EPOCHS: 100
LR: 0.005 LR: 0.0005
LR_POLICY: "poly" LR_POLICY: "poly"
OPTIMIZER: "sgd" OPTIMIZER: "sgd"
WEIGHT_DECAY: 0.001
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function
import cv2
import numpy as np
from utils.config import cfg
from models.model_builder import ModelPhase
from pdseg.data_aug import get_random_scale, randomly_scale_image_and_label, random_rotation, \
rand_scale_aspect, hsv_color_jitter, rand_crop
def resize(img, grt=None, grt_instance=None, mode=ModelPhase.TRAIN):
"""
改变图像及标签图像尺寸
AUG.AUG_METHOD为unpadding,所有模式均直接resize到AUG.FIX_RESIZE_SIZE的尺寸
AUG.AUG_METHOD为stepscaling, 按比例resize,训练时比例范围AUG.MIN_SCALE_FACTOR到AUG.MAX_SCALE_FACTOR,间隔为AUG.SCALE_STEP_SIZE,其他模式返回原图
AUG.AUG_METHOD为rangescaling,长边对齐,短边按比例变化,训练时长边对齐范围AUG.MIN_RESIZE_VALUE到AUG.MAX_RESIZE_VALUE,其他模式长边对齐AUG.INF_RESIZE_VALUE
Args:
img(numpy.ndarray): 输入图像
grt(numpy.ndarray): 标签图像,默认为None
mode(string): 模式, 默认训练模式,即ModelPhase.TRAIN
Returns:
resize后的图像和标签图
"""
if cfg.AUG.AUG_METHOD == 'unpadding':
target_size = cfg.AUG.FIX_RESIZE_SIZE
img = cv2.resize(img, target_size, interpolation=cv2.INTER_LINEAR)
if grt is not None:
grt = cv2.resize(grt, target_size, interpolation=cv2.INTER_NEAREST)
if grt_instance is not None:
grt_instance = cv2.resize(grt_instance, target_size, interpolation=cv2.INTER_NEAREST)
elif cfg.AUG.AUG_METHOD == 'stepscaling':
if mode == ModelPhase.TRAIN:
min_scale_factor = cfg.AUG.MIN_SCALE_FACTOR
max_scale_factor = cfg.AUG.MAX_SCALE_FACTOR
step_size = cfg.AUG.SCALE_STEP_SIZE
scale_factor = get_random_scale(min_scale_factor, max_scale_factor,
step_size)
img, grt = randomly_scale_image_and_label(
img, grt, scale=scale_factor)
elif cfg.AUG.AUG_METHOD == 'rangescaling':
min_resize_value = cfg.AUG.MIN_RESIZE_VALUE
max_resize_value = cfg.AUG.MAX_RESIZE_VALUE
if mode == ModelPhase.TRAIN:
if min_resize_value == max_resize_value:
random_size = min_resize_value
else:
random_size = int(
np.random.uniform(min_resize_value, max_resize_value) + 0.5)
else:
random_size = cfg.AUG.INF_RESIZE_VALUE
value = max(img.shape[0], img.shape[1])
scale = float(random_size) / float(value)
img = cv2.resize(
img, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_LINEAR)
if grt is not None:
grt = cv2.resize(
grt, (0, 0),
fx=scale,
fy=scale,
interpolation=cv2.INTER_NEAREST)
else:
raise Exception("Unexpect data augmention method: {}".format(
cfg.AUG.AUG_METHOD))
return img, grt, grt_instance
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "../../../", "test")
sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress
def download_tusimple_dataset(savepath, extrapath):
url = "https://paddleseg.bj.bcebos.com/dataset/tusimple_lane_detection.tar"
download_file_and_uncompress(
url=url, savepath=savepath, extrapath=extrapath)
if __name__ == "__main__":
download_tusimple_dataset(LOCAL_PATH, LOCAL_PATH)
print("Dataset download finish!")
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
cur_path = os.path.abspath(os.path.dirname(__file__))
root_path = os.path.split(os.path.split(cur_path)[0])[0]
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
SEG_PATH = os.path.join(LOCAL_PATH, "../../../")
sys.path.append(SEG_PATH)
sys.path.append(root_path)
import time
import argparse
import functools
import pprint
import cv2
import numpy as np
import paddle
import paddle.fluid as fluid
from utils.config import cfg
from pdseg.utils.timer import Timer, calculate_eta
from models.model_builder import build_model
from models.model_builder import ModelPhase
from reader import LaneNetDataset
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg model evalution')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu',
dest='use_gpu',
help='Use gpu or cpu',
action='store_true',
default=False)
parser.add_argument(
'--use_mpio',
dest='use_mpio',
help='Use multiprocess IO or not',
action='store_true',
default=False)
parser.add_argument(
'opts',
help='See utils/config.py for all options',
default=None,
nargs=argparse.REMAINDER)
if len(sys.argv) == 1:
parser.print_help()
sys.exit(1)
return parser.parse_args()
def evaluate(cfg, ckpt_dir=None, use_gpu=False, use_mpio=False, **kwargs):
np.set_printoptions(precision=5, suppress=True)
startup_prog = fluid.Program()
test_prog = fluid.Program()
dataset = LaneNetDataset(
file_list=cfg.DATASET.VAL_FILE_LIST,
mode=ModelPhase.TRAIN,
shuffle=True,
data_dir=cfg.DATASET.DATA_DIR)
def data_generator():
#TODO: check is batch reader compatitable with Windows
if use_mpio:
data_gen = dataset.multiprocess_generator(
num_processes=cfg.DATALOADER.NUM_WORKERS,
max_queue_size=cfg.DATALOADER.BUF_SIZE)
else:
data_gen = dataset.generator()
for b in data_gen:
yield b
py_reader, pred, grts, masks, accuracy, fp, fn = build_model(
test_prog, startup_prog, phase=ModelPhase.EVAL)
py_reader.decorate_sample_generator(
data_generator, drop_last=False, batch_size=cfg.BATCH_SIZE)
# Get device environment
places = fluid.cuda_places() if use_gpu else fluid.cpu_places()
place = places[0]
dev_count = len(places)
print("#Device count: {}".format(dev_count))
exe = fluid.Executor(place)
exe.run(startup_prog)
test_prog = test_prog.clone(for_test=True)
ckpt_dir = cfg.TEST.TEST_MODEL if not ckpt_dir else ckpt_dir
if ckpt_dir is not None:
print('load test model:', ckpt_dir)
fluid.io.load_params(exe, ckpt_dir, main_program=test_prog)
# Use streaming confusion matrix to calculate mean_iou
np.set_printoptions(
precision=4, suppress=True, linewidth=160, floatmode="fixed")
fetch_list = [pred.name, grts.name, masks.name, accuracy.name, fp.name, fn.name]
num_images = 0
step = 0
avg_acc = 0.0
avg_fp = 0.0
avg_fn = 0.0
# cur_images = 0
all_step = cfg.DATASET.TEST_TOTAL_IMAGES // cfg.BATCH_SIZE + 1
timer = Timer()
timer.start()
py_reader.start()
while True:
try:
step += 1
pred, grts, masks, out_acc, out_fp, out_fn = exe.run(
test_prog, fetch_list=fetch_list, return_numpy=True)
avg_acc += np.mean(out_acc) * pred.shape[0]
avg_fp += np.mean(out_fp) * pred.shape[0]
avg_fn += np.mean(out_fn) * pred.shape[0]
num_images += pred.shape[0]
speed = 1.0 / timer.elapsed_time()
print(
"[EVAL]step={} accuracy={:.4f} fp={:.4f} fn={:.4f} step/sec={:.2f} | ETA {}"
.format(step, avg_acc / num_images, avg_fp / num_images, avg_fn / num_images, speed,
calculate_eta(all_step - step, speed)))
timer.restart()
sys.stdout.flush()
except fluid.core.EOFException:
break
print("[EVAL]#image={} accuracy={:.4f} fp={:.4f} fn={:.4f}".format(
num_images, avg_acc / num_images, avg_fp / num_images, avg_fn / num_images))
return avg_acc / num_images, avg_fp / num_images, avg_fn / num_images
def main():
args = parse_args()
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
cfg.check_and_infer()
print(pprint.pformat(cfg))
evaluate(cfg, **args.__dict__)
if __name__ == '__main__':
main()
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle.fluid as fluid
import numpy as np
from utils.config import cfg
def unsorted_segment_sum(data, segment_ids, unique_labels, feature_dims):
zeros = fluid.layers.fill_constant_batch_size_like(unique_labels, shape=[1, feature_dims],
dtype='float32', value=0)
segment_ids = fluid.layers.unsqueeze(segment_ids, axes=[1])
segment_ids.stop_gradient = True
segment_sum = fluid.layers.scatter_nd_add(zeros, segment_ids, data)
zeros.stop_gradient = True
return segment_sum
def norm(x, axis=-1):
distance = fluid.layers.reduce_sum(fluid.layers.abs(x), dim=axis, keep_dim=True)
return distance
def discriminative_loss_single(
prediction,
correct_label,
feature_dim,
label_shape,
delta_v,
delta_d,
param_var,
param_dist,
param_reg):
correct_label = fluid.layers.reshape(
correct_label, [
label_shape[1] * label_shape[0]])
prediction = fluid.layers.transpose(prediction, [1, 2, 0])
reshaped_pred = fluid.layers.reshape(
prediction, [
label_shape[1] * label_shape[0], feature_dim])
unique_labels, unique_id, counts = fluid.layers.unique_with_counts(correct_label)
correct_label.stop_gradient = True
counts = fluid.layers.cast(counts, 'float32')
num_instances = fluid.layers.shape(unique_labels)
segmented_sum = unsorted_segment_sum(
reshaped_pred, unique_id, unique_labels, feature_dims=feature_dim)
counts_rsp = fluid.layers.reshape(counts, (-1, 1))
mu = fluid.layers.elementwise_div(segmented_sum, counts_rsp)
counts_rsp.stop_gradient = True
mu_expand = fluid.layers.gather(mu, unique_id)
tmp = fluid.layers.elementwise_sub(mu_expand, reshaped_pred)
distance = norm(tmp)
distance = distance - delta_v
distance_pos = fluid.layers.greater_equal(distance, fluid.layers.zeros_like(distance))
distance_pos = fluid.layers.cast(distance_pos, 'float32')
distance = distance * distance_pos
distance = fluid.layers.square(distance)
l_var = unsorted_segment_sum(distance, unique_id, unique_labels, feature_dims=1)
l_var = fluid.layers.elementwise_div(l_var, counts_rsp)
l_var = fluid.layers.reduce_sum(l_var)
l_var = l_var / fluid.layers.cast(num_instances * (num_instances - 1), 'float32')
mu_interleaved_rep = fluid.layers.expand(mu, [num_instances, 1])
mu_band_rep = fluid.layers.expand(mu, [1, num_instances])
mu_band_rep = fluid.layers.reshape(mu_band_rep, (num_instances * num_instances, feature_dim))
mu_diff = fluid.layers.elementwise_sub(mu_band_rep, mu_interleaved_rep)
intermediate_tensor = fluid.layers.reduce_sum(fluid.layers.abs(mu_diff), dim=1)
intermediate_tensor.stop_gradient = True
zero_vector = fluid.layers.zeros([1], 'float32')
bool_mask = fluid.layers.not_equal(intermediate_tensor, zero_vector)
temp = fluid.layers.where(bool_mask)
mu_diff_bool = fluid.layers.gather(mu_diff, temp)
mu_norm = norm(mu_diff_bool)
mu_norm = 2. * delta_d - mu_norm
mu_norm_pos = fluid.layers.greater_equal(mu_norm, fluid.layers.zeros_like(mu_norm))
mu_norm_pos = fluid.layers.cast(mu_norm_pos, 'float32')
mu_norm = mu_norm * mu_norm_pos
mu_norm_pos.stop_gradient = True
mu_norm = fluid.layers.square(mu_norm)
l_dist = fluid.layers.reduce_mean(mu_norm)
l_reg = fluid.layers.reduce_mean(norm(mu, axis=1))
l_var = param_var * l_var
l_dist = param_dist * l_dist
l_reg = param_reg * l_reg
loss = l_var + l_dist + l_reg
return loss, l_var, l_dist, l_reg
def discriminative_loss(prediction, correct_label, feature_dim, image_shape,
delta_v, delta_d, param_var, param_dist, param_reg):
batch_size = int(cfg.BATCH_SIZE_PER_DEV)
output_ta_loss = 0.
output_ta_var = 0.
output_ta_dist = 0.
output_ta_reg = 0.
for i in range(batch_size):
disc_loss_single, l_var_single, l_dist_single, l_reg_single = discriminative_loss_single(
prediction[i], correct_label[i], feature_dim, image_shape, delta_v, delta_d, param_var, param_dist,
param_reg)
output_ta_loss += disc_loss_single
output_ta_var += l_var_single
output_ta_dist += l_dist_single
output_ta_reg += l_reg_single
disc_loss = output_ta_loss / batch_size
l_var = output_ta_var / batch_size
l_dist = output_ta_dist / batch_size
l_reg = output_ta_reg / batch_size
return disc_loss, l_var, l_dist, l_reg
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import models.modeling
#import models.backbone
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
sys.path.append("..")
import struct
import paddle.fluid as fluid
from paddle.fluid.proto.framework_pb2 import VarType
from pdseg import solver
from utils.config import cfg
from pdseg.loss import multi_softmax_with_loss
from loss import discriminative_loss
from models.modeling import lanenet
class ModelPhase(object):
"""
Standard name for model phase in PaddleSeg
The following standard keys are defined:
* `TRAIN`: training mode.
* `EVAL`: testing/evaluation mode.
* `PREDICT`: prediction/inference mode.
* `VISUAL` : visualization mode
"""
TRAIN = 'train'
EVAL = 'eval'
PREDICT = 'predict'
VISUAL = 'visual'
@staticmethod
def is_train(phase):
return phase == ModelPhase.TRAIN
@staticmethod
def is_predict(phase):
return phase == ModelPhase.PREDICT
@staticmethod
def is_eval(phase):
return phase == ModelPhase.EVAL
@staticmethod
def is_visual(phase):
return phase == ModelPhase.VISUAL
@staticmethod
def is_valid_phase(phase):
""" Check valid phase """
if ModelPhase.is_train(phase) or ModelPhase.is_predict(phase) \
or ModelPhase.is_eval(phase) or ModelPhase.is_visual(phase):
return True
return False
def seg_model(image, class_num):
model_name = cfg.MODEL.MODEL_NAME
if model_name == 'lanenet':
logits = lanenet.lanenet(image, class_num)
else:
raise Exception(
"unknow model name, only support unet, deeplabv3p, icnet, pspnet, hrnet"
)
return logits
def softmax(logit):
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.softmax(logit)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def sigmoid_to_softmax(logit):
"""
one channel to two channel
"""
logit = fluid.layers.transpose(logit, [0, 2, 3, 1])
logit = fluid.layers.sigmoid(logit)
logit_back = 1 - logit
logit = fluid.layers.concat([logit_back, logit], axis=-1)
logit = fluid.layers.transpose(logit, [0, 3, 1, 2])
return logit
def build_model(main_prog, start_prog, phase=ModelPhase.TRAIN):
if not ModelPhase.is_valid_phase(phase):
raise ValueError("ModelPhase {} is not valid!".format(phase))
if ModelPhase.is_train(phase):
width = cfg.TRAIN_CROP_SIZE[0]
height = cfg.TRAIN_CROP_SIZE[1]
else:
width = cfg.EVAL_CROP_SIZE[0]
height = cfg.EVAL_CROP_SIZE[1]
image_shape = [cfg.DATASET.DATA_DIM, height, width]
grt_shape = [1, height, width]
class_num = cfg.DATASET.NUM_CLASSES
with fluid.program_guard(main_prog, start_prog):
with fluid.unique_name.guard():
image = fluid.layers.data(
name='image', shape=image_shape, dtype='float32')
label = fluid.layers.data(
name='label', shape=grt_shape, dtype='int32')
if cfg.MODEL.MODEL_NAME == 'lanenet':
label_instance = fluid.layers.data(
name='label_instance', shape=grt_shape, dtype='int32')
mask = fluid.layers.data(
name='mask', shape=grt_shape, dtype='int32')
# use PyReader when doing traning and evaluation
if ModelPhase.is_train(phase) or ModelPhase.is_eval(phase):
py_reader = fluid.io.PyReader(
feed_list=[image, label, label_instance, mask],
capacity=cfg.DATALOADER.BUF_SIZE,
iterable=False,
use_double_buffer=True)
loss_type = cfg.SOLVER.LOSS
if not isinstance(loss_type, list):
loss_type = list(loss_type)
logits = seg_model(image, class_num)
if ModelPhase.is_train(phase):
loss_valid = False
valid_loss = []
if cfg.MODEL.MODEL_NAME == 'lanenet':
embeding_logit = logits[1]
logits = logits[0]
disc_loss, _, _, l_reg = discriminative_loss(embeding_logit, label_instance, 4,
image_shape[1:], 0.5, 3.0, 1.0, 1.0, 0.001)
if "softmax_loss" in loss_type:
weight = None
if cfg.MODEL.MODEL_NAME == 'lanenet':
weight = get_dynamic_weight(label)
seg_loss = multi_softmax_with_loss(logits, label, mask, class_num, weight)
loss_valid = True
valid_loss.append("softmax_loss")
if not loss_valid:
raise Exception("SOLVER.LOSS: {} is set wrong. it should "
"include one of (softmax_loss, bce_loss, dice_loss) at least"
" example: ['softmax_loss']".format(cfg.SOLVER.LOSS))
invalid_loss = [x for x in loss_type if x not in valid_loss]
if len(invalid_loss) > 0:
print("Warning: the loss {} you set is invalid. it will not be included in loss computed.".format(invalid_loss))
avg_loss = disc_loss + 0.00001 * l_reg + seg_loss
#get pred result in original size
if isinstance(logits, tuple):
logit = logits[0]
else:
logit = logits
if logit.shape[2:] != label.shape[2:]:
logit = fluid.layers.resize_bilinear(logit, label.shape[2:])
# return image input and logit output for inference graph prune
if ModelPhase.is_predict(phase):
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return image, logit
if class_num == 1:
out = sigmoid_to_softmax(logit)
out = fluid.layers.transpose(out, [0, 2, 3, 1])
else:
out = fluid.layers.transpose(logit, [0, 2, 3, 1])
pred = fluid.layers.argmax(out, axis=3)
pred = fluid.layers.unsqueeze(pred, axes=[3])
if ModelPhase.is_visual(phase):
if cfg.MODEL.MODEL_NAME == 'lanenet':
return pred, logits[1]
if class_num == 1:
logit = sigmoid_to_softmax(logit)
else:
logit = softmax(logit)
return pred, logit
accuracy, fp, fn = compute_metric(pred, label)
if ModelPhase.is_eval(phase):
return py_reader, pred, label, mask, accuracy, fp, fn
if ModelPhase.is_train(phase):
optimizer = solver.Solver(main_prog, start_prog)
decayed_lr = optimizer.optimise(avg_loss)
return py_reader, avg_loss, decayed_lr, pred, label, mask, disc_loss, seg_loss, accuracy, fp, fn
def compute_metric(pred, label):
label = fluid.layers.transpose(label, [0, 2, 3, 1])
idx = fluid.layers.where(pred == 1)
pix_cls_ret = fluid.layers.gather_nd(label, idx)
correct_num = fluid.layers.reduce_sum(fluid.layers.cast(pix_cls_ret, 'float32'))
gt_num = fluid.layers.cast(fluid.layers.shape(fluid.layers.gather_nd(label,
fluid.layers.where(label == 1)))[0], 'int64')
pred_num = fluid.layers.cast(fluid.layers.shape(fluid.layers.gather_nd(pred, idx))[0], 'int64')
accuracy = correct_num / gt_num
false_pred = pred_num - correct_num
fp = fluid.layers.cast(false_pred, 'float32') / fluid.layers.cast(fluid.layers.shape(pix_cls_ret)[0], 'int64')
label_cls_ret = fluid.layers.gather_nd(label, fluid.layers.where(label == 1))
mis_pred = fluid.layers.cast(fluid.layers.shape(label_cls_ret)[0], 'int64') - correct_num
fn = fluid.layers.cast(mis_pred, 'float32') / fluid.layers.cast(fluid.layers.shape(label_cls_ret)[0], 'int64')
accuracy.stop_gradient = True
fp.stop_gradient = True
fn.stop_gradient = True
return accuracy, fp, fn
def get_dynamic_weight(label):
label = fluid.layers.reshape(label, [-1])
unique_labels, unique_id, counts = fluid.layers.unique_with_counts(label)
counts = fluid.layers.cast(counts, 'float32')
weight = 1.0 / fluid.layers.log((counts / fluid.layers.reduce_sum(counts) + 1.02))
return weight
def to_int(string, dest="I"):
return struct.unpack(dest, string)[0]
def parse_shape_from_file(filename):
with open(filename, "rb") as file:
version = file.read(4)
lod_level = to_int(file.read(8), dest="Q")
for i in range(lod_level):
_size = to_int(file.read(8), dest="Q")
_ = file.read(_size)
version = file.read(4)
tensor_desc_size = to_int(file.read(4))
tensor_desc = VarType.TensorDesc()
tensor_desc.ParseFromString(file.read(tensor_desc_size))
return tuple(tensor_desc.dims)
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import division
from __future__ import print_function
import paddle.fluid as fluid
from utils.config import cfg
from pdseg.models.libs.model_libs import scope, name_scope
from pdseg.models.libs.model_libs import bn, bn_relu, relu
from pdseg.models.libs.model_libs import conv, max_pool, deconv
from pdseg.models.backbone.vgg import VGGNet as vgg_backbone
#from models.backbone.vgg import VGGNet as vgg_backbone
# Bottleneck type
REGULAR = 1
DOWNSAMPLING = 2
UPSAMPLING = 3
DILATED = 4
ASYMMETRIC = 5
def prelu(x, decoder=False):
# If decoder, then perform relu else perform prelu
if decoder:
return fluid.layers.relu(x)
return fluid.layers.prelu(x, 'channel')
def iniatial_block(inputs, name_scope='iniatial_block'):
'''
The initial block for Enet has 2 branches: The convolution branch and Maxpool branch.
The conv branch has 13 filters, while the maxpool branch gives 3 channels corresponding to the RGB channels.
Both output layers are then concatenated to give an output of 16 channels.
:param inputs(Tensor): A 4D tensor of shape [batch_size, height, width, channels]
:return net_concatenated(Tensor): a 4D Tensor of new shape [batch_size, height, width, channels]
'''
# Convolutional branch
with scope(name_scope):
net_conv = conv(inputs, 13, 3, stride=2, padding=1)
net_conv = bn(net_conv)
net_conv = fluid.layers.prelu(net_conv, 'channel')
# Max pool branch
net_pool = max_pool(inputs, [2, 2], stride=2, padding='SAME')
# Concatenated output - does it matter max pool comes first or conv comes first? probably not.
net_concatenated = fluid.layers.concat([net_conv, net_pool], axis=1)
return net_concatenated
def bottleneck(inputs,
output_depth,
filter_size,
regularizer_prob,
projection_ratio=4,
type=REGULAR,
seed=0,
output_shape=None,
dilation_rate=None,
decoder=False,
name_scope='bottleneck'):
# Calculate the depth reduction based on the projection ratio used in 1x1 convolution.
reduced_depth = int(inputs.shape[1] / projection_ratio)
# DOWNSAMPLING BOTTLENECK
if type == DOWNSAMPLING:
#=============MAIN BRANCH=============
#Just perform a max pooling
with scope('down_sample'):
inputs_shape = inputs.shape
with scope('main_max_pool'):
net_main = fluid.layers.conv2d(inputs, inputs_shape[1], filter_size=3, stride=2, padding='SAME')
#First get the difference in depth to pad, then pad with zeros only on the last dimension.
depth_to_pad = abs(inputs_shape[1] - output_depth)
paddings = [0, 0, 0, depth_to_pad, 0, 0, 0, 0]
with scope('main_padding'):
net_main = fluid.layers.pad(net_main, paddings=paddings)
with scope('block1'):
net = conv(inputs, reduced_depth, [2, 2], stride=2, padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
with scope('block2'):
net = conv(net, reduced_depth, [filter_size, filter_size], padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
with scope('block3'):
net = conv(net, output_depth, [1, 1], padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
# Regularizer
net = fluid.layers.dropout(net, regularizer_prob, seed=seed)
# Finally, combine the two branches together via an element-wise addition
net = fluid.layers.elementwise_add(net, net_main)
net = prelu(net, decoder=decoder)
return net, inputs_shape
# DILATION CONVOLUTION BOTTLENECK
# Everything is the same as a regular bottleneck except for the dilation rate argument
elif type == DILATED:
#Check if dilation rate is given
if not dilation_rate:
raise ValueError('Dilation rate is not given.')
with scope('dilated'):
# Save the main branch for addition later
net_main = inputs
# First projection with 1x1 kernel (dimensionality reduction)
with scope('block1'):
net = conv(inputs, reduced_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Second conv block --- apply dilated convolution here
with scope('block2'):
net = conv(net, reduced_depth, filter_size, padding='SAME', dilation=dilation_rate)
net = bn(net)
net = prelu(net, decoder=decoder)
# Final projection with 1x1 kernel (Expansion)
with scope('block3'):
net = conv(net, output_depth, [1,1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Regularizer
net = fluid.layers.dropout(net, regularizer_prob, seed=seed)
net = prelu(net, decoder=decoder)
# Add the main branch
net = fluid.layers.elementwise_add(net_main, net)
net = prelu(net, decoder=decoder)
return net
# ASYMMETRIC CONVOLUTION BOTTLENECK
# Everything is the same as a regular bottleneck except for a [5,5] kernel decomposed into two [5,1] then [1,5]
elif type == ASYMMETRIC:
# Save the main branch for addition later
with scope('asymmetric'):
net_main = inputs
# First projection with 1x1 kernel (dimensionality reduction)
with scope('block1'):
net = conv(inputs, reduced_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Second conv block --- apply asymmetric conv here
with scope('block2'):
with scope('asymmetric_conv2a'):
net = conv(net, reduced_depth, [filter_size, 1], padding='same')
with scope('asymmetric_conv2b'):
net = conv(net, reduced_depth, [1, filter_size], padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
# Final projection with 1x1 kernel
with scope('block3'):
net = conv(net, output_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Regularizer
net = fluid.layers.dropout(net, regularizer_prob, seed=seed)
net = prelu(net, decoder=decoder)
# Add the main branch
net = fluid.layers.elementwise_add(net_main, net)
net = prelu(net, decoder=decoder)
return net
# UPSAMPLING BOTTLENECK
# Everything is the same as a regular one, except convolution becomes transposed.
elif type == UPSAMPLING:
#Check if pooling indices is given
#Check output_shape given or not
if output_shape is None:
raise ValueError('Output depth is not given')
#=======MAIN BRANCH=======
#Main branch to upsample. output shape must match with the shape of the layer that was pooled initially, in order
#for the pooling indices to work correctly. However, the initial pooled layer was padded, so need to reduce dimension
#before unpooling. In the paper, padding is replaced with convolution for this purpose of reducing the depth!
with scope('upsampling'):
with scope('unpool'):
net_unpool = conv(inputs, output_depth, [1, 1])
net_unpool = bn(net_unpool)
net_unpool = fluid.layers.resize_bilinear(net_unpool, out_shape=output_shape[2:])
# First 1x1 projection to reduce depth
with scope('block1'):
net = conv(inputs, reduced_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
with scope('block2'):
net = deconv(net, reduced_depth, filter_size=filter_size, stride=2, padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
# Final projection with 1x1 kernel
with scope('block3'):
net = conv(net, output_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Regularizer
net = fluid.layers.dropout(net, regularizer_prob, seed=seed)
net = prelu(net, decoder=decoder)
# Finally, add the unpooling layer and the sub branch together
net = fluid.layers.elementwise_add(net, net_unpool)
net = prelu(net, decoder=decoder)
return net
# REGULAR BOTTLENECK
else:
with scope('regular'):
net_main = inputs
# First projection with 1x1 kernel
with scope('block1'):
net = conv(inputs, reduced_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Second conv block
with scope('block2'):
net = conv(net, reduced_depth, [filter_size, filter_size], padding='same')
net = bn(net)
net = prelu(net, decoder=decoder)
# Final projection with 1x1 kernel
with scope('block3'):
net = conv(net, output_depth, [1, 1])
net = bn(net)
net = prelu(net, decoder=decoder)
# Regularizer
net = fluid.layers.dropout(net, regularizer_prob, seed=seed)
net = prelu(net, decoder=decoder)
# Add the main branch
net = fluid.layers.elementwise_add(net_main, net)
net = prelu(net, decoder=decoder)
return net
def ENet_stage1(inputs, name_scope='stage1_block'):
with scope(name_scope):
with scope('bottleneck1_0'):
net, inputs_shape_1 \
= bottleneck(inputs, output_depth=64, filter_size=3, regularizer_prob=0.01, type=DOWNSAMPLING,
name_scope='bottleneck1_0')
with scope('bottleneck1_1'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.01,
name_scope='bottleneck1_1')
with scope('bottleneck1_2'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.01,
name_scope='bottleneck1_2')
with scope('bottleneck1_3'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.01,
name_scope='bottleneck1_3')
with scope('bottleneck1_4'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.01,
name_scope='bottleneck1_4')
return net, inputs_shape_1
def ENet_stage2(inputs, name_scope='stage2_block'):
with scope(name_scope):
net, inputs_shape_2 \
= bottleneck(inputs, output_depth=128, filter_size=3, regularizer_prob=0.1, type=DOWNSAMPLING,
name_scope='bottleneck2_0')
for i in range(2):
with scope('bottleneck2_{}'.format(str(4 * i + 1))):
net = bottleneck(net, output_depth=128, filter_size=3, regularizer_prob=0.1,
name_scope='bottleneck2_{}'.format(str(4 * i + 1)))
with scope('bottleneck2_{}'.format(str(4 * i + 2))):
net = bottleneck(net, output_depth=128, filter_size=3, regularizer_prob=0.1, type=DILATED, dilation_rate=(2 ** (2*i+1)),
name_scope='bottleneck2_{}'.format(str(4 * i + 2)))
with scope('bottleneck2_{}'.format(str(4 * i + 3))):
net = bottleneck(net, output_depth=128, filter_size=5, regularizer_prob=0.1, type=ASYMMETRIC,
name_scope='bottleneck2_{}'.format(str(4 * i + 3)))
with scope('bottleneck2_{}'.format(str(4 * i + 4))):
net = bottleneck(net, output_depth=128, filter_size=3, regularizer_prob=0.1, type=DILATED, dilation_rate=(2 ** (2*i+2)),
name_scope='bottleneck2_{}'.format(str(4 * i + 4)))
return net, inputs_shape_2
def ENet_stage3(inputs, name_scope='stage3_block'):
with scope(name_scope):
for i in range(2):
with scope('bottleneck3_{}'.format(str(4 * i + 0))):
net = bottleneck(inputs, output_depth=128, filter_size=3, regularizer_prob=0.1,
name_scope='bottleneck3_{}'.format(str(4 * i + 0)))
with scope('bottleneck3_{}'.format(str(4 * i + 1))):
net = bottleneck(net, output_depth=128, filter_size=3, regularizer_prob=0.1, type=DILATED, dilation_rate=(2 ** (2*i+1)),
name_scope='bottleneck3_{}'.format(str(4 * i + 1)))
with scope('bottleneck3_{}'.format(str(4 * i + 2))):
net = bottleneck(net, output_depth=128, filter_size=5, regularizer_prob=0.1, type=ASYMMETRIC,
name_scope='bottleneck3_{}'.format(str(4 * i + 2)))
with scope('bottleneck3_{}'.format(str(4 * i + 3))):
net = bottleneck(net, output_depth=128, filter_size=3, regularizer_prob=0.1, type=DILATED, dilation_rate=(2 ** (2*i+2)),
name_scope='bottleneck3_{}'.format(str(4 * i + 3)))
return net
def ENet_stage4(inputs, inputs_shape, connect_tensor,
skip_connections=True, name_scope='stage4_block'):
with scope(name_scope):
with scope('bottleneck4_0'):
net = bottleneck(inputs, output_depth=64, filter_size=3, regularizer_prob=0.1,
type=UPSAMPLING, decoder=True, output_shape=inputs_shape,
name_scope='bottleneck4_0')
if skip_connections:
net = fluid.layers.elementwise_add(net, connect_tensor)
with scope('bottleneck4_1'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.1, decoder=True,
name_scope='bottleneck4_1')
with scope('bottleneck4_2'):
net = bottleneck(net, output_depth=64, filter_size=3, regularizer_prob=0.1, decoder=True,
name_scope='bottleneck4_2')
return net
def ENet_stage5(inputs, inputs_shape, connect_tensor, skip_connections=True,
name_scope='stage5_block'):
with scope(name_scope):
net = bottleneck(inputs, output_depth=16, filter_size=3, regularizer_prob=0.1, type=UPSAMPLING,
decoder=True, output_shape=inputs_shape,
name_scope='bottleneck5_0')
if skip_connections:
net = fluid.layers.elementwise_add(net, connect_tensor)
with scope('bottleneck5_1'):
net = bottleneck(net, output_depth=16, filter_size=3, regularizer_prob=0.1, decoder=True,
name_scope='bottleneck5_1')
return net
def decoder(input, num_classes):
if 'enet' in cfg.MODEL.LANENET.BACKBONE:
# Segmentation branch
with scope('LaneNetSeg'):
initial, stage1, stage2, inputs_shape_1, inputs_shape_2 = input
segStage3 = ENet_stage3(stage2)
segStage4 = ENet_stage4(segStage3, inputs_shape_2, stage1)
segStage5 = ENet_stage5(segStage4, inputs_shape_1, initial)
segLogits = deconv(segStage5, num_classes, filter_size=2, stride=2, padding='SAME')
# Embedding branch
with scope('LaneNetEm'):
emStage3 = ENet_stage3(stage2)
emStage4 = ENet_stage4(emStage3, inputs_shape_2, stage1)
emStage5 = ENet_stage5(emStage4, inputs_shape_1, initial)
emLogits = deconv(emStage5, 4, filter_size=2, stride=2, padding='SAME')
elif 'vgg' in cfg.MODEL.LANENET.BACKBONE:
encoder_list = ['pool5', 'pool4', 'pool3']
# score stage
input_tensor = input[encoder_list[0]]
with scope('score_origin'):
score = conv(input_tensor, 64, 1)
encoder_list = encoder_list[1:]
for i in range(len(encoder_list)):
with scope('deconv_{:d}'.format(i + 1)):
deconv_out = deconv(score, 64, filter_size=4, stride=2, padding='SAME')
input_tensor = input[encoder_list[i]]
with scope('score_{:d}'.format(i + 1)):
score = conv(input_tensor, 64, 1)
score = fluid.layers.elementwise_add(deconv_out, score)
with scope('deconv_final'):
emLogits = deconv(score, 64, filter_size=16, stride=8, padding='SAME')
with scope('score_final'):
segLogits = conv(emLogits, num_classes, 1)
emLogits = relu(conv(emLogits, 4, 1))
return segLogits, emLogits
def encoder(input):
if 'vgg' in cfg.MODEL.LANENET.BACKBONE:
model = vgg_backbone(layers=16)
#output = model.net(input)
_, encode_feature_dict = model.net(input, end_points=13, decode_points=[7, 10, 13])
output = {}
output['pool3'] = encode_feature_dict[7]
output['pool4'] = encode_feature_dict[10]
output['pool5'] = encode_feature_dict[13]
elif 'enet' in cfg.MODEL.LANET.BACKBONE:
with scope('LaneNetBase'):
initial = iniatial_block(input)
stage1, inputs_shape_1 = ENet_stage1(initial)
stage2, inputs_shape_2 = ENet_stage2(stage1)
output = (initial, stage1, stage2, inputs_shape_1, inputs_shape_2)
else:
raise Exception("LaneNet expect enet and vgg backbone, but received {}".
format(cfg.MODEL.LANENET.BACKBONE))
return output
def lanenet(img, num_classes):
output = encoder(img)
segLogits, emLogits = decoder(output, num_classes)
return segLogits, emLogits
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function
import sys
import os
import time
import codecs
import numpy as np
import cv2
from utils.config import cfg
import data_aug as aug
from pdseg.data_utils import GeneratorEnqueuer
from models.model_builder import ModelPhase
import copy
def cv2_imread(file_path, flag=cv2.IMREAD_COLOR):
# resolve cv2.imread open Chinese file path issues on Windows Platform.
return cv2.imdecode(np.fromfile(file_path, dtype=np.uint8), flag)
class LaneNetDataset():
def __init__(self,
file_list,
data_dir,
shuffle=False,
mode=ModelPhase.TRAIN):
self.mode = mode
self.shuffle = shuffle
self.data_dir = data_dir
self.shuffle_seed = 0
# NOTE: Please ensure file list was save in UTF-8 coding format
with codecs.open(file_list, 'r', 'utf-8') as flist:
self.lines = [line.strip() for line in flist]
self.all_lines = copy.deepcopy(self.lines)
if shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
elif shuffle:
np.random.shuffle(self.lines)
def generator(self):
if self.shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
num_lines = len(self.all_lines) // cfg.NUM_TRAINERS
self.lines = self.all_lines[num_lines * cfg.TRAINER_ID: num_lines * (cfg.TRAINER_ID + 1)]
self.shuffle_seed += 1
elif self.shuffle:
np.random.shuffle(self.lines)
for line in self.lines:
yield self.process_image(line, self.data_dir, self.mode)
def sharding_generator(self, pid=0, num_processes=1):
"""
Use line id as shard key for multiprocess io
It's a normal generator if pid=0, num_processes=1
"""
for index, line in enumerate(self.lines):
# Use index and pid to shard file list
if index % num_processes == pid:
yield self.process_image(line, self.data_dir, self.mode)
def batch_reader(self, batch_size):
br = self.batch(self.reader, batch_size)
for batch in br:
yield batch[0], batch[1], batch[2]
def multiprocess_generator(self, max_queue_size=32, num_processes=8):
# Re-shuffle file list
if self.shuffle and cfg.NUM_TRAINERS > 1:
np.random.RandomState(self.shuffle_seed).shuffle(self.all_lines)
num_lines = len(self.all_lines) // self.num_trainers
self.lines = self.all_lines[num_lines * self.trainer_id: num_lines * (self.trainer_id + 1)]
self.shuffle_seed += 1
elif self.shuffle:
np.random.shuffle(self.lines)
# Create multiple sharding generators according to num_processes for multiple processes
generators = []
for pid in range(num_processes):
generators.append(self.sharding_generator(pid, num_processes))
try:
enqueuer = GeneratorEnqueuer(generators)
enqueuer.start(max_queue_size=max_queue_size, workers=num_processes)
while True:
generator_out = None
while enqueuer.is_running():
if not enqueuer.queue.empty():
generator_out = enqueuer.queue.get(timeout=5)
break
else:
time.sleep(0.01)
if generator_out is None:
break
yield generator_out
finally:
if enqueuer is not None:
enqueuer.stop()
def batch(self, reader, batch_size, is_test=False, drop_last=False):
def batch_reader(is_test=False, drop_last=drop_last):
if is_test:
imgs, grts, grts_instance, img_names, valid_shapes, org_shapes = [], [], [], [], [], []
for img, grt, grt_instance, img_name, valid_shape, org_shape in reader():
imgs.append(img)
grts.append(grt)
grts_instance.append(grt_instance)
img_names.append(img_name)
valid_shapes.append(valid_shape)
org_shapes.append(org_shape)
if len(imgs) == batch_size:
yield np.array(imgs), np.array(
grts), np.array(grts_instance), img_names, np.array(valid_shapes), np.array(
org_shapes)
imgs, grts, grts_instance, img_names, valid_shapes, org_shapes = [], [], [], [], [], []
if not drop_last and len(imgs) > 0:
yield np.array(imgs), np.array(grts), np.array(grts_instance), img_names, np.array(
valid_shapes), np.array(org_shapes)
else:
imgs, labs, labs_instance, ignore = [], [], [], []
bs = 0
for img, lab, lab_instance, ig in reader():
imgs.append(img)
labs.append(lab)
labs_instance.append(lab_instance)
ignore.append(ig)
bs += 1
if bs == batch_size:
yield np.array(imgs), np.array(labs), np.array(labs_instance), np.array(ignore)
bs = 0
imgs, labs, labs_instance, ignore = [], [], [], []
if not drop_last and bs > 0:
yield np.array(imgs), np.array(labs), np.array(labs_instance), np.array(ignore)
return batch_reader(is_test, drop_last)
def load_image(self, line, src_dir, mode=ModelPhase.TRAIN):
# original image cv2.imread flag setting
cv2_imread_flag = cv2.IMREAD_COLOR
if cfg.DATASET.IMAGE_TYPE == "rgba":
# If use RBGA 4 channel ImageType, use IMREAD_UNCHANGED flags to
# reserver alpha channel
cv2_imread_flag = cv2.IMREAD_UNCHANGED
parts = line.strip().split(cfg.DATASET.SEPARATOR)
if len(parts) != 3:
if mode == ModelPhase.TRAIN or mode == ModelPhase.EVAL:
raise Exception("File list format incorrect! It should be"
" image_name{}label_name\\n".format(
cfg.DATASET.SEPARATOR))
img_name, grt_name, grt_instance_name = parts[0], None, None
else:
img_name, grt_name, grt_instance_name = parts[0], parts[1], parts[2]
img_path = os.path.join(src_dir, img_name)
img = cv2_imread(img_path, cv2_imread_flag)
if grt_name is not None:
grt_path = os.path.join(src_dir, grt_name)
grt_instance_path = os.path.join(src_dir, grt_instance_name)
grt = cv2_imread(grt_path, cv2.IMREAD_GRAYSCALE)
grt[grt == 255] = 1
grt[grt != 1] = 0
grt_instance = cv2_imread(grt_instance_path, cv2.IMREAD_GRAYSCALE)
else:
grt = None
grt_instance = None
if img is None:
raise Exception(
"Empty image, src_dir: {}, img: {} & lab: {}".format(
src_dir, img_path, grt_path))
img_height = img.shape[0]
img_width = img.shape[1]
if grt is not None:
grt_height = grt.shape[0]
grt_width = grt.shape[1]
if img_height != grt_height or img_width != grt_width:
raise Exception(
"source img and label img must has the same size")
else:
if mode == ModelPhase.TRAIN or mode == ModelPhase.EVAL:
raise Exception(
"Empty image, src_dir: {}, img: {} & lab: {}".format(
src_dir, img_path, grt_path))
if len(img.shape) < 3:
img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
img_channels = img.shape[2]
if img_channels < 3:
raise Exception("PaddleSeg only supports gray, rgb or rgba image")
if img_channels != cfg.DATASET.DATA_DIM:
raise Exception(
"Input image channel({}) is not match cfg.DATASET.DATA_DIM({}), img_name={}"
.format(img_channels, cfg.DATASET.DATADIM, img_name))
if img_channels != len(cfg.MEAN):
raise Exception(
"img name {}, img chns {} mean size {}, size unequal".format(
img_name, img_channels, len(cfg.MEAN)))
if img_channels != len(cfg.STD):
raise Exception(
"img name {}, img chns {} std size {}, size unequal".format(
img_name, img_channels, len(cfg.STD)))
return img, grt, grt_instance, img_name, grt_name
def normalize_image(self, img):
""" 像素归一化后减均值除方差 """
img = img.transpose((2, 0, 1)).astype('float32') / 255.0
img_mean = np.array(cfg.MEAN).reshape((len(cfg.MEAN), 1, 1))
img_std = np.array(cfg.STD).reshape((len(cfg.STD), 1, 1))
img -= img_mean
img /= img_std
return img
def process_image(self, line, data_dir, mode):
""" process_image """
img, grt, grt_instance, img_name, grt_name = self.load_image(
line, data_dir, mode=mode)
if mode == ModelPhase.TRAIN:
img, grt, grt_instance = aug.resize(img, grt, grt_instance, mode)
if cfg.AUG.RICH_CROP.ENABLE:
if cfg.AUG.RICH_CROP.BLUR:
if cfg.AUG.RICH_CROP.BLUR_RATIO <= 0:
n = 0
elif cfg.AUG.RICH_CROP.BLUR_RATIO >= 1:
n = 1
else:
n = int(1.0 / cfg.AUG.RICH_CROP.BLUR_RATIO)
if n > 0:
if np.random.randint(0, n) == 0:
radius = np.random.randint(3, 10)
if radius % 2 != 1:
radius = radius + 1
if radius > 9:
radius = 9
img = cv2.GaussianBlur(img, (radius, radius), 0, 0)
img, grt = aug.random_rotation(
img,
grt,
rich_crop_max_rotation=cfg.AUG.RICH_CROP.MAX_ROTATION,
mean_value=cfg.DATASET.PADDING_VALUE)
img, grt = aug.rand_scale_aspect(
img,
grt,
rich_crop_min_scale=cfg.AUG.RICH_CROP.MIN_AREA_RATIO,
rich_crop_aspect_ratio=cfg.AUG.RICH_CROP.ASPECT_RATIO)
img = aug.hsv_color_jitter(
img,
brightness_jitter_ratio=cfg.AUG.RICH_CROP.
BRIGHTNESS_JITTER_RATIO,
saturation_jitter_ratio=cfg.AUG.RICH_CROP.
SATURATION_JITTER_RATIO,
contrast_jitter_ratio=cfg.AUG.RICH_CROP.
CONTRAST_JITTER_RATIO)
if cfg.AUG.FLIP:
if cfg.AUG.FLIP_RATIO <= 0:
n = 0
elif cfg.AUG.FLIP_RATIO >= 1:
n = 1
else:
n = int(1.0 / cfg.AUG.FLIP_RATIO)
if n > 0:
if np.random.randint(0, n) == 0:
img = img[::-1, :, :]
grt = grt[::-1, :]
if cfg.AUG.MIRROR:
if np.random.randint(0, 2) == 1:
img = img[:, ::-1, :]
grt = grt[:, ::-1]
img, grt = aug.rand_crop(img, grt, mode=mode)
elif ModelPhase.is_eval(mode):
img, grt, grt_instance = aug.resize(img, grt, grt_instance, mode=mode)
elif ModelPhase.is_visual(mode):
ori_img = img.copy()
img, grt, grt_instance = aug.resize(img, grt, grt_instance, mode=mode)
valid_shape = [img.shape[0], img.shape[1]]
else:
raise ValueError("Dataset mode={} Error!".format(mode))
# Normalize image
img = self.normalize_image(img)
if ModelPhase.is_train(mode) or ModelPhase.is_eval(mode):
grt = np.expand_dims(np.array(grt).astype('int32'), axis=0)
ignore = (grt != cfg.DATASET.IGNORE_INDEX).astype('int32')
if ModelPhase.is_train(mode):
return (img, grt, grt_instance, ignore)
elif ModelPhase.is_eval(mode):
return (img, grt, grt_instance, ignore)
elif ModelPhase.is_visual(mode):
return (img, grt, grt_instance, img_name, valid_shape, ori_img)
pre-commit
yapf == 0.26.0
flake8
pyyaml >= 5.1
tb-paddle
tensorboard >= 1.15.0
Pillow
numpy
six
opencv-python
tqdm
requests
sklearn
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
cur_path = os.path.abspath(os.path.dirname(__file__))
root_path = os.path.split(os.path.split(cur_path)[0])[0]
SEG_PATH = os.path.join(cur_path, "../../../")
sys.path.append(SEG_PATH)
sys.path.append(root_path)
import argparse
import pprint
import numpy as np
import paddle.fluid as fluid
from utils.config import cfg
from pdseg.utils.timer import Timer, calculate_eta
from reader import LaneNetDataset
from models.model_builder import build_model
from models.model_builder import ModelPhase
from models.model_builder import parse_shape_from_file
from eval import evaluate
from vis import visualize
from utils import dist_utils
def parse_args():
parser = argparse.ArgumentParser(description='PaddleSeg training')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu',
dest='use_gpu',
help='Use gpu or cpu',
action='store_true',
default=False)
parser.add_argument(
'--use_mpio',
dest='use_mpio',
help='Use multiprocess I/O or not',
action='store_true',
default=False)
parser.add_argument(
'--log_steps',
dest='log_steps',
help='Display logging information at every log_steps',
default=10,
type=int)
parser.add_argument(
'--debug',
dest='debug',
help='debug mode, display detail information of training',
action='store_true')
parser.add_argument(
'--use_tb',
dest='use_tb',
help='whether to record the data during training to Tensorboard',
action='store_true')
parser.add_argument(
'--tb_log_dir',
dest='tb_log_dir',
help='Tensorboard logging directory',
default=None,
type=str)
parser.add_argument(
'--do_eval',
dest='do_eval',
help='Evaluation models result on every new checkpoint',
action='store_true')
parser.add_argument(
'opts',
help='See utils/config.py for all options',
default=None,
nargs=argparse.REMAINDER)
return parser.parse_args()
def save_vars(executor, dirname, program=None, vars=None):
"""
Temporary resolution for Win save variables compatability.
Will fix in PaddlePaddle v1.5.2
"""
save_program = fluid.Program()
save_block = save_program.global_block()
for each_var in vars:
# NOTE: don't save the variable which type is RAW
if each_var.type == fluid.core.VarDesc.VarType.RAW:
continue
new_var = save_block.create_var(
name=each_var.name,
shape=each_var.shape,
dtype=each_var.dtype,
type=each_var.type,
lod_level=each_var.lod_level,
persistable=True)
file_path = os.path.join(dirname, new_var.name)
file_path = os.path.normpath(file_path)
save_block.append_op(
type='save',
inputs={'X': [new_var]},
outputs={},
attrs={'file_path': file_path})
executor.run(save_program)
def save_checkpoint(exe, program, ckpt_name):
"""
Save checkpoint for evaluation or resume training
"""
ckpt_dir = os.path.join(cfg.TRAIN.MODEL_SAVE_DIR, str(ckpt_name))
print("Save model checkpoint to {}".format(ckpt_dir))
if not os.path.isdir(ckpt_dir):
os.makedirs(ckpt_dir)
save_vars(
exe,
ckpt_dir,
program,
vars=list(filter(fluid.io.is_persistable, program.list_vars())))
return ckpt_dir
def load_checkpoint(exe, program):
"""
Load checkpoiont from pretrained model directory for resume training
"""
print('Resume model training from:', cfg.TRAIN.RESUME_MODEL_DIR)
if not os.path.exists(cfg.TRAIN.RESUME_MODEL_DIR):
raise ValueError("TRAIN.PRETRAIN_MODEL {} not exist!".format(
cfg.TRAIN.RESUME_MODEL_DIR))
fluid.io.load_persistables(
exe, cfg.TRAIN.RESUME_MODEL_DIR, main_program=program)
model_path = cfg.TRAIN.RESUME_MODEL_DIR
# Check is path ended by path spearator
if model_path[-1] == os.sep:
model_path = model_path[0:-1]
epoch_name = os.path.basename(model_path)
# If resume model is final model
if epoch_name == 'final':
begin_epoch = cfg.SOLVER.NUM_EPOCHS
# If resume model path is end of digit, restore epoch status
elif epoch_name.isdigit():
epoch = int(epoch_name)
begin_epoch = epoch + 1
else:
raise ValueError("Resume model path is not valid!")
print("Model checkpoint loaded successfully!")
return begin_epoch
def print_info(*msg):
if cfg.TRAINER_ID == 0:
print(*msg)
def train(cfg):
startup_prog = fluid.Program()
train_prog = fluid.Program()
drop_last = True
dataset = LaneNetDataset(
file_list=cfg.DATASET.TRAIN_FILE_LIST,
mode=ModelPhase.TRAIN,
shuffle=True,
data_dir=cfg.DATASET.DATA_DIR)
def data_generator():
if args.use_mpio:
data_gen = dataset.multiprocess_generator(
num_processes=cfg.DATALOADER.NUM_WORKERS,
max_queue_size=cfg.DATALOADER.BUF_SIZE)
else:
data_gen = dataset.generator()
batch_data = []
for b in data_gen:
batch_data.append(b)
if len(batch_data) == (cfg.BATCH_SIZE // cfg.NUM_TRAINERS):
for item in batch_data:
yield item
batch_data = []
# Get device environment
gpu_id = int(os.environ.get('FLAGS_selected_gpus', 0))
place = fluid.CUDAPlace(gpu_id) if args.use_gpu else fluid.CPUPlace()
places = fluid.cuda_places() if args.use_gpu else fluid.cpu_places()
# Get number of GPU
dev_count = cfg.NUM_TRAINERS if cfg.NUM_TRAINERS > 1 else len(places)
print_info("#Device count: {}".format(dev_count))
# Make sure BATCH_SIZE can divided by GPU cards
assert cfg.BATCH_SIZE % dev_count == 0, (
'BATCH_SIZE:{} not divisble by number of GPUs:{}'.format(
cfg.BATCH_SIZE, dev_count))
# If use multi-gpu training mode, batch data will allocated to each GPU evenly
batch_size_per_dev = cfg.BATCH_SIZE // dev_count
cfg.BATCH_SIZE_PER_DEV = batch_size_per_dev
print_info("batch_size_per_dev: {}".format(batch_size_per_dev))
py_reader, avg_loss, lr, pred, grts, masks, emb_loss, seg_loss, accuracy, fp, fn = build_model(
train_prog, startup_prog, phase=ModelPhase.TRAIN)
py_reader.decorate_sample_generator(
data_generator, batch_size=batch_size_per_dev, drop_last=drop_last)
exe = fluid.Executor(place)
exe.run(startup_prog)
exec_strategy = fluid.ExecutionStrategy()
# Clear temporary variables every 100 iteration
if args.use_gpu:
exec_strategy.num_threads = fluid.core.get_cuda_device_count()
exec_strategy.num_iteration_per_drop_scope = 100
build_strategy = fluid.BuildStrategy()
if cfg.NUM_TRAINERS > 1 and args.use_gpu:
dist_utils.prepare_for_multi_process(exe, build_strategy, train_prog)
exec_strategy.num_threads = 1
if cfg.TRAIN.SYNC_BATCH_NORM and args.use_gpu:
if dev_count > 1:
# Apply sync batch norm strategy
print_info("Sync BatchNorm strategy is effective.")
build_strategy.sync_batch_norm = True
else:
print_info(
"Sync BatchNorm strategy will not be effective if GPU device"
" count <= 1")
compiled_train_prog = fluid.CompiledProgram(train_prog).with_data_parallel(
loss_name=avg_loss.name,
exec_strategy=exec_strategy,
build_strategy=build_strategy)
# Resume training
begin_epoch = cfg.SOLVER.BEGIN_EPOCH
if cfg.TRAIN.RESUME_MODEL_DIR:
begin_epoch = load_checkpoint(exe, train_prog)
# Load pretrained model
elif os.path.exists(cfg.TRAIN.PRETRAINED_MODEL_DIR):
print_info('Pretrained model dir: ', cfg.TRAIN.PRETRAINED_MODEL_DIR)
load_vars = []
load_fail_vars = []
def var_shape_matched(var, shape):
"""
Check whehter persitable variable shape is match with current network
"""
var_exist = os.path.exists(
os.path.join(cfg.TRAIN.PRETRAINED_MODEL_DIR, var.name))
if var_exist:
var_shape = parse_shape_from_file(
os.path.join(cfg.TRAIN.PRETRAINED_MODEL_DIR, var.name))
if var_shape != shape:
print(var.name, var_shape, shape)
return var_shape == shape
return False
for x in train_prog.list_vars():
if isinstance(x, fluid.framework.Parameter):
shape = tuple(fluid.global_scope().find_var(
x.name).get_tensor().shape())
if var_shape_matched(x, shape):
load_vars.append(x)
else:
load_fail_vars.append(x)
fluid.io.load_vars(
exe, dirname=cfg.TRAIN.PRETRAINED_MODEL_DIR, vars=load_vars)
for var in load_vars:
print_info("Parameter[{}] loaded sucessfully!".format(var.name))
for var in load_fail_vars:
print_info(
"Parameter[{}] don't exist or shape does not match current network, skip"
" to load it.".format(var.name))
print_info("{}/{} pretrained parameters loaded successfully!".format(
len(load_vars),
len(load_vars) + len(load_fail_vars)))
else:
print_info(
'Pretrained model dir {} not exists, training from scratch...'.
format(cfg.TRAIN.PRETRAINED_MODEL_DIR))
# fetch_list = [avg_loss.name, lr.name, accuracy.name, precision.name, recall.name]
fetch_list = [avg_loss.name, lr.name, seg_loss.name, emb_loss.name, accuracy.name, fp.name, fn.name]
if args.debug:
# Fetch more variable info and use streaming confusion matrix to
# calculate IoU results if in debug mode
np.set_printoptions(
precision=4, suppress=True, linewidth=160, floatmode="fixed")
fetch_list.extend([pred.name, grts.name, masks.name])
# cm = ConfusionMatrix(cfg.DATASET.NUM_CLASSES, streaming=True)
if args.use_tb:
if not args.tb_log_dir:
print_info("Please specify the log directory by --tb_log_dir.")
exit(1)
from tb_paddle import SummaryWriter
log_writer = SummaryWriter(args.tb_log_dir)
# trainer_id = int(os.getenv("PADDLE_TRAINER_ID", 0))
# num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
global_step = 0
all_step = cfg.DATASET.TRAIN_TOTAL_IMAGES // cfg.BATCH_SIZE
if cfg.DATASET.TRAIN_TOTAL_IMAGES % cfg.BATCH_SIZE and drop_last != True:
all_step += 1
all_step *= (cfg.SOLVER.NUM_EPOCHS - begin_epoch + 1)
avg_loss = 0.0
avg_seg_loss = 0.0
avg_emb_loss = 0.0
avg_acc = 0.0
avg_fp = 0.0
avg_fn = 0.0
timer = Timer()
timer.start()
if begin_epoch > cfg.SOLVER.NUM_EPOCHS:
raise ValueError(
("begin epoch[{}] is larger than cfg.SOLVER.NUM_EPOCHS[{}]").format(
begin_epoch, cfg.SOLVER.NUM_EPOCHS))
if args.use_mpio:
print_info("Use multiprocess reader")
else:
print_info("Use multi-thread reader")
for epoch in range(begin_epoch, cfg.SOLVER.NUM_EPOCHS + 1):
py_reader.start()
while True:
try:
# If not in debug mode, avoid unnessary log and calculate
loss, lr, out_seg_loss, out_emb_loss, out_acc, out_fp, out_fn = exe.run(
program=compiled_train_prog,
fetch_list=fetch_list,
return_numpy=True)
avg_loss += np.mean(np.array(loss))
avg_seg_loss += np.mean(np.array(out_seg_loss))
avg_emb_loss += np.mean(np.array(out_emb_loss))
avg_acc += np.mean(out_acc)
avg_fp += np.mean(out_fp)
avg_fn += np.mean(out_fn)
global_step += 1
if global_step % args.log_steps == 0 and cfg.TRAINER_ID == 0:
avg_loss /= args.log_steps
avg_seg_loss /= args.log_steps
avg_emb_loss /= args.log_steps
avg_acc /= args.log_steps
avg_fp /= args.log_steps
avg_fn /= args.log_steps
speed = args.log_steps / timer.elapsed_time()
print((
"epoch={} step={} lr={:.5f} loss={:.4f} seg_loss={:.4f} emb_loss={:.4f} accuracy={:.4} fp={:.4} fn={:.4} step/sec={:.3f} | ETA {}"
).format(epoch, global_step, lr[0], avg_loss, avg_seg_loss, avg_emb_loss, avg_acc, avg_fp, avg_fn, speed,
calculate_eta(all_step - global_step, speed)))
if args.use_tb:
log_writer.add_scalar('Train/loss', avg_loss,
global_step)
log_writer.add_scalar('Train/lr', lr[0],
global_step)
log_writer.add_scalar('Train/speed', speed,
global_step)
sys.stdout.flush()
avg_loss = 0.0
avg_seg_loss = 0.0
avg_emb_loss = 0.0
avg_acc = 0.0
avg_fp = 0.0
avg_fn = 0.0
timer.restart()
except fluid.core.EOFException:
py_reader.reset()
break
except Exception as e:
print(e)
if epoch % cfg.TRAIN.SNAPSHOT_EPOCH == 0 and cfg.TRAINER_ID == 0:
ckpt_dir = save_checkpoint(exe, train_prog, epoch)
if args.do_eval:
print("Evaluation start")
accuracy, fp, fn = evaluate(
cfg=cfg,
ckpt_dir=ckpt_dir,
use_gpu=args.use_gpu,
use_mpio=args.use_mpio)
if args.use_tb:
log_writer.add_scalar('Evaluate/accuracy', accuracy,
global_step)
log_writer.add_scalar('Evaluate/fp', fp,
global_step)
log_writer.add_scalar('Evaluate/fn', fn,
global_step)
# Use Tensorboard to visualize results
if args.use_tb and cfg.DATASET.VIS_FILE_LIST is not None:
visualize(
cfg=cfg,
use_gpu=args.use_gpu,
vis_file_list=cfg.DATASET.VIS_FILE_LIST,
vis_dir="visual",
ckpt_dir=ckpt_dir,
log_writer=log_writer)
# save final model
if cfg.TRAINER_ID == 0:
save_checkpoint(exe, train_prog, 'final')
def main(args):
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
cfg.TRAINER_ID = int(os.getenv("PADDLE_TRAINER_ID", 0))
cfg.NUM_TRAINERS = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
cfg.check_and_infer()
print_info(pprint.pformat(cfg))
train(cfg)
if __name__ == '__main__':
args = parse_args()
if fluid.core.is_compiled_with_cuda() != True and args.use_gpu == True:
print(
"You can not set use_gpu = True in the model because you are using paddlepaddle-cpu."
)
print(
"Please: 1. Install paddlepaddle-gpu to run your models on GPU or 2. Set use_gpu=False to run models on CPU."
)
sys.exit(1)
main(args)
# -*- coding: utf-8 -*-
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function
from __future__ import unicode_literals
import os
import sys
# LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
# PDSEG_PATH = os.path.join(LOCAL_PATH, "../../../", "pdseg")
# print(PDSEG_PATH)
# sys.path.insert(0, PDSEG_PATH)
# print(sys.path)
from pdseg.utils.collect import SegConfig
import numpy as np
cfg = SegConfig()
########################## 基本配置 ###########################################
# 均值,图像预处理减去的均值
cfg.MEAN = [0.5, 0.5, 0.5]
# 标准差,图像预处理除以标准差·
cfg.STD = [0.5, 0.5, 0.5]
# 批处理大小
cfg.BATCH_SIZE = 1
# 验证时图像裁剪尺寸(宽,高)
cfg.EVAL_CROP_SIZE = tuple()
# 训练时图像裁剪尺寸(宽,高)
cfg.TRAIN_CROP_SIZE = tuple()
# 多进程训练总进程数
cfg.NUM_TRAINERS = 1
# 多进程训练进程ID
cfg.TRAINER_ID = 0
# 每张gpu上的批大小,无需设置,程序会自动根据batch调整
cfg.BATCH_SIZE_PER_DEV = 1
########################## 数据载入配置 #######################################
# 数据载入时的并发数, 建议值8
cfg.DATALOADER.NUM_WORKERS = 8
# 数据载入时缓存队列大小, 建议值256
cfg.DATALOADER.BUF_SIZE = 256
########################## 数据集配置 #########################################
# 数据主目录目录
cfg.DATASET.DATA_DIR = './dataset/cityscapes/'
# 训练集列表
cfg.DATASET.TRAIN_FILE_LIST = './dataset/cityscapes/train.list'
# 训练集数量
cfg.DATASET.TRAIN_TOTAL_IMAGES = 2975
# 验证集列表
cfg.DATASET.VAL_FILE_LIST = './dataset/cityscapes/val.list'
# 验证数据数量
cfg.DATASET.VAL_TOTAL_IMAGES = 500
# 测试数据列表
cfg.DATASET.TEST_FILE_LIST = './dataset/cityscapes/test.list'
# 测试数据数量
cfg.DATASET.TEST_TOTAL_IMAGES = 500
# Tensorboard 可视化的数据集
cfg.DATASET.VIS_FILE_LIST = None
# 类别数(需包括背景类)
cfg.DATASET.NUM_CLASSES = 19
# 输入图像类型, 支持三通道'rgb',四通道'rgba',单通道灰度图'gray'
cfg.DATASET.IMAGE_TYPE = 'rgb'
# 输入图片的通道数
cfg.DATASET.DATA_DIM = 3
# 数据列表分割符, 默认为空格
cfg.DATASET.SEPARATOR = ' '
# 忽略的像素标签值, 默认为255,一般无需改动
cfg.DATASET.IGNORE_INDEX = 255
# 数据增强是图像的padding值
cfg.DATASET.PADDING_VALUE = [127.5,127.5,127.5]
########################### 数据增强配置 ######################################
# 图像镜像左右翻转
cfg.AUG.MIRROR = True
# 图像上下翻转开关,True/False
cfg.AUG.FLIP = False
# 图像启动上下翻转的概率,0-1
cfg.AUG.FLIP_RATIO = 0.5
# 图像resize的固定尺寸(宽,高),非负
cfg.AUG.FIX_RESIZE_SIZE = tuple()
# 图像resize的方式有三种:
# unpadding(固定尺寸),stepscaling(按比例resize),rangescaling(长边对齐)
cfg.AUG.AUG_METHOD = 'rangescaling'
# 图像resize方式为stepscaling,resize最小尺度,非负
cfg.AUG.MIN_SCALE_FACTOR = 0.5
# 图像resize方式为stepscaling,resize最大尺度,不小于MIN_SCALE_FACTOR
cfg.AUG.MAX_SCALE_FACTOR = 2.0
# 图像resize方式为stepscaling,resize尺度范围间隔,非负
cfg.AUG.SCALE_STEP_SIZE = 0.25
# 图像resize方式为rangescaling,训练时长边resize的范围最小值,非负
cfg.AUG.MIN_RESIZE_VALUE = 400
# 图像resize方式为rangescaling,训练时长边resize的范围最大值,
# 不小于MIN_RESIZE_VALUE
cfg.AUG.MAX_RESIZE_VALUE = 600
# 图像resize方式为rangescaling, 测试验证可视化模式下长边resize的长度,
# 在MIN_RESIZE_VALUE到MAX_RESIZE_VALUE范围内
cfg.AUG.INF_RESIZE_VALUE = 500
# RichCrop数据增广开关,用于提升模型鲁棒性
cfg.AUG.RICH_CROP.ENABLE = False
# 图像旋转最大角度,0-90
cfg.AUG.RICH_CROP.MAX_ROTATION = 15
# 裁取图像与原始图像面积比,0-1
cfg.AUG.RICH_CROP.MIN_AREA_RATIO = 0.5
# 裁取图像宽高比范围,非负
cfg.AUG.RICH_CROP.ASPECT_RATIO = 0.33
# 亮度调节范围,0-1
cfg.AUG.RICH_CROP.BRIGHTNESS_JITTER_RATIO = 0.5
# 饱和度调节范围,0-1
cfg.AUG.RICH_CROP.SATURATION_JITTER_RATIO = 0.5
# 对比度调节范围,0-1
cfg.AUG.RICH_CROP.CONTRAST_JITTER_RATIO = 0.5
# 图像模糊开关,True/False
cfg.AUG.RICH_CROP.BLUR = False
# 图像启动模糊百分比,0-1
cfg.AUG.RICH_CROP.BLUR_RATIO = 0.1
########################### 训练配置 ##########################################
# 模型保存路径
cfg.TRAIN.MODEL_SAVE_DIR = ''
# 预训练模型路径
cfg.TRAIN.PRETRAINED_MODEL_DIR = ''
# 是否resume,继续训练
cfg.TRAIN.RESUME_MODEL_DIR = ''
# 是否使用多卡间同步BatchNorm均值和方差
cfg.TRAIN.SYNC_BATCH_NORM = False
# 模型参数保存的epoch间隔数,可用来继续训练中断的模型
cfg.TRAIN.SNAPSHOT_EPOCH = 10
########################### 模型优化相关配置 ##################################
# 初始学习率
cfg.SOLVER.LR = 0.1
# 学习率下降方法, 支持poly piecewise cosine 三种
cfg.SOLVER.LR_POLICY = "poly"
# 优化算法, 支持SGD和Adam两种算法
cfg.SOLVER.OPTIMIZER = "sgd"
# 动量参数
cfg.SOLVER.MOMENTUM = 0.9
# 二阶矩估计的指数衰减率
cfg.SOLVER.MOMENTUM2 = 0.999
# 学习率Poly下降指数
cfg.SOLVER.POWER = 0.9
# step下降指数
cfg.SOLVER.GAMMA = 0.1
# step下降间隔
cfg.SOLVER.DECAY_EPOCH = [10, 20]
# 学习率权重衰减,0-1
cfg.SOLVER.WEIGHT_DECAY = 0.00004
# 训练开始epoch数,默认为1
cfg.SOLVER.BEGIN_EPOCH = 1
# 训练epoch数,正整数
cfg.SOLVER.NUM_EPOCHS = 30
# loss的选择,支持softmax_loss, bce_loss, dice_loss
cfg.SOLVER.LOSS = ["softmax_loss"]
# cross entropy weight, 默认为None,如果设置为'dynamic',会根据每个batch中各个类别的数目,
# 动态调整类别权重。
# 也可以设置一个静态权重(list的方式),比如有3类,每个类别权重可以设置为[0.1, 2.0, 0.9]
cfg.SOLVER.CROSS_ENTROPY_WEIGHT = None
########################## 测试配置 ###########################################
# 测试模型路径
cfg.TEST.TEST_MODEL = ''
########################## 模型通用配置 #######################################
# 模型名称, 支持deeplab, unet, icnet三种
cfg.MODEL.MODEL_NAME = ''
# BatchNorm类型: bn、gn(group_norm)
cfg.MODEL.DEFAULT_NORM_TYPE = 'bn'
# 多路损失加权值
cfg.MODEL.MULTI_LOSS_WEIGHT = [1.0]
# DEFAULT_NORM_TYPE为gn时group数
cfg.MODEL.DEFAULT_GROUP_NUMBER = 32
# 极小值, 防止分母除0溢出,一般无需改动
cfg.MODEL.DEFAULT_EPSILON = 1e-5
# BatchNorm动量, 一般无需改动
cfg.MODEL.BN_MOMENTUM = 0.99
# 是否使用FP16训练
cfg.MODEL.FP16 = False
# 混合精度训练需对LOSS进行scale, 默认为动态scale,静态scale可以设置为512.0
cfg.MODEL.SCALE_LOSS = "DYNAMIC"
########################## DeepLab模型配置 ####################################
# DeepLab backbone 配置, 可选项xception_65, mobilenetv2
cfg.MODEL.DEEPLAB.BACKBONE = "xception_65"
# DeepLab output stride
cfg.MODEL.DEEPLAB.OUTPUT_STRIDE = 16
# MobileNet backbone scale 设置
cfg.MODEL.DEEPLAB.DEPTH_MULTIPLIER = 1.0
# MobileNet backbone scale 设置
cfg.MODEL.DEEPLAB.ENCODER_WITH_ASPP = True
# MobileNet backbone scale 设置
cfg.MODEL.DEEPLAB.ENABLE_DECODER = True
# ASPP是否使用可分离卷积
cfg.MODEL.DEEPLAB.ASPP_WITH_SEP_CONV = True
# 解码器是否使用可分离卷积
cfg.MODEL.DEEPLAB.DECODER_USE_SEP_CONV = True
########################## UNET模型配置 #######################################
# 上采样方式, 默认为双线性插值
cfg.MODEL.UNET.UPSAMPLE_MODE = 'bilinear'
########################## ICNET模型配置 ######################################
# RESNET backbone scale 设置
cfg.MODEL.ICNET.DEPTH_MULTIPLIER = 0.5
# RESNET 层数 设置
cfg.MODEL.ICNET.LAYERS = 50
########################## PSPNET模型配置 ######################################
# Lannet backbone name
cfg.MODEL.LANENET.BACKBONE = "vgg"
########################## LaneNet模型配置 ######################################
########################## 预测部署模型配置 ###################################
# 预测保存的模型名称
cfg.FREEZE.MODEL_FILENAME = '__model__'
# 预测保存的参数名称
cfg.FREEZE.PARAMS_FILENAME = '__params__'
# 预测模型参数保存的路径
cfg.FREEZE.SAVE_DIR = 'freeze_model'
#copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import paddle.fluid as fluid
def nccl2_prepare(args, startup_prog, main_prog):
config = fluid.DistributeTranspilerConfig()
config.mode = "nccl2"
t = fluid.DistributeTranspiler(config=config)
envs = args.dist_env
t.transpile(
envs["trainer_id"],
trainers=','.join(envs["trainer_endpoints"]),
current_endpoint=envs["current_endpoint"],
startup_program=startup_prog,
program=main_prog)
def pserver_prepare(args, train_prog, startup_prog):
config = fluid.DistributeTranspilerConfig()
config.slice_var_up = args.split_var
t = fluid.DistributeTranspiler(config=config)
envs = args.dist_env
training_role = envs["training_role"]
t.transpile(
envs["trainer_id"],
program=train_prog,
pservers=envs["pserver_endpoints"],
trainers=envs["num_trainers"],
sync_mode=not args.async_mode,
startup_program=startup_prog)
if training_role == "PSERVER":
pserver_program = t.get_pserver_program(envs["current_endpoint"])
pserver_startup_program = t.get_startup_program(
envs["current_endpoint"],
pserver_program,
startup_program=startup_prog)
return pserver_program, pserver_startup_program
elif training_role == "TRAINER":
train_program = t.get_trainer_program()
return train_program, startup_prog
else:
raise ValueError(
'PADDLE_TRAINING_ROLE environment variable must be either TRAINER or PSERVER'
)
def nccl2_prepare_paddle(trainer_id, startup_prog, main_prog):
config = fluid.DistributeTranspilerConfig()
config.mode = "nccl2"
t = fluid.DistributeTranspiler(config=config)
t.transpile(
trainer_id,
trainers=os.environ.get('PADDLE_TRAINER_ENDPOINTS'),
current_endpoint=os.environ.get('PADDLE_CURRENT_ENDPOINT'),
startup_program=startup_prog,
program=main_prog)
def prepare_for_multi_process(exe, build_strategy, train_prog):
# prepare for multi-process
trainer_id = int(os.environ.get('PADDLE_TRAINER_ID', 0))
num_trainers = int(os.environ.get('PADDLE_TRAINERS_NUM', 1))
if num_trainers < 2: return
build_strategy.num_trainers = num_trainers
build_strategy.trainer_id = trainer_id
# NOTE(zcd): use multi processes to train the model,
# and each process use one GPU card.
startup_prog = fluid.Program()
nccl2_prepare_paddle(trainer_id, startup_prog, train_prog)
# the startup_prog are run two times, but it doesn't matter.
exe.run(startup_prog)
"""
generate tusimple training dataset
"""
import argparse
import glob
import json
import os
import os.path as ops
import shutil
import cv2
import numpy as np
def init_args():
parser = argparse.ArgumentParser()
parser.add_argument('--src_dir', type=str, help='The origin path of unzipped tusimple dataset')
return parser.parse_args()
def process_json_file(json_file_path, src_dir, ori_dst_dir, binary_dst_dir, instance_dst_dir):
assert ops.exists(json_file_path), '{:s} not exist'.format(json_file_path)
image_nums = len(os.listdir(os.path.join(src_dir, ori_dst_dir)))
with open(json_file_path, 'r') as file:
for line_index, line in enumerate(file):
info_dict = json.loads(line)
image_dir = ops.split(info_dict['raw_file'])[0]
image_dir_split = image_dir.split('/')[1:]
image_dir_split.append(ops.split(info_dict['raw_file'])[1])
image_name = '_'.join(image_dir_split)
image_path = ops.join(src_dir, info_dict['raw_file'])
assert ops.exists(image_path), '{:s} not exist'.format(image_path)
h_samples = info_dict['h_samples']
lanes = info_dict['lanes']
image_name_new = '{:s}.png'.format('{:d}'.format(line_index + image_nums).zfill(4))
src_image = cv2.imread(image_path, cv2.IMREAD_COLOR)
dst_binary_image = np.zeros([src_image.shape[0], src_image.shape[1]], np.uint8)
dst_instance_image = np.zeros([src_image.shape[0], src_image.shape[1]], np.uint8)
for lane_index, lane in enumerate(lanes):
assert len(h_samples) == len(lane)
lane_x = []
lane_y = []
for index in range(len(lane)):
if lane[index] == -2:
continue
else:
ptx = lane[index]
pty = h_samples[index]
lane_x.append(ptx)
lane_y.append(pty)
if not lane_x:
continue
lane_pts = np.vstack((lane_x, lane_y)).transpose()
lane_pts = np.array([lane_pts], np.int64)
cv2.polylines(dst_binary_image, lane_pts, isClosed=False,
color=255, thickness=5)
cv2.polylines(dst_instance_image, lane_pts, isClosed=False,
color=lane_index * 50 + 20, thickness=5)
dst_binary_image_path = ops.join(src_dir, binary_dst_dir, image_name_new)
dst_instance_image_path = ops.join(src_dir, instance_dst_dir, image_name_new)
dst_rgb_image_path = ops.join(src_dir, ori_dst_dir, image_name_new)
cv2.imwrite(dst_binary_image_path, dst_binary_image)
cv2.imwrite(dst_instance_image_path, dst_instance_image)
cv2.imwrite(dst_rgb_image_path, src_image)
print('Process {:s} success'.format(image_name))
def gen_sample(src_dir, b_gt_image_dir, i_gt_image_dir, image_dir, phase='train', split=False):
label_list = []
with open('{:s}/{}ing/{}.txt'.format(src_dir, phase, phase), 'w') as file:
for image_name in os.listdir(b_gt_image_dir):
if not image_name.endswith('.png'):
continue
binary_gt_image_path = ops.join(b_gt_image_dir, image_name)
instance_gt_image_path = ops.join(i_gt_image_dir, image_name)
image_path = ops.join(image_dir, image_name)
assert ops.exists(image_path), '{:s} not exist'.format(image_path)
assert ops.exists(instance_gt_image_path), '{:s} not exist'.format(instance_gt_image_path)
b_gt_image = cv2.imread(binary_gt_image_path, cv2.IMREAD_COLOR)
i_gt_image = cv2.imread(instance_gt_image_path, cv2.IMREAD_COLOR)
image = cv2.imread(image_path, cv2.IMREAD_COLOR)
if b_gt_image is None or image is None or i_gt_image is None:
print('image: {:s} corrupt'.format(image_name))
continue
else:
info = '{:s} {:s} {:s}'.format(image_path, binary_gt_image_path, instance_gt_image_path)
file.write(info + '\n')
label_list.append(info)
if phase == 'train' and split:
np.random.RandomState(0).shuffle(label_list)
val_list_len = len(label_list) // 10
val_label_list = label_list[:val_list_len]
train_label_list = label_list[val_list_len:]
with open('{:s}/{}ing/train_part.txt'.format(src_dir, phase, phase), 'w') as file:
for info in train_label_list:
file.write(info + '\n')
with open('{:s}/{}ing/val_part.txt'.format(src_dir, phase, phase), 'w') as file:
for info in val_label_list:
file.write(info + '\n')
return
def process_tusimple_dataset(src_dir):
traing_folder_path = ops.join(src_dir, 'training')
testing_folder_path = ops.join(src_dir, 'testing')
os.makedirs(traing_folder_path, exist_ok=True)
os.makedirs(testing_folder_path, exist_ok=True)
for json_label_path in glob.glob('{:s}/label*.json'.format(src_dir)):
json_label_name = ops.split(json_label_path)[1]
shutil.copyfile(json_label_path, ops.join(traing_folder_path, json_label_name))
for json_label_path in glob.glob('{:s}/test_label.json'.format(src_dir)):
json_label_name = ops.split(json_label_path)[1]
shutil.copyfile(json_label_path, ops.join(testing_folder_path, json_label_name))
train_gt_image_dir = ops.join('training', 'gt_image')
train_gt_binary_dir = ops.join('training', 'gt_binary_image')
train_gt_instance_dir = ops.join('training', 'gt_instance_image')
test_gt_image_dir = ops.join('testing', 'gt_image')
test_gt_binary_dir = ops.join('testing', 'gt_binary_image')
test_gt_instance_dir = ops.join('testing', 'gt_instance_image')
os.makedirs(os.path.join(src_dir, train_gt_image_dir), exist_ok=True)
os.makedirs(os.path.join(src_dir, train_gt_binary_dir), exist_ok=True)
os.makedirs(os.path.join(src_dir, train_gt_instance_dir), exist_ok=True)
os.makedirs(os.path.join(src_dir, test_gt_image_dir), exist_ok=True)
os.makedirs(os.path.join(src_dir, test_gt_binary_dir), exist_ok=True)
os.makedirs(os.path.join(src_dir, test_gt_instance_dir), exist_ok=True)
for json_label_path in glob.glob('{:s}/*.json'.format(traing_folder_path)):
process_json_file(json_label_path, src_dir, train_gt_image_dir, train_gt_binary_dir, train_gt_instance_dir)
gen_sample(src_dir, train_gt_binary_dir, train_gt_instance_dir, train_gt_image_dir, 'train', True)
if __name__ == '__main__':
args = init_args()
process_tusimple_dataset(args.src_dir)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# this code heavily base on https://github.com/MaybeShewill-CV/lanenet-lane-detection/blob/master/lanenet_model/lanenet_postprocess.py
"""
LaneNet model post process
"""
import os.path as ops
import math
import cv2
import time
import numpy as np
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
def _morphological_process(image, kernel_size=5):
"""
morphological process to fill the hole in the binary segmentation result
:param image:
:param kernel_size:
:return:
"""
if len(image.shape) == 3:
raise ValueError('Binary segmentation result image should be a single channel image')
if image.dtype is not np.uint8:
image = np.array(image, np.uint8)
kernel = cv2.getStructuringElement(shape=cv2.MORPH_ELLIPSE, ksize=(kernel_size, kernel_size))
# close operation fille hole
closing = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel, iterations=1)
return closing
def _connect_components_analysis(image):
"""
connect components analysis to remove the small components
:param image:
:return:
"""
if len(image.shape) == 3:
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
else:
gray_image = image
return cv2.connectedComponentsWithStats(gray_image, connectivity=8, ltype=cv2.CV_32S)
class _LaneFeat(object):
"""
"""
def __init__(self, feat, coord, class_id=-1):
"""
lane feat object
:param feat: lane embeddng feats [feature_1, feature_2, ...]
:param coord: lane coordinates [x, y]
:param class_id: lane class id
"""
self._feat = feat
self._coord = coord
self._class_id = class_id
@property
def feat(self):
return self._feat
@feat.setter
def feat(self, value):
if not isinstance(value, np.ndarray):
value = np.array(value, dtype=np.float64)
if value.dtype != np.float32:
value = np.array(value, dtype=np.float64)
self._feat = value
@property
def coord(self):
return self._coord
@coord.setter
def coord(self, value):
if not isinstance(value, np.ndarray):
value = np.array(value)
if value.dtype != np.int32:
value = np.array(value, dtype=np.int32)
self._coord = value
@property
def class_id(self):
return self._class_id
@class_id.setter
def class_id(self, value):
if not isinstance(value, np.int64):
raise ValueError('Class id must be integer')
self._class_id = value
class _LaneNetCluster(object):
"""
Instance segmentation result cluster
"""
def __init__(self):
"""
"""
self._color_map = [np.array([255, 0, 0]),
np.array([0, 255, 0]),
np.array([0, 0, 255]),
np.array([125, 125, 0]),
np.array([0, 125, 125]),
np.array([125, 0, 125]),
np.array([50, 100, 50]),
np.array([100, 50, 100])]
@staticmethod
def _embedding_feats_dbscan_cluster(embedding_image_feats):
"""
dbscan cluster
"""
db = DBSCAN(eps=0.4, min_samples=500)
try:
features = StandardScaler().fit_transform(embedding_image_feats)
db.fit(features)
except Exception as err:
print(err)
ret = {
'origin_features': None,
'cluster_nums': 0,
'db_labels': None,
'unique_labels': None,
'cluster_center': None
}
return ret
db_labels = db.labels_
unique_labels = np.unique(db_labels)
num_clusters = len(unique_labels)
cluster_centers = db.components_
ret = {
'origin_features': features,
'cluster_nums': num_clusters,
'db_labels': db_labels,
'unique_labels': unique_labels,
'cluster_center': cluster_centers
}
return ret
@staticmethod
def _get_lane_embedding_feats(binary_seg_ret, instance_seg_ret):
"""
get lane embedding features according the binary seg result
"""
idx = np.where(binary_seg_ret == 255)
lane_embedding_feats = instance_seg_ret[idx]
lane_coordinate = np.vstack((idx[1], idx[0])).transpose()
assert lane_embedding_feats.shape[0] == lane_coordinate.shape[0]
ret = {
'lane_embedding_feats': lane_embedding_feats,
'lane_coordinates': lane_coordinate
}
return ret
def apply_lane_feats_cluster(self, binary_seg_result, instance_seg_result):
"""
:param binary_seg_result:
:param instance_seg_result:
:return:
"""
# get embedding feats and coords
get_lane_embedding_feats_result = self._get_lane_embedding_feats(
binary_seg_ret=binary_seg_result,
instance_seg_ret=instance_seg_result
)
# dbscan cluster
dbscan_cluster_result = self._embedding_feats_dbscan_cluster(
embedding_image_feats=get_lane_embedding_feats_result['lane_embedding_feats']
)
mask = np.zeros(shape=[binary_seg_result.shape[0], binary_seg_result.shape[1], 3], dtype=np.uint8)
db_labels = dbscan_cluster_result['db_labels']
unique_labels = dbscan_cluster_result['unique_labels']
coord = get_lane_embedding_feats_result['lane_coordinates']
if db_labels is None:
return None, None
lane_coords = []
for index, label in enumerate(unique_labels.tolist()):
if label == -1:
continue
idx = np.where(db_labels == label)
pix_coord_idx = tuple((coord[idx][:, 1], coord[idx][:, 0]))
mask[pix_coord_idx] = self._color_map[index]
lane_coords.append(coord[idx])
return mask, lane_coords
class LaneNetPostProcessor(object):
"""
lanenet post process for lane generation
"""
def __init__(self, ipm_remap_file_path='./utils/tusimple_ipm_remap.yml'):
"""
convert front car view to bird view
"""
assert ops.exists(ipm_remap_file_path), '{:s} not exist'.format(ipm_remap_file_path)
self._cluster = _LaneNetCluster()
self._ipm_remap_file_path = ipm_remap_file_path
remap_file_load_ret = self._load_remap_matrix()
self._remap_to_ipm_x = remap_file_load_ret['remap_to_ipm_x']
self._remap_to_ipm_y = remap_file_load_ret['remap_to_ipm_y']
self._color_map = [np.array([255, 0, 0]),
np.array([0, 255, 0]),
np.array([0, 0, 255]),
np.array([125, 125, 0]),
np.array([0, 125, 125]),
np.array([125, 0, 125]),
np.array([50, 100, 50]),
np.array([100, 50, 100])]
def _load_remap_matrix(self):
fs = cv2.FileStorage(self._ipm_remap_file_path, cv2.FILE_STORAGE_READ)
remap_to_ipm_x = fs.getNode('remap_ipm_x').mat()
remap_to_ipm_y = fs.getNode('remap_ipm_y').mat()
ret = {
'remap_to_ipm_x': remap_to_ipm_x,
'remap_to_ipm_y': remap_to_ipm_y,
}
fs.release()
return ret
def postprocess(self, binary_seg_result, instance_seg_result=None,
min_area_threshold=100, source_image=None,
data_source='tusimple'):
# convert binary_seg_result
binary_seg_result = np.array(binary_seg_result * 255, dtype=np.uint8)
# apply image morphology operation to fill in the hold and reduce the small area
morphological_ret = _morphological_process(binary_seg_result, kernel_size=5)
connect_components_analysis_ret = _connect_components_analysis(image=morphological_ret)
labels = connect_components_analysis_ret[1]
stats = connect_components_analysis_ret[2]
for index, stat in enumerate(stats):
if stat[4] <= min_area_threshold:
idx = np.where(labels == index)
morphological_ret[idx] = 0
# apply embedding features cluster
mask_image, lane_coords = self._cluster.apply_lane_feats_cluster(
binary_seg_result=morphological_ret,
instance_seg_result=instance_seg_result
)
if mask_image is None:
return {
'mask_image': None,
'fit_params': None,
'source_image': None,
}
# lane line fit
fit_params = []
src_lane_pts = []
for lane_index, coords in enumerate(lane_coords):
if data_source == 'tusimple':
tmp_mask = np.zeros(shape=(720, 1280), dtype=np.uint8)
tmp_mask[tuple((np.int_(coords[:, 1] * 720 / 256), np.int_(coords[:, 0] * 1280 / 512)))] = 255
else:
raise ValueError('Wrong data source now only support tusimple')
tmp_ipm_mask = cv2.remap(
tmp_mask,
self._remap_to_ipm_x,
self._remap_to_ipm_y,
interpolation=cv2.INTER_NEAREST
)
nonzero_y = np.array(tmp_ipm_mask.nonzero()[0])
nonzero_x = np.array(tmp_ipm_mask.nonzero()[1])
fit_param = np.polyfit(nonzero_y, nonzero_x, 2)
fit_params.append(fit_param)
[ipm_image_height, ipm_image_width] = tmp_ipm_mask.shape
plot_y = np.linspace(10, ipm_image_height, ipm_image_height - 10)
fit_x = fit_param[0] * plot_y ** 2 + fit_param[1] * plot_y + fit_param[2]
lane_pts = []
for index in range(0, plot_y.shape[0], 5):
src_x = self._remap_to_ipm_x[
int(plot_y[index]), int(np.clip(fit_x[index], 0, ipm_image_width - 1))]
if src_x <= 0:
continue
src_y = self._remap_to_ipm_y[
int(plot_y[index]), int(np.clip(fit_x[index], 0, ipm_image_width - 1))]
src_y = src_y if src_y > 0 else 0
lane_pts.append([src_x, src_y])
src_lane_pts.append(lane_pts)
# tusimple test data sample point along y axis every 10 pixels
source_image_width = source_image.shape[1]
for index, single_lane_pts in enumerate(src_lane_pts):
single_lane_pt_x = np.array(single_lane_pts, dtype=np.float32)[:, 0]
single_lane_pt_y = np.array(single_lane_pts, dtype=np.float32)[:, 1]
if data_source == 'tusimple':
start_plot_y = 240
end_plot_y = 720
else:
raise ValueError('Wrong data source now only support tusimple')
step = int(math.floor((end_plot_y - start_plot_y) / 10))
for plot_y in np.linspace(start_plot_y, end_plot_y, step):
diff = single_lane_pt_y - plot_y
fake_diff_bigger_than_zero = diff.copy()
fake_diff_smaller_than_zero = diff.copy()
fake_diff_bigger_than_zero[np.where(diff <= 0)] = float('inf')
fake_diff_smaller_than_zero[np.where(diff > 0)] = float('-inf')
idx_low = np.argmax(fake_diff_smaller_than_zero)
idx_high = np.argmin(fake_diff_bigger_than_zero)
previous_src_pt_x = single_lane_pt_x[idx_low]
previous_src_pt_y = single_lane_pt_y[idx_low]
last_src_pt_x = single_lane_pt_x[idx_high]
last_src_pt_y = single_lane_pt_y[idx_high]
if previous_src_pt_y < start_plot_y or last_src_pt_y < start_plot_y or \
fake_diff_smaller_than_zero[idx_low] == float('-inf') or \
fake_diff_bigger_than_zero[idx_high] == float('inf'):
continue
interpolation_src_pt_x = (abs(previous_src_pt_y - plot_y) * previous_src_pt_x +
abs(last_src_pt_y - plot_y) * last_src_pt_x) / \
(abs(previous_src_pt_y - plot_y) + abs(last_src_pt_y - plot_y))
interpolation_src_pt_y = (abs(previous_src_pt_y - plot_y) * previous_src_pt_y +
abs(last_src_pt_y - plot_y) * last_src_pt_y) / \
(abs(previous_src_pt_y - plot_y) + abs(last_src_pt_y - plot_y))
if interpolation_src_pt_x > source_image_width or interpolation_src_pt_x < 10:
continue
lane_color = self._color_map[index].tolist()
cv2.circle(source_image, (int(interpolation_src_pt_x),
int(interpolation_src_pt_y)), 5, lane_color, -1)
ret = {
'mask_image': mask_image,
'fit_params': fit_params,
'source_image': source_image,
}
return ret
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
# GPU memory garbage collection optimization flags
os.environ['FLAGS_eager_delete_tensor_gb'] = "0.0"
import sys
cur_path = os.path.abspath(os.path.dirname(__file__))
root_path = os.path.split(os.path.split(cur_path)[0])[0]
SEG_PATH = os.path.join(cur_path, "../../../")
sys.path.append(SEG_PATH)
sys.path.append(root_path)
import matplotlib
matplotlib.use('Agg')
import time
import argparse
import pprint
import cv2
import numpy as np
import paddle.fluid as fluid
from utils.config import cfg
from reader import LaneNetDataset
from models.model_builder import build_model
from models.model_builder import ModelPhase
from utils import lanenet_postprocess
import matplotlib.pyplot as plt
def parse_args():
parser = argparse.ArgumentParser(description='PaddeSeg visualization tools')
parser.add_argument(
'--cfg',
dest='cfg_file',
help='Config file for training (and optionally testing)',
default=None,
type=str)
parser.add_argument(
'--use_gpu', dest='use_gpu', help='Use gpu or cpu', action='store_true')
parser.add_argument(
'--vis_dir',
dest='vis_dir',
help='visual save dir',
type=str,
default='visual')
parser.add_argument(
'--also_save_raw_results',
dest='also_save_raw_results',
help='whether to save raw result',
action='store_true')
parser.add_argument(
'--local_test',
dest='local_test',
help='if in local test mode, only visualize 5 images for testing',
action='store_true')
parser.add_argument(
'opts',
help='See config.py for all options',
default=None,
nargs=argparse.REMAINDER)
if len(sys.argv) == 1:
parser.print_help()
sys.exit(1)
return parser.parse_args()
def makedirs(directory):
if not os.path.exists(directory):
os.makedirs(directory)
def to_png_fn(fn, name=""):
"""
Append png as filename postfix
"""
directory, filename = os.path.split(fn)
basename, ext = os.path.splitext(filename)
return basename + name + ".png"
def minmax_scale(input_arr):
min_val = np.min(input_arr)
max_val = np.max(input_arr)
output_arr = (input_arr - min_val) * 255.0 / (max_val - min_val)
return output_arr
def visualize(cfg,
vis_file_list=None,
use_gpu=False,
vis_dir="visual",
also_save_raw_results=False,
ckpt_dir=None,
log_writer=None,
local_test=False,
**kwargs):
if vis_file_list is None:
vis_file_list = cfg.DATASET.TEST_FILE_LIST
dataset = LaneNetDataset(
file_list=vis_file_list,
mode=ModelPhase.VISUAL,
shuffle=True,
data_dir=cfg.DATASET.DATA_DIR)
startup_prog = fluid.Program()
test_prog = fluid.Program()
pred, logit = build_model(test_prog, startup_prog, phase=ModelPhase.VISUAL)
# Clone forward graph
test_prog = test_prog.clone(for_test=True)
# Get device environment
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(startup_prog)
ckpt_dir = cfg.TEST.TEST_MODEL if not ckpt_dir else ckpt_dir
fluid.io.load_params(exe, ckpt_dir, main_program=test_prog)
save_dir = os.path.join(vis_dir, 'visual_results')
makedirs(save_dir)
if also_save_raw_results:
raw_save_dir = os.path.join(vis_dir, 'raw_results')
makedirs(raw_save_dir)
fetch_list = [pred.name, logit.name]
test_reader = dataset.batch(dataset.generator, batch_size=1, is_test=True)
postprocessor = lanenet_postprocess.LaneNetPostProcessor()
for imgs, grts, grts_instance, img_names, valid_shapes, org_imgs in test_reader:
segLogits, emLogits = exe.run(
program=test_prog,
feed={'image': imgs},
fetch_list=fetch_list,
return_numpy=True)
num_imgs = segLogits.shape[0]
for i in range(num_imgs):
gt_image = org_imgs[i]
binary_seg_image, instance_seg_image = segLogits[i].squeeze(-1), emLogits[i].transpose((1,2,0))
postprocess_result = postprocessor.postprocess(
binary_seg_result=binary_seg_image,
instance_seg_result=instance_seg_image,
source_image=gt_image
)
pred_binary_fn = os.path.join(save_dir, to_png_fn(img_names[i], name='_pred_binary'))
pred_lane_fn = os.path.join(save_dir, to_png_fn(img_names[i], name='_pred_lane'))
pred_instance_fn = os.path.join(save_dir, to_png_fn(img_names[i], name='_pred_instance'))
dirname = os.path.dirname(pred_binary_fn)
makedirs(dirname)
mask_image = postprocess_result['mask_image']
for i in range(4):
instance_seg_image[:, :, i] = minmax_scale(instance_seg_image[:, :, i])
embedding_image = np.array(instance_seg_image).astype(np.uint8)
plt.figure('mask_image')
plt.imshow(mask_image[:, :, (2, 1, 0)])
plt.figure('src_image')
plt.imshow(gt_image[:, :, (2, 1, 0)])
plt.figure('instance_image')
plt.imshow(embedding_image[:, :, (2, 1, 0)])
plt.figure('binary_image')
plt.imshow(binary_seg_image * 255, cmap='gray')
plt.show()
cv2.imwrite(pred_binary_fn, np.array(binary_seg_image * 255).astype(np.uint8))
cv2.imwrite(pred_lane_fn, postprocess_result['source_image'])
cv2.imwrite(pred_instance_fn, mask_image)
print(pred_lane_fn, 'saved!')
if __name__ == '__main__':
args = parse_args()
if args.cfg_file is not None:
cfg.update_from_file(args.cfg_file)
if args.opts:
cfg.update_from_list(args.opts)
cfg.check_and_infer()
print(pprint.pformat(cfg))
visualize(cfg, **args.__dict__)
...@@ -16,7 +16,7 @@ import sys ...@@ -16,7 +16,7 @@ import sys
import os import os
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__)) LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "..", "test") TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test")
sys.path.append(TEST_PATH) sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress from test_utils import download_file_and_uncompress
......
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os
LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
TEST_PATH = os.path.join(LOCAL_PATH, "..", "..", "test")
sys.path.append(TEST_PATH)
from test_utils import download_file_and_uncompress
if __name__ == "__main__":
download_file_and_uncompress(
url='https://paddleseg.bj.bcebos.com/models/unet_mechanical_industry_meter.tar',
savepath=LOCAL_PATH,
extrapath=LOCAL_PATH)
print("Pretrained Model download success!")
...@@ -21,14 +21,14 @@ DATALOADER: ...@@ -21,14 +21,14 @@ DATALOADER:
BUF_SIZE: 256 BUF_SIZE: 256
NUM_WORKERS: 4 NUM_WORKERS: 4
DATASET: DATASET:
DATA_DIR: "./dataset/mini_mechanical_industry_meter_data/" DATA_DIR: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/"
IMAGE_TYPE: "rgb" # choice rgb or rgba IMAGE_TYPE: "rgb" # choice rgb or rgba
NUM_CLASSES: 5 NUM_CLASSES: 5
TEST_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/val_mini.txt" TEST_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/val_mini.txt"
TEST_TOTAL_IMAGES: 8 TEST_TOTAL_IMAGES: 8
TRAIN_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/train_mini.txt" TRAIN_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/train_mini.txt"
TRAIN_TOTAL_IMAGES: 64 TRAIN_TOTAL_IMAGES: 64
VAL_FILE_LIST: "./dataset/mini_mechanical_industry_meter_data/val_mini.txt" VAL_FILE_LIST: "./contrib/MechanicalIndustryMeter/mini_mechanical_industry_meter_data/val_mini.txt"
VAL_TOTAL_IMAGES: 8 VAL_TOTAL_IMAGES: 8
SEPARATOR: "|" SEPARATOR: "|"
IGNORE_INDEX: 255 IGNORE_INDEX: 255
......
# PaddleSeg 特色垂类分割模型 # PaddleSeg 特色垂类分割模型
提供基于PaddlePaddle最新的分割特色模型 提供基于PaddlePaddle最新的分割特色模型:
## Augmented Context Embedding with Edge Perceiving (ACE2P) - [人像分割](#人像分割)
- [人体解析](#人体解析)
- [车道线分割](#车道线分割)
- [工业用表分割](#工业用表分割)
- [在线体验](#在线体验)
## 人像分割
### 1. 模型概述 **Note:** 本章节所有命令均在`contrib/HumanSeg`目录下执行。
CVPR 19 Look into Person (LIP) 单人人像分割比赛冠军模型,详见[ACE2P](./ACE2P)
### 2. 模型下载 ```
cd contrib/HumanSeg
```
点击[链接](https://paddleseg.bj.bcebos.com/models/ACE2P.tgz),下载, 在contrib/ACE2P下解压, `tar -xzf ACE2P.tgz` ### 1. 模型结构
### 3. 数据下载 DeepLabv3+ backbone为Xception65
前往LIP数据集官网: http://47.100.21.47:9999/overview.php 或点击 [Baidu_Drive](https://pan.baidu.com/s/1nvqmZBN#list/path=%2Fsharelink2787269280-523292635003760%2FLIP%2FLIP&parentPath=%2Fsharelink2787269280-523292635003760), ### 2. 下载模型和数据
执行以下命令下载并解压模型和数据集:
加载Testing_images.zip, 解压到contrib/ACE2P/data文件夹下 ```
python download_HumanSeg.py
```
或点击[链接](https://paddleseg.bj.bcebos.com/models/HumanSeg.tgz)进行手动下载,并解压到contrib/HumanSeg文件夹下
### 4. 运行
**NOTE:** 运行该模型需要2G左右显存 ### 3. 运行
使用GPU预测 使用GPU预测
``` ```
python -u infer.py --example ACE2P --use_gpu python -u infer.py --example HumanSeg --use_gpu
``` ```
使用CPU预测: 使用CPU预测:
``` ```
python -u infer.py --example ACE2P python -u infer.py --example HumanSeg
``` ```
## 人像分割 (HumanSeg)
预测结果存放在contrib/HumanSeg/HumanSeg/result目录下。
### 1. 模型结构 ### 4. 预测结果示例:
DeepLabv3+ backbone为Xception65 原图:
![](HumanSeg/imgs/Human.jpg)
预测结果:
![](HumanSeg/imgs/HumanSeg.jpg)
### 2. 下载模型和数据
点击[链接](https://paddleseg.bj.bcebos.com/models/HumanSeg.tgz),下载解压到contrib文件夹下
### 3. 运行 ## 人体解析
![](ACE2P/imgs/result.jpg)
人体解析(Human Parsing)是细粒度的语义分割任务,旨在识别像素级别的人类图像的组成部分(例如,身体部位和服装)。本章节使用冠军模型Augmented Context Embedding with Edge Perceiving (ACE2P)进行预测分割。
**Note:** 本章节所有命令均在`contrib/ACE2P`目录下执行。
使用GPU预测:
``` ```
python -u infer.py --example HumanSeg --use_gpu cd contrib/ACE2P
``` ```
### 1. 模型概述
Augmented Context Embedding with Edge Perceiving (ACE2P)通过融合底层特征、全局上下文信息和边缘细节,端到端训练学习人体解析任务。以ACE2P单人人体解析网络为基础的解决方案在CVPR2019第三届Look into Person (LIP)挑战赛中赢得了全部三个人体解析任务的第一名。详情请参见[ACE2P](./ACE2P)
### 2. 模型下载
执行以下命令下载并解压ACE2P预测模型:
使用CPU预测:
``` ```
python -u infer.py --example HumanSeg python download_ACE2P.py
``` ```
或点击[链接](https://paddleseg.bj.bcebos.com/models/ACE2P.tgz)进行手动下载, 并在contrib/ACE2P下解压。
### 4. 预测结果示例: ### 3. 数据下载
测试图片共10000张,
点击 [Baidu_Drive](https://pan.baidu.com/s/1nvqmZBN#list/path=%2Fsharelink2787269280-523292635003760%2FLIP%2FLIP&parentPath=%2Fsharelink2787269280-523292635003760)
下载Testing_images.zip,或前往LIP数据集官网进行下载。
下载后解压到contrib/ACE2P/data文件夹下
### 4. 运行
使用GPU预测
```
python -u infer.py --example ACE2P --use_gpu
```
使用CPU预测:
```
python -u infer.py --example ACE2P
```
原图:![](imgs/Human.jpg) **NOTE:** 运行该模型需要2G左右显存。由于数据图片较多,预测过程将比较耗时。
#### 5. 预测结果示例:
原图:
![](ACE2P/imgs/117676_2149260.jpg)
预测结果:
预测结果:![](imgs/HumanSeg.jpg) ![](ACE2P/imgs/117676_2149260.png)
### 备注
## 车道线分割 (RoadLine) 1. 数据及模型路径等详细配置见ACE2P/HumanSeg/RoadLine下的config.py文件
2. ACE2P模型需预留2G显存,若显存超可调小FLAGS_fraction_of_gpu_memory_to_use
## 车道线分割
**Note:** 本章节所有命令均在`contrib/RoadLine`目录下执行。
```
cd contrib/RoadLine
```
### 1. 模型结构 ### 1. 模型结构
...@@ -75,7 +142,15 @@ Deeplabv3+ backbone为MobileNetv2 ...@@ -75,7 +142,15 @@ Deeplabv3+ backbone为MobileNetv2
### 2. 下载模型和数据 ### 2. 下载模型和数据
点击[链接](https://paddleseg.bj.bcebos.com/inference_model/RoadLine.tgz),下载解压在contrib文件夹下
执行以下命令下载并解压模型和数据集:
```
python download_RoadLine.py
```
或点击[链接](https://paddleseg.bj.bcebos.com/inference_model/RoadLine.tgz)进行手动下载,并解压到contrib/RoadLine文件夹下
### 3. 运行 ### 3. 运行
...@@ -92,45 +167,84 @@ python -u infer.py --example RoadLine --use_gpu ...@@ -92,45 +167,84 @@ python -u infer.py --example RoadLine --use_gpu
python -u infer.py --example RoadLine python -u infer.py --example RoadLine
``` ```
预测结果存放在contrib/RoadLine/RoadLine/result目录下。
#### 4. 预测结果示例: #### 4. 预测结果示例:
原图:![](imgs/RoadLine.jpg) 原图:
![](RoadLine/imgs/RoadLine.jpg)
预测结果:![](imgs/RoadLine.png) 预测结果:
![](RoadLine/imgs/RoadLine.png)
## 工业用表分割 ## 工业用表分割
**Note:** 本章节所有命令均在`PaddleSeg`目录下执行。
### 1. 模型结构 ### 1. 模型结构
unet unet
### 2. 数据准备 ### 2. 数据准备
cd到PaddleSeg/dataset文件夹下,执行download_mini_mechanical_industry_meter.py 执行以下命令下载并解压数据集,数据集将存放在contrib/MechanicalIndustryMeter文件夹下:
```
python ./contrib/MechanicalIndustryMeter/download_mini_mechanical_industry_meter.py
```
### 3. 下载预训练模型
```
python ./pretrained_model/download_model.py unet_bn_coco
```
### 4. 训练与评估
### 3. 训练与评估 ```
export CUDA_VISIBLE_DEVICES=0
python ./pdseg/train.py --log_steps 10 --cfg contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml --use_gpu --do_eval --use_mpio
```
### 5. 可视化
我们已提供了一个训练好的模型,执行以下命令进行下载,下载后将存放在./contrib/MechanicalIndustryMeter/文件夹下。
``` ```
CUDA_VISIBLE_DEVICES=0 python ./pdseg/train.py --log_steps 10 --cfg configs/unet_mechanical_meter.yaml --use_gpu --do_eval --use_mpio python ./contrib/MechanicalIndustryMeter/download_unet_mechanical_industry_meter.py
``` ```
### 4. 可视化 使用该模型进行预测可视化:
我们提供了一个训练好的模型,点击[链接](https://paddleseg.bj.bcebos.com/models/unet_mechanical_industry_meter.tar),下载后放在PaddleSeg/pretrained_model下
``` ```
CUDA_VISIBLE_DEVICES=0 python ./pdseg/vis.py --cfg configs/unet_mechanical_meter.yaml --use_gpu --vis_dir vis_meter \ python ./pdseg/vis.py --cfg contrib/MechanicalIndustryMeter/unet_mechanical_meter.yaml --use_gpu --vis_dir vis_meter \
TEST.TEST_MODEL "./pretrained_model/unet_gongyeyongbiao/" TEST.TEST_MODEL "./contrib/MechanicalIndustryMeter/unet_mechanical_industry_meter/"
``` ```
可视化结果会保存在vis_meter文件夹下 可视化结果会保存在./vis_meter文件夹下。
### 5. 可视化结果示例: ### 6. 可视化结果示例:
原图:![](imgs/1560143028.5_IMG_3091.JPG) 原图:
![](MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.JPG)
预测结果:![](imgs/1560143028.5_IMG_3091.png) 预测结果:
# 备注 ![](MechanicalIndustryMeter/imgs/1560143028.5_IMG_3091.png)
## 在线体验
PaddleSeg在AI Studio平台上提供了在线体验的教程,欢迎体验:
|教程|链接|
|-|-|
|工业质检|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184392)|
|人像分割|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/188833)|
|特色垂类模型|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/226710)|
1. 数据及模型路径等详细配置见ACE2P/HumanSeg/RoadLine下的config.py文件
2. ACE2P模型需预留2G显存,若显存超可调小FLAGS_fraction_of_gpu_memory_to_use
cmake_minimum_required(VERSION 3.0)
project(PaddleMaskDetector CXX C)
option(WITH_MKL "Compile demo with MKL/OpenBlas support,defaultuseMKL." ON)
option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." ON)
option(WITH_STATIC_LIB "Compile demo with static/shared library, default use static." ON)
option(USE_TENSORRT "Compile demo with TensorRT." OFF)
SET(PADDLE_DIR "" CACHE PATH "Location of libraries")
SET(OPENCV_DIR "" CACHE PATH "Location of libraries")
SET(CUDA_LIB "" CACHE PATH "Location of libraries")
macro(safe_set_static_flag)
foreach(flag_var
CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE
CMAKE_CXX_FLAGS_MINSIZEREL CMAKE_CXX_FLAGS_RELWITHDEBINFO)
if(${flag_var} MATCHES "/MD")
string(REGEX REPLACE "/MD" "/MT" ${flag_var} "${${flag_var}}")
endif(${flag_var} MATCHES "/MD")
endforeach(flag_var)
endmacro()
if (WITH_MKL)
ADD_DEFINITIONS(-DUSE_MKL)
endif()
if (NOT DEFINED PADDLE_DIR OR ${PADDLE_DIR} STREQUAL "")
message(FATAL_ERROR "please set PADDLE_DIR with -DPADDLE_DIR=/path/paddle_influence_dir")
endif()
if (NOT DEFINED OPENCV_DIR OR ${OPENCV_DIR} STREQUAL "")
message(FATAL_ERROR "please set OPENCV_DIR with -DOPENCV_DIR=/path/opencv")
endif()
include_directories("${CMAKE_SOURCE_DIR}/")
include_directories("${PADDLE_DIR}/")
include_directories("${PADDLE_DIR}/third_party/install/protobuf/include")
include_directories("${PADDLE_DIR}/third_party/install/glog/include")
include_directories("${PADDLE_DIR}/third_party/install/gflags/include")
include_directories("${PADDLE_DIR}/third_party/install/xxhash/include")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/include")
include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/include")
include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
endif()
include_directories("${PADDLE_DIR}/third_party/install/zlib/include")
include_directories("${PADDLE_DIR}/third_party/boost")
include_directories("${PADDLE_DIR}/third_party/eigen3")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
endif()
link_directories("${PADDLE_DIR}/third_party/install/zlib/lib")
link_directories("${PADDLE_DIR}/third_party/install/protobuf/lib")
link_directories("${PADDLE_DIR}/third_party/install/glog/lib")
link_directories("${PADDLE_DIR}/third_party/install/gflags/lib")
link_directories("${PADDLE_DIR}/third_party/install/xxhash/lib")
link_directories("${PADDLE_DIR}/paddle/lib/")
link_directories("${CMAKE_CURRENT_BINARY_DIR}")
if (WIN32)
include_directories("${PADDLE_DIR}/paddle/fluid/inference")
include_directories("${PADDLE_DIR}/paddle/include")
link_directories("${PADDLE_DIR}/paddle/fluid/inference")
include_directories("${OPENCV_DIR}/build/include")
include_directories("${OPENCV_DIR}/opencv/build/include")
link_directories("${OPENCV_DIR}/build/x64/vc14/lib")
else ()
find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/share/OpenCV NO_DEFAULT_PATH)
include_directories("${PADDLE_DIR}/paddle/include")
link_directories("${PADDLE_DIR}/paddle/lib")
include_directories(${OpenCV_INCLUDE_DIRS})
endif ()
if (WIN32)
add_definitions("/DGOOGLE_GLOG_DLL_DECL=")
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} /bigobj /MTd")
set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /bigobj /MT")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /bigobj /MTd")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /bigobj /MT")
if (WITH_STATIC_LIB)
safe_set_static_flag()
add_definitions(-DSTATIC_LIB)
endif()
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -o2 -fopenmp -std=c++11")
set(CMAKE_STATIC_LIBRARY_PREFIX "")
endif()
# TODO let users define cuda lib path
if (WITH_GPU)
if (NOT DEFINED CUDA_LIB OR ${CUDA_LIB} STREQUAL "")
message(FATAL_ERROR "please set CUDA_LIB with -DCUDA_LIB=/path/cuda-8.0/lib64")
endif()
if (NOT WIN32)
if (NOT DEFINED CUDNN_LIB)
message(FATAL_ERROR "please set CUDNN_LIB with -DCUDNN_LIB=/path/cudnn_v7.4/cuda/lib64")
endif()
endif(NOT WIN32)
endif()
if (NOT WIN32)
if (USE_TENSORRT AND WITH_GPU)
include_directories("${PADDLE_DIR}/third_party/install/tensorrt/include")
link_directories("${PADDLE_DIR}/third_party/install/tensorrt/lib")
endif()
endif(NOT WIN32)
if (NOT WIN32)
set(NGRAPH_PATH "${PADDLE_DIR}/third_party/install/ngraph")
if(EXISTS ${NGRAPH_PATH})
include(GNUInstallDirs)
include_directories("${NGRAPH_PATH}/include")
link_directories("${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}")
set(NGRAPH_LIB ${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}/libngraph${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
endif()
if(WITH_MKL)
include_directories("${PADDLE_DIR}/third_party/install/mklml/include")
if (WIN32)
set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.lib
${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.lib)
else ()
set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
execute_process(COMMAND cp -r ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX} /usr/lib)
endif ()
set(MKLDNN_PATH "${PADDLE_DIR}/third_party/install/mkldnn")
if(EXISTS ${MKLDNN_PATH})
include_directories("${MKLDNN_PATH}/include")
if (WIN32)
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/mkldnn.lib)
else ()
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
endif ()
endif()
else()
set(MATH_LIB ${PADDLE_DIR}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
if (WIN32)
if(EXISTS "${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX}")
set(DEPS
${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
else()
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
endif()
if(WITH_STATIC_LIB)
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
else()
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
if (NOT WIN32)
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
glog gflags protobuf z xxhash
)
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
else()
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
opencv_world346 glog gflags_static libprotobuf zlibstatic xxhash)
set(DEPS ${DEPS} libcmt shlwapi)
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
endif(NOT WIN32)
if(WITH_GPU)
if(NOT WIN32)
if (USE_TENSORRT)
set(DEPS ${DEPS} ${PADDLE_DIR}/third_party/install/tensorrt/lib/libnvinfer${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${PADDLE_DIR}/third_party/install/tensorrt/lib/libnvinfer_plugin${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX})
else()
set(DEPS ${DEPS} ${CUDA_LIB}/cudart${CMAKE_STATIC_LIBRARY_SUFFIX} )
set(DEPS ${DEPS} ${CUDA_LIB}/cublas${CMAKE_STATIC_LIBRARY_SUFFIX} )
set(DEPS ${DEPS} ${CUDA_LIB}/cudnn${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
endif()
if (NOT WIN32)
set(EXTERNAL_LIB "-ldl -lrt -lgomp -lz -lm -lpthread")
set(DEPS ${DEPS} ${EXTERNAL_LIB} ${OpenCV_LIBS})
endif()
add_executable(main main.cc humanseg.cc humanseg_postprocess.cc)
target_link_libraries(main ${DEPS})
if (WIN32)
add_custom_command(TARGET main POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./mklml.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./libiomp5md.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./mkldnn.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./release/mklml.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./release/libiomp5md.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./release/mkldnn.dll
)
endif()
{
"configurations": [
{
"name": "x64-Release",
"generator": "Ninja",
"configurationType": "RelWithDebInfo",
"inheritEnvironments": [ "msvc_x64_x64" ],
"buildRoot": "${projectDir}\\out\\build\\${name}",
"installRoot": "${projectDir}\\out\\install\\${name}",
"cmakeCommandArgs": "",
"buildCommandArgs": "-v",
"ctestCommandArgs": "",
"variables": [
{
"name": "CUDA_LIB",
"value": "D:/projects/packages/cuda10_0/lib64",
"type": "PATH"
},
{
"name": "CUDNN_LIB",
"value": "D:/projects/packages/cuda10_0/lib64",
"type": "PATH"
},
{
"name": "OPENCV_DIR",
"value": "D:/projects/packages/opencv3_4_6",
"type": "PATH"
},
{
"name": "PADDLE_DIR",
"value": "D:/projects/packages/fluid_inference1_6_1",
"type": "PATH"
},
{
"name": "CMAKE_BUILD_TYPE",
"value": "Release",
"type": "STRING"
}
]
}
]
}
\ No newline at end of file
# 视频实时图像分割模型C++预测部署
本文档主要介绍实时图像分割模型如何在`Windows``Linux`上完成基于`C++`的预测部署。
## C++预测部署编译
### 1. 下载模型
点击右边下载:[模型下载地址](https://paddleseg.bj.bcebos.com/deploy/models/humanseg_paddleseg_int8.zip)
模型文件路径将做为预测时的输入参数,请解压到合适的目录位置。
### 2. 编译
本项目支持在Windows和Linux上编译并部署C++项目,不同平台的编译请参考:
- [Linux 编译](./docs/linux_build.md)
- [Windows 使用 Visual Studio 2019编译](./docs/windows_build.md)
# 视频实时人像分割模型Linux平台C++预测部署
## 1. 系统和软件依赖
### 1.1 操作系统及硬件要求
- Ubuntu 14.04 或者 16.04 (其它平台未测试)
- GCC版本4.8.5 ~ 4.9.2
- 支持Intel MKL-DNN的CPU
- NOTE: 如需在Nvidia GPU运行,请自行安装CUDA 9.0 / 10.0 + CUDNN 7.3+ (不支持9.1/10.1版本的CUDA)
### 1.2 下载PaddlePaddle C++预测库
PaddlePaddle C++ 预测库主要分为CPU版本和GPU版本。
其中,GPU 版本支持`CUDA 10.0``CUDA 9.0`:
以下为各版本C++预测库的下载链接:
| 版本 | 链接 |
| ---- | ---- |
| CPU+MKL版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-cpu-avx-mkl/fluid_inference.tgz) |
| CUDA9.0+MKL 版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-gpu-cuda9-cudnn7-avx-mkl/fluid_inference.tgz) |
| CUDA10.0+MKL 版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-gpu-cuda10-cudnn7-avx-mkl/fluid_inference.tgz) |
更多可用预测库版本,请点击以下链接下载:[C++预测库下载列表](https://paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/build_and_install_lib_cn.html)
下载并解压, 解压后的 `fluid_inference`目录包含的内容:
```
fluid_inference
├── paddle # paddle核心库和头文件
|
├── third_party # 第三方依赖库和头文件
|
└── version.txt # 版本和编译信息
```
**注意:** 请把解压后的目录放到合适的路径,**该目录路径后续会作为编译依赖**使用。
## 2. 编译与运行
### 2.1 配置编译脚本
打开文件`linux_build.sh`, 看到以下内容:
```shell
# 是否使用GPU
WITH_GPU=OFF
# Paddle 预测库路径
PADDLE_DIR=/PATH/TO/fluid_inference/
# CUDA库路径, 仅 WITH_GPU=ON 时设置
CUDA_LIB=/PATH/TO/CUDA_LIB64/
# CUDNN库路径,仅 WITH_GPU=ON 且 CUDA_LIB有效时设置
CUDNN_LIB=/PATH/TO/CUDNN_LIB64/
# OpenCV 库路径, 无须设置
OPENCV_DIR=/PATH/TO/opencv3gcc4.8/
cd build
cmake .. \
-DWITH_GPU=${WITH_GPU} \
-DPADDLE_DIR=${PADDLE_DIR} \
-DCUDA_LIB=${CUDA_LIB} \
-DCUDNN_LIB=${CUDNN_LIB} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_STATIC_LIB=OFF
make -j4
```
把上述参数根据实际情况做修改后,运行脚本编译程序:
```shell
sh linux_build.sh
```
### 2.2. 运行和可视化
可执行文件有 **2** 个参数,第一个是前面导出的`inference_model`路径,第二个是需要预测的视频路径。
示例:
```shell
./build/main ./models /PATH/TO/TEST_VIDEO
```
点击下载[测试视频](https://paddleseg.bj.bcebos.com/deploy/data/test.avi)
预测的结果保存在视频文件`result.avi`中。
# 视频实时人像分割模型Windows平台C++预测部署
## 1. 系统和软件依赖
### 1.1 基础依赖
- Windows 10 / Windows Server 2016+ (其它平台未测试)
- Visual Studio 2019 (社区版或专业版均可)
- CUDA 9.0 / 10.0 + CUDNN 7.3+ (不支持9.1/10.1版本的CUDA)
### 1.2 下载OpenCV并设置环境变量
- 在OpenCV官网下载适用于Windows平台的3.4.6版本: [点击下载](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
- 运行下载的可执行文件,将OpenCV解压至合适目录,这里以解压到`D:\projects\opencv`为例
- 把OpenCV动态库加入到系统环境变量
- 此电脑(我的电脑)->属性->高级系统设置->环境变量
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
- 新建,将opencv路径填入并保存,如D:\projects\opencv\build\x64\vc14\bin
**注意:** `OpenCV`的解压目录后续将做为编译配置项使用,所以请放置合适的目录中。
### 1.3 下载PaddlePaddle C++ 预测库
`PaddlePaddle` **C++ 预测库** 主要分为`CPU``GPU`版本, 其中`GPU版本`提供`CUDA 9.0``CUDA 10.0` 支持。
常用的版本如下:
| 版本 | 链接 |
| ---- | ---- |
| CPU+MKL版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/cpu/fluid_inference_install_dir.zip) |
| CUDA9.0+MKL 版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/post97/fluid_inference_install_dir.zip) |
| CUDA10.0+MKL 版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/post107/fluid_inference_install_dir.zip) |
更多不同平台的可用预测库版本,请[点击查看](https://paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/windows_cpp_inference.html) 选择适合你的版本。
下载并解压, 解压后的 `fluid_inference`目录包含的内容:
```
fluid_inference_install_dir
├── paddle # paddle核心库和头文件
|
├── third_party # 第三方依赖库和头文件
|
└── version.txt # 版本和编译信息
```
**注意:** 这里的`fluid_inference_install_dir` 目录所在路径,将用于后面的编译参数设置,请放置在合适的位置。
## 2. Visual Studio 2019 编译
- 2.1 打开Visual Studio 2019 Community,点击`继续但无需代码`, 如下图:
![step2.1](https://paddleseg.bj.bcebos.com/inference/vs2019_step1.png)
- 2.2 点击 `文件`->`打开`->`CMake`, 如下图:
![step2.2](https://paddleseg.bj.bcebos.com/inference/vs2019_step2.png)
- 2.3 选择本项目根目录`CMakeList.txt`文件打开, 如下图:
![step2.3](https://paddleseg.bj.bcebos.com/deploy/docs/vs2019_step2.3.png)
- 2.4 点击:`项目`->`PaddleMaskDetector的CMake设置`
![step2.4](https://paddleseg.bj.bcebos.com/deploy/docs/vs2019_step2.4.png)
- 2.5 点击浏览设置`OPENCV_DIR`, `CUDA_LIB``PADDLE_DIR` 3个编译依赖库的位置, 设置完成后点击`保存并生成CMake缓存并加载变量`
![step2.5](https://paddleseg.bj.bcebos.com/inference/vs2019_step5.png)
- 2.6 点击`生成`->`全部生成` 编译项目
![step2.6](https://paddleseg.bj.bcebos.com/inference/vs2019_step6.png)
## 3. 运行程序
成功编译后, 产出的可执行文件在项目子目录`out\build\x64-Release`目录, 按以下步骤运行代码:
- 打开`cmd`切换至该目录
- 运行以下命令传入模型路径与测试视频
```shell
main.exe ./models/ ./data/test.avi
```
第一个参数即人像分割预测模型的路径,第二个参数即要预测的视频。
点击下载[测试视频](https://paddleseg.bj.bcebos.com/deploy/data/test.avi)
运行后,预测结果保存在文件`result.avi`中。
此差异已折叠。
此差异已折叠。
此差异已折叠。
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <opencv2/core/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/optflow.hpp>
int ThresholdMask(const cv::Mat &fg_cfd,
const float fg_thres,
const float bg_thres,
cv::Mat& fg_mask);
cv::Mat MergeSegMat(const cv::Mat& seg_mat,
const cv::Mat& ori_frame);
int MergeProcess(const uchar *im_buff,
const float *im_scoremap_buff,
const int height,
const int width,
uchar *result_buff);
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -99,4 +99,4 @@ cd /d D:\projects\PaddleSeg\deploy\cpp\out\build\x64-Release ...@@ -99,4 +99,4 @@ cd /d D:\projects\PaddleSeg\deploy\cpp\out\build\x64-Release
demo.exe --conf=/path/to/your/conf --input_dir=/path/to/your/input/data/directory demo.exe --conf=/path/to/your/conf --input_dir=/path/to/your/input/data/directory
``` ```
更详细说明请参考ReadMe文档: [预测和可视化部分](../ReadMe.md) 更详细说明请参考ReadMe文档: [预测和可视化部分](../README.md)
此差异已折叠。
此差异已折叠。
## This file must *NOT* be checked into Version Control Systems,
# as it contains information specific to your local configuration.
#
# Location of the SDK. This is only used by Gradle.
# For customization when using a Version Control System, please read the
# header note.
#Mon Nov 25 17:01:52 CST 2019
sdk.dir=/Users/chenlingchi/Library/Android/sdk
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册