提交 06690edf 编写于 作者: C Channingss

fix conflict

...@@ -8,16 +8,26 @@ ...@@ -8,16 +8,26 @@
PaddleX是基于飞桨技术生态的全流程深度学习模型开发工具。具备易集成,易使用,全流程等特点。PaddleX作为深度学习开发工具,不仅提供了开源的内核代码,可供用户灵活使用或集成,同时也提供了配套的前端可视化客户端套件,让用户以可视化的方式进行模型开发,免去代码开发过程,访问[PaddleX官网](https://www.paddlepaddle.org.cn/paddle/paddlex)获取更多相关细节。 PaddleX是基于飞桨技术生态的全流程深度学习模型开发工具。具备易集成,易使用,全流程等特点。PaddleX作为深度学习开发工具,不仅提供了开源的内核代码,可供用户灵活使用或集成,同时也提供了配套的前端可视化客户端套件,让用户以可视化的方式进行模型开发,免去代码开发过程,访问[PaddleX官网](https://www.paddlepaddle.org.cn/paddle/paddlex)获取更多相关细节。
## 安装 ## 安装
查看[PaddleX安装文档](docs/install.md) ### pip安装(使用Python代码进行模型训练)
> **依赖**
> - cython
> - pycocotools
> - python3
```
pip install paddlex -i https://mirror.baidu.com/pypi/simple
```
### PaddleX模型训练客户端安装(使用可视化界面进行模型训练)
> 进入官网[下载使用](https://www.paddlepaddle.org.cn/paddle/paddleX)
## 文档 ## 文档
推荐访问[PaddleX在线使用文档](https://paddlex.readthedocs.io/zh_CN/latest/index.html),快速查阅读使用教程和API文档说明。 推荐访问[PaddleX在线使用文档](https://paddlex.readthedocs.io/zh_CN/latest/index.html),快速查阅读使用教程和API文档说明。
常用文档
- [10分钟快速上手PaddleX模型训练](docs/quick_start.md) - [10分钟快速上手PaddleX模型训练](docs/quick_start.md)
- [PaddleX使用教程](docs/tutorials) - [PaddleX使用教程](docs/tutorials)
- [PaddleX模型库](docs/model_zoo.md) - [PaddleX模型库](docs/model_zoo.md)
- [导出模型部署](docs/deploy.md) - [导出模型部署](docs/deploy.md)
- [使用PaddleX客户端进行模型训练](docs/client_use.md)
## 反馈 ## 反馈
......
# Anaconda安装使用
Anaconda是一个开源的Python发行版本,其包含了conda、Python等180多个科学包及其依赖项。使用Anaconda可以通过创建多个独立的Python环境,避免用户的Python环境安装太多不同版本依赖导致冲突。
## Windows安装Anaconda
### 第一步 下载
在Anaconda官网[(https://www.anaconda.com/products/individual)](https://www.anaconda.com/products/individual)选择下载Windows Python3.7 64-Bit版本
### 第二步 安装
运行下载的安装包(以.exe为后辍),根据引导完成安装, 用户可自行修改安装目录(如下图)
![](./images/anaconda_windows.png)
### 第三步 使用
- 点击Windows系统左下角的Windows图标,打开:所有程序->Anaconda3/2(64-bit)->Anaconda Prompt
- 在命令行中执行下述命令
```cmd
# 创建名为my_paddlex的环境,指定Python版本为3.7
conda create -n my_paddlex python=3.7
# 进入my_paddlex环境
conda activate my_paddlex
# 安装git
conda install git
# 安装pycocotools
pip install cython
pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
# 安装paddlepaddle-gpu
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
# 安装paddlex
pip install paddlex -i https://mirror.baidu.com/pypi/simple
```
按如上方式配置后,即可在环境中使用PaddleX了,命令行输入`python`回车后,`import paddlex`试试吧,之后再次使用都可以通过打开'所有程序->Anaconda3/2(64-bit)->Anaconda Prompt',再执行`conda activate my_paddlex`进入环境后,即可再次使用paddlex
## Linux/Mac安装
### 第一步 下载
在Anaconda官网[(https://www.anaconda.com/products/individual)](https://www.anaconda.com/products/individual)选择下载对应系统 Python3.7版本下载(Mac下载Command Line Installer版本即可)
### 第二步 安装
打开终端,在终端安装Anaconda
```
# ~/Downloads/Anaconda3-2019.07-Linux-x86_64.sh即下载的文件
bash ~/Downloads/Anaconda3-2019.07-Linux-x86_64.sh
```
安装过程中一直回车即可,如提示设置安装路径,可根据需求修改,一般默认即可。
### 第三步 使用
```
# 创建名为my_paddlex的环境,指定Python版本为3.7
conda create -n my_paddlex python=3.7
# 进入paddlex环境
conda activate my_paddlex
# 安装pycocotools
pip install cython
pip install pycocotools
# 安装paddlepaddle-gpu
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
# 安装paddlex
pip install paddlex -i https://mirror.baidu.com/pypi/simple
```
按如上方式配置后,即可在环境中使用PaddleX了,终端输入`python`回车后,`import paddlex`试试吧,之后再次使用只需再打开终端,再执行`conda activate my_paddlex`进入环境后,即可使用paddlex
...@@ -122,56 +122,42 @@ paddlex.det.transforms.MixupImage(alpha=1.5, beta=1.5, mixup_epoch=-1) ...@@ -122,56 +122,42 @@ paddlex.det.transforms.MixupImage(alpha=1.5, beta=1.5, mixup_epoch=-1)
## RandomExpand类 ## RandomExpand类
```python ```python
paddlex.det.transforms.RandomExpand(max_ratio=4., prob=0.5, mean=[127.5, 127.5, 127.5]) paddlex.det.transforms.RandomExpand(ratio=4., prob=0.5, fill_value=[123.675, 116.28, 103.53])
``` ```
随机扩张图像,模型训练时的数据增强操作,模型训练时的数据增强操作。 随机扩张图像,模型训练时的数据增强操作
1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。 1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。
2. 计算扩张后图像大小。 2. 计算扩张后图像大小。
3. 初始化像素值为数据集均值的图像,并将原图像随机粘贴于该图像上。 3. 初始化像素值为输入填充值的图像,并将原图像随机粘贴于该图像上。
4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。 4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
5. 根据原图像粘贴位置换算出扩张后真实分割区域的位置坐标。
### 参数 ### 参数
* **max_ratio** (float): 图像扩张的最大比例。默认为4.0。 * **ratio** (float): 图像扩张的最大比例。默认为4.0。
* **prob** (float): 随机扩张的概率。默认为0.5。 * **prob** (float): 随机扩张的概率。默认为0.5。
* **mean** (list): 图像数据集的均值(0-255)。默认为[127.5, 127.5, 127.5]。 * **fill_value** (list): 扩张图像的初始填充值(0-255)。默认为[123.675, 116.28, 103.53]。
## RandomCrop类 ## RandomCrop类
```python ```python
paddlex.det.transforms.RandomCrop(batch_sampler=None, satisfy_all=False, avoid_no_bbox=True) paddlex.det.transforms.RandomCrop(aspect_ratio=[.5, 2.], thresholds=[.0, .1, .3, .5, .7, .9], scaling=[.3, 1.], num_attempts=50, allow_no_crop=True, cover_all_box=False)
``` ```
随机裁剪图像,模型训练时的数据增强操作。 随机裁剪图像,模型训练时的数据增强操作。
1. 根据batch_sampler计算获取裁剪候选区域的位置。 1. 若allow_no_crop为True,则在thresholds加入’no_crop’。
(1) 根据min scale、max scale、min aspect ratio、max aspect ratio计算随机剪裁的高、宽。 2. 随机打乱thresholds。
(2) 根据随机剪裁的高、宽随机选取剪裁的起始点。 3. 遍历thresholds中各元素:
(3) 筛选出裁剪候选区域: (1) 如果当前thresh为’no_crop’,则返回原始图像和标注信息。
* 当satisfy_all为True时,需所有真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域才可保留。 (2) 随机取出aspect_ratio和scaling中的值并由此计算出候选裁剪区域的高、宽、起始点。
* 当satisfy_all为False时,当有一个真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域就可保留。 (3) 计算真实标注框与候选裁剪区域IoU,若全部真实标注框的IoU都小于thresh,则继续第3步。
2. 遍历所有裁剪候选区域: (4) 如果cover_all_box为True且存在真实标注框的IoU小于thresh,则继续第3步。
(1) 若真实标注框与候选裁剪区域不重叠,或其中心点不在候选裁剪区域,则将该真实标注框去除。 (5) 筛选出位于候选裁剪区域内的真实标注框,若有效框的个数为0,则继续第3步,否则进行第4步。
(2) 计算相对于该候选裁剪区域,真实标注框的位置,并筛选出对应的类别、混合得分。 4. 换算有效真值标注框相对候选裁剪区域的位置坐标。
(3) 若avoid_no_bbox为False,返回当前裁剪后的信息即可;反之,要找到一个裁剪区域中真实标注框个数不为0的区域,才返回裁剪后的信息 5. 换算有效分割区域相对候选裁剪区域的位置坐标
### 参数 ### 参数
* **batch_sampler** (list): 随机裁剪参数的多种组合,每种组合包含8个值,如下: * **aspect_ratio** (list): 裁剪后短边缩放比例的取值范围,以[min, max]形式表示。默认值为[.5, 2.]。
- max sample (int):满足当前组合的裁剪区域的个数上限。 * **thresholds** (list): 判断裁剪候选区域是否有效所需的IoU阈值取值列表。默认值为[.0, .1, .3, .5, .7, .9]。
- max trial (int): 查找满足当前组合的次数。 * **scaling** (list): 裁剪面积相对原面积的取值范围,以[min, max]形式表示。默认值为[.3, 1.]。
- min scale (float): 裁剪面积相对原面积,每条边缩短比例的最小限制。 * **num_attempts** (int): 在放弃寻找有效裁剪区域前尝试的次数。默认值为50。
- max scale (float): 裁剪面积相对原面积,每条边缩短比例的最大限制。 * **allow_no_crop** (bool): 是否允许未进行裁剪。默认值为True。
- min aspect ratio (float): 裁剪后短边缩放比例的最小限制。 * **cover_all_box** (bool): 是否要求所有的真实标注框都必须在裁剪区域内。默认值为False。
- max aspect ratio (float): 裁剪后短边缩放比例的最大限制。
- min overlap (float): 真实标注框与裁剪图像重叠面积的最小限制。
- max overlap (float): 真实标注框与裁剪图像重叠面积的最大限制。
默认值为None,当为None时采用如下设置:
[[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
* **satisfy_all** (bool): 是否需要所有标注框满足条件,裁剪候选区域才保留。默认为False。
* **avoid_no_bbox** (bool): 是否对裁剪图像不存在标注框的图像进行保留。默认为True。
...@@ -3,7 +3,7 @@ PaddleX提供了一系列模型预测和结果分析的可视化函数。 ...@@ -3,7 +3,7 @@ PaddleX提供了一系列模型预测和结果分析的可视化函数。
## 目标检测/实例分割预测结果可视化 ## 目标检测/实例分割预测结果可视化
``` ```
paddlex.det.visualize(image, result, threshold=0.5, save_dir=None) paddlex.det.visualize(image, result, threshold=0.5, save_dir='./')
``` ```
将目标检测/实例分割模型预测得到的Box框和Mask在原图上进行可视化 将目标检测/实例分割模型预测得到的Box框和Mask在原图上进行可视化
...@@ -11,7 +11,7 @@ paddlex.det.visualize(image, result, threshold=0.5, save_dir=None) ...@@ -11,7 +11,7 @@ paddlex.det.visualize(image, result, threshold=0.5, save_dir=None)
> * **image** (str): 原图文件路径。 > * **image** (str): 原图文件路径。
> * **result** (str): 模型预测结果。 > * **result** (str): 模型预测结果。
> * **threshold**(float): score阈值,将Box置信度低于该阈值的框过滤不进行可视化。默认0.5 > * **threshold**(float): score阈值,将Box置信度低于该阈值的框过滤不进行可视化。默认0.5
> * **save_dir**(str): 可视化结果保存路径。若为None,则表示不保存,该函数将可视化的结果以np.ndarray的形式返回;若设为目录路径,则将可视化结果保存至该目录下 > * **save_dir**(str): 可视化结果保存路径。若为None,则表示不保存,该函数将可视化的结果以np.ndarray的形式返回;若设为目录路径,则将可视化结果保存至该目录下。默认值为'./'。
### 使用示例 ### 使用示例
> 点击下载如下示例中的[模型](https://bj.bcebos.com/paddlex/models/xiaoduxiong_epoch_12.tar.gz)和[测试图片](https://bj.bcebos.com/paddlex/datasets/xiaoduxiong.jpeg) > 点击下载如下示例中的[模型](https://bj.bcebos.com/paddlex/models/xiaoduxiong_epoch_12.tar.gz)和[测试图片](https://bj.bcebos.com/paddlex/datasets/xiaoduxiong.jpeg)
...@@ -23,17 +23,81 @@ pdx.det.visualize('xiaoduxiong.jpeg', result, save_dir='./') ...@@ -23,17 +23,81 @@ pdx.det.visualize('xiaoduxiong.jpeg', result, save_dir='./')
# 预测结果保存在./visualize_xiaoduxiong.jpeg # 预测结果保存在./visualize_xiaoduxiong.jpeg
``` ```
## 目标检测/实例分割准确率-召回率可视化
```
paddlex.det.draw_pr_curve(eval_details_file=None, gt=None, pred_bbox=None, pred_mask=None, iou_thresh=0.5, save_dir='./')
```
将目标检测/实例分割模型评估结果中各个类别的准确率和召回率的对应关系进行可视化,同时可视化召回率和置信度阈值的对应关系。
### 参数
> * **eval_details_file** (str): 模型评估结果的保存路径,包含真值信息和预测结果。默认值为None。
> * **gt** (list): 数据集的真值信息。默认值为None。
> * **pred_bbox** (list): 模型在数据集上的预测框。默认值为None。
> * **pred_mask** (list): 模型在数据集上的预测mask。默认值为None。
> * **iou_thresh** (float): 判断预测框或预测mask为真阳时的IoU阈值。默认值为0.5。
> * **save_dir** (str): 可视化结果保存路径。默认值为'./'。
**注意:**`eval_details_file`的优先级更高,只要`eval_details_file`不为None,就会从`eval_details_file`提取真值信息和预测结果做分析。当`eval_details_file`为None时,则用`gt``pred_mask``pred_mask`做分析。
### 使用示例
> 示例一:
点击下载如下示例中的[模型](https://bj.bcebos.com/paddlex/models/xiaoduxiong_epoch_12.tar.gz)[数据集](https://bj.bcebos.com/paddlex/datasets/xiaoduxiong_ins_det.tar.gz)
```
import os
# 选择使用0号卡
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
from paddlex.det import transforms
import paddlex as pdx
eval_transforms = transforms.Compose([
transforms.Normalize(),
transforms.ResizeByShort(short_size=800, max_size=1333),
transforms.Padding(coarsest_stride=32)
])
eval_dataset = pdx.datasets.CocoDetection(
data_dir='xiaoduxiong_ins_det/JPEGImages',
ann_file='xiaoduxiong_ins_det/val.json',
transforms=eval_transforms)
model = pdx.load_model('xiaoduxiong_epoch_12')
metrics, evaluate_details = model.evaluate(eval_dataset, batch_size=1, return_details=True)
gt = evaluate_details['gt']
bbox = evaluate_details['bbox']
mask = evaluate_details['mask']
# 分别可视化bboxmask的准召曲线
pdx.det.draw_pr_curve(gt=gt, pred_bbox=bbox, pred_mask=mask, save_dir='./xiaoduxiong')
```
预测框的各个类别的准确率和召回率的对应关系、召回率和置信度阈值的对应关系可视化如下:
![](./images/xiaoduxiong_bbox_pr_curve(iou-0.5).png)
预测mask的各个类别的准确率和召回率的对应关系、召回率和置信度阈值的对应关系可视化如下:
![](./images/xiaoduxiong_segm_pr_curve(iou-0.5).png)
> 示例二:
使用[yolov3_darknet53.py示例代码](https://github.com/PaddlePaddle/PaddleX/blob/develop/tutorials/train/detection/yolov3_darknet53.py)训练完成后,加载模型评估结果文件进行分析:
```
import paddlex as pdx
eval_details_file = 'output/yolov3_darknet53/best_model/eval_details.json'
pdx.det.draw_pr_curve(eval_details_file, save_dir='./insect')
```
预测框的各个类别的准确率和召回率的对应关系、召回率和置信度阈值的对应关系可视化如下:
![](./images/insect_bbox_pr_curve(iou-0.5).png)
## 语义分割预测结果可视化 ## 语义分割预测结果可视化
``` ```
paddlex.seg.visualize(image, result, weight=0.6, save_dir=None) paddlex.seg.visualize(image, result, weight=0.6, save_dir='./')
``` ```
将语义分割模型预测得到的Mask在原图上进行可视化 将语义分割模型预测得到的Mask在原图上进行可视化
### 参数 ### 参数
> * **image** (str): 原图文件路径。 > * **image** (str): 原图文件路径。
> * **result** (str): 模型预测结果。 > * **result** (str): 模型预测结果。
> * **weight**(float): mask可视化结果与原图权重因子,weight表示原图的权重。默认0.6 > * **weight**(float): mask可视化结果与原图权重因子,weight表示原图的权重。默认0.6
> * **save_dir**(str): 可视化结果保存路径。若为None,则表示不保存,该函数将可视化的结果以np.ndarray的形式返回;若设为目录路径,则将可视化结果保存至该目录下 > * **save_dir**(str): 可视化结果保存路径。若为None,则表示不保存,该函数将可视化的结果以np.ndarray的形式返回;若设为目录路径,则将可视化结果保存至该目录下。默认值为'./'。
### 使用示例 ### 使用示例
> 点击下载如下示例中的[模型](https://bj.bcebos.com/paddlex/models/cityscape_deeplab.tar.gz)和[测试图片](https://bj.bcebos.com/paddlex/datasets/city.png) > 点击下载如下示例中的[模型](https://bj.bcebos.com/paddlex/models/cityscape_deeplab.tar.gz)和[测试图片](https://bj.bcebos.com/paddlex/datasets/city.png)
......
# 使用PaddleX客户端进行模型训练
**第一步:下载PaddleX客户端**
您需要前往 [官网](https://www.paddlepaddle.org.cn/paddle/paddlex)填写基本信息后下载试用PaddleX客户端
**第二步:准备数据**
在开始模型训练前,您需要根据不同的任务类型,将数据标注为相应的格式。目前PaddleX支持【图像分类】、【目标检测】、【语义分割】、【实例分割】四种任务类型。不同类型任务的数据处理方式可查看[数据标注方式](https://github.com/PaddlePaddle/PaddleX/tree/master/DataAnnotation/AnnotationNote)
**第三步:导入我的数据集**
① 数据标注完成后,您需要根据不同的任务,将数据和标注文件,按照客户端提示更名并保存到正确的文件中。
② 在客户端新建数据集,选择与数据集匹配的任务类型,并选择数据集对应的路径,将数据集导入。
![](./images/00_loaddata.png)
③ 选定导入数据集后,客户端会自动校验数据及标注文件是否合规,校验成功后,您可根据实际需求,将数据集按比例划分为训练集、验证集、测试集。
④ 您可在「数据分析」模块按规则预览您标注的数据集,双击单张图片可放大查看。
![](./images/01_datasplit.png)
**第四步:创建项目**
① 在完成数据导入后,您可以点击「新建项目」创建一个项目。
② 您可根据实际任务需求选择项目的任务类型,需要注意项目所采用的数据集也带有任务类型属性,两者需要进行匹配。
![](./images/02_newproject.png)
**第五步:项目开发**
**数据选择**:项目创建完成后,您需要选择已载入客户端并校验后的数据集,并点击下一步,进入参数配置页面。
![](./images/03_choosedata.png)
**参数配置**:主要分为**模型参数****训练参数****优化策略**三部分。您可根据实际需求选择模型结构及对应的训练参数、优化策略,使得任务效果最佳。
![](./images/04_parameter.png)
参数配置完成后,点击启动训练,模型开始训练并进行效果评估。
**训练可视化**
在训练过程中,您可通过VisualDL查看模型训练过程时的参数变化、日志详情,及当前最优的训练集和验证集训练指标。模型在训练过程中通过点击"终止训练"随时终止训练过程。
![](./images/05_train.png)
![](./images/06_VisualDL.png)
模型训练结束后,点击”下一步“,进入模型评估页面。
**模型评估**
在模型评估页面,您可将训练后的模型应用在切分时留出的「验证数据集」以测试模型在验证集上的效果。评估方法包括混淆矩阵、精度、召回率等。在这个页面,您也可以直接查看模型在测试数据集上的预测效果。
根据评估结果,您可决定进入模型发布页面,或返回先前步骤调整参数配置重新进行训练。
![](./images/07_evaluate.png)
**模型发布**
当模型效果满意后,您可根据实际的生产环境需求,选择将模型发布为需要的版本。
![](./images/08_deploy.png)
...@@ -19,9 +19,10 @@ PaddleX是基于飞桨技术生态的深度学习全流程开发工具。具备 ...@@ -19,9 +19,10 @@ PaddleX是基于飞桨技术生态的深度学习全流程开发工具。具备
tutorials/index.rst tutorials/index.rst
metrics.md metrics.md
deploy.md deploy.md
client_use.md
FAQ.md FAQ.md
* PaddleX版本: v0.1.0 * PaddleX版本: v0.1.6
* 项目官网: http://www.paddlepaddle.org.cn/paddle/paddlex * 项目官网: http://www.paddlepaddle.org.cn/paddle/paddlex
* 项目GitHub: https://github.com/PaddlePaddle/PaddleX/tree/develop * 项目GitHub: https://github.com/PaddlePaddle/PaddleX/tree/develop
* 官方QQ用户群: 1045148026 * 官方QQ用户群: 1045148026
......
# 安装 # 安装
> 以下安装过程默认用户已安装好**paddlepaddle-gpu或paddlepaddle(版本大于或等于1.7.1)**,paddlepaddle安装方式参照[飞桨官网](https://www.paddlepaddle.org.cn/install/quick) 以下安装过程默认用户已安装好**paddlepaddle-gpu或paddlepaddle(版本大于或等于1.7.1)**,paddlepaddle安装方式参照[飞桨官网](https://www.paddlepaddle.org.cn/install/quick)
> 推荐使用Anaconda Python环境,Anaconda下安装PaddleX参考文档[Anaconda安装使用](./anaconda_install.md)
## Github代码安装 ## Github代码安装
github代码会跟随开发进度不断更新 github代码会跟随开发进度不断更新
......
# 10分钟快速上手使用 # 10分钟快速上手使用
本文档在一个小数据集上展示了如何通过PaddleX进行训练,您可以阅读PaddleX的**使用教程**来了解更多模型任务的训练使用方式。本示例同步在AIStudio上,可直接[在线体验模型训练](https://aistudio.baidu.com/aistudio/projectdetail/423472) 本文档在一个小数据集上展示了如何通过PaddleX进行训练,您可以阅读PaddleX的**使用教程**来了解更多模型任务的训练使用方式。本示例同步在AIStudio上,可直接[在线体验模型训练](https://aistudio.baidu.com/aistudio/projectdetail/439860)
## 1. 准备蔬菜分类数据集 ## 1. 准备蔬菜分类数据集
``` ```
...@@ -42,7 +42,7 @@ train_dataset = pdx.datasets.ImageNet( ...@@ -42,7 +42,7 @@ train_dataset = pdx.datasets.ImageNet(
shuffle=True) shuffle=True)
eval_dataset = pdx.datasets.ImageNet( eval_dataset = pdx.datasets.ImageNet(
data_dir='vegetables_cls', data_dir='vegetables_cls',
file_list='vegetables_cls/train_list.txt', file_list='vegetables_cls/val_list.txt',
label_list='vegetables_cls/labels.txt', label_list='vegetables_cls/labels.txt',
transforms=eval_transforms) transforms=eval_transforms)
``` ```
......
...@@ -25,4 +25,4 @@ load_model = cv.models.load_model ...@@ -25,4 +25,4 @@ load_model = cv.models.load_model
datasets = cv.datasets datasets = cv.datasets
log_level = 2 log_level = 2
__version__ = '0.1.5.github' __version__ = '0.1.6.github'
...@@ -15,10 +15,14 @@ ...@@ -15,10 +15,14 @@
import os import os
import cv2 import cv2
import numpy as np import numpy as np
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw from PIL import Image, ImageDraw
import paddlex.utils.logging as logging
from .detection_eval import fixed_linspace, backup_linspace, loadRes
def visualize_detection(image, result, threshold=0.5, save_dir=None):
def visualize_detection(image, result, threshold=0.5, save_dir='./'):
""" """
Visualize bbox and mask results Visualize bbox and mask results
""" """
...@@ -31,11 +35,12 @@ def visualize_detection(image, result, threshold=0.5, save_dir=None): ...@@ -31,11 +35,12 @@ def visualize_detection(image, result, threshold=0.5, save_dir=None):
os.makedirs(save_dir) os.makedirs(save_dir)
out_path = os.path.join(save_dir, 'visualize_{}'.format(image_name)) out_path = os.path.join(save_dir, 'visualize_{}'.format(image_name))
image.save(out_path, quality=95) image.save(out_path, quality=95)
logging.info('The visualized result is saved as {}'.format(out_path))
else: else:
return image return image
def visualize_segmentation(image, result, weight=0.6, save_dir=None): def visualize_segmentation(image, result, weight=0.6, save_dir='./'):
""" """
Convert segment result to color image, and save added image. Convert segment result to color image, and save added image.
Args: Args:
...@@ -62,6 +67,7 @@ def visualize_segmentation(image, result, weight=0.6, save_dir=None): ...@@ -62,6 +67,7 @@ def visualize_segmentation(image, result, weight=0.6, save_dir=None):
image_name = os.path.split(image)[-1] image_name = os.path.split(image)[-1]
out_path = os.path.join(save_dir, 'visualize_{}'.format(image_name)) out_path = os.path.join(save_dir, 'visualize_{}'.format(image_name))
cv2.imwrite(out_path, vis_result) cv2.imwrite(out_path, vis_result)
logging.info('The visualized result is saved as {}'.format(out_path))
else: else:
return vis_result return vis_result
...@@ -160,3 +166,129 @@ def draw_bbox_mask(image, results, threshold=0.5, alpha=0.7): ...@@ -160,3 +166,129 @@ def draw_bbox_mask(image, results, threshold=0.5, alpha=0.7):
img_array[idx[0], idx[1], :] += alpha * color_mask img_array[idx[0], idx[1], :] += alpha * color_mask
image = Image.fromarray(img_array.astype('uint8')) image = Image.fromarray(img_array.astype('uint8'))
return image return image
def draw_pr_curve(eval_details_file=None,
gt=None,
pred_bbox=None,
pred_mask=None,
iou_thresh=0.5,
save_dir='./'):
if eval_details_file is not None:
import json
with open(eval_details_file, 'r') as f:
eval_details = json.load(f)
pred_bbox = eval_details['bbox']
if 'mask' in eval_details:
pred_mask = eval_details['mask']
gt = eval_details['gt']
if gt is None or pred_bbox is None:
raise Exception(
"gt/pred_bbox/pred_mask is None now, please set right eval_details_file or gt/pred_bbox/pred_mask."
)
if pred_bbox is not None and len(pred_bbox) == 0:
raise Exception("There is no predicted bbox.")
if pred_mask is not None and len(pred_mask) == 0:
raise Exception("There is no predicted mask.")
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
coco = COCO()
coco.dataset = gt
coco.createIndex()
def _summarize(coco_gt, ap=1, iouThr=None, areaRng='all', maxDets=100):
p = coco_gt.params
aind = [i for i, aRng in enumerate(p.areaRngLbl) if aRng == areaRng]
mind = [i for i, mDet in enumerate(p.maxDets) if mDet == maxDets]
if ap == 1:
# dimension of precision: [TxRxKxAxM]
s = coco_gt.eval['precision']
# IoU
if iouThr is not None:
t = np.where(iouThr == p.iouThrs)[0]
s = s[t]
s = s[:, :, :, aind, mind]
else:
# dimension of recall: [TxKxAxM]
s = coco_gt.eval['recall']
if iouThr is not None:
t = np.where(iouThr == p.iouThrs)[0]
s = s[t]
s = s[:, :, aind, mind]
if len(s[s > -1]) == 0:
mean_s = -1
else:
mean_s = np.mean(s[s > -1])
return mean_s
def cal_pr(coco_gt, coco_dt, iou_thresh, save_dir, style='bbox'):
from pycocotools.cocoeval import COCOeval
coco_dt = loadRes(coco_gt, coco_dt)
np.linspace = fixed_linspace
coco_eval = COCOeval(coco_gt, coco_dt, style)
coco_eval.params.iouThrs = np.linspace(
iou_thresh, iou_thresh, 1, endpoint=True)
np.linspace = backup_linspace
coco_eval.evaluate()
coco_eval.accumulate()
stats = _summarize(coco_eval, iouThr=iou_thresh)
catIds = coco_gt.getCatIds()
if len(catIds) != coco_eval.eval['precision'].shape[2]:
raise Exception(
"The category number must be same as the third dimension of precisions."
)
x = np.arange(0.0, 1.01, 0.01)
color_map = get_color_map_list(256)[1:256]
plt.subplot(1, 2, 1)
plt.title(style + " precision-recall IoU={}".format(iou_thresh))
plt.xlabel("recall")
plt.ylabel("precision")
plt.xlim(0, 1.01)
plt.ylim(0, 1.01)
plt.grid(linestyle='--', linewidth=1)
plt.plot([0, 1], [0, 1], 'r--', linewidth=1)
my_x_ticks = np.arange(0, 1.01, 0.1)
my_y_ticks = np.arange(0, 1.01, 0.1)
plt.xticks(my_x_ticks, fontsize=5)
plt.yticks(my_y_ticks, fontsize=5)
for idx, catId in enumerate(catIds):
pr_array = coco_eval.eval['precision'][0, :, idx, 0, 2]
precision = pr_array[pr_array > -1]
ap = np.mean(precision) if precision.size else float('nan')
nm = coco_gt.loadCats(catId)[0]['name'] + ' AP={:0.2f}'.format(
float(ap * 100))
color = tuple(color_map[idx])
color = [float(c) / 255 for c in color]
color.append(0.75)
plt.plot(x, pr_array, color=color, label=nm, linewidth=1)
plt.legend(loc="lower left", fontsize=5)
plt.subplot(1, 2, 2)
plt.title(style + " score-recall IoU={}".format(iou_thresh))
plt.xlabel('recall')
plt.ylabel('score')
plt.xlim(0, 1.01)
plt.ylim(0, 1.01)
plt.grid(linestyle='--', linewidth=1)
plt.xticks(my_x_ticks, fontsize=5)
plt.yticks(my_y_ticks, fontsize=5)
for idx, catId in enumerate(catIds):
nm = coco_gt.loadCats(catId)[0]['name']
sr_array = coco_eval.eval['scores'][0, :, idx, 0, 2]
color = tuple(color_map[idx])
color = [float(c) / 255 for c in color]
color.append(0.75)
plt.plot(x, sr_array, color=color, label=nm, linewidth=1)
plt.legend(loc="lower left", fontsize=5)
plt.savefig(
os.path.join(save_dir, "./{}_pr_curve(iou-{}).png".format(
style, iou_thresh)),
dpi=800)
plt.close()
if not os.path.exists(save_dir):
os.makedirs(save_dir)
cal_pr(coco, pred_bbox, iou_thresh, save_dir, style='bbox')
if pred_mask is not None:
cal_pr(coco, pred_mask, iou_thresh, save_dir, style='segm')
...@@ -95,9 +95,9 @@ class YOLOv3(BaseAPI): ...@@ -95,9 +95,9 @@ class YOLOv3(BaseAPI):
elif backbone_name == 'MobileNetV1': elif backbone_name == 'MobileNetV1':
backbone = paddlex.cv.nets.MobileNetV1(norm_type='sync_bn') backbone = paddlex.cv.nets.MobileNetV1(norm_type='sync_bn')
elif backbone_name.startswith('MobileNetV3'): elif backbone_name.startswith('MobileNetV3'):
models_name = backbone_name.split('_')[1] model_name = backbone_name.split('_')[1]
backbone = paddlex.cv.nets.MobileNetV3( backbone = paddlex.cv.nets.MobileNetV3(
norm_type='sync_bn', models_name=models_name) norm_type='sync_bn', model_name=model_name)
return backbone return backbone
def build_net(self, mode='train'): def build_net(self, mode='train'):
......
...@@ -19,25 +19,6 @@ import cv2 ...@@ -19,25 +19,6 @@ import cv2
import scipy import scipy
def meet_emit_constraint(src_bbox, sample_bbox):
center_x = (src_bbox[2] + src_bbox[0]) / 2
center_y = (src_bbox[3] + src_bbox[1]) / 2
if center_x >= sample_bbox[0] and \
center_x <= sample_bbox[2] and \
center_y >= sample_bbox[1] and \
center_y <= sample_bbox[3]:
return True
return False
def clip_bbox(src_bbox):
src_bbox[0] = max(min(src_bbox[0], 1.0), 0.0)
src_bbox[1] = max(min(src_bbox[1], 1.0), 0.0)
src_bbox[2] = max(min(src_bbox[2], 1.0), 0.0)
src_bbox[3] = max(min(src_bbox[3], 1.0), 0.0)
return src_bbox
def bbox_area(src_bbox): def bbox_area(src_bbox):
if src_bbox[2] < src_bbox[0] or src_bbox[3] < src_bbox[1]: if src_bbox[2] < src_bbox[0] or src_bbox[3] < src_bbox[1]:
return 0. return 0.
...@@ -47,189 +28,6 @@ def bbox_area(src_bbox): ...@@ -47,189 +28,6 @@ def bbox_area(src_bbox):
return width * height return width * height
def is_overlap(object_bbox, sample_bbox):
if object_bbox[0] >= sample_bbox[2] or \
object_bbox[2] <= sample_bbox[0] or \
object_bbox[1] >= sample_bbox[3] or \
object_bbox[3] <= sample_bbox[1]:
return False
else:
return True
def filter_and_process(sample_bbox, bboxes, labels, scores=None):
new_bboxes = []
new_labels = []
new_scores = []
for i in range(len(bboxes)):
new_bbox = [0, 0, 0, 0]
obj_bbox = [bboxes[i][0], bboxes[i][1], bboxes[i][2], bboxes[i][3]]
if not meet_emit_constraint(obj_bbox, sample_bbox):
continue
if not is_overlap(obj_bbox, sample_bbox):
continue
sample_width = sample_bbox[2] - sample_bbox[0]
sample_height = sample_bbox[3] - sample_bbox[1]
new_bbox[0] = (obj_bbox[0] - sample_bbox[0]) / sample_width
new_bbox[1] = (obj_bbox[1] - sample_bbox[1]) / sample_height
new_bbox[2] = (obj_bbox[2] - sample_bbox[0]) / sample_width
new_bbox[3] = (obj_bbox[3] - sample_bbox[1]) / sample_height
new_bbox = clip_bbox(new_bbox)
if bbox_area(new_bbox) > 0:
new_bboxes.append(new_bbox)
new_labels.append([labels[i][0]])
if scores is not None:
new_scores.append([scores[i][0]])
bboxes = np.array(new_bboxes)
labels = np.array(new_labels)
scores = np.array(new_scores)
return bboxes, labels, scores
def bbox_area_sampling(bboxes, labels, scores, target_size, min_size):
new_bboxes = []
new_labels = []
new_scores = []
for i, bbox in enumerate(bboxes):
w = float((bbox[2] - bbox[0]) * target_size)
h = float((bbox[3] - bbox[1]) * target_size)
if w * h < float(min_size * min_size):
continue
else:
new_bboxes.append(bbox)
new_labels.append(labels[i])
if scores is not None and scores.size != 0:
new_scores.append(scores[i])
bboxes = np.array(new_bboxes)
labels = np.array(new_labels)
scores = np.array(new_scores)
return bboxes, labels, scores
def generate_sample_bbox(sampler):
scale = np.random.uniform(sampler[2], sampler[3])
aspect_ratio = np.random.uniform(sampler[4], sampler[5])
aspect_ratio = max(aspect_ratio, (scale**2.0))
aspect_ratio = min(aspect_ratio, 1 / (scale**2.0))
bbox_width = scale * (aspect_ratio**0.5)
bbox_height = scale / (aspect_ratio**0.5)
xmin_bound = 1 - bbox_width
ymin_bound = 1 - bbox_height
xmin = np.random.uniform(0, xmin_bound)
ymin = np.random.uniform(0, ymin_bound)
xmax = xmin + bbox_width
ymax = ymin + bbox_height
sampled_bbox = [xmin, ymin, xmax, ymax]
return sampled_bbox
def generate_sample_bbox_square(sampler, image_width, image_height):
scale = np.random.uniform(sampler[2], sampler[3])
aspect_ratio = np.random.uniform(sampler[4], sampler[5])
aspect_ratio = max(aspect_ratio, (scale**2.0))
aspect_ratio = min(aspect_ratio, 1 / (scale**2.0))
bbox_width = scale * (aspect_ratio**0.5)
bbox_height = scale / (aspect_ratio**0.5)
if image_height < image_width:
bbox_width = bbox_height * image_height / image_width
else:
bbox_height = bbox_width * image_width / image_height
xmin_bound = 1 - bbox_width
ymin_bound = 1 - bbox_height
xmin = np.random.uniform(0, xmin_bound)
ymin = np.random.uniform(0, ymin_bound)
xmax = xmin + bbox_width
ymax = ymin + bbox_height
sampled_bbox = [xmin, ymin, xmax, ymax]
return sampled_bbox
def data_anchor_sampling(bbox_labels, image_width, image_height, scale_array,
resize_width):
num_gt = len(bbox_labels)
# np.random.randint range: [low, high)
rand_idx = np.random.randint(0, num_gt) if num_gt != 0 else 0
if num_gt != 0:
norm_xmin = bbox_labels[rand_idx][0]
norm_ymin = bbox_labels[rand_idx][1]
norm_xmax = bbox_labels[rand_idx][2]
norm_ymax = bbox_labels[rand_idx][3]
xmin = norm_xmin * image_width
ymin = norm_ymin * image_height
wid = image_width * (norm_xmax - norm_xmin)
hei = image_height * (norm_ymax - norm_ymin)
range_size = 0
area = wid * hei
for scale_ind in range(0, len(scale_array) - 1):
if area > scale_array[scale_ind] ** 2 and area < \
scale_array[scale_ind + 1] ** 2:
range_size = scale_ind + 1
break
if area > scale_array[len(scale_array) - 2]**2:
range_size = len(scale_array) - 2
scale_choose = 0.0
if range_size == 0:
rand_idx_size = 0
else:
# np.random.randint range: [low, high)
rng_rand_size = np.random.randint(0, range_size + 1)
rand_idx_size = rng_rand_size % (range_size + 1)
if rand_idx_size == range_size:
min_resize_val = scale_array[rand_idx_size] / 2.0
max_resize_val = min(2.0 * scale_array[rand_idx_size],
2 * math.sqrt(wid * hei))
scale_choose = random.uniform(min_resize_val, max_resize_val)
else:
min_resize_val = scale_array[rand_idx_size] / 2.0
max_resize_val = 2.0 * scale_array[rand_idx_size]
scale_choose = random.uniform(min_resize_val, max_resize_val)
sample_bbox_size = wid * resize_width / scale_choose
w_off_orig = 0.0
h_off_orig = 0.0
if sample_bbox_size < max(image_height, image_width):
if wid <= sample_bbox_size:
w_off_orig = np.random.uniform(xmin + wid - sample_bbox_size,
xmin)
else:
w_off_orig = np.random.uniform(xmin,
xmin + wid - sample_bbox_size)
if hei <= sample_bbox_size:
h_off_orig = np.random.uniform(ymin + hei - sample_bbox_size,
ymin)
else:
h_off_orig = np.random.uniform(ymin,
ymin + hei - sample_bbox_size)
else:
w_off_orig = np.random.uniform(image_width - sample_bbox_size, 0.0)
h_off_orig = np.random.uniform(image_height - sample_bbox_size,
0.0)
w_off_orig = math.floor(w_off_orig)
h_off_orig = math.floor(h_off_orig)
# Figure out top left coordinates.
w_off = float(w_off_orig / image_width)
h_off = float(h_off_orig / image_height)
sampled_bbox = [
w_off, h_off, w_off + float(sample_bbox_size / image_width),
h_off + float(sample_bbox_size / image_height)
]
return sampled_bbox
else:
return 0
def jaccard_overlap(sample_bbox, object_bbox): def jaccard_overlap(sample_bbox, object_bbox):
if sample_bbox[0] >= object_bbox[2] or \ if sample_bbox[0] >= object_bbox[2] or \
sample_bbox[2] <= object_bbox[0] or \ sample_bbox[2] <= object_bbox[0] or \
...@@ -249,143 +47,143 @@ def jaccard_overlap(sample_bbox, object_bbox): ...@@ -249,143 +47,143 @@ def jaccard_overlap(sample_bbox, object_bbox):
return overlap return overlap
def intersect_bbox(bbox1, bbox2): def iou_matrix(a, b):
if bbox2[0] > bbox1[2] or bbox2[2] < bbox1[0] or \ tl_i = np.maximum(a[:, np.newaxis, :2], b[:, :2])
bbox2[1] > bbox1[3] or bbox2[3] < bbox1[1]: br_i = np.minimum(a[:, np.newaxis, 2:], b[:, 2:])
intersection_box = [0.0, 0.0, 0.0, 0.0]
else:
intersection_box = [
max(bbox1[0], bbox2[0]),
max(bbox1[1], bbox2[1]),
min(bbox1[2], bbox2[2]),
min(bbox1[3], bbox2[3])
]
return intersection_box
def bbox_coverage(bbox1, bbox2):
inter_box = intersect_bbox(bbox1, bbox2)
intersect_size = bbox_area(inter_box)
if intersect_size > 0:
bbox1_size = bbox_area(bbox1)
return intersect_size / bbox1_size
else:
return 0.
area_i = np.prod(br_i - tl_i, axis=2) * (tl_i < br_i).all(axis=2)
area_a = np.prod(a[:, 2:] - a[:, :2], axis=1)
area_b = np.prod(b[:, 2:] - b[:, :2], axis=1)
area_o = (area_a[:, np.newaxis] + area_b - area_i)
return area_i / (area_o + 1e-10)
def satisfy_sample_constraint(sampler,
sample_bbox, def crop_box_with_center_constraint(box, crop):
gt_bboxes, cropped_box = box.copy()
satisfy_all=False):
if sampler[6] == 0 and sampler[7] == 0: cropped_box[:, :2] = np.maximum(box[:, :2], crop[:2])
return True cropped_box[:, 2:] = np.minimum(box[:, 2:], crop[2:])
satisfied = [] cropped_box[:, :2] -= crop[:2]
for i in range(len(gt_bboxes)): cropped_box[:, 2:] -= crop[:2]
object_bbox = [
gt_bboxes[i][0], gt_bboxes[i][1], gt_bboxes[i][2], gt_bboxes[i][3] centers = (box[:, :2] + box[:, 2:]) / 2
] valid = np.logical_and(crop[:2] <= centers, centers < crop[2:]).all(axis=1)
overlap = jaccard_overlap(sample_bbox, object_bbox) valid = np.logical_and(
if sampler[6] != 0 and \ valid, (cropped_box[:, :2] < cropped_box[:, 2:]).all(axis=1))
overlap < sampler[6]:
satisfied.append(False) return cropped_box, np.where(valid)[0]
def is_poly(segm):
if not isinstance(segm, (list, dict)):
raise Exception("Invalid segm type: {}".format(type(segm)))
return isinstance(segm, list)
def crop_image(img, crop):
x1, y1, x2, y2 = crop
return img[y1:y2, x1:x2, :]
def crop_segms(segms, valid_ids, crop, height, width):
def _crop_poly(segm, crop):
xmin, ymin, xmax, ymax = crop
crop_coord = [xmin, ymin, xmin, ymax, xmax, ymax, xmax, ymin]
crop_p = np.array(crop_coord).reshape(4, 2)
crop_p = Polygon(crop_p)
crop_segm = list()
for poly in segm:
poly = np.array(poly).reshape(len(poly) // 2, 2)
polygon = Polygon(poly)
if not polygon.is_valid:
exterior = polygon.exterior
multi_lines = exterior.intersection(exterior)
polygons = shapely.ops.polygonize(multi_lines)
polygon = MultiPolygon(polygons)
multi_polygon = list()
if isinstance(polygon, MultiPolygon):
multi_polygon = copy.deepcopy(polygon)
else:
multi_polygon.append(copy.deepcopy(polygon))
for per_polygon in multi_polygon:
inter = per_polygon.intersection(crop_p)
if not inter:
continue continue
if sampler[7] != 0 and \ if isinstance(inter, (MultiPolygon, GeometryCollection)):
overlap > sampler[7]: for part in inter:
satisfied.append(False) if not isinstance(part, Polygon):
continue continue
satisfied.append(True) part = np.squeeze(
if not satisfy_all: np.array(part.exterior.coords[:-1]).reshape(1, -1))
return True part[0::2] -= xmin
part[1::2] -= ymin
if satisfy_all: crop_segm.append(part.tolist())
return np.all(satisfied) elif isinstance(inter, Polygon):
crop_poly = np.squeeze(
np.array(inter.exterior.coords[:-1]).reshape(1, -1))
crop_poly[0::2] -= xmin
crop_poly[1::2] -= ymin
crop_segm.append(crop_poly.tolist())
else: else:
return False continue
return crop_segm
def _crop_rle(rle, crop, height, width):
if 'counts' in rle and type(rle['counts']) == list:
rle = mask_util.frPyObjects(rle, height, width)
mask = mask_util.decode(rle)
mask = mask[crop[1]:crop[3], crop[0]:crop[2]]
rle = mask_util.encode(np.array(mask, order='F', dtype=np.uint8))
return rle
def satisfy_sample_constraint_coverage(sampler, sample_bbox, gt_bboxes): crop_segms = []
if sampler[6] == 0 and sampler[7] == 0: for id in valid_ids:
has_jaccard_overlap = False segm = segms[id]
if is_poly(segm):
import copy
import shapely.ops
import logging
from shapely.geometry import Polygon, MultiPolygon, GeometryCollection
logging.getLogger("shapely").setLevel(logging.WARNING)
# Polygon format
crop_segms.append(_crop_poly(segm, crop))
else: else:
has_jaccard_overlap = True # RLE format
if sampler[8] == 0 and sampler[9] == 0: import pycocotools.mask as mask_util
has_object_coverage = False crop_segms.append(_crop_rle(segm, crop, height, width))
return crop_segms
def expand_segms(segms, x, y, height, width, ratio):
def _expand_poly(poly, x, y):
expanded_poly = np.array(poly)
expanded_poly[0::2] += x
expanded_poly[1::2] += y
return expanded_poly.tolist()
def _expand_rle(rle, x, y, height, width, ratio):
if 'counts' in rle and type(rle['counts']) == list:
rle = mask_util.frPyObjects(rle, height, width)
mask = mask_util.decode(rle)
expanded_mask = np.full((int(height * ratio), int(width * ratio)),
0).astype(mask.dtype)
expanded_mask[y:y + height, x:x + width] = mask
rle = mask_util.encode(
np.array(expanded_mask, order='F', dtype=np.uint8))
return rle
expanded_segms = []
for segm in segms:
if is_poly(segm):
# Polygon format
expanded_segms.append([_expand_poly(poly, x, y) for poly in segm])
else: else:
has_object_coverage = True # RLE format
import pycocotools.mask as mask_util
if not has_jaccard_overlap and not has_object_coverage: expanded_segms.append(
return True _expand_rle(segm, x, y, height, width, ratio))
found = False return expanded_segms
for i in range(len(gt_bboxes)):
object_bbox = [
gt_bboxes[i][0], gt_bboxes[i][1], gt_bboxes[i][2], gt_bboxes[i][3]
]
if has_jaccard_overlap:
overlap = jaccard_overlap(sample_bbox, object_bbox)
if sampler[6] != 0 and \
overlap < sampler[6]:
continue
if sampler[7] != 0 and \
overlap > sampler[7]:
continue
found = True
if has_object_coverage:
object_coverage = bbox_coverage(object_bbox, sample_bbox)
if sampler[8] != 0 and \
object_coverage < sampler[8]:
continue
if sampler[9] != 0 and \
object_coverage > sampler[9]:
continue
found = True
if found:
return True
return found
def crop_image_sampling(img, sample_bbox, image_width, image_height,
target_size):
# no clipping here
xmin = int(sample_bbox[0] * image_width)
xmax = int(sample_bbox[2] * image_width)
ymin = int(sample_bbox[1] * image_height)
ymax = int(sample_bbox[3] * image_height)
w_off = xmin
h_off = ymin
width = xmax - xmin
height = ymax - ymin
cross_xmin = max(0.0, float(w_off))
cross_ymin = max(0.0, float(h_off))
cross_xmax = min(float(w_off + width - 1.0), float(image_width))
cross_ymax = min(float(h_off + height - 1.0), float(image_height))
cross_width = cross_xmax - cross_xmin
cross_height = cross_ymax - cross_ymin
roi_xmin = 0 if w_off >= 0 else abs(w_off)
roi_ymin = 0 if h_off >= 0 else abs(h_off)
roi_width = cross_width
roi_height = cross_height
roi_y1 = int(roi_ymin)
roi_y2 = int(roi_ymin + roi_height)
roi_x1 = int(roi_xmin)
roi_x2 = int(roi_xmin + roi_width)
cross_y1 = int(cross_ymin)
cross_y2 = int(cross_ymin + cross_height)
cross_x1 = int(cross_xmin)
cross_x2 = int(cross_xmin + cross_width)
sample_img = np.zeros((height, width, 3))
sample_img[roi_y1: roi_y2, roi_x1: roi_x2] = \
img[cross_y1: cross_y2, cross_x1: cross_x2]
sample_img = cv2.resize(
sample_img, (target_size, target_size), interpolation=cv2.INTER_AREA)
return sample_img
def box_horizontal_flip(bboxes, width): def box_horizontal_flip(bboxes, width):
...@@ -409,15 +207,10 @@ def segms_horizontal_flip(segms, height, width): ...@@ -409,15 +207,10 @@ def segms_horizontal_flip(segms, height, width):
if 'counts' in rle and type(rle['counts']) == list: if 'counts' in rle and type(rle['counts']) == list:
rle = mask_util.frPyObjects([rle], height, width) rle = mask_util.frPyObjects([rle], height, width)
mask = mask_util.decode(rle) mask = mask_util.decode(rle)
mask = mask[:, ::-1, :] mask = mask[:, ::-1]
rle = mask_util.encode(np.array(mask, order='F', dtype=np.uint8)) rle = mask_util.encode(np.array(mask, order='F', dtype=np.uint8))
return rle return rle
def is_poly(segm):
if not isinstance(segm, (list, dict)):
raise Exception("Invalid segm type: {}".format(type(segm)))
return isinstance(segm, list)
flipped_segms = [] flipped_segms = []
for segm in segms: for segm in segms:
if is_poly(segm): if is_poly(segm):
......
...@@ -385,15 +385,12 @@ class RandomDistort: ...@@ -385,15 +385,12 @@ class RandomDistort:
'saturation': self.saturation_prob, 'saturation': self.saturation_prob,
'hue': self.hue_prob, 'hue': self.hue_prob,
} }
im = im.astype('uint8')
im = Image.fromarray(im)
for id in range(len(ops)): for id in range(len(ops)):
params = params_dict[ops[id].__name__] params = params_dict[ops[id].__name__]
prob = prob_dict[ops[id].__name__] prob = prob_dict[ops[id].__name__]
params['im'] = im params['im'] = im
if np.random.uniform(0, 1) < prob: if np.random.uniform(0, 1) < prob:
im = ops[id](**params) im = ops[id](**params)
im = np.asarray(im).astype('float32')
if label is None: if label is None:
return (im, ) return (im, )
else: else:
......
...@@ -12,13 +12,20 @@ ...@@ -12,13 +12,20 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from .ops import * try:
from .box_utils import * from collections.abc import Sequence
except Exception:
from collections import Sequence
import random import random
import os.path as osp import os.path as osp
import numpy as np import numpy as np
from PIL import Image, ImageEnhance
import cv2 import cv2
from PIL import Image, ImageEnhance
from .ops import *
from .box_utils import *
class Compose: class Compose:
...@@ -534,15 +541,13 @@ class RandomDistort: ...@@ -534,15 +541,13 @@ class RandomDistort:
'saturation': self.saturation_prob, 'saturation': self.saturation_prob,
'hue': self.hue_prob 'hue': self.hue_prob
} }
im = im.astype('uint8')
im = Image.fromarray(im)
for id in range(4): for id in range(4):
params = params_dict[ops[id].__name__] params = params_dict[ops[id].__name__]
prob = prob_dict[ops[id].__name__] prob = prob_dict[ops[id].__name__]
params['im'] = im params['im'] = im
if np.random.uniform(0, 1) < prob: if np.random.uniform(0, 1) < prob:
im = ops[id](**params) im = ops[id](**params)
im = np.asarray(im).astype('float32')
if label_info is None: if label_info is None:
return (im, im_info) return (im, im_info)
else: else:
...@@ -591,7 +596,7 @@ class MixupImage: ...@@ -591,7 +596,7 @@ class MixupImage:
img1.astype('float32') * factor img1.astype('float32') * factor
img[:img2.shape[0], :img2.shape[1], :] += \ img[:img2.shape[0], :img2.shape[1], :] += \
img2.astype('float32') * (1.0 - factor) img2.astype('float32') * (1.0 - factor)
return img.astype('uint8') return img.astype('float32')
def __call__(self, im, im_info=None, label_info=None): def __call__(self, im, im_info=None, label_info=None):
""" """
...@@ -658,9 +663,17 @@ class MixupImage: ...@@ -658,9 +663,17 @@ class MixupImage:
gt_score2 = im_info['mixup'][2]['gt_score'] gt_score2 = im_info['mixup'][2]['gt_score']
gt_score = np.concatenate( gt_score = np.concatenate(
(gt_score1 * factor, gt_score2 * (1. - factor)), axis=0) (gt_score1 * factor, gt_score2 * (1. - factor)), axis=0)
if 'gt_poly' in label_info:
gt_poly1 = label_info['gt_poly']
gt_poly2 = im_info['mixup'][2]['gt_poly']
label_info['gt_poly'] = gt_poly1 + gt_poly2
is_crowd1 = label_info['is_crowd']
is_crowd2 = im_info['mixup'][2]['is_crowd']
is_crowd = np.concatenate((is_crowd1, is_crowd2), axis=0)
label_info['gt_bbox'] = gt_bbox label_info['gt_bbox'] = gt_bbox
label_info['gt_score'] = gt_score label_info['gt_score'] = gt_score
label_info['gt_class'] = gt_class label_info['gt_class'] = gt_class
label_info['is_crowd'] = is_crowd
im_info['augment_shape'] = np.array([im.shape[0], im_info['augment_shape'] = np.array([im.shape[0],
im.shape[1]]).astype('int32') im.shape[1]]).astype('int32')
im_info.pop('mixup') im_info.pop('mixup')
...@@ -672,23 +685,30 @@ class MixupImage: ...@@ -672,23 +685,30 @@ class MixupImage:
class RandomExpand: class RandomExpand:
"""随机扩张图像,模型训练时的数据增强操作。 """随机扩张图像,模型训练时的数据增强操作。
1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。 1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。
2. 计算扩张后图像大小。 2. 计算扩张后图像大小。
3. 初始化像素值为数据集均值的图像,并将原图像随机粘贴于该图像上。 3. 初始化像素值为输入填充值的图像,并将原图像随机粘贴于该图像上。
4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。 4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
5. 根据原图像粘贴位置换算出扩张后真实分割区域的位置坐标。
Args: Args:
max_ratio (float): 图像扩张的最大比例。默认为4.0。 ratio (float): 图像扩张的最大比例。默认为4.0。
prob (float): 随机扩张的概率。默认为0.5。 prob (float): 随机扩张的概率。默认为0.5。
mean (list): 图像数据集的均值(0-255)。默认为[127.5, 127.5, 127.5]。 fill_value (list): 扩张图像的初始填充值(0-255)。默认为[123.675, 116.28, 103.53]。
""" """
def __init__(self, max_ratio=4., prob=0.5, mean=[127.5, 127.5, 127.5]): def __init__(self,
self.max_ratio = max_ratio ratio=4.,
self.mean = mean prob=0.5,
fill_value=[123.675, 116.28, 103.53]):
super(RandomExpand, self).__init__()
assert ratio > 1.01, "expand ratio must be larger than 1.01"
self.ratio = ratio
self.prob = prob self.prob = prob
assert isinstance(fill_value, Sequence), \
"fill value must be sequence"
if not isinstance(fill_value, tuple):
fill_value = tuple(fill_value)
self.fill_value = fill_value
def __call__(self, im, im_info=None, label_info=None): def __call__(self, im, im_info=None, label_info=None):
""" """
...@@ -696,7 +716,6 @@ class RandomExpand: ...@@ -696,7 +716,6 @@ class RandomExpand:
im (np.ndarray): 图像np.ndarray数据。 im (np.ndarray): 图像np.ndarray数据。
im_info (dict, 可选): 存储与图像相关的信息。 im_info (dict, 可选): 存储与图像相关的信息。
label_info (dict, 可选): 存储与标注框相关的信息。 label_info (dict, 可选): 存储与标注框相关的信息。
Returns: Returns:
tuple: 当label_info为空时,返回的tuple为(im, im_info),分别对应图像np.ndarray数据、存储与图像相关信息的字典; tuple: 当label_info为空时,返回的tuple为(im, im_info),分别对应图像np.ndarray数据、存储与图像相关信息的字典;
当label_info不为空时,返回的tuple为(im, im_info, label_info),分别对应图像np.ndarray数据、 当label_info不为空时,返回的tuple为(im, im_info, label_info),分别对应图像np.ndarray数据、
...@@ -708,7 +727,6 @@ class RandomExpand: ...@@ -708,7 +727,6 @@ class RandomExpand:
其中n代表真实标注框的个数。 其中n代表真实标注框的个数。
- gt_class (np.ndarray): 随机扩张后每个真实标注框对应的类别序号,形状为(n, 1), - gt_class (np.ndarray): 随机扩张后每个真实标注框对应的类别序号,形状为(n, 1),
其中n代表真实标注框的个数。 其中n代表真实标注框的个数。
Raises: Raises:
TypeError: 形参数据类型不满足需求。 TypeError: 形参数据类型不满足需求。
""" """
...@@ -723,108 +741,68 @@ class RandomExpand: ...@@ -723,108 +741,68 @@ class RandomExpand:
'gt_class' not in label_info: 'gt_class' not in label_info:
raise TypeError('Cannot do RandomExpand! ' + \ raise TypeError('Cannot do RandomExpand! ' + \
'Becasuse gt_bbox/gt_class is not in label_info!') 'Becasuse gt_bbox/gt_class is not in label_info!')
prob = np.random.uniform(0, 1) if np.random.uniform(0., 1.) < self.prob:
return (im, im_info, label_info)
augment_shape = im_info['augment_shape'] augment_shape = im_info['augment_shape']
im_width = augment_shape[1] height = int(augment_shape[0])
im_height = augment_shape[0] width = int(augment_shape[1])
gt_bbox = label_info['gt_bbox']
gt_class = label_info['gt_class']
if prob < self.prob: expand_ratio = np.random.uniform(1., self.ratio)
if self.max_ratio - 1 >= 0.01: h = int(height * expand_ratio)
expand_ratio = np.random.uniform(1, self.max_ratio) w = int(width * expand_ratio)
height = int(im_height * expand_ratio) if not h > height or not w > width:
width = int(im_width * expand_ratio)
h_off = math.floor(np.random.uniform(0, height - im_height))
w_off = math.floor(np.random.uniform(0, width - im_width))
expand_bbox = [
-w_off / im_width, -h_off / im_height,
(width - w_off) / im_width, (height - h_off) / im_height
]
expand_im = np.ones((height, width, 3))
expand_im = np.uint8(expand_im * np.squeeze(self.mean))
expand_im = Image.fromarray(expand_im)
im = im.astype('uint8')
im = Image.fromarray(im)
expand_im.paste(im, (int(w_off), int(h_off)))
expand_im = np.asarray(expand_im)
for i in range(gt_bbox.shape[0]):
gt_bbox[i][0] = gt_bbox[i][0] / im_width
gt_bbox[i][1] = gt_bbox[i][1] / im_height
gt_bbox[i][2] = gt_bbox[i][2] / im_width
gt_bbox[i][3] = gt_bbox[i][3] / im_height
gt_bbox, gt_class, _ = filter_and_process(
expand_bbox, gt_bbox, gt_class)
for i in range(gt_bbox.shape[0]):
gt_bbox[i][0] = gt_bbox[i][0] * width
gt_bbox[i][1] = gt_bbox[i][1] * height
gt_bbox[i][2] = gt_bbox[i][2] * width
gt_bbox[i][3] = gt_bbox[i][3] * height
im = expand_im.astype('float32')
label_info['gt_bbox'] = gt_bbox
label_info['gt_class'] = gt_class
im_info['augment_shape'] = np.array([height,
width]).astype('int32')
if label_info is None:
return (im, im_info)
else:
return (im, im_info, label_info) return (im, im_info, label_info)
y = np.random.randint(0, h - height)
x = np.random.randint(0, w - width)
canvas = np.ones((h, w, 3), dtype=np.float32)
canvas *= np.array(self.fill_value, dtype=np.float32)
canvas[y:y + height, x:x + width, :] = im
im_info['augment_shape'] = np.array([h, w]).astype('int32')
if 'gt_bbox' in label_info and len(label_info['gt_bbox']) > 0:
label_info['gt_bbox'] += np.array([x, y] * 2, dtype=np.float32)
if 'gt_poly' in label_info and len(label_info['gt_poly']) > 0:
label_info['gt_poly'] = expand_segms(label_info['gt_poly'], x, y,
height, width, expand_ratio)
return (canvas, im_info, label_info)
class RandomCrop: class RandomCrop:
"""随机裁剪图像。 """随机裁剪图像。
1. 若allow_no_crop为True,则在thresholds加入’no_crop’。
1. 根据batch_sampler计算获取裁剪候选区域的位置。 2. 随机打乱thresholds。
(1) 根据min scale、max scale、min aspect ratio、max aspect ratio计算随机剪裁的高、宽。 3. 遍历thresholds中各元素:
(2) 根据随机剪裁的高、宽随机选取剪裁的起始点。 (1) 如果当前thresh为’no_crop’,则返回原始图像和标注信息。
(3) 筛选出裁剪候选区域: (2) 随机取出aspect_ratio和scaling中的值并由此计算出候选裁剪区域的高、宽、起始点。
- 当satisfy_all为True时,需所有真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域才可保留。 (3) 计算真实标注框与候选裁剪区域IoU,若全部真实标注框的IoU都小于thresh,则继续第3步。
- 当satisfy_all为False时,当有一个真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域就可保留。 (4) 如果cover_all_box为True且存在真实标注框的IoU小于thresh,则继续第3步。
2. 遍历所有裁剪候选区域: (5) 筛选出位于候选裁剪区域内的真实标注框,若有效框的个数为0,则继续第3步,否则进行第4步。
(1) 若真实标注框与候选裁剪区域不重叠,或其中心点不在候选裁剪区域, 4. 换算有效真值标注框相对候选裁剪区域的位置坐标。
则将该真实标注框去除。 5. 换算有效分割区域相对候选裁剪区域的位置坐标。
(2) 计算相对于该候选裁剪区域,真实标注框的位置,并筛选出对应的类别、混合得分。
(3) 若avoid_no_bbox为False,返回当前裁剪后的信息即可;
反之,要找到一个裁剪区域中真实标注框个数不为0的区域,才返回裁剪后的信息。
Args: Args:
batch_sampler (list): 随机裁剪参数的多种组合,每种组合包含8个值,如下: aspect_ratio (list): 裁剪后短边缩放比例的取值范围,以[min, max]形式表示。默认值为[.5, 2.]。
- max sample (int):满足当前组合的裁剪区域的个数上限。 thresholds (list): 判断裁剪候选区域是否有效所需的IoU阈值取值列表。默认值为[.0, .1, .3, .5, .7, .9]。
- max trial (int): 查找满足当前组合的次数。 scaling (list): 裁剪面积相对原面积的取值范围,以[min, max]形式表示。默认值为[.3, 1.]。
- min scale (float): 裁剪面积相对原面积,每条边缩短比例的最小限制。 num_attempts (int): 在放弃寻找有效裁剪区域前尝试的次数。默认值为50。
- max scale (float): 裁剪面积相对原面积,每条边缩短比例的最大限制。 allow_no_crop (bool): 是否允许未进行裁剪。默认值为True。
- min aspect ratio (float): 裁剪后短边缩放比例的最小限制。 cover_all_box (bool): 是否要求所有的真实标注框都必须在裁剪区域内。默认值为False。
- max aspect ratio (float): 裁剪后短边缩放比例的最大限制。
- min overlap (float): 真实标注框与裁剪图像重叠面积的最小限制。
- max overlap (float): 真实标注框与裁剪图像重叠面积的最大限制。
默认值为None,当为None时采用如下设置:
[[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
satisfy_all (bool): 是否需要所有标注框满足条件,裁剪候选区域才保留。默认为False。
avoid_no_bbox (bool): 是否对裁剪图像不存在标注框的图像进行保留。默认为True。
""" """
def __init__(self, def __init__(self,
batch_sampler=None, aspect_ratio=[.5, 2.],
satisfy_all=False, thresholds=[.0, .1, .3, .5, .7, .9],
avoid_no_bbox=True): scaling=[.3, 1.],
if batch_sampler is None: num_attempts=50,
batch_sampler = [[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], allow_no_crop=True,
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0], cover_all_box=False):
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0], self.aspect_ratio = aspect_ratio
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0], self.thresholds = thresholds
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0], self.scaling = scaling
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0], self.num_attempts = num_attempts
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]] self.allow_no_crop = allow_no_crop
self.batch_sampler = batch_sampler self.cover_all_box = cover_all_box
self.satisfy_all = satisfy_all
self.avoid_no_bbox = avoid_no_bbox
def __call__(self, im, im_info=None, label_info=None): def __call__(self, im, im_info=None, label_info=None):
""" """
...@@ -859,65 +837,83 @@ class RandomCrop: ...@@ -859,65 +837,83 @@ class RandomCrop:
'gt_class' not in label_info: 'gt_class' not in label_info:
raise TypeError('Cannot do RandomCrop! ' + \ raise TypeError('Cannot do RandomCrop! ' + \
'Becasuse gt_bbox/gt_class is not in label_info!') 'Becasuse gt_bbox/gt_class is not in label_info!')
if len(label_info['gt_bbox']) == 0:
return (im, im_info, label_info)
augment_shape = im_info['augment_shape'] augment_shape = im_info['augment_shape']
im_width = augment_shape[1] w = augment_shape[1]
im_height = augment_shape[0] h = augment_shape[0]
gt_bbox = label_info['gt_bbox'] gt_bbox = label_info['gt_bbox']
gt_bbox_tmp = gt_bbox.copy() thresholds = list(self.thresholds)
for i in range(gt_bbox_tmp.shape[0]): if self.allow_no_crop:
gt_bbox_tmp[i][0] = gt_bbox[i][0] / im_width thresholds.append('no_crop')
gt_bbox_tmp[i][1] = gt_bbox[i][1] / im_height np.random.shuffle(thresholds)
gt_bbox_tmp[i][2] = gt_bbox[i][2] / im_width
gt_bbox_tmp[i][3] = gt_bbox[i][3] / im_height
gt_class = label_info['gt_class']
gt_score = None for thresh in thresholds:
if 'gt_score' in label_info: if thresh == 'no_crop':
gt_score = label_info['gt_score'] return (im, im_info, label_info)
sampled_bbox = []
gt_bbox_tmp = gt_bbox_tmp.tolist() found = False
for sampler in self.batch_sampler: for i in range(self.num_attempts):
found = 0 scale = np.random.uniform(*self.scaling)
for i in range(sampler[1]): min_ar, max_ar = self.aspect_ratio
if found >= sampler[0]: aspect_ratio = np.random.uniform(
break max(min_ar, scale**2), min(max_ar, scale**-2))
sample_bbox = generate_sample_bbox(sampler) crop_h = int(h * scale / np.sqrt(aspect_ratio))
if satisfy_sample_constraint(sampler, sample_bbox, gt_bbox_tmp, crop_w = int(w * scale * np.sqrt(aspect_ratio))
self.satisfy_all): crop_y = np.random.randint(0, h - crop_h)
sampled_bbox.append(sample_bbox) crop_x = np.random.randint(0, w - crop_w)
found = found + 1 crop_box = [crop_x, crop_y, crop_x + crop_w, crop_y + crop_h]
im = np.array(im) iou = iou_matrix(gt_bbox, np.array([crop_box],
while sampled_bbox: dtype=np.float32))
idx = int(np.random.uniform(0, len(sampled_bbox))) if iou.max() < thresh:
sample_bbox = sampled_bbox.pop(idx)
sample_bbox = clip_bbox(sample_bbox)
crop_bbox, crop_class, crop_score = \
filter_and_process(sample_bbox, gt_bbox_tmp, gt_class, gt_score)
if self.avoid_no_bbox:
if len(crop_bbox) < 1:
continue continue
xmin = int(sample_bbox[0] * im_width)
xmax = int(sample_bbox[2] * im_width) if self.cover_all_box and iou.min() < thresh:
ymin = int(sample_bbox[1] * im_height) continue
ymax = int(sample_bbox[3] * im_height)
im = im[ymin:ymax, xmin:xmax] cropped_box, valid_ids = crop_box_with_center_constraint(
for i in range(crop_bbox.shape[0]): gt_bbox, np.array(crop_box, dtype=np.float32))
crop_bbox[i][0] = crop_bbox[i][0] * (xmax - xmin) if valid_ids.size > 0:
crop_bbox[i][1] = crop_bbox[i][1] * (ymax - ymin) found = True
crop_bbox[i][2] = crop_bbox[i][2] * (xmax - xmin) break
crop_bbox[i][3] = crop_bbox[i][3] * (ymax - ymin)
label_info['gt_bbox'] = crop_bbox if found:
label_info['gt_class'] = crop_class if 'gt_poly' in label_info and len(label_info['gt_poly']) > 0:
label_info['gt_score'] = crop_score crop_polys = crop_segms(label_info['gt_poly'], valid_ids,
im_info['augment_shape'] = np.array([ymax - ymin, np.array(crop_box, dtype=np.int64),
xmax - xmin]).astype('int32') h, w)
if label_info is None: if [] in crop_polys:
return (im, im_info) delete_id = list()
valid_polys = list()
for id, crop_poly in enumerate(crop_polys):
if crop_poly == []:
delete_id.append(id)
else: else:
valid_polys.append(crop_poly)
valid_ids = np.delete(valid_ids, delete_id)
if len(valid_polys) == 0:
return (im, im_info, label_info) return (im, im_info, label_info)
if label_info is None: label_info['gt_poly'] = valid_polys
return (im, im_info)
else: else:
label_info['gt_poly'] = crop_polys
im = crop_image(im, crop_box)
label_info['gt_bbox'] = np.take(cropped_box, valid_ids, axis=0)
label_info['gt_class'] = np.take(
label_info['gt_class'], valid_ids, axis=0)
im_info['augment_shape'] = np.array(
[crop_box[3] - crop_box[1],
crop_box[2] - crop_box[0]]).astype('int32')
if 'gt_score' in label_info:
label_info['gt_score'] = np.take(
label_info['gt_score'], valid_ids, axis=0)
if 'is_crowd' in label_info:
label_info['is_crowd'] = np.take(
label_info['is_crowd'], valid_ids, axis=0)
return (im, im_info, label_info)
return (im, im_info, label_info) return (im, im_info, label_info)
......
...@@ -111,32 +111,41 @@ def bgr2rgb(im): ...@@ -111,32 +111,41 @@ def bgr2rgb(im):
return im[:, :, ::-1] return im[:, :, ::-1]
def brightness(im, brightness_lower, brightness_upper): def hue(im, hue_lower, hue_upper):
brightness_delta = np.random.uniform(brightness_lower, brightness_upper) delta = np.random.uniform(hue_lower, hue_upper)
im = ImageEnhance.Brightness(im).enhance(brightness_delta) u = np.cos(delta * np.pi)
w = np.sin(delta * np.pi)
bt = np.array([[1.0, 0.0, 0.0], [0.0, u, -w], [0.0, w, u]])
tyiq = np.array([[0.299, 0.587, 0.114], [0.596, -0.274, -0.321],
[0.211, -0.523, 0.311]])
ityiq = np.array([[1.0, 0.956, 0.621], [1.0, -0.272, -0.647],
[1.0, -1.107, 1.705]])
t = np.dot(np.dot(ityiq, bt), tyiq).T
im = np.dot(im, t)
return im return im
def contrast(im, contrast_lower, contrast_upper): def saturation(im, saturation_lower, saturation_upper):
contrast_delta = np.random.uniform(contrast_lower, contrast_upper) delta = np.random.uniform(saturation_lower, saturation_upper)
im = ImageEnhance.Contrast(im).enhance(contrast_delta) gray = im * np.array([[[0.299, 0.587, 0.114]]], dtype=np.float32)
gray = gray.sum(axis=2, keepdims=True)
gray *= (1.0 - delta)
im *= delta
im += gray
return im return im
def saturation(im, saturation_lower, saturation_upper): def contrast(im, contrast_lower, contrast_upper):
saturation_delta = np.random.uniform(saturation_lower, saturation_upper) delta = np.random.uniform(contrast_lower, contrast_upper)
im = ImageEnhance.Color(im).enhance(saturation_delta) im *= delta
return im return im
def hue(im, hue_lower, hue_upper): def brightness(im, brightness_lower, brightness_upper):
hue_delta = np.random.uniform(hue_lower, hue_upper) delta = np.random.uniform(brightness_lower, brightness_upper)
im = np.array(im.convert('HSV')) im += delta
im[:, :, 0] = im[:, :, 0] + hue_delta
im = Image.fromarray(im, mode='HSV').convert('RGB')
return im return im
def rotate(im, rotate_lower, rotate_upper): def rotate(im, rotate_lower, rotate_upper):
rotate_delta = np.random.uniform(rotate_lower, rotate_upper) rotate_delta = np.random.uniform(rotate_lower, rotate_upper)
im = im.rotate(int(rotate_delta)) im = im.rotate(int(rotate_delta))
......
...@@ -889,15 +889,12 @@ class RandomDistort: ...@@ -889,15 +889,12 @@ class RandomDistort:
'saturation': self.saturation_prob, 'saturation': self.saturation_prob,
'hue': self.hue_prob 'hue': self.hue_prob
} }
im = im.astype('uint8')
im = Image.fromarray(im)
for id in range(4): for id in range(4):
params = params_dict[ops[id].__name__] params = params_dict[ops[id].__name__]
prob = prob_dict[ops[id].__name__] prob = prob_dict[ops[id].__name__]
params['im'] = im params['im'] = im
if np.random.uniform(0, 1) < prob: if np.random.uniform(0, 1) < prob:
im = ops[id](**params) im = ops[id](**params)
im = np.asarray(im).astype('float32')
if label is None: if label is None:
return (im, im_info) return (im, im_info)
else: else:
......
...@@ -20,3 +20,4 @@ YOLOv3 = cv.models.YOLOv3 ...@@ -20,3 +20,4 @@ YOLOv3 = cv.models.YOLOv3
MaskRCNN = cv.models.MaskRCNN MaskRCNN = cv.models.MaskRCNN
transforms = cv.transforms.det_transforms transforms = cv.transforms.det_transforms
visualize = cv.models.utils.visualize.visualize_detection visualize = cv.models.utils.visualize.visualize_detection
draw_pr_curve = cv.models.utils.visualize.draw_pr_curve
...@@ -6,3 +6,4 @@ cython ...@@ -6,3 +6,4 @@ cython
pycocotools pycocotools
visualdl=1.3.0 visualdl=1.3.0
paddleslim=1.0.1 paddleslim=1.0.1
shapely
...@@ -19,7 +19,7 @@ long_description = "PaddleX. A end-to-end deeplearning model development toolkit ...@@ -19,7 +19,7 @@ long_description = "PaddleX. A end-to-end deeplearning model development toolkit
setuptools.setup( setuptools.setup(
name="paddlex", name="paddlex",
version='0.1.5', version='0.1.6',
author="paddlex", author="paddlex",
author_email="paddlex@baidu.com", author_email="paddlex@baidu.com",
description=long_description, description=long_description,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册