Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleX
提交
1d2e74d9
P
PaddleX
项目概览
PaddlePaddle
/
PaddleX
通知
138
Star
4
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
43
列表
看板
标记
里程碑
合并请求
5
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleX
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
43
Issue
43
列表
看板
标记
里程碑
合并请求
5
合并请求
5
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
1d2e74d9
编写于
5月 06, 2020
作者:
F
FlyingQianMM
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
randomcrop and randomexpand for mask
上级
cbf72fca
变更
4
隐藏空白更改
内联
并排
Showing
4 changed file
with
325 addition
and
544 deletion
+325
-544
docs/apis/transforms/det_transforms.md
docs/apis/transforms/det_transforms.md
+25
-39
docs/deploy.md
docs/deploy.md
+2
-2
paddlex/cv/transforms/box_utils.py
paddlex/cv/transforms/box_utils.py
+134
-341
paddlex/cv/transforms/det_transforms.py
paddlex/cv/transforms/det_transforms.py
+164
-162
未找到文件。
docs/apis/transforms/det_transforms.md
浏览文件 @
1d2e74d9
...
@@ -122,56 +122,42 @@ paddlex.det.transforms.MixupImage(alpha=1.5, beta=1.5, mixup_epoch=-1)
...
@@ -122,56 +122,42 @@ paddlex.det.transforms.MixupImage(alpha=1.5, beta=1.5, mixup_epoch=-1)
## RandomExpand类
## RandomExpand类
```
python
```
python
paddlex
.
det
.
transforms
.
RandomExpand
(
max_ratio
=
4.
,
prob
=
0.5
,
mean
=
[
127.5
,
127.5
,
127.5
])
paddlex
.
det
.
transforms
.
RandomExpand
(
ratio
=
4.
,
prob
=
0.5
,
fill_value
=
[
123.675
,
116.28
,
103.53
])
```
```
随机扩张图像,模型训练时的数据增强操作
,模型训练时的数据增强操作。
随机扩张图像,模型训练时的数据增强操作
。
1.
随机选取扩张比例(扩张比例大于1时才进行扩张)。
1.
随机选取扩张比例(扩张比例大于1时才进行扩张)。
2.
计算扩张后图像大小。
2.
计算扩张后图像大小。
3.
初始化像素值为
数据集均值的图像,并将原图像随机粘贴于该图像上。
3.
初始化像素值为
输入填充值的图像,并将原图像随机粘贴于该图像上。
4.
根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
4.
根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
5.
根据原图像粘贴位置换算出扩张后真实分割区域的位置坐标。
### 参数
### 参数
*
**
max_
ratio**
(float): 图像扩张的最大比例。默认为4.0。
*
**ratio**
(float): 图像扩张的最大比例。默认为4.0。
*
**prob**
(float): 随机扩张的概率。默认为0.5。
*
**prob**
(float): 随机扩张的概率。默认为0.5。
*
**
mean**
(list): 图像数据集的均值(0-255)。默认为[127.5, 127.5, 127.5
]。
*
**
fill_value**
(list): 扩张图像的初始填充值(0-255)。默认为[123.675, 116.28, 103.53
]。
## RandomCrop类
## RandomCrop类
```
python
```
python
paddlex
.
det
.
transforms
.
RandomCrop
(
batch_sampler
=
None
,
satisfy_all
=
False
,
avoid_no_bbox
=
Tru
e
)
paddlex
.
det
.
transforms
.
RandomCrop
(
aspect_ratio
=
[.
5
,
2.
],
thresholds
=
[.
0
,
.
1
,
.
3
,
.
5
,
.
7
,
.
9
],
scaling
=
[.
3
,
1.
],
num_attempts
=
50
,
allow_no_crop
=
True
,
cover_all_box
=
Fals
e
)
```
```
随机裁剪图像,模型训练时的数据增强操作。
随机裁剪图像,模型训练时的数据增强操作。
1.
根据batch_sampler计算获取裁剪候选区域的位置。
1.
若allow_no_crop为True,则在thresholds加入’no_crop’
(1) 根据min scale、max scale、min aspect ratio、max aspect ratio计算随机剪裁的高、宽。
2.
随机打乱thresholds
(2) 根据随机剪裁的高、宽随机选取剪裁的起始点。
3.
遍历thresholds中各元素:
(
3) 筛选出裁剪候选区域:
(
1) 如果当前thresh为’no_crop’,则返回原始图像和标注信息
*
当satisfy_all为True时,需所有真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域才可保留。
(2) 随机取出aspect_ratio和scaling中的值并由此计算出候选裁剪区域的高、宽、起始点。
*
当satisfy_all为False时,当有一个真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域就可保留。
(3) 计算真实标注框与候选裁剪区域IoU,若全部真实标注框的IoU都小于thresh,则继续第3步
2.
遍历所有裁剪候选区域:
(4) 如果cover_all_box为True且存在真实标注框的IoU小于thresh,则继续第3步
(
1) 若真实标注框与候选裁剪区域不重叠,或其中心点不在候选裁剪区域,则将该真实标注框去除。
(
5) 筛选出位于候选裁剪区域内的真实标注框,若有效框的个数为0,则继续第3步,否则进行第4步。
(2) 计算相对于该候选裁剪区域,真实标注框的位置,并筛选出对应的类别、混合得分。
4.
换算有效真值标注框相对候选裁剪区域的位置坐标。
(3) 若avoid_no_bbox为False,返回当前裁剪后的信息即可;反之,要找到一个裁剪区域中真实标注框个数不为0的区域,才返回裁剪后的信息
。
5.
换算有效分割区域相对候选裁剪区域的位置坐标
。
### 参数
### 参数
*
**batch_sampler**
(list): 随机裁剪参数的多种组合,每种组合包含8个值,如下:
*
**aspect_ratio**
(list): 裁剪后短边缩放比例的取值范围,以[min, max]形式表示。默认值为[.5, 2.]。
-
max sample (int):满足当前组合的裁剪区域的个数上限。
*
**thresholds**
(list): 判断裁剪候选区域是否有效所需的IoU阈值取值列表。默认值为[.0, .1, .3, .5, .7, .9]。
-
max trial (int): 查找满足当前组合的次数。
*
**scaling**
(list): 裁剪面积相对原面积的取值范围,以[min, max]形式表示。默认值为[.3, 1.]。
-
min scale (float): 裁剪面积相对原面积,每条边缩短比例的最小限制。
*
**num_attempts**
(int): 在放弃寻找有效裁剪区域前尝试的次数。默认值为50。
-
max scale (float): 裁剪面积相对原面积,每条边缩短比例的最大限制。
*
**allow_no_crop**
(bool): 是否允许未进行裁剪。默认值为True。
-
min aspect ratio (float): 裁剪后短边缩放比例的最小限制。
*
**cover_all_box**
(bool): 是否要求所有的真实标注框都必须在裁剪区域内。默认值为False。
-
max aspect ratio (float): 裁剪后短边缩放比例的最大限制。
-
min overlap (float): 真实标注框与裁剪图像重叠面积的最小限制。
-
max overlap (float): 真实标注框与裁剪图像重叠面积的最大限制。
默认值为None,当为None时采用如下设置:
[[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
*
**satisfy_all**
(bool): 是否需要所有标注框满足条件,裁剪候选区域才保留。默认为False。
*
**avoid_no_bbox**
(bool): 是否对裁剪图像不存在标注框的图像进行保留。默认为True。
docs/deploy.md
浏览文件 @
1d2e74d9
...
@@ -4,10 +4,10 @@
...
@@ -4,10 +4,10 @@
在服务端部署的模型需要首先将模型导出为inference格式模型,导出的模型将包括
`__model__`
、
`__params__`
和
`model.yml`
三个文名,分别为模型的网络结构,模型权重和模型的配置文件(包括数据预处理参数等等)。在安装完PaddleX后,在命令行终端使用如下命令导出模型到当前目录
`inferece_model`
下。
在服务端部署的模型需要首先将模型导出为inference格式模型,导出的模型将包括
`__model__`
、
`__params__`
和
`model.yml`
三个文名,分别为模型的网络结构,模型权重和模型的配置文件(包括数据预处理参数等等)。在安装完PaddleX后,在命令行终端使用如下命令导出模型到当前目录
`inferece_model`
下。
> 可直接下载垃圾检测模型测试本文档的流程[
garbage_epoch_12.tar.gz](https://bj.bcebos.com/paddlex/models/garbage
_epoch_12.tar.gz)
> 可直接下载垃圾检测模型测试本文档的流程[
xiaoduxiong_epoch_12.tar.gz](https://bj.bcebos.com/paddlex/models/xiaoduxiong
_epoch_12.tar.gz)
```
```
paddlex --export_inference --model_dir=./
garbage
_epoch_12 --save_dir=./inference_model
paddlex --export_inference --model_dir=./
xiaoduxiong
_epoch_12 --save_dir=./inference_model
```
```
## 模型C++和Python部署方案预计一周内推出...
## 模型C++和Python部署方案预计一周内推出...
paddlex/cv/transforms/box_utils.py
浏览文件 @
1d2e74d9
...
@@ -19,25 +19,6 @@ import cv2
...
@@ -19,25 +19,6 @@ import cv2
import
scipy
import
scipy
def
meet_emit_constraint
(
src_bbox
,
sample_bbox
):
center_x
=
(
src_bbox
[
2
]
+
src_bbox
[
0
])
/
2
center_y
=
(
src_bbox
[
3
]
+
src_bbox
[
1
])
/
2
if
center_x
>=
sample_bbox
[
0
]
and
\
center_x
<=
sample_bbox
[
2
]
and
\
center_y
>=
sample_bbox
[
1
]
and
\
center_y
<=
sample_bbox
[
3
]:
return
True
return
False
def
clip_bbox
(
src_bbox
):
src_bbox
[
0
]
=
max
(
min
(
src_bbox
[
0
],
1.0
),
0.0
)
src_bbox
[
1
]
=
max
(
min
(
src_bbox
[
1
],
1.0
),
0.0
)
src_bbox
[
2
]
=
max
(
min
(
src_bbox
[
2
],
1.0
),
0.0
)
src_bbox
[
3
]
=
max
(
min
(
src_bbox
[
3
],
1.0
),
0.0
)
return
src_bbox
def
bbox_area
(
src_bbox
):
def
bbox_area
(
src_bbox
):
if
src_bbox
[
2
]
<
src_bbox
[
0
]
or
src_bbox
[
3
]
<
src_bbox
[
1
]:
if
src_bbox
[
2
]
<
src_bbox
[
0
]
or
src_bbox
[
3
]
<
src_bbox
[
1
]:
return
0.
return
0.
...
@@ -47,189 +28,6 @@ def bbox_area(src_bbox):
...
@@ -47,189 +28,6 @@ def bbox_area(src_bbox):
return
width
*
height
return
width
*
height
def
is_overlap
(
object_bbox
,
sample_bbox
):
if
object_bbox
[
0
]
>=
sample_bbox
[
2
]
or
\
object_bbox
[
2
]
<=
sample_bbox
[
0
]
or
\
object_bbox
[
1
]
>=
sample_bbox
[
3
]
or
\
object_bbox
[
3
]
<=
sample_bbox
[
1
]:
return
False
else
:
return
True
def
filter_and_process
(
sample_bbox
,
bboxes
,
labels
,
scores
=
None
):
new_bboxes
=
[]
new_labels
=
[]
new_scores
=
[]
for
i
in
range
(
len
(
bboxes
)):
new_bbox
=
[
0
,
0
,
0
,
0
]
obj_bbox
=
[
bboxes
[
i
][
0
],
bboxes
[
i
][
1
],
bboxes
[
i
][
2
],
bboxes
[
i
][
3
]]
if
not
meet_emit_constraint
(
obj_bbox
,
sample_bbox
):
continue
if
not
is_overlap
(
obj_bbox
,
sample_bbox
):
continue
sample_width
=
sample_bbox
[
2
]
-
sample_bbox
[
0
]
sample_height
=
sample_bbox
[
3
]
-
sample_bbox
[
1
]
new_bbox
[
0
]
=
(
obj_bbox
[
0
]
-
sample_bbox
[
0
])
/
sample_width
new_bbox
[
1
]
=
(
obj_bbox
[
1
]
-
sample_bbox
[
1
])
/
sample_height
new_bbox
[
2
]
=
(
obj_bbox
[
2
]
-
sample_bbox
[
0
])
/
sample_width
new_bbox
[
3
]
=
(
obj_bbox
[
3
]
-
sample_bbox
[
1
])
/
sample_height
new_bbox
=
clip_bbox
(
new_bbox
)
if
bbox_area
(
new_bbox
)
>
0
:
new_bboxes
.
append
(
new_bbox
)
new_labels
.
append
([
labels
[
i
][
0
]])
if
scores
is
not
None
:
new_scores
.
append
([
scores
[
i
][
0
]])
bboxes
=
np
.
array
(
new_bboxes
)
labels
=
np
.
array
(
new_labels
)
scores
=
np
.
array
(
new_scores
)
return
bboxes
,
labels
,
scores
def
bbox_area_sampling
(
bboxes
,
labels
,
scores
,
target_size
,
min_size
):
new_bboxes
=
[]
new_labels
=
[]
new_scores
=
[]
for
i
,
bbox
in
enumerate
(
bboxes
):
w
=
float
((
bbox
[
2
]
-
bbox
[
0
])
*
target_size
)
h
=
float
((
bbox
[
3
]
-
bbox
[
1
])
*
target_size
)
if
w
*
h
<
float
(
min_size
*
min_size
):
continue
else
:
new_bboxes
.
append
(
bbox
)
new_labels
.
append
(
labels
[
i
])
if
scores
is
not
None
and
scores
.
size
!=
0
:
new_scores
.
append
(
scores
[
i
])
bboxes
=
np
.
array
(
new_bboxes
)
labels
=
np
.
array
(
new_labels
)
scores
=
np
.
array
(
new_scores
)
return
bboxes
,
labels
,
scores
def
generate_sample_bbox
(
sampler
):
scale
=
np
.
random
.
uniform
(
sampler
[
2
],
sampler
[
3
])
aspect_ratio
=
np
.
random
.
uniform
(
sampler
[
4
],
sampler
[
5
])
aspect_ratio
=
max
(
aspect_ratio
,
(
scale
**
2.0
))
aspect_ratio
=
min
(
aspect_ratio
,
1
/
(
scale
**
2.0
))
bbox_width
=
scale
*
(
aspect_ratio
**
0.5
)
bbox_height
=
scale
/
(
aspect_ratio
**
0.5
)
xmin_bound
=
1
-
bbox_width
ymin_bound
=
1
-
bbox_height
xmin
=
np
.
random
.
uniform
(
0
,
xmin_bound
)
ymin
=
np
.
random
.
uniform
(
0
,
ymin_bound
)
xmax
=
xmin
+
bbox_width
ymax
=
ymin
+
bbox_height
sampled_bbox
=
[
xmin
,
ymin
,
xmax
,
ymax
]
return
sampled_bbox
def
generate_sample_bbox_square
(
sampler
,
image_width
,
image_height
):
scale
=
np
.
random
.
uniform
(
sampler
[
2
],
sampler
[
3
])
aspect_ratio
=
np
.
random
.
uniform
(
sampler
[
4
],
sampler
[
5
])
aspect_ratio
=
max
(
aspect_ratio
,
(
scale
**
2.0
))
aspect_ratio
=
min
(
aspect_ratio
,
1
/
(
scale
**
2.0
))
bbox_width
=
scale
*
(
aspect_ratio
**
0.5
)
bbox_height
=
scale
/
(
aspect_ratio
**
0.5
)
if
image_height
<
image_width
:
bbox_width
=
bbox_height
*
image_height
/
image_width
else
:
bbox_height
=
bbox_width
*
image_width
/
image_height
xmin_bound
=
1
-
bbox_width
ymin_bound
=
1
-
bbox_height
xmin
=
np
.
random
.
uniform
(
0
,
xmin_bound
)
ymin
=
np
.
random
.
uniform
(
0
,
ymin_bound
)
xmax
=
xmin
+
bbox_width
ymax
=
ymin
+
bbox_height
sampled_bbox
=
[
xmin
,
ymin
,
xmax
,
ymax
]
return
sampled_bbox
def
data_anchor_sampling
(
bbox_labels
,
image_width
,
image_height
,
scale_array
,
resize_width
):
num_gt
=
len
(
bbox_labels
)
# np.random.randint range: [low, high)
rand_idx
=
np
.
random
.
randint
(
0
,
num_gt
)
if
num_gt
!=
0
else
0
if
num_gt
!=
0
:
norm_xmin
=
bbox_labels
[
rand_idx
][
0
]
norm_ymin
=
bbox_labels
[
rand_idx
][
1
]
norm_xmax
=
bbox_labels
[
rand_idx
][
2
]
norm_ymax
=
bbox_labels
[
rand_idx
][
3
]
xmin
=
norm_xmin
*
image_width
ymin
=
norm_ymin
*
image_height
wid
=
image_width
*
(
norm_xmax
-
norm_xmin
)
hei
=
image_height
*
(
norm_ymax
-
norm_ymin
)
range_size
=
0
area
=
wid
*
hei
for
scale_ind
in
range
(
0
,
len
(
scale_array
)
-
1
):
if
area
>
scale_array
[
scale_ind
]
**
2
and
area
<
\
scale_array
[
scale_ind
+
1
]
**
2
:
range_size
=
scale_ind
+
1
break
if
area
>
scale_array
[
len
(
scale_array
)
-
2
]
**
2
:
range_size
=
len
(
scale_array
)
-
2
scale_choose
=
0.0
if
range_size
==
0
:
rand_idx_size
=
0
else
:
# np.random.randint range: [low, high)
rng_rand_size
=
np
.
random
.
randint
(
0
,
range_size
+
1
)
rand_idx_size
=
rng_rand_size
%
(
range_size
+
1
)
if
rand_idx_size
==
range_size
:
min_resize_val
=
scale_array
[
rand_idx_size
]
/
2.0
max_resize_val
=
min
(
2.0
*
scale_array
[
rand_idx_size
],
2
*
math
.
sqrt
(
wid
*
hei
))
scale_choose
=
random
.
uniform
(
min_resize_val
,
max_resize_val
)
else
:
min_resize_val
=
scale_array
[
rand_idx_size
]
/
2.0
max_resize_val
=
2.0
*
scale_array
[
rand_idx_size
]
scale_choose
=
random
.
uniform
(
min_resize_val
,
max_resize_val
)
sample_bbox_size
=
wid
*
resize_width
/
scale_choose
w_off_orig
=
0.0
h_off_orig
=
0.0
if
sample_bbox_size
<
max
(
image_height
,
image_width
):
if
wid
<=
sample_bbox_size
:
w_off_orig
=
np
.
random
.
uniform
(
xmin
+
wid
-
sample_bbox_size
,
xmin
)
else
:
w_off_orig
=
np
.
random
.
uniform
(
xmin
,
xmin
+
wid
-
sample_bbox_size
)
if
hei
<=
sample_bbox_size
:
h_off_orig
=
np
.
random
.
uniform
(
ymin
+
hei
-
sample_bbox_size
,
ymin
)
else
:
h_off_orig
=
np
.
random
.
uniform
(
ymin
,
ymin
+
hei
-
sample_bbox_size
)
else
:
w_off_orig
=
np
.
random
.
uniform
(
image_width
-
sample_bbox_size
,
0.0
)
h_off_orig
=
np
.
random
.
uniform
(
image_height
-
sample_bbox_size
,
0.0
)
w_off_orig
=
math
.
floor
(
w_off_orig
)
h_off_orig
=
math
.
floor
(
h_off_orig
)
# Figure out top left coordinates.
w_off
=
float
(
w_off_orig
/
image_width
)
h_off
=
float
(
h_off_orig
/
image_height
)
sampled_bbox
=
[
w_off
,
h_off
,
w_off
+
float
(
sample_bbox_size
/
image_width
),
h_off
+
float
(
sample_bbox_size
/
image_height
)
]
return
sampled_bbox
else
:
return
0
def
jaccard_overlap
(
sample_bbox
,
object_bbox
):
def
jaccard_overlap
(
sample_bbox
,
object_bbox
):
if
sample_bbox
[
0
]
>=
object_bbox
[
2
]
or
\
if
sample_bbox
[
0
]
>=
object_bbox
[
2
]
or
\
sample_bbox
[
2
]
<=
object_bbox
[
0
]
or
\
sample_bbox
[
2
]
<=
object_bbox
[
0
]
or
\
...
@@ -249,143 +47,143 @@ def jaccard_overlap(sample_bbox, object_bbox):
...
@@ -249,143 +47,143 @@ def jaccard_overlap(sample_bbox, object_bbox):
return
overlap
return
overlap
def
intersect_bbox
(
bbox1
,
bbox2
):
def
iou_matrix
(
a
,
b
):
if
bbox2
[
0
]
>
bbox1
[
2
]
or
bbox2
[
2
]
<
bbox1
[
0
]
or
\
tl_i
=
np
.
maximum
(
a
[:,
np
.
newaxis
,
:
2
],
b
[:,
:
2
])
bbox2
[
1
]
>
bbox1
[
3
]
or
bbox2
[
3
]
<
bbox1
[
1
]:
br_i
=
np
.
minimum
(
a
[:,
np
.
newaxis
,
2
:],
b
[:,
2
:])
intersection_box
=
[
0.0
,
0.0
,
0.0
,
0.0
]
else
:
intersection_box
=
[
max
(
bbox1
[
0
],
bbox2
[
0
]),
max
(
bbox1
[
1
],
bbox2
[
1
]),
min
(
bbox1
[
2
],
bbox2
[
2
]),
min
(
bbox1
[
3
],
bbox2
[
3
])
]
return
intersection_box
def
bbox_coverage
(
bbox1
,
bbox2
):
inter_box
=
intersect_bbox
(
bbox1
,
bbox2
)
intersect_size
=
bbox_area
(
inter_box
)
if
intersect_size
>
0
:
bbox1_size
=
bbox_area
(
bbox1
)
return
intersect_size
/
bbox1_size
else
:
return
0.
area_i
=
np
.
prod
(
br_i
-
tl_i
,
axis
=
2
)
*
(
tl_i
<
br_i
).
all
(
axis
=
2
)
area_a
=
np
.
prod
(
a
[:,
2
:]
-
a
[:,
:
2
],
axis
=
1
)
area_b
=
np
.
prod
(
b
[:,
2
:]
-
b
[:,
:
2
],
axis
=
1
)
area_o
=
(
area_a
[:,
np
.
newaxis
]
+
area_b
-
area_i
)
return
area_i
/
(
area_o
+
1e-10
)
def
satisfy_sample_constraint
(
sampler
,
sample_bbox
,
gt_bboxes
,
satisfy_all
=
False
):
if
sampler
[
6
]
==
0
and
sampler
[
7
]
==
0
:
return
True
satisfied
=
[]
for
i
in
range
(
len
(
gt_bboxes
)):
object_bbox
=
[
gt_bboxes
[
i
][
0
],
gt_bboxes
[
i
][
1
],
gt_bboxes
[
i
][
2
],
gt_bboxes
[
i
][
3
]
]
overlap
=
jaccard_overlap
(
sample_bbox
,
object_bbox
)
if
sampler
[
6
]
!=
0
and
\
overlap
<
sampler
[
6
]:
satisfied
.
append
(
False
)
continue
if
sampler
[
7
]
!=
0
and
\
overlap
>
sampler
[
7
]:
satisfied
.
append
(
False
)
continue
satisfied
.
append
(
True
)
if
not
satisfy_all
:
return
True
if
satisfy_all
:
return
np
.
all
(
satisfied
)
else
:
return
False
def
crop_box_with_center_constraint
(
box
,
crop
):
cropped_box
=
box
.
copy
()
cropped_box
[:,
:
2
]
=
np
.
maximum
(
box
[:,
:
2
],
crop
[:
2
])
cropped_box
[:,
2
:]
=
np
.
minimum
(
box
[:,
2
:],
crop
[
2
:])
cropped_box
[:,
:
2
]
-=
crop
[:
2
]
cropped_box
[:,
2
:]
-=
crop
[:
2
]
centers
=
(
box
[:,
:
2
]
+
box
[:,
2
:])
/
2
valid
=
np
.
logical_and
(
crop
[:
2
]
<=
centers
,
centers
<
crop
[
2
:]).
all
(
axis
=
1
)
valid
=
np
.
logical_and
(
valid
,
(
cropped_box
[:,
:
2
]
<
cropped_box
[:,
2
:]).
all
(
axis
=
1
))
return
cropped_box
,
np
.
where
(
valid
)[
0
]
def
is_poly
(
segm
):
if
not
isinstance
(
segm
,
(
list
,
dict
)):
raise
Exception
(
"Invalid segm type: {}"
.
format
(
type
(
segm
)))
return
isinstance
(
segm
,
list
)
def
satisfy_sample_constraint_coverage
(
sampler
,
sample_bbox
,
gt_bboxes
):
if
sampler
[
6
]
==
0
and
sampler
[
7
]
==
0
:
def
crop_image
(
img
,
crop
):
has_jaccard_overlap
=
False
x1
,
y1
,
x2
,
y2
=
crop
else
:
return
img
[
y1
:
y2
,
x1
:
x2
,
:]
has_jaccard_overlap
=
True
if
sampler
[
8
]
==
0
and
sampler
[
9
]
==
0
:
has_object_coverage
=
False
def
crop_segms
(
segms
,
valid_ids
,
crop
,
height
,
width
):
else
:
def
_crop_poly
(
segm
,
crop
):
has_object_coverage
=
True
xmin
,
ymin
,
xmax
,
ymax
=
crop
crop_coord
=
[
xmin
,
ymin
,
xmin
,
ymax
,
xmax
,
ymax
,
xmax
,
ymin
]
if
not
has_jaccard_overlap
and
not
has_object_coverage
:
crop_p
=
np
.
array
(
crop_coord
).
reshape
(
4
,
2
)
return
True
crop_p
=
Polygon
(
crop_p
)
found
=
False
for
i
in
range
(
len
(
gt_bboxes
)):
crop_segm
=
list
()
object_bbox
=
[
for
poly
in
segm
:
gt_bboxes
[
i
][
0
],
gt_bboxes
[
i
][
1
],
gt_bboxes
[
i
][
2
],
gt_bboxes
[
i
][
3
]
poly
=
np
.
array
(
poly
).
reshape
(
len
(
poly
)
//
2
,
2
)
]
polygon
=
Polygon
(
poly
)
if
has_jaccard_overlap
:
if
not
polygon
.
is_valid
:
overlap
=
jaccard_overlap
(
sample_bbox
,
object_bbox
)
exterior
=
polygon
.
exterior
if
sampler
[
6
]
!=
0
and
\
multi_lines
=
exterior
.
intersection
(
exterior
)
overlap
<
sampler
[
6
]:
polygons
=
shapely
.
ops
.
polygonize
(
multi_lines
)
continue
polygon
=
MultiPolygon
(
polygons
)
if
sampler
[
7
]
!=
0
and
\
multi_polygon
=
list
()
overlap
>
sampler
[
7
]:
if
isinstance
(
polygon
,
MultiPolygon
):
continue
multi_polygon
=
copy
.
deepcopy
(
polygon
)
found
=
True
else
:
if
has_object_coverage
:
multi_polygon
.
append
(
copy
.
deepcopy
(
polygon
))
object_coverage
=
bbox_coverage
(
object_bbox
,
sample_bbox
)
for
per_polygon
in
multi_polygon
:
if
sampler
[
8
]
!=
0
and
\
inter
=
per_polygon
.
intersection
(
crop_p
)
object_coverage
<
sampler
[
8
]:
if
not
inter
:
continue
continue
if
sampler
[
9
]
!=
0
and
\
if
isinstance
(
inter
,
(
MultiPolygon
,
GeometryCollection
)):
object_coverage
>
sampler
[
9
]:
for
part
in
inter
:
continue
if
not
isinstance
(
part
,
Polygon
):
found
=
True
continue
if
found
:
part
=
np
.
squeeze
(
return
True
np
.
array
(
part
.
exterior
.
coords
[:
-
1
]).
reshape
(
1
,
-
1
))
return
found
part
[
0
::
2
]
-=
xmin
part
[
1
::
2
]
-=
ymin
crop_segm
.
append
(
part
.
tolist
())
def
crop_image_sampling
(
img
,
sample_bbox
,
image_width
,
image_height
,
elif
isinstance
(
inter
,
Polygon
):
target_size
):
crop_poly
=
np
.
squeeze
(
# no clipping here
np
.
array
(
inter
.
exterior
.
coords
[:
-
1
]).
reshape
(
1
,
-
1
))
xmin
=
int
(
sample_bbox
[
0
]
*
image_width
)
crop_poly
[
0
::
2
]
-=
xmin
xmax
=
int
(
sample_bbox
[
2
]
*
image_width
)
crop_poly
[
1
::
2
]
-=
ymin
ymin
=
int
(
sample_bbox
[
1
]
*
image_height
)
crop_segm
.
append
(
crop_poly
.
tolist
())
ymax
=
int
(
sample_bbox
[
3
]
*
image_height
)
else
:
continue
w_off
=
xmin
return
crop_segm
h_off
=
ymin
width
=
xmax
-
xmin
def
_crop_rle
(
rle
,
crop
,
height
,
width
):
height
=
ymax
-
ymin
if
'counts'
in
rle
and
type
(
rle
[
'counts'
])
==
list
:
cross_xmin
=
max
(
0.0
,
float
(
w_off
))
rle
=
mask_util
.
frPyObjects
(
rle
,
height
,
width
)
cross_ymin
=
max
(
0.0
,
float
(
h_off
))
mask
=
mask_util
.
decode
(
rle
)
cross_xmax
=
min
(
float
(
w_off
+
width
-
1.0
),
float
(
image_width
))
mask
=
mask
[
crop
[
1
]:
crop
[
3
],
crop
[
0
]:
crop
[
2
]]
cross_ymax
=
min
(
float
(
h_off
+
height
-
1.0
),
float
(
image_height
))
rle
=
mask_util
.
encode
(
np
.
array
(
mask
,
order
=
'F'
,
dtype
=
np
.
uint8
))
cross_width
=
cross_xmax
-
cross_xmin
return
rle
cross_height
=
cross_ymax
-
cross_ymin
crop_segms
=
[]
roi_xmin
=
0
if
w_off
>=
0
else
abs
(
w_off
)
for
id
in
valid_ids
:
roi_ymin
=
0
if
h_off
>=
0
else
abs
(
h_off
)
segm
=
segms
[
id
]
roi_width
=
cross_width
if
is_poly
(
segm
):
roi_height
=
cross_height
import
copy
import
shapely.ops
roi_y1
=
int
(
roi_ymin
)
import
logging
roi_y2
=
int
(
roi_ymin
+
roi_height
)
from
shapely.geometry
import
Polygon
,
MultiPolygon
,
GeometryCollection
roi_x1
=
int
(
roi_xmin
)
logging
.
getLogger
(
"shapely"
).
setLevel
(
logging
.
WARNING
)
roi_x2
=
int
(
roi_xmin
+
roi_width
)
# Polygon format
crop_segms
.
append
(
_crop_poly
(
segm
,
crop
))
cross_y1
=
int
(
cross_ymin
)
else
:
cross_y2
=
int
(
cross_ymin
+
cross_height
)
# RLE format
cross_x1
=
int
(
cross_xmin
)
import
pycocotools.mask
as
mask_util
cross_x2
=
int
(
cross_xmin
+
cross_width
)
crop_segms
.
append
(
_crop_rle
(
segm
,
crop
,
height
,
width
))
return
crop_segms
sample_img
=
np
.
zeros
((
height
,
width
,
3
))
sample_img
[
roi_y1
:
roi_y2
,
roi_x1
:
roi_x2
]
=
\
img
[
cross_y1
:
cross_y2
,
cross_x1
:
cross_x2
]
def
expand_segms
(
segms
,
x
,
y
,
height
,
width
,
ratio
):
def
_expand_poly
(
poly
,
x
,
y
):
sample_img
=
cv2
.
resize
(
expanded_poly
=
np
.
array
(
poly
)
sample_img
,
(
target_size
,
target_size
),
interpolation
=
cv2
.
INTER_AREA
)
expanded_poly
[
0
::
2
]
+=
x
expanded_poly
[
1
::
2
]
+=
y
return
sample_img
return
expanded_poly
.
tolist
()
def
_expand_rle
(
rle
,
x
,
y
,
height
,
width
,
ratio
):
if
'counts'
in
rle
and
type
(
rle
[
'counts'
])
==
list
:
rle
=
mask_util
.
frPyObjects
(
rle
,
height
,
width
)
mask
=
mask_util
.
decode
(
rle
)
expanded_mask
=
np
.
full
((
int
(
height
*
ratio
),
int
(
width
*
ratio
)),
0
).
astype
(
mask
.
dtype
)
expanded_mask
[
y
:
y
+
height
,
x
:
x
+
width
]
=
mask
rle
=
mask_util
.
encode
(
np
.
array
(
expanded_mask
,
order
=
'F'
,
dtype
=
np
.
uint8
))
return
rle
expanded_segms
=
[]
for
segm
in
segms
:
if
is_poly
(
segm
):
# Polygon format
expanded_segms
.
append
([
_expand_poly
(
poly
,
x
,
y
)
for
poly
in
segm
])
else
:
# RLE format
import
pycocotools.mask
as
mask_util
expanded_segms
.
append
(
_expand_rle
(
segm
,
x
,
y
,
height
,
width
,
ratio
))
return
expanded_segms
def
box_horizontal_flip
(
bboxes
,
width
):
def
box_horizontal_flip
(
bboxes
,
width
):
...
@@ -409,15 +207,10 @@ def segms_horizontal_flip(segms, height, width):
...
@@ -409,15 +207,10 @@ def segms_horizontal_flip(segms, height, width):
if
'counts'
in
rle
and
type
(
rle
[
'counts'
])
==
list
:
if
'counts'
in
rle
and
type
(
rle
[
'counts'
])
==
list
:
rle
=
mask_util
.
frPyObjects
([
rle
],
height
,
width
)
rle
=
mask_util
.
frPyObjects
([
rle
],
height
,
width
)
mask
=
mask_util
.
decode
(
rle
)
mask
=
mask_util
.
decode
(
rle
)
mask
=
mask
[:,
::
-
1
,
:
]
mask
=
mask
[:,
::
-
1
]
rle
=
mask_util
.
encode
(
np
.
array
(
mask
,
order
=
'F'
,
dtype
=
np
.
uint8
))
rle
=
mask_util
.
encode
(
np
.
array
(
mask
,
order
=
'F'
,
dtype
=
np
.
uint8
))
return
rle
return
rle
def
is_poly
(
segm
):
if
not
isinstance
(
segm
,
(
list
,
dict
)):
raise
Exception
(
"Invalid segm type: {}"
.
format
(
type
(
segm
)))
return
isinstance
(
segm
,
list
)
flipped_segms
=
[]
flipped_segms
=
[]
for
segm
in
segms
:
for
segm
in
segms
:
if
is_poly
(
segm
):
if
is_poly
(
segm
):
...
...
paddlex/cv/transforms/det_transforms.py
浏览文件 @
1d2e74d9
...
@@ -12,13 +12,22 @@
...
@@ -12,13 +12,22 @@
# See the License for the specific language governing permissions and
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
from
.ops
import
*
try
:
from
.box_utils
import
*
from
collections.abc
import
Sequence
except
Exception
:
from
collections
import
Sequence
from
numbers
import
Number
import
random
import
random
import
os.path
as
osp
import
os.path
as
osp
import
numpy
as
np
import
numpy
as
np
from
PIL
import
Image
,
ImageEnhance
import
cv2
import
cv2
from
PIL
import
Image
,
ImageEnhance
from
.ops
import
*
from
.box_utils
import
*
class
Compose
:
class
Compose
:
...
@@ -81,7 +90,7 @@ class Compose:
...
@@ -81,7 +90,7 @@ class Compose:
im
=
cv2
.
imread
(
im_file
).
astype
(
'float32'
)
im
=
cv2
.
imread
(
im_file
).
astype
(
'float32'
)
except
:
except
:
raise
TypeError
(
raise
TypeError
(
'Can
\'
t read The image file {}!'
.
format
(
im_file
))
'Can
\'
t read The image file {}!'
.
format
(
im_file
))
im
=
cv2
.
cvtColor
(
im
,
cv2
.
COLOR_BGR2RGB
)
im
=
cv2
.
cvtColor
(
im
,
cv2
.
COLOR_BGR2RGB
)
# make default im_info with [h, w, 1]
# make default im_info with [h, w, 1]
im_info
[
'im_resize_info'
]
=
np
.
array
(
im_info
[
'im_resize_info'
]
=
np
.
array
(
...
@@ -658,9 +667,17 @@ class MixupImage:
...
@@ -658,9 +667,17 @@ class MixupImage:
gt_score2
=
im_info
[
'mixup'
][
2
][
'gt_score'
]
gt_score2
=
im_info
[
'mixup'
][
2
][
'gt_score'
]
gt_score
=
np
.
concatenate
(
gt_score
=
np
.
concatenate
(
(
gt_score1
*
factor
,
gt_score2
*
(
1.
-
factor
)),
axis
=
0
)
(
gt_score1
*
factor
,
gt_score2
*
(
1.
-
factor
)),
axis
=
0
)
if
'gt_poly'
in
label_info
:
gt_poly1
=
label_info
[
'gt_poly'
]
gt_poly2
=
im_info
[
'mixup'
][
2
][
'gt_poly'
]
label_info
[
'gt_poly'
]
=
gt_poly1
+
gt_poly2
is_crowd1
=
label_info
[
'is_crowd'
]
is_crowd2
=
im_info
[
'mixup'
][
2
][
'is_crowd'
]
is_crowd
=
np
.
concatenate
((
is_crowd1
,
is_crowd2
),
axis
=
0
)
label_info
[
'gt_bbox'
]
=
gt_bbox
label_info
[
'gt_bbox'
]
=
gt_bbox
label_info
[
'gt_score'
]
=
gt_score
label_info
[
'gt_score'
]
=
gt_score
label_info
[
'gt_class'
]
=
gt_class
label_info
[
'gt_class'
]
=
gt_class
label_info
[
'is_crowd'
]
=
is_crowd
im_info
[
'augment_shape'
]
=
np
.
array
([
im
.
shape
[
0
],
im_info
[
'augment_shape'
]
=
np
.
array
([
im
.
shape
[
0
],
im
.
shape
[
1
]]).
astype
(
'int32'
)
im
.
shape
[
1
]]).
astype
(
'int32'
)
im_info
.
pop
(
'mixup'
)
im_info
.
pop
(
'mixup'
)
...
@@ -672,23 +689,32 @@ class MixupImage:
...
@@ -672,23 +689,32 @@ class MixupImage:
class
RandomExpand
:
class
RandomExpand
:
"""随机扩张图像,模型训练时的数据增强操作。
"""随机扩张图像,模型训练时的数据增强操作。
1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。
1. 随机选取扩张比例(扩张比例大于1时才进行扩张)。
2. 计算扩张后图像大小。
2. 计算扩张后图像大小。
3. 初始化像素值为
数据集均
值的图像,并将原图像随机粘贴于该图像上。
3. 初始化像素值为
输入填充
值的图像,并将原图像随机粘贴于该图像上。
4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
4. 根据原图像粘贴位置换算出扩张后真实标注框的位置坐标。
5. 根据原图像粘贴位置换算出扩张后真实分割区域的位置坐标。
Args:
Args:
max_
ratio (float): 图像扩张的最大比例。默认为4.0。
ratio (float): 图像扩张的最大比例。默认为4.0。
prob (float): 随机扩张的概率。默认为0.5。
prob (float): 随机扩张的概率。默认为0.5。
mean (list): 图像数据集的均值(0-255)。默认为[127.5, 127.5, 127.5]。
fill_value (list): 扩张图像的初始填充值(0-255)。默认为[123.675, 116.28, 103.53]。
"""
"""
def
__init__
(
self
,
max_ratio
=
4.
,
prob
=
0.5
,
mean
=
[
127.5
,
127.5
,
127.5
]):
def
__init__
(
self
,
self
.
max_ratio
=
max_ratio
ratio
=
4.
,
self
.
mean
=
mean
prob
=
0.5
,
fill_value
=
[
123.675
,
116.28
,
103.53
]):
super
(
RandomExpand
,
self
).
__init__
()
assert
ratio
>
1.01
,
"expand ratio must be larger than 1.01"
self
.
ratio
=
ratio
self
.
prob
=
prob
self
.
prob
=
prob
assert
isinstance
(
fill_value
,
(
Number
,
Sequence
)),
\
"fill value must be either float or sequence"
if
isinstance
(
fill_value
,
Number
):
fill_value
=
(
fill_value
,
)
*
3
if
not
isinstance
(
fill_value
,
tuple
):
fill_value
=
tuple
(
fill_value
)
self
.
fill_value
=
fill_value
def
__call__
(
self
,
im
,
im_info
=
None
,
label_info
=
None
):
def
__call__
(
self
,
im
,
im_info
=
None
,
label_info
=
None
):
"""
"""
...
@@ -696,7 +722,6 @@ class RandomExpand:
...
@@ -696,7 +722,6 @@ class RandomExpand:
im (np.ndarray): 图像np.ndarray数据。
im (np.ndarray): 图像np.ndarray数据。
im_info (dict, 可选): 存储与图像相关的信息。
im_info (dict, 可选): 存储与图像相关的信息。
label_info (dict, 可选): 存储与标注框相关的信息。
label_info (dict, 可选): 存储与标注框相关的信息。
Returns:
Returns:
tuple: 当label_info为空时,返回的tuple为(im, im_info),分别对应图像np.ndarray数据、存储与图像相关信息的字典;
tuple: 当label_info为空时,返回的tuple为(im, im_info),分别对应图像np.ndarray数据、存储与图像相关信息的字典;
当label_info不为空时,返回的tuple为(im, im_info, label_info),分别对应图像np.ndarray数据、
当label_info不为空时,返回的tuple为(im, im_info, label_info),分别对应图像np.ndarray数据、
...
@@ -708,7 +733,6 @@ class RandomExpand:
...
@@ -708,7 +733,6 @@ class RandomExpand:
其中n代表真实标注框的个数。
其中n代表真实标注框的个数。
- gt_class (np.ndarray): 随机扩张后每个真实标注框对应的类别序号,形状为(n, 1),
- gt_class (np.ndarray): 随机扩张后每个真实标注框对应的类别序号,形状为(n, 1),
其中n代表真实标注框的个数。
其中n代表真实标注框的个数。
Raises:
Raises:
TypeError: 形参数据类型不满足需求。
TypeError: 形参数据类型不满足需求。
"""
"""
...
@@ -723,108 +747,68 @@ class RandomExpand:
...
@@ -723,108 +747,68 @@ class RandomExpand:
'gt_class'
not
in
label_info
:
'gt_class'
not
in
label_info
:
raise
TypeError
(
'Cannot do RandomExpand! '
+
\
raise
TypeError
(
'Cannot do RandomExpand! '
+
\
'Becasuse gt_bbox/gt_class is not in label_info!'
)
'Becasuse gt_bbox/gt_class is not in label_info!'
)
prob
=
np
.
random
.
uniform
(
0
,
1
)
if
np
.
random
.
uniform
(
0.
,
1.
)
<
self
.
prob
:
return
(
im
,
im_info
,
label_info
)
augment_shape
=
im_info
[
'augment_shape'
]
augment_shape
=
im_info
[
'augment_shape'
]
im_width
=
augment_shape
[
1
]
height
=
int
(
augment_shape
[
0
])
im_height
=
augment_shape
[
0
]
width
=
int
(
augment_shape
[
1
])
gt_bbox
=
label_info
[
'gt_bbox'
]
gt_class
=
label_info
[
'gt_class'
]
expand_ratio
=
np
.
random
.
uniform
(
1.
,
self
.
ratio
)
h
=
int
(
height
*
expand_ratio
)
if
prob
<
self
.
prob
:
w
=
int
(
width
*
expand_ratio
)
if
self
.
max_ratio
-
1
>=
0.01
:
if
not
h
>
height
or
not
w
>
width
:
expand_ratio
=
np
.
random
.
uniform
(
1
,
self
.
max_ratio
)
height
=
int
(
im_height
*
expand_ratio
)
width
=
int
(
im_width
*
expand_ratio
)
h_off
=
math
.
floor
(
np
.
random
.
uniform
(
0
,
height
-
im_height
))
w_off
=
math
.
floor
(
np
.
random
.
uniform
(
0
,
width
-
im_width
))
expand_bbox
=
[
-
w_off
/
im_width
,
-
h_off
/
im_height
,
(
width
-
w_off
)
/
im_width
,
(
height
-
h_off
)
/
im_height
]
expand_im
=
np
.
ones
((
height
,
width
,
3
))
expand_im
=
np
.
uint8
(
expand_im
*
np
.
squeeze
(
self
.
mean
))
expand_im
=
Image
.
fromarray
(
expand_im
)
im
=
im
.
astype
(
'uint8'
)
im
=
Image
.
fromarray
(
im
)
expand_im
.
paste
(
im
,
(
int
(
w_off
),
int
(
h_off
)))
expand_im
=
np
.
asarray
(
expand_im
)
for
i
in
range
(
gt_bbox
.
shape
[
0
]):
gt_bbox
[
i
][
0
]
=
gt_bbox
[
i
][
0
]
/
im_width
gt_bbox
[
i
][
1
]
=
gt_bbox
[
i
][
1
]
/
im_height
gt_bbox
[
i
][
2
]
=
gt_bbox
[
i
][
2
]
/
im_width
gt_bbox
[
i
][
3
]
=
gt_bbox
[
i
][
3
]
/
im_height
gt_bbox
,
gt_class
,
_
=
filter_and_process
(
expand_bbox
,
gt_bbox
,
gt_class
)
for
i
in
range
(
gt_bbox
.
shape
[
0
]):
gt_bbox
[
i
][
0
]
=
gt_bbox
[
i
][
0
]
*
width
gt_bbox
[
i
][
1
]
=
gt_bbox
[
i
][
1
]
*
height
gt_bbox
[
i
][
2
]
=
gt_bbox
[
i
][
2
]
*
width
gt_bbox
[
i
][
3
]
=
gt_bbox
[
i
][
3
]
*
height
im
=
expand_im
.
astype
(
'float32'
)
label_info
[
'gt_bbox'
]
=
gt_bbox
label_info
[
'gt_class'
]
=
gt_class
im_info
[
'augment_shape'
]
=
np
.
array
([
height
,
width
]).
astype
(
'int32'
)
if
label_info
is
None
:
return
(
im
,
im_info
)
else
:
return
(
im
,
im_info
,
label_info
)
return
(
im
,
im_info
,
label_info
)
y
=
np
.
random
.
randint
(
0
,
h
-
height
)
x
=
np
.
random
.
randint
(
0
,
w
-
width
)
canvas
=
np
.
ones
((
h
,
w
,
3
),
dtype
=
np
.
uint8
)
canvas
*=
np
.
array
(
self
.
fill_value
,
dtype
=
np
.
uint8
)
canvas
[
y
:
y
+
height
,
x
:
x
+
width
,
:]
=
im
.
astype
(
np
.
uint8
)
im_info
[
'augment_shape'
]
=
np
.
array
([
h
,
w
]).
astype
(
'int32'
)
if
'gt_bbox'
in
label_info
and
len
(
label_info
[
'gt_bbox'
])
>
0
:
label_info
[
'gt_bbox'
]
+=
np
.
array
([
x
,
y
]
*
2
,
dtype
=
np
.
float32
)
if
'gt_poly'
in
label_info
and
len
(
label_info
[
'gt_poly'
])
>
0
:
label_info
[
'gt_poly'
]
=
expand_segms
(
label_info
[
'gt_poly'
],
x
,
y
,
height
,
width
,
expand_ratio
)
return
(
canvas
,
im_info
,
label_info
)
class
RandomCrop
:
class
RandomCrop
:
"""随机裁剪图像。
"""随机裁剪图像。
1. 若allow_no_crop为True,则在thresholds加入’no_crop’
1. 根据batch_sampler计算获取裁剪候选区域的位置。
2. 随机打乱thresholds
(1) 根据min scale、max scale、min aspect ratio、max aspect ratio计算随机剪裁的高、宽。
3. 遍历thresholds中各元素:
(2) 根据随机剪裁的高、宽随机选取剪裁的起始点。
(1) 如果当前thresh为’no_crop’,则返回原始图像和标注信息
(3) 筛选出裁剪候选区域:
(2) 随机取出aspect_ratio和scaling中的值并由此计算出候选裁剪区域的高、宽、起始点。
- 当satisfy_all为True时,需所有真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域才可保留。
(3) 计算真实标注框与候选裁剪区域IoU,若全部真实标注框的IoU都小于thresh,则继续第3步
- 当satisfy_all为False时,当有一个真实标注框与裁剪候选区域的重叠度满足需求时,该裁剪候选区域就可保留。
(4) 如果cover_all_box为True且存在真实标注框的IoU小于thresh,则继续第3步
2. 遍历所有裁剪候选区域:
(5) 筛选出位于候选裁剪区域内的真实标注框,若有效框的个数为0,则继续第3步,否则进行第4步。
(1) 若真实标注框与候选裁剪区域不重叠,或其中心点不在候选裁剪区域,
4. 换算有效真值标注框相对候选裁剪区域的位置坐标。
则将该真实标注框去除。
5. 换算有效分割区域相对候选裁剪区域的位置坐标。
(2) 计算相对于该候选裁剪区域,真实标注框的位置,并筛选出对应的类别、混合得分。
(3) 若avoid_no_bbox为False,返回当前裁剪后的信息即可;
反之,要找到一个裁剪区域中真实标注框个数不为0的区域,才返回裁剪后的信息。
Args:
Args:
batch_sampler (list): 随机裁剪参数的多种组合,每种组合包含8个值,如下:
aspect_ratio (list): 裁剪后短边缩放比例的取值范围,以[min, max]形式表示。默认值为[.5, 2.]。
- max sample (int):满足当前组合的裁剪区域的个数上限。
thresholds (list): 判断裁剪候选区域是否有效所需的IoU阈值取值列表。默认值为[.0, .1, .3, .5, .7, .9]。
- max trial (int): 查找满足当前组合的次数。
scaling (list): 裁剪面积相对原面积的取值范围,以[min, max]形式表示。默认值为[.3, 1.]。
- min scale (float): 裁剪面积相对原面积,每条边缩短比例的最小限制。
num_attempts (int): 在放弃寻找有效裁剪区域前尝试的次数。默认值为50。
- max scale (float): 裁剪面积相对原面积,每条边缩短比例的最大限制。
allow_no_crop (bool): 是否允许未进行裁剪。默认值为True。
- min aspect ratio (float): 裁剪后短边缩放比例的最小限制。
cover_all_box (bool): 是否要求所有的真实标注框都必须在裁剪区域内。默认值为False。
- max aspect ratio (float): 裁剪后短边缩放比例的最大限制。
- min overlap (float): 真实标注框与裁剪图像重叠面积的最小限制。
- max overlap (float): 真实标注框与裁剪图像重叠面积的最大限制。
默认值为None,当为None时采用如下设置:
[[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
satisfy_all (bool): 是否需要所有标注框满足条件,裁剪候选区域才保留。默认为False。
avoid_no_bbox (bool): 是否对裁剪图像不存在标注框的图像进行保留。默认为True。
"""
"""
def
__init__
(
self
,
def
__init__
(
self
,
batch_sampler
=
None
,
aspect_ratio
=
[.
5
,
2.
],
satisfy_all
=
False
,
thresholds
=
[.
0
,
.
1
,
.
3
,
.
5
,
.
7
,
.
9
],
avoid_no_bbox
=
True
):
scaling
=
[.
3
,
1.
],
if
batch_sampler
is
None
:
num_attempts
=
50
,
batch_sampler
=
[[
1
,
1
,
1.0
,
1.0
,
1.0
,
1.0
,
0.0
,
0.0
],
allow_no_crop
=
True
,
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.1
,
1.0
],
cover_all_box
=
False
):
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.3
,
1.0
],
self
.
aspect_ratio
=
aspect_ratio
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.5
,
1.0
],
self
.
thresholds
=
thresholds
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.7
,
1.0
],
self
.
scaling
=
scaling
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.9
,
1.0
],
self
.
num_attempts
=
num_attempts
[
1
,
50
,
0.3
,
1.0
,
0.5
,
2.0
,
0.0
,
1.0
]]
self
.
allow_no_crop
=
allow_no_crop
self
.
batch_sampler
=
batch_sampler
self
.
cover_all_box
=
cover_all_box
self
.
satisfy_all
=
satisfy_all
self
.
avoid_no_bbox
=
avoid_no_bbox
def
__call__
(
self
,
im
,
im_info
=
None
,
label_info
=
None
):
def
__call__
(
self
,
im
,
im_info
=
None
,
label_info
=
None
):
"""
"""
...
@@ -859,66 +843,84 @@ class RandomCrop:
...
@@ -859,66 +843,84 @@ class RandomCrop:
'gt_class'
not
in
label_info
:
'gt_class'
not
in
label_info
:
raise
TypeError
(
'Cannot do RandomCrop! '
+
\
raise
TypeError
(
'Cannot do RandomCrop! '
+
\
'Becasuse gt_bbox/gt_class is not in label_info!'
)
'Becasuse gt_bbox/gt_class is not in label_info!'
)
if
len
(
label_info
[
'gt_bbox'
])
==
0
:
return
(
im
,
im_info
,
label_info
)
augment_shape
=
im_info
[
'augment_shape'
]
augment_shape
=
im_info
[
'augment_shape'
]
im_width
=
augment_shape
[
1
]
w
=
augment_shape
[
1
]
im_height
=
augment_shape
[
0
]
h
=
augment_shape
[
0
]
gt_bbox
=
label_info
[
'gt_bbox'
]
gt_bbox
=
label_info
[
'gt_bbox'
]
gt_bbox_tmp
=
gt_bbox
.
copy
()
thresholds
=
list
(
self
.
thresholds
)
for
i
in
range
(
gt_bbox_tmp
.
shape
[
0
]):
if
self
.
allow_no_crop
:
gt_bbox_tmp
[
i
][
0
]
=
gt_bbox
[
i
][
0
]
/
im_width
thresholds
.
append
(
'no_crop'
)
gt_bbox_tmp
[
i
][
1
]
=
gt_bbox
[
i
][
1
]
/
im_height
np
.
random
.
shuffle
(
thresholds
)
gt_bbox_tmp
[
i
][
2
]
=
gt_bbox
[
i
][
2
]
/
im_width
gt_bbox_tmp
[
i
][
3
]
=
gt_bbox
[
i
][
3
]
/
im_height
for
thresh
in
thresholds
:
gt_class
=
label_info
[
'gt_class'
]
if
thresh
==
'no_crop'
:
return
(
im
,
im_info
,
label_info
)
gt_score
=
None
if
'gt_score'
in
label_info
:
found
=
False
gt_score
=
label_info
[
'gt_score'
]
for
i
in
range
(
self
.
num_attempts
):
sampled_bbox
=
[]
scale
=
np
.
random
.
uniform
(
*
self
.
scaling
)
gt_bbox_tmp
=
gt_bbox_tmp
.
tolist
()
min_ar
,
max_ar
=
self
.
aspect_ratio
for
sampler
in
self
.
batch_sampler
:
aspect_ratio
=
np
.
random
.
uniform
(
found
=
0
max
(
min_ar
,
scale
**
2
),
min
(
max_ar
,
scale
**-
2
))
for
i
in
range
(
sampler
[
1
]):
crop_h
=
int
(
h
*
scale
/
np
.
sqrt
(
aspect_ratio
))
if
found
>=
sampler
[
0
]:
crop_w
=
int
(
w
*
scale
*
np
.
sqrt
(
aspect_ratio
))
break
crop_y
=
np
.
random
.
randint
(
0
,
h
-
crop_h
)
sample_bbox
=
generate_sample_bbox
(
sampler
)
crop_x
=
np
.
random
.
randint
(
0
,
w
-
crop_w
)
if
satisfy_sample_constraint
(
sampler
,
sample_bbox
,
gt_bbox_tmp
,
crop_box
=
[
crop_x
,
crop_y
,
crop_x
+
crop_w
,
crop_y
+
crop_h
]
self
.
satisfy_all
):
iou
=
iou_matrix
(
gt_bbox
,
np
.
array
([
crop_box
],
sampled_bbox
.
append
(
sample_bbox
)
dtype
=
np
.
float32
))
found
=
found
+
1
if
iou
.
max
()
<
thresh
:
im
=
np
.
array
(
im
)
while
sampled_bbox
:
idx
=
int
(
np
.
random
.
uniform
(
0
,
len
(
sampled_bbox
)))
sample_bbox
=
sampled_bbox
.
pop
(
idx
)
sample_bbox
=
clip_bbox
(
sample_bbox
)
crop_bbox
,
crop_class
,
crop_score
=
\
filter_and_process
(
sample_bbox
,
gt_bbox_tmp
,
gt_class
,
gt_score
)
if
self
.
avoid_no_bbox
:
if
len
(
crop_bbox
)
<
1
:
continue
continue
xmin
=
int
(
sample_bbox
[
0
]
*
im_width
)
xmax
=
int
(
sample_bbox
[
2
]
*
im_width
)
if
self
.
cover_all_box
and
iou
.
min
()
<
thresh
:
ymin
=
int
(
sample_bbox
[
1
]
*
im_height
)
continue
ymax
=
int
(
sample_bbox
[
3
]
*
im_height
)
im
=
im
[
ymin
:
ymax
,
xmin
:
xmax
]
cropped_box
,
valid_ids
=
crop_box_with_center_constraint
(
for
i
in
range
(
crop_bbox
.
shape
[
0
]):
gt_bbox
,
np
.
array
(
crop_box
,
dtype
=
np
.
float32
))
crop_bbox
[
i
][
0
]
=
crop_bbox
[
i
][
0
]
*
(
xmax
-
xmin
)
if
valid_ids
.
size
>
0
:
crop_bbox
[
i
][
1
]
=
crop_bbox
[
i
][
1
]
*
(
ymax
-
ymin
)
found
=
True
crop_bbox
[
i
][
2
]
=
crop_bbox
[
i
][
2
]
*
(
xmax
-
xmin
)
break
crop_bbox
[
i
][
3
]
=
crop_bbox
[
i
][
3
]
*
(
ymax
-
ymin
)
label_info
[
'gt_bbox'
]
=
crop_bbox
if
found
:
label_info
[
'gt_class'
]
=
crop_class
if
'gt_poly'
in
label_info
and
len
(
label_info
[
'gt_poly'
])
>
0
:
label_info
[
'gt_score'
]
=
crop_score
crop_polys
=
crop_segms
(
label_info
[
'gt_poly'
],
valid_ids
,
im_info
[
'augment_shape'
]
=
np
.
array
([
ymax
-
ymin
,
np
.
array
(
crop_box
,
dtype
=
np
.
int64
),
xmax
-
xmin
]).
astype
(
'int32'
)
h
,
w
)
if
label_info
is
None
:
if
[]
in
crop_polys
:
return
(
im
,
im_info
)
delete_id
=
list
()
else
:
valid_polys
=
list
()
for
id
,
crop_poly
in
enumerate
(
crop_polys
):
if
crop_poly
==
[]:
delete_id
.
append
(
id
)
else
:
valid_polys
.
append
(
crop_poly
)
valid_ids
=
np
.
delete
(
valid_ids
,
delete_id
)
if
len
(
valid_polys
)
==
0
:
return
(
im
,
im_info
,
label_info
)
label_info
[
'gt_poly'
]
=
valid_polys
else
:
label_info
[
'gt_poly'
]
=
crop_polys
im
=
crop_image
(
im
,
crop_box
)
label_info
[
'gt_bbox'
]
=
np
.
take
(
cropped_box
,
valid_ids
,
axis
=
0
)
label_info
[
'gt_class'
]
=
np
.
take
(
label_info
[
'gt_class'
],
valid_ids
,
axis
=
0
)
im_info
[
'augment_shape'
]
=
np
.
array
(
[
crop_box
[
3
]
-
crop_box
[
1
],
crop_box
[
2
]
-
crop_box
[
0
]]).
astype
(
'int32'
)
if
'gt_score'
in
label_info
:
label_info
[
'gt_score'
]
=
np
.
take
(
label_info
[
'gt_score'
],
valid_ids
,
axis
=
0
)
if
'is_crowd'
in
label_info
:
label_info
[
'is_crowd'
]
=
np
.
take
(
label_info
[
'is_crowd'
],
valid_ids
,
axis
=
0
)
return
(
im
,
im_info
,
label_info
)
return
(
im
,
im_info
,
label_info
)
if
label_info
is
None
:
return
(
im
,
im_info
)
return
(
im
,
im_info
,
label_info
)
else
:
return
(
im
,
im_info
,
label_info
)
class
ArrangeFasterRCNN
:
class
ArrangeFasterRCNN
:
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录