Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleHub
提交
59499f1b
P
PaddleHub
项目概览
PaddlePaddle
/
PaddleHub
1 年多 前同步成功
通知
283
Star
12117
Fork
2091
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
200
列表
看板
标记
里程碑
合并请求
4
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleHub
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
200
Issue
200
列表
看板
标记
里程碑
合并请求
4
合并请求
4
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
59499f1b
编写于
5月 07, 2020
作者:
W
wuzewu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Add yolov3_resnet50_vd_coco2017
上级
4d82efa0
变更
21
隐藏空白更改
内联
并排
Showing
21 changed file
with
1755 addition
and
23 deletion
+1755
-23
hub_module/modules/image/object_detection/faster_rcnn_resnet50_coco2017/README.md
.../object_detection/faster_rcnn_resnet50_coco2017/README.md
+2
-2
hub_module/modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
...ect_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
+2
-2
hub_module/modules/image/object_detection/ssd_mobilenet_v1_pascal/README.md
.../image/object_detection/ssd_mobilenet_v1_pascal/README.md
+3
-3
hub_module/modules/image/object_detection/ssd_vgg16_300_coco2017/README.md
...s/image/object_detection/ssd_vgg16_300_coco2017/README.md
+3
-3
hub_module/modules/image/object_detection/ssd_vgg16_512_coco2017/README.md
...s/image/object_detection/ssd_vgg16_512_coco2017/README.md
+3
-3
hub_module/modules/image/object_detection/yolov3_darknet53_coco2017/README.md
...mage/object_detection/yolov3_darknet53_coco2017/README.md
+2
-2
hub_module/modules/image/object_detection/yolov3_darknet53_pedestrian/README.md
...ge/object_detection/yolov3_darknet53_pedestrian/README.md
+2
-2
hub_module/modules/image/object_detection/yolov3_darknet53_vehicles/README.md
...mage/object_detection/yolov3_darknet53_vehicles/README.md
+2
-2
hub_module/modules/image/object_detection/yolov3_mobilenet_v1_coco2017/README.md
...e/object_detection/yolov3_mobilenet_v1_coco2017/README.md
+2
-2
hub_module/modules/image/object_detection/yolov3_resnet34_coco2017/README.md
...image/object_detection/yolov3_resnet34_coco2017/README.md
+2
-2
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/README.md
...ge/object_detection/yolov3_resnet50_vd_coco2017/README.md
+138
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/__init__.py
.../object_detection/yolov3_resnet50_vd_coco2017/__init__.py
+0
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
...object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
+71
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/label_file.txt
...ject_detection/yolov3_resnet50_vd_coco2017/label_file.txt
+80
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/module.py
...ge/object_detection/yolov3_resnet50_vd_coco2017/module.py
+321
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/name_adapter.py
...ect_detection/yolov3_resnet50_vd_coco2017/name_adapter.py
+61
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
..._detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
+154
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/processor.py
...object_detection/yolov3_resnet50_vd_coco2017/processor.py
+180
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
...ge/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
+447
-0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
...object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
+273
-0
hub_module/scripts/configs/yolov3_resnet50_vd_coco2017.yml
hub_module/scripts/configs/yolov3_resnet50_vd_coco2017.yml
+7
-0
未找到文件。
hub_module/modules/image/object_detection/faster_rcnn_resnet50_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run faster_rcnn_resnet50_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
...
...
hub_module/modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run faster_rcnn_resnet50_fpn_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
...
...
hub_module/modules/image/object_detection/ssd_mobilenet_v1_pascal/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run ssd_mobilenet_v1_pascal
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
@@ -25,7 +25,7 @@ def context(trainable=True,
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
fatures',否则输出 'bbox
\_
out'。
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
f
e
atures',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
...
...
hub_module/modules/image/object_detection/ssd_vgg16_300_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run ssd_vgg16_300_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
@@ -25,7 +25,7 @@ def context(trainable=True,
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
fatures',否则输出 'bbox
\_
out'。
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
f
e
atures',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
...
...
hub_module/modules/image/object_detection/ssd_vgg16_512_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run ssd_vgg16_512_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
@@ -25,7 +25,7 @@ def context(trainable=True,
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
fatures',否则输出 'bbox
\_
out'。
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
f
e
atures',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
...
...
hub_module/modules/image/object_detection/yolov3_darknet53_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run yolov3_darknet53_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
hub_module/modules/image/object_detection/yolov3_darknet53_pedestrian/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run yolov3_darknet53_pedestrian
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
hub_module/modules/image/object_detection/yolov3_darknet53_vehicles/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run yolov3_darknet53_vehicles
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
hub_module/modules/image/object_detection/yolov3_mobilenet_v1_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run yolov3_mobilenet_v1_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
hub_module/modules/image/object_detection/yolov3_resnet34_coco2017/README.md
浏览文件 @
59499f1b
## 命令行预测
```
```
shell
$
hub run yolov3_resnet34_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
...
...
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/README.md
0 → 100644
浏览文件 @
59499f1b
## 命令行预测
```
shell
$
hub run yolov3_resnet50_vd_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
python
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
```
提取特征,用于迁移学习。
**参数**
*
trainable(bool): 参数是否可训练;
*
pretrained (bool): 是否加载预训练模型;
*
get
\_
prediction (bool): 是否执行预测。
**返回**
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
features'、'body
\_
features',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
**参数**
*
paths (list
\[
str
\]
): 图片的路径;
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
**参数**
*
dirname: 存在模型的目录名称
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
*
combined: 是否将参数保存到统一的一个文件中。
## 代码示例
```
python
import
paddlehub
as
hub
import
cv2
object_detector
=
hub
.
Module
(
name
=
"yolov3_resnet50_vd_coco2017"
)
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
PaddleHub Serving 可以部署一个目标检测的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```
shell
$
hub serving start
-m
yolov3_resnet50_vd_coco2017
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```
python
import
requests
import
json
import
cv2
import
base64
def
cv2_to_base64
(
image
):
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
# 发送HTTP请求
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
headers
=
{
"Content-type"
:
"application/json"
}
url
=
"http://127.0.0.1:8866/predict/yolov3_resnet50_vd_coco2017"
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
# 打印预测结果
print
(
r
.
json
()[
"results"
])
```
### 依赖
paddlepaddle >= 1.6.2
paddlehub >= 1.6.0
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/__init__.py
0 → 100644
浏览文件 @
59499f1b
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
0 → 100644
浏览文件 @
59499f1b
# coding=utf-8
from
__future__
import
absolute_import
from
__future__
import
print_function
from
__future__
import
division
import
os
import
cv2
import
numpy
as
np
__all__
=
[
'reader'
]
def
reader
(
paths
=
[],
images
=
None
):
"""
data generator
Args:
paths (list[str]): paths to images.
images (list(numpy.ndarray)): data of images, shape of each is [H, W, C]
Yield:
res (list): preprocessed image and the size of original image.
"""
img_list
=
[]
if
paths
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
if
images
is
not
None
:
for
img
in
images
:
img_list
.
append
(
img
)
for
im
in
img_list
:
# im_size
im_shape
=
im
.
shape
im_size
=
np
.
array
([
im_shape
[
0
],
im_shape
[
1
]],
dtype
=
np
.
int32
)
# decode image
im
=
cv2
.
cvtColor
(
im
,
cv2
.
COLOR_BGR2RGB
)
# resize image
target_size
=
608
im_size_min
=
np
.
min
(
im_shape
[
0
:
2
])
im_size_max
=
np
.
max
(
im_shape
[
0
:
2
])
if
float
(
im_size_min
)
==
0
:
raise
ZeroDivisionError
(
'min size of image is 0'
)
im_scale_x
=
float
(
target_size
)
/
float
(
im_shape
[
1
])
im_scale_y
=
float
(
target_size
)
/
float
(
im_shape
[
0
])
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
2
)
# normalize image
mean
=
[
0.485
,
0.456
,
0.406
]
std
=
[
0.229
,
0.224
,
0.225
]
im
=
im
.
astype
(
np
.
float32
,
copy
=
False
)
mean
=
np
.
array
(
mean
)[
np
.
newaxis
,
np
.
newaxis
,
:]
std
=
np
.
array
(
std
)[
np
.
newaxis
,
np
.
newaxis
,
:]
im
=
im
/
255.0
im
-=
mean
im
/=
std
# permute
im
=
np
.
swapaxes
(
im
,
1
,
2
)
im
=
np
.
swapaxes
(
im
,
1
,
0
)
yield
[
im
,
im_size
]
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/label_file.txt
0 → 100644
浏览文件 @
59499f1b
person
bicycle
car
motorcycle
airplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
couch
potted plant
bed
dining table
toilet
tv
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/module.py
0 → 100644
浏览文件 @
59499f1b
# coding=utf-8
from
__future__
import
absolute_import
import
ast
import
argparse
import
os
from
functools
import
partial
import
numpy
as
np
import
paddle.fluid
as
fluid
import
paddlehub
as
hub
from
paddle.fluid.core
import
PaddleTensor
,
AnalysisConfig
,
create_paddle_predictor
from
paddlehub.module.module
import
moduleinfo
,
runnable
,
serving
from
paddlehub.common.paddle_helper
import
add_vars_prefix
from
yolov3_resnet50_vd_coco2017.resnet
import
ResNet
from
yolov3_resnet50_vd_coco2017.processor
import
load_label_info
,
postprocess
,
base64_to_cv2
from
yolov3_resnet50_vd_coco2017.data_feed
import
reader
from
yolov3_resnet50_vd_coco2017.yolo_head
import
MultiClassNMS
,
YOLOv3Head
@
moduleinfo
(
name
=
"yolov3_resnet50_vd_coco2017"
,
version
=
"1.0.0"
,
type
=
"CV/object_detection"
,
summary
=
"Baidu's YOLOv3 model for object detection with backbone ResNet50, trained with dataset coco2017."
,
author
=
"paddlepaddle"
,
author_email
=
"paddle-dev@baidu.com"
)
class
YOLOv3ResNet50Coco2017
(
hub
.
Module
):
def
_initialize
(
self
):
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
directory
,
"yolov3_resnet50_model"
)
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
_set_config
()
def
_set_config
(
self
):
"""
predictor config setting.
"""
cpu_config
=
AnalysisConfig
(
self
.
default_pretrained_model_path
)
cpu_config
.
disable_glog_info
()
cpu_config
.
disable_gpu
()
cpu_config
.
switch_ir_optim
(
False
)
self
.
cpu_predictor
=
create_paddle_predictor
(
cpu_config
)
try
:
_places
=
os
.
environ
[
"CUDA_VISIBLE_DEVICES"
]
int
(
_places
[
0
])
use_gpu
=
True
except
:
use_gpu
=
False
if
use_gpu
:
gpu_config
=
AnalysisConfig
(
self
.
default_pretrained_model_path
)
gpu_config
.
disable_glog_info
()
gpu_config
.
enable_use_gpu
(
memory_pool_init_size_mb
=
500
,
device_id
=
0
)
self
.
gpu_predictor
=
create_paddle_predictor
(
gpu_config
)
def
context
(
self
,
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
):
"""
Distill the Head Features, so as to perform transfer learning.
Args:
trainable (bool): whether to set parameters trainable.
pretrained (bool): whether to load default pretrained model.
get_prediction (bool): whether to get prediction.
Returns:
inputs(dict): the input variables.
outputs(dict): the output variables.
context_prog (Program): the program to execute transfer learning.
"""
context_prog
=
fluid
.
Program
()
startup_program
=
fluid
.
Program
()
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
unique_name
.
guard
():
# image
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
3
,
608
,
608
],
dtype
=
'float32'
)
# backbone
backbone
=
ResNet
(
norm_type
=
'sync_bn'
,
freeze_at
=
0
,
freeze_norm
=
False
,
norm_decay
=
0.
,
dcn_v2_stages
=
[
5
],
depth
=
50
,
variant
=
'd'
,
feature_maps
=
[
3
,
4
,
5
])
# body_feats
body_feats
=
backbone
(
image
)
# im_size
im_size
=
fluid
.
layers
.
data
(
name
=
'im_size'
,
shape
=
[
2
],
dtype
=
'int32'
)
# yolo_head
yolo_head
=
YOLOv3Head
(
num_classes
=
80
)
# head_features
head_features
,
body_features
=
yolo_head
.
_get_outputs
(
body_feats
,
is_train
=
trainable
)
place
=
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
exe
.
run
(
fluid
.
default_startup_program
())
# var_prefix
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
# name of inputs
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'im_size'
:
var_prefix
+
im_size
.
name
}
# name of outputs
if
get_prediction
:
bbox_out
=
yolo_head
.
get_prediction
(
head_features
,
im_size
)
outputs
=
{
'bbox_out'
:
[
var_prefix
+
bbox_out
.
name
]}
else
:
outputs
=
{
'head_features'
:
[
var_prefix
+
var
.
name
for
var
in
head_features
],
'body_features'
:
[
var_prefix
+
var
.
name
for
var
in
body_features
]
}
# add_vars_prefix
add_vars_prefix
(
context_prog
,
var_prefix
)
add_vars_prefix
(
fluid
.
default_startup_program
(),
var_prefix
)
# inputs
inputs
=
{
key
:
context_prog
.
global_block
().
vars
[
value
]
for
key
,
value
in
inputs
.
items
()
}
# outputs
outputs
=
{
key
:
[
context_prog
.
global_block
().
vars
[
varname
]
for
varname
in
value
]
for
key
,
value
in
outputs
.
items
()
}
# trainable
for
param
in
context_prog
.
global_block
().
iter_parameters
():
param
.
trainable
=
trainable
# pretrained
if
pretrained
:
def
_if_exist
(
var
):
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
else
:
exe
.
run
(
startup_program
)
return
inputs
,
outputs
,
context_prog
def
object_detection
(
self
,
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
):
"""API of Object Detection.
Args:
paths (list[str]): The paths of images.
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
output_dir (str): The path to store output images.
visualization (bool): Whether to save image or not.
score_thresh (float): threshold for object detecion.
Returns:
res (list[dict]): The result of coco2017 detecion. keys include 'data', 'save_path', the corresponding value is:
data (dict): the result of object detection, keys include 'left', 'top', 'right', 'bottom', 'label', 'confidence', the corresponding value is:
left (float): The X coordinate of the upper left corner of the bounding box;
top (float): The Y coordinate of the upper left corner of the bounding box;
right (float): The X coordinate of the lower right corner of the bounding box;
bottom (float): The Y coordinate of the lower right corner of the bounding box;
label (str): The label of detection result;
confidence (float): The confidence of detection result.
save_path (str, optional): The path to save output images.
"""
if
use_gpu
:
try
:
_places
=
os
.
environ
[
"CUDA_VISIBLE_DEVICES"
]
int
(
_places
[
0
])
except
:
raise
RuntimeError
(
"Attempt to use GPU for prediction, but environment variable CUDA_VISIBLE_DEVICES was not set correctly."
)
paths
=
paths
if
paths
else
list
()
data_reader
=
partial
(
reader
,
paths
,
images
)
batch_reader
=
fluid
.
io
.
batch
(
data_reader
,
batch_size
=
batch_size
)
res
=
[]
for
iter_id
,
feed_data
in
enumerate
(
batch_reader
()):
feed_data
=
np
.
array
(
feed_data
)
image_tensor
=
PaddleTensor
(
np
.
array
(
list
(
feed_data
[:,
0
])))
im_size_tensor
=
PaddleTensor
(
np
.
array
(
list
(
feed_data
[:,
1
])))
if
use_gpu
:
data_out
=
self
.
gpu_predictor
.
run
(
[
image_tensor
,
im_size_tensor
])
else
:
data_out
=
self
.
cpu_predictor
.
run
(
[
image_tensor
,
im_size_tensor
])
output
=
postprocess
(
paths
=
paths
,
images
=
images
,
data_out
=
data_out
,
score_thresh
=
score_thresh
,
label_names
=
self
.
label_names
,
output_dir
=
output_dir
,
handle_id
=
iter_id
*
batch_size
,
visualization
=
visualization
)
res
.
extend
(
output
)
return
res
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
if
combined
:
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
place
=
fluid
.
CPUPlace
()
exe
=
fluid
.
Executor
(
place
)
program
,
feeded_var_names
,
target_vars
=
fluid
.
io
.
load_inference_model
(
dirname
=
self
.
default_pretrained_model_path
,
executor
=
exe
)
fluid
.
io
.
save_inference_model
(
dirname
=
dirname
,
main_program
=
program
,
executor
=
exe
,
feeded_var_names
=
feeded_var_names
,
target_vars
=
target_vars
,
model_filename
=
model_filename
,
params_filename
=
params_filename
)
@
serving
def
serving_method
(
self
,
images
,
**
kwargs
):
"""
Run as a service.
"""
images_decode
=
[
base64_to_cv2
(
image
)
for
image
in
images
]
results
=
self
.
object_detection
(
images
=
images_decode
,
**
kwargs
)
return
results
@
runnable
def
run_cmd
(
self
,
argvs
):
"""
Run as a command.
"""
self
.
parser
=
argparse
.
ArgumentParser
(
description
=
"Run the {} module."
.
format
(
self
.
name
),
prog
=
'hub run {}'
.
format
(
self
.
name
),
usage
=
'%(prog)s'
,
add_help
=
True
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
self
.
add_module_config_arg
()
self
.
add_module_input_arg
()
args
=
self
.
parser
.
parse_args
(
argvs
)
results
=
self
.
face_detection
(
paths
=
[
args
.
input_path
],
batch_size
=
args
.
batch_size
,
use_gpu
=
args
.
use_gpu
,
output_dir
=
args
.
output_dir
,
visualization
=
args
.
visualization
,
score_thresh
=
args
.
score_thresh
)
return
results
def
add_module_config_arg
(
self
):
"""
Add the command config options.
"""
self
.
arg_config_group
.
add_argument
(
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
self
.
arg_config_group
.
add_argument
(
'--output_dir'
,
type
=
str
,
default
=
'detection_result'
,
help
=
"The directory to save output images."
)
self
.
arg_config_group
.
add_argument
(
'--visualization'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether to save output as images."
)
def
add_module_input_arg
(
self
):
"""
Add the command input options.
"""
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
help
=
"path to image."
)
self
.
arg_input_group
.
add_argument
(
'--batch_size'
,
type
=
ast
.
literal_eval
,
default
=
1
,
help
=
"batch size."
)
self
.
arg_input_group
.
add_argument
(
'--score_thresh'
,
type
=
ast
.
literal_eval
,
default
=
0.5
,
help
=
"threshold for object detecion."
)
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/name_adapter.py
0 → 100644
浏览文件 @
59499f1b
# coding=utf-8
class
NameAdapter
(
object
):
"""Fix the backbones variable names for pretrained weight"""
def
__init__
(
self
,
model
):
super
(
NameAdapter
,
self
).
__init__
()
self
.
model
=
model
@
property
def
model_type
(
self
):
return
getattr
(
self
.
model
,
'_model_type'
,
''
)
@
property
def
variant
(
self
):
return
getattr
(
self
.
model
,
'variant'
,
''
)
def
fix_conv_norm_name
(
self
,
name
):
if
name
==
"conv1"
:
bn_name
=
"bn_"
+
name
else
:
bn_name
=
"bn"
+
name
[
3
:]
# the naming rule is same as pretrained weight
if
self
.
model_type
==
'SEResNeXt'
:
bn_name
=
name
+
"_bn"
return
bn_name
def
fix_shortcut_name
(
self
,
name
):
if
self
.
model_type
==
'SEResNeXt'
:
name
=
'conv'
+
name
+
'_prj'
return
name
def
fix_bottleneck_name
(
self
,
name
):
if
self
.
model_type
==
'SEResNeXt'
:
conv_name1
=
'conv'
+
name
+
'_x1'
conv_name2
=
'conv'
+
name
+
'_x2'
conv_name3
=
'conv'
+
name
+
'_x3'
shortcut_name
=
name
else
:
conv_name1
=
name
+
"_branch2a"
conv_name2
=
name
+
"_branch2b"
conv_name3
=
name
+
"_branch2c"
shortcut_name
=
name
+
"_branch1"
return
conv_name1
,
conv_name2
,
conv_name3
,
shortcut_name
def
fix_layer_warp_name
(
self
,
stage_num
,
count
,
i
):
name
=
'res'
+
str
(
stage_num
)
if
count
>
10
and
stage_num
==
4
:
if
i
==
0
:
conv_name
=
name
+
"a"
else
:
conv_name
=
name
+
"b"
+
str
(
i
)
else
:
conv_name
=
name
+
chr
(
ord
(
"a"
)
+
i
)
if
self
.
model_type
==
'SEResNeXt'
:
conv_name
=
str
(
stage_num
+
2
)
+
'_'
+
str
(
i
+
1
)
return
conv_name
def
fix_c1_stage_name
(
self
):
return
"res_conv1"
if
self
.
model_type
==
'ResNeXt'
else
"conv1"
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
0 → 100644
浏览文件 @
59499f1b
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
from
__future__
import
unicode_literals
import
paddle.fluid
as
fluid
from
paddle.fluid
import
ParamAttr
nonlocal_params
=
{
"use_zero_init_conv"
:
False
,
"conv_init_std"
:
0.01
,
"no_bias"
:
True
,
"use_maxpool"
:
False
,
"use_softmax"
:
True
,
"use_bn"
:
False
,
"use_scale"
:
True
,
# vital for the model prformance!!!
"use_affine"
:
False
,
"bn_momentum"
:
0.9
,
"bn_epsilon"
:
1.0000001e-5
,
"bn_init_gamma"
:
0.9
,
"weight_decay_bn"
:
1.e-4
,
}
def
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
,
max_pool_stride
=
2
):
cur
=
input
theta
=
fluid
.
layers
.
conv2d
(
input
=
cur
,
num_filters
=
dim_inner
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
padding
=
[
0
,
0
],
\
param_attr
=
ParamAttr
(
name
=
prefix
+
'_theta'
+
"_w"
,
\
initializer
=
fluid
.
initializer
.
Normal
(
loc
=
0.0
,
scale
=
nonlocal_params
[
"conv_init_std"
])),
\
bias_attr
=
ParamAttr
(
name
=
prefix
+
'_theta'
+
"_b"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
))
\
if
not
nonlocal_params
[
"no_bias"
]
else
False
,
\
name
=
prefix
+
'_theta'
)
theta_shape
=
theta
.
shape
theta_shape_op
=
fluid
.
layers
.
shape
(
theta
)
theta_shape_op
.
stop_gradient
=
True
if
nonlocal_params
[
"use_maxpool"
]:
max_pool
=
fluid
.
layers
.
pool2d
(
input
=
cur
,
\
pool_size
=
[
max_pool_stride
,
max_pool_stride
],
\
pool_type
=
'max'
,
\
pool_stride
=
[
max_pool_stride
,
max_pool_stride
],
\
pool_padding
=
[
0
,
0
],
\
name
=
prefix
+
'_pool'
)
else
:
max_pool
=
cur
phi
=
fluid
.
layers
.
conv2d
(
input
=
max_pool
,
num_filters
=
dim_inner
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
padding
=
[
0
,
0
],
\
param_attr
=
ParamAttr
(
name
=
prefix
+
'_phi'
+
"_w"
,
\
initializer
=
fluid
.
initializer
.
Normal
(
loc
=
0.0
,
scale
=
nonlocal_params
[
"conv_init_std"
])),
\
bias_attr
=
ParamAttr
(
name
=
prefix
+
'_phi'
+
"_b"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
))
\
if
(
nonlocal_params
[
"no_bias"
]
==
0
)
else
False
,
\
name
=
prefix
+
'_phi'
)
phi_shape
=
phi
.
shape
g
=
fluid
.
layers
.
conv2d
(
input
=
max_pool
,
num_filters
=
dim_inner
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
padding
=
[
0
,
0
],
\
param_attr
=
ParamAttr
(
name
=
prefix
+
'_g'
+
"_w"
,
\
initializer
=
fluid
.
initializer
.
Normal
(
loc
=
0.0
,
scale
=
nonlocal_params
[
"conv_init_std"
])),
\
bias_attr
=
ParamAttr
(
name
=
prefix
+
'_g'
+
"_b"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
))
if
(
nonlocal_params
[
"no_bias"
]
==
0
)
else
False
,
\
name
=
prefix
+
'_g'
)
g_shape
=
g
.
shape
# we have to use explicit batch size (to support arbitrary spacetime size)
# e.g. (8, 1024, 4, 14, 14) => (8, 1024, 784)
theta
=
fluid
.
layers
.
reshape
(
theta
,
shape
=
(
0
,
0
,
-
1
))
theta
=
fluid
.
layers
.
transpose
(
theta
,
[
0
,
2
,
1
])
phi
=
fluid
.
layers
.
reshape
(
phi
,
[
0
,
0
,
-
1
])
theta_phi
=
fluid
.
layers
.
matmul
(
theta
,
phi
,
name
=
prefix
+
'_affinity'
)
g
=
fluid
.
layers
.
reshape
(
g
,
[
0
,
0
,
-
1
])
if
nonlocal_params
[
"use_softmax"
]:
if
nonlocal_params
[
"use_scale"
]:
theta_phi_sc
=
fluid
.
layers
.
scale
(
theta_phi
,
scale
=
dim_inner
**-
.
5
)
else
:
theta_phi_sc
=
theta_phi
p
=
fluid
.
layers
.
softmax
(
theta_phi_sc
,
name
=
prefix
+
'_affinity'
+
'_prob'
)
else
:
# not clear about what is doing in xlw's code
p
=
None
# not implemented
raise
"Not implemented when not use softmax"
# note g's axis[2] corresponds to p's axis[2]
# e.g. g(8, 1024, 784_2) * p(8, 784_1, 784_2) => (8, 1024, 784_1)
p
=
fluid
.
layers
.
transpose
(
p
,
[
0
,
2
,
1
])
t
=
fluid
.
layers
.
matmul
(
g
,
p
,
name
=
prefix
+
'_y'
)
# reshape back
# e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14)
t_shape
=
t
.
shape
t_re
=
fluid
.
layers
.
reshape
(
t
,
shape
=
list
(
theta_shape
),
actual_shape
=
theta_shape_op
)
blob_out
=
t_re
blob_out
=
fluid
.
layers
.
conv2d
(
input
=
blob_out
,
num_filters
=
dim_out
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
padding
=
[
0
,
0
],
\
param_attr
=
ParamAttr
(
name
=
prefix
+
'_out'
+
"_w"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
)
\
if
nonlocal_params
[
"use_zero_init_conv"
]
\
else
fluid
.
initializer
.
Normal
(
loc
=
0.0
,
scale
=
nonlocal_params
[
"conv_init_std"
])),
\
bias_attr
=
ParamAttr
(
name
=
prefix
+
'_out'
+
"_b"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
))
\
if
(
nonlocal_params
[
"no_bias"
]
==
0
)
else
False
,
\
name
=
prefix
+
'_out'
)
blob_out_shape
=
blob_out
.
shape
if
nonlocal_params
[
"use_bn"
]:
bn_name
=
prefix
+
"_bn"
blob_out
=
fluid
.
layers
.
batch_norm
(
blob_out
,
\
# is_test = test_mode, \
momentum
=
nonlocal_params
[
"bn_momentum"
],
\
epsilon
=
nonlocal_params
[
"bn_epsilon"
],
\
name
=
bn_name
,
\
param_attr
=
ParamAttr
(
name
=
bn_name
+
"_s"
,
\
initializer
=
fluid
.
initializer
.
Constant
(
value
=
nonlocal_params
[
"bn_init_gamma"
]),
\
regularizer
=
fluid
.
regularizer
.
L2Decay
(
nonlocal_params
[
"weight_decay_bn"
])),
\
bias_attr
=
ParamAttr
(
name
=
bn_name
+
"_b"
,
\
regularizer
=
fluid
.
regularizer
.
L2Decay
(
nonlocal_params
[
"weight_decay_bn"
])),
\
moving_mean_name
=
bn_name
+
"_rm"
,
\
moving_variance_name
=
bn_name
+
"_riv"
)
# add bn
if
nonlocal_params
[
"use_affine"
]:
affine_scale
=
fluid
.
layers
.
create_parameter
(
\
shape
=
[
blob_out_shape
[
1
]],
dtype
=
blob_out
.
dtype
,
\
attr
=
ParamAttr
(
name
=
prefix
+
'_affine'
+
'_s'
),
\
default_initializer
=
fluid
.
initializer
.
Constant
(
value
=
1.
))
affine_bias
=
fluid
.
layers
.
create_parameter
(
\
shape
=
[
blob_out_shape
[
1
]],
dtype
=
blob_out
.
dtype
,
\
attr
=
ParamAttr
(
name
=
prefix
+
'_affine'
+
'_b'
),
\
default_initializer
=
fluid
.
initializer
.
Constant
(
value
=
0.
))
blob_out
=
fluid
.
layers
.
affine_channel
(
blob_out
,
scale
=
affine_scale
,
\
bias
=
affine_bias
,
name
=
prefix
+
'_affine'
)
# add affine
return
blob_out
def
add_space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
):
'''
add_space_nonlocal:
Non-local Neural Networks: see https://arxiv.org/abs/1711.07971
'''
conv
=
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
)
output
=
fluid
.
layers
.
elementwise_add
(
input
,
conv
,
name
=
prefix
+
'_sum'
)
return
output
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/processor.py
0 → 100644
浏览文件 @
59499f1b
# coding=utf-8
import
base64
import
os
import
cv2
import
numpy
as
np
from
PIL
import
Image
,
ImageDraw
__all__
=
[
'base64_to_cv2'
,
'load_label_info'
,
'postprocess'
]
def
base64_to_cv2
(
b64str
):
data
=
base64
.
b64decode
(
b64str
.
encode
(
'utf8'
))
data
=
np
.
fromstring
(
data
,
np
.
uint8
)
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
return
data
def
check_dir
(
dir_path
):
if
not
os
.
path
.
exists
(
dir_path
):
os
.
makedirs
(
dir_path
)
elif
os
.
path
.
isfile
(
dir_path
):
os
.
remove
(
dir_path
)
os
.
makedirs
(
dir_path
)
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
"""Get save image name from source image path.
"""
image_name
=
os
.
path
.
split
(
image_path
)[
-
1
]
name
,
ext
=
os
.
path
.
splitext
(
image_name
)
if
ext
==
''
:
if
img
.
format
==
'PNG'
:
ext
=
'.png'
elif
img
.
format
==
'JPEG'
:
ext
=
'.jpg'
elif
img
.
format
==
'BMP'
:
ext
=
'.bmp'
else
:
if
img
.
mode
==
"RGB"
or
img
.
mode
==
"L"
:
ext
=
".jpg"
elif
img
.
mode
==
"RGBA"
or
img
.
mode
==
"P"
:
ext
=
'.png'
return
os
.
path
.
join
(
output_dir
,
"{}"
.
format
(
name
))
+
ext
def
draw_bounding_box_on_image
(
image_path
,
data_list
,
save_dir
):
image
=
Image
.
open
(
image_path
)
draw
=
ImageDraw
.
Draw
(
image
)
for
data
in
data_list
:
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
# draw bbox
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
# draw label
if
image
.
mode
==
'RGB'
:
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
draw
.
rectangle
(
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
if
os
.
path
.
exists
(
save_name
):
os
.
remove
(
save_name
)
image
.
save
(
save_name
)
return
save_name
def
clip_bbox
(
bbox
,
img_width
,
img_height
):
xmin
=
max
(
min
(
bbox
[
0
],
img_width
),
0.
)
ymin
=
max
(
min
(
bbox
[
1
],
img_height
),
0.
)
xmax
=
max
(
min
(
bbox
[
2
],
img_width
),
0.
)
ymax
=
max
(
min
(
bbox
[
3
],
img_height
),
0.
)
return
float
(
xmin
),
float
(
ymin
),
float
(
xmax
),
float
(
ymax
)
def
load_label_info
(
file_path
):
with
open
(
file_path
,
'r'
)
as
fr
:
text
=
fr
.
readlines
()
label_names
=
[]
for
info
in
text
:
label_names
.
append
(
info
.
strip
())
return
label_names
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
"""
postprocess the lod_tensor produced by fluid.Executor.run
Args:
paths (list[str]): The paths of images.
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
data_out (lod_tensor): data output of predictor.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
output_dir (str): The path to store output images.
visualization (bool): Whether to save image or not.
score_thresh (float): the low limit of bounding box.
label_names (list[str]): label names.
handle_id (int): The number of images that have been handled.
Returns:
res (list[dict]): The result of vehicles detecion. keys include 'data', 'save_path', the corresponding value is:
data (dict): the result of object detection, keys include 'left', 'top', 'right', 'bottom', 'label', 'confidence', the corresponding value is:
left (float): The X coordinate of the upper left corner of the bounding box;
top (float): The Y coordinate of the upper left corner of the bounding box;
right (float): The X coordinate of the lower right corner of the bounding box;
bottom (float): The Y coordinate of the lower right corner of the bounding box;
label (str): The label of detection result;
confidence (float): The confidence of detection result.
save_path (str): The path to save output images.
"""
lod_tensor
=
data_out
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
results
=
lod_tensor
.
as_ndarray
()
check_dir
(
output_dir
)
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
unhandled_paths_num
=
0
output
=
list
()
for
index
in
range
(
len
(
lod
)
-
1
):
output_i
=
{
'data'
:
[]}
if
index
<
unhandled_paths_num
:
org_img_path
=
unhandled_paths
[
index
]
org_img
=
Image
.
open
(
org_img_path
)
else
:
org_img
=
images
[
index
-
unhandled_paths_num
]
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
if
visualization
:
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
(
(
handle_id
+
index
)))
org_img
.
save
(
org_img_path
)
org_img_height
=
org_img
.
height
org_img_width
=
org_img
.
width
result_i
=
results
[
lod
[
index
]:
lod
[
index
+
1
]]
for
row
in
result_i
:
if
len
(
row
)
!=
6
:
continue
if
row
[
1
]
<
score_thresh
:
continue
category_id
=
int
(
row
[
0
])
confidence
=
row
[
1
]
bbox
=
row
[
2
:]
dt
=
{}
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
output_i
[
'data'
].
append
(
dt
)
output
.
append
(
output_i
)
if
visualization
:
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
return
output
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
0 → 100644
浏览文件 @
59499f1b
# coding=utf-8
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
math
from
collections
import
OrderedDict
from
numbers
import
Integral
from
paddle
import
fluid
from
paddle.fluid.param_attr
import
ParamAttr
from
paddle.fluid.framework
import
Variable
from
paddle.fluid.regularizer
import
L2Decay
from
paddle.fluid.initializer
import
Constant
from
.nonlocal_helper
import
add_space_nonlocal
from
.name_adapter
import
NameAdapter
__all__
=
[
'ResNet'
,
'ResNetC5'
]
class
ResNet
(
object
):
"""
Residual Network, see https://arxiv.org/abs/1512.03385
Args:
depth (int): ResNet depth, should be 34, 50.
freeze_at (int): freeze the backbone at which stage
norm_type (str): normalization type, 'bn'/'sync_bn'/'affine_channel'
freeze_norm (bool): freeze normalization layers
norm_decay (float): weight decay for normalization layer weights
variant (str): ResNet variant, supports 'a', 'b', 'c', 'd' currently
feature_maps (list): index of stages whose feature maps are returned
dcn_v2_stages (list): index of stages who select deformable conv v2
nonlocal_stages (list): index of stages who select nonlocal networks
"""
__shared__
=
[
'norm_type'
,
'freeze_norm'
,
'weight_prefix_name'
]
def
__init__
(
self
,
depth
=
50
,
freeze_at
=
0
,
norm_type
=
'sync_bn'
,
freeze_norm
=
False
,
norm_decay
=
0.
,
variant
=
'b'
,
feature_maps
=
[
3
,
4
,
5
],
dcn_v2_stages
=
[],
weight_prefix_name
=
''
,
nonlocal_stages
=
[],
get_prediction
=
False
,
class_dim
=
1000
):
super
(
ResNet
,
self
).
__init__
()
if
isinstance
(
feature_maps
,
Integral
):
feature_maps
=
[
feature_maps
]
assert
depth
in
[
34
,
50
],
\
"depth {} not in [34, 50]"
assert
variant
in
[
'a'
,
'b'
,
'c'
,
'd'
],
"invalid ResNet variant"
assert
0
<=
freeze_at
<=
4
,
"freeze_at should be 0, 1, 2, 3 or 4"
assert
len
(
feature_maps
)
>
0
,
"need one or more feature maps"
assert
norm_type
in
[
'bn'
,
'sync_bn'
,
'affine_channel'
]
assert
not
(
len
(
nonlocal_stages
)
>
0
and
depth
<
50
),
\
"non-local is not supported for resnet18 or resnet34"
self
.
depth
=
depth
self
.
freeze_at
=
freeze_at
self
.
norm_type
=
norm_type
self
.
norm_decay
=
norm_decay
self
.
freeze_norm
=
freeze_norm
self
.
variant
=
variant
self
.
_model_type
=
'ResNet'
self
.
feature_maps
=
feature_maps
self
.
dcn_v2_stages
=
dcn_v2_stages
self
.
depth_cfg
=
{
34
:
([
3
,
4
,
6
,
3
],
self
.
basicblock
),
50
:
([
3
,
4
,
6
,
3
],
self
.
bottleneck
),
}
self
.
stage_filters
=
[
64
,
128
,
256
,
512
]
self
.
_c1_out_chan_num
=
64
self
.
na
=
NameAdapter
(
self
)
self
.
prefix_name
=
weight_prefix_name
self
.
nonlocal_stages
=
nonlocal_stages
self
.
nonlocal_mod_cfg
=
{
50
:
2
,
101
:
5
,
152
:
8
,
200
:
12
,
}
self
.
get_prediction
=
get_prediction
self
.
class_dim
=
class_dim
def
_conv_offset
(
self
,
input
,
filter_size
,
stride
,
padding
,
act
=
None
,
name
=
None
):
out_channel
=
filter_size
*
filter_size
*
3
out
=
fluid
.
layers
.
conv2d
(
input
,
num_filters
=
out_channel
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
padding
,
param_attr
=
ParamAttr
(
initializer
=
Constant
(
0.0
),
name
=
name
+
".w_0"
),
bias_attr
=
ParamAttr
(
initializer
=
Constant
(
0.0
),
name
=
name
+
".b_0"
),
act
=
act
,
name
=
name
)
return
out
def
_conv_norm
(
self
,
input
,
num_filters
,
filter_size
,
stride
=
1
,
groups
=
1
,
act
=
None
,
name
=
None
,
dcn_v2
=
False
):
_name
=
self
.
prefix_name
+
name
if
self
.
prefix_name
!=
''
else
name
if
not
dcn_v2
:
conv
=
fluid
.
layers
.
conv2d
(
input
=
input
,
num_filters
=
num_filters
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
(
filter_size
-
1
)
//
2
,
groups
=
groups
,
act
=
None
,
param_attr
=
ParamAttr
(
name
=
_name
+
"_weights"
),
bias_attr
=
False
,
name
=
_name
+
'.conv2d.output.1'
)
else
:
# select deformable conv"
offset_mask
=
self
.
_conv_offset
(
input
=
input
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
(
filter_size
-
1
)
//
2
,
act
=
None
,
name
=
_name
+
"_conv_offset"
)
offset_channel
=
filter_size
**
2
*
2
mask_channel
=
filter_size
**
2
offset
,
mask
=
fluid
.
layers
.
split
(
input
=
offset_mask
,
num_or_sections
=
[
offset_channel
,
mask_channel
],
dim
=
1
)
mask
=
fluid
.
layers
.
sigmoid
(
mask
)
conv
=
fluid
.
layers
.
deformable_conv
(
input
=
input
,
offset
=
offset
,
mask
=
mask
,
num_filters
=
num_filters
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
(
filter_size
-
1
)
//
2
,
groups
=
groups
,
deformable_groups
=
1
,
im2col_step
=
1
,
param_attr
=
ParamAttr
(
name
=
_name
+
"_weights"
),
bias_attr
=
False
,
name
=
_name
+
".conv2d.output.1"
)
bn_name
=
self
.
na
.
fix_conv_norm_name
(
name
)
bn_name
=
self
.
prefix_name
+
bn_name
if
self
.
prefix_name
!=
''
else
bn_name
norm_lr
=
0.
if
self
.
freeze_norm
else
1.
norm_decay
=
self
.
norm_decay
pattr
=
ParamAttr
(
name
=
bn_name
+
'_scale'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
battr
=
ParamAttr
(
name
=
bn_name
+
'_offset'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
if
self
.
norm_type
in
[
'bn'
,
'sync_bn'
]:
global_stats
=
True
if
self
.
freeze_norm
else
False
out
=
fluid
.
layers
.
batch_norm
(
input
=
conv
,
act
=
act
,
name
=
bn_name
+
'.output.1'
,
param_attr
=
pattr
,
bias_attr
=
battr
,
moving_mean_name
=
bn_name
+
'_mean'
,
moving_variance_name
=
bn_name
+
'_variance'
,
use_global_stats
=
global_stats
)
scale
=
fluid
.
framework
.
_get_var
(
pattr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
elif
self
.
norm_type
==
'affine_channel'
:
scale
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
bias
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
if
self
.
freeze_norm
:
scale
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
return
out
def
_shortcut
(
self
,
input
,
ch_out
,
stride
,
is_first
,
name
):
max_pooling_in_short_cut
=
self
.
variant
==
'd'
ch_in
=
input
.
shape
[
1
]
# the naming rule is same as pretrained weight
name
=
self
.
na
.
fix_shortcut_name
(
name
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
if
ch_in
!=
ch_out
or
stride
!=
1
or
(
self
.
depth
<
50
and
is_first
):
if
std_senet
:
if
is_first
:
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
else
:
return
self
.
_conv_norm
(
input
,
ch_out
,
3
,
stride
,
name
=
name
)
if
max_pooling_in_short_cut
and
not
is_first
:
input
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
2
,
pool_stride
=
2
,
pool_padding
=
0
,
ceil_mode
=
True
,
pool_type
=
'avg'
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
1
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
else
:
return
input
def
bottleneck
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
if
self
.
variant
==
'a'
:
stride1
,
stride2
=
stride
,
1
else
:
stride1
,
stride2
=
1
,
stride
# ResNeXt
groups
=
getattr
(
self
,
'groups'
,
1
)
group_width
=
getattr
(
self
,
'group_width'
,
-
1
)
if
groups
==
1
:
expand
=
4
elif
(
groups
*
group_width
)
==
256
:
expand
=
1
else
:
# FIXME hard code for now, handles 32x4d, 64x4d and 32x8d
num_filters
=
num_filters
//
2
expand
=
2
conv_name1
,
conv_name2
,
conv_name3
,
\
shortcut_name
=
self
.
na
.
fix_bottleneck_name
(
name
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
if
std_senet
:
conv_def
=
[[
int
(
num_filters
/
2
),
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
else
:
conv_def
=
[[
num_filters
,
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
residual
=
input
for
i
,
(
c
,
k
,
s
,
act
,
g
,
_name
)
in
enumerate
(
conv_def
):
residual
=
self
.
_conv_norm
(
input
=
residual
,
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
act
,
groups
=
g
,
name
=
_name
,
dcn_v2
=
(
i
==
1
and
dcn_v2
))
short
=
self
.
_shortcut
(
input
,
num_filters
*
expand
,
stride
,
is_first
=
is_first
,
name
=
shortcut_name
)
# Squeeze-and-Excitation
if
callable
(
getattr
(
self
,
'_squeeze_excitation'
,
None
)):
residual
=
self
.
_squeeze_excitation
(
input
=
residual
,
num_channels
=
num_filters
,
name
=
'fc'
+
name
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
residual
,
act
=
'relu'
,
name
=
name
+
".add.output.5"
)
def
basicblock
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
assert
dcn_v2
is
False
,
"Not implemented yet."
conv0
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
'relu'
,
stride
=
stride
,
name
=
name
+
"_branch2a"
)
conv1
=
self
.
_conv_norm
(
input
=
conv0
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
None
,
name
=
name
+
"_branch2b"
)
short
=
self
.
_shortcut
(
input
,
num_filters
,
stride
,
is_first
,
name
=
name
+
"_branch1"
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
conv1
,
act
=
'relu'
)
def
layer_warp
(
self
,
input
,
stage_num
):
"""
Args:
input (Variable): input variable.
stage_num (int): the stage number, should be 2, 3, 4, 5
Returns:
The last variable in endpoint-th stage.
"""
assert
stage_num
in
[
2
,
3
,
4
,
5
]
stages
,
block_func
=
self
.
depth_cfg
[
self
.
depth
]
count
=
stages
[
stage_num
-
2
]
ch_out
=
self
.
stage_filters
[
stage_num
-
2
]
is_first
=
False
if
stage_num
!=
2
else
True
dcn_v2
=
True
if
stage_num
in
self
.
dcn_v2_stages
else
False
nonlocal_mod
=
1000
if
stage_num
in
self
.
nonlocal_stages
:
nonlocal_mod
=
self
.
nonlocal_mod_cfg
[
self
.
depth
]
if
stage_num
==
4
else
2
# Make the layer name and parameter name consistent
# with ImageNet pre-trained model
conv
=
input
for
i
in
range
(
count
):
conv_name
=
self
.
na
.
fix_layer_warp_name
(
stage_num
,
count
,
i
)
if
self
.
depth
<
50
:
is_first
=
True
if
i
==
0
and
stage_num
==
2
else
False
conv
=
block_func
(
input
=
conv
,
num_filters
=
ch_out
,
stride
=
2
if
i
==
0
and
stage_num
!=
2
else
1
,
is_first
=
is_first
,
name
=
conv_name
,
dcn_v2
=
dcn_v2
)
# add non local model
dim_in
=
conv
.
shape
[
1
]
nonlocal_name
=
"nonlocal_conv{}"
.
format
(
stage_num
)
if
i
%
nonlocal_mod
==
nonlocal_mod
-
1
:
conv
=
add_space_nonlocal
(
conv
,
dim_in
,
dim_in
,
nonlocal_name
+
'_{}'
.
format
(
i
),
int
(
dim_in
/
2
))
return
conv
def
c1_stage
(
self
,
input
):
out_chan
=
self
.
_c1_out_chan_num
conv1_name
=
self
.
na
.
fix_c1_stage_name
()
if
self
.
variant
in
[
'c'
,
'd'
]:
conv_def
=
[
[
out_chan
//
2
,
3
,
2
,
"conv1_1"
],
[
out_chan
//
2
,
3
,
1
,
"conv1_2"
],
[
out_chan
,
3
,
1
,
"conv1_3"
],
]
else
:
conv_def
=
[[
out_chan
,
7
,
2
,
conv1_name
]]
for
(
c
,
k
,
s
,
_name
)
in
conv_def
:
input
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
'relu'
,
name
=
_name
)
output
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
3
,
pool_stride
=
2
,
pool_padding
=
1
,
pool_type
=
'max'
)
return
output
def
__call__
(
self
,
input
):
assert
isinstance
(
input
,
Variable
)
assert
not
(
set
(
self
.
feature_maps
)
-
set
([
2
,
3
,
4
,
5
])),
\
"feature maps {} not in [2, 3, 4, 5]"
.
format
(
self
.
feature_maps
)
res_endpoints
=
[]
res
=
input
feature_maps
=
self
.
feature_maps
severed_head
=
getattr
(
self
,
'severed_head'
,
False
)
if
not
severed_head
:
res
=
self
.
c1_stage
(
res
)
feature_maps
=
range
(
2
,
max
(
self
.
feature_maps
)
+
1
)
for
i
in
feature_maps
:
res
=
self
.
layer_warp
(
res
,
i
)
if
i
in
self
.
feature_maps
:
res_endpoints
.
append
(
res
)
if
self
.
freeze_at
>=
i
:
res
.
stop_gradient
=
True
if
self
.
get_prediction
:
pool
=
fluid
.
layers
.
pool2d
(
input
=
res
,
pool_type
=
'avg'
,
global_pooling
=
True
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
out
=
fluid
.
layers
.
fc
(
input
=
pool
,
size
=
self
.
class_dim
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
)))
out
=
fluid
.
layers
.
softmax
(
out
)
return
out
return
OrderedDict
([(
'res{}_sum'
.
format
(
self
.
feature_maps
[
idx
]),
feat
)
for
idx
,
feat
in
enumerate
(
res_endpoints
)])
class
ResNetC5
(
ResNet
):
def
__init__
(
self
,
depth
=
50
,
freeze_at
=
2
,
norm_type
=
'affine_channel'
,
freeze_norm
=
True
,
norm_decay
=
0.
,
variant
=
'b'
,
feature_maps
=
[
5
],
weight_prefix_name
=
''
):
super
(
ResNetC5
,
self
).
__init__
(
depth
,
freeze_at
,
norm_type
,
freeze_norm
,
norm_decay
,
variant
,
feature_maps
)
self
.
severed_head
=
True
hub_module/modules/image/object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
0 → 100644
浏览文件 @
59499f1b
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
from
collections
import
OrderedDict
from
paddle
import
fluid
from
paddle.fluid.param_attr
import
ParamAttr
from
paddle.fluid.regularizer
import
L2Decay
__all__
=
[
'MultiClassNMS'
,
'YOLOv3Head'
]
class
MultiClassNMS
(
object
):
# __op__ = fluid.layers.multiclass_nms
def
__init__
(
self
,
background_label
,
keep_top_k
,
nms_threshold
,
nms_top_k
,
normalized
,
score_threshold
):
super
(
MultiClassNMS
,
self
).
__init__
()
self
.
background_label
=
background_label
self
.
keep_top_k
=
keep_top_k
self
.
nms_threshold
=
nms_threshold
self
.
nms_top_k
=
nms_top_k
self
.
normalized
=
normalized
self
.
score_threshold
=
score_threshold
class
YOLOv3Head
(
object
):
"""Head block for YOLOv3 network
Args:
norm_decay (float): weight decay for normalization layer weights
num_classes (int): number of output classes
ignore_thresh (float): threshold to ignore confidence loss
label_smooth (bool): whether to use label smoothing
anchors (list): anchors
anchor_masks (list): anchor masks
nms (object): an instance of `MultiClassNMS`
"""
def
__init__
(
self
,
norm_decay
=
0.
,
num_classes
=
80
,
ignore_thresh
=
0.7
,
label_smooth
=
True
,
anchors
=
[[
10
,
13
],
[
16
,
30
],
[
33
,
23
],
[
30
,
61
],
[
62
,
45
],
[
59
,
119
],
[
116
,
90
],
[
156
,
198
],
[
373
,
326
]],
anchor_masks
=
[[
6
,
7
,
8
],
[
3
,
4
,
5
],
[
0
,
1
,
2
]],
nms
=
MultiClassNMS
(
background_label
=-
1
,
keep_top_k
=
100
,
nms_threshold
=
0.45
,
nms_top_k
=
1000
,
normalized
=
True
,
score_threshold
=
0.01
),
weight_prefix_name
=
''
):
self
.
norm_decay
=
norm_decay
self
.
num_classes
=
num_classes
self
.
ignore_thresh
=
ignore_thresh
self
.
label_smooth
=
label_smooth
self
.
anchor_masks
=
anchor_masks
self
.
_parse_anchors
(
anchors
)
self
.
nms
=
nms
self
.
prefix_name
=
weight_prefix_name
def
_conv_bn
(
self
,
input
,
ch_out
,
filter_size
,
stride
,
padding
,
act
=
'leaky'
,
is_test
=
True
,
name
=
None
):
conv
=
fluid
.
layers
.
conv2d
(
input
=
input
,
num_filters
=
ch_out
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
padding
,
act
=
None
,
param_attr
=
ParamAttr
(
name
=
name
+
".conv.weights"
),
bias_attr
=
False
)
bn_name
=
name
+
".bn"
bn_param_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
self
.
norm_decay
),
name
=
bn_name
+
'.scale'
)
bn_bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
self
.
norm_decay
),
name
=
bn_name
+
'.offset'
)
out
=
fluid
.
layers
.
batch_norm
(
input
=
conv
,
act
=
None
,
is_test
=
is_test
,
param_attr
=
bn_param_attr
,
bias_attr
=
bn_bias_attr
,
moving_mean_name
=
bn_name
+
'.mean'
,
moving_variance_name
=
bn_name
+
'.var'
)
if
act
==
'leaky'
:
out
=
fluid
.
layers
.
leaky_relu
(
x
=
out
,
alpha
=
0.1
)
return
out
def
_detection_block
(
self
,
input
,
channel
,
is_test
=
True
,
name
=
None
):
assert
channel
%
2
==
0
,
\
"channel {} cannot be divided by 2 in detection block {}"
\
.
format
(
channel
,
name
)
conv
=
input
for
j
in
range
(
2
):
conv
=
self
.
_conv_bn
(
conv
,
channel
,
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
is_test
=
is_test
,
name
=
'{}.{}.0'
.
format
(
name
,
j
))
conv
=
self
.
_conv_bn
(
conv
,
channel
*
2
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
is_test
=
is_test
,
name
=
'{}.{}.1'
.
format
(
name
,
j
))
route
=
self
.
_conv_bn
(
conv
,
channel
,
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
is_test
=
is_test
,
name
=
'{}.2'
.
format
(
name
))
tip
=
self
.
_conv_bn
(
route
,
channel
*
2
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
is_test
=
is_test
,
name
=
'{}.tip'
.
format
(
name
))
return
route
,
tip
def
_upsample
(
self
,
input
,
scale
=
2
,
name
=
None
):
out
=
fluid
.
layers
.
resize_nearest
(
input
=
input
,
scale
=
float
(
scale
),
name
=
name
)
return
out
def
_parse_anchors
(
self
,
anchors
):
"""
Check ANCHORS/ANCHOR_MASKS in config and parse mask_anchors
"""
self
.
anchors
=
[]
self
.
mask_anchors
=
[]
assert
len
(
anchors
)
>
0
,
"ANCHORS not set."
assert
len
(
self
.
anchor_masks
)
>
0
,
"ANCHOR_MASKS not set."
for
anchor
in
anchors
:
assert
len
(
anchor
)
==
2
,
"anchor {} len should be 2"
.
format
(
anchor
)
self
.
anchors
.
extend
(
anchor
)
anchor_num
=
len
(
anchors
)
for
masks
in
self
.
anchor_masks
:
self
.
mask_anchors
.
append
([])
for
mask
in
masks
:
assert
mask
<
anchor_num
,
"anchor mask index overflow"
self
.
mask_anchors
[
-
1
].
extend
(
anchors
[
mask
])
def
_get_outputs
(
self
,
input
,
is_train
=
True
):
"""
Get YOLOv3 head output
Args:
input (list): List of Variables, output of backbone stages
is_train (bool): whether in train or test mode
Returns:
outputs (list): Variables of each output layer
"""
outputs
=
[]
# get last out_layer_num blocks in reverse order
out_layer_num
=
len
(
self
.
anchor_masks
)
if
isinstance
(
input
,
OrderedDict
):
blocks
=
list
(
input
.
values
())[
-
1
:
-
out_layer_num
-
1
:
-
1
]
else
:
blocks
=
input
[
-
1
:
-
out_layer_num
-
1
:
-
1
]
route
=
None
for
i
,
block
in
enumerate
(
blocks
):
if
i
>
0
:
# perform concat in first 2 detection_block
block
=
fluid
.
layers
.
concat
(
input
=
[
route
,
block
],
axis
=
1
)
route
,
tip
=
self
.
_detection_block
(
block
,
channel
=
512
//
(
2
**
i
),
is_test
=
(
not
is_train
),
name
=
self
.
prefix_name
+
"yolo_block.{}"
.
format
(
i
))
# out channel number = mask_num * (5 + class_num)
num_filters
=
len
(
self
.
anchor_masks
[
i
])
*
(
self
.
num_classes
+
5
)
block_out
=
fluid
.
layers
.
conv2d
(
input
=
tip
,
num_filters
=
num_filters
,
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
act
=
None
,
param_attr
=
ParamAttr
(
name
=
self
.
prefix_name
+
"yolo_output.{}.conv.weights"
.
format
(
i
)),
bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
0.
),
name
=
self
.
prefix_name
+
"yolo_output.{}.conv.bias"
.
format
(
i
)))
outputs
.
append
(
block_out
)
if
i
<
len
(
blocks
)
-
1
:
# do not perform upsample in the last detection_block
route
=
self
.
_conv_bn
(
input
=
route
,
ch_out
=
256
//
(
2
**
i
),
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
is_test
=
(
not
is_train
),
name
=
self
.
prefix_name
+
"yolo_transition.{}"
.
format
(
i
))
# upsample
route
=
self
.
_upsample
(
route
)
return
outputs
,
blocks
def
get_prediction
(
self
,
outputs
,
im_size
):
"""
Get prediction result of YOLOv3 network
Args:
outputs (list): list of Variables, return from _get_outputs
im_size (Variable): Variable of size([h, w]) of each image
Returns:
pred (Variable): The prediction result after non-max suppress.
"""
boxes
=
[]
scores
=
[]
downsample
=
32
for
i
,
output
in
enumerate
(
outputs
):
box
,
score
=
fluid
.
layers
.
yolo_box
(
x
=
output
,
img_size
=
im_size
,
anchors
=
self
.
mask_anchors
[
i
],
class_num
=
self
.
num_classes
,
conf_thresh
=
self
.
nms
.
score_threshold
,
downsample_ratio
=
downsample
,
name
=
self
.
prefix_name
+
"yolo_box"
+
str
(
i
))
boxes
.
append
(
box
)
scores
.
append
(
fluid
.
layers
.
transpose
(
score
,
perm
=
[
0
,
2
,
1
]))
downsample
//=
2
yolo_boxes
=
fluid
.
layers
.
concat
(
boxes
,
axis
=
1
)
yolo_scores
=
fluid
.
layers
.
concat
(
scores
,
axis
=
2
)
pred
=
fluid
.
layers
.
multiclass_nms
(
bboxes
=
yolo_boxes
,
scores
=
yolo_scores
,
score_threshold
=
self
.
nms
.
score_threshold
,
nms_top_k
=
self
.
nms
.
nms_top_k
,
keep_top_k
=
self
.
nms
.
keep_top_k
,
nms_threshold
=
self
.
nms
.
nms_threshold
,
background_label
=
self
.
nms
.
background_label
,
normalized
=
self
.
nms
.
normalized
,
name
=
"multiclass_nms"
)
return
pred
hub_module/scripts/configs/yolov3_resnet50_vd_coco2017.yml
0 → 100644
浏览文件 @
59499f1b
name
:
yolov3_resnet50_vd_coco2017
dir
:
"
modules/image/object_detection/yolov3_resnet50_vd_coco2017"
resources
:
-
url
:
https://paddlehub.bj.bcebos.com/model/cv/yolov3_resnet50_model.tar.gz
dest
:
yolov3_resnet50_model
uncompress
:
True
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录