Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleHub
提交
c3f1e085
P
PaddleHub
项目概览
PaddlePaddle
/
PaddleHub
接近 2 年 前同步成功
通知
284
Star
12117
Fork
2091
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
200
列表
看板
标记
里程碑
合并请求
4
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleHub
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
200
Issue
200
列表
看板
标记
里程碑
合并请求
4
合并请求
4
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
c3f1e085
编写于
9月 18, 2021
作者:
C
chenjian
提交者:
GitHub
9月 18, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update object detection module README (#1589)
上级
13cc2d57
变更
63
展开全部
隐藏空白更改
内联
并排
Showing
63 changed file
with
4245 addition
and
2031 deletion
+4245
-2031
modules/image/object_detection/faster_rcnn_resnet50_coco2017/README.md
.../object_detection/faster_rcnn_resnet50_coco2017/README.md
+164
-150
modules/image/object_detection/faster_rcnn_resnet50_coco2017/bbox_head.py
...ject_detection/faster_rcnn_resnet50_coco2017/bbox_head.py
+41
-13
modules/image/object_detection/faster_rcnn_resnet50_coco2017/data_feed.py
...ject_detection/faster_rcnn_resnet50_coco2017/data_feed.py
+21
-8
modules/image/object_detection/faster_rcnn_resnet50_coco2017/module.py
.../object_detection/faster_rcnn_resnet50_coco2017/module.py
+101
-37
modules/image/object_detection/faster_rcnn_resnet50_coco2017/nonlocal_helper.py
...etection/faster_rcnn_resnet50_coco2017/nonlocal_helper.py
+6
-3
modules/image/object_detection/faster_rcnn_resnet50_coco2017/processor.py
...ject_detection/faster_rcnn_resnet50_coco2017/processor.py
+46
-13
modules/image/object_detection/faster_rcnn_resnet50_coco2017/resnet.py
.../object_detection/faster_rcnn_resnet50_coco2017/resnet.py
+113
-30
modules/image/object_detection/faster_rcnn_resnet50_coco2017/rpn_head.py
...bject_detection/faster_rcnn_resnet50_coco2017/rpn_head.py
+66
-24
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
...ect_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
+164
-150
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/bbox_head.py
..._detection/faster_rcnn_resnet50_fpn_coco2017/bbox_head.py
+41
-13
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/data_feed.py
..._detection/faster_rcnn_resnet50_fpn_coco2017/data_feed.py
+21
-8
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/fpn.py
...object_detection/faster_rcnn_resnet50_fpn_coco2017/fpn.py
+64
-19
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/module.py
...ect_detection/faster_rcnn_resnet50_fpn_coco2017/module.py
+112
-38
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/nonlocal_helper.py
...tion/faster_rcnn_resnet50_fpn_coco2017/nonlocal_helper.py
+6
-3
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/processor.py
..._detection/faster_rcnn_resnet50_fpn_coco2017/processor.py
+46
-13
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/resnet.py
...ect_detection/faster_rcnn_resnet50_fpn_coco2017/resnet.py
+113
-30
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/roi_extractor.py
...ection/faster_rcnn_resnet50_fpn_coco2017/roi_extractor.py
+2
-2
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/rpn_head.py
...t_detection/faster_rcnn_resnet50_fpn_coco2017/rpn_head.py
+123
-45
modules/image/object_detection/faster_rcnn_resnet50_fpn_venus/README.md
...object_detection/faster_rcnn_resnet50_fpn_venus/README.md
+106
-71
modules/image/object_detection/ssd_mobilenet_v1_pascal/README.md
.../image/object_detection/ssd_mobilenet_v1_pascal/README.md
+135
-100
modules/image/object_detection/ssd_mobilenet_v1_pascal/data_feed.py
...age/object_detection/ssd_mobilenet_v1_pascal/data_feed.py
+31
-10
modules/image/object_detection/ssd_mobilenet_v1_pascal/mobilenet_v1.py
.../object_detection/ssd_mobilenet_v1_pascal/mobilenet_v1.py
+66
-22
modules/image/object_detection/ssd_mobilenet_v1_pascal/module.py
.../image/object_detection/ssd_mobilenet_v1_pascal/module.py
+75
-23
modules/image/object_detection/ssd_mobilenet_v1_pascal/processor.py
...age/object_detection/ssd_mobilenet_v1_pascal/processor.py
+47
-13
modules/image/object_detection/ssd_vgg16_512_coco2017/README.md
...s/image/object_detection/ssd_vgg16_512_coco2017/README.md
+133
-100
modules/image/object_detection/ssd_vgg16_512_coco2017/data_feed.py
...mage/object_detection/ssd_vgg16_512_coco2017/data_feed.py
+26
-8
modules/image/object_detection/ssd_vgg16_512_coco2017/module.py
...s/image/object_detection/ssd_vgg16_512_coco2017/module.py
+80
-25
modules/image/object_detection/ssd_vgg16_512_coco2017/processor.py
...mage/object_detection/ssd_vgg16_512_coco2017/processor.py
+47
-13
modules/image/object_detection/ssd_vgg16_512_coco2017/vgg.py
modules/image/object_detection/ssd_vgg16_512_coco2017/vgg.py
+57
-17
modules/image/object_detection/yolov3_darknet53_coco2017/README.md
...mage/object_detection/yolov3_darknet53_coco2017/README.md
+132
-100
modules/image/object_detection/yolov3_darknet53_pedestrian/README.md
...ge/object_detection/yolov3_darknet53_pedestrian/README.md
+134
-100
modules/image/object_detection/yolov3_darknet53_pedestrian/darknet.py
...e/object_detection/yolov3_darknet53_pedestrian/darknet.py
+59
-12
modules/image/object_detection/yolov3_darknet53_pedestrian/data_feed.py
...object_detection/yolov3_darknet53_pedestrian/data_feed.py
+4
-2
modules/image/object_detection/yolov3_darknet53_pedestrian/module.py
...ge/object_detection/yolov3_darknet53_pedestrian/module.py
+72
-27
modules/image/object_detection/yolov3_darknet53_pedestrian/processor.py
...object_detection/yolov3_darknet53_pedestrian/processor.py
+39
-14
modules/image/object_detection/yolov3_darknet53_pedestrian/yolo_head.py
...object_detection/yolov3_darknet53_pedestrian/yolo_head.py
+56
-14
modules/image/object_detection/yolov3_darknet53_vehicles/README.md
...mage/object_detection/yolov3_darknet53_vehicles/README.md
+133
-100
modules/image/object_detection/yolov3_darknet53_vehicles/darknet.py
...age/object_detection/yolov3_darknet53_vehicles/darknet.py
+59
-12
modules/image/object_detection/yolov3_darknet53_vehicles/data_feed.py
...e/object_detection/yolov3_darknet53_vehicles/data_feed.py
+4
-2
modules/image/object_detection/yolov3_darknet53_vehicles/module.py
...mage/object_detection/yolov3_darknet53_vehicles/module.py
+72
-27
modules/image/object_detection/yolov3_darknet53_vehicles/processor.py
...e/object_detection/yolov3_darknet53_vehicles/processor.py
+38
-14
modules/image/object_detection/yolov3_darknet53_vehicles/yolo_head.py
...e/object_detection/yolov3_darknet53_vehicles/yolo_head.py
+56
-14
modules/image/object_detection/yolov3_darknet53_venus/README.md
...s/image/object_detection/yolov3_darknet53_venus/README.md
+106
-36
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/README.md
...e/object_detection/yolov3_mobilenet_v1_coco2017/README.md
+134
-100
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/data_feed.py
...bject_detection/yolov3_mobilenet_v1_coco2017/data_feed.py
+4
-2
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/mobilenet_v1.py
...ct_detection/yolov3_mobilenet_v1_coco2017/mobilenet_v1.py
+62
-20
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/module.py
...e/object_detection/yolov3_mobilenet_v1_coco2017/module.py
+79
-27
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/processor.py
...bject_detection/yolov3_mobilenet_v1_coco2017/processor.py
+38
-14
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/yolo_head.py
...bject_detection/yolov3_mobilenet_v1_coco2017/yolo_head.py
+56
-14
modules/image/object_detection/yolov3_resnet34_coco2017/README.md
...image/object_detection/yolov3_resnet34_coco2017/README.md
+133
-100
modules/image/object_detection/yolov3_resnet34_coco2017/data_feed.py
...ge/object_detection/yolov3_resnet34_coco2017/data_feed.py
+4
-2
modules/image/object_detection/yolov3_resnet34_coco2017/module.py
...image/object_detection/yolov3_resnet34_coco2017/module.py
+80
-27
modules/image/object_detection/yolov3_resnet34_coco2017/nonlocal_helper.py
...ect_detection/yolov3_resnet34_coco2017/nonlocal_helper.py
+6
-3
modules/image/object_detection/yolov3_resnet34_coco2017/processor.py
...ge/object_detection/yolov3_resnet34_coco2017/processor.py
+38
-14
modules/image/object_detection/yolov3_resnet34_coco2017/resnet.py
...image/object_detection/yolov3_resnet34_coco2017/resnet.py
+113
-30
modules/image/object_detection/yolov3_resnet34_coco2017/yolo_head.py
...ge/object_detection/yolov3_resnet34_coco2017/yolo_head.py
+56
-14
modules/image/object_detection/yolov3_resnet50_vd_coco2017/README.md
...ge/object_detection/yolov3_resnet50_vd_coco2017/README.md
+133
-100
modules/image/object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
...object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
+4
-2
modules/image/object_detection/yolov3_resnet50_vd_coco2017/module.py
...ge/object_detection/yolov3_resnet50_vd_coco2017/module.py
+74
-26
modules/image/object_detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
..._detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
+6
-3
modules/image/object_detection/yolov3_resnet50_vd_coco2017/processor.py
...object_detection/yolov3_resnet50_vd_coco2017/processor.py
+39
-14
modules/image/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
...ge/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
+113
-30
modules/image/object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
...object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
+56
-14
未找到文件。
modules/image/object_detection/faster_rcnn_resnet50_coco2017/README.md
浏览文件 @
c3f1e085
## 命令行预测
# faster_rcnn_resnet50_coco2017
```
shell
$
hub run faster_rcnn_resnet50_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
python
def
context
(
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
)
```
提取特征,用于迁移学习。
**参数**
*
num
\_
classes (int): 类别数;
*
trainable(bool): 参数是否可训练;
*
pretrained (bool): 是否加载预训练模型;
*
phase (str): 可选值为 'train'/'predict','trian' 用于训练,'predict' 用于预测。
**返回**
*
inputs (dict): 模型的输入,相应的取值为:
当 phase 为 'train'时,包含:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图像的尺寸
*
im
\_
info (Variable): 图像缩放信息
*
gt
\_
class (Variable): 检测框类别
*
gt
\_
box (Variable): 检测框坐标
*
is
\_
crowd (Variable): 单个框内是否包含多个物体
当 phase 为 'predict'时,包含:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图像的尺寸
*
im
\_
info (Variable): 图像缩放信息
*
outputs (dict): 模型的输出,相应的取值为:
当 phase 为 'train'时,包含:
*
head_features (Variable): 所提取的特征
*
rpn
\_
cls
\_
loss (Variable): 检测框分类损失
*
rpn
\_
reg
\_
loss (Variable): 检测框回归损失
*
generate
\_
proposal
\_
labels (Variable): 图像信息
当 phase 为 'predict'时,包含:
*
head_features (Variable): 所提取的特征
*
rois (Variable): 提取的roi
*
bbox
\_
out (Variable): 预测结果
*
context
\_
prog (Program): 用于迁移学习的 Program。
```
python
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
**参数**
*
paths (list
\[
str
\]
): 图片的路径;
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
**参数**
*
dirname: 存在模型的目录名称
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
*
combined: 是否将参数保存到统一的一个文件中
## 代码示例
```
python
import
paddlehub
as
hub
import
cv2
object_detector
=
hub
.
Module
(
name
=
"faster_rcnn_resnet50_coco2017"
)
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
PaddleHub Serving 可以部署一个目标检测的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```
shell
$
hub serving start
-m
faster_rcnn_resnet50_coco2017
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```
python
import
requests
import
json
import
cv2
import
base64
|模型名称|faster_rcnn_resnet50_coco2017|
| :--- | :---: |
|类别|图像 - 目标检测|
|网络|faster_rcnn|
|数据集|COCO2017|
|是否支持Fine-tuning|否|
|模型大小|131MB|
|最新更新日期|2021-03-15|
|数据指标|-|
def
cv2_to_base64
(
image
):
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
## 一、模型基本信息
# 发送HTTP请求
-
### 应用效果展示
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
-
样例结果示例:
headers
=
{
"Content-type"
:
"application/json"
}
<p
align=
"center"
>
url
=
"http://127.0.0.1:8866/predict/faster_rcnn_resnet50_coco2017"
<img
src=
"https://user-images.githubusercontent.com/22424850/131504887-d024c7e5-fc09-4d6b-92b8-4d0c965949d0.jpg"
width=
'50%'
hspace=
'10'
/>
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
<br
/>
</p>
# 打印预测结果
print
(
r
.
json
()[
"results"
])
```
### 依赖
-
### 模型介绍
paddlepaddle >= 1.6.2
-
Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置。Faster_RCNN整体网络可以分为4部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层。Faster_RCNN是在MS-COCO数据集上预训练的模型。目前仅提供预测功能。
paddlehub >= 1.6.0
## 二、安装
-
### 1、环境依赖
-
paddlepaddle >= 1.6.2
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
-
### 2、安装
-
```shell
$ hub install faster_rcnn_resnet50_coco2017
```
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
## 三、模型API预测
-
### 1、命令行预测
-
```shell
$ hub run faster_rcnn_resnet50_coco2017 --input_path "/PATH/TO/IMAGE"
```
-
通过命令行方式实现目标检测模型的调用,更多请见
[
PaddleHub命令行指令
](
../../../../docs/docs_ch/tutorial/cmd_usage.rst
)
-
### 2、代码示例
-
```python
import paddlehub as hub
import cv2
object_detector = hub.Module(name="faster_rcnn_resnet50_coco2017")
result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
-
### 3、API
-
```python
def object_detection(paths=None,
images=None,
batch_size=1,
use_gpu=False,
output_dir='detection_result',
score_thresh=0.5,
visualization=True)
```
- 预测API,检测输入图片中的所有目标的位置。
- **参数**
- paths (list\[str\]): 图片的路径; <br/>
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; <br/>
- batch\_size (int): batch 的大小;<br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- output\_dir (str): 图片的保存路径,默认设为 detection\_result;<br/>
- score\_thresh (float): 识别置信度的阈值;<br/>
- visualization (bool): 是否将识别结果保存为图片文件。
**NOTE:** paths和images两个参数选择其一进行提供数据
- **返回**
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
- data (list): 检测结果,list的每一个元素为 dict,各字段为:
- confidence (float): 识别的置信度
- label (str): 标签
- left (int): 边界框的左上角x坐标
- top (int): 边界框的左上角y坐标
- right (int): 边界框的右下角x坐标
- bottom (int): 边界框的右下角y坐标
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
- **参数**
- dirname: 存在模型的目录名称; <br/>
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
- combined: 是否将参数保存到统一的一个文件中。
## 四、服务部署
-
PaddleHub Serving可以部署一个目标检测的在线服务。
-
### 第一步:启动PaddleHub Serving
-
运行启动命令:
-
```shell
$ hub serving start -m faster_rcnn_resnet50_coco2017
```
-
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
-
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
-
### 第二步:发送预测请求
-
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
-
```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/faster_rcnn_resnet50_coco2017"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
```
## 五、更新历史
*
1.1.0
初始发布
*
1.1.1
修复numpy数据读取问题
-
```shell
$ hub install faster_rcnn_resnet50_coco2017==1.1.1
```
modules/image/object_detection/faster_rcnn_resnet50_coco2017/bbox_head.py
浏览文件 @
c3f1e085
...
@@ -45,11 +45,18 @@ class SmoothL1Loss(object):
...
@@ -45,11 +45,18 @@ class SmoothL1Loss(object):
def
__call__
(
self
,
x
,
y
,
inside_weight
=
None
,
outside_weight
=
None
):
def
__call__
(
self
,
x
,
y
,
inside_weight
=
None
,
outside_weight
=
None
):
return
fluid
.
layers
.
smooth_l1
(
return
fluid
.
layers
.
smooth_l1
(
x
,
y
,
inside_weight
=
inside_weight
,
outside_weight
=
outside_weight
,
sigma
=
self
.
sigma
)
x
,
y
,
inside_weight
=
inside_weight
,
outside_weight
=
outside_weight
,
sigma
=
self
.
sigma
)
class
BoxCoder
(
object
):
class
BoxCoder
(
object
):
def
__init__
(
self
,
prior_box_var
=
[
0.1
,
0.1
,
0.2
,
0.2
],
code_type
=
'decode_center_size'
,
box_normalized
=
False
,
def
__init__
(
self
,
prior_box_var
=
[
0.1
,
0.1
,
0.2
,
0.2
],
code_type
=
'decode_center_size'
,
box_normalized
=
False
,
axis
=
1
):
axis
=
1
):
super
(
BoxCoder
,
self
).
__init__
()
super
(
BoxCoder
,
self
).
__init__
()
self
.
prior_box_var
=
prior_box_var
self
.
prior_box_var
=
prior_box_var
...
@@ -78,14 +85,16 @@ class TwoFCHead(object):
...
@@ -78,14 +85,16 @@ class TwoFCHead(object):
act
=
'relu'
,
act
=
'relu'
,
name
=
'fc6'
,
name
=
'fc6'
,
param_attr
=
ParamAttr
(
name
=
'fc6_w'
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
name
=
'fc6_w'
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
'fc6_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bias_attr
=
ParamAttr
(
name
=
'fc6_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
head_feat
=
fluid
.
layers
.
fc
(
head_feat
=
fluid
.
layers
.
fc
(
input
=
fc6
,
input
=
fc6
,
size
=
self
.
mlp_dim
,
size
=
self
.
mlp_dim
,
act
=
'relu'
,
act
=
'relu'
,
name
=
'fc7'
,
name
=
'fc7'
,
param_attr
=
ParamAttr
(
name
=
'fc7_w'
,
initializer
=
Xavier
()),
param_attr
=
ParamAttr
(
name
=
'fc7_w'
,
initializer
=
Xavier
()),
bias_attr
=
ParamAttr
(
name
=
'fc7_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bias_attr
=
ParamAttr
(
name
=
'fc7_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
head_feat
return
head_feat
...
@@ -103,7 +112,12 @@ class BBoxHead(object):
...
@@ -103,7 +112,12 @@ class BBoxHead(object):
__inject__
=
[
'head'
,
'box_coder'
,
'nms'
,
'bbox_loss'
]
__inject__
=
[
'head'
,
'box_coder'
,
'nms'
,
'bbox_loss'
]
__shared__
=
[
'num_classes'
]
__shared__
=
[
'num_classes'
]
def
__init__
(
self
,
head
,
box_coder
=
BoxCoder
(),
nms
=
MultiClassNMS
(),
bbox_loss
=
SmoothL1Loss
(),
num_classes
=
81
):
def
__init__
(
self
,
head
,
box_coder
=
BoxCoder
(),
nms
=
MultiClassNMS
(),
bbox_loss
=
SmoothL1Loss
(),
num_classes
=
81
):
super
(
BBoxHead
,
self
).
__init__
()
super
(
BBoxHead
,
self
).
__init__
()
self
.
head
=
head
self
.
head
=
head
self
.
num_classes
=
num_classes
self
.
num_classes
=
num_classes
...
@@ -140,24 +154,30 @@ class BBoxHead(object):
...
@@ -140,24 +154,30 @@ class BBoxHead(object):
head_feat
=
self
.
get_head_feat
(
roi_feat
)
head_feat
=
self
.
get_head_feat
(
roi_feat
)
# when ResNetC5 output a single feature map
# when ResNetC5 output a single feature map
if
not
isinstance
(
self
.
head
,
TwoFCHead
):
if
not
isinstance
(
self
.
head
,
TwoFCHead
):
head_feat
=
fluid
.
layers
.
pool2d
(
head_feat
,
pool_type
=
'avg'
,
global_pooling
=
True
)
head_feat
=
fluid
.
layers
.
pool2d
(
head_feat
,
pool_type
=
'avg'
,
global_pooling
=
True
)
cls_score
=
fluid
.
layers
.
fc
(
cls_score
=
fluid
.
layers
.
fc
(
input
=
head_feat
,
input
=
head_feat
,
size
=
self
.
num_classes
,
size
=
self
.
num_classes
,
act
=
None
,
act
=
None
,
name
=
'cls_score'
,
name
=
'cls_score'
,
param_attr
=
ParamAttr
(
name
=
'cls_score_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
'cls_score_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
'cls_score_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
'cls_score_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bbox_pred
=
fluid
.
layers
.
fc
(
bbox_pred
=
fluid
.
layers
.
fc
(
input
=
head_feat
,
input
=
head_feat
,
size
=
4
*
self
.
num_classes
,
size
=
4
*
self
.
num_classes
,
act
=
None
,
act
=
None
,
name
=
'bbox_pred'
,
name
=
'bbox_pred'
,
param_attr
=
ParamAttr
(
name
=
'bbox_pred_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.001
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
'bbox_pred_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
'bbox_pred_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.001
)),
bias_attr
=
ParamAttr
(
name
=
'bbox_pred_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
cls_score
,
bbox_pred
return
cls_score
,
bbox_pred
def
get_loss
(
self
,
roi_feat
,
labels_int32
,
bbox_targets
,
bbox_inside_weights
,
bbox_outside_weights
):
def
get_loss
(
self
,
roi_feat
,
labels_int32
,
bbox_targets
,
bbox_inside_weights
,
bbox_outside_weights
):
"""
"""
Get bbox_head loss.
Get bbox_head loss.
...
@@ -186,11 +206,19 @@ class BBoxHead(object):
...
@@ -186,11 +206,19 @@ class BBoxHead(object):
logits
=
cls_score
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
logits
=
cls_score
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
loss_cls
=
fluid
.
layers
.
reduce_mean
(
loss_cls
)
loss_cls
=
fluid
.
layers
.
reduce_mean
(
loss_cls
)
loss_bbox
=
self
.
bbox_loss
(
loss_bbox
=
self
.
bbox_loss
(
x
=
bbox_pred
,
y
=
bbox_targets
,
inside_weight
=
bbox_inside_weights
,
outside_weight
=
bbox_outside_weights
)
x
=
bbox_pred
,
y
=
bbox_targets
,
inside_weight
=
bbox_inside_weights
,
outside_weight
=
bbox_outside_weights
)
loss_bbox
=
fluid
.
layers
.
reduce_mean
(
loss_bbox
)
loss_bbox
=
fluid
.
layers
.
reduce_mean
(
loss_bbox
)
return
{
'loss_cls'
:
loss_cls
,
'loss_bbox'
:
loss_bbox
}
return
{
'loss_cls'
:
loss_cls
,
'loss_bbox'
:
loss_bbox
}
def
get_prediction
(
self
,
roi_feat
,
rois
,
im_info
,
im_shape
,
return_box_score
=
False
):
def
get_prediction
(
self
,
roi_feat
,
rois
,
im_info
,
im_shape
,
return_box_score
=
False
):
"""
"""
Get prediction bounding box in test stage.
Get prediction bounding box in test stage.
...
...
modules/image/object_detection/faster_rcnn_resnet50_coco2017/data_feed.py
浏览文件 @
c3f1e085
...
@@ -30,7 +30,8 @@ def test_reader(paths=None, images=None):
...
@@ -30,7 +30,8 @@ def test_reader(paths=None, images=None):
img_list
=
list
()
img_list
=
list
()
if
paths
:
if
paths
:
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
@@ -65,7 +66,13 @@ def test_reader(paths=None, images=None):
...
@@ -65,7 +66,13 @@ def test_reader(paths=None, images=None):
# im_info holds the resize info of image.
# im_info holds the resize info of image.
im_info
=
np
.
array
([
resize_h
,
resize_w
,
im_scale
]).
astype
(
'float32'
)
im_info
=
np
.
array
([
resize_h
,
resize_w
,
im_scale
]).
astype
(
'float32'
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale
,
fy
=
im_scale
,
interpolation
=
cv2
.
INTER_LINEAR
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale
,
fy
=
im_scale
,
interpolation
=
cv2
.
INTER_LINEAR
)
# HWC --> CHW
# HWC --> CHW
im
=
np
.
swapaxes
(
im
,
1
,
2
)
im
=
np
.
swapaxes
(
im
,
1
,
2
)
...
@@ -74,11 +81,14 @@ def test_reader(paths=None, images=None):
...
@@ -74,11 +81,14 @@ def test_reader(paths=None, images=None):
def
padding_minibatch
(
batch_data
,
coarsest_stride
=
0
,
use_padded_im_info
=
True
):
def
padding_minibatch
(
batch_data
,
coarsest_stride
=
0
,
use_padded_im_info
=
True
):
max_shape_org
=
np
.
array
([
data
[
'image'
].
shape
for
data
in
batch_data
]).
max
(
axis
=
0
)
max_shape_org
=
np
.
array
(
[
data
[
'image'
].
shape
for
data
in
batch_data
]).
max
(
axis
=
0
)
if
coarsest_stride
>
0
:
if
coarsest_stride
>
0
:
max_shape
=
np
.
zeros
((
3
)).
astype
(
'int32'
)
max_shape
=
np
.
zeros
((
3
)).
astype
(
'int32'
)
max_shape
[
1
]
=
int
(
np
.
ceil
(
max_shape_org
[
1
]
/
coarsest_stride
)
*
coarsest_stride
)
max_shape
[
1
]
=
int
(
max_shape
[
2
]
=
int
(
np
.
ceil
(
max_shape_org
[
2
]
/
coarsest_stride
)
*
coarsest_stride
)
np
.
ceil
(
max_shape_org
[
1
]
/
coarsest_stride
)
*
coarsest_stride
)
max_shape
[
2
]
=
int
(
np
.
ceil
(
max_shape_org
[
2
]
/
coarsest_stride
)
*
coarsest_stride
)
else
:
else
:
max_shape
=
max_shape_org
.
astype
(
'int32'
)
max_shape
=
max_shape_org
.
astype
(
'int32'
)
...
@@ -89,12 +99,15 @@ def padding_minibatch(batch_data, coarsest_stride=0, use_padded_im_info=True):
...
@@ -89,12 +99,15 @@ def padding_minibatch(batch_data, coarsest_stride=0, use_padded_im_info=True):
for
data
in
batch_data
:
for
data
in
batch_data
:
im_c
,
im_h
,
im_w
=
data
[
'image'
].
shape
im_c
,
im_h
,
im_w
=
data
[
'image'
].
shape
# image
# image
padding_im
=
np
.
zeros
((
im_c
,
max_shape
[
1
],
max_shape
[
2
]),
dtype
=
np
.
float32
)
padding_im
=
np
.
zeros
((
im_c
,
max_shape
[
1
],
max_shape
[
2
]),
dtype
=
np
.
float32
)
padding_im
[:,
0
:
im_h
,
0
:
im_w
]
=
data
[
'image'
]
padding_im
[:,
0
:
im_h
,
0
:
im_w
]
=
data
[
'image'
]
padding_image
.
append
(
padding_im
)
padding_image
.
append
(
padding_im
)
# im_info
# im_info
data
[
'im_info'
][
0
]
=
max_shape
[
1
]
if
use_padded_im_info
else
max_shape_org
[
1
]
data
[
'im_info'
][
data
[
'im_info'
][
1
]
=
max_shape
[
2
]
if
use_padded_im_info
else
max_shape_org
[
2
]
0
]
=
max_shape
[
1
]
if
use_padded_im_info
else
max_shape_org
[
1
]
data
[
'im_info'
][
1
]
=
max_shape
[
2
]
if
use_padded_im_info
else
max_shape_org
[
2
]
padding_info
.
append
(
data
[
'im_info'
])
padding_info
.
append
(
data
[
'im_info'
])
padding_shape
.
append
(
data
[
'im_shape'
])
padding_shape
.
append
(
data
[
'im_shape'
])
...
...
modules/image/object_detection/faster_rcnn_resnet50_coco2017/module.py
浏览文件 @
c3f1e085
...
@@ -29,16 +29,19 @@ from faster_rcnn_resnet50_coco2017.roi_extractor import RoIAlign
...
@@ -29,16 +29,19 @@ from faster_rcnn_resnet50_coco2017.roi_extractor import RoIAlign
@
moduleinfo
(
@
moduleinfo
(
name
=
"faster_rcnn_resnet50_coco2017"
,
name
=
"faster_rcnn_resnet50_coco2017"
,
version
=
"1.1.
0
"
,
version
=
"1.1.
1
"
,
type
=
"cv/object_detection"
,
type
=
"cv/object_detection"
,
summary
=
"Baidu's Faster R-CNN model for object detection with backbone ResNet50, trained with dataset COCO2017"
,
summary
=
"Baidu's Faster R-CNN model for object detection with backbone ResNet50, trained with dataset COCO2017"
,
author
=
"paddlepaddle"
,
author
=
"paddlepaddle"
,
author_email
=
"paddle-dev@baidu.com"
)
author_email
=
"paddle-dev@baidu.com"
)
class
FasterRCNNResNet50
(
hub
.
Module
):
class
FasterRCNNResNet50
(
hub
.
Module
):
def
_initialize
(
self
):
def
_initialize
(
self
):
# default pretrained model, Faster-RCNN with backbone ResNet50, shape of input tensor is [3, 800, 1333]
# default pretrained model, Faster-RCNN with backbone ResNet50, shape of input tensor is [3, 800, 1333]
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
directory
,
"faster_rcnn_resnet50_model"
)
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
directory
,
"faster_rcnn_resnet50_model"
)
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
_set_config
()
self
.
_set_config
()
def
_set_config
(
self
):
def
_set_config
(
self
):
...
@@ -62,7 +65,11 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -62,7 +65,11 @@ class FasterRCNNResNet50(hub.Module):
gpu_config
.
enable_use_gpu
(
memory_pool_init_size_mb
=
500
,
device_id
=
0
)
gpu_config
.
enable_use_gpu
(
memory_pool_init_size_mb
=
500
,
device_id
=
0
)
self
.
gpu_predictor
=
create_paddle_predictor
(
gpu_config
)
self
.
gpu_predictor
=
create_paddle_predictor
(
gpu_config
)
def
context
(
self
,
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
):
def
context
(
self
,
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
):
"""
"""
Distill the Head Features, so as to perform transfer learning.
Distill the Head Features, so as to perform transfer learning.
...
@@ -81,24 +88,34 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -81,24 +88,34 @@ class FasterRCNNResNet50(hub.Module):
startup_program
=
fluid
.
Program
()
startup_program
=
fluid
.
Program
()
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
unique_name
.
guard
():
with
fluid
.
unique_name
.
guard
():
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
-
1
,
3
,
-
1
,
-
1
],
dtype
=
'float32'
)
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
-
1
,
3
,
-
1
,
-
1
],
dtype
=
'float32'
)
# backbone
# backbone
backbone
=
ResNet
(
norm_type
=
'affine_channel'
,
depth
=
50
,
feature_maps
=
4
,
freeze_at
=
2
)
backbone
=
ResNet
(
norm_type
=
'affine_channel'
,
depth
=
50
,
feature_maps
=
4
,
freeze_at
=
2
)
body_feats
=
backbone
(
image
)
body_feats
=
backbone
(
image
)
# var_prefix
# var_prefix
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
im_info
=
fluid
.
layers
.
data
(
name
=
'im_info'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
im_info
=
fluid
.
layers
.
data
(
im_shape
=
fluid
.
layers
.
data
(
name
=
'im_shape'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
name
=
'im_info'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
im_shape
=
fluid
.
layers
.
data
(
name
=
'im_shape'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
body_feat_names
=
list
(
body_feats
.
keys
())
body_feat_names
=
list
(
body_feats
.
keys
())
# rpn_head: RPNHead
# rpn_head: RPNHead
rpn_head
=
self
.
rpn_head
()
rpn_head
=
self
.
rpn_head
()
rois
=
rpn_head
.
get_proposals
(
body_feats
,
im_info
,
mode
=
phase
)
rois
=
rpn_head
.
get_proposals
(
body_feats
,
im_info
,
mode
=
phase
)
# train
# train
if
phase
==
'train'
:
if
phase
==
'train'
:
gt_bbox
=
fluid
.
layers
.
data
(
name
=
'gt_bbox'
,
shape
=
[
4
],
dtype
=
'float32'
,
lod_level
=
1
)
gt_bbox
=
fluid
.
layers
.
data
(
is_crowd
=
fluid
.
layers
.
data
(
name
=
'is_crowd'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
name
=
'gt_bbox'
,
shape
=
[
4
],
dtype
=
'float32'
,
lod_level
=
1
)
gt_class
=
fluid
.
layers
.
data
(
name
=
'gt_class'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
is_crowd
=
fluid
.
layers
.
data
(
name
=
'is_crowd'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
gt_class
=
fluid
.
layers
.
data
(
name
=
'gt_class'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
rpn_loss
=
rpn_head
.
get_loss
(
im_info
,
gt_bbox
,
is_crowd
)
rpn_loss
=
rpn_head
.
get_loss
(
im_info
,
gt_bbox
,
is_crowd
)
# bbox_assigner: BBoxAssigner
# bbox_assigner: BBoxAssigner
bbox_assigner
=
self
.
bbox_assigner
(
num_classes
)
bbox_assigner
=
self
.
bbox_assigner
(
num_classes
)
...
@@ -143,13 +160,18 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -143,13 +160,18 @@ class FasterRCNNResNet50(hub.Module):
'is_crowd'
:
var_prefix
+
is_crowd
.
name
'is_crowd'
:
var_prefix
+
is_crowd
.
name
}
}
outputs
=
{
outputs
=
{
'head_features'
:
var_prefix
+
head_feat
.
name
,
'head_features'
:
'rpn_cls_loss'
:
var_prefix
+
rpn_loss
[
'rpn_cls_loss'
].
name
,
var_prefix
+
head_feat
.
name
,
'rpn_reg_loss'
:
var_prefix
+
rpn_loss
[
'rpn_reg_loss'
].
name
,
'rpn_cls_loss'
:
'generate_proposal_labels'
:
[
var_prefix
+
var
.
name
for
var
in
outs
]
var_prefix
+
rpn_loss
[
'rpn_cls_loss'
].
name
,
'rpn_reg_loss'
:
var_prefix
+
rpn_loss
[
'rpn_reg_loss'
].
name
,
'generate_proposal_labels'
:
[
var_prefix
+
var
.
name
for
var
in
outs
]
}
}
elif
phase
==
'predict'
:
elif
phase
==
'predict'
:
pred
=
bbox_head
.
get_prediction
(
roi_feat
,
rois
,
im_info
,
im_shape
)
pred
=
bbox_head
.
get_prediction
(
roi_feat
,
rois
,
im_info
,
im_shape
)
inputs
=
{
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'image'
:
var_prefix
+
image
.
name
,
'im_info'
:
var_prefix
+
im_info
.
name
,
'im_info'
:
var_prefix
+
im_info
.
name
,
...
@@ -164,9 +186,13 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -164,9 +186,13 @@ class FasterRCNNResNet50(hub.Module):
add_vars_prefix
(
startup_program
,
var_prefix
)
add_vars_prefix
(
startup_program
,
var_prefix
)
global_vars
=
context_prog
.
global_block
().
vars
global_vars
=
context_prog
.
global_block
().
vars
inputs
=
{
key
:
global_vars
[
value
]
for
key
,
value
in
inputs
.
items
()}
inputs
=
{
key
:
global_vars
[
value
]
for
key
,
value
in
inputs
.
items
()
}
outputs
=
{
outputs
=
{
key
:
global_vars
[
value
]
if
not
isinstance
(
value
,
list
)
else
[
global_vars
[
var
]
for
var
in
value
]
key
:
global_vars
[
value
]
if
not
isinstance
(
value
,
list
)
else
[
global_vars
[
var
]
for
var
in
value
]
for
key
,
value
in
outputs
.
items
()
for
key
,
value
in
outputs
.
items
()
}
}
...
@@ -182,9 +208,14 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -182,9 +208,14 @@ class FasterRCNNResNet50(hub.Module):
if
num_classes
!=
81
:
if
num_classes
!=
81
:
if
'bbox_pred'
in
var
.
name
or
'cls_score'
in
var
.
name
:
if
'bbox_pred'
in
var
.
name
or
'cls_score'
in
var
.
name
:
return
False
return
False
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
return
inputs
,
outputs
,
context_prog
return
inputs
,
outputs
,
context_prog
def
rpn_head
(
self
):
def
rpn_head
(
self
):
...
@@ -200,8 +231,16 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -200,8 +231,16 @@ class FasterRCNNResNet50(hub.Module):
rpn_negative_overlap
=
0.3
,
rpn_negative_overlap
=
0.3
,
rpn_positive_overlap
=
0.7
,
rpn_positive_overlap
=
0.7
,
rpn_straddle_thresh
=
0.0
),
rpn_straddle_thresh
=
0.0
),
train_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
12000
,
pre_nms_top_n
=
2000
),
train_proposal
=
GenerateProposals
(
test_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
6000
,
pre_nms_top_n
=
1000
))
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
12000
,
pre_nms_top_n
=
2000
),
test_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
6000
,
pre_nms_top_n
=
1000
))
def
roi_extractor
(
self
):
def
roi_extractor
(
self
):
return
RoIAlign
(
resolution
=
14
,
sampling_ratio
=
0
,
spatial_scale
=
0.0625
)
return
RoIAlign
(
resolution
=
14
,
sampling_ratio
=
0
,
spatial_scale
=
0.0625
)
...
@@ -209,7 +248,8 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -209,7 +248,8 @@ class FasterRCNNResNet50(hub.Module):
def
bbox_head
(
self
,
num_classes
):
def
bbox_head
(
self
,
num_classes
):
return
BBoxHead
(
return
BBoxHead
(
head
=
ResNetC5
(
depth
=
50
,
norm_type
=
'affine_channel'
),
head
=
ResNetC5
(
depth
=
50
,
norm_type
=
'affine_channel'
),
nms
=
MultiClassNMS
(
keep_top_k
=
100
,
nms_threshold
=
0.5
,
score_threshold
=
0.05
),
nms
=
MultiClassNMS
(
keep_top_k
=
100
,
nms_threshold
=
0.5
,
score_threshold
=
0.05
),
bbox_loss
=
SmoothL1Loss
(),
bbox_loss
=
SmoothL1Loss
(),
num_classes
=
num_classes
)
num_classes
=
num_classes
)
...
@@ -223,7 +263,11 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -223,7 +263,11 @@ class FasterRCNNResNet50(hub.Module):
fg_thresh
=
0.5
,
fg_thresh
=
0.5
,
class_nums
=
num_classes
)
class_nums
=
num_classes
)
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
if
combined
:
if
combined
:
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
...
@@ -279,7 +323,7 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -279,7 +323,7 @@ class FasterRCNNResNet50(hub.Module):
int
(
_places
[
0
])
int
(
_places
[
0
])
except
:
except
:
raise
RuntimeError
(
raise
RuntimeError
(
"
Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id
."
"
Attempt to use GPU for prediction, but environment variable CUDA_VISIBLE_DEVICES was not set correctly
."
)
)
paths
=
paths
if
paths
else
list
()
paths
=
paths
if
paths
else
list
()
if
data
and
'image'
in
data
:
if
data
and
'image'
in
data
:
...
@@ -301,11 +345,14 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -301,11 +345,14 @@ class FasterRCNNResNet50(hub.Module):
except
:
except
:
pass
pass
padding_image
,
padding_info
,
padding_shape
=
padding_minibatch
(
batch_data
)
padding_image
,
padding_info
,
padding_shape
=
padding_minibatch
(
batch_data
)
padding_image_tensor
=
PaddleTensor
(
padding_image
.
copy
())
padding_image_tensor
=
PaddleTensor
(
padding_image
.
copy
())
padding_info_tensor
=
PaddleTensor
(
padding_info
.
copy
())
padding_info_tensor
=
PaddleTensor
(
padding_info
.
copy
())
padding_shape_tensor
=
PaddleTensor
(
padding_shape
.
copy
())
padding_shape_tensor
=
PaddleTensor
(
padding_shape
.
copy
())
feed_list
=
[
padding_image_tensor
,
padding_info_tensor
,
padding_shape_tensor
]
feed_list
=
[
padding_image_tensor
,
padding_info_tensor
,
padding_shape_tensor
]
if
use_gpu
:
if
use_gpu
:
data_out
=
self
.
gpu_predictor
.
run
(
feed_list
)
data_out
=
self
.
gpu_predictor
.
run
(
feed_list
)
else
:
else
:
...
@@ -327,17 +374,29 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -327,17 +374,29 @@ class FasterRCNNResNet50(hub.Module):
Add the command config options
Add the command config options
"""
"""
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
self
.
arg_config_group
.
add_argument
(
'--batch_size'
,
type
=
int
,
default
=
1
,
help
=
"batch size for prediction"
)
self
.
arg_config_group
.
add_argument
(
'--batch_size'
,
type
=
int
,
default
=
1
,
help
=
"batch size for prediction"
)
def
add_module_input_arg
(
self
):
def
add_module_input_arg
(
self
):
"""
"""
Add the command input options
Add the command input options
"""
"""
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
default
=
None
,
help
=
"input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
default
=
None
,
help
=
"input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_file'
,
type
=
str
,
default
=
None
,
help
=
"file contain input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_file'
,
type
=
str
,
default
=
None
,
help
=
"file contain input data"
)
def
check_input_data
(
self
,
args
):
def
check_input_data
(
self
,
args
):
input_data
=
[]
input_data
=
[]
...
@@ -366,9 +425,12 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -366,9 +425,12 @@ class FasterRCNNResNet50(hub.Module):
prog
=
"hub run {}"
.
format
(
self
.
name
),
prog
=
"hub run {}"
.
format
(
self
.
name
),
usage
=
'%(prog)s'
,
usage
=
'%(prog)s'
,
add_help
=
True
)
add_help
=
True
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
self
.
add_module_config_arg
()
self
.
add_module_config_arg
()
self
.
add_module_input_arg
()
self
.
add_module_input_arg
()
...
@@ -380,5 +442,7 @@ class FasterRCNNResNet50(hub.Module):
...
@@ -380,5 +442,7 @@ class FasterRCNNResNet50(hub.Module):
else
:
else
:
for
image_path
in
input_data
:
for
image_path
in
input_data
:
if
not
os
.
path
.
exists
(
image_path
):
if
not
os
.
path
.
exists
(
image_path
):
raise
RuntimeError
(
"File %s or %s is not exist."
%
image_path
)
raise
RuntimeError
(
return
self
.
object_detection
(
paths
=
input_data
,
use_gpu
=
args
.
use_gpu
,
batch_size
=
args
.
batch_size
)
"File %s or %s is not exist."
%
image_path
)
return
self
.
object_detection
(
paths
=
input_data
,
use_gpu
=
args
.
use_gpu
,
batch_size
=
args
.
batch_size
)
modules/image/object_detection/faster_rcnn_resnet50_coco2017/nonlocal_helper.py
浏览文件 @
c3f1e085
...
@@ -22,7 +22,8 @@ nonlocal_params = {
...
@@ -22,7 +22,8 @@ nonlocal_params = {
}
}
def
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
,
max_pool_stride
=
2
):
def
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
,
max_pool_stride
=
2
):
cur
=
input
cur
=
input
theta
=
fluid
.
layers
.
conv2d
(
input
=
cur
,
num_filters
=
dim_inner
,
\
theta
=
fluid
.
layers
.
conv2d
(
input
=
cur
,
num_filters
=
dim_inner
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
...
@@ -82,7 +83,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
...
@@ -82,7 +83,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
theta_phi_sc
=
fluid
.
layers
.
scale
(
theta_phi
,
scale
=
dim_inner
**-
.
5
)
theta_phi_sc
=
fluid
.
layers
.
scale
(
theta_phi
,
scale
=
dim_inner
**-
.
5
)
else
:
else
:
theta_phi_sc
=
theta_phi
theta_phi_sc
=
theta_phi
p
=
fluid
.
layers
.
softmax
(
theta_phi_sc
,
name
=
prefix
+
'_affinity'
+
'_prob'
)
p
=
fluid
.
layers
.
softmax
(
theta_phi_sc
,
name
=
prefix
+
'_affinity'
+
'_prob'
)
else
:
else
:
# not clear about what is doing in xlw's code
# not clear about what is doing in xlw's code
p
=
None
# not implemented
p
=
None
# not implemented
...
@@ -96,7 +98,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
...
@@ -96,7 +98,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
# reshape back
# reshape back
# e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14)
# e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14)
t_shape
=
t
.
shape
t_shape
=
t
.
shape
t_re
=
fluid
.
layers
.
reshape
(
t
,
shape
=
list
(
theta_shape
),
actual_shape
=
theta_shape_op
)
t_re
=
fluid
.
layers
.
reshape
(
t
,
shape
=
list
(
theta_shape
),
actual_shape
=
theta_shape_op
)
blob_out
=
t_re
blob_out
=
t_re
blob_out
=
fluid
.
layers
.
conv2d
(
input
=
blob_out
,
num_filters
=
dim_out
,
\
blob_out
=
fluid
.
layers
.
conv2d
(
input
=
blob_out
,
num_filters
=
dim_out
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
padding
=
[
0
,
0
],
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
padding
=
[
0
,
0
],
\
...
...
modules/image/object_detection/faster_rcnn_resnet50_coco2017/processor.py
浏览文件 @
c3f1e085
...
@@ -19,6 +19,12 @@ def base64_to_cv2(b64str):
...
@@ -19,6 +19,12 @@ def base64_to_cv2(b64str):
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
return
data
return
data
def
check_dir
(
dir_path
):
if
not
os
.
path
.
exists
(
dir_path
):
os
.
makedirs
(
dir_path
)
elif
os
.
path
.
isfile
(
dir_path
):
os
.
remove
(
dir_path
)
os
.
makedirs
(
dir_path
)
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
"""Get save image name from source image path.
"""Get save image name from source image path.
...
@@ -48,17 +54,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
...
@@ -48,17 +54,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
image
=
Image
.
open
(
image_path
)
image
=
Image
.
open
(
image_path
)
draw
=
ImageDraw
.
Draw
(
image
)
draw
=
ImageDraw
.
Draw
(
image
)
for
data
in
data_list
:
for
data
in
data_list
:
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
# draw bbox
# draw bbox
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
# draw label
# draw label
if
image
.
mode
==
'RGB'
:
if
image
.
mode
==
'RGB'
:
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
draw
.
rectangle
(
draw
.
rectangle
(
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
...
@@ -86,7 +98,14 @@ def load_label_info(file_path):
...
@@ -86,7 +98,14 @@ def load_label_info(file_path):
return
label_names
return
label_names
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
"""
"""
postprocess the lod_tensor produced by fluid.Executor.run
postprocess the lod_tensor produced by fluid.Executor.run
...
@@ -115,16 +134,26 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -115,16 +134,26 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
lod
=
lod_tensor
.
lod
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
results
=
lod_tensor
.
as_ndarray
()
results
=
lod_tensor
.
as_ndarray
()
if
handle_id
<
len
(
paths
):
check_dir
(
output_dir
)
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
if
paths
:
else
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
unhandled_paths_num
=
0
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
unhandled_paths_num
=
0
if
images
is
not
None
:
if
handle_id
<
len
(
images
):
unhandled_paths
=
None
unhandled_paths_num
=
len
(
images
)
-
handle_id
else
:
unhandled_paths_num
=
0
output
=
[]
output
=
[]
for
index
in
range
(
len
(
lod
)
-
1
):
for
index
in
range
(
len
(
lod
)
-
1
):
output_i
=
{
'data'
:
[]}
output_i
=
{
'data'
:
[]}
if
index
<
unhandled_paths_num
:
if
unhandled_paths
and
index
<
unhandled_paths_num
:
org_img_path
=
unhandled_paths
[
index
]
org_img_path
=
unhandled_paths
[
index
]
org_img
=
Image
.
open
(
org_img_path
)
org_img
=
Image
.
open
(
org_img_path
)
output_i
[
'path'
]
=
org_img_path
output_i
[
'path'
]
=
org_img_path
...
@@ -133,7 +162,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -133,7 +162,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
if
visualization
:
if
visualization
:
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
((
handle_id
+
index
)))
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
(
(
handle_id
+
index
)))
org_img
.
save
(
org_img_path
)
org_img
.
save
(
org_img_path
)
org_img_height
=
org_img
.
height
org_img_height
=
org_img
.
height
org_img_width
=
org_img
.
width
org_img_width
=
org_img
.
width
...
@@ -149,11 +180,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -149,11 +180,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
dt
=
{}
dt
=
{}
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
output_i
[
'data'
].
append
(
dt
)
output_i
[
'data'
].
append
(
dt
)
output
.
append
(
output_i
)
output
.
append
(
output_i
)
if
visualization
:
if
visualization
:
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
return
output
return
output
modules/image/object_detection/faster_rcnn_resnet50_coco2017/resnet.py
浏览文件 @
c3f1e085
...
@@ -90,7 +90,13 @@ class ResNet(object):
...
@@ -90,7 +90,13 @@ class ResNet(object):
self
.
get_prediction
=
get_prediction
self
.
get_prediction
=
get_prediction
self
.
class_dim
=
class_dim
self
.
class_dim
=
class_dim
def
_conv_offset
(
self
,
input
,
filter_size
,
stride
,
padding
,
act
=
None
,
name
=
None
):
def
_conv_offset
(
self
,
input
,
filter_size
,
stride
,
padding
,
act
=
None
,
name
=
None
):
out_channel
=
filter_size
*
filter_size
*
3
out_channel
=
filter_size
*
filter_size
*
3
out
=
fluid
.
layers
.
conv2d
(
out
=
fluid
.
layers
.
conv2d
(
input
,
input
,
...
@@ -104,7 +110,15 @@ class ResNet(object):
...
@@ -104,7 +110,15 @@ class ResNet(object):
name
=
name
)
name
=
name
)
return
out
return
out
def
_conv_norm
(
self
,
input
,
num_filters
,
filter_size
,
stride
=
1
,
groups
=
1
,
act
=
None
,
name
=
None
,
dcn_v2
=
False
):
def
_conv_norm
(
self
,
input
,
num_filters
,
filter_size
,
stride
=
1
,
groups
=
1
,
act
=
None
,
name
=
None
,
dcn_v2
=
False
):
_name
=
self
.
prefix_name
+
name
if
self
.
prefix_name
!=
''
else
name
_name
=
self
.
prefix_name
+
name
if
self
.
prefix_name
!=
''
else
name
if
not
dcn_v2
:
if
not
dcn_v2
:
conv
=
fluid
.
layers
.
conv2d
(
conv
=
fluid
.
layers
.
conv2d
(
...
@@ -129,7 +143,10 @@ class ResNet(object):
...
@@ -129,7 +143,10 @@ class ResNet(object):
name
=
_name
+
"_conv_offset"
)
name
=
_name
+
"_conv_offset"
)
offset_channel
=
filter_size
**
2
*
2
offset_channel
=
filter_size
**
2
*
2
mask_channel
=
filter_size
**
2
mask_channel
=
filter_size
**
2
offset
,
mask
=
fluid
.
layers
.
split
(
input
=
offset_mask
,
num_or_sections
=
[
offset_channel
,
mask_channel
],
dim
=
1
)
offset
,
mask
=
fluid
.
layers
.
split
(
input
=
offset_mask
,
num_or_sections
=
[
offset_channel
,
mask_channel
],
dim
=
1
)
mask
=
fluid
.
layers
.
sigmoid
(
mask
)
mask
=
fluid
.
layers
.
sigmoid
(
mask
)
conv
=
fluid
.
layers
.
deformable_conv
(
conv
=
fluid
.
layers
.
deformable_conv
(
input
=
input
,
input
=
input
,
...
@@ -151,8 +168,14 @@ class ResNet(object):
...
@@ -151,8 +168,14 @@ class ResNet(object):
norm_lr
=
0.
if
self
.
freeze_norm
else
1.
norm_lr
=
0.
if
self
.
freeze_norm
else
1.
norm_decay
=
self
.
norm_decay
norm_decay
=
self
.
norm_decay
pattr
=
ParamAttr
(
name
=
bn_name
+
'_scale'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
pattr
=
ParamAttr
(
battr
=
ParamAttr
(
name
=
bn_name
+
'_offset'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
name
=
bn_name
+
'_scale'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
battr
=
ParamAttr
(
name
=
bn_name
+
'_offset'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
if
self
.
norm_type
in
[
'bn'
,
'sync_bn'
]:
if
self
.
norm_type
in
[
'bn'
,
'sync_bn'
]:
global_stats
=
True
if
self
.
freeze_norm
else
False
global_stats
=
True
if
self
.
freeze_norm
else
False
...
@@ -169,10 +192,17 @@ class ResNet(object):
...
@@ -169,10 +192,17 @@ class ResNet(object):
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
elif
self
.
norm_type
==
'affine_channel'
:
elif
self
.
norm_type
==
'affine_channel'
:
scale
=
fluid
.
layers
.
create_parameter
(
scale
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
bias
=
fluid
.
layers
.
create_parameter
(
bias
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
shape
=
[
conv
.
shape
[
1
]],
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
if
self
.
freeze_norm
:
if
self
.
freeze_norm
:
scale
.
stop_gradient
=
True
scale
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
...
@@ -192,13 +222,24 @@ class ResNet(object):
...
@@ -192,13 +222,24 @@ class ResNet(object):
return
self
.
_conv_norm
(
input
,
ch_out
,
3
,
stride
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
3
,
stride
,
name
=
name
)
if
max_pooling_in_short_cut
and
not
is_first
:
if
max_pooling_in_short_cut
and
not
is_first
:
input
=
fluid
.
layers
.
pool2d
(
input
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
2
,
pool_stride
=
2
,
pool_padding
=
0
,
ceil_mode
=
True
,
pool_type
=
'avg'
)
input
=
input
,
pool_size
=
2
,
pool_stride
=
2
,
pool_padding
=
0
,
ceil_mode
=
True
,
pool_type
=
'avg'
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
1
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
1
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
else
:
else
:
return
input
return
input
def
bottleneck
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
def
bottleneck
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
if
self
.
variant
==
'a'
:
if
self
.
variant
==
'a'
:
stride1
,
stride2
=
stride
,
1
stride1
,
stride2
=
stride
,
1
else
:
else
:
...
@@ -219,8 +260,9 @@ class ResNet(object):
...
@@ -219,8 +260,9 @@ class ResNet(object):
shortcut_name
=
self
.
na
.
fix_bottleneck_name
(
name
)
shortcut_name
=
self
.
na
.
fix_bottleneck_name
(
name
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
if
std_senet
:
if
std_senet
:
conv_def
=
[[
int
(
num_filters
/
2
),
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
conv_def
=
[[
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
int
(
num_filters
/
2
),
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
else
:
else
:
conv_def
=
[[
num_filters
,
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
conv_def
=
[[
num_filters
,
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
...
@@ -238,18 +280,42 @@ class ResNet(object):
...
@@ -238,18 +280,42 @@ class ResNet(object):
groups
=
g
,
groups
=
g
,
name
=
_name
,
name
=
_name
,
dcn_v2
=
(
i
==
1
and
dcn_v2
))
dcn_v2
=
(
i
==
1
and
dcn_v2
))
short
=
self
.
_shortcut
(
input
,
num_filters
*
expand
,
stride
,
is_first
=
is_first
,
name
=
shortcut_name
)
short
=
self
.
_shortcut
(
input
,
num_filters
*
expand
,
stride
,
is_first
=
is_first
,
name
=
shortcut_name
)
# Squeeze-and-Excitation
# Squeeze-and-Excitation
if
callable
(
getattr
(
self
,
'_squeeze_excitation'
,
None
)):
if
callable
(
getattr
(
self
,
'_squeeze_excitation'
,
None
)):
residual
=
self
.
_squeeze_excitation
(
input
=
residual
,
num_channels
=
num_filters
,
name
=
'fc'
+
name
)
residual
=
self
.
_squeeze_excitation
(
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
residual
,
act
=
'relu'
,
name
=
name
+
".add.output.5"
)
input
=
residual
,
num_channels
=
num_filters
,
name
=
'fc'
+
name
)
return
fluid
.
layers
.
elementwise_add
(
def
basicblock
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
x
=
short
,
y
=
residual
,
act
=
'relu'
,
name
=
name
+
".add.output.5"
)
def
basicblock
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
assert
dcn_v2
is
False
,
"Not implemented yet."
assert
dcn_v2
is
False
,
"Not implemented yet."
conv0
=
self
.
_conv_norm
(
conv0
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
'relu'
,
stride
=
stride
,
name
=
name
+
"_branch2a"
)
input
=
input
,
conv1
=
self
.
_conv_norm
(
input
=
conv0
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
None
,
name
=
name
+
"_branch2b"
)
num_filters
=
num_filters
,
short
=
self
.
_shortcut
(
input
,
num_filters
,
stride
,
is_first
,
name
=
name
+
"_branch1"
)
filter_size
=
3
,
act
=
'relu'
,
stride
=
stride
,
name
=
name
+
"_branch2a"
)
conv1
=
self
.
_conv_norm
(
input
=
conv0
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
None
,
name
=
name
+
"_branch2b"
)
short
=
self
.
_shortcut
(
input
,
num_filters
,
stride
,
is_first
,
name
=
name
+
"_branch1"
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
conv1
,
act
=
'relu'
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
conv1
,
act
=
'relu'
)
def
layer_warp
(
self
,
input
,
stage_num
):
def
layer_warp
(
self
,
input
,
stage_num
):
...
@@ -272,7 +338,8 @@ class ResNet(object):
...
@@ -272,7 +338,8 @@ class ResNet(object):
nonlocal_mod
=
1000
nonlocal_mod
=
1000
if
stage_num
in
self
.
nonlocal_stages
:
if
stage_num
in
self
.
nonlocal_stages
:
nonlocal_mod
=
self
.
nonlocal_mod_cfg
[
self
.
depth
]
if
stage_num
==
4
else
2
nonlocal_mod
=
self
.
nonlocal_mod_cfg
[
self
.
depth
]
if
stage_num
==
4
else
2
# Make the layer name and parameter name consistent
# Make the layer name and parameter name consistent
# with ImageNet pre-trained model
# with ImageNet pre-trained model
...
@@ -293,7 +360,9 @@ class ResNet(object):
...
@@ -293,7 +360,9 @@ class ResNet(object):
dim_in
=
conv
.
shape
[
1
]
dim_in
=
conv
.
shape
[
1
]
nonlocal_name
=
"nonlocal_conv{}"
.
format
(
stage_num
)
nonlocal_name
=
"nonlocal_conv{}"
.
format
(
stage_num
)
if
i
%
nonlocal_mod
==
nonlocal_mod
-
1
:
if
i
%
nonlocal_mod
==
nonlocal_mod
-
1
:
conv
=
add_space_nonlocal
(
conv
,
dim_in
,
dim_in
,
nonlocal_name
+
'_{}'
.
format
(
i
),
int
(
dim_in
/
2
))
conv
=
add_space_nonlocal
(
conv
,
dim_in
,
dim_in
,
nonlocal_name
+
'_{}'
.
format
(
i
),
int
(
dim_in
/
2
))
return
conv
return
conv
def
c1_stage
(
self
,
input
):
def
c1_stage
(
self
,
input
):
...
@@ -311,9 +380,20 @@ class ResNet(object):
...
@@ -311,9 +380,20 @@ class ResNet(object):
conv_def
=
[[
out_chan
,
7
,
2
,
conv1_name
]]
conv_def
=
[[
out_chan
,
7
,
2
,
conv1_name
]]
for
(
c
,
k
,
s
,
_name
)
in
conv_def
:
for
(
c
,
k
,
s
,
_name
)
in
conv_def
:
input
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
'relu'
,
name
=
_name
)
input
=
self
.
_conv_norm
(
input
=
input
,
output
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
3
,
pool_stride
=
2
,
pool_padding
=
1
,
pool_type
=
'max'
)
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
'relu'
,
name
=
_name
)
output
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
3
,
pool_stride
=
2
,
pool_padding
=
1
,
pool_type
=
'max'
)
return
output
return
output
def
__call__
(
self
,
input
):
def
__call__
(
self
,
input
):
...
@@ -337,17 +417,19 @@ class ResNet(object):
...
@@ -337,17 +417,19 @@ class ResNet(object):
if
self
.
freeze_at
>=
i
:
if
self
.
freeze_at
>=
i
:
res
.
stop_gradient
=
True
res
.
stop_gradient
=
True
if
self
.
get_prediction
:
if
self
.
get_prediction
:
pool
=
fluid
.
layers
.
pool2d
(
input
=
res
,
pool_type
=
'avg'
,
global_pooling
=
True
)
pool
=
fluid
.
layers
.
pool2d
(
input
=
res
,
pool_type
=
'avg'
,
global_pooling
=
True
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
out
=
fluid
.
layers
.
fc
(
out
=
fluid
.
layers
.
fc
(
input
=
pool
,
input
=
pool
,
size
=
self
.
class_dim
,
size
=
self
.
class_dim
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
)))
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
)))
out
=
fluid
.
layers
.
softmax
(
out
)
out
=
fluid
.
layers
.
softmax
(
out
)
return
out
return
out
return
OrderedDict
(
return
OrderedDict
(
[(
'res{}_sum'
.
format
(
self
.
feature_maps
[
idx
]),
feat
)
[(
'res{}_sum'
.
format
(
self
.
feature_maps
[
idx
]),
feat
)
for
idx
,
feat
in
enumerate
(
res_endpoints
)])
for
idx
,
feat
in
enumerate
(
res_endpoints
)])
class
ResNetC5
(
ResNet
):
class
ResNetC5
(
ResNet
):
...
@@ -360,5 +442,6 @@ class ResNetC5(ResNet):
...
@@ -360,5 +442,6 @@ class ResNetC5(ResNet):
variant
=
'b'
,
variant
=
'b'
,
feature_maps
=
[
5
],
feature_maps
=
[
5
],
weight_prefix_name
=
''
):
weight_prefix_name
=
''
):
super
(
ResNetC5
,
self
).
__init__
(
depth
,
freeze_at
,
norm_type
,
freeze_norm
,
norm_decay
,
variant
,
feature_maps
)
super
(
ResNetC5
,
self
).
__init__
(
depth
,
freeze_at
,
norm_type
,
freeze_norm
,
norm_decay
,
variant
,
feature_maps
)
self
.
severed_head
=
True
self
.
severed_head
=
True
modules/image/object_detection/faster_rcnn_resnet50_coco2017/rpn_head.py
浏览文件 @
c3f1e085
...
@@ -45,7 +45,12 @@ class RPNTargetAssign(object):
...
@@ -45,7 +45,12 @@ class RPNTargetAssign(object):
class
GenerateProposals
(
object
):
class
GenerateProposals
(
object
):
# __op__ = fluid.layers.generate_proposals
# __op__ = fluid.layers.generate_proposals
def
__init__
(
self
,
pre_nms_top_n
=
6000
,
post_nms_top_n
=
1000
,
nms_thresh
=
.
5
,
min_size
=
.
1
,
eta
=
1.
):
def
__init__
(
self
,
pre_nms_top_n
=
6000
,
post_nms_top_n
=
1000
,
nms_thresh
=
.
5
,
min_size
=
.
1
,
eta
=
1.
):
super
(
GenerateProposals
,
self
).
__init__
()
super
(
GenerateProposals
,
self
).
__init__
()
self
.
pre_nms_top_n
=
pre_nms_top_n
self
.
pre_nms_top_n
=
pre_nms_top_n
self
.
post_nms_top_n
=
post_nms_top_n
self
.
post_nms_top_n
=
post_nms_top_n
...
@@ -65,9 +70,17 @@ class RPNHead(object):
...
@@ -65,9 +70,17 @@ class RPNHead(object):
test_proposal (object): `GenerateProposals` instance for testing
test_proposal (object): `GenerateProposals` instance for testing
num_classes (int): number of classes in rpn output
num_classes (int): number of classes in rpn output
"""
"""
__inject__
=
[
'anchor_generator'
,
'rpn_target_assign'
,
'train_proposal'
,
'test_proposal'
]
__inject__
=
[
'anchor_generator'
,
'rpn_target_assign'
,
'train_proposal'
,
'test_proposal'
]
def
__init__
(
self
,
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
,
num_classes
=
1
):
def
__init__
(
self
,
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
,
num_classes
=
1
):
super
(
RPNHead
,
self
).
__init__
()
super
(
RPNHead
,
self
).
__init__
()
self
.
anchor_generator
=
anchor_generator
self
.
anchor_generator
=
anchor_generator
self
.
rpn_target_assign
=
rpn_target_assign
self
.
rpn_target_assign
=
rpn_target_assign
...
@@ -95,8 +108,10 @@ class RPNHead(object):
...
@@ -95,8 +108,10 @@ class RPNHead(object):
padding
=
1
,
padding
=
1
,
act
=
'relu'
,
act
=
'relu'
,
name
=
'conv_rpn'
,
name
=
'conv_rpn'
,
param_attr
=
ParamAttr
(
name
=
"conv_rpn_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"conv_rpn_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"conv_rpn_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"conv_rpn_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
# Generate anchors self.anchor_generator
# Generate anchors self.anchor_generator
self
.
anchor
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
self
.
anchor
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
input
=
rpn_conv
,
input
=
rpn_conv
,
...
@@ -115,8 +130,13 @@ class RPNHead(object):
...
@@ -115,8 +130,13 @@ class RPNHead(object):
padding
=
0
,
padding
=
0
,
act
=
None
,
act
=
None
,
name
=
'rpn_cls_score'
,
name
=
'rpn_cls_score'
,
param_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"rpn_cls_logits_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
# Proposal bbox regression deltas
# Proposal bbox regression deltas
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
rpn_conv
,
rpn_conv
,
...
@@ -126,8 +146,12 @@ class RPNHead(object):
...
@@ -126,8 +146,12 @@ class RPNHead(object):
padding
=
0
,
padding
=
0
,
act
=
None
,
act
=
None
,
name
=
'rpn_bbox_pred'
,
name
=
'rpn_bbox_pred'
,
param_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"rpn_bbox_pred_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
def
get_proposals
(
self
,
body_feats
,
im_info
,
mode
=
'train'
):
def
get_proposals
(
self
,
body_feats
,
im_info
,
mode
=
'train'
):
...
@@ -150,15 +174,22 @@ class RPNHead(object):
...
@@ -150,15 +174,22 @@ class RPNHead(object):
rpn_cls_score
,
rpn_bbox_pred
=
self
.
_get_output
(
body_feat
)
rpn_cls_score
,
rpn_bbox_pred
=
self
.
_get_output
(
body_feat
)
if
self
.
num_classes
==
1
:
if
self
.
num_classes
==
1
:
rpn_cls_prob
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score
,
name
=
'rpn_cls_prob'
)
rpn_cls_prob
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score
,
name
=
'rpn_cls_prob'
)
else
:
else
:
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
=
fluid
.
layers
.
reshape
(
rpn_cls_score
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_prob_tmp
=
fluid
.
layers
.
softmax
(
rpn_cls_score
,
use_cudnn
=
False
,
name
=
'rpn_cls_prob'
)
rpn_cls_score
=
fluid
.
layers
.
reshape
(
rpn_cls_prob_slice
=
fluid
.
layers
.
slice
(
rpn_cls_prob_tmp
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
rpn_cls_score
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_prob_tmp
=
fluid
.
layers
.
softmax
(
rpn_cls_score
,
use_cudnn
=
False
,
name
=
'rpn_cls_prob'
)
rpn_cls_prob_slice
=
fluid
.
layers
.
slice
(
rpn_cls_prob_tmp
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
rpn_cls_prob
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_slice
,
1
)
rpn_cls_prob
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_slice
,
1
)
rpn_cls_prob
=
fluid
.
layers
.
reshape
(
rpn_cls_prob
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob
=
fluid
.
layers
.
reshape
(
rpn_cls_prob
=
fluid
.
layers
.
transpose
(
rpn_cls_prob
,
perm
=
[
0
,
3
,
1
,
2
])
rpn_cls_prob
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob
=
fluid
.
layers
.
transpose
(
rpn_cls_prob
,
perm
=
[
0
,
3
,
1
,
2
])
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
# prop_op
# prop_op
rpn_rois
,
rpn_roi_probs
=
fluid
.
layers
.
generate_proposals
(
rpn_rois
,
rpn_roi_probs
=
fluid
.
layers
.
generate_proposals
(
...
@@ -174,20 +205,24 @@ class RPNHead(object):
...
@@ -174,20 +205,24 @@ class RPNHead(object):
eta
=
prop_op
.
eta
)
eta
=
prop_op
.
eta
)
return
rpn_rois
return
rpn_rois
def
_transform_input
(
self
,
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
):
def
_transform_input
(
self
,
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
):
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_bbox_pred
=
fluid
.
layers
.
transpose
(
rpn_bbox_pred
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_bbox_pred
=
fluid
.
layers
.
transpose
(
rpn_bbox_pred
,
perm
=
[
0
,
2
,
3
,
1
])
anchor
=
fluid
.
layers
.
reshape
(
anchor
,
shape
=
(
-
1
,
4
))
anchor
=
fluid
.
layers
.
reshape
(
anchor
,
shape
=
(
-
1
,
4
))
anchor_var
=
fluid
.
layers
.
reshape
(
anchor_var
,
shape
=
(
-
1
,
4
))
anchor_var
=
fluid
.
layers
.
reshape
(
anchor_var
,
shape
=
(
-
1
,
4
))
rpn_cls_score
=
fluid
.
layers
.
reshape
(
x
=
rpn_cls_score
,
shape
=
(
0
,
-
1
,
self
.
num_classes
))
rpn_cls_score
=
fluid
.
layers
.
reshape
(
x
=
rpn_cls_score
,
shape
=
(
0
,
-
1
,
self
.
num_classes
))
rpn_bbox_pred
=
fluid
.
layers
.
reshape
(
x
=
rpn_bbox_pred
,
shape
=
(
0
,
-
1
,
4
))
rpn_bbox_pred
=
fluid
.
layers
.
reshape
(
x
=
rpn_bbox_pred
,
shape
=
(
0
,
-
1
,
4
))
return
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
return
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
def
_get_loss_input
(
self
):
def
_get_loss_input
(
self
):
for
attr
in
[
'rpn_cls_score'
,
'rpn_bbox_pred'
,
'anchor'
,
'anchor_var'
]:
for
attr
in
[
'rpn_cls_score'
,
'rpn_bbox_pred'
,
'anchor'
,
'anchor_var'
]:
if
not
getattr
(
self
,
attr
,
None
):
if
not
getattr
(
self
,
attr
,
None
):
raise
ValueError
(
"self.{} should not be None,"
.
format
(
attr
),
"call RPNHead.get_proposals first"
)
raise
ValueError
(
"self.{} should not be None,"
.
format
(
attr
),
return
self
.
_transform_input
(
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
,
self
.
anchor
,
self
.
anchor_var
)
"call RPNHead.get_proposals first"
)
return
self
.
_transform_input
(
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
,
self
.
anchor
,
self
.
anchor_var
)
def
get_loss
(
self
,
im_info
,
gt_box
,
is_crowd
,
gt_label
=
None
):
def
get_loss
(
self
,
im_info
,
gt_box
,
is_crowd
,
gt_label
=
None
):
"""
"""
...
@@ -227,7 +262,8 @@ class RPNHead(object):
...
@@ -227,7 +262,8 @@ class RPNHead(object):
use_random
=
self
.
rpn_target_assign
.
use_random
)
use_random
=
self
.
rpn_target_assign
.
use_random
)
score_tgt
=
fluid
.
layers
.
cast
(
x
=
score_tgt
,
dtype
=
'float32'
)
score_tgt
=
fluid
.
layers
.
cast
(
x
=
score_tgt
,
dtype
=
'float32'
)
score_tgt
.
stop_gradient
=
True
score_tgt
.
stop_gradient
=
True
rpn_cls_loss
=
fluid
.
layers
.
sigmoid_cross_entropy_with_logits
(
x
=
score_pred
,
label
=
score_tgt
)
rpn_cls_loss
=
fluid
.
layers
.
sigmoid_cross_entropy_with_logits
(
x
=
score_pred
,
label
=
score_tgt
)
else
:
else
:
score_pred
,
loc_pred
,
score_tgt
,
loc_tgt
,
bbox_weight
=
\
score_pred
,
loc_pred
,
score_tgt
,
loc_tgt
,
bbox_weight
=
\
self
.
rpn_target_assign
(
self
.
rpn_target_assign
(
...
@@ -245,13 +281,19 @@ class RPNHead(object):
...
@@ -245,13 +281,19 @@ class RPNHead(object):
rpn_cls_loss
=
fluid
.
layers
.
softmax_with_cross_entropy
(
rpn_cls_loss
=
fluid
.
layers
.
softmax_with_cross_entropy
(
logits
=
score_pred
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
logits
=
score_pred
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
rpn_cls_loss
=
fluid
.
layers
.
reduce_mean
(
rpn_cls_loss
,
name
=
'loss_rpn_cls'
)
rpn_cls_loss
=
fluid
.
layers
.
reduce_mean
(
rpn_cls_loss
,
name
=
'loss_rpn_cls'
)
loc_tgt
=
fluid
.
layers
.
cast
(
x
=
loc_tgt
,
dtype
=
'float32'
)
loc_tgt
=
fluid
.
layers
.
cast
(
x
=
loc_tgt
,
dtype
=
'float32'
)
loc_tgt
.
stop_gradient
=
True
loc_tgt
.
stop_gradient
=
True
rpn_reg_loss
=
fluid
.
layers
.
smooth_l1
(
rpn_reg_loss
=
fluid
.
layers
.
smooth_l1
(
x
=
loc_pred
,
y
=
loc_tgt
,
sigma
=
3.0
,
inside_weight
=
bbox_weight
,
outside_weight
=
bbox_weight
)
x
=
loc_pred
,
rpn_reg_loss
=
fluid
.
layers
.
reduce_sum
(
rpn_reg_loss
,
name
=
'loss_rpn_bbox'
)
y
=
loc_tgt
,
sigma
=
3.0
,
inside_weight
=
bbox_weight
,
outside_weight
=
bbox_weight
)
rpn_reg_loss
=
fluid
.
layers
.
reduce_sum
(
rpn_reg_loss
,
name
=
'loss_rpn_bbox'
)
score_shape
=
fluid
.
layers
.
shape
(
score_tgt
)
score_shape
=
fluid
.
layers
.
shape
(
score_tgt
)
score_shape
=
fluid
.
layers
.
cast
(
x
=
score_shape
,
dtype
=
'float32'
)
score_shape
=
fluid
.
layers
.
cast
(
x
=
score_shape
,
dtype
=
'float32'
)
norm
=
fluid
.
layers
.
reduce_prod
(
score_shape
)
norm
=
fluid
.
layers
.
reduce_prod
(
score_shape
)
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/README.md
浏览文件 @
c3f1e085
## 命令行预测
# faster_rcnn_resnet50_fpn_coco2017
```
shell
$
hub run faster_rcnn_resnet50_fpn_coco2017
--input_path
"/PATH/TO/IMAGE"
```
## API
```
python
def
context
(
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
)
```
提取特征,用于迁移学习。
**参数**
*
num
\_
classes (int): 类别数;
*
trainable(bool): 参数是否可训练;
*
pretrained (bool): 是否加载预训练模型;
*
phase (str): 可选值为 'train'/'predict','trian' 用于训练,'predict' 用于预测。
**返回**
*
inputs (dict): 模型的输入,相应的取值为:
当 phase 为 'train'时,包含:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图像的尺寸
*
im
\_
info (Variable): 图像缩放信息
*
gt
\_
class (Variable): 检测框类别
*
gt
\_
box (Variable): 检测框坐标
*
is
\_
crowd (Variable): 单个框内是否包含多个物体
当 phase 为 'predict'时,包含:
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图像的尺寸
*
im
\_
info (Variable): 图像缩放信息
*
outputs (dict): 模型的输出,相应的取值为:
当 phase 为 'train'时,包含:
*
head_features (Variable): 所提取的特征
*
rpn
\_
cls
\_
loss (Variable): 检测框分类损失
*
rpn
\_
reg
\_
loss (Variable): 检测框回归损失
*
generate
\_
proposal
\_
labels (Variable): 图像信息
当 phase 为 'predict'时,包含:
*
head_features (Variable): 所提取的特征
*
rois (Variable): 提取的roi
*
bbox
\_
out (Variable): 预测结果
*
context
\_
prog (Program): 用于迁移学习的 Program。
```
python
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
**参数**
*
paths (list
\[
str
\]
): 图片的路径;
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
**参数**
*
dirname: 存在模型的目录名称
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
*
combined: 是否将参数保存到统一的一个文件中
## 代码示例
```
python
import
paddlehub
as
hub
import
cv2
object_detector
=
hub
.
Module
(
name
=
"faster_rcnn_resnet50_fpn_coco2017"
)
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
PaddleHub Serving 可以部署一个目标检测的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```
shell
$
hub serving start
-m
faster_rcnn_resnet50_fpn_coco2017
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```
python
import
requests
import
json
import
cv2
import
base64
|模型名称|faster_rcnn_resnet50_fpn_coco2017|
| :--- | :---: |
|类别|图像 - 目标检测|
|网络|faster_rcnn|
|数据集|COCO2017|
|是否支持Fine-tuning|否|
|模型大小|161MB|
|最新更新日期|2021-03-15|
|数据指标|-|
def
cv2_to_base64
(
image
):
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
## 一、模型基本信息
# 发送HTTP请求
-
### 应用效果展示
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
-
样例结果示例:
headers
=
{
"Content-type"
:
"application/json"
}
<p
align=
"center"
>
url
=
"http://127.0.0.1:8866/predict/faster_rcnn_resnet50_fpn_coco2017"
<img
src=
"https://user-images.githubusercontent.com/22424850/131504887-d024c7e5-fc09-4d6b-92b8-4d0c965949d0.jpg"
width=
'50%'
hspace=
'10'
/>
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
<br
/>
</p>
# 打印预测结果
-
### 模型介绍
print
(
r
.
json
()[
"results"
])
```
### 依赖
-
Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置。Faster_RCNN整体网络可以分为4个部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层。Faster_RCNN是在MS-COCO数据集上预训练的模型。目前仅支持预测。
paddlepaddle >= 1.6.2
paddlehub >= 1.6.0
## 二、安装
-
### 1、环境依赖
-
paddlepaddle >= 1.6.2
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
-
### 2、安装
-
```shell
$ hub install faster_rcnn_resnet50_fpn_coco2017
```
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
## 三、模型API预测
-
### 1、命令行预测
-
```shell
$ hub run faster_rcnn_resnet50_fpn_coco2017 --input_path "/PATH/TO/IMAGE"
```
-
通过命令行方式实现目标检测模型的调用,更多请见
[
PaddleHub命令行指令
](
../../../../docs/docs_ch/tutorial/cmd_usage.rst
)
-
### 2、代码示例
-
```python
import paddlehub as hub
import cv2
object_detector = hub.Module(name="faster_rcnn_resnet50_fpn_coco2017")
result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
-
### 3、API
-
```python
def object_detection(paths=None,
images=None,
batch_size=1,
use_gpu=False,
output_dir='detection_result',
score_thresh=0.5,
visualization=True)
```
- 预测API,检测输入图片中的所有目标的位置。
- **参数**
- paths (list\[str\]): 图片的路径; <br/>
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; <br/>
- batch\_size (int): batch 的大小;<br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- output\_dir (str): 图片的保存路径,默认设为 detection\_result;<br/>
- score\_thresh (float): 识别置信度的阈值;<br/>
- visualization (bool): 是否将识别结果保存为图片文件。
**NOTE:** paths和images两个参数选择其一进行提供数据
- **返回**
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
- data (list): 检测结果,list的每一个元素为 dict,各字段为:
- confidence (float): 识别的置信度
- label (str): 标签
- left (int): 边界框的左上角x坐标
- top (int): 边界框的左上角y坐标
- right (int): 边界框的右下角x坐标
- bottom (int): 边界框的右下角y坐标
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
- **参数**
- dirname: 存在模型的目录名称; <br/>
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
- combined: 是否将参数保存到统一的一个文件中。
## 四、服务部署
-
PaddleHub Serving可以部署一个目标检测的在线服务。
-
### 第一步:启动PaddleHub Serving
-
运行启动命令:
-
```shell
$ hub serving start -m faster_rcnn_resnet50_fpn_coco2017
```
-
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
-
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
-
### 第二步:发送预测请求
-
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
-
```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/faster_rcnn_resnet50_fpn_coco2017"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
```
## 五、更新历史
*
1.0.0
初始发布
*
1.0.1
修复numpy数据读取问题
-
```shell
$ hub install faster_rcnn_resnet50_fpn_coco2017==1.0.1
```
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/bbox_head.py
浏览文件 @
c3f1e085
...
@@ -45,11 +45,18 @@ class SmoothL1Loss(object):
...
@@ -45,11 +45,18 @@ class SmoothL1Loss(object):
def
__call__
(
self
,
x
,
y
,
inside_weight
=
None
,
outside_weight
=
None
):
def
__call__
(
self
,
x
,
y
,
inside_weight
=
None
,
outside_weight
=
None
):
return
fluid
.
layers
.
smooth_l1
(
return
fluid
.
layers
.
smooth_l1
(
x
,
y
,
inside_weight
=
inside_weight
,
outside_weight
=
outside_weight
,
sigma
=
self
.
sigma
)
x
,
y
,
inside_weight
=
inside_weight
,
outside_weight
=
outside_weight
,
sigma
=
self
.
sigma
)
class
BoxCoder
(
object
):
class
BoxCoder
(
object
):
def
__init__
(
self
,
prior_box_var
=
[
0.1
,
0.1
,
0.2
,
0.2
],
code_type
=
'decode_center_size'
,
box_normalized
=
False
,
def
__init__
(
self
,
prior_box_var
=
[
0.1
,
0.1
,
0.2
,
0.2
],
code_type
=
'decode_center_size'
,
box_normalized
=
False
,
axis
=
1
):
axis
=
1
):
super
(
BoxCoder
,
self
).
__init__
()
super
(
BoxCoder
,
self
).
__init__
()
self
.
prior_box_var
=
prior_box_var
self
.
prior_box_var
=
prior_box_var
...
@@ -79,14 +86,16 @@ class TwoFCHead(object):
...
@@ -79,14 +86,16 @@ class TwoFCHead(object):
act
=
'relu'
,
act
=
'relu'
,
name
=
'fc6'
,
name
=
'fc6'
,
param_attr
=
ParamAttr
(
name
=
'fc6_w'
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
name
=
'fc6_w'
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
'fc6_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bias_attr
=
ParamAttr
(
name
=
'fc6_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
head_feat
=
fluid
.
layers
.
fc
(
head_feat
=
fluid
.
layers
.
fc
(
input
=
fc6
,
input
=
fc6
,
size
=
self
.
mlp_dim
,
size
=
self
.
mlp_dim
,
act
=
'relu'
,
act
=
'relu'
,
name
=
'fc7'
,
name
=
'fc7'
,
param_attr
=
ParamAttr
(
name
=
'fc7_w'
,
initializer
=
Xavier
()),
param_attr
=
ParamAttr
(
name
=
'fc7_w'
,
initializer
=
Xavier
()),
bias_attr
=
ParamAttr
(
name
=
'fc7_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bias_attr
=
ParamAttr
(
name
=
'fc7_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
head_feat
return
head_feat
...
@@ -104,7 +113,12 @@ class BBoxHead(object):
...
@@ -104,7 +113,12 @@ class BBoxHead(object):
__inject__
=
[
'head'
,
'box_coder'
,
'nms'
,
'bbox_loss'
]
__inject__
=
[
'head'
,
'box_coder'
,
'nms'
,
'bbox_loss'
]
__shared__
=
[
'num_classes'
]
__shared__
=
[
'num_classes'
]
def
__init__
(
self
,
head
,
box_coder
=
BoxCoder
(),
nms
=
MultiClassNMS
(),
bbox_loss
=
SmoothL1Loss
(),
num_classes
=
81
):
def
__init__
(
self
,
head
,
box_coder
=
BoxCoder
(),
nms
=
MultiClassNMS
(),
bbox_loss
=
SmoothL1Loss
(),
num_classes
=
81
):
super
(
BBoxHead
,
self
).
__init__
()
super
(
BBoxHead
,
self
).
__init__
()
self
.
head
=
head
self
.
head
=
head
self
.
num_classes
=
num_classes
self
.
num_classes
=
num_classes
...
@@ -141,24 +155,30 @@ class BBoxHead(object):
...
@@ -141,24 +155,30 @@ class BBoxHead(object):
head_feat
=
self
.
get_head_feat
(
roi_feat
)
head_feat
=
self
.
get_head_feat
(
roi_feat
)
# when ResNetC5 output a single feature map
# when ResNetC5 output a single feature map
if
not
isinstance
(
self
.
head
,
TwoFCHead
):
if
not
isinstance
(
self
.
head
,
TwoFCHead
):
head_feat
=
fluid
.
layers
.
pool2d
(
head_feat
,
pool_type
=
'avg'
,
global_pooling
=
True
)
head_feat
=
fluid
.
layers
.
pool2d
(
head_feat
,
pool_type
=
'avg'
,
global_pooling
=
True
)
cls_score
=
fluid
.
layers
.
fc
(
cls_score
=
fluid
.
layers
.
fc
(
input
=
head_feat
,
input
=
head_feat
,
size
=
self
.
num_classes
,
size
=
self
.
num_classes
,
act
=
None
,
act
=
None
,
name
=
'cls_score'
,
name
=
'cls_score'
,
param_attr
=
ParamAttr
(
name
=
'cls_score_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
'cls_score_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
'cls_score_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
'cls_score_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
bbox_pred
=
fluid
.
layers
.
fc
(
bbox_pred
=
fluid
.
layers
.
fc
(
input
=
head_feat
,
input
=
head_feat
,
size
=
4
*
self
.
num_classes
,
size
=
4
*
self
.
num_classes
,
act
=
None
,
act
=
None
,
name
=
'bbox_pred'
,
name
=
'bbox_pred'
,
param_attr
=
ParamAttr
(
name
=
'bbox_pred_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.001
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
'bbox_pred_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
'bbox_pred_w'
,
initializer
=
Normal
(
loc
=
0.0
,
scale
=
0.001
)),
bias_attr
=
ParamAttr
(
name
=
'bbox_pred_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
cls_score
,
bbox_pred
return
cls_score
,
bbox_pred
def
get_loss
(
self
,
roi_feat
,
labels_int32
,
bbox_targets
,
bbox_inside_weights
,
bbox_outside_weights
):
def
get_loss
(
self
,
roi_feat
,
labels_int32
,
bbox_targets
,
bbox_inside_weights
,
bbox_outside_weights
):
"""
"""
Get bbox_head loss.
Get bbox_head loss.
...
@@ -187,11 +207,19 @@ class BBoxHead(object):
...
@@ -187,11 +207,19 @@ class BBoxHead(object):
logits
=
cls_score
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
logits
=
cls_score
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
loss_cls
=
fluid
.
layers
.
reduce_mean
(
loss_cls
)
loss_cls
=
fluid
.
layers
.
reduce_mean
(
loss_cls
)
loss_bbox
=
self
.
bbox_loss
(
loss_bbox
=
self
.
bbox_loss
(
x
=
bbox_pred
,
y
=
bbox_targets
,
inside_weight
=
bbox_inside_weights
,
outside_weight
=
bbox_outside_weights
)
x
=
bbox_pred
,
y
=
bbox_targets
,
inside_weight
=
bbox_inside_weights
,
outside_weight
=
bbox_outside_weights
)
loss_bbox
=
fluid
.
layers
.
reduce_mean
(
loss_bbox
)
loss_bbox
=
fluid
.
layers
.
reduce_mean
(
loss_bbox
)
return
{
'loss_cls'
:
loss_cls
,
'loss_bbox'
:
loss_bbox
}
return
{
'loss_cls'
:
loss_cls
,
'loss_bbox'
:
loss_bbox
}
def
get_prediction
(
self
,
roi_feat
,
rois
,
im_info
,
im_shape
,
return_box_score
=
False
):
def
get_prediction
(
self
,
roi_feat
,
rois
,
im_info
,
im_shape
,
return_box_score
=
False
):
"""
"""
Get prediction bounding box in test stage.
Get prediction bounding box in test stage.
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/data_feed.py
浏览文件 @
c3f1e085
...
@@ -31,7 +31,8 @@ def test_reader(paths=None, images=None):
...
@@ -31,7 +31,8 @@ def test_reader(paths=None, images=None):
img_list
=
list
()
img_list
=
list
()
if
paths
:
if
paths
:
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
@@ -66,7 +67,13 @@ def test_reader(paths=None, images=None):
...
@@ -66,7 +67,13 @@ def test_reader(paths=None, images=None):
# im_info holds the resize info of image.
# im_info holds the resize info of image.
im_info
=
np
.
array
([
resize_h
,
resize_w
,
im_scale
]).
astype
(
'float32'
)
im_info
=
np
.
array
([
resize_h
,
resize_w
,
im_scale
]).
astype
(
'float32'
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale
,
fy
=
im_scale
,
interpolation
=
cv2
.
INTER_LINEAR
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale
,
fy
=
im_scale
,
interpolation
=
cv2
.
INTER_LINEAR
)
# HWC --> CHW
# HWC --> CHW
im
=
np
.
swapaxes
(
im
,
1
,
2
)
im
=
np
.
swapaxes
(
im
,
1
,
2
)
...
@@ -75,11 +82,14 @@ def test_reader(paths=None, images=None):
...
@@ -75,11 +82,14 @@ def test_reader(paths=None, images=None):
def
padding_minibatch
(
batch_data
,
coarsest_stride
=
0
,
use_padded_im_info
=
True
):
def
padding_minibatch
(
batch_data
,
coarsest_stride
=
0
,
use_padded_im_info
=
True
):
max_shape_org
=
np
.
array
([
data
[
'image'
].
shape
for
data
in
batch_data
]).
max
(
axis
=
0
)
max_shape_org
=
np
.
array
(
[
data
[
'image'
].
shape
for
data
in
batch_data
]).
max
(
axis
=
0
)
if
coarsest_stride
>
0
:
if
coarsest_stride
>
0
:
max_shape
=
np
.
zeros
((
3
)).
astype
(
'int32'
)
max_shape
=
np
.
zeros
((
3
)).
astype
(
'int32'
)
max_shape
[
1
]
=
int
(
np
.
ceil
(
max_shape_org
[
1
]
/
coarsest_stride
)
*
coarsest_stride
)
max_shape
[
1
]
=
int
(
max_shape
[
2
]
=
int
(
np
.
ceil
(
max_shape_org
[
2
]
/
coarsest_stride
)
*
coarsest_stride
)
np
.
ceil
(
max_shape_org
[
1
]
/
coarsest_stride
)
*
coarsest_stride
)
max_shape
[
2
]
=
int
(
np
.
ceil
(
max_shape_org
[
2
]
/
coarsest_stride
)
*
coarsest_stride
)
else
:
else
:
max_shape
=
max_shape_org
.
astype
(
'int32'
)
max_shape
=
max_shape_org
.
astype
(
'int32'
)
...
@@ -90,12 +100,15 @@ def padding_minibatch(batch_data, coarsest_stride=0, use_padded_im_info=True):
...
@@ -90,12 +100,15 @@ def padding_minibatch(batch_data, coarsest_stride=0, use_padded_im_info=True):
for
data
in
batch_data
:
for
data
in
batch_data
:
im_c
,
im_h
,
im_w
=
data
[
'image'
].
shape
im_c
,
im_h
,
im_w
=
data
[
'image'
].
shape
# image
# image
padding_im
=
np
.
zeros
((
im_c
,
max_shape
[
1
],
max_shape
[
2
]),
dtype
=
np
.
float32
)
padding_im
=
np
.
zeros
((
im_c
,
max_shape
[
1
],
max_shape
[
2
]),
dtype
=
np
.
float32
)
padding_im
[:,
0
:
im_h
,
0
:
im_w
]
=
data
[
'image'
]
padding_im
[:,
0
:
im_h
,
0
:
im_w
]
=
data
[
'image'
]
padding_image
.
append
(
padding_im
)
padding_image
.
append
(
padding_im
)
# im_info
# im_info
data
[
'im_info'
][
0
]
=
max_shape
[
1
]
if
use_padded_im_info
else
max_shape_org
[
1
]
data
[
'im_info'
][
data
[
'im_info'
][
1
]
=
max_shape
[
2
]
if
use_padded_im_info
else
max_shape_org
[
2
]
0
]
=
max_shape
[
1
]
if
use_padded_im_info
else
max_shape_org
[
1
]
data
[
'im_info'
][
1
]
=
max_shape
[
2
]
if
use_padded_im_info
else
max_shape_org
[
2
]
padding_info
.
append
(
data
[
'im_info'
])
padding_info
.
append
(
data
[
'im_info'
])
padding_shape
.
append
(
data
[
'im_shape'
])
padding_shape
.
append
(
data
[
'im_shape'
])
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/fpn.py
浏览文件 @
c3f1e085
...
@@ -52,13 +52,22 @@ def ConvNorm(input,
...
@@ -52,13 +52,22 @@ def ConvNorm(input,
dilation
=
dilation
,
dilation
=
dilation
,
groups
=
groups
,
groups
=
groups
,
act
=
None
,
act
=
None
,
param_attr
=
ParamAttr
(
name
=
name
+
"_weights"
,
initializer
=
initializer
,
learning_rate
=
lr_scale
),
param_attr
=
ParamAttr
(
name
=
name
+
"_weights"
,
initializer
=
initializer
,
learning_rate
=
lr_scale
),
bias_attr
=
False
,
bias_attr
=
False
,
name
=
name
+
'.conv2d.output.1'
)
name
=
name
+
'.conv2d.output.1'
)
norm_lr
=
0.
if
freeze_norm
else
1.
norm_lr
=
0.
if
freeze_norm
else
1.
pattr
=
ParamAttr
(
name
=
norm_name
+
'_scale'
,
learning_rate
=
norm_lr
*
lr_scale
,
regularizer
=
L2Decay
(
norm_decay
))
pattr
=
ParamAttr
(
battr
=
ParamAttr
(
name
=
norm_name
+
'_offset'
,
learning_rate
=
norm_lr
*
lr_scale
,
regularizer
=
L2Decay
(
norm_decay
))
name
=
norm_name
+
'_scale'
,
learning_rate
=
norm_lr
*
lr_scale
,
regularizer
=
L2Decay
(
norm_decay
))
battr
=
ParamAttr
(
name
=
norm_name
+
'_offset'
,
learning_rate
=
norm_lr
*
lr_scale
,
regularizer
=
L2Decay
(
norm_decay
))
if
norm_type
in
[
'bn'
,
'sync_bn'
]:
if
norm_type
in
[
'bn'
,
'sync_bn'
]:
global_stats
=
True
if
freeze_norm
else
False
global_stats
=
True
if
freeze_norm
else
False
...
@@ -75,15 +84,27 @@ def ConvNorm(input,
...
@@ -75,15 +84,27 @@ def ConvNorm(input,
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
elif
norm_type
==
'gn'
:
elif
norm_type
==
'gn'
:
out
=
fluid
.
layers
.
group_norm
(
out
=
fluid
.
layers
.
group_norm
(
input
=
conv
,
act
=
act
,
name
=
norm_name
+
'.output.1'
,
groups
=
norm_groups
,
param_attr
=
pattr
,
bias_attr
=
battr
)
input
=
conv
,
act
=
act
,
name
=
norm_name
+
'.output.1'
,
groups
=
norm_groups
,
param_attr
=
pattr
,
bias_attr
=
battr
)
scale
=
fluid
.
framework
.
_get_var
(
pattr
.
name
)
scale
=
fluid
.
framework
.
_get_var
(
pattr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
elif
norm_type
==
'affine_channel'
:
elif
norm_type
==
'affine_channel'
:
scale
=
fluid
.
layers
.
create_parameter
(
scale
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
bias
=
fluid
.
layers
.
create_parameter
(
bias
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
shape
=
[
conv
.
shape
[
1
]],
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
if
freeze_norm
:
if
freeze_norm
:
scale
.
stop_gradient
=
True
scale
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
...
@@ -140,10 +161,15 @@ class FPN(object):
...
@@ -140,10 +161,15 @@ class FPN(object):
body_input
,
body_input
,
self
.
num_chan
,
self
.
num_chan
,
1
,
1
,
param_attr
=
ParamAttr
(
name
=
lateral_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
lateral_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
lateral_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
lateral_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
lateral_name
)
name
=
lateral_name
)
topdown
=
fluid
.
layers
.
resize_nearest
(
upper_output
,
scale
=
2.
,
name
=
topdown_name
)
topdown
=
fluid
.
layers
.
resize_nearest
(
upper_output
,
scale
=
2.
,
name
=
topdown_name
)
return
lateral
+
topdown
return
lateral
+
topdown
def
get_output
(
self
,
body_dict
):
def
get_output
(
self
,
body_dict
):
...
@@ -182,14 +208,20 @@ class FPN(object):
...
@@ -182,14 +208,20 @@ class FPN(object):
body_input
,
body_input
,
self
.
num_chan
,
self
.
num_chan
,
1
,
1
,
param_attr
=
ParamAttr
(
name
=
fpn_inner_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
fpn_inner_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_inner_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
fpn_inner_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_inner_name
)
name
=
fpn_inner_name
)
for
i
in
range
(
1
,
num_backbone_stages
):
for
i
in
range
(
1
,
num_backbone_stages
):
body_name
=
body_name_list
[
i
]
body_name
=
body_name_list
[
i
]
body_input
=
body_dict
[
body_name
]
body_input
=
body_dict
[
body_name
]
top_output
=
self
.
fpn_inner_output
[
i
-
1
]
top_output
=
self
.
fpn_inner_output
[
i
-
1
]
fpn_inner_single
=
self
.
_add_topdown_lateral
(
body_name
,
body_input
,
top_output
)
fpn_inner_single
=
self
.
_add_topdown_lateral
(
body_name
,
body_input
,
top_output
)
self
.
fpn_inner_output
[
i
]
=
fpn_inner_single
self
.
fpn_inner_output
[
i
]
=
fpn_inner_single
fpn_dict
=
{}
fpn_dict
=
{}
fpn_name_list
=
[]
fpn_name_list
=
[]
...
@@ -213,15 +245,24 @@ class FPN(object):
...
@@ -213,15 +245,24 @@ class FPN(object):
self
.
num_chan
,
self
.
num_chan
,
filter_size
=
3
,
filter_size
=
3
,
padding
=
1
,
padding
=
1
,
param_attr
=
ParamAttr
(
name
=
fpn_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
fpn_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
fpn_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_name
)
name
=
fpn_name
)
fpn_dict
[
fpn_name
]
=
fpn_output
fpn_dict
[
fpn_name
]
=
fpn_output
fpn_name_list
.
append
(
fpn_name
)
fpn_name_list
.
append
(
fpn_name
)
if
not
self
.
has_extra_convs
and
self
.
max_level
-
self
.
min_level
==
len
(
spatial_scale
):
if
not
self
.
has_extra_convs
and
self
.
max_level
-
self
.
min_level
==
len
(
spatial_scale
):
body_top_name
=
fpn_name_list
[
0
]
body_top_name
=
fpn_name_list
[
0
]
body_top_extension
=
fluid
.
layers
.
pool2d
(
body_top_extension
=
fluid
.
layers
.
pool2d
(
fpn_dict
[
body_top_name
],
1
,
'max'
,
pool_stride
=
2
,
name
=
body_top_name
+
'_subsampled_2x'
)
fpn_dict
[
body_top_name
],
1
,
'max'
,
pool_stride
=
2
,
name
=
body_top_name
+
'_subsampled_2x'
)
fpn_dict
[
body_top_name
+
'_subsampled_2x'
]
=
body_top_extension
fpn_dict
[
body_top_name
+
'_subsampled_2x'
]
=
body_top_extension
fpn_name_list
.
insert
(
0
,
body_top_name
+
'_subsampled_2x'
)
fpn_name_list
.
insert
(
0
,
body_top_name
+
'_subsampled_2x'
)
spatial_scale
.
insert
(
0
,
spatial_scale
[
0
]
*
0.5
)
spatial_scale
.
insert
(
0
,
spatial_scale
[
0
]
*
0.5
)
...
@@ -241,8 +282,12 @@ class FPN(object):
...
@@ -241,8 +282,12 @@ class FPN(object):
filter_size
=
3
,
filter_size
=
3
,
stride
=
2
,
stride
=
2
,
padding
=
1
,
padding
=
1
,
param_attr
=
ParamAttr
(
name
=
fpn_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
fpn_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_name
+
"_w"
,
initializer
=
Xavier
(
fan_out
=
fan
)),
bias_attr
=
ParamAttr
(
name
=
fpn_name
+
"_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)),
name
=
fpn_name
)
name
=
fpn_name
)
fpn_dict
[
fpn_name
]
=
fpn_blob
fpn_dict
[
fpn_name
]
=
fpn_blob
fpn_name_list
.
insert
(
0
,
fpn_name
)
fpn_name_list
.
insert
(
0
,
fpn_name
)
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/module.py
浏览文件 @
c3f1e085
...
@@ -30,7 +30,7 @@ from faster_rcnn_resnet50_fpn_coco2017.roi_extractor import FPNRoIAlign
...
@@ -30,7 +30,7 @@ from faster_rcnn_resnet50_fpn_coco2017.roi_extractor import FPNRoIAlign
@
moduleinfo
(
@
moduleinfo
(
name
=
"faster_rcnn_resnet50_fpn_coco2017"
,
name
=
"faster_rcnn_resnet50_fpn_coco2017"
,
version
=
"1.0.
0
"
,
version
=
"1.0.
1
"
,
type
=
"cv/object_detection"
,
type
=
"cv/object_detection"
,
summary
=
summary
=
"Baidu's Faster-RCNN model for object detection, whose backbone is ResNet50, processed with Feature Pyramid Networks"
,
"Baidu's Faster-RCNN model for object detection, whose backbone is ResNet50, processed with Feature Pyramid Networks"
,
...
@@ -39,8 +39,10 @@ from faster_rcnn_resnet50_fpn_coco2017.roi_extractor import FPNRoIAlign
...
@@ -39,8 +39,10 @@ from faster_rcnn_resnet50_fpn_coco2017.roi_extractor import FPNRoIAlign
class
FasterRCNNResNet50RPN
(
hub
.
Module
):
class
FasterRCNNResNet50RPN
(
hub
.
Module
):
def
_initialize
(
self
):
def
_initialize
(
self
):
# default pretrained model, Faster-RCNN with backbone ResNet50, shape of input tensor is [3, 800, 1333]
# default pretrained model, Faster-RCNN with backbone ResNet50, shape of input tensor is [3, 800, 1333]
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
directory
,
"faster_rcnn_resnet50_fpn_model"
)
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
directory
,
"faster_rcnn_resnet50_fpn_model"
)
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
_set_config
()
self
.
_set_config
()
def
_set_config
(
self
):
def
_set_config
(
self
):
...
@@ -64,7 +66,11 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -64,7 +66,11 @@ class FasterRCNNResNet50RPN(hub.Module):
gpu_config
.
enable_use_gpu
(
memory_pool_init_size_mb
=
500
,
device_id
=
0
)
gpu_config
.
enable_use_gpu
(
memory_pool_init_size_mb
=
500
,
device_id
=
0
)
self
.
gpu_predictor
=
create_paddle_predictor
(
gpu_config
)
self
.
gpu_predictor
=
create_paddle_predictor
(
gpu_config
)
def
context
(
self
,
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
):
def
context
(
self
,
num_classes
=
81
,
trainable
=
True
,
pretrained
=
True
,
phase
=
'train'
):
"""
"""
Distill the Head Features, so as to perform transfer learning.
Distill the Head Features, so as to perform transfer learning.
...
@@ -83,15 +89,26 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -83,15 +89,26 @@ class FasterRCNNResNet50RPN(hub.Module):
startup_program
=
fluid
.
Program
()
startup_program
=
fluid
.
Program
()
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
unique_name
.
guard
():
with
fluid
.
unique_name
.
guard
():
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
-
1
,
3
,
-
1
,
-
1
],
dtype
=
'float32'
)
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
-
1
,
3
,
-
1
,
-
1
],
dtype
=
'float32'
)
# backbone
# backbone
backbone
=
ResNet
(
norm_type
=
'affine_channel'
,
depth
=
50
,
feature_maps
=
[
2
,
3
,
4
,
5
],
freeze_at
=
2
)
backbone
=
ResNet
(
norm_type
=
'affine_channel'
,
depth
=
50
,
feature_maps
=
[
2
,
3
,
4
,
5
],
freeze_at
=
2
)
body_feats
=
backbone
(
image
)
body_feats
=
backbone
(
image
)
# fpn
# fpn
fpn
=
FPN
(
max_level
=
6
,
min_level
=
2
,
num_chan
=
256
,
spatial_scale
=
[
0.03125
,
0.0625
,
0.125
,
0.25
])
fpn
=
FPN
(
max_level
=
6
,
min_level
=
2
,
num_chan
=
256
,
spatial_scale
=
[
0.03125
,
0.0625
,
0.125
,
0.25
])
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
im_info
=
fluid
.
layers
.
data
(
name
=
'im_info'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
im_info
=
fluid
.
layers
.
data
(
im_shape
=
fluid
.
layers
.
data
(
name
=
'im_shape'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
name
=
'im_info'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
im_shape
=
fluid
.
layers
.
data
(
name
=
'im_shape'
,
shape
=
[
3
],
dtype
=
'float32'
,
lod_level
=
0
)
body_feat_names
=
list
(
body_feats
.
keys
())
body_feat_names
=
list
(
body_feats
.
keys
())
body_feats
,
spatial_scale
=
fpn
.
get_output
(
body_feats
)
body_feats
,
spatial_scale
=
fpn
.
get_output
(
body_feats
)
# rpn_head: RPNHead
# rpn_head: RPNHead
...
@@ -99,9 +116,12 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -99,9 +116,12 @@ class FasterRCNNResNet50RPN(hub.Module):
rois
=
rpn_head
.
get_proposals
(
body_feats
,
im_info
,
mode
=
phase
)
rois
=
rpn_head
.
get_proposals
(
body_feats
,
im_info
,
mode
=
phase
)
# train
# train
if
phase
==
'train'
:
if
phase
==
'train'
:
gt_bbox
=
fluid
.
layers
.
data
(
name
=
'gt_bbox'
,
shape
=
[
4
],
dtype
=
'float32'
,
lod_level
=
1
)
gt_bbox
=
fluid
.
layers
.
data
(
is_crowd
=
fluid
.
layers
.
data
(
name
=
'is_crowd'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
name
=
'gt_bbox'
,
shape
=
[
4
],
dtype
=
'float32'
,
lod_level
=
1
)
gt_class
=
fluid
.
layers
.
data
(
name
=
'gt_class'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
is_crowd
=
fluid
.
layers
.
data
(
name
=
'is_crowd'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
gt_class
=
fluid
.
layers
.
data
(
name
=
'gt_class'
,
shape
=
[
1
],
dtype
=
'int32'
,
lod_level
=
1
)
rpn_loss
=
rpn_head
.
get_loss
(
im_info
,
gt_bbox
,
is_crowd
)
rpn_loss
=
rpn_head
.
get_loss
(
im_info
,
gt_bbox
,
is_crowd
)
# bbox_assigner: BBoxAssigner
# bbox_assigner: BBoxAssigner
bbox_assigner
=
self
.
bbox_assigner
(
num_classes
)
bbox_assigner
=
self
.
bbox_assigner
(
num_classes
)
...
@@ -122,7 +142,10 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -122,7 +142,10 @@ class FasterRCNNResNet50RPN(hub.Module):
rois
=
outs
[
0
]
rois
=
outs
[
0
]
roi_extractor
=
self
.
roi_extractor
()
roi_extractor
=
self
.
roi_extractor
()
roi_feat
=
roi_extractor
(
head_inputs
=
body_feats
,
rois
=
rois
,
spatial_scale
=
spatial_scale
)
roi_feat
=
roi_extractor
(
head_inputs
=
body_feats
,
rois
=
rois
,
spatial_scale
=
spatial_scale
)
# head_feat
# head_feat
bbox_head
=
self
.
bbox_head
(
num_classes
)
bbox_head
=
self
.
bbox_head
(
num_classes
)
head_feat
=
bbox_head
.
head
(
roi_feat
)
head_feat
=
bbox_head
.
head
(
roi_feat
)
...
@@ -138,13 +161,18 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -138,13 +161,18 @@ class FasterRCNNResNet50RPN(hub.Module):
'is_crowd'
:
var_prefix
+
is_crowd
.
name
'is_crowd'
:
var_prefix
+
is_crowd
.
name
}
}
outputs
=
{
outputs
=
{
'head_features'
:
var_prefix
+
head_feat
.
name
,
'head_features'
:
'rpn_cls_loss'
:
var_prefix
+
rpn_loss
[
'rpn_cls_loss'
].
name
,
var_prefix
+
head_feat
.
name
,
'rpn_reg_loss'
:
var_prefix
+
rpn_loss
[
'rpn_reg_loss'
].
name
,
'rpn_cls_loss'
:
'generate_proposal_labels'
:
[
var_prefix
+
var
.
name
for
var
in
outs
]
var_prefix
+
rpn_loss
[
'rpn_cls_loss'
].
name
,
'rpn_reg_loss'
:
var_prefix
+
rpn_loss
[
'rpn_reg_loss'
].
name
,
'generate_proposal_labels'
:
[
var_prefix
+
var
.
name
for
var
in
outs
]
}
}
elif
phase
==
'predict'
:
elif
phase
==
'predict'
:
pred
=
bbox_head
.
get_prediction
(
roi_feat
,
rois
,
im_info
,
im_shape
)
pred
=
bbox_head
.
get_prediction
(
roi_feat
,
rois
,
im_info
,
im_shape
)
inputs
=
{
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'image'
:
var_prefix
+
image
.
name
,
'im_info'
:
var_prefix
+
im_info
.
name
,
'im_info'
:
var_prefix
+
im_info
.
name
,
...
@@ -159,9 +187,13 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -159,9 +187,13 @@ class FasterRCNNResNet50RPN(hub.Module):
add_vars_prefix
(
startup_program
,
var_prefix
)
add_vars_prefix
(
startup_program
,
var_prefix
)
global_vars
=
context_prog
.
global_block
().
vars
global_vars
=
context_prog
.
global_block
().
vars
inputs
=
{
key
:
global_vars
[
value
]
for
key
,
value
in
inputs
.
items
()}
inputs
=
{
key
:
global_vars
[
value
]
for
key
,
value
in
inputs
.
items
()
}
outputs
=
{
outputs
=
{
key
:
global_vars
[
value
]
if
not
isinstance
(
value
,
list
)
else
[
global_vars
[
var
]
for
var
in
value
]
key
:
global_vars
[
value
]
if
not
isinstance
(
value
,
list
)
else
[
global_vars
[
var
]
for
var
in
value
]
for
key
,
value
in
outputs
.
items
()
for
key
,
value
in
outputs
.
items
()
}
}
...
@@ -177,9 +209,14 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -177,9 +209,14 @@ class FasterRCNNResNet50RPN(hub.Module):
if
num_classes
!=
81
:
if
num_classes
!=
81
:
if
'bbox_pred'
in
var
.
name
or
'cls_score'
in
var
.
name
:
if
'bbox_pred'
in
var
.
name
or
'cls_score'
in
var
.
name
:
return
False
return
False
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
return
inputs
,
outputs
,
context_prog
return
inputs
,
outputs
,
context_prog
def
rpn_head
(
self
):
def
rpn_head
(
self
):
...
@@ -195,8 +232,16 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -195,8 +232,16 @@ class FasterRCNNResNet50RPN(hub.Module):
rpn_negative_overlap
=
0.3
,
rpn_negative_overlap
=
0.3
,
rpn_positive_overlap
=
0.7
,
rpn_positive_overlap
=
0.7
,
rpn_straddle_thresh
=
0.0
),
rpn_straddle_thresh
=
0.0
),
train_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
2000
,
pre_nms_top_n
=
2000
),
train_proposal
=
GenerateProposals
(
test_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
1000
,
pre_nms_top_n
=
1000
),
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
2000
,
pre_nms_top_n
=
2000
),
test_proposal
=
GenerateProposals
(
min_size
=
0.0
,
nms_thresh
=
0.7
,
post_nms_top_n
=
1000
,
pre_nms_top_n
=
1000
),
anchor_start_size
=
32
,
anchor_start_size
=
32
,
num_chan
=
256
,
num_chan
=
256
,
min_level
=
2
,
min_level
=
2
,
...
@@ -204,12 +249,18 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -204,12 +249,18 @@ class FasterRCNNResNet50RPN(hub.Module):
def
roi_extractor
(
self
):
def
roi_extractor
(
self
):
return
FPNRoIAlign
(
return
FPNRoIAlign
(
canconical_level
=
4
,
canonical_size
=
224
,
max_level
=
5
,
min_level
=
2
,
box_resolution
=
7
,
sampling_ratio
=
2
)
canconical_level
=
4
,
canonical_size
=
224
,
max_level
=
5
,
min_level
=
2
,
box_resolution
=
7
,
sampling_ratio
=
2
)
def
bbox_head
(
self
,
num_classes
):
def
bbox_head
(
self
,
num_classes
):
return
BBoxHead
(
return
BBoxHead
(
head
=
TwoFCHead
(
mlp_dim
=
1024
),
head
=
TwoFCHead
(
mlp_dim
=
1024
),
nms
=
MultiClassNMS
(
keep_top_k
=
100
,
nms_threshold
=
0.5
,
score_threshold
=
0.05
),
nms
=
MultiClassNMS
(
keep_top_k
=
100
,
nms_threshold
=
0.5
,
score_threshold
=
0.05
),
num_classes
=
num_classes
)
num_classes
=
num_classes
)
def
bbox_assigner
(
self
,
num_classes
):
def
bbox_assigner
(
self
,
num_classes
):
...
@@ -222,7 +273,11 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -222,7 +273,11 @@ class FasterRCNNResNet50RPN(hub.Module):
fg_thresh
=
0.5
,
fg_thresh
=
0.5
,
class_nums
=
num_classes
)
class_nums
=
num_classes
)
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
if
combined
:
if
combined
:
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
...
@@ -278,7 +333,7 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -278,7 +333,7 @@ class FasterRCNNResNet50RPN(hub.Module):
int
(
_places
[
0
])
int
(
_places
[
0
])
except
:
except
:
raise
RuntimeError
(
raise
RuntimeError
(
"
Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id
."
"
Attempt to use GPU for prediction, but environment variable CUDA_VISIBLE_DEVICES was not set correctly
."
)
)
paths
=
paths
if
paths
else
list
()
paths
=
paths
if
paths
else
list
()
...
@@ -308,7 +363,9 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -308,7 +363,9 @@ class FasterRCNNResNet50RPN(hub.Module):
padding_image_tensor
=
PaddleTensor
(
padding_image
.
copy
())
padding_image_tensor
=
PaddleTensor
(
padding_image
.
copy
())
padding_info_tensor
=
PaddleTensor
(
padding_info
.
copy
())
padding_info_tensor
=
PaddleTensor
(
padding_info
.
copy
())
padding_shape_tensor
=
PaddleTensor
(
padding_shape
.
copy
())
padding_shape_tensor
=
PaddleTensor
(
padding_shape
.
copy
())
feed_list
=
[
padding_image_tensor
,
padding_info_tensor
,
padding_shape_tensor
]
feed_list
=
[
padding_image_tensor
,
padding_info_tensor
,
padding_shape_tensor
]
if
use_gpu
:
if
use_gpu
:
data_out
=
self
.
gpu_predictor
.
run
(
feed_list
)
data_out
=
self
.
gpu_predictor
.
run
(
feed_list
)
...
@@ -333,17 +390,29 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -333,17 +390,29 @@ class FasterRCNNResNet50RPN(hub.Module):
Add the command config options
Add the command config options
"""
"""
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
self
.
arg_config_group
.
add_argument
(
'--batch_size'
,
type
=
int
,
default
=
1
,
help
=
"batch size for prediction"
)
self
.
arg_config_group
.
add_argument
(
'--batch_size'
,
type
=
int
,
default
=
1
,
help
=
"batch size for prediction"
)
def
add_module_input_arg
(
self
):
def
add_module_input_arg
(
self
):
"""
"""
Add the command input options
Add the command input options
"""
"""
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
default
=
None
,
help
=
"input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
default
=
None
,
help
=
"input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_file'
,
type
=
str
,
default
=
None
,
help
=
"file contain input data"
)
self
.
arg_input_group
.
add_argument
(
'--input_file'
,
type
=
str
,
default
=
None
,
help
=
"file contain input data"
)
def
check_input_data
(
self
,
args
):
def
check_input_data
(
self
,
args
):
input_data
=
[]
input_data
=
[]
...
@@ -372,9 +441,12 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -372,9 +441,12 @@ class FasterRCNNResNet50RPN(hub.Module):
prog
=
"hub run {}"
.
format
(
self
.
name
),
prog
=
"hub run {}"
.
format
(
self
.
name
),
usage
=
'%(prog)s'
,
usage
=
'%(prog)s'
,
add_help
=
True
)
add_help
=
True
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
self
.
add_module_config_arg
()
self
.
add_module_config_arg
()
self
.
add_module_input_arg
()
self
.
add_module_input_arg
()
...
@@ -386,5 +458,7 @@ class FasterRCNNResNet50RPN(hub.Module):
...
@@ -386,5 +458,7 @@ class FasterRCNNResNet50RPN(hub.Module):
else
:
else
:
for
image_path
in
input_data
:
for
image_path
in
input_data
:
if
not
os
.
path
.
exists
(
image_path
):
if
not
os
.
path
.
exists
(
image_path
):
raise
RuntimeError
(
"File %s or %s is not exist."
%
image_path
)
raise
RuntimeError
(
return
self
.
object_detection
(
paths
=
input_data
,
use_gpu
=
args
.
use_gpu
,
batch_size
=
args
.
batch_size
)
"File %s or %s is not exist."
%
image_path
)
return
self
.
object_detection
(
paths
=
input_data
,
use_gpu
=
args
.
use_gpu
,
batch_size
=
args
.
batch_size
)
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/nonlocal_helper.py
浏览文件 @
c3f1e085
...
@@ -22,7 +22,8 @@ nonlocal_params = {
...
@@ -22,7 +22,8 @@ nonlocal_params = {
}
}
def
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
,
max_pool_stride
=
2
):
def
space_nonlocal
(
input
,
dim_in
,
dim_out
,
prefix
,
dim_inner
,
max_pool_stride
=
2
):
cur
=
input
cur
=
input
theta
=
fluid
.
layers
.
conv2d
(
input
=
cur
,
num_filters
=
dim_inner
,
\
theta
=
fluid
.
layers
.
conv2d
(
input
=
cur
,
num_filters
=
dim_inner
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
\
...
@@ -82,7 +83,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
...
@@ -82,7 +83,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
theta_phi_sc
=
fluid
.
layers
.
scale
(
theta_phi
,
scale
=
dim_inner
**-
.
5
)
theta_phi_sc
=
fluid
.
layers
.
scale
(
theta_phi
,
scale
=
dim_inner
**-
.
5
)
else
:
else
:
theta_phi_sc
=
theta_phi
theta_phi_sc
=
theta_phi
p
=
fluid
.
layers
.
softmax
(
theta_phi_sc
,
name
=
prefix
+
'_affinity'
+
'_prob'
)
p
=
fluid
.
layers
.
softmax
(
theta_phi_sc
,
name
=
prefix
+
'_affinity'
+
'_prob'
)
else
:
else
:
# not clear about what is doing in xlw's code
# not clear about what is doing in xlw's code
p
=
None
# not implemented
p
=
None
# not implemented
...
@@ -96,7 +98,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
...
@@ -96,7 +98,8 @@ def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2)
# reshape back
# reshape back
# e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14)
# e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14)
t_shape
=
t
.
shape
t_shape
=
t
.
shape
t_re
=
fluid
.
layers
.
reshape
(
t
,
shape
=
list
(
theta_shape
),
actual_shape
=
theta_shape_op
)
t_re
=
fluid
.
layers
.
reshape
(
t
,
shape
=
list
(
theta_shape
),
actual_shape
=
theta_shape_op
)
blob_out
=
t_re
blob_out
=
t_re
blob_out
=
fluid
.
layers
.
conv2d
(
input
=
blob_out
,
num_filters
=
dim_out
,
\
blob_out
=
fluid
.
layers
.
conv2d
(
input
=
blob_out
,
num_filters
=
dim_out
,
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
padding
=
[
0
,
0
],
\
filter_size
=
[
1
,
1
],
stride
=
[
1
,
1
],
padding
=
[
0
,
0
],
\
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/processor.py
浏览文件 @
c3f1e085
...
@@ -19,6 +19,12 @@ def base64_to_cv2(b64str):
...
@@ -19,6 +19,12 @@ def base64_to_cv2(b64str):
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
return
data
return
data
def
check_dir
(
dir_path
):
if
not
os
.
path
.
exists
(
dir_path
):
os
.
makedirs
(
dir_path
)
elif
os
.
path
.
isfile
(
dir_path
):
os
.
remove
(
dir_path
)
os
.
makedirs
(
dir_path
)
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
"""Get save image name from source image path.
"""Get save image name from source image path.
...
@@ -48,17 +54,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
...
@@ -48,17 +54,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
image
=
Image
.
open
(
image_path
)
image
=
Image
.
open
(
image_path
)
draw
=
ImageDraw
.
Draw
(
image
)
draw
=
ImageDraw
.
Draw
(
image
)
for
data
in
data_list
:
for
data
in
data_list
:
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
# draw bbox
# draw bbox
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
# draw label
# draw label
if
image
.
mode
==
'RGB'
:
if
image
.
mode
==
'RGB'
:
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
draw
.
rectangle
(
draw
.
rectangle
(
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
...
@@ -86,7 +98,14 @@ def load_label_info(file_path):
...
@@ -86,7 +98,14 @@ def load_label_info(file_path):
return
label_names
return
label_names
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
"""
"""
postprocess the lod_tensor produced by fluid.Executor.run
postprocess the lod_tensor produced by fluid.Executor.run
...
@@ -115,16 +134,26 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -115,16 +134,26 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
lod
=
lod_tensor
.
lod
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
results
=
lod_tensor
.
as_ndarray
()
results
=
lod_tensor
.
as_ndarray
()
if
handle_id
<
len
(
paths
):
check_dir
(
output_dir
)
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
if
paths
:
else
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
unhandled_paths_num
=
0
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
unhandled_paths_num
=
0
if
images
is
not
None
:
if
handle_id
<
len
(
images
):
unhandled_paths
=
None
unhandled_paths_num
=
len
(
images
)
-
handle_id
else
:
unhandled_paths_num
=
0
output
=
[]
output
=
[]
for
index
in
range
(
len
(
lod
)
-
1
):
for
index
in
range
(
len
(
lod
)
-
1
):
output_i
=
{
'data'
:
[]}
output_i
=
{
'data'
:
[]}
if
index
<
unhandled_paths_num
:
if
unhandled_paths
and
index
<
unhandled_paths_num
:
org_img_path
=
unhandled_paths
[
index
]
org_img_path
=
unhandled_paths
[
index
]
org_img
=
Image
.
open
(
org_img_path
)
org_img
=
Image
.
open
(
org_img_path
)
output_i
[
'path'
]
=
org_img_path
output_i
[
'path'
]
=
org_img_path
...
@@ -133,7 +162,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -133,7 +162,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
if
visualization
:
if
visualization
:
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
((
handle_id
+
index
)))
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
(
(
handle_id
+
index
)))
org_img
.
save
(
org_img_path
)
org_img
.
save
(
org_img_path
)
org_img_height
=
org_img
.
height
org_img_height
=
org_img
.
height
org_img_width
=
org_img
.
width
org_img_width
=
org_img
.
width
...
@@ -149,11 +180,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -149,11 +180,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
dt
=
{}
dt
=
{}
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
output_i
[
'data'
].
append
(
dt
)
output_i
[
'data'
].
append
(
dt
)
output
.
append
(
output_i
)
output
.
append
(
output_i
)
if
visualization
:
if
visualization
:
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
return
output
return
output
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/resnet.py
浏览文件 @
c3f1e085
...
@@ -90,7 +90,13 @@ class ResNet(object):
...
@@ -90,7 +90,13 @@ class ResNet(object):
self
.
get_prediction
=
get_prediction
self
.
get_prediction
=
get_prediction
self
.
class_dim
=
class_dim
self
.
class_dim
=
class_dim
def
_conv_offset
(
self
,
input
,
filter_size
,
stride
,
padding
,
act
=
None
,
name
=
None
):
def
_conv_offset
(
self
,
input
,
filter_size
,
stride
,
padding
,
act
=
None
,
name
=
None
):
out_channel
=
filter_size
*
filter_size
*
3
out_channel
=
filter_size
*
filter_size
*
3
out
=
fluid
.
layers
.
conv2d
(
out
=
fluid
.
layers
.
conv2d
(
input
,
input
,
...
@@ -104,7 +110,15 @@ class ResNet(object):
...
@@ -104,7 +110,15 @@ class ResNet(object):
name
=
name
)
name
=
name
)
return
out
return
out
def
_conv_norm
(
self
,
input
,
num_filters
,
filter_size
,
stride
=
1
,
groups
=
1
,
act
=
None
,
name
=
None
,
dcn_v2
=
False
):
def
_conv_norm
(
self
,
input
,
num_filters
,
filter_size
,
stride
=
1
,
groups
=
1
,
act
=
None
,
name
=
None
,
dcn_v2
=
False
):
_name
=
self
.
prefix_name
+
name
if
self
.
prefix_name
!=
''
else
name
_name
=
self
.
prefix_name
+
name
if
self
.
prefix_name
!=
''
else
name
if
not
dcn_v2
:
if
not
dcn_v2
:
conv
=
fluid
.
layers
.
conv2d
(
conv
=
fluid
.
layers
.
conv2d
(
...
@@ -129,7 +143,10 @@ class ResNet(object):
...
@@ -129,7 +143,10 @@ class ResNet(object):
name
=
_name
+
"_conv_offset"
)
name
=
_name
+
"_conv_offset"
)
offset_channel
=
filter_size
**
2
*
2
offset_channel
=
filter_size
**
2
*
2
mask_channel
=
filter_size
**
2
mask_channel
=
filter_size
**
2
offset
,
mask
=
fluid
.
layers
.
split
(
input
=
offset_mask
,
num_or_sections
=
[
offset_channel
,
mask_channel
],
dim
=
1
)
offset
,
mask
=
fluid
.
layers
.
split
(
input
=
offset_mask
,
num_or_sections
=
[
offset_channel
,
mask_channel
],
dim
=
1
)
mask
=
fluid
.
layers
.
sigmoid
(
mask
)
mask
=
fluid
.
layers
.
sigmoid
(
mask
)
conv
=
fluid
.
layers
.
deformable_conv
(
conv
=
fluid
.
layers
.
deformable_conv
(
input
=
input
,
input
=
input
,
...
@@ -151,8 +168,14 @@ class ResNet(object):
...
@@ -151,8 +168,14 @@ class ResNet(object):
norm_lr
=
0.
if
self
.
freeze_norm
else
1.
norm_lr
=
0.
if
self
.
freeze_norm
else
1.
norm_decay
=
self
.
norm_decay
norm_decay
=
self
.
norm_decay
pattr
=
ParamAttr
(
name
=
bn_name
+
'_scale'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
pattr
=
ParamAttr
(
battr
=
ParamAttr
(
name
=
bn_name
+
'_offset'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
name
=
bn_name
+
'_scale'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
battr
=
ParamAttr
(
name
=
bn_name
+
'_offset'
,
learning_rate
=
norm_lr
,
regularizer
=
L2Decay
(
norm_decay
))
if
self
.
norm_type
in
[
'bn'
,
'sync_bn'
]:
if
self
.
norm_type
in
[
'bn'
,
'sync_bn'
]:
global_stats
=
True
if
self
.
freeze_norm
else
False
global_stats
=
True
if
self
.
freeze_norm
else
False
...
@@ -169,10 +192,17 @@ class ResNet(object):
...
@@ -169,10 +192,17 @@ class ResNet(object):
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
bias
=
fluid
.
framework
.
_get_var
(
battr
.
name
)
elif
self
.
norm_type
==
'affine_channel'
:
elif
self
.
norm_type
==
'affine_channel'
:
scale
=
fluid
.
layers
.
create_parameter
(
scale
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
pattr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
1.
))
bias
=
fluid
.
layers
.
create_parameter
(
bias
=
fluid
.
layers
.
create_parameter
(
shape
=
[
conv
.
shape
[
1
]],
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
shape
=
[
conv
.
shape
[
1
]],
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
dtype
=
conv
.
dtype
,
attr
=
battr
,
default_initializer
=
fluid
.
initializer
.
Constant
(
0.
))
out
=
fluid
.
layers
.
affine_channel
(
x
=
conv
,
scale
=
scale
,
bias
=
bias
,
act
=
act
)
if
self
.
freeze_norm
:
if
self
.
freeze_norm
:
scale
.
stop_gradient
=
True
scale
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
bias
.
stop_gradient
=
True
...
@@ -192,13 +222,24 @@ class ResNet(object):
...
@@ -192,13 +222,24 @@ class ResNet(object):
return
self
.
_conv_norm
(
input
,
ch_out
,
3
,
stride
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
3
,
stride
,
name
=
name
)
if
max_pooling_in_short_cut
and
not
is_first
:
if
max_pooling_in_short_cut
and
not
is_first
:
input
=
fluid
.
layers
.
pool2d
(
input
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
2
,
pool_stride
=
2
,
pool_padding
=
0
,
ceil_mode
=
True
,
pool_type
=
'avg'
)
input
=
input
,
pool_size
=
2
,
pool_stride
=
2
,
pool_padding
=
0
,
ceil_mode
=
True
,
pool_type
=
'avg'
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
1
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
1
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
return
self
.
_conv_norm
(
input
,
ch_out
,
1
,
stride
,
name
=
name
)
else
:
else
:
return
input
return
input
def
bottleneck
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
def
bottleneck
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
if
self
.
variant
==
'a'
:
if
self
.
variant
==
'a'
:
stride1
,
stride2
=
stride
,
1
stride1
,
stride2
=
stride
,
1
else
:
else
:
...
@@ -219,8 +260,9 @@ class ResNet(object):
...
@@ -219,8 +260,9 @@ class ResNet(object):
shortcut_name
=
self
.
na
.
fix_bottleneck_name
(
name
)
shortcut_name
=
self
.
na
.
fix_bottleneck_name
(
name
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
std_senet
=
getattr
(
self
,
'std_senet'
,
False
)
if
std_senet
:
if
std_senet
:
conv_def
=
[[
int
(
num_filters
/
2
),
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
conv_def
=
[[
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
int
(
num_filters
/
2
),
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
[
num_filters
,
3
,
stride2
,
'relu'
,
groups
,
conv_name2
],
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
[
num_filters
*
expand
,
1
,
1
,
None
,
1
,
conv_name3
]]
else
:
else
:
conv_def
=
[[
num_filters
,
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
conv_def
=
[[
num_filters
,
1
,
stride1
,
'relu'
,
1
,
conv_name1
],
...
@@ -238,18 +280,42 @@ class ResNet(object):
...
@@ -238,18 +280,42 @@ class ResNet(object):
groups
=
g
,
groups
=
g
,
name
=
_name
,
name
=
_name
,
dcn_v2
=
(
i
==
1
and
dcn_v2
))
dcn_v2
=
(
i
==
1
and
dcn_v2
))
short
=
self
.
_shortcut
(
input
,
num_filters
*
expand
,
stride
,
is_first
=
is_first
,
name
=
shortcut_name
)
short
=
self
.
_shortcut
(
input
,
num_filters
*
expand
,
stride
,
is_first
=
is_first
,
name
=
shortcut_name
)
# Squeeze-and-Excitation
# Squeeze-and-Excitation
if
callable
(
getattr
(
self
,
'_squeeze_excitation'
,
None
)):
if
callable
(
getattr
(
self
,
'_squeeze_excitation'
,
None
)):
residual
=
self
.
_squeeze_excitation
(
input
=
residual
,
num_channels
=
num_filters
,
name
=
'fc'
+
name
)
residual
=
self
.
_squeeze_excitation
(
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
residual
,
act
=
'relu'
,
name
=
name
+
".add.output.5"
)
input
=
residual
,
num_channels
=
num_filters
,
name
=
'fc'
+
name
)
return
fluid
.
layers
.
elementwise_add
(
def
basicblock
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
x
=
short
,
y
=
residual
,
act
=
'relu'
,
name
=
name
+
".add.output.5"
)
def
basicblock
(
self
,
input
,
num_filters
,
stride
,
is_first
,
name
,
dcn_v2
=
False
):
assert
dcn_v2
is
False
,
"Not implemented yet."
assert
dcn_v2
is
False
,
"Not implemented yet."
conv0
=
self
.
_conv_norm
(
conv0
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
'relu'
,
stride
=
stride
,
name
=
name
+
"_branch2a"
)
input
=
input
,
conv1
=
self
.
_conv_norm
(
input
=
conv0
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
None
,
name
=
name
+
"_branch2b"
)
num_filters
=
num_filters
,
short
=
self
.
_shortcut
(
input
,
num_filters
,
stride
,
is_first
,
name
=
name
+
"_branch1"
)
filter_size
=
3
,
act
=
'relu'
,
stride
=
stride
,
name
=
name
+
"_branch2a"
)
conv1
=
self
.
_conv_norm
(
input
=
conv0
,
num_filters
=
num_filters
,
filter_size
=
3
,
act
=
None
,
name
=
name
+
"_branch2b"
)
short
=
self
.
_shortcut
(
input
,
num_filters
,
stride
,
is_first
,
name
=
name
+
"_branch1"
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
conv1
,
act
=
'relu'
)
return
fluid
.
layers
.
elementwise_add
(
x
=
short
,
y
=
conv1
,
act
=
'relu'
)
def
layer_warp
(
self
,
input
,
stage_num
):
def
layer_warp
(
self
,
input
,
stage_num
):
...
@@ -272,7 +338,8 @@ class ResNet(object):
...
@@ -272,7 +338,8 @@ class ResNet(object):
nonlocal_mod
=
1000
nonlocal_mod
=
1000
if
stage_num
in
self
.
nonlocal_stages
:
if
stage_num
in
self
.
nonlocal_stages
:
nonlocal_mod
=
self
.
nonlocal_mod_cfg
[
self
.
depth
]
if
stage_num
==
4
else
2
nonlocal_mod
=
self
.
nonlocal_mod_cfg
[
self
.
depth
]
if
stage_num
==
4
else
2
# Make the layer name and parameter name consistent
# Make the layer name and parameter name consistent
# with ImageNet pre-trained model
# with ImageNet pre-trained model
...
@@ -293,7 +360,9 @@ class ResNet(object):
...
@@ -293,7 +360,9 @@ class ResNet(object):
dim_in
=
conv
.
shape
[
1
]
dim_in
=
conv
.
shape
[
1
]
nonlocal_name
=
"nonlocal_conv{}"
.
format
(
stage_num
)
nonlocal_name
=
"nonlocal_conv{}"
.
format
(
stage_num
)
if
i
%
nonlocal_mod
==
nonlocal_mod
-
1
:
if
i
%
nonlocal_mod
==
nonlocal_mod
-
1
:
conv
=
add_space_nonlocal
(
conv
,
dim_in
,
dim_in
,
nonlocal_name
+
'_{}'
.
format
(
i
),
int
(
dim_in
/
2
))
conv
=
add_space_nonlocal
(
conv
,
dim_in
,
dim_in
,
nonlocal_name
+
'_{}'
.
format
(
i
),
int
(
dim_in
/
2
))
return
conv
return
conv
def
c1_stage
(
self
,
input
):
def
c1_stage
(
self
,
input
):
...
@@ -311,9 +380,20 @@ class ResNet(object):
...
@@ -311,9 +380,20 @@ class ResNet(object):
conv_def
=
[[
out_chan
,
7
,
2
,
conv1_name
]]
conv_def
=
[[
out_chan
,
7
,
2
,
conv1_name
]]
for
(
c
,
k
,
s
,
_name
)
in
conv_def
:
for
(
c
,
k
,
s
,
_name
)
in
conv_def
:
input
=
self
.
_conv_norm
(
input
=
input
,
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
'relu'
,
name
=
_name
)
input
=
self
.
_conv_norm
(
input
=
input
,
output
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
3
,
pool_stride
=
2
,
pool_padding
=
1
,
pool_type
=
'max'
)
num_filters
=
c
,
filter_size
=
k
,
stride
=
s
,
act
=
'relu'
,
name
=
_name
)
output
=
fluid
.
layers
.
pool2d
(
input
=
input
,
pool_size
=
3
,
pool_stride
=
2
,
pool_padding
=
1
,
pool_type
=
'max'
)
return
output
return
output
def
__call__
(
self
,
input
):
def
__call__
(
self
,
input
):
...
@@ -337,17 +417,19 @@ class ResNet(object):
...
@@ -337,17 +417,19 @@ class ResNet(object):
if
self
.
freeze_at
>=
i
:
if
self
.
freeze_at
>=
i
:
res
.
stop_gradient
=
True
res
.
stop_gradient
=
True
if
self
.
get_prediction
:
if
self
.
get_prediction
:
pool
=
fluid
.
layers
.
pool2d
(
input
=
res
,
pool_type
=
'avg'
,
global_pooling
=
True
)
pool
=
fluid
.
layers
.
pool2d
(
input
=
res
,
pool_type
=
'avg'
,
global_pooling
=
True
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
out
=
fluid
.
layers
.
fc
(
out
=
fluid
.
layers
.
fc
(
input
=
pool
,
input
=
pool
,
size
=
self
.
class_dim
,
size
=
self
.
class_dim
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
)))
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
)))
out
=
fluid
.
layers
.
softmax
(
out
)
out
=
fluid
.
layers
.
softmax
(
out
)
return
out
return
out
return
OrderedDict
(
return
OrderedDict
(
[(
'res{}_sum'
.
format
(
self
.
feature_maps
[
idx
]),
feat
)
[(
'res{}_sum'
.
format
(
self
.
feature_maps
[
idx
]),
feat
)
for
idx
,
feat
in
enumerate
(
res_endpoints
)])
for
idx
,
feat
in
enumerate
(
res_endpoints
)])
class
ResNetC5
(
ResNet
):
class
ResNetC5
(
ResNet
):
...
@@ -360,5 +442,6 @@ class ResNetC5(ResNet):
...
@@ -360,5 +442,6 @@ class ResNetC5(ResNet):
variant
=
'b'
,
variant
=
'b'
,
feature_maps
=
[
5
],
feature_maps
=
[
5
],
weight_prefix_name
=
''
):
weight_prefix_name
=
''
):
super
(
ResNetC5
,
self
).
__init__
(
depth
,
freeze_at
,
norm_type
,
freeze_norm
,
norm_decay
,
variant
,
feature_maps
)
super
(
ResNetC5
,
self
).
__init__
(
depth
,
freeze_at
,
norm_type
,
freeze_norm
,
norm_decay
,
variant
,
feature_maps
)
self
.
severed_head
=
True
self
.
severed_head
=
True
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/roi_extractor.py
浏览文件 @
c3f1e085
...
@@ -51,8 +51,8 @@ class FPNRoIAlign(object):
...
@@ -51,8 +51,8 @@ class FPNRoIAlign(object):
name_list
=
list
(
head_inputs
.
keys
())
name_list
=
list
(
head_inputs
.
keys
())
input_name_list
=
name_list
[
-
num_roi_lvls
:]
input_name_list
=
name_list
[
-
num_roi_lvls
:]
spatial_scale
=
spatial_scale
[
-
num_roi_lvls
:]
spatial_scale
=
spatial_scale
[
-
num_roi_lvls
:]
rois_dist
,
restore_index
=
fluid
.
layers
.
distribute_fpn_proposals
(
rois
,
k_min
,
k_max
,
self
.
canconical_level
,
rois_dist
,
restore_index
=
fluid
.
layers
.
distribute_fpn_proposals
(
self
.
canonical_size
)
rois
,
k_min
,
k_max
,
self
.
canconical_level
,
self
.
canonical_size
)
# rois_dist is in ascend order
# rois_dist is in ascend order
roi_out_list
=
[]
roi_out_list
=
[]
resolution
=
is_mask
and
self
.
mask_resolution
or
self
.
box_resolution
resolution
=
is_mask
and
self
.
mask_resolution
or
self
.
box_resolution
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_coco2017/rpn_head.py
浏览文件 @
c3f1e085
...
@@ -8,7 +8,10 @@ from paddle.fluid.param_attr import ParamAttr
...
@@ -8,7 +8,10 @@ from paddle.fluid.param_attr import ParamAttr
from
paddle.fluid.initializer
import
Normal
from
paddle.fluid.initializer
import
Normal
from
paddle.fluid.regularizer
import
L2Decay
from
paddle.fluid.regularizer
import
L2Decay
__all__
=
[
'AnchorGenerator'
,
'RPNTargetAssign'
,
'GenerateProposals'
,
'RPNHead'
,
'FPNRPNHead'
]
__all__
=
[
'AnchorGenerator'
,
'RPNTargetAssign'
,
'GenerateProposals'
,
'RPNHead'
,
'FPNRPNHead'
]
class
AnchorGenerator
(
object
):
class
AnchorGenerator
(
object
):
...
@@ -45,7 +48,12 @@ class RPNTargetAssign(object):
...
@@ -45,7 +48,12 @@ class RPNTargetAssign(object):
class
GenerateProposals
(
object
):
class
GenerateProposals
(
object
):
# __op__ = fluid.layers.generate_proposals
# __op__ = fluid.layers.generate_proposals
def
__init__
(
self
,
pre_nms_top_n
=
6000
,
post_nms_top_n
=
1000
,
nms_thresh
=
.
5
,
min_size
=
.
1
,
eta
=
1.
):
def
__init__
(
self
,
pre_nms_top_n
=
6000
,
post_nms_top_n
=
1000
,
nms_thresh
=
.
5
,
min_size
=
.
1
,
eta
=
1.
):
super
(
GenerateProposals
,
self
).
__init__
()
super
(
GenerateProposals
,
self
).
__init__
()
self
.
pre_nms_top_n
=
pre_nms_top_n
self
.
pre_nms_top_n
=
pre_nms_top_n
self
.
post_nms_top_n
=
post_nms_top_n
self
.
post_nms_top_n
=
post_nms_top_n
...
@@ -65,9 +73,17 @@ class RPNHead(object):
...
@@ -65,9 +73,17 @@ class RPNHead(object):
test_proposal (object): `GenerateProposals` instance for testing
test_proposal (object): `GenerateProposals` instance for testing
num_classes (int): number of classes in rpn output
num_classes (int): number of classes in rpn output
"""
"""
__inject__
=
[
'anchor_generator'
,
'rpn_target_assign'
,
'train_proposal'
,
'test_proposal'
]
__inject__
=
[
'anchor_generator'
,
'rpn_target_assign'
,
'train_proposal'
,
'test_proposal'
]
def
__init__
(
self
,
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
,
num_classes
=
1
):
def
__init__
(
self
,
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
,
num_classes
=
1
):
super
(
RPNHead
,
self
).
__init__
()
super
(
RPNHead
,
self
).
__init__
()
self
.
anchor_generator
=
anchor_generator
self
.
anchor_generator
=
anchor_generator
self
.
rpn_target_assign
=
rpn_target_assign
self
.
rpn_target_assign
=
rpn_target_assign
...
@@ -95,8 +111,10 @@ class RPNHead(object):
...
@@ -95,8 +111,10 @@ class RPNHead(object):
padding
=
1
,
padding
=
1
,
act
=
'relu'
,
act
=
'relu'
,
name
=
'conv_rpn'
,
name
=
'conv_rpn'
,
param_attr
=
ParamAttr
(
name
=
"conv_rpn_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"conv_rpn_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"conv_rpn_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"conv_rpn_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
# Generate anchors self.anchor_generator
# Generate anchors self.anchor_generator
self
.
anchor
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
self
.
anchor
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
input
=
rpn_conv
,
input
=
rpn_conv
,
...
@@ -115,8 +133,13 @@ class RPNHead(object):
...
@@ -115,8 +133,13 @@ class RPNHead(object):
padding
=
0
,
padding
=
0
,
act
=
None
,
act
=
None
,
name
=
'rpn_cls_score'
,
name
=
'rpn_cls_score'
,
param_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"rpn_cls_logits_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"rpn_cls_logits_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
# Proposal bbox regression deltas
# Proposal bbox regression deltas
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
rpn_conv
,
rpn_conv
,
...
@@ -126,8 +149,12 @@ class RPNHead(object):
...
@@ -126,8 +149,12 @@ class RPNHead(object):
padding
=
0
,
padding
=
0
,
act
=
None
,
act
=
None
,
name
=
'rpn_bbox_pred'
,
name
=
'rpn_bbox_pred'
,
param_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
"rpn_bbox_pred_w"
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
"rpn_bbox_pred_b"
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
def
get_proposals
(
self
,
body_feats
,
im_info
,
mode
=
'train'
):
def
get_proposals
(
self
,
body_feats
,
im_info
,
mode
=
'train'
):
...
@@ -150,15 +177,22 @@ class RPNHead(object):
...
@@ -150,15 +177,22 @@ class RPNHead(object):
rpn_cls_score
,
rpn_bbox_pred
=
self
.
_get_output
(
body_feat
)
rpn_cls_score
,
rpn_bbox_pred
=
self
.
_get_output
(
body_feat
)
if
self
.
num_classes
==
1
:
if
self
.
num_classes
==
1
:
rpn_cls_prob
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score
,
name
=
'rpn_cls_prob'
)
rpn_cls_prob
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score
,
name
=
'rpn_cls_prob'
)
else
:
else
:
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
=
fluid
.
layers
.
reshape
(
rpn_cls_score
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_prob_tmp
=
fluid
.
layers
.
softmax
(
rpn_cls_score
,
use_cudnn
=
False
,
name
=
'rpn_cls_prob'
)
rpn_cls_score
=
fluid
.
layers
.
reshape
(
rpn_cls_prob_slice
=
fluid
.
layers
.
slice
(
rpn_cls_prob_tmp
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
rpn_cls_score
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_prob_tmp
=
fluid
.
layers
.
softmax
(
rpn_cls_score
,
use_cudnn
=
False
,
name
=
'rpn_cls_prob'
)
rpn_cls_prob_slice
=
fluid
.
layers
.
slice
(
rpn_cls_prob_tmp
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
rpn_cls_prob
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_slice
,
1
)
rpn_cls_prob
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_slice
,
1
)
rpn_cls_prob
=
fluid
.
layers
.
reshape
(
rpn_cls_prob
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob
=
fluid
.
layers
.
reshape
(
rpn_cls_prob
=
fluid
.
layers
.
transpose
(
rpn_cls_prob
,
perm
=
[
0
,
3
,
1
,
2
])
rpn_cls_prob
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob
=
fluid
.
layers
.
transpose
(
rpn_cls_prob
,
perm
=
[
0
,
3
,
1
,
2
])
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
# prop_op
# prop_op
rpn_rois
,
rpn_roi_probs
=
fluid
.
layers
.
generate_proposals
(
rpn_rois
,
rpn_roi_probs
=
fluid
.
layers
.
generate_proposals
(
...
@@ -174,20 +208,24 @@ class RPNHead(object):
...
@@ -174,20 +208,24 @@ class RPNHead(object):
eta
=
prop_op
.
eta
)
eta
=
prop_op
.
eta
)
return
rpn_rois
return
rpn_rois
def
_transform_input
(
self
,
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
):
def
_transform_input
(
self
,
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
):
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score
=
fluid
.
layers
.
transpose
(
rpn_cls_score
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_bbox_pred
=
fluid
.
layers
.
transpose
(
rpn_bbox_pred
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_bbox_pred
=
fluid
.
layers
.
transpose
(
rpn_bbox_pred
,
perm
=
[
0
,
2
,
3
,
1
])
anchor
=
fluid
.
layers
.
reshape
(
anchor
,
shape
=
(
-
1
,
4
))
anchor
=
fluid
.
layers
.
reshape
(
anchor
,
shape
=
(
-
1
,
4
))
anchor_var
=
fluid
.
layers
.
reshape
(
anchor_var
,
shape
=
(
-
1
,
4
))
anchor_var
=
fluid
.
layers
.
reshape
(
anchor_var
,
shape
=
(
-
1
,
4
))
rpn_cls_score
=
fluid
.
layers
.
reshape
(
x
=
rpn_cls_score
,
shape
=
(
0
,
-
1
,
self
.
num_classes
))
rpn_cls_score
=
fluid
.
layers
.
reshape
(
x
=
rpn_cls_score
,
shape
=
(
0
,
-
1
,
self
.
num_classes
))
rpn_bbox_pred
=
fluid
.
layers
.
reshape
(
x
=
rpn_bbox_pred
,
shape
=
(
0
,
-
1
,
4
))
rpn_bbox_pred
=
fluid
.
layers
.
reshape
(
x
=
rpn_bbox_pred
,
shape
=
(
0
,
-
1
,
4
))
return
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
return
rpn_cls_score
,
rpn_bbox_pred
,
anchor
,
anchor_var
def
_get_loss_input
(
self
):
def
_get_loss_input
(
self
):
for
attr
in
[
'rpn_cls_score'
,
'rpn_bbox_pred'
,
'anchor'
,
'anchor_var'
]:
for
attr
in
[
'rpn_cls_score'
,
'rpn_bbox_pred'
,
'anchor'
,
'anchor_var'
]:
if
not
getattr
(
self
,
attr
,
None
):
if
not
getattr
(
self
,
attr
,
None
):
raise
ValueError
(
"self.{} should not be None,"
.
format
(
attr
),
"call RPNHead.get_proposals first"
)
raise
ValueError
(
"self.{} should not be None,"
.
format
(
attr
),
return
self
.
_transform_input
(
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
,
self
.
anchor
,
self
.
anchor_var
)
"call RPNHead.get_proposals first"
)
return
self
.
_transform_input
(
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
,
self
.
anchor
,
self
.
anchor_var
)
def
get_loss
(
self
,
im_info
,
gt_box
,
is_crowd
,
gt_label
=
None
):
def
get_loss
(
self
,
im_info
,
gt_box
,
is_crowd
,
gt_label
=
None
):
"""
"""
...
@@ -227,7 +265,8 @@ class RPNHead(object):
...
@@ -227,7 +265,8 @@ class RPNHead(object):
use_random
=
self
.
rpn_target_assign
.
use_random
)
use_random
=
self
.
rpn_target_assign
.
use_random
)
score_tgt
=
fluid
.
layers
.
cast
(
x
=
score_tgt
,
dtype
=
'float32'
)
score_tgt
=
fluid
.
layers
.
cast
(
x
=
score_tgt
,
dtype
=
'float32'
)
score_tgt
.
stop_gradient
=
True
score_tgt
.
stop_gradient
=
True
rpn_cls_loss
=
fluid
.
layers
.
sigmoid_cross_entropy_with_logits
(
x
=
score_pred
,
label
=
score_tgt
)
rpn_cls_loss
=
fluid
.
layers
.
sigmoid_cross_entropy_with_logits
(
x
=
score_pred
,
label
=
score_tgt
)
else
:
else
:
score_pred
,
loc_pred
,
score_tgt
,
loc_tgt
,
bbox_weight
=
\
score_pred
,
loc_pred
,
score_tgt
,
loc_tgt
,
bbox_weight
=
\
self
.
rpn_target_assign
(
self
.
rpn_target_assign
(
...
@@ -245,13 +284,19 @@ class RPNHead(object):
...
@@ -245,13 +284,19 @@ class RPNHead(object):
rpn_cls_loss
=
fluid
.
layers
.
softmax_with_cross_entropy
(
rpn_cls_loss
=
fluid
.
layers
.
softmax_with_cross_entropy
(
logits
=
score_pred
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
logits
=
score_pred
,
label
=
labels_int64
,
numeric_stable_mode
=
True
)
rpn_cls_loss
=
fluid
.
layers
.
reduce_mean
(
rpn_cls_loss
,
name
=
'loss_rpn_cls'
)
rpn_cls_loss
=
fluid
.
layers
.
reduce_mean
(
rpn_cls_loss
,
name
=
'loss_rpn_cls'
)
loc_tgt
=
fluid
.
layers
.
cast
(
x
=
loc_tgt
,
dtype
=
'float32'
)
loc_tgt
=
fluid
.
layers
.
cast
(
x
=
loc_tgt
,
dtype
=
'float32'
)
loc_tgt
.
stop_gradient
=
True
loc_tgt
.
stop_gradient
=
True
rpn_reg_loss
=
fluid
.
layers
.
smooth_l1
(
rpn_reg_loss
=
fluid
.
layers
.
smooth_l1
(
x
=
loc_pred
,
y
=
loc_tgt
,
sigma
=
3.0
,
inside_weight
=
bbox_weight
,
outside_weight
=
bbox_weight
)
x
=
loc_pred
,
rpn_reg_loss
=
fluid
.
layers
.
reduce_sum
(
rpn_reg_loss
,
name
=
'loss_rpn_bbox'
)
y
=
loc_tgt
,
sigma
=
3.0
,
inside_weight
=
bbox_weight
,
outside_weight
=
bbox_weight
)
rpn_reg_loss
=
fluid
.
layers
.
reduce_sum
(
rpn_reg_loss
,
name
=
'loss_rpn_bbox'
)
score_shape
=
fluid
.
layers
.
shape
(
score_tgt
)
score_shape
=
fluid
.
layers
.
shape
(
score_tgt
)
score_shape
=
fluid
.
layers
.
cast
(
x
=
score_shape
,
dtype
=
'float32'
)
score_shape
=
fluid
.
layers
.
cast
(
x
=
score_shape
,
dtype
=
'float32'
)
norm
=
fluid
.
layers
.
reduce_prod
(
score_shape
)
norm
=
fluid
.
layers
.
reduce_prod
(
score_shape
)
...
@@ -286,7 +331,8 @@ class FPNRPNHead(RPNHead):
...
@@ -286,7 +331,8 @@ class FPNRPNHead(RPNHead):
min_level
=
2
,
min_level
=
2
,
max_level
=
6
,
max_level
=
6
,
num_classes
=
1
):
num_classes
=
1
):
super
(
FPNRPNHead
,
self
).
__init__
(
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
)
super
(
FPNRPNHead
,
self
).
__init__
(
anchor_generator
,
rpn_target_assign
,
train_proposal
,
test_proposal
)
self
.
anchor_start_size
=
anchor_start_size
self
.
anchor_start_size
=
anchor_start_size
self
.
num_chan
=
num_chan
self
.
num_chan
=
num_chan
self
.
min_level
=
min_level
self
.
min_level
=
min_level
...
@@ -328,13 +374,19 @@ class FPNRPNHead(RPNHead):
...
@@ -328,13 +374,19 @@ class FPNRPNHead(RPNHead):
padding
=
1
,
padding
=
1
,
act
=
'relu'
,
act
=
'relu'
,
name
=
conv_name
,
name
=
conv_name
,
param_attr
=
ParamAttr
(
name
=
conv_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
conv_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
conv_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
conv_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
# self.anchor_generator
# self.anchor_generator
self
.
anchors
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
self
.
anchors
,
self
.
anchor_var
=
fluid
.
layers
.
anchor_generator
(
input
=
conv_rpn_fpn
,
input
=
conv_rpn_fpn
,
anchor_sizes
=
(
self
.
anchor_start_size
*
2.
**
(
feat_lvl
-
self
.
min_level
),
),
anchor_sizes
=
(
self
.
anchor_start_size
*
2.
**
(
feat_lvl
-
self
.
min_level
),
),
stride
=
(
2.
**
feat_lvl
,
2.
**
feat_lvl
),
stride
=
(
2.
**
feat_lvl
,
2.
**
feat_lvl
),
aspect_ratios
=
self
.
anchor_generator
.
aspect_ratios
,
aspect_ratios
=
self
.
anchor_generator
.
aspect_ratios
,
variance
=
self
.
anchor_generator
.
variance
)
variance
=
self
.
anchor_generator
.
variance
)
...
@@ -346,16 +398,26 @@ class FPNRPNHead(RPNHead):
...
@@ -346,16 +398,26 @@ class FPNRPNHead(RPNHead):
filter_size
=
1
,
filter_size
=
1
,
act
=
None
,
act
=
None
,
name
=
cls_name
,
name
=
cls_name
,
param_attr
=
ParamAttr
(
name
=
cls_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
cls_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
cls_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
cls_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
self
.
rpn_bbox_pred
=
fluid
.
layers
.
conv2d
(
input
=
conv_rpn_fpn
,
input
=
conv_rpn_fpn
,
num_filters
=
num_anchors
*
4
,
num_filters
=
num_anchors
*
4
,
filter_size
=
1
,
filter_size
=
1
,
act
=
None
,
act
=
None
,
name
=
bbox_name
,
name
=
bbox_name
,
param_attr
=
ParamAttr
(
name
=
bbox_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
param_attr
=
ParamAttr
(
bias_attr
=
ParamAttr
(
name
=
bbox_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
name
=
bbox_share_name
+
'_w'
,
initializer
=
Normal
(
loc
=
0.
,
scale
=
0.01
)),
bias_attr
=
ParamAttr
(
name
=
bbox_share_name
+
'_b'
,
learning_rate
=
2.
,
regularizer
=
L2Decay
(
0.
)))
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
return
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
def
_get_single_proposals
(
self
,
body_feat
,
im_info
,
feat_lvl
,
mode
=
'train'
):
def
_get_single_proposals
(
self
,
body_feat
,
im_info
,
feat_lvl
,
mode
=
'train'
):
...
@@ -375,20 +437,29 @@ class FPNRPNHead(RPNHead):
...
@@ -375,20 +437,29 @@ class FPNRPNHead(RPNHead):
shape of (rois_num, 1).
shape of (rois_num, 1).
"""
"""
rpn_cls_score_fpn
,
rpn_bbox_pred_fpn
=
self
.
_get_output
(
body_feat
,
feat_lvl
)
rpn_cls_score_fpn
,
rpn_bbox_pred_fpn
=
self
.
_get_output
(
body_feat
,
feat_lvl
)
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
if
self
.
num_classes
==
1
:
if
self
.
num_classes
==
1
:
rpn_cls_prob_fpn
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score_fpn
,
name
=
'rpn_cls_prob_fpn'
+
str
(
feat_lvl
))
rpn_cls_prob_fpn
=
fluid
.
layers
.
sigmoid
(
rpn_cls_score_fpn
,
name
=
'rpn_cls_prob_fpn'
+
str
(
feat_lvl
))
else
:
else
:
rpn_cls_score_fpn
=
fluid
.
layers
.
transpose
(
rpn_cls_score_fpn
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score_fpn
=
fluid
.
layers
.
transpose
(
rpn_cls_score_fpn
=
fluid
.
layers
.
reshape
(
rpn_cls_score_fpn
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_score_fpn
,
perm
=
[
0
,
2
,
3
,
1
])
rpn_cls_score_fpn
=
fluid
.
layers
.
reshape
(
rpn_cls_score_fpn
,
shape
=
(
0
,
0
,
0
,
-
1
,
self
.
num_classes
))
rpn_cls_prob_fpn
=
fluid
.
layers
.
softmax
(
rpn_cls_prob_fpn
=
fluid
.
layers
.
softmax
(
rpn_cls_score_fpn
,
use_cudnn
=
False
,
name
=
'rpn_cls_prob_fpn'
+
str
(
feat_lvl
))
rpn_cls_score_fpn
,
rpn_cls_prob_fpn
=
fluid
.
layers
.
slice
(
rpn_cls_prob_fpn
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
use_cudnn
=
False
,
name
=
'rpn_cls_prob_fpn'
+
str
(
feat_lvl
))
rpn_cls_prob_fpn
=
fluid
.
layers
.
slice
(
rpn_cls_prob_fpn
,
axes
=
[
4
],
starts
=
[
1
],
ends
=
[
self
.
num_classes
])
rpn_cls_prob_fpn
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_fpn
,
1
)
rpn_cls_prob_fpn
,
_
=
fluid
.
layers
.
topk
(
rpn_cls_prob_fpn
,
1
)
rpn_cls_prob_fpn
=
fluid
.
layers
.
reshape
(
rpn_cls_prob_fpn
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob_fpn
=
fluid
.
layers
.
reshape
(
rpn_cls_prob_fpn
=
fluid
.
layers
.
transpose
(
rpn_cls_prob_fpn
,
perm
=
[
0
,
3
,
1
,
2
])
rpn_cls_prob_fpn
,
shape
=
(
0
,
0
,
0
,
-
1
))
rpn_cls_prob_fpn
=
fluid
.
layers
.
transpose
(
rpn_cls_prob_fpn
,
perm
=
[
0
,
3
,
1
,
2
])
# prop_op
# prop_op
rpn_rois_fpn
,
rpn_roi_prob_fpn
=
fluid
.
layers
.
generate_proposals
(
rpn_rois_fpn
,
rpn_roi_prob_fpn
=
fluid
.
layers
.
generate_proposals
(
scores
=
rpn_cls_prob_fpn
,
scores
=
rpn_cls_prob_fpn
,
...
@@ -423,7 +494,8 @@ class FPNRPNHead(RPNHead):
...
@@ -423,7 +494,8 @@ class FPNRPNHead(RPNHead):
for
lvl
in
range
(
self
.
min_level
,
self
.
max_level
+
1
):
for
lvl
in
range
(
self
.
min_level
,
self
.
max_level
+
1
):
fpn_feat_name
=
fpn_feat_names
[
self
.
max_level
-
lvl
]
fpn_feat_name
=
fpn_feat_names
[
self
.
max_level
-
lvl
]
fpn_feat
=
fpn_feats
[
fpn_feat_name
]
fpn_feat
=
fpn_feats
[
fpn_feat_name
]
rois_fpn
,
roi_probs_fpn
=
self
.
_get_single_proposals
(
fpn_feat
,
im_info
,
lvl
,
mode
)
rois_fpn
,
roi_probs_fpn
=
self
.
_get_single_proposals
(
fpn_feat
,
im_info
,
lvl
,
mode
)
self
.
fpn_rpn_list
.
append
((
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
))
self
.
fpn_rpn_list
.
append
((
self
.
rpn_cls_score
,
self
.
rpn_bbox_pred
))
rois_list
.
append
(
rois_fpn
)
rois_list
.
append
(
rois_fpn
)
roi_probs_list
.
append
(
roi_probs_fpn
)
roi_probs_list
.
append
(
roi_probs_fpn
)
...
@@ -432,7 +504,12 @@ class FPNRPNHead(RPNHead):
...
@@ -432,7 +504,12 @@ class FPNRPNHead(RPNHead):
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
prop_op
=
self
.
train_proposal
if
mode
==
'train'
else
self
.
test_proposal
post_nms_top_n
=
prop_op
.
post_nms_top_n
post_nms_top_n
=
prop_op
.
post_nms_top_n
rois_collect
=
fluid
.
layers
.
collect_fpn_proposals
(
rois_collect
=
fluid
.
layers
.
collect_fpn_proposals
(
rois_list
,
roi_probs_list
,
self
.
min_level
,
self
.
max_level
,
post_nms_top_n
,
name
=
'collect'
)
rois_list
,
roi_probs_list
,
self
.
min_level
,
self
.
max_level
,
post_nms_top_n
,
name
=
'collect'
)
return
rois_collect
return
rois_collect
def
_get_loss_input
(
self
):
def
_get_loss_input
(
self
):
...
@@ -441,8 +518,9 @@ class FPNRPNHead(RPNHead):
...
@@ -441,8 +518,9 @@ class FPNRPNHead(RPNHead):
anchors
=
[]
anchors
=
[]
anchor_vars
=
[]
anchor_vars
=
[]
for
i
in
range
(
len
(
self
.
fpn_rpn_list
)):
for
i
in
range
(
len
(
self
.
fpn_rpn_list
)):
single_input
=
self
.
_transform_input
(
self
.
fpn_rpn_list
[
i
][
0
],
self
.
fpn_rpn_list
[
i
][
1
],
self
.
anchors_list
[
i
],
single_input
=
self
.
_transform_input
(
self
.
anchor_var_list
[
i
])
self
.
fpn_rpn_list
[
i
][
0
],
self
.
fpn_rpn_list
[
i
][
1
],
self
.
anchors_list
[
i
],
self
.
anchor_var_list
[
i
])
rpn_clses
.
append
(
single_input
[
0
])
rpn_clses
.
append
(
single_input
[
0
])
rpn_bboxes
.
append
(
single_input
[
1
])
rpn_bboxes
.
append
(
single_input
[
1
])
anchors
.
append
(
single_input
[
2
])
anchors
.
append
(
single_input
[
2
])
...
...
modules/image/object_detection/faster_rcnn_resnet50_fpn_venus/README.md
浏览文件 @
c3f1e085
## 命令行预测
# faster_rcnn_resnet50_fpn_venus
```
shell
|模型名称|faster_rcnn_resnet50_fpn_venus|
$
hub run faster_rcnn_resnet50_fpn_venus
--input_path
"/PATH/TO/IMAGE"
| :--- | :---: |
```
|类别|图像 - 目标检测|
|网络|faster_rcnn|
## API
|数据集|百度自建数据集|
|是否支持Fine-tuning|是|
```
python
|模型大小|317MB|
def
context
(
num_classes
=
81
,
|最新更新日期|2021-02-26|
trainable
=
True
,
|数据指标|-|
pretrained
=
True
,
phase
=
'train'
)
```
## 一、模型基本信息
提取特征,用于迁移学习。
-
### 模型介绍
**参数**
-
Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置。Faster_RCNN整体网络可以分为4个部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层。该PaddleHub Module是由800+tag,170w图片,1000w+检测框训练的大规模通用检测模型,在8个数据集上MAP平均提升2.06%,iou=0.5的准确率平均提升1.78%。对比于其他通用检测模型,使用该Module进行finetune,可以更快收敛,达到较优效果。
*
num
\_
classes (int): 类别数;
*
trainable(bool): 参数是否可训练;
## 二、安装
*
pretrained (bool): 是否加载预训练模型;
*
phase (str): 可选值为 'train'/'predict','trian' 用于训练,'predict' 用于预测。
-
### 1、环境依赖
**返回**
-
paddlepaddle >= 1.6.2
*
inputs (dict): 模型的输入,相应的取值为:
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
当 phase 为 'train'时,包含:
*
image (Variable): 图像变量
-
### 2、安装
*
im
\_
size (Variable): 图像的尺寸
*
im
\_
info (Variable): 图像缩放信息
-
```shell
*
gt
\_
class (Variable): 检测框类别
$ hub install faster_rcnn_resnet50_fpn_venus
*
gt
\_
box (Variable): 检测框坐标
```
*
is
\_
crowd (Variable): 单个框内是否包含多个物体
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
当 phase 为 'predict'时,包含:
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图像的尺寸
## 三、模型API预测
*
im
\_
info (Variable): 图像缩放信息
*
outputs (dict): 模型的输出,相应的取值为:
-
### 1、API
当 phase 为 'train'时,包含:
*
head_features (Variable): 所提取的特征
-
```python
*
rpn
\_
cls
\_
loss (Variable): 检测框分类损失
def context(num_classes=81,
*
rpn
\_
reg
\_
loss (Variable): 检测框回归损失
trainable=True,
*
generate
\_
proposal
\_
labels (Variable): 图像信息
pretrained=True,
当 phase 为 'predict'时,包含:
phase='train')
*
head_features (Variable): 所提取的特征
```
*
rois (Variable): 提取的roi
*
bbox
\_
out (Variable): 预测结果
- 提取特征,用于迁移学习。
*
context
\_
prog (Program): 用于迁移学习的 Program。
- **参数**
```
python
- num\_classes (int): 类别数;<br/>
def
save_inference_model
(
dirname
,
- trainable (bool): 参数是否可训练;<br/>
model_filename
=
None
,
- pretrained (bool): 是否加载预训练模型;<br/>
params_filename
=
None
,
- get\_prediction (bool): 可选值为 'train'/'predict','train' 用于训练,'predict' 用于预测。
combined
=
True
)
```
- **返回**
- inputs (dict): 模型的输入,相应的取值为:
将模型保存到指定路径。
当phase为'train'时,包含:
- image (Variable): 图像变量
**参数**
- im\_size (Variable): 图像的尺寸
- im\_info (Variable): 图像缩放信息
*
dirname: 存在模型的目录名称
- gt\_class (Variable): 检测框类别
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
- gt\_box (Variable): 检测框坐标
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
- is\_crowd (Variable): 单个框内是否包含多个物体
*
combined: 是否将参数保存到统一的一个文件中
当 phase 为 'predict'时,包含:
- image (Variable): 图像变量
### 依赖
- im\_size (Variable): 图像的尺寸
- im\_info (Variable): 图像缩放信息
paddlepaddle >= 1.6.2
- outputs (dict): 模型的输出,响应的取值为:
当 phase 为 'train'时,包含:
paddlehub >= 1.6.0
- head_features (Variable): 所提取的特征
- rpn\_cls\_loss (Variable): 检测框分类损失
- rpn\_reg\_loss (Variable): 检测框回归损失
- generate\_proposal\_labels (Variable): 图像信息
当 phase 为 'predict'时,包含:
- head_features (Variable): 所提取的特征
- rois (Variable): 提取的roi
- bbox\_out (Variable): 预测结果
- context\_prog (Program): 用于迁移学习的 Program
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
- **参数**
- dirname: 存在模型的目录名称; <br/>
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
- combined: 是否将参数保存到统一的一个文件中。
## 四、更新历史
*
1.0.0
初始发布
-
```shell
$ hub install faster_rcnn_resnet50_fpn_venus==1.0.0
```
modules/image/object_detection/ssd_mobilenet_v1_pascal/README.md
浏览文件 @
c3f1e085
#
# 命令行预测
#
ssd_mobilenet_v1_pascal
```
shell
|模型名称|ssd_mobilenet_v1_pascal|
$
hub run ssd_mobilenet_v1_pascal
--input_path
"/PATH/TO/IMAGE"
| :--- | :---: |
```
|类别|图像 - 目标检测|
|网络|SSD|
|数据集|PASCAL VOC|
|是否支持Fine-tuning|否|
|模型大小|24MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## API
```
python
## 一、模型基本信息
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
```
提取特征,用于迁移学习。
-
### 应用效果展示
-
样例结果示例:
<p
align=
"center"
>
<img
src=
"https://user-images.githubusercontent.com/22424850/131504887-d024c7e5-fc09-4d6b-92b8-4d0c965949d0.jpg"
width=
'50%'
hspace=
'10'
/>
<br
/>
</p>
**参数**
-
### 模型介绍
*
trainable(bool): 参数是否可训练;
-
Single Shot MultiBox Detector (SSD) 是一种单阶段的目标检测器。与两阶段的检测方法不同,单阶段目标检测并不进行区域推荐,而是直接从特征图回归出目标的边界框和分类概率。SSD 运用了这种单阶段检测的思想,并且对其进行改进:在不同尺度的特征图上检测对应尺度的目标。该PaddleHub Module的基网络为MobileNet-v1模型,在Pascal数据集上预训练得到,目前仅支持预测。
*
pretrained (bool): 是否加载预训练模型;
*
get
\_
prediction (bool): 是否执行预测。
**返回**
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
## 二、安装
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
features',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
-
### 1、环境依赖
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
-
paddlepaddle >= 1.6.2
**参数**
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
*
paths (list
\[
str
\]
): 图片的路径;
-
### 2、安装
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
-
```shell
$ hub install ssd_mobilenet_v1_pascal
```
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
## 三、模型API预测
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
-
### 1、命令行预测
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
-
```shell
$ hub run ssd_mobilenet_v1_pascal --input_path "/PATH/TO/IMAGE"
```
-
通过命令行方式实现目标检测模型的调用,更多请见
[
PaddleHub命令行指令
](
../../../../docs/docs_ch/tutorial/cmd_usage.rst
)
-
### 2、代码示例
**参数**
-
```python
import paddlehub as hub
import cv2
*
dirname: 存在模型的目录名称
object_detector = hub.Module(name="ssd_mobilenet_v1_pascal")
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')])
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
# or
*
combined: 是否将参数保存到统一的一个文件中
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 代码示例
-
### 3、API
```
python
-
```python
import
paddlehub
as
hub
def object_detection(paths=None,
import
cv2
images=None,
batch_size=1,
use_gpu=False,
output_dir='detection_result',
score_thresh=0.5,
visualization=True,
)
```
object_detector
=
hub
.
Module
(
name
=
"ssd_mobilenet_v1_pascal"
)
- 预测API,检测输入图片中的所有目标的位置。
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
- **参数**
PaddleHub Serving可以部署一个目标检测的在线服务。
- paths (list\[str\]): 图片的路径; <br/>
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; <br/>
- batch\_size (int): batch 的大小;<br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- output\_dir (str): 图片的保存路径,默认设为 detection\_result; <br/>
- score\_thresh (float): 识别置信度的阈值;<br/>
- visualization (bool): 是否将识别结果保存为图片文件。
## 第一步:启动PaddleHub Serving
**NOTE:** paths和images两个参数选择其一进行提供数据
运行启动命令:
```
shell
$
hub serving start
-m
ssd_mobilenet_v1_pascal
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
- **返回**
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
- data (list): 检测结果,list的每一个元素为 dict,各字段为:
- confidence (float): 识别的置信度
- label (str): 标签
- left (int): 边界框的左上角x坐标
- top (int): 边界框的左上角y坐标
- right (int): 边界框的右下角x坐标
- bottom (int): 边界框的右下角y坐标
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)
## 第二步:发送预测请求
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- **参数**
```
python
- dirname: 存在模型的目录名称; <br/>
import
requests
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
import
json
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
import
cv2
- combined: 是否将参数保存到统一的一个文件中。
import
base64
def
cv2_to_base64
(
image
):
## 四、服务部署
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
-
PaddleHub Serving可以部署一个目标检测的在线服务。
# 发送HTTP请求
-
### 第一步:启动PaddleHub Serving
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
headers
=
{
"Content-type"
:
"application/json"
}
url
=
"http://127.0.0.1:8866/predict/ssd_mobilenet_v1_pascal"
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
# 打印预测结果
-
运行启动命令:
print
(
r
.
json
()[
"results"
])
-
```shell
```
$ hub serving start -m ssd_mobilenet_v1_pascal
```
### 依赖
-
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
paddlepaddle >= 1.6.2
-
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
paddlehub >= 1.6.0
-
### 第二步:发送预测请求
-
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
-
```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/ssd_mobilenet_v1_pascal"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
```
## 五、更新历史
*
1.0.0
初始发布
*
1.1.2
修复numpy数据读取问题
-
```shell
$ hub install ssd_mobilenet_v1_pascal==1.1.2
```
modules/image/object_detection/ssd_mobilenet_v1_pascal/data_feed.py
浏览文件 @
c3f1e085
...
@@ -34,7 +34,11 @@ class DecodeImage(object):
...
@@ -34,7 +34,11 @@ class DecodeImage(object):
class
ResizeImage
(
object
):
class
ResizeImage
(
object
):
def
__init__
(
self
,
target_size
=
0
,
max_size
=
0
,
interp
=
cv2
.
INTER_LINEAR
,
use_cv2
=
True
):
def
__init__
(
self
,
target_size
=
0
,
max_size
=
0
,
interp
=
cv2
.
INTER_LINEAR
,
use_cv2
=
True
):
"""
"""
Rescale image to the specified target size, and capped at max_size
Rescale image to the specified target size, and capped at max_size
if max_size != 0.
if max_size != 0.
...
@@ -88,11 +92,18 @@ class ResizeImage(object):
...
@@ -88,11 +92,18 @@ class ResizeImage(object):
resize_h
=
selected_size
resize_h
=
selected_size
if
self
.
use_cv2
:
if
self
.
use_cv2
:
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
self
.
interp
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
self
.
interp
)
else
:
else
:
if
self
.
max_size
!=
0
:
if
self
.
max_size
!=
0
:
raise
TypeError
(
'If you set max_size to cap the maximum size of image,'
raise
TypeError
(
'please set use_cv2 to True to resize the image.'
)
'If you set max_size to cap the maximum size of image,'
'please set use_cv2 to True to resize the image.'
)
im
=
im
.
astype
(
'uint8'
)
im
=
im
.
astype
(
'uint8'
)
im
=
Image
.
fromarray
(
im
)
im
=
Image
.
fromarray
(
im
)
im
=
im
.
resize
((
int
(
resize_w
),
int
(
resize_h
)),
self
.
interp
)
im
=
im
.
resize
((
int
(
resize_w
),
int
(
resize_h
)),
self
.
interp
)
...
@@ -102,7 +113,11 @@ class ResizeImage(object):
...
@@ -102,7 +113,11 @@ class ResizeImage(object):
class
NormalizeImage
(
object
):
class
NormalizeImage
(
object
):
def
__init__
(
self
,
mean
=
[
0.485
,
0.456
,
0.406
],
std
=
[
1
,
1
,
1
],
is_scale
=
True
,
is_channel_first
=
True
):
def
__init__
(
self
,
mean
=
[
0.485
,
0.456
,
0.406
],
std
=
[
1
,
1
,
1
],
is_scale
=
True
,
is_channel_first
=
True
):
"""
"""
Args:
Args:
mean (list): the pixel mean
mean (list): the pixel mean
...
@@ -158,9 +173,11 @@ class Permute(object):
...
@@ -158,9 +173,11 @@ class Permute(object):
def
reader
(
paths
=
[],
def
reader
(
paths
=
[],
images
=
None
,
images
=
None
,
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
),
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
),
resize_image
=
ResizeImage
(
target_size
=
512
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
),
resize_image
=
ResizeImage
(
target_size
=
512
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
),
permute_image
=
Permute
(
to_bgr
=
False
),
permute_image
=
Permute
(
to_bgr
=
False
),
normalize_image
=
NormalizeImage
(
mean
=
[
104
,
117
,
123
],
std
=
[
1
,
1
,
1
],
is_scale
=
False
)):
normalize_image
=
NormalizeImage
(
mean
=
[
104
,
117
,
123
],
std
=
[
1
,
1
,
1
],
is_scale
=
False
)):
"""
"""
data generator
data generator
...
@@ -176,7 +193,8 @@ def reader(paths=[],
...
@@ -176,7 +193,8 @@ def reader(paths=[],
if
paths
is
not
None
:
if
paths
is
not
None
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
@@ -184,10 +202,13 @@ def reader(paths=[],
...
@@ -184,10 +202,13 @@ def reader(paths=[],
img_list
.
append
(
img
)
img_list
.
append
(
img
)
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
)
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
)
resize_image
=
ResizeImage
(
target_size
=
300
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
)
resize_image
=
ResizeImage
(
target_size
=
300
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
)
permute_image
=
Permute
()
permute_image
=
Permute
()
normalize_image
=
NormalizeImage
(
normalize_image
=
NormalizeImage
(
mean
=
[
127.5
,
127.5
,
127.5
],
std
=
[
127.502231
,
127.502231
,
127.502231
],
is_scale
=
False
)
mean
=
[
127.5
,
127.5
,
127.5
],
std
=
[
127.502231
,
127.502231
,
127.502231
],
is_scale
=
False
)
for
img
in
img_list
:
for
img
in
img_list
:
preprocessed_img
=
decode_image
(
img
)
preprocessed_img
=
decode_image
(
img
)
...
...
modules/image/object_detection/ssd_mobilenet_v1_pascal/mobilenet_v1.py
浏览文件 @
c3f1e085
...
@@ -31,7 +31,8 @@ class MobileNet(object):
...
@@ -31,7 +31,8 @@ class MobileNet(object):
conv_group_scale
=
1
,
conv_group_scale
=
1
,
conv_learning_rate
=
1.0
,
conv_learning_rate
=
1.0
,
with_extra_blocks
=
False
,
with_extra_blocks
=
False
,
extra_block_filters
=
[[
256
,
512
],
[
128
,
256
],
[
128
,
256
],
[
64
,
128
]],
extra_block_filters
=
[[
256
,
512
],
[
128
,
256
],
[
128
,
256
],
[
64
,
128
]],
weight_prefix_name
=
''
,
weight_prefix_name
=
''
,
class_dim
=
1000
,
class_dim
=
1000
,
yolo_v3
=
False
):
yolo_v3
=
False
):
...
@@ -56,7 +57,9 @@ class MobileNet(object):
...
@@ -56,7 +57,9 @@ class MobileNet(object):
use_cudnn
=
True
,
use_cudnn
=
True
,
name
=
None
):
name
=
None
):
parameter_attr
=
ParamAttr
(
parameter_attr
=
ParamAttr
(
learning_rate
=
self
.
conv_learning_rate
,
initializer
=
fluid
.
initializer
.
MSRA
(),
name
=
name
+
"_weights"
)
learning_rate
=
self
.
conv_learning_rate
,
initializer
=
fluid
.
initializer
.
MSRA
(),
name
=
name
+
"_weights"
)
conv
=
fluid
.
layers
.
conv2d
(
conv
=
fluid
.
layers
.
conv2d
(
input
=
input
,
input
=
input
,
num_filters
=
num_filters
,
num_filters
=
num_filters
,
...
@@ -71,8 +74,10 @@ class MobileNet(object):
...
@@ -71,8 +74,10 @@ class MobileNet(object):
bn_name
=
name
+
"_bn"
bn_name
=
name
+
"_bn"
norm_decay
=
self
.
norm_decay
norm_decay
=
self
.
norm_decay
bn_param_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
norm_decay
),
name
=
bn_name
+
'_scale'
)
bn_param_attr
=
ParamAttr
(
bn_bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
norm_decay
),
name
=
bn_name
+
'_offset'
)
regularizer
=
L2Decay
(
norm_decay
),
name
=
bn_name
+
'_scale'
)
bn_bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
norm_decay
),
name
=
bn_name
+
'_offset'
)
return
fluid
.
layers
.
batch_norm
(
return
fluid
.
layers
.
batch_norm
(
input
=
conv
,
input
=
conv
,
act
=
act
,
act
=
act
,
...
@@ -81,7 +86,14 @@ class MobileNet(object):
...
@@ -81,7 +86,14 @@ class MobileNet(object):
moving_mean_name
=
bn_name
+
'_mean'
,
moving_mean_name
=
bn_name
+
'_mean'
,
moving_variance_name
=
bn_name
+
'_variance'
)
moving_variance_name
=
bn_name
+
'_variance'
)
def
depthwise_separable
(
self
,
input
,
num_filters1
,
num_filters2
,
num_groups
,
stride
,
scale
,
name
=
None
):
def
depthwise_separable
(
self
,
input
,
num_filters1
,
num_filters2
,
num_groups
,
stride
,
scale
,
name
=
None
):
depthwise_conv
=
self
.
_conv_norm
(
depthwise_conv
=
self
.
_conv_norm
(
input
=
input
,
input
=
input
,
filter_size
=
3
,
filter_size
=
3
,
...
@@ -101,7 +113,13 @@ class MobileNet(object):
...
@@ -101,7 +113,13 @@ class MobileNet(object):
name
=
name
+
"_sep"
)
name
=
name
+
"_sep"
)
return
pointwise_conv
return
pointwise_conv
def
_extra_block
(
self
,
input
,
num_filters1
,
num_filters2
,
num_groups
,
stride
,
name
=
None
):
def
_extra_block
(
self
,
input
,
num_filters1
,
num_filters2
,
num_groups
,
stride
,
name
=
None
):
pointwise_conv
=
self
.
_conv_norm
(
pointwise_conv
=
self
.
_conv_norm
(
input
=
input
,
input
=
input
,
filter_size
=
1
,
filter_size
=
1
,
...
@@ -124,44 +142,70 @@ class MobileNet(object):
...
@@ -124,44 +142,70 @@ class MobileNet(object):
scale
=
self
.
conv_group_scale
scale
=
self
.
conv_group_scale
blocks
=
[]
blocks
=
[]
# input 1/1
# input 1/1
out
=
self
.
_conv_norm
(
input
,
3
,
int
(
32
*
scale
),
2
,
1
,
name
=
self
.
prefix_name
+
"conv1"
)
out
=
self
.
_conv_norm
(
input
,
3
,
int
(
32
*
scale
),
2
,
1
,
name
=
self
.
prefix_name
+
"conv1"
)
# 1/2
# 1/2
out
=
self
.
depthwise_separable
(
out
,
32
,
64
,
32
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv2_1"
)
out
=
self
.
depthwise_separable
(
out
=
self
.
depthwise_separable
(
out
,
64
,
128
,
64
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv2_2"
)
out
,
32
,
64
,
32
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv2_1"
)
out
=
self
.
depthwise_separable
(
out
,
64
,
128
,
64
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv2_2"
)
# 1/4
# 1/4
out
=
self
.
depthwise_separable
(
out
,
128
,
128
,
128
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv3_1"
)
out
=
self
.
depthwise_separable
(
out
=
self
.
depthwise_separable
(
out
,
128
,
256
,
128
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv3_2"
)
out
,
128
,
128
,
128
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv3_1"
)
out
=
self
.
depthwise_separable
(
out
,
128
,
256
,
128
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv3_2"
)
# 1/8
# 1/8
blocks
.
append
(
out
)
blocks
.
append
(
out
)
out
=
self
.
depthwise_separable
(
out
,
256
,
256
,
256
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv4_1"
)
out
=
self
.
depthwise_separable
(
out
=
self
.
depthwise_separable
(
out
,
256
,
512
,
256
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv4_2"
)
out
,
256
,
256
,
256
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv4_1"
)
out
=
self
.
depthwise_separable
(
out
,
256
,
512
,
256
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv4_2"
)
# 1/16
# 1/16
blocks
.
append
(
out
)
blocks
.
append
(
out
)
for
i
in
range
(
5
):
for
i
in
range
(
5
):
out
=
self
.
depthwise_separable
(
out
,
512
,
512
,
512
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv5_"
+
str
(
i
+
1
))
out
=
self
.
depthwise_separable
(
out
,
512
,
512
,
512
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv5_"
+
str
(
i
+
1
))
module11
=
out
module11
=
out
out
=
self
.
depthwise_separable
(
out
,
512
,
1024
,
512
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv5_6"
)
out
=
self
.
depthwise_separable
(
out
,
512
,
1024
,
512
,
2
,
scale
,
name
=
self
.
prefix_name
+
"conv5_6"
)
# 1/32
# 1/32
out
=
self
.
depthwise_separable
(
out
,
1024
,
1024
,
1024
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv6"
)
out
=
self
.
depthwise_separable
(
out
,
1024
,
1024
,
1024
,
1
,
scale
,
name
=
self
.
prefix_name
+
"conv6"
)
module13
=
out
module13
=
out
blocks
.
append
(
out
)
blocks
.
append
(
out
)
if
self
.
yolo_v3
:
if
self
.
yolo_v3
:
return
blocks
return
blocks
if
not
self
.
with_extra_blocks
:
if
not
self
.
with_extra_blocks
:
out
=
fluid
.
layers
.
pool2d
(
input
=
out
,
pool_type
=
'avg'
,
global_pooling
=
True
)
out
=
fluid
.
layers
.
pool2d
(
input
=
out
,
pool_type
=
'avg'
,
global_pooling
=
True
)
out
=
fluid
.
layers
.
fc
(
out
=
fluid
.
layers
.
fc
(
input
=
out
,
input
=
out
,
size
=
self
.
class_dim
,
size
=
self
.
class_dim
,
param_attr
=
ParamAttr
(
initializer
=
fluid
.
initializer
.
MSRA
(),
name
=
"fc7_weights"
),
param_attr
=
ParamAttr
(
initializer
=
fluid
.
initializer
.
MSRA
(),
name
=
"fc7_weights"
),
bias_attr
=
ParamAttr
(
name
=
"fc7_offset"
))
bias_attr
=
ParamAttr
(
name
=
"fc7_offset"
))
out
=
fluid
.
layers
.
softmax
(
out
)
out
=
fluid
.
layers
.
softmax
(
out
)
blocks
.
append
(
out
)
blocks
.
append
(
out
)
return
blocks
return
blocks
num_filters
=
self
.
extra_block_filters
num_filters
=
self
.
extra_block_filters
module14
=
self
.
_extra_block
(
module13
,
num_filters
[
0
][
0
],
num_filters
[
0
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_1"
)
module14
=
self
.
_extra_block
(
module13
,
num_filters
[
0
][
0
],
module15
=
self
.
_extra_block
(
module14
,
num_filters
[
1
][
0
],
num_filters
[
1
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_2"
)
num_filters
[
0
][
1
],
1
,
2
,
module16
=
self
.
_extra_block
(
module15
,
num_filters
[
2
][
0
],
num_filters
[
2
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_3"
)
self
.
prefix_name
+
"conv7_1"
)
module17
=
self
.
_extra_block
(
module16
,
num_filters
[
3
][
0
],
num_filters
[
3
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_4"
)
module15
=
self
.
_extra_block
(
module14
,
num_filters
[
1
][
0
],
num_filters
[
1
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_2"
)
module16
=
self
.
_extra_block
(
module15
,
num_filters
[
2
][
0
],
num_filters
[
2
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_3"
)
module17
=
self
.
_extra_block
(
module16
,
num_filters
[
3
][
0
],
num_filters
[
3
][
1
],
1
,
2
,
self
.
prefix_name
+
"conv7_4"
)
return
module11
,
module13
,
module14
,
module15
,
module16
,
module17
return
module11
,
module13
,
module14
,
module15
,
module16
,
module17
modules/image/object_detection/ssd_mobilenet_v1_pascal/module.py
浏览文件 @
c3f1e085
...
@@ -21,15 +21,17 @@ from ssd_mobilenet_v1_pascal.data_feed import reader
...
@@ -21,15 +21,17 @@ from ssd_mobilenet_v1_pascal.data_feed import reader
@
moduleinfo
(
@
moduleinfo
(
name
=
"ssd_mobilenet_v1_pascal"
,
name
=
"ssd_mobilenet_v1_pascal"
,
version
=
"1.1.
1
"
,
version
=
"1.1.
2
"
,
type
=
"cv/object_detection"
,
type
=
"cv/object_detection"
,
summary
=
"SSD with backbone MobileNet_V1, trained with dataset Pasecal VOC."
,
summary
=
"SSD with backbone MobileNet_V1, trained with dataset Pasecal VOC."
,
author
=
"paddlepaddle"
,
author
=
"paddlepaddle"
,
author_email
=
"paddle-dev@baidu.com"
)
author_email
=
"paddle-dev@baidu.com"
)
class
SSDMobileNetv1
(
hub
.
Module
):
class
SSDMobileNetv1
(
hub
.
Module
):
def
_initialize
(
self
):
def
_initialize
(
self
):
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
directory
,
"ssd_mobilenet_v1_model"
)
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
directory
,
"ssd_mobilenet_v1_model"
)
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
model_config
=
None
self
.
model_config
=
None
self
.
_set_config
()
self
.
_set_config
()
...
@@ -81,34 +83,55 @@ class SSDMobileNetv1(hub.Module):
...
@@ -81,34 +83,55 @@ class SSDMobileNetv1(hub.Module):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
unique_name
.
guard
():
with
fluid
.
unique_name
.
guard
():
# image
# image
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
3
,
300
,
300
],
dtype
=
'float32'
)
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
3
,
300
,
300
],
dtype
=
'float32'
)
# backbone
# backbone
backbone
=
MobileNet
(
**
self
.
mobilenet_config
)
backbone
=
MobileNet
(
**
self
.
mobilenet_config
)
# body_feats
# body_feats
body_feats
=
backbone
(
image
)
body_feats
=
backbone
(
image
)
# im_size
# im_size
im_size
=
fluid
.
layers
.
data
(
name
=
'im_size'
,
shape
=
[
2
],
dtype
=
'int32'
)
im_size
=
fluid
.
layers
.
data
(
name
=
'im_size'
,
shape
=
[
2
],
dtype
=
'int32'
)
# var_prefix
# var_prefix
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
# names of inputs
# names of inputs
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'im_size'
:
var_prefix
+
im_size
.
name
}
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'im_size'
:
var_prefix
+
im_size
.
name
}
# names of outputs
# names of outputs
if
get_prediction
:
if
get_prediction
:
locs
,
confs
,
box
,
box_var
=
fluid
.
layers
.
multi_box_head
(
locs
,
confs
,
box
,
box_var
=
fluid
.
layers
.
multi_box_head
(
inputs
=
body_feats
,
image
=
image
,
num_classes
=
21
,
**
self
.
multi_box_head_config
)
inputs
=
body_feats
,
image
=
image
,
num_classes
=
21
,
**
self
.
multi_box_head_config
)
pred
=
fluid
.
layers
.
detection_output
(
pred
=
fluid
.
layers
.
detection_output
(
loc
=
locs
,
scores
=
confs
,
prior_box
=
box
,
prior_box_var
=
box_var
,
**
self
.
output_decoder_config
)
loc
=
locs
,
scores
=
confs
,
prior_box
=
box
,
prior_box_var
=
box_var
,
**
self
.
output_decoder_config
)
outputs
=
{
'bbox_out'
:
[
var_prefix
+
pred
.
name
]}
outputs
=
{
'bbox_out'
:
[
var_prefix
+
pred
.
name
]}
else
:
else
:
outputs
=
{
'body_features'
:
[
var_prefix
+
var
.
name
for
var
in
body_feats
]}
outputs
=
{
'body_features'
:
[
var_prefix
+
var
.
name
for
var
in
body_feats
]
}
# add_vars_prefix
# add_vars_prefix
add_vars_prefix
(
context_prog
,
var_prefix
)
add_vars_prefix
(
context_prog
,
var_prefix
)
add_vars_prefix
(
fluid
.
default_startup_program
(),
var_prefix
)
add_vars_prefix
(
fluid
.
default_startup_program
(),
var_prefix
)
# inputs
# inputs
inputs
=
{
key
:
context_prog
.
global_block
().
vars
[
value
]
for
key
,
value
in
inputs
.
items
()}
inputs
=
{
key
:
context_prog
.
global_block
().
vars
[
value
]
for
key
,
value
in
inputs
.
items
()
}
outputs
=
{
outputs
=
{
out_key
:
[
context_prog
.
global_block
().
vars
[
varname
]
for
varname
in
out_value
]
out_key
:
[
context_prog
.
global_block
().
vars
[
varname
]
for
varname
in
out_value
]
for
out_key
,
out_value
in
outputs
.
items
()
for
out_key
,
out_value
in
outputs
.
items
()
}
}
# trainable
# trainable
...
@@ -121,9 +144,14 @@ class SSDMobileNetv1(hub.Module):
...
@@ -121,9 +144,14 @@ class SSDMobileNetv1(hub.Module):
if
pretrained
:
if
pretrained
:
def
_if_exist
(
var
):
def
_if_exist
(
var
):
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
else
:
else
:
exe
.
run
(
startup_program
)
exe
.
run
(
startup_program
)
...
@@ -166,7 +194,7 @@ class SSDMobileNetv1(hub.Module):
...
@@ -166,7 +194,7 @@ class SSDMobileNetv1(hub.Module):
int
(
_places
[
0
])
int
(
_places
[
0
])
except
:
except
:
raise
RuntimeError
(
raise
RuntimeError
(
"
Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id
."
"
Attempt to use GPU for prediction, but environment variable CUDA_VISIBLE_DEVICES was not set correctly
."
)
)
paths
=
paths
if
paths
else
list
()
paths
=
paths
if
paths
else
list
()
...
@@ -196,7 +224,11 @@ class SSDMobileNetv1(hub.Module):
...
@@ -196,7 +224,11 @@ class SSDMobileNetv1(hub.Module):
res
.
extend
(
output
)
res
.
extend
(
output
)
return
res
return
res
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
if
combined
:
if
combined
:
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
...
@@ -234,9 +266,12 @@ class SSDMobileNetv1(hub.Module):
...
@@ -234,9 +266,12 @@ class SSDMobileNetv1(hub.Module):
prog
=
'hub run {}'
.
format
(
self
.
name
),
prog
=
'hub run {}'
.
format
(
self
.
name
),
usage
=
'%(prog)s'
,
usage
=
'%(prog)s'
,
add_help
=
True
)
add_help
=
True
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
self
.
add_module_config_arg
()
self
.
add_module_config_arg
()
self
.
add_module_input_arg
()
self
.
add_module_input_arg
()
args
=
self
.
parser
.
parse_args
(
argvs
)
args
=
self
.
parser
.
parse_args
(
argvs
)
...
@@ -254,17 +289,34 @@ class SSDMobileNetv1(hub.Module):
...
@@ -254,17 +289,34 @@ class SSDMobileNetv1(hub.Module):
Add the command config options.
Add the command config options.
"""
"""
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--output_dir'
,
type
=
str
,
default
=
'detection_result'
,
help
=
"The directory to save output images."
)
'--output_dir'
,
type
=
str
,
default
=
'detection_result'
,
help
=
"The directory to save output images."
)
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--visualization'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether to save output as images."
)
'--visualization'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether to save output as images."
)
def
add_module_input_arg
(
self
):
def
add_module_input_arg
(
self
):
"""
"""
Add the command input options.
Add the command input options.
"""
"""
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
help
=
"path to image."
)
self
.
arg_input_group
.
add_argument
(
'--batch_size'
,
type
=
ast
.
literal_eval
,
default
=
1
,
help
=
"batch size."
)
self
.
arg_input_group
.
add_argument
(
self
.
arg_input_group
.
add_argument
(
'--score_thresh'
,
type
=
ast
.
literal_eval
,
default
=
0.5
,
help
=
"threshold for object detecion."
)
'--input_path'
,
type
=
str
,
help
=
"path to image."
)
self
.
arg_input_group
.
add_argument
(
'--batch_size'
,
type
=
ast
.
literal_eval
,
default
=
1
,
help
=
"batch size."
)
self
.
arg_input_group
.
add_argument
(
'--score_thresh'
,
type
=
ast
.
literal_eval
,
default
=
0.5
,
help
=
"threshold for object detecion."
)
modules/image/object_detection/ssd_mobilenet_v1_pascal/processor.py
浏览文件 @
c3f1e085
...
@@ -15,6 +15,12 @@ def base64_to_cv2(b64str):
...
@@ -15,6 +15,12 @@ def base64_to_cv2(b64str):
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
return
data
return
data
def
check_dir
(
dir_path
):
if
not
os
.
path
.
exists
(
dir_path
):
os
.
makedirs
(
dir_path
)
elif
os
.
path
.
isfile
(
dir_path
):
os
.
remove
(
dir_path
)
os
.
makedirs
(
dir_path
)
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
"""
"""
...
@@ -44,17 +50,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
...
@@ -44,17 +50,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
image
=
Image
.
open
(
image_path
)
image
=
Image
.
open
(
image_path
)
draw
=
ImageDraw
.
Draw
(
image
)
draw
=
ImageDraw
.
Draw
(
image
)
for
data
in
data_list
:
for
data
in
data_list
:
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
# draw bbox
# draw bbox
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
# draw label
# draw label
if
image
.
mode
==
'RGB'
:
if
image
.
mode
==
'RGB'
:
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
draw
.
rectangle
(
draw
.
rectangle
(
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
...
@@ -83,7 +95,14 @@ def load_label_info(file_path):
...
@@ -83,7 +95,14 @@ def load_label_info(file_path):
return
label_names
return
label_names
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
"""
"""
postprocess the lod_tensor produced by fluid.Executor.run
postprocess the lod_tensor produced by fluid.Executor.run
...
@@ -111,16 +130,27 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -111,16 +130,27 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
lod_tensor
=
data_out
[
0
]
lod_tensor
=
data_out
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
results
=
lod_tensor
.
as_ndarray
()
results
=
lod_tensor
.
as_ndarray
()
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
check_dir
(
output_dir
)
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
if
paths
:
unhandled_paths_num
=
0
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
unhandled_paths_num
=
0
if
images
is
not
None
:
if
handle_id
<
len
(
images
):
unhandled_paths
=
None
unhandled_paths_num
=
len
(
images
)
-
handle_id
else
:
unhandled_paths_num
=
0
output
=
[]
output
=
[]
for
index
in
range
(
len
(
lod
)
-
1
):
for
index
in
range
(
len
(
lod
)
-
1
):
output_i
=
{
'data'
:
[]}
output_i
=
{
'data'
:
[]}
if
index
<
unhandled_paths_num
:
if
unhandled_paths
and
index
<
unhandled_paths_num
:
org_img_path
=
unhandled_paths
[
index
]
org_img_path
=
unhandled_paths
[
index
]
org_img
=
Image
.
open
(
org_img_path
)
org_img
=
Image
.
open
(
org_img_path
)
output_i
[
'path'
]
=
org_img_path
output_i
[
'path'
]
=
org_img_path
...
@@ -129,7 +159,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -129,7 +159,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
if
visualization
:
if
visualization
:
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
((
handle_id
+
index
)))
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
(
(
handle_id
+
index
)))
org_img
.
save
(
org_img_path
)
org_img
.
save
(
org_img_path
)
org_img_height
=
org_img
.
height
org_img_height
=
org_img
.
height
org_img_width
=
org_img
.
width
org_img_width
=
org_img
.
width
...
@@ -149,11 +181,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -149,11 +181,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
dt
=
{}
dt
=
{}
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
output_i
[
'data'
].
append
(
dt
)
output_i
[
'data'
].
append
(
dt
)
output
.
append
(
output_i
)
output
.
append
(
output_i
)
if
visualization
:
if
visualization
:
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
return
output
return
output
modules/image/object_detection/ssd_vgg16_512_coco2017/README.md
浏览文件 @
c3f1e085
#
# 命令行预测
#
ssd_vgg16_512_coco2017
```
shell
|模型名称|ssd_vgg16_512_coco2017|
$
hub run ssd_vgg16_512_coco2017
--input_path
"/PATH/TO/IMAGE"
| :--- | :---: |
```
|类别|图像 - 目标检测|
|网络|SSD|
|数据集|COCO2017|
|是否支持Fine-tuning|否|
|模型大小|139MB|
|最新更新日期|2021-03-15|
|数据指标|-|
## API
```
python
## 一、模型基本信息
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
```
提取特征,用于迁移学习。
-
### 应用效果展示
-
样例结果示例:
<p
align=
"center"
>
<img
src=
"https://user-images.githubusercontent.com/22424850/131506781-b4ecb77b-5ab1-4795-88da-5f547f7f7f9c.jpg"
width=
'50%'
hspace=
'10'
/>
<br
/>
</p>
**参数**
-
### 模型介绍
*
trainable(bool): 参数是否可训练;
-
Single Shot MultiBox Detector (SSD) 是一种单阶段的目标检测器。与两阶段的检测方法不同,单阶段目标检测并不进行区域推荐,而是直接从特征图回归出目标的边界框和分类概率。SSD 运用了这种单阶段检测的思想,并且对其进行改进:在不同尺度的特征图上检测对应尺度的目标。该PaddleHub Module的基网络为VGG16模型,在Pascal数据集上预训练得到,目前仅支持预测。
*
pretrained (bool): 是否加载预训练模型;
*
get
\_
prediction (bool): 是否执行预测。
**返回**
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
## 二、安装
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
features',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
-
### 1、环境依赖
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
-
paddlepaddle >= 1.6.2
**参数**
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
*
paths (list
\[
str
\]
): 图片的路径;
-
### 2、安装
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
-
```shell
$ hub install ssd_vgg16_512_coco2017
```
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
## 三、模型API预测
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
-
### 1、命令行预测
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
-
```shell
$ hub run ssd_vgg16_512_coco2017 --input_path "/PATH/TO/IMAGE"
```
-
通过命令行方式实现目标检测模型的调用,更多请见
[
PaddleHub命令行指令
](
../../../../docs/docs_ch/tutorial/cmd_usage.rst
)
-
### 2、代码示例
**参数**
-
```python
import paddlehub as hub
import cv2
*
dirname: 存在模型的目录名称
object_detector = hub.Module(name="ssd_vgg16_512_coco2017")
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')])
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
# or
*
combined: 是否将参数保存到统一的一个文件中
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 代码示例
-
### 3、API
```
python
-
```python
import
paddlehub
as
hub
def object_detection(paths=None,
import
cv2
images=None,
batch_size=1,
use_gpu=False,
output_dir='detection_result',
score_thresh=0.5,
visualization=True)
```
object_detector
=
hub
.
Module
(
name
=
"ssd_vgg16_512_coco2017"
)
- 预测API,检测输入图片中的所有目标的位置。
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
- **参数**
PaddleHub Serving可以部署一个目标检测的在线服务。
- paths (list\[str\]): 图片的路径; <br/>
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; <br/>
- batch\_size (int): batch 的大小;<br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- output\_dir (str): 图片的保存路径,默认设为 detection\_result;<br/>
- score\_thresh (float): 识别置信度的阈值;<br/>
- visualization (bool): 是否将识别结果保存为图片文件。
## 第一步:启动PaddleHub Serving
**NOTE:** paths和images两个参数选择其一进行提供数据
运行启动命令:
- **返回**
```
shell
$
hub serving start
-m
ssd_vgg16_512_coco2017
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
- data (list): 检测结果,list的每一个元素为 dict,各字段为:
- confidence (float): 识别的置信度
- label (str): 标签
- left (int): 边界框的左上角x坐标
- top (int): 边界框的左上角y坐标
- right (int): 边界框的右下角x坐标
- bottom (int): 边界框的右下角y坐标
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)
**NOTE:**
如使用GPU 预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
## 第二步:发送预测请求
- **参数**
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- dirname: 存在模型的目录名称; <br/>
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
- combined: 是否将参数保存到统一的一个文件中。
```
python
import
requests
import
json
import
cv2
import
base64
## 四、服务部署
def
cv2_to_base64
(
image
):
-
PaddleHub Serving可以部署一个目标检测的在线服务。
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
-
### 第一步:启动PaddleHub Serving
# 发送HTTP请求
-
运行启动命令:
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
-
```shell
headers
=
{
"Content-type"
:
"application/json"
}
$ hub serving start -m ssd_vgg16_512_coco2017
url
=
"http://127.0.0.1:8866/predict/ssd_vgg16_512_coco2017"
```
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
# 打印预测结果
-
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
print
(
r
.
json
()[
"results"
])
```
### 依赖
-
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
paddlepaddle >= 1.6.2
-
### 第二步:发送预测请求
paddlehub >= 1.6.0
-
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
-
```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/ssd_vgg16_512_coco2017"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
```
## 五、更新历史
*
1.0.0
初始发布
*
1.0.2
修复numpy数据读取问题
-
```shell
$ hub install ssd_vgg16_512_coco2017==1.0.2
```
modules/image/object_detection/ssd_vgg16_512_coco2017/data_feed.py
浏览文件 @
c3f1e085
...
@@ -34,7 +34,11 @@ class DecodeImage(object):
...
@@ -34,7 +34,11 @@ class DecodeImage(object):
class
ResizeImage
(
object
):
class
ResizeImage
(
object
):
def
__init__
(
self
,
target_size
=
0
,
max_size
=
0
,
interp
=
cv2
.
INTER_LINEAR
,
use_cv2
=
True
):
def
__init__
(
self
,
target_size
=
0
,
max_size
=
0
,
interp
=
cv2
.
INTER_LINEAR
,
use_cv2
=
True
):
"""
"""
Rescale image to the specified target size, and capped at max_size
Rescale image to the specified target size, and capped at max_size
if max_size != 0.
if max_size != 0.
...
@@ -88,11 +92,18 @@ class ResizeImage(object):
...
@@ -88,11 +92,18 @@ class ResizeImage(object):
resize_h
=
selected_size
resize_h
=
selected_size
if
self
.
use_cv2
:
if
self
.
use_cv2
:
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
self
.
interp
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
self
.
interp
)
else
:
else
:
if
self
.
max_size
!=
0
:
if
self
.
max_size
!=
0
:
raise
TypeError
(
'If you set max_size to cap the maximum size of image,'
raise
TypeError
(
'please set use_cv2 to True to resize the image.'
)
'If you set max_size to cap the maximum size of image,'
'please set use_cv2 to True to resize the image.'
)
im
=
im
.
astype
(
'uint8'
)
im
=
im
.
astype
(
'uint8'
)
im
=
Image
.
fromarray
(
im
)
im
=
Image
.
fromarray
(
im
)
im
=
im
.
resize
((
int
(
resize_w
),
int
(
resize_h
)),
self
.
interp
)
im
=
im
.
resize
((
int
(
resize_w
),
int
(
resize_h
)),
self
.
interp
)
...
@@ -102,7 +113,11 @@ class ResizeImage(object):
...
@@ -102,7 +113,11 @@ class ResizeImage(object):
class
NormalizeImage
(
object
):
class
NormalizeImage
(
object
):
def
__init__
(
self
,
mean
=
[
0.485
,
0.456
,
0.406
],
std
=
[
1
,
1
,
1
],
is_scale
=
True
,
is_channel_first
=
True
):
def
__init__
(
self
,
mean
=
[
0.485
,
0.456
,
0.406
],
std
=
[
1
,
1
,
1
],
is_scale
=
True
,
is_channel_first
=
True
):
"""
"""
Args:
Args:
mean (list): the pixel mean
mean (list): the pixel mean
...
@@ -158,9 +173,11 @@ class Permute(object):
...
@@ -158,9 +173,11 @@ class Permute(object):
def
reader
(
paths
=
[],
def
reader
(
paths
=
[],
images
=
None
,
images
=
None
,
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
),
decode_image
=
DecodeImage
(
to_rgb
=
True
,
with_mixup
=
False
),
resize_image
=
ResizeImage
(
target_size
=
512
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
),
resize_image
=
ResizeImage
(
target_size
=
512
,
interp
=
1
,
max_size
=
0
,
use_cv2
=
False
),
permute_image
=
Permute
(
to_bgr
=
False
),
permute_image
=
Permute
(
to_bgr
=
False
),
normalize_image
=
NormalizeImage
(
mean
=
[
104
,
117
,
123
],
std
=
[
1
,
1
,
1
],
is_scale
=
False
)):
normalize_image
=
NormalizeImage
(
mean
=
[
104
,
117
,
123
],
std
=
[
1
,
1
,
1
],
is_scale
=
False
)):
"""
"""
data generator
data generator
...
@@ -176,7 +193,8 @@ def reader(paths=[],
...
@@ -176,7 +193,8 @@ def reader(paths=[],
if
paths
is
not
None
:
if
paths
is
not
None
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
...
modules/image/object_detection/ssd_vgg16_512_coco2017/module.py
浏览文件 @
c3f1e085
...
@@ -21,15 +21,17 @@ from ssd_vgg16_512_coco2017.data_feed import reader
...
@@ -21,15 +21,17 @@ from ssd_vgg16_512_coco2017.data_feed import reader
@
moduleinfo
(
@
moduleinfo
(
name
=
"ssd_vgg16_512_coco2017"
,
name
=
"ssd_vgg16_512_coco2017"
,
version
=
"1.0.
1
"
,
version
=
"1.0.
2
"
,
type
=
"cv/object_detection"
,
type
=
"cv/object_detection"
,
summary
=
"SSD with backbone VGG16, trained with dataset COCO."
,
summary
=
"SSD with backbone VGG16, trained with dataset COCO."
,
author
=
"paddlepaddle"
,
author
=
"paddlepaddle"
,
author_email
=
"paddle-dev@baidu.com"
)
author_email
=
"paddle-dev@baidu.com"
)
class
SSDVGG16_512
(
hub
.
Module
):
class
SSDVGG16_512
(
hub
.
Module
):
def
_initialize
(
self
):
def
_initialize
(
self
):
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
directory
,
"ssd_vgg16_512_model"
)
self
.
default_pretrained_model_path
=
os
.
path
.
join
(
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
directory
,
"ssd_vgg16_512_model"
)
self
.
label_names
=
load_label_info
(
os
.
path
.
join
(
self
.
directory
,
"label_file.txt"
))
self
.
model_config
=
None
self
.
model_config
=
None
self
.
_set_config
()
self
.
_set_config
()
...
@@ -80,39 +82,63 @@ class SSDVGG16_512(hub.Module):
...
@@ -80,39 +82,63 @@ class SSDVGG16_512(hub.Module):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
program_guard
(
context_prog
,
startup_program
):
with
fluid
.
unique_name
.
guard
():
with
fluid
.
unique_name
.
guard
():
# image
# image
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
3
,
512
,
512
],
dtype
=
'float32'
)
image
=
fluid
.
layers
.
data
(
name
=
'image'
,
shape
=
[
3
,
512
,
512
],
dtype
=
'float32'
)
# backbone
# backbone
backbone
=
VGG
(
backbone
=
VGG
(
depth
=
16
,
depth
=
16
,
with_extra_blocks
=
True
,
with_extra_blocks
=
True
,
normalizations
=
[
20.
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
],
normalizations
=
[
20.
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
],
extra_block_filters
=
[[
256
,
512
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
extra_block_filters
=
[[
256
,
512
,
1
,
2
,
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
1
,
1
,
4
]])
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
1
,
1
,
4
]])
# body_feats
# body_feats
body_feats
=
backbone
(
image
)
body_feats
=
backbone
(
image
)
# im_size
# im_size
im_size
=
fluid
.
layers
.
data
(
name
=
'im_size'
,
shape
=
[
2
],
dtype
=
'int32'
)
im_size
=
fluid
.
layers
.
data
(
name
=
'im_size'
,
shape
=
[
2
],
dtype
=
'int32'
)
# var_prefix
# var_prefix
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
var_prefix
=
'@HUB_{}@'
.
format
(
self
.
name
)
# names of inputs
# names of inputs
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'im_size'
:
var_prefix
+
im_size
.
name
}
inputs
=
{
'image'
:
var_prefix
+
image
.
name
,
'im_size'
:
var_prefix
+
im_size
.
name
}
# names of outputs
# names of outputs
if
get_prediction
:
if
get_prediction
:
locs
,
confs
,
box
,
box_var
=
fluid
.
layers
.
multi_box_head
(
locs
,
confs
,
box
,
box_var
=
fluid
.
layers
.
multi_box_head
(
inputs
=
body_feats
,
image
=
image
,
num_classes
=
81
,
**
self
.
multi_box_head_config
)
inputs
=
body_feats
,
image
=
image
,
num_classes
=
81
,
**
self
.
multi_box_head_config
)
pred
=
fluid
.
layers
.
detection_output
(
pred
=
fluid
.
layers
.
detection_output
(
loc
=
locs
,
scores
=
confs
,
prior_box
=
box
,
prior_box_var
=
box_var
,
**
self
.
output_decoder_config
)
loc
=
locs
,
scores
=
confs
,
prior_box
=
box
,
prior_box_var
=
box_var
,
**
self
.
output_decoder_config
)
outputs
=
{
'bbox_out'
:
[
var_prefix
+
pred
.
name
]}
outputs
=
{
'bbox_out'
:
[
var_prefix
+
pred
.
name
]}
else
:
else
:
outputs
=
{
'body_features'
:
[
var_prefix
+
var
.
name
for
var
in
body_feats
]}
outputs
=
{
'body_features'
:
[
var_prefix
+
var
.
name
for
var
in
body_feats
]
}
# add_vars_prefix
# add_vars_prefix
add_vars_prefix
(
context_prog
,
var_prefix
)
add_vars_prefix
(
context_prog
,
var_prefix
)
add_vars_prefix
(
fluid
.
default_startup_program
(),
var_prefix
)
add_vars_prefix
(
fluid
.
default_startup_program
(),
var_prefix
)
# inputs
# inputs
inputs
=
{
key
:
context_prog
.
global_block
().
vars
[
value
]
for
key
,
value
in
inputs
.
items
()}
inputs
=
{
key
:
context_prog
.
global_block
().
vars
[
value
]
for
key
,
value
in
inputs
.
items
()
}
outputs
=
{
outputs
=
{
out_key
:
[
context_prog
.
global_block
().
vars
[
varname
]
for
varname
in
out_value
]
out_key
:
[
context_prog
.
global_block
().
vars
[
varname
]
for
varname
in
out_value
]
for
out_key
,
out_value
in
outputs
.
items
()
for
out_key
,
out_value
in
outputs
.
items
()
}
}
# trainable
# trainable
...
@@ -125,9 +151,14 @@ class SSDVGG16_512(hub.Module):
...
@@ -125,9 +151,14 @@ class SSDVGG16_512(hub.Module):
if
pretrained
:
if
pretrained
:
def
_if_exist
(
var
):
def
_if_exist
(
var
):
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
return
os
.
path
.
exists
(
os
.
path
.
join
(
self
.
default_pretrained_model_path
,
var
.
name
))
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
fluid
.
io
.
load_vars
(
exe
,
self
.
default_pretrained_model_path
,
predicate
=
_if_exist
)
else
:
else
:
exe
.
run
(
startup_program
)
exe
.
run
(
startup_program
)
...
@@ -169,7 +200,7 @@ class SSDVGG16_512(hub.Module):
...
@@ -169,7 +200,7 @@ class SSDVGG16_512(hub.Module):
int
(
_places
[
0
])
int
(
_places
[
0
])
except
:
except
:
raise
RuntimeError
(
raise
RuntimeError
(
"
Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id
."
"
Attempt to use GPU for prediction, but environment variable CUDA_VISIBLE_DEVICES was not set correctly
."
)
)
paths
=
paths
if
paths
else
list
()
paths
=
paths
if
paths
else
list
()
...
@@ -196,7 +227,11 @@ class SSDVGG16_512(hub.Module):
...
@@ -196,7 +227,11 @@ class SSDVGG16_512(hub.Module):
res
.
extend
(
output
)
res
.
extend
(
output
)
return
res
return
res
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
def
save_inference_model
(
self
,
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
):
if
combined
:
if
combined
:
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
model_filename
=
"__model__"
if
not
model_filename
else
model_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
params_filename
=
"__params__"
if
not
params_filename
else
params_filename
...
@@ -234,9 +269,12 @@ class SSDVGG16_512(hub.Module):
...
@@ -234,9 +269,12 @@ class SSDVGG16_512(hub.Module):
prog
=
'hub run {}'
.
format
(
self
.
name
),
prog
=
'hub run {}'
.
format
(
self
.
name
),
usage
=
'%(prog)s'
,
usage
=
'%(prog)s'
,
add_help
=
True
)
add_help
=
True
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_input_group
=
self
.
parser
.
add_argument_group
(
title
=
"Input options"
,
description
=
"Input data. Required"
)
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
self
.
arg_config_group
=
self
.
parser
.
add_argument_group
(
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
title
=
"Config options"
,
description
=
"Run configuration for controlling module behavior, not required."
)
self
.
add_module_config_arg
()
self
.
add_module_config_arg
()
self
.
add_module_input_arg
()
self
.
add_module_input_arg
()
args
=
self
.
parser
.
parse_args
(
argvs
)
args
=
self
.
parser
.
parse_args
(
argvs
)
...
@@ -254,17 +292,34 @@ class SSDVGG16_512(hub.Module):
...
@@ -254,17 +292,34 @@ class SSDVGG16_512(hub.Module):
Add the command config options.
Add the command config options.
"""
"""
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
'--use_gpu'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether use GPU or not"
)
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--output_dir'
,
type
=
str
,
default
=
'detection_result'
,
help
=
"The directory to save output images."
)
'--output_dir'
,
type
=
str
,
default
=
'detection_result'
,
help
=
"The directory to save output images."
)
self
.
arg_config_group
.
add_argument
(
self
.
arg_config_group
.
add_argument
(
'--visualization'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether to save output as images."
)
'--visualization'
,
type
=
ast
.
literal_eval
,
default
=
False
,
help
=
"whether to save output as images."
)
def
add_module_input_arg
(
self
):
def
add_module_input_arg
(
self
):
"""
"""
Add the command input options.
Add the command input options.
"""
"""
self
.
arg_input_group
.
add_argument
(
'--input_path'
,
type
=
str
,
help
=
"path to image."
)
self
.
arg_input_group
.
add_argument
(
'--batch_size'
,
type
=
ast
.
literal_eval
,
default
=
1
,
help
=
"batch size."
)
self
.
arg_input_group
.
add_argument
(
self
.
arg_input_group
.
add_argument
(
'--score_thresh'
,
type
=
ast
.
literal_eval
,
default
=
0.5
,
help
=
"threshold for object detecion."
)
'--input_path'
,
type
=
str
,
help
=
"path to image."
)
self
.
arg_input_group
.
add_argument
(
'--batch_size'
,
type
=
ast
.
literal_eval
,
default
=
1
,
help
=
"batch size."
)
self
.
arg_input_group
.
add_argument
(
'--score_thresh'
,
type
=
ast
.
literal_eval
,
default
=
0.5
,
help
=
"threshold for object detecion."
)
modules/image/object_detection/ssd_vgg16_512_coco2017/processor.py
浏览文件 @
c3f1e085
...
@@ -15,6 +15,12 @@ def base64_to_cv2(b64str):
...
@@ -15,6 +15,12 @@ def base64_to_cv2(b64str):
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
data
=
cv2
.
imdecode
(
data
,
cv2
.
IMREAD_COLOR
)
return
data
return
data
def
check_dir
(
dir_path
):
if
not
os
.
path
.
exists
(
dir_path
):
os
.
makedirs
(
dir_path
)
elif
os
.
path
.
isfile
(
dir_path
):
os
.
remove
(
dir_path
)
os
.
makedirs
(
dir_path
)
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
def
get_save_image_name
(
img
,
output_dir
,
image_path
):
"""
"""
...
@@ -44,17 +50,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
...
@@ -44,17 +50,23 @@ def draw_bounding_box_on_image(image_path, data_list, save_dir):
image
=
Image
.
open
(
image_path
)
image
=
Image
.
open
(
image_path
)
draw
=
ImageDraw
.
Draw
(
image
)
draw
=
ImageDraw
.
Draw
(
image
)
for
data
in
data_list
:
for
data
in
data_list
:
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
left
,
right
,
top
,
bottom
=
data
[
'left'
],
data
[
'right'
],
data
[
'top'
],
data
[
'bottom'
]
# draw bbox
# draw bbox
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
draw
.
line
([(
left
,
top
),
(
left
,
bottom
),
(
right
,
bottom
),
(
right
,
top
),
(
left
,
top
)],
width
=
2
,
fill
=
'red'
)
# draw label
# draw label
if
image
.
mode
==
'RGB'
:
if
image
.
mode
==
'RGB'
:
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
text
=
data
[
'label'
]
+
": %.2f%%"
%
(
100
*
data
[
'confidence'
])
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
textsize_width
,
textsize_height
=
draw
.
textsize
(
text
=
text
)
draw
.
rectangle
(
draw
.
rectangle
(
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
xy
=
(
left
,
top
-
(
textsize_height
+
5
),
left
+
textsize_width
+
10
,
top
),
fill
=
(
255
,
255
,
255
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
draw
.
text
(
xy
=
(
left
,
top
-
15
),
text
=
text
,
fill
=
(
0
,
0
,
0
))
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
save_name
=
get_save_image_name
(
image
,
save_dir
,
image_path
)
...
@@ -83,7 +95,14 @@ def load_label_info(file_path):
...
@@ -83,7 +95,14 @@ def load_label_info(file_path):
return
label_names
return
label_names
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
def
postprocess
(
paths
,
images
,
data_out
,
score_thresh
,
label_names
,
output_dir
,
handle_id
,
visualization
=
True
):
"""
"""
postprocess the lod_tensor produced by fluid.Executor.run
postprocess the lod_tensor produced by fluid.Executor.run
...
@@ -111,16 +130,27 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -111,16 +130,27 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
lod_tensor
=
data_out
[
0
]
lod_tensor
=
data_out
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
lod
=
lod_tensor
.
lod
[
0
]
results
=
lod_tensor
.
as_ndarray
()
results
=
lod_tensor
.
as_ndarray
()
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
check_dir
(
output_dir
)
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
if
paths
:
unhandled_paths_num
=
0
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
if
handle_id
<
len
(
paths
):
unhandled_paths
=
paths
[
handle_id
:]
unhandled_paths_num
=
len
(
unhandled_paths
)
else
:
unhandled_paths_num
=
0
if
images
is
not
None
:
if
handle_id
<
len
(
images
):
unhandled_paths
=
None
unhandled_paths_num
=
len
(
images
)
-
handle_id
else
:
unhandled_paths_num
=
0
output
=
[]
output
=
[]
for
index
in
range
(
len
(
lod
)
-
1
):
for
index
in
range
(
len
(
lod
)
-
1
):
output_i
=
{
'data'
:
[]}
output_i
=
{
'data'
:
[]}
if
index
<
unhandled_paths_num
:
if
unhandled_paths
and
index
<
unhandled_paths_num
:
org_img_path
=
unhandled_paths
[
index
]
org_img_path
=
unhandled_paths
[
index
]
org_img
=
Image
.
open
(
org_img_path
)
org_img
=
Image
.
open
(
org_img_path
)
output_i
[
'path'
]
=
org_img_path
output_i
[
'path'
]
=
org_img_path
...
@@ -129,7 +159,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -129,7 +159,9 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
org_img
.
astype
(
np
.
uint8
)
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
org_img
=
Image
.
fromarray
(
org_img
[:,
:,
::
-
1
])
if
visualization
:
if
visualization
:
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
((
handle_id
+
index
)))
org_img_path
=
get_save_image_name
(
org_img
,
output_dir
,
'image_numpy_{}'
.
format
(
(
handle_id
+
index
)))
org_img
.
save
(
org_img_path
)
org_img
.
save
(
org_img_path
)
org_img_height
=
org_img
.
height
org_img_height
=
org_img
.
height
org_img_width
=
org_img
.
width
org_img_width
=
org_img
.
width
...
@@ -149,11 +181,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
...
@@ -149,11 +181,13 @@ def postprocess(paths, images, data_out, score_thresh, label_names, output_dir,
dt
=
{}
dt
=
{}
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'label'
]
=
label_names
[
category_id
]
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'confidence'
]
=
float
(
confidence
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
dt
[
'left'
],
dt
[
'top'
],
dt
[
'right'
],
dt
[
'bottom'
]
=
clip_bbox
(
bbox
,
org_img_width
,
org_img_height
)
output_i
[
'data'
].
append
(
dt
)
output_i
[
'data'
].
append
(
dt
)
output
.
append
(
output_i
)
output
.
append
(
output_i
)
if
visualization
:
if
visualization
:
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
output_i
[
'save_path'
]
=
draw_bounding_box_on_image
(
org_img_path
,
output_i
[
'data'
],
output_dir
)
return
output
return
output
modules/image/object_detection/ssd_vgg16_512_coco2017/vgg.py
浏览文件 @
c3f1e085
...
@@ -27,8 +27,8 @@ class VGG(object):
...
@@ -27,8 +27,8 @@ class VGG(object):
depth
=
16
,
depth
=
16
,
with_extra_blocks
=
False
,
with_extra_blocks
=
False
,
normalizations
=
[
20.
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
],
normalizations
=
[
20.
,
-
1
,
-
1
,
-
1
,
-
1
,
-
1
],
extra_block_filters
=
[[
256
,
512
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
0
,
1
,
3
],
extra_block_filters
=
[[
256
,
512
,
1
,
2
,
3
],
[
128
,
256
,
1
,
2
,
3
],
[
128
,
256
,
0
,
1
,
3
]],
[
128
,
256
,
0
,
1
,
3
]
,
[
128
,
256
,
0
,
1
,
3
]
],
class_dim
=
1000
):
class_dim
=
1000
):
assert
depth
in
[
16
,
19
],
"depth {} not in [16, 19]"
assert
depth
in
[
16
,
19
],
"depth {} not in [16, 19]"
self
.
depth
=
depth
self
.
depth
=
depth
...
@@ -60,7 +60,8 @@ class VGG(object):
...
@@ -60,7 +60,8 @@ class VGG(object):
res_layer
=
[]
res_layer
=
[]
layers
=
[]
layers
=
[]
for
k
,
v
in
enumerate
(
vgg_base
):
for
k
,
v
in
enumerate
(
vgg_base
):
conv
=
self
.
_conv_block
(
conv
,
v
,
nums
[
k
],
name
=
"conv{}_"
.
format
(
k
+
1
))
conv
=
self
.
_conv_block
(
conv
,
v
,
nums
[
k
],
name
=
"conv{}_"
.
format
(
k
+
1
))
layers
.
append
(
conv
)
layers
.
append
(
conv
)
if
self
.
with_extra_blocks
:
if
self
.
with_extra_blocks
:
if
k
==
4
:
if
k
==
4
:
...
@@ -76,19 +77,25 @@ class VGG(object):
...
@@ -76,19 +77,25 @@ class VGG(object):
input
=
conv
,
input
=
conv
,
size
=
fc_dim
,
size
=
fc_dim
,
act
=
'relu'
,
act
=
'relu'
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
0
]
+
"_weights"
),
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
0
]
+
"_offset"
))
name
=
fc_name
[
0
]
+
"_weights"
),
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
0
]
+
"_offset"
))
fc2
=
fluid
.
layers
.
fc
(
fc2
=
fluid
.
layers
.
fc
(
input
=
fc1
,
input
=
fc1
,
size
=
fc_dim
,
size
=
fc_dim
,
act
=
'relu'
,
act
=
'relu'
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
1
]
+
"_weights"
),
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
1
]
+
"_offset"
))
name
=
fc_name
[
1
]
+
"_weights"
),
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
1
]
+
"_offset"
))
out
=
fluid
.
layers
.
fc
(
out
=
fluid
.
layers
.
fc
(
input
=
fc2
,
input
=
fc2
,
size
=
self
.
class_dim
,
size
=
self
.
class_dim
,
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
2
]
+
"_weights"
),
param_attr
=
fluid
.
param_attr
.
ParamAttr
(
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
2
]
+
"_offset"
))
name
=
fc_name
[
2
]
+
"_weights"
),
bias_attr
=
fluid
.
param_attr
.
ParamAttr
(
name
=
fc_name
[
2
]
+
"_offset"
))
out
=
fluid
.
layers
.
softmax
(
out
)
out
=
fluid
.
layers
.
softmax
(
out
)
res_layer
.
append
(
out
)
res_layer
.
append
(
out
)
return
[
out
]
return
[
out
]
...
@@ -103,7 +110,14 @@ class VGG(object):
...
@@ -103,7 +110,14 @@ class VGG(object):
layers
=
[]
layers
=
[]
for
k
,
v
in
enumerate
(
cfg
):
for
k
,
v
in
enumerate
(
cfg
):
assert
len
(
v
)
==
5
,
"extra_block_filters size not fix"
assert
len
(
v
)
==
5
,
"extra_block_filters size not fix"
conv
=
self
.
_extra_block
(
conv
,
v
[
0
],
v
[
1
],
v
[
2
],
v
[
3
],
v
[
4
],
name
=
"conv{}_"
.
format
(
6
+
k
))
conv
=
self
.
_extra_block
(
conv
,
v
[
0
],
v
[
1
],
v
[
2
],
v
[
3
],
v
[
4
],
name
=
"conv{}_"
.
format
(
6
+
k
))
layers
.
append
(
conv
)
layers
.
append
(
conv
)
return
layers
return
layers
...
@@ -121,10 +135,23 @@ class VGG(object):
...
@@ -121,10 +135,23 @@ class VGG(object):
name
=
name
+
str
(
i
+
1
))
name
=
name
+
str
(
i
+
1
))
return
conv
return
conv
def
_extra_block
(
self
,
input
,
num_filters1
,
num_filters2
,
padding_size
,
stride_size
,
filter_size
,
name
=
None
):
def
_extra_block
(
self
,
input
,
num_filters1
,
num_filters2
,
padding_size
,
stride_size
,
filter_size
,
name
=
None
):
# 1x1 conv
# 1x1 conv
conv_1
=
self
.
_conv_layer
(
conv_1
=
self
.
_conv_layer
(
input
=
input
,
num_filters
=
int
(
num_filters1
),
filter_size
=
1
,
stride
=
1
,
act
=
'relu'
,
padding
=
0
,
name
=
name
+
"1"
)
input
=
input
,
num_filters
=
int
(
num_filters1
),
filter_size
=
1
,
stride
=
1
,
act
=
'relu'
,
padding
=
0
,
name
=
name
+
"1"
)
# 3x3 conv
# 3x3 conv
conv_2
=
self
.
_conv_layer
(
conv_2
=
self
.
_conv_layer
(
...
@@ -157,11 +184,17 @@ class VGG(object):
...
@@ -157,11 +184,17 @@ class VGG(object):
act
=
act
,
act
=
act
,
use_cudnn
=
use_cudnn
,
use_cudnn
=
use_cudnn
,
param_attr
=
ParamAttr
(
name
=
name
+
"_weights"
),
param_attr
=
ParamAttr
(
name
=
name
+
"_weights"
),
bias_attr
=
ParamAttr
(
name
=
name
+
"_biases"
)
if
self
.
with_extra_blocks
else
False
,
bias_attr
=
ParamAttr
(
name
=
name
+
"_biases"
)
if
self
.
with_extra_blocks
else
False
,
name
=
name
+
'.conv2d.output.1'
)
name
=
name
+
'.conv2d.output.1'
)
return
conv
return
conv
def
_pooling_block
(
self
,
conv
,
pool_size
,
pool_stride
,
pool_padding
=
0
,
ceil_mode
=
True
):
def
_pooling_block
(
self
,
conv
,
pool_size
,
pool_stride
,
pool_padding
=
0
,
ceil_mode
=
True
):
pool
=
fluid
.
layers
.
pool2d
(
pool
=
fluid
.
layers
.
pool2d
(
input
=
conv
,
input
=
conv
,
pool_size
=
pool_size
,
pool_size
=
pool_size
,
...
@@ -175,10 +208,17 @@ class VGG(object):
...
@@ -175,10 +208,17 @@ class VGG(object):
from
paddle.fluid.layer_helper
import
LayerHelper
from
paddle.fluid.layer_helper
import
LayerHelper
from
paddle.fluid.initializer
import
Constant
from
paddle.fluid.initializer
import
Constant
helper
=
LayerHelper
(
"Scale"
)
helper
=
LayerHelper
(
"Scale"
)
l2_norm
=
fluid
.
layers
.
l2_normalize
(
input
,
axis
=
1
)
# l2 norm along channel
l2_norm
=
fluid
.
layers
.
l2_normalize
(
input
,
axis
=
1
)
# l2 norm along channel
shape
=
[
1
]
if
channel_shared
else
[
input
.
shape
[
1
]]
shape
=
[
1
]
if
channel_shared
else
[
input
.
shape
[
1
]]
scale
=
helper
.
create_parameter
(
scale
=
helper
.
create_parameter
(
attr
=
helper
.
param_attr
,
shape
=
shape
,
dtype
=
input
.
dtype
,
default_initializer
=
Constant
(
init_scale
))
attr
=
helper
.
param_attr
,
shape
=
shape
,
dtype
=
input
.
dtype
,
default_initializer
=
Constant
(
init_scale
))
out
=
fluid
.
layers
.
elementwise_mul
(
out
=
fluid
.
layers
.
elementwise_mul
(
x
=
l2_norm
,
y
=
scale
,
axis
=-
1
if
channel_shared
else
1
,
name
=
"conv4_3_norm_scale"
)
x
=
l2_norm
,
y
=
scale
,
axis
=-
1
if
channel_shared
else
1
,
name
=
"conv4_3_norm_scale"
)
return
out
return
out
modules/image/object_detection/yolov3_darknet53_coco2017/README.md
浏览文件 @
c3f1e085
#
# 命令行预测
#
yolov3_darknet53_coco2017
```
shell
|模型名称|yolov3_darknet53_coco2017|
$
hub run yolov3_darknet53_coco2017
--input_path
"/PATH/TO/IMAGE"
| :--- | :---: |
```
|类别|图像 - 目标检测|
|网络|YOLOv3|
|数据集|COCO2017|
|是否支持Fine-tuning|否|
|模型大小|239MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## API
```
python
## 一、模型基本信息
def
context
(
trainable
=
True
,
pretrained
=
True
,
get_prediction
=
False
)
```
提取特征,用于迁移学习。
-
### 应用效果展示
-
样例结果示例:
<p
align=
"center"
>
<img
src=
"https://user-images.githubusercontent.com/22424850/131506781-b4ecb77b-5ab1-4795-88da-5f547f7f7f9c.jpg"
width=
'50%'
hspace=
'10'
/>
<br
/>
</p>
**参数**
-
### 模型介绍
*
trainable(bool): 参数是否可训练;
-
YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍。 YOLOv3将输入图像划分格子,并对每个格子预测bounding box。YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差。该PaddleHub Module预训练数据集为COCO2017,目前仅支持预测。
*
pretrained (bool): 是否加载预训练模型;
*
get
\_
prediction (bool): 是否执行预测。
**返回**
*
inputs (dict): 模型的输入,keys 包括 'image', 'im
\_
size',相应的取值为:
## 二、安装
*
image (Variable): 图像变量
*
im
\_
size (Variable): 图片的尺寸
*
outputs (dict): 模型的输出。如果 get
\_
prediction 为 False,输出 'head
\_
features'、'body
\_
features',否则输出 'bbox
\_
out'。
*
context
\_
prog (Program): 用于迁移学习的 Program.
```
python
-
### 1、环境依赖
def
object_detection
(
paths
=
None
,
images
=
None
,
batch_size
=
1
,
use_gpu
=
False
,
output_dir
=
'detection_result'
,
score_thresh
=
0.5
,
visualization
=
True
)
```
预测API,检测输入图片中的所有目标的位置。
-
paddlepaddle >= 1.6.2
**参数**
-
paddlehub >= 1.6.0 |
[
如何安装paddlehub
](
../../../../docs/docs_ch/get_start/installation.rst
)
*
paths (list
\[
str
\]
): 图片的路径;
-
### 2、安装
*
images (list
\[
numpy.ndarray
\]
): 图片数据,ndarray.shape 为
\[
H, W, C
\]
,BGR格式;
*
batch
\_
size (int): batch 的大小;
*
use
\_
gpu (bool): 是否使用 GPU;
*
score
\_
thresh (float): 识别置信度的阈值;
*
visualization (bool): 是否将识别结果保存为图片文件;
*
output
\_
dir (str): 图片的保存路径,默认设为 detection
\_
result;
**返回**
-
```shell
$ hub install yolov3_darknet53_coco2017
```
-
如您安装时遇到问题,可参考:
[
零基础windows安装
](
../../../../docs/docs_ch/get_start/windows_quickstart.md
)
|
[
零基础Linux安装
](
../../../../docs/docs_ch/get_start/linux_quickstart.md
)
|
[
零基础MacOS安装
](
../../../../docs/docs_ch/get_start/mac_quickstart.md
)
*
res (list
\[
dict
\]
): 识别结果的列表,列表中每一个元素为 dict,各字段为:
## 三、模型API预测
*
data (list): 检测结果,list的每一个元素为 dict,各字段为:
*
confidence (float): 识别的置信度;
*
label (str): 标签;
*
left (int): 边界框的左上角x坐标;
*
top (int): 边界框的左上角y坐标;
*
right (int): 边界框的右下角x坐标;
*
bottom (int): 边界框的右下角y坐标;
*
save
\_
path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。
```
python
-
### 1、命令行预测
def
save_inference_model
(
dirname
,
model_filename
=
None
,
params_filename
=
None
,
combined
=
True
)
```
将模型保存到指定路径。
-
```shell
$ hub run yolov3_darknet53_coco2017 --input_path "/PATH/TO/IMAGE"
```
-
通过命令行方式实现目标检测模型的调用,更多请见
[
PaddleHub命令行指令
](
../../../../docs/docs_ch/tutorial/cmd_usage.rst
)
-
### 2、代码示例
**参数**
-
```python
import paddlehub as hub
import cv2
*
dirname: 存在模型的目录名称
object_detector = hub.Module(name="yolov3_darknet53_coco2017")
*
model
\_
filename: 模型文件名称,默认为
\_\_
model
\_\_
result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')])
*
params
\_
filename: 参数文件名称,默认为
\_\_
params
\_\_
(仅当
`combined`
为True时生效)
# or
*
combined: 是否将参数保存到统一的一个文件中
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 代码示例
-
### 3、API
```
python
-
```python
import
paddlehub
as
hub
def object_detection(paths=None,
import
cv2
images=None,
batch_size=1,
use_gpu=False,
output_dir='detection_result',
score_thresh=0.5,
visualization=True)
```
object_detector
=
hub
.
Module
(
name
=
"yolov3_darknet53_coco2017"
)
- 预测API,检测输入图片中的所有目标的位置。
result
=
object_detector
.
object_detection
(
images
=
[
cv2
.
imread
(
'/PATH/TO/IMAGE'
)])
# or
# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE'])
```
## 服务部署
- **参数**
PaddleHub Serving可以部署一个目标检测的在线服务。
- paths (list\[str\]): 图片的路径; <br/>
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; <br/>
- batch\_size (int): batch 的大小;<br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- output\_dir (str): 图片的保存路径,默认设为 detection\_result;<br/>
- score\_thresh (float): 识别置信度的阈值;<br/>
- visualization (bool): 是否将识别结果保存为图片文件。
## 第一步:启动PaddleHub Serving
**NOTE:** paths和images两个参数选择其一进行提供数据
运行启动命令:
- **返回**
```
shell
$
hub serving start
-m
yolov3_darknet53_coco2017
```
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
- data (list): 检测结果,list的每一个元素为 dict,各字段为:
- confidence (float): 识别的置信度
- label (str): 标签
- left (int): 边界框的左上角x坐标
- top (int): 边界框的左上角y坐标
- right (int): 边界框的右下角x坐标
- bottom (int): 边界框的右下角y坐标
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
-
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
-
将模型保存到指定路径。
## 第二步:发送预测请求
- **参数**
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- dirname: 存在模型的目录名称; <br/>
- model\_filename: 模型文件名称,默认为\_\_model\_\_; <br/>
- params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);<br/>
- combined: 是否将参数保存到统一的一个文件中。
```
python
import
requests
import
json
import
cv2
import
base64
## 四、服务部署
def
cv2_to_base64
(
image
):
-
PaddleHub Serving可以部署一个目标检测的在线服务。
data
=
cv2
.
imencode
(
'.jpg'
,
image
)[
1
]
return
base64
.
b64encode
(
data
.
tostring
()).
decode
(
'utf8'
)
-
### 第一步:启动PaddleHub Serving
# 发送HTTP请求
-
运行启动命令:
data
=
{
'images'
:[
cv2_to_base64
(
cv2
.
imread
(
"/PATH/TO/IMAGE"
))]}
-
```shell
headers
=
{
"Content-type"
:
"application/json"
}
$ hub serving start -m yolov3_darknet53_coco2017
url
=
"http://127.0.0.1:8866/predict/yolov3_darknet53_coco2017"
```
r
=
requests
.
post
(
url
=
url
,
headers
=
headers
,
data
=
json
.
dumps
(
data
))
# 打印预测结果
-
这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
print
(
r
.
json
()[
"results"
])
```
### 依赖
-
**NOTE:**
如使用GPU预测,则需要在启动服务之前,请设置CUDA
\_
VISIBLE
\_
DEVICES环境变量,否则不用设置。
paddlepaddle >= 1.6.2
-
### 第二步:发送预测请求
paddlehub >= 1.6.0
-
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
-
```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/yolov3_darknet53_coco2017"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
```
## 五、更新历史
*
1.0.0
初始发布
*
1.1.1
修复numpy数据读取问题
-
```shell
$ hub install yolov3_darknet53_coco2017==1.1.1
```
modules/image/object_detection/yolov3_darknet53_pedestrian/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_pedestrian/darknet.py
浏览文件 @
c3f1e085
...
@@ -39,7 +39,14 @@ class DarkNet(object):
...
@@ -39,7 +39,14 @@ class DarkNet(object):
self
.
class_dim
=
class_dim
self
.
class_dim
=
class_dim
self
.
get_prediction
=
get_prediction
self
.
get_prediction
=
get_prediction
def
_conv_norm
(
self
,
input
,
ch_out
,
filter_size
,
stride
,
padding
,
act
=
'leaky'
,
name
=
None
):
def
_conv_norm
(
self
,
input
,
ch_out
,
filter_size
,
stride
,
padding
,
act
=
'leaky'
,
name
=
None
):
conv
=
fluid
.
layers
.
conv2d
(
conv
=
fluid
.
layers
.
conv2d
(
input
=
input
,
input
=
input
,
num_filters
=
ch_out
,
num_filters
=
ch_out
,
...
@@ -51,8 +58,12 @@ class DarkNet(object):
...
@@ -51,8 +58,12 @@ class DarkNet(object):
bias_attr
=
False
)
bias_attr
=
False
)
bn_name
=
name
+
".bn"
bn_name
=
name
+
".bn"
bn_param_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
float
(
self
.
norm_decay
)),
name
=
bn_name
+
'.scale'
)
bn_param_attr
=
ParamAttr
(
bn_bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
float
(
self
.
norm_decay
)),
name
=
bn_name
+
'.offset'
)
regularizer
=
L2Decay
(
float
(
self
.
norm_decay
)),
name
=
bn_name
+
'.scale'
)
bn_bias_attr
=
ParamAttr
(
regularizer
=
L2Decay
(
float
(
self
.
norm_decay
)),
name
=
bn_name
+
'.offset'
)
out
=
fluid
.
layers
.
batch_norm
(
out
=
fluid
.
layers
.
batch_norm
(
input
=
conv
,
input
=
conv
,
...
@@ -69,12 +80,36 @@ class DarkNet(object):
...
@@ -69,12 +80,36 @@ class DarkNet(object):
return
out
return
out
def
_downsample
(
self
,
input
,
ch_out
,
filter_size
=
3
,
stride
=
2
,
padding
=
1
,
name
=
None
):
def
_downsample
(
self
,
return
self
.
_conv_norm
(
input
,
ch_out
=
ch_out
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
padding
,
name
=
name
)
input
,
ch_out
,
filter_size
=
3
,
stride
=
2
,
padding
=
1
,
name
=
None
):
return
self
.
_conv_norm
(
input
,
ch_out
=
ch_out
,
filter_size
=
filter_size
,
stride
=
stride
,
padding
=
padding
,
name
=
name
)
def
basicblock
(
self
,
input
,
ch_out
,
name
=
None
):
def
basicblock
(
self
,
input
,
ch_out
,
name
=
None
):
conv1
=
self
.
_conv_norm
(
input
,
ch_out
=
ch_out
,
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
name
=
name
+
".0"
)
conv1
=
self
.
_conv_norm
(
conv2
=
self
.
_conv_norm
(
conv1
,
ch_out
=
ch_out
*
2
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
name
=
name
+
".1"
)
input
,
ch_out
=
ch_out
,
filter_size
=
1
,
stride
=
1
,
padding
=
0
,
name
=
name
+
".0"
)
conv2
=
self
.
_conv_norm
(
conv1
,
ch_out
=
ch_out
*
2
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
name
=
name
+
".1"
)
out
=
fluid
.
layers
.
elementwise_add
(
x
=
input
,
y
=
conv2
,
act
=
None
)
out
=
fluid
.
layers
.
elementwise_add
(
x
=
input
,
y
=
conv2
,
act
=
None
)
return
out
return
out
...
@@ -94,9 +129,16 @@ class DarkNet(object):
...
@@ -94,9 +129,16 @@ class DarkNet(object):
stages
,
block_func
=
self
.
depth_cfg
[
self
.
depth
]
stages
,
block_func
=
self
.
depth_cfg
[
self
.
depth
]
stages
=
stages
[
0
:
5
]
stages
=
stages
[
0
:
5
]
conv
=
self
.
_conv_norm
(
conv
=
self
.
_conv_norm
(
input
=
input
,
ch_out
=
32
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
name
=
self
.
prefix_name
+
"yolo_input"
)
input
=
input
,
ch_out
=
32
,
filter_size
=
3
,
stride
=
1
,
padding
=
1
,
name
=
self
.
prefix_name
+
"yolo_input"
)
downsample_
=
self
.
_downsample
(
downsample_
=
self
.
_downsample
(
input
=
conv
,
ch_out
=
conv
.
shape
[
1
]
*
2
,
name
=
self
.
prefix_name
+
"yolo_input.downsample"
)
input
=
conv
,
ch_out
=
conv
.
shape
[
1
]
*
2
,
name
=
self
.
prefix_name
+
"yolo_input.downsample"
)
blocks
=
[]
blocks
=
[]
for
i
,
stage
in
enumerate
(
stages
):
for
i
,
stage
in
enumerate
(
stages
):
block
=
self
.
layer_warp
(
block
=
self
.
layer_warp
(
...
@@ -108,14 +150,19 @@ class DarkNet(object):
...
@@ -108,14 +150,19 @@ class DarkNet(object):
blocks
.
append
(
block
)
blocks
.
append
(
block
)
if
i
<
len
(
stages
)
-
1
:
# do not downsaple in the last stage
if
i
<
len
(
stages
)
-
1
:
# do not downsaple in the last stage
downsample_
=
self
.
_downsample
(
downsample_
=
self
.
_downsample
(
input
=
block
,
ch_out
=
block
.
shape
[
1
]
*
2
,
name
=
self
.
prefix_name
+
"stage.{}.downsample"
.
format
(
i
))
input
=
block
,
ch_out
=
block
.
shape
[
1
]
*
2
,
name
=
self
.
prefix_name
+
"stage.{}.downsample"
.
format
(
i
))
if
self
.
get_prediction
:
if
self
.
get_prediction
:
pool
=
fluid
.
layers
.
pool2d
(
input
=
block
,
pool_type
=
'avg'
,
global_pooling
=
True
)
pool
=
fluid
.
layers
.
pool2d
(
input
=
block
,
pool_type
=
'avg'
,
global_pooling
=
True
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
stdv
=
1.0
/
math
.
sqrt
(
pool
.
shape
[
1
]
*
1.0
)
out
=
fluid
.
layers
.
fc
(
out
=
fluid
.
layers
.
fc
(
input
=
pool
,
input
=
pool
,
size
=
self
.
class_dim
,
size
=
self
.
class_dim
,
param_attr
=
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
),
name
=
'fc_weights'
),
param_attr
=
ParamAttr
(
initializer
=
fluid
.
initializer
.
Uniform
(
-
stdv
,
stdv
),
name
=
'fc_weights'
),
bias_attr
=
ParamAttr
(
name
=
'fc_offset'
))
bias_attr
=
ParamAttr
(
name
=
'fc_offset'
))
out
=
fluid
.
layers
.
softmax
(
out
)
out
=
fluid
.
layers
.
softmax
(
out
)
return
out
return
out
...
...
modules/image/object_detection/yolov3_darknet53_pedestrian/data_feed.py
浏览文件 @
c3f1e085
...
@@ -26,7 +26,8 @@ def reader(paths=[], images=None):
...
@@ -26,7 +26,8 @@ def reader(paths=[], images=None):
if
paths
:
if
paths
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
@@ -50,7 +51,8 @@ def reader(paths=[], images=None):
...
@@ -50,7 +51,8 @@ def reader(paths=[], images=None):
im_scale_x
=
float
(
target_size
)
/
float
(
im_shape
[
1
])
im_scale_x
=
float
(
target_size
)
/
float
(
im_shape
[
1
])
im_scale_y
=
float
(
target_size
)
/
float
(
im_shape
[
0
])
im_scale_y
=
float
(
target_size
)
/
float
(
im_shape
[
0
])
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
2
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
2
)
# normalize image
# normalize image
mean
=
[
0.485
,
0.456
,
0.406
]
mean
=
[
0.485
,
0.456
,
0.406
]
...
...
modules/image/object_detection/yolov3_darknet53_pedestrian/module.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_pedestrian/processor.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_pedestrian/yolo_head.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_vehicles/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_vehicles/darknet.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_vehicles/data_feed.py
浏览文件 @
c3f1e085
...
@@ -26,7 +26,8 @@ def reader(paths=[], images=None):
...
@@ -26,7 +26,8 @@ def reader(paths=[], images=None):
if
paths
:
if
paths
:
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
assert
type
(
paths
)
is
list
,
"type(paths) is not list."
for
img_path
in
paths
:
for
img_path
in
paths
:
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
assert
os
.
path
.
isfile
(
img_path
),
"The {} isn't a valid file path."
.
format
(
img_path
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img
=
cv2
.
imread
(
img_path
).
astype
(
'float32'
)
img_list
.
append
(
img
)
img_list
.
append
(
img
)
if
images
is
not
None
:
if
images
is
not
None
:
...
@@ -50,7 +51,8 @@ def reader(paths=[], images=None):
...
@@ -50,7 +51,8 @@ def reader(paths=[], images=None):
im_scale_x
=
float
(
target_size
)
/
float
(
im_shape
[
1
])
im_scale_x
=
float
(
target_size
)
/
float
(
im_shape
[
1
])
im_scale_y
=
float
(
target_size
)
/
float
(
im_shape
[
0
])
im_scale_y
=
float
(
target_size
)
/
float
(
im_shape
[
0
])
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
2
)
im
=
cv2
.
resize
(
im
,
None
,
None
,
fx
=
im_scale_x
,
fy
=
im_scale_y
,
interpolation
=
2
)
# normalize image
# normalize image
mean
=
[
0.485
,
0.456
,
0.406
]
mean
=
[
0.485
,
0.456
,
0.406
]
...
...
modules/image/object_detection/yolov3_darknet53_vehicles/module.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_vehicles/processor.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_vehicles/yolo_head.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_darknet53_venus/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/data_feed.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/mobilenet_v1.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/module.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/processor.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_mobilenet_v1_coco2017/yolo_head.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/data_feed.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/module.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/nonlocal_helper.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/processor.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/resnet.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet34_coco2017/yolo_head.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/README.md
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/data_feed.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/module.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/nonlocal_helper.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/processor.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/resnet.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
modules/image/object_detection/yolov3_resnet50_vd_coco2017/yolo_head.py
浏览文件 @
c3f1e085
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录