未验证 提交 e1a0be18 编写于 作者: M Mark Ma 提交者: GitHub

add multi-scale test (#4377)

* add multi-scale test

* add resnet34-fpn-mstest item in model zoo
上级 94ae6e1f
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
| ResNet50-vd | Faster | 1 | 1x | ---- | 37.6 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_vd_1x_coco.yml) | | ResNet50-vd | Faster | 1 | 1x | ---- | 37.6 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_vd_1x_coco.yml) |
| ResNet101 | Faster | 1 | 1x | ---- | 39.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r101_1x_coco.yml) | | ResNet101 | Faster | 1 | 1x | ---- | 39.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r101_1x_coco.yml) |
| ResNet34-FPN | Faster | 1 | 1x | ---- | 37.8 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r34_fpn_1x_coco.yml) | | ResNet34-FPN | Faster | 1 | 1x | ---- | 37.8 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r34_fpn_1x_coco.yml) |
| ResNet34-FPN-MultiScaleTest | Faster | 1 | 1x | ---- | 38.2 | - | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml) |
| ResNet34-vd-FPN | Faster | 1 | 1x | ---- | 38.5 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r34_vd_fpn_1x_coco.yml) | | ResNet34-vd-FPN | Faster | 1 | 1x | ---- | 38.5 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r34_vd_fpn_1x_coco.yml) |
| ResNet50-FPN | Faster | 1 | 1x | ---- | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml) | | ResNet50-FPN | Faster | 1 | 1x | ---- | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml) |
| ResNet50-FPN | Faster | 1 | 2x | ---- | 40.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml) | | ResNet50-FPN | Faster | 1 | 2x | ---- | 40.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml) |
......
_BASE_: [
'faster_rcnn_r34_fpn_1x_coco.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_pretrained.pdparams
weights: output/faster_rcnn_r34_fpn_multiscaletest_1x_coco/model_final
EvalReader:
sample_transforms:
- Decode: {}
# - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900], use_flip: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
TestReader:
sample_transforms:
- Decode: {}
# - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900], use_flip: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
\ No newline at end of file
# Multi Scale Test Configuration
Tags: Configuration
---
```yaml
##################################### Multi scale test configuration #####################################
EvalReader:
sample_transforms:
- Decode: {}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900]}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
TestReader:
sample_transforms:
- Decode: {}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900]}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
```
---
Multi Scale Test is a TTA (Test Time Augmentation) method, it can improve object detection performance.
The input image will be scaled into different scales, then model generated predictions (bboxes) at different scales, finally all the predictions will be combined to generate final prediction. (Here **NMS** is used to aggregate the predictions.)
## _MultiscaleTestResize_ option
`MultiscaleTestResize` option is used to enable multi scale test prediction.
`origin_target_size: [800, 1333]` means the input image will be scaled to 800 (for short edge) and 1333 (max edge length cannot be greater than 1333) at first
`target_size: [700 , 900]` property is used to specify different scales.
It can be plugged into evaluation process or test (inference) process, by adding `MultiscaleTestResize` entry to `EvalReader.sample_transforms` or `TestReader.sample_transforms`
---
###Note
Now only CascadeRCNN, FasterRCNN and MaskRCNN are supported for multi scale testing. And batch size must be 1.
\ No newline at end of file
# 多尺度测试的配置
标签: 配置
---
```yaml
##################################### 多尺度测试的配置 #####################################
EvalReader:
sample_transforms:
- Decode: {}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900]}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
TestReader:
sample_transforms:
- Decode: {}
- MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900]}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
```
---
多尺度测试是一种TTA方法(测试时增强),可以用于提高目标检测的准确率
输入图像首先被缩放为不同尺度的图像,然后模型对这些不同尺度的图像进行预测,最后将这些不同尺度上的预测结果整合为最终预测结果。(这里使用了**NMS**来整合不同尺度的预测结果)
## _MultiscaleTestResize_ 选项
`MultiscaleTestResize` 选项用于开启多尺度测试.
`origin_target_size: [800, 1333]` 项代表输入图像首先缩放为短边为800,最长边不超过1333.
`target_size: [700 , 900]` 项设置不同的预测尺度。
通过在`EvalReader.sample_transforms``TestReader.sample_transforms`中设置`MultiscaleTestResize`项,可以在评估过程或预测过程中开启多尺度测试。
---
###注意
目前多尺度测试只支持CascadeRCNN, FasterRCNN and MaskRCNN网络, 并且batch size需要是1.
\ No newline at end of file
...@@ -16,6 +16,8 @@ from __future__ import absolute_import ...@@ -16,6 +16,8 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import typing
try: try:
from collections.abc import Sequence from collections.abc import Sequence
except Exception: except Exception:
...@@ -69,7 +71,13 @@ class PadBatch(BaseOperator): ...@@ -69,7 +71,13 @@ class PadBatch(BaseOperator):
""" """
coarsest_stride = self.pad_to_stride coarsest_stride = self.pad_to_stride
max_shape = np.array([data['image'].shape for data in samples]).max( # multi scale input is nested list
if isinstance(samples, typing.Sequence) and len(samples) > 0 and isinstance(samples[0], typing.Sequence):
inner_samples = samples[0]
else:
inner_samples = samples
max_shape = np.array([data['image'].shape for data in inner_samples]).max(
axis=0) axis=0)
if coarsest_stride > 0: if coarsest_stride > 0:
max_shape[1] = int( max_shape[1] = int(
...@@ -77,7 +85,7 @@ class PadBatch(BaseOperator): ...@@ -77,7 +85,7 @@ class PadBatch(BaseOperator):
max_shape[2] = int( max_shape[2] = int(
np.ceil(max_shape[2] / coarsest_stride) * coarsest_stride) np.ceil(max_shape[2] / coarsest_stride) * coarsest_stride)
for data in samples: for data in inner_samples:
im = data['image'] im = data['image']
im_c, im_h, im_w = im.shape[:] im_c, im_h, im_w = im.shape[:]
padding_im = np.zeros( padding_im = np.zeros(
......
...@@ -22,6 +22,7 @@ import copy ...@@ -22,6 +22,7 @@ import copy
import time import time
import numpy as np import numpy as np
import typing
from PIL import Image, ImageOps from PIL import Image, ImageOps
import paddle import paddle
...@@ -473,6 +474,10 @@ class Trainer(object): ...@@ -473,6 +474,10 @@ class Trainer(object):
for metric in self._metrics: for metric in self._metrics:
metric.update(data, outs) metric.update(data, outs)
# multi-scale inputs: all inputs have same im_id
if isinstance(data, typing.Sequence):
sample_num += data[0]['im_id'].numpy().shape[0]
else:
sample_num += data['im_id'].numpy().shape[0] sample_num += data['im_id'].numpy().shape[0]
self._compose_callback.on_step_end(self.status) self._compose_callback.on_step_end(self.status)
...@@ -517,6 +522,9 @@ class Trainer(object): ...@@ -517,6 +522,9 @@ class Trainer(object):
outs = self.model(data) outs = self.model(data)
for key in ['im_shape', 'scale_factor', 'im_id']: for key in ['im_shape', 'scale_factor', 'im_id']:
if isinstance(data, typing.Sequence):
outs[key] = data[0][key]
else:
outs[key] = data[key] outs[key] = data[key]
for key, value in outs.items(): for key, value in outs.items():
if hasattr(value, 'numpy'): if hasattr(value, 'numpy'):
......
...@@ -21,6 +21,7 @@ import sys ...@@ -21,6 +21,7 @@ import sys
import json import json
import paddle import paddle
import numpy as np import numpy as np
import typing
from .map_utils import prune_zero_padding, DetectionMAP from .map_utils import prune_zero_padding, DetectionMAP
from .coco_utils import get_infer_results, cocoapi_eval from .coco_utils import get_infer_results, cocoapi_eval
...@@ -98,6 +99,10 @@ class COCOMetric(Metric): ...@@ -98,6 +99,10 @@ class COCOMetric(Metric):
for k, v in outputs.items(): for k, v in outputs.items():
outs[k] = v.numpy() if isinstance(v, paddle.Tensor) else v outs[k] = v.numpy() if isinstance(v, paddle.Tensor) else v
# multi-scale inputs: all inputs have same im_id
if isinstance(inputs, typing.Sequence):
im_id = inputs[0]['im_id']
else:
im_id = inputs['im_id'] im_id = inputs['im_id']
outs['im_id'] = im_id.numpy() if isinstance(im_id, outs['im_id'] = im_id.numpy() if isinstance(im_id,
paddle.Tensor) else im_id paddle.Tensor) else im_id
......
...@@ -2,9 +2,13 @@ from __future__ import absolute_import ...@@ -2,9 +2,13 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import numpy as np
import paddle import paddle
import paddle.nn as nn import paddle.nn as nn
import typing
from ppdet.core.workspace import register from ppdet.core.workspace import register
from ppdet.modeling.post_process import nms
__all__ = ['BaseArch'] __all__ = ['BaseArch']
...@@ -53,7 +57,53 @@ class BaseArch(nn.Layer): ...@@ -53,7 +57,53 @@ class BaseArch(nn.Layer):
if self.training: if self.training:
out = self.get_loss() out = self.get_loss()
else: else:
out = self.get_pred() inputs_list = []
# multi-scale input
if not isinstance(inputs, typing.Sequence):
inputs_list.append(inputs)
else:
inputs_list.extend(inputs)
outs = []
for inp in inputs_list:
self.inputs = inp
outs.append(self.get_pred())
# multi-scale test
if len(outs)>1:
out = self.merge_multi_scale_predictions(outs)
else:
out = outs[0]
return out
def merge_multi_scale_predictions(self, outs):
# default values for architectures not included in following list
num_classes = 80
nms_threshold = 0.5
keep_top_k = 100
if self.__class__.__name__ in ('CascadeRCNN', 'FasterRCNN', 'MaskRCNN'):
num_classes = self.bbox_head.num_classes
keep_top_k = self.bbox_post_process.nms.keep_top_k
nms_threshold = self.bbox_post_process.nms.nms_threshold
else:
raise Exception("Multi scale test only supports CascadeRCNN, FasterRCNN and MaskRCNN for now")
final_boxes = []
all_scale_outs = paddle.concat([o['bbox'] for o in outs]).numpy()
for c in range(num_classes):
idxs = all_scale_outs[:, 0] == c
if np.count_nonzero(idxs) == 0:
continue
r = nms(all_scale_outs[idxs, 1:], nms_threshold)
final_boxes.append(np.concatenate([np.full((r.shape[0], 1), c), r], 1))
out = np.concatenate(final_boxes)
out = np.concatenate(sorted(out, key=lambda e: e[1])[-keep_top_k:]).reshape((-1, 6))
out = {
'bbox': paddle.to_tensor(out),
'bbox_num': paddle.to_tensor(np.array([out.shape[0], ]))
}
return out return out
def build_inputs(self, data, input_def): def build_inputs(self, data, input_def):
......
...@@ -654,3 +654,59 @@ class SparsePostProcess(object): ...@@ -654,3 +654,59 @@ class SparsePostProcess(object):
bbox_pred = paddle.concat(boxes_final) bbox_pred = paddle.concat(boxes_final)
return bbox_pred, bbox_num return bbox_pred, bbox_num
def nms(dets, thresh):
"""Apply classic DPM-style greedy NMS."""
if dets.shape[0] == 0:
return dets[[], :]
scores = dets[:, 0]
x1 = dets[:, 1]
y1 = dets[:, 2]
x2 = dets[:, 3]
y2 = dets[:, 4]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
ndets = dets.shape[0]
suppressed = np.zeros((ndets), dtype=np.int)
# nominal indices
# _i, _j
# sorted indices
# i, j
# temp variables for box i's (the box currently under consideration)
# ix1, iy1, ix2, iy2, iarea
# variables for computing overlap with box j (lower scoring box)
# xx1, yy1, xx2, yy2
# w, h
# inter, ovr
for _i in range(ndets):
i = order[_i]
if suppressed[i] == 1:
continue
ix1 = x1[i]
iy1 = y1[i]
ix2 = x2[i]
iy2 = y2[i]
iarea = areas[i]
for _j in range(_i + 1, ndets):
j = order[_j]
if suppressed[j] == 1:
continue
xx1 = max(ix1, x1[j])
yy1 = max(iy1, y1[j])
xx2 = min(ix2, x2[j])
yy2 = min(iy2, y2[j])
w = max(0.0, xx2 - xx1 + 1)
h = max(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (iarea + areas[j] - inter)
if ovr >= thresh:
suppressed[j] = 1
keep = np.where(suppressed == 0)[0]
dets = dets[keep, :]
return dets
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import unittest
from ppdet.core.workspace import load_config
from ppdet.engine import Trainer
class TestMultiScaleInference(unittest.TestCase):
def setUp(self):
self.set_config()
def set_config(self):
self.mstest_cfg_file = 'configs/faster_rcnn/faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml'
# test evaluation with multi scale test
def test_eval_mstest(self):
cfg = load_config(self.mstest_cfg_file)
trainer = Trainer(cfg, mode='eval')
cfg.weights = 'https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_1x_coco.pdparams'
trainer.load_weights(cfg.weights)
trainer.evaluate()
# test inference with multi scale test
def test_infer_mstest(self):
cfg = load_config(self.mstest_cfg_file)
trainer = Trainer(cfg, mode='test')
cfg.weights = 'https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_1x_coco.pdparams'
trainer.load_weights(cfg.weights)
tests_img_root = os.path.join(os.path.dirname(__file__), 'imgs')
# input images to predict
imgs = ['coco2017_val2017_000000000139.jpg', 'coco2017_val2017_000000000724.jpg']
imgs = [os.path.join(tests_img_root, img) for img in imgs]
trainer.predict(imgs,
draw_threshold=0.5,
output_dir='output',
save_txt=True)
if __name__ == '__main__':
unittest.main()
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册