This operation proposes RoIs according to each box with their
probability to be a foreground object and
the box can be calculated by anchors. Bbox_deltais and scores
to be an object are the output of RPN. Final proposals
could be used to train detection net.
For generating proposals, this operation performs following steps:
1. Transposes and resizes scores and bbox_deltas in size of
(H*W*A, 1) and (H*W*A, 4)
2. Calculate box locations as proposals candidates.
3. Clip boxes to image
4. Remove predicted boxes with small area.
5. Apply NMS to get final proposals as output.
Args:
scores(Tensor): A 4-D Tensor with shape [N, A, H, W] represents
the probability for each box to be an object.
N is batch size, A is number of anchors, H and W are height and
width of the feature map. The data type must be float32.
bbox_deltas(Tensor): A 4-D Tensor with shape [N, 4*A, H, W]
represents the difference between predicted box location and
anchor location. The data type must be float32.
im_shape(Tensor): A 2-D Tensor with shape [N, 2] represents H, W, the
origin image size or input size. The data type can be float32 or
float64.
anchors(Tensor): A 4-D Tensor represents the anchors with a layout
of [H, W, A, 4]. H and W are height and width of the feature map,
num_anchors is the box count of each position. Each anchor is
in (xmin, ymin, xmax, ymax) format an unnormalized. The data type must be float32.
variances(Tensor): A 4-D Tensor. The expanded variances of anchors with a layout of
[H, W, num_priors, 4]. Each variance is in
(xcenter, ycenter, w, h) format. The data type must be float32.
pre_nms_top_n(float): Number of total bboxes to be kept per
image before NMS. The data type must be float32. `6000` by default.
post_nms_top_n(float): Number of total bboxes to be kept per
image after NMS. The data type must be float32. `1000` by default.
nms_thresh(float): Threshold in NMS. The data type must be float32. `0.5` by default.
min_size(float): Remove predicted boxes with either height or
width < min_size. The data type must be float32. `0.1` by default.
eta(float): Apply in adaptive NMS, if adaptive `threshold > 0.5`,
`adaptive_threshold = adaptive_threshold * eta` in each iteration.
return_rois_num(bool): When setting True, it will return a 1D Tensor with shape [N, ] that includes Rois's
num of each image in one batch. The N is the image's num. For example, the tensor has values [4,5] that represents
the first image has 4 Rois, the second image has 5 Rois. It only used in rcnn model.
'False' by default.
name(str, optional): For detailed information, please refer
to :ref:`api_guide_Name`. Usually name is no need to set and
None by default.
Returns:
tuple:
A tuple with format ``(rpn_rois, rpn_roi_probs)``.
- **rpn_rois**: The generated RoIs. 2-D Tensor with shape ``[N, 4]`` while ``N`` is the number of RoIs. The data type is the same as ``scores``.
- **rpn_roi_probs**: The scores of generated RoIs. 2-D Tensor with shape ``[N, 1]`` while ``N`` is the number of RoIs. The data type is the same as ``scores``.