未验证 提交 b1e56994 编写于 作者: K Kaipeng Deng 提交者: GitHub

Merge pull request #15901 from heavengate/release/1.3

cherry-pick pool/adaptive_pool/yolov3_loss doc fix
...@@ -144,34 +144,40 @@ class Yolov3LossOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -144,34 +144,40 @@ class Yolov3LossOpMaker : public framework::OpProtoAndCheckerMaker {
"The ignore threshold to ignore confidence loss.") "The ignore threshold to ignore confidence loss.")
.SetDefault(0.7); .SetDefault(0.7);
AddComment(R"DOC( AddComment(R"DOC(
This operator generate yolov3 loss by given predict result and ground This operator generates yolov3 loss based on given predict result and ground
truth boxes. truth boxes.
The output of previous network is in shape [N, C, H, W], while H and W The output of previous network is in shape [N, C, H, W], while H and W
should be the same, specify the grid size, each grid point predict given should be the same, H and W specify the grid size, each grid point predict
number boxes, this given number is specified by anchors, it should be given number boxes, this given number, which following will be represented as S,
half anchors length, which following will be represented as S. In the is specified by the number of anchors, In the second dimension(the channel
second dimention(the channel dimention), C should be S * (class_num + 5), dimension), C should be equal to S * (class_num + 5), class_num is the object
class_num is the box categoriy number of source dataset(such as coco), category number of source dataset(such as 80 in coco dataset), so in the
so in the second dimention, stores 4 box location coordinates x, y, w, h second(channel) dimension, apart from 4 box location coordinates x, y, w, h,
and confidence score of the box and class one-hot key of each anchor box. also includes confidence score of the box and class one-hot key of each anchor box.
While the 4 location coordinates if $$tx, ty, tw, th$$, the box predictions Assume the 4 location coordinates are :math:`t_x, t_y, t_w, t_h`, the box predictions
correspnd to: should be as follows:
$$ $$
b_x = \sigma(t_x) + c_x b_x = \\sigma(t_x) + c_x
b_y = \sigma(t_y) + c_y $$
$$
b_y = \\sigma(t_y) + c_y
$$
$$
b_w = p_w e^{t_w} b_w = p_w e^{t_w}
$$
$$
b_h = p_h e^{t_h} b_h = p_h e^{t_h}
$$ $$
While $$c_x, c_y$$ is the left top corner of current grid and $$p_w, p_h$$ In the equation above, :math:`c_x, c_y` is the left top corner of current grid
is specified by anchors. and :math:`p_w, p_h` is specified by anchors.
As for confidence score, it is the logistic regression value of IoU between As for confidence score, it is the logistic regression value of IoU between
anchor boxes and ground truth boxes, the score of the anchor box which has anchor boxes and ground truth boxes, the score of the anchor box which has
the max IoU should be 1, and if the anchor box has IoU bigger then ignore the max IoU should be 1, and if the anchor box has IoU bigger than ignore
thresh, the confidence score loss of this anchor box will be ignored. thresh, the confidence score loss of this anchor box will be ignored.
Therefore, the yolov3 loss consist of three major parts, box location loss, Therefore, the yolov3 loss consist of three major parts, box location loss,
...@@ -186,13 +192,13 @@ class Yolov3LossOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -186,13 +192,13 @@ class Yolov3LossOpMaker : public framework::OpProtoAndCheckerMaker {
In order to trade off box coordinate losses between big boxes and small In order to trade off box coordinate losses between big boxes and small
boxes, box coordinate losses will be mutiplied by scale weight, which is boxes, box coordinate losses will be mutiplied by scale weight, which is
calculated as follow. calculated as follows.
$$ $$
weight_{box} = 2.0 - t_w * t_h weight_{box} = 2.0 - t_w * t_h
$$ $$
Final loss will be represented as follow. Final loss will be represented as follows.
$$ $$
loss = (loss_{xy} + loss_{wh}) * weight_{box} loss = (loss_{xy} + loss_{wh}) * weight_{box}
......
...@@ -168,9 +168,10 @@ void Pool2dOpMaker::Make() { ...@@ -168,9 +168,10 @@ void Pool2dOpMaker::Make() {
"be ignored."); // TODO(Chengduo): Add checker. "be ignored."); // TODO(Chengduo): Add checker.
// (Currently, // (Currently,
// TypedAttrChecker don't support vector type.) // TypedAttrChecker don't support vector type.)
AddAttr<bool>("global_pooling", AddAttr<bool>(
"global_pooling",
"(bool, default false) Whether to use the global pooling. " "(bool, default false) Whether to use the global pooling. "
"If global_pooling = true, ksize and paddings will be ignored.") "If global_pooling = true, kernel size and paddings will be ignored.")
.SetDefault(false); .SetDefault(false);
AddAttr<std::vector<int>>("strides", AddAttr<std::vector<int>>("strides",
"(vector<int>, default {1, 1}), strides(height, " "(vector<int>, default {1, 1}), strides(height, "
...@@ -182,7 +183,7 @@ void Pool2dOpMaker::Make() { ...@@ -182,7 +183,7 @@ void Pool2dOpMaker::Make() {
"paddings", "paddings",
"(vector<int>, default {0,0}), paddings(height, width) of pooling " "(vector<int>, default {0,0}), paddings(height, width) of pooling "
"operator." "operator."
"If global_pooling = true, paddings and ksize will be ignored.") "If global_pooling = true, paddings and kernel size will be ignored.")
.SetDefault({0, 0}); .SetDefault({0, 0});
AddAttr<bool>( AddAttr<bool>(
"exclusive", "exclusive",
...@@ -204,7 +205,7 @@ void Pool2dOpMaker::Make() { ...@@ -204,7 +205,7 @@ void Pool2dOpMaker::Make() {
.SetDefault(false); .SetDefault(false);
AddAttr<bool>( AddAttr<bool>(
"ceil_mode", "ceil_mode",
"(bool, default false) Wether to use the ceil function to calculate " "(bool, default false) Whether to use the ceil function to calculate "
"output height and width. False is the default. If it is set to False, " "output height and width. False is the default. If it is set to False, "
"the floor function will be used.") "the floor function will be used.")
.SetDefault(false); .SetDefault(false);
...@@ -259,31 +260,40 @@ Example: ...@@ -259,31 +260,40 @@ Example:
W_{out} = \\frac{(W_{in} - ksize[1] + 2 * paddings[1] + strides[1] - 1)}{strides[1]} + 1 W_{out} = \\frac{(W_{in} - ksize[1] + 2 * paddings[1] + strides[1] - 1)}{strides[1]} + 1
$$ $$
For exclusive = true: For exclusive = false:
$$ $$
hstart = i * strides[0] - paddings[0] hstart = i * strides[0] - paddings[0]
$$
$$
hend = hstart + ksize[0] hend = hstart + ksize[0]
$$
$$
wstart = j * strides[1] - paddings[1] wstart = j * strides[1] - paddings[1]
$$
$$
wend = wstart + ksize[1] wend = wstart + ksize[1]
$$
$$
Output(i ,j) = \\frac{sum(Input[hstart:hend, wstart:wend])}{ksize[0] * ksize[1]} Output(i ,j) = \\frac{sum(Input[hstart:hend, wstart:wend])}{ksize[0] * ksize[1]}
$$ $$
For exclusive = false:
For exclusive = true:
$$ $$
hstart = max(0, i * strides[0] - paddings[0]) hstart = max(0, i * strides[0] - paddings[0])
$$
$$
hend = min(H, hstart + ksize[0]) hend = min(H, hstart + ksize[0])
$$
$$
wstart = max(0, j * strides[1] - paddings[1]) wstart = max(0, j * strides[1] - paddings[1])
$$
$$
wend = min(W, wstart + ksize[1]) wend = min(W, wstart + ksize[1])
Output(i ,j) = \\frac{sum(Input[hstart:hend, wstart:wend])}{(hend - hstart) * (wend - wstart)}
$$ $$
For adaptive = true:
$$ $$
hstart = floor(i * H_{in} / H_{out})
hend = ceil((i + 1) * H_{in} / H_{out})
wstart = floor(j * W_{in} / W_{out})
wend = ceil((j + 1) * W_{in} / W_{out})
Output(i ,j) = \\frac{sum(Input[hstart:hend, wstart:wend])}{(hend - hstart) * (wend - wstart)} Output(i ,j) = \\frac{sum(Input[hstart:hend, wstart:wend])}{(hend - hstart) * (wend - wstart)}
$$ $$
)DOC"); )DOC");
} }
...@@ -324,7 +334,7 @@ void Pool3dOpMaker::Make() { ...@@ -324,7 +334,7 @@ void Pool3dOpMaker::Make() {
AddAttr<bool>( AddAttr<bool>(
"global_pooling", "global_pooling",
"(bool, default false) Whether to use the global pooling. " "(bool, default false) Whether to use the global pooling. "
"If global_pooling = true, ksize and paddings wille be ignored.") "If global_pooling = true, kernel size and paddings will be ignored.")
.SetDefault(false); .SetDefault(false);
AddAttr<std::vector<int>>( AddAttr<std::vector<int>>(
"strides", "strides",
...@@ -359,7 +369,7 @@ void Pool3dOpMaker::Make() { ...@@ -359,7 +369,7 @@ void Pool3dOpMaker::Make() {
.SetDefault(false); .SetDefault(false);
AddAttr<bool>( AddAttr<bool>(
"ceil_mode", "ceil_mode",
"(bool, default false) Wether to use the ceil function to calculate " "(bool, default false) Whether to use the ceil function to calculate "
"output height and width. False is the default. If it is set to False, " "output height and width. False is the default. If it is set to False, "
"the floor function will be used.") "the floor function will be used.")
.SetDefault(false); .SetDefault(false);
...@@ -393,45 +403,65 @@ Example: ...@@ -393,45 +403,65 @@ Example:
Out shape: $(N, C, D_{out}, H_{out}, W_{out})$ Out shape: $(N, C, D_{out}, H_{out}, W_{out})$
For ceil_mode = false: For ceil_mode = false:
$$ $$
D_{out} = \frac{(D_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1 \\ D_{out} = \\frac{(D_{in} - ksize[0] + 2 * paddings[0])}{strides[0]} + 1
H_{out} = \frac{(H_{in} - ksize[1] + 2 * paddings[1])}{strides[1]} + 1 \\ $$
W_{out} = \frac{(W_{in} - ksize[2] + 2 * paddings[2])}{strides[2]} + 1 $$
H_{out} = \\frac{(H_{in} - ksize[1] + 2 * paddings[1])}{strides[2]} + 1
$$
$$
W_{out} = \\frac{(W_{in} - ksize[2] + 2 * paddings[2])}{strides[2]} + 1
$$ $$
For ceil_mode = true: For ceil_mode = true:
$$ $$
D_{out} = \frac{(D_{in} - ksize[0] + 2 * paddings[0] + strides[0] -1)}{strides[0]} + 1 \\ D_{out} = \\frac{(D_{in} - ksize[0] + 2 * paddings[0] + strides[0] -1)}{strides[0]} + 1
H_{out} = \frac{(H_{in} - ksize[1] + 2 * paddings[1] + strides[1] -1)}{strides[1]} + 1 \\
W_{out} = \frac{(W_{in} - ksize[2] + 2 * paddings[2] + strides[2] -1)}{strides[2]} + 1
$$ $$
For exclusive = true: $$
H_{out} = \\frac{(H_{in} - ksize[1] + 2 * paddings[1] + strides[1] -1)}{strides[1]} + 1
$$
$$
W_{out} = \\frac{(W_{in} - ksize[2] + 2 * paddings[2] + strides[2] -1)}{strides[2]} + 1
$$
For exclusive = false:
$$ $$
dstart = i * strides[0] - paddings[0] dstart = i * strides[0] - paddings[0]
$$
$$
dend = dstart + ksize[0] dend = dstart + ksize[0]
$$
$$
hstart = j * strides[1] - paddings[1] hstart = j * strides[1] - paddings[1]
$$
$$
hend = hstart + ksize[1] hend = hstart + ksize[1]
$$
$$
wstart = k * strides[2] - paddings[2] wstart = k * strides[2] - paddings[2]
$$
$$
wend = wstart + ksize[2] wend = wstart + ksize[2]
$$
$$
Output(i ,j, k) = \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{ksize[0] * ksize[1] * ksize[2]} Output(i ,j, k) = \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{ksize[0] * ksize[1] * ksize[2]}
$$ $$
For exclusive = false:
For exclusive = true:
$$ $$
dstart = max(0, i * strides[0] - paddings[0]) dstart = max(0, i * strides[0] - paddings[0])
$$
$$
dend = min(D, dstart + ksize[0]) dend = min(D, dstart + ksize[0])
hstart = max(0, j * strides[1] - paddings[1]) $$
$$
hend = min(H, hstart + ksize[1]) hend = min(H, hstart + ksize[1])
$$
$$
wstart = max(0, k * strides[2] - paddings[2]) wstart = max(0, k * strides[2] - paddings[2])
$$
$$
wend = min(W, wstart + ksize[2]) wend = min(W, wstart + ksize[2])
Output(i ,j, k) = \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{(dend - dstart) * (hend - hstart) * (wend - wstart)}
$$ $$
For adaptive = true:
$$ $$
dstart = floor(i * D_{in} / D_{out})
dend = ceil((i + 1) * D_{in} / D_{out})
hstart = floor(j * H_{in} / H_{out})
hend = ceil((j + 1) * H_{in} / H_{out})
wstart = floor(k * W_{in} / W_{out})
wend = ceil((k + 1) * W_{in} / W_{out})
Output(i ,j, k) = \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{(dend - dstart) * (hend - hstart) * (wend - wstart)} Output(i ,j, k) = \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{(dend - dstart) * (hend - hstart) * (wend - wstart)}
$$ $$
......
...@@ -551,9 +551,10 @@ def yolov3_loss(x, ...@@ -551,9 +551,10 @@ def yolov3_loss(x,
gtbox = fluid.layers.data(name='gtbox', shape=[6, 5], dtype='float32') gtbox = fluid.layers.data(name='gtbox', shape=[6, 5], dtype='float32')
gtlabel = fluid.layers.data(name='gtlabel', shape=[6, 1], dtype='int32') gtlabel = fluid.layers.data(name='gtlabel', shape=[6, 1], dtype='int32')
anchors = [10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326] anchors = [10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326]
anchors = [0, 1, 2] anchor_mask = [0, 1, 2]
loss = fluid.layers.yolov3_loss(x=x, gtbox=gtbox, class_num=80, anchors=anchors, loss = fluid.layers.yolov3_loss(x=x, gtbox=gtbox, gtlabel=gtlabel, anchors=anchors,
ignore_thresh=0.5, downsample_ratio=32) anchor_mask=anchor_mask, class_num=80,
ignore_thresh=0.7, downsample_ratio=32)
""" """
helper = LayerHelper('yolov3_loss', **locals()) helper = LayerHelper('yolov3_loss', **locals())
......
...@@ -2441,7 +2441,7 @@ def pool2d(input, ...@@ -2441,7 +2441,7 @@ def pool2d(input,
data = fluid.layers.data( data = fluid.layers.data(
name='data', shape=[3, 32, 32], dtype='float32') name='data', shape=[3, 32, 32], dtype='float32')
conv2d = fluid.layers.pool2d( pool2d = fluid.layers.pool2d(
input=data, input=data,
pool_size=2, pool_size=2,
pool_type='max', pool_type='max',
...@@ -2490,6 +2490,7 @@ def pool2d(input, ...@@ -2490,6 +2490,7 @@ def pool2d(input,
return pool_out return pool_out
@templatedoc()
def pool3d(input, def pool3d(input,
pool_size=-1, pool_size=-1,
pool_type="max", pool_type="max",
...@@ -2501,13 +2502,19 @@ def pool3d(input, ...@@ -2501,13 +2502,19 @@ def pool3d(input,
name=None, name=None,
exclusive=True): exclusive=True):
""" """
This function adds the operator for pooling in 3-dimensions, using the ${comment}
pooling configurations mentioned in input parameters.
Args: Args:
input (Variable): ${input_comment} input (Variable): The input tensor of pooling operator. The format of
pool_size (int): ${ksize_comment} input tensor is NCDHW, where N is batch size, C is
pool_type (str): ${pooling_type_comment} the number of channels, D is the depth of the feature,
H is the height of the feature, and W is the width
of the feature.
pool_size (int|list|tuple): The pool kernel size. If pool kernel size
is a tuple or list, it must contain three integers,
(pool_size_Depth, pool_size_Height, pool_size_Width).
Otherwise, the pool kernel size will be the cube of an int.
pool_type (string): ${pooling_type_comment}
pool_stride (int): stride of the pooling layer. pool_stride (int): stride of the pooling layer.
pool_padding (int): padding size. pool_padding (int): padding size.
global_pooling (bool): ${global_pooling_comment} global_pooling (bool): ${global_pooling_comment}
...@@ -2520,6 +2527,19 @@ def pool3d(input, ...@@ -2520,6 +2527,19 @@ def pool3d(input,
Returns: Returns:
Variable: output of pool3d layer. Variable: output of pool3d layer.
Examples:
.. code-block:: python
data = fluid.layers.data(
name='data', shape=[3, 32, 32, 32], dtype='float32')
pool3d = fluid.layers.pool3d(
input=data,
pool_size=2,
pool_type='max',
pool_stride=1,
global_pooling=False)
""" """
if pool_type not in ["max", "avg"]: if pool_type not in ["max", "avg"]:
raise ValueError( raise ValueError(
...@@ -2569,7 +2589,27 @@ def adaptive_pool2d(input, ...@@ -2569,7 +2589,27 @@ def adaptive_pool2d(input,
require_index=False, require_index=False,
name=None): name=None):
""" """
${comment} **Adaptive Pool2d Operator**
The adaptive_pool2d operation calculates the output based on the input, pool_size,
pool_type parameters. Input(X) and output(Out) are in NCHW format, where N is batch
size, C is the number of channels, H is the height of the feature, and W is
the width of the feature. Parameters(pool_size) should contain two elements which
represent height and width, respectively. Also the H and W dimensions of output(Out)
is same as Parameter(pool_size).
For average adaptive pool2d:
.. math::
hstart &= floor(i * H_{in} / H_{out})
hend &= ceil((i + 1) * H_{in} / H_{out})
wstart &= floor(j * W_{in} / W_{out})
wend &= ceil((j + 1) * W_{in} / W_{out})
Output(i ,j) &= \\frac{sum(Input[hstart:hend, wstart:wend])}{(hend - hstart) * (wend - wstart)}
Args: Args:
input (Variable): The input tensor of pooling operator. The format of input (Variable): The input tensor of pooling operator. The format of
...@@ -2579,8 +2619,8 @@ def adaptive_pool2d(input, ...@@ -2579,8 +2619,8 @@ def adaptive_pool2d(input,
pool_size (int|list|tuple): The pool kernel size. If pool kernel size is a tuple or list, pool_size (int|list|tuple): The pool kernel size. If pool kernel size is a tuple or list,
it must contain two integers, (pool_size_Height, pool_size_Width). it must contain two integers, (pool_size_Height, pool_size_Width).
pool_type: ${pooling_type_comment} pool_type: ${pooling_type_comment}
require_index (bool): If true, the index of max pooling point along with outputs. require_index (bool): If true, the index of max pooling point will be returned along
it cannot be set in average pooling type. with outputs. It cannot be set in average pooling type.
name (str|None): A name for this layer(optional). If set None, the name (str|None): A name for this layer(optional). If set None, the
layer will be named automatically. layer will be named automatically.
...@@ -2661,18 +2701,42 @@ def adaptive_pool3d(input, ...@@ -2661,18 +2701,42 @@ def adaptive_pool3d(input,
require_index=False, require_index=False,
name=None): name=None):
""" """
${comment} **Adaptive Pool3d Operator**
The adaptive_pool3d operation calculates the output based on the input, pool_size,
pool_type parameters. Input(X) and output(Out) are in NCDHW format, where N is batch
size, C is the number of channels, D is the depth of the feature, H is the height of
the feature, and W is the width of the feature. Parameters(pool_size) should contain
three elements which represent height and width, respectively. Also the D, H and W
dimensions of output(Out) is same as Parameter(pool_size).
For average adaptive pool3d:
.. math::
dstart &= floor(i * D_{in} / D_{out})
dend &= ceil((i + 1) * D_{in} / D_{out})
hstart &= floor(j * H_{in} / H_{out})
hend &= ceil((j + 1) * H_{in} / H_{out})
wstart &= floor(k * W_{in} / W_{out})
wend &= ceil((k + 1) * W_{in} / W_{out})
Output(i ,j, k) &= \\frac{sum(Input[dstart:dend, hstart:hend, wstart:wend])}{(dend - dstart) * (hend - hstart) * (wend - wstart)}
Args: Args:
input (Variable): The input tensor of pooling operator. The format of input (Variable): The input tensor of pooling operator. The format of
input tensor is NCHW, where N is batch size, C is input tensor is NCDHW, where N is batch size, C is
the number of channels, H is the height of the the number of channels, D is the depth of the feature,
feature, and W is the width of the feature. H is the height of the feature, and W is the width of the feature.
pool_size (int|list|tuple): The pool kernel size. If pool kernel size is a tuple or list, pool_size (int|list|tuple): The pool kernel size. If pool kernel size is a tuple or list,
it must contain two integers, (Depth, Height, Width). it must contain three integers, (Depth, Height, Width).
pool_type: ${pooling_type_comment} pool_type: ${pooling_type_comment}
require_index (bool): If true, the index of max pooling point along with outputs. require_index (bool): If true, the index of max pooling point will be returned along
it cannot be set in average pooling type. with outputs. It cannot be set in average pooling type.
name (str|None): A name for this layer(optional). If set None, the name (str|None): A name for this layer(optional). If set None, the
layer will be named automatically. layer will be named automatically.
...@@ -2709,7 +2773,7 @@ def adaptive_pool3d(input, ...@@ -2709,7 +2773,7 @@ def adaptive_pool3d(input,
name='data', shape=[3, 32, 32], dtype='float32') name='data', shape=[3, 32, 32], dtype='float32')
pool_out, mask = fluid.layers.adaptive_pool3d( pool_out, mask = fluid.layers.adaptive_pool3d(
input=data, input=data,
pool_size=[3, 3], pool_size=[3, 3, 3],
pool_type='avg') pool_type='avg')
""" """
if pool_type not in ["max", "avg"]: if pool_type not in ["max", "avg"]:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册