未验证 提交 420570c9 编写于 作者: S sunzhongkai588 提交者: GitHub

paddle/nn/functional docs' bug fix (#34580)

* fix paddle.optimizer test=document_fix

* fix paddle.optimizer test=document_fix

* fix bugs in paddle.nn.functional document test=document_fix

* fix bugs in paddle.nn.functional document test=document_fix

* fix bugs in paddle.nn.functional document test=document_fix

* fix bugs in paddle.nn.functional document test=document_fix
上级 d9e63a81
...@@ -7097,9 +7097,9 @@ def dice_loss(input, label, epsilon=0.00001, name=None): ...@@ -7097,9 +7097,9 @@ def dice_loss(input, label, epsilon=0.00001, name=None):
.. math:: .. math::
dice\_loss &= 1 - \\frac{2 * intersection\_area}{total\_area} \\\\ dice\_loss &= 1 - \frac{2 * intersection\_area}{total\_area} \\
&= \\frac{(total\_area - intersection\_area) - intersection\_area}{total\_area} \\\\ &= \frac{(total\_area - intersection\_area) - intersection\_area}{total\_area} \\
&= \\frac{(union\_area - intersection\_area)}{total\_area} &= \frac{(union\_area - intersection\_area)}{total\_area}
Parameters: Parameters:
...@@ -13065,8 +13065,8 @@ def log_loss(input, label, epsilon=1e-4, name=None): ...@@ -13065,8 +13065,8 @@ def log_loss(input, label, epsilon=1e-4, name=None):
.. math:: .. math::
Out = -label * \\log{(input + \\epsilon)} Out = -label * \log{(input + \epsilon)}
- (1 - label) * \\log{(1 - input + \\epsilon)} - (1 - label) * \log{(1 - input + \epsilon)}
Args: Args:
input (Tensor|list): A 2-D tensor with shape [N x 1], where N is the input (Tensor|list): A 2-D tensor with shape [N x 1], where N is the
...@@ -14500,17 +14500,17 @@ def unfold(x, kernel_sizes, strides=1, paddings=0, dilations=1, name=None): ...@@ -14500,17 +14500,17 @@ def unfold(x, kernel_sizes, strides=1, paddings=0, dilations=1, name=None):
.. math:: .. math::
dkernel[0] &= dilations[0] \\times (kernel\_sizes[0] - 1) + 1 dkernel[0] &= dilations[0] \times (kernel\_sizes[0] - 1) + 1
dkernel[1] &= dilations[1] \\times (kernel\_sizes[1] - 1) + 1 dkernel[1] &= dilations[1] \times (kernel\_sizes[1] - 1) + 1
hout &= \\frac{H + paddings[0] + paddings[2] - dkernel[0]}{strides[0]} + 1 hout &= \frac{H + paddings[0] + paddings[2] - dkernel[0]}{strides[0]} + 1
wout &= \\frac{W + paddings[1] + paddings[3] - dkernel[1]}{strides[1]} + 1 wout &= \frac{W + paddings[1] + paddings[3] - dkernel[1]}{strides[1]} + 1
Cout &= C \\times kernel\_sizes[0] \\times kernel\_sizes[1] Cout &= C \times kernel\_sizes[0] \times kernel\_sizes[1]
Lout &= hout \\times wout Lout &= hout \times wout
Parameters: Parameters:
......
...@@ -37,7 +37,7 @@ def elu(x, alpha=1.0, name=None): ...@@ -37,7 +37,7 @@ def elu(x, alpha=1.0, name=None):
.. math:: .. math::
elu(x) = max(0, x) + min(0, \\alpha * (e^{x}-1)) elu(x) = max(0, x) + min(0, \alpha * (e^{x}-1))
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -91,13 +91,13 @@ def gelu(x, approximate=False, name=None): ...@@ -91,13 +91,13 @@ def gelu(x, approximate=False, name=None):
.. math:: .. math::
gelu(x) = 0.5 * x * (1 + tanh(\\sqrt{\\frac{2}{\\pi}} * (x + 0.044715x^{3}))) gelu(x) = 0.5 * x * (1 + tanh(\sqrt{\frac{2}{\pi}} * (x + 0.044715x^{3})))
else else
.. math:: .. math::
gelu(x) = 0.5 * x * (1 + erf(\\frac{x}{\\sqrt{2}})) gelu(x) = 0.5 * x * (1 + erf(\frac{x}{\sqrt{2}}))
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -144,13 +144,13 @@ def hardshrink(x, threshold=0.5, name=None): ...@@ -144,13 +144,13 @@ def hardshrink(x, threshold=0.5, name=None):
.. math:: .. math::
hardshrink(x)= hardshrink(x)=
\\left\\{ \left\{
\\begin{aligned} \begin{array}{rcl}
&x, & & if \\ x > threshold \\\\ x,& &if \ {x > threshold} \\
&x, & & if \\ x < -threshold \\\\ x,& &if \ {x < -threshold} \\
&0, & & if \\ others 0,& &if \ {others} &
\\end{aligned} \end{array}
\\right. \right.
Args: Args:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -192,11 +192,14 @@ def hardtanh(x, min=-1.0, max=1.0, name=None): ...@@ -192,11 +192,14 @@ def hardtanh(x, min=-1.0, max=1.0, name=None):
.. math:: .. math::
hardtanh(x)= \\begin{cases} hardtanh(x)=
max, \\text{if } x > max \\\\ \left\{
min, \\text{if } x < min \\\\ \begin{array}{cll}
x, \\text{otherwise} max,& & \text{if } x > max \\
\\end{cases} min,& & \text{if } x < min \\
x,& & \text{otherwise}
\end{array}
\right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -246,13 +249,13 @@ def hardsigmoid(x, slope=0.1666667, offset=0.5, name=None): ...@@ -246,13 +249,13 @@ def hardsigmoid(x, slope=0.1666667, offset=0.5, name=None):
.. math:: .. math::
hardsigmoid(x)= hardsigmoid(x)=
\\left\\{ \left\{
\\begin{aligned} \begin{array}{lcl}
&0, & & \\text{if } x \\leq -3 \\\\ 0, & &\text{if } \ x \leq -3 \\
&1, & & \\text{if } x \\geq 3 \\\\ 1, & &\text{if } \ x \geq 3 \\
&slope * x + offset, & & \\text{otherwise} slope * x + offset, & &\text{otherwise}
\\end{aligned} \end{array}
\\right. \right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -302,13 +305,13 @@ def hardswish(x, name=None): ...@@ -302,13 +305,13 @@ def hardswish(x, name=None):
.. math:: .. math::
hardswish(x)= hardswish(x)=
\\left\\{ \left\{
\\begin{aligned} \begin{array}{cll}
&0, & & \\text{if } x \\leq -3 \\\\ 0 &, & \text{if } x \leq -3 \\
&x, & & \\text{if } x \\geq 3 \\\\ x &, & \text{if } x \geq 3 \\
&\\frac{x(x+3)}{6}, & & \\text{otherwise} \frac{x(x+3)}{6} &, & \text{otherwise}
\\end{aligned} \end{array}
\\right. \right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -345,13 +348,13 @@ def leaky_relu(x, negative_slope=0.01, name=None): ...@@ -345,13 +348,13 @@ def leaky_relu(x, negative_slope=0.01, name=None):
leaky_relu activation leaky_relu activation
.. math:: .. math::
leaky\\_relu(x)= leaky\_relu(x)=
\\left\\{ \left\{
\\begin{aligned} \begin{array}{rcl}
&x, & & if \\ x >= 0 \\\\ x, & & if \ x >= 0 \\
&negative\_slope * x, & & otherwise \\\\ negative\_slope * x, & & otherwise \\
\\end{aligned} \end{array}
\\right. \\\\ \right.
Args: Args:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -513,7 +516,7 @@ def log_sigmoid(x, name=None): ...@@ -513,7 +516,7 @@ def log_sigmoid(x, name=None):
.. math:: .. math::
log\\_sigmoid(x) = log \\frac{1}{1 + e^{-x}} log\_sigmoid(x) = log \frac{1}{1 + e^{-x}}
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -554,12 +557,15 @@ def maxout(x, groups, axis=1, name=None): ...@@ -554,12 +557,15 @@ def maxout(x, groups, axis=1, name=None):
.. math:: .. math::
&out_{si+j} = \\max_{k} x_{gsi + sk + j} \\\\ \begin{array}{l}
&g = groups \\\\ &out_{si+j} = \max_{k} x_{gsi + sk + j} \\
&s = \\frac{input.size}{num\\_channels} \\\\ &g = groups \\
&0 \\le i < \\frac{num\\_channels}{groups} \\\\ &s = \frac{input.size}{num\_channels} \\
&0 \\le j < s \\\\ &0 \le i < \frac{num\_channels}{groups} \\
&0 \\le k < groups &0 \le j < s \\
&0 \le k < groups
\end{array}
Parameters: Parameters:
x (Tensor): The input is 4-D Tensor with shape [N, C, H, W] or [N, H, W, C], the data type x (Tensor): The input is 4-D Tensor with shape [N, C, H, W] or [N, H, W, C], the data type
...@@ -670,10 +676,12 @@ def selu(x, ...@@ -670,10 +676,12 @@ def selu(x,
.. math:: .. math::
selu(x)= scale * selu(x)= scale *
\\begin{cases} \left\{
x, \\text{if } x > 0 \\\\ \begin{array}{lcl}
alpha * e^{x} - alpha, \\text{if } x <= 0 x,& &\text{if } \ x > 0 \\
\\end{cases} alpha * e^{x} - alpha,& &\text{if } \ x <= 0
\end{array}
\right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -719,9 +727,11 @@ def selu(x, ...@@ -719,9 +727,11 @@ def selu(x,
def silu(x, name=None): def silu(x, name=None):
""" r"""
silu activation. silu activation
.. math:
.. math::
silu(x) = \frac{x}{1 + e^{-x}} silu(x) = \frac{x}{1 + e^{-x}}
Parameters: Parameters:
...@@ -734,11 +744,12 @@ def silu(x, name=None): ...@@ -734,11 +744,12 @@ def silu(x, name=None):
Examples: Examples:
.. code-block:: python .. code-block:: python
import paddle
import paddle.nn.functional as F import paddle
import paddle.nn.functional as F
x = paddle.to_tensor([1.0, 2.0, 3.0, 4.0])
out = F.silu(x) # [ 0.731059, 1.761594, 2.857722, 3.928055 ] x = paddle.to_tensor([1.0, 2.0, 3.0, 4.0])
out = F.silu(x) # [ 0.731059, 1.761594, 2.857722, 3.928055 ]
""" """
if in_dygraph_mode(): if in_dygraph_mode():
...@@ -778,7 +789,7 @@ def softmax(x, axis=-1, dtype=None, name=None): ...@@ -778,7 +789,7 @@ def softmax(x, axis=-1, dtype=None, name=None):
.. math:: .. math::
softmax[i, j] = \\frac{\\exp(x[i, j])}{\\sum_j(exp(x[i, j])} softmax[i, j] = \frac{\exp(x[i, j])}{\sum_j(exp(x[i, j])}
Example: Example:
...@@ -923,8 +934,8 @@ def softplus(x, beta=1, threshold=20, name=None): ...@@ -923,8 +934,8 @@ def softplus(x, beta=1, threshold=20, name=None):
.. math:: .. math::
softplus(x) = \\frac{1}{beta} * \\log(1 + e^{beta * x}) \\\\ softplus(x) = \frac{1}{beta} * \log(1 + e^{beta * x}) \\
\\text{For numerical stability, the implementation reverts to the linear function when: beta * x > threshold.} \text{For numerical stability, the implementation reverts to the linear function when: beta * x > threshold.}
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -968,11 +979,14 @@ def softshrink(x, threshold=0.5, name=None): ...@@ -968,11 +979,14 @@ def softshrink(x, threshold=0.5, name=None):
.. math:: .. math::
softshrink(x)= \\begin{cases} softshrink(x)=
x - threshold, \\text{if } x > threshold \\\\ \left\{
x + threshold, \\text{if } x < -threshold \\\\ \begin{array}{rcl}
0, \\text{otherwise} x - threshold,& & \text{if } x > threshold \\
\\end{cases} x + threshold,& & \text{if } x < -threshold \\
0,& & \text{otherwise}
\end{array}
\right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -1019,7 +1033,7 @@ def softsign(x, name=None): ...@@ -1019,7 +1033,7 @@ def softsign(x, name=None):
.. math:: .. math::
softsign(x) = \\frac{x}{1 + |x|} softsign(x) = \frac{x}{1 + |x|}
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -1056,7 +1070,7 @@ def swish(x, name=None): ...@@ -1056,7 +1070,7 @@ def swish(x, name=None):
.. math:: .. math::
swish(x) = \\frac{x}{1 + e^{-x}} swish(x) = \frac{x}{1 + e^{-x}}
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -1134,10 +1148,14 @@ def thresholded_relu(x, threshold=1.0, name=None): ...@@ -1134,10 +1148,14 @@ def thresholded_relu(x, threshold=1.0, name=None):
.. math:: .. math::
thresholded\\_relu(x) = \\begin{cases} thresholded\_relu(x) =
x, \\text{if } x > threshold \\\\ \left\{
0, \\text{otherwise} \begin{array}{rl}
\\end{cases} x,& \text{if } \ x > threshold \\
0,& \text{otherwise}
\end{array}
\right.
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
...@@ -1181,10 +1199,10 @@ def log_softmax(x, axis=-1, dtype=None, name=None): ...@@ -1181,10 +1199,10 @@ def log_softmax(x, axis=-1, dtype=None, name=None):
.. math:: .. math::
\\begin{aligned} \begin{aligned}
log\\_softmax[i, j] &= log(softmax(x)) \\\\ log\_softmax[i, j] &= log(softmax(x)) \\
&= log(\\frac{\\exp(X[i, j])}{\\sum_j(\\exp(X[i, j])}) &= log(\frac{\exp(X[i, j])}{\sum_j(\exp(X[i, j])})
\\end{aligned} \end{aligned}
Parameters: Parameters:
x (Tensor): The input Tensor with data type float32, float64. x (Tensor): The input Tensor with data type float32, float64.
......
...@@ -180,18 +180,18 @@ def binary_cross_entropy_with_logits(logit, ...@@ -180,18 +180,18 @@ def binary_cross_entropy_with_logits(logit,
First this operator calculate loss function as follows: First this operator calculate loss function as follows:
.. math:: .. math::
Out = -Labels * \\log(\\sigma(Logit)) - (1 - Labels) * \\log(1 - \\sigma(Logit)) Out = -Labels * \log(\sigma(Logit)) - (1 - Labels) * \log(1 - \sigma(Logit))
We know that :math:`\\sigma(Logit) = \\frac{1}{1 + e^{-Logit}}`. By substituting this we get: We know that :math:`\sigma(Logit) = \frac{1}{1 + e^{-Logit}}`. By substituting this we get:
.. math:: .. math::
Out = Logit - Logit * Labels + \\log(1 + e^{-Logit}) Out = Logit - Logit * Labels + \log(1 + e^{-Logit})
For stability and to prevent overflow of :math:`e^{-Logit}` when Logit < 0, For stability and to prevent overflow of :math:`e^{-Logit}` when Logit < 0,
we reformulate the loss as follows: we reformulate the loss as follows:
.. math:: .. math::
Out = \\max(Logit, 0) - Logit * Labels + \\log(1 + e^{-\|Logit\|}) Out = \max(Logit, 0) - Logit * Labels + \log(1 + e^{-\|Logit\|})
Then, if ``weight`` or ``pos_weight`` is not None, this operator multiply the Then, if ``weight`` or ``pos_weight`` is not None, this operator multiply the
weight tensor on the loss `Out`. The ``weight`` tensor will attach different weight tensor on the loss `Out`. The ``weight`` tensor will attach different
...@@ -450,17 +450,17 @@ def smooth_l1_loss(input, label, reduction='mean', delta=1.0, name=None): ...@@ -450,17 +450,17 @@ def smooth_l1_loss(input, label, reduction='mean', delta=1.0, name=None):
.. math:: .. math::
loss(x,y) = \\frac{1}{n}\\sum_{i}z_i loss(x,y) = \frac{1}{n}\sum_{i}z_i
where z_i is given by: where z_i is given by:
.. math:: .. math::
\\mathop{z_i} = \\left\\{\\begin{array}{rcl} \mathop{z_i} = \left\{\begin{array}{rcl}
0.5(x_i - y_i)^2 & & {if |x_i - y_i| < delta} \\\\ 0.5(x_i - y_i)^2 & & {if |x_i - y_i| < delta} \\
delta * |x_i - y_i| - 0.5 * delta^2 & & {otherwise} delta * |x_i - y_i| - 0.5 * delta^2 & & {otherwise}
\\end{array} \\right. \end{array} \right.
Parameters: Parameters:
input (Tensor): Input tensor, the data type is float32 or float64. Shape is input (Tensor): Input tensor, the data type is float32 or float64. Shape is
...@@ -631,17 +631,17 @@ def l1_loss(input, label, reduction='mean', name=None): ...@@ -631,17 +631,17 @@ def l1_loss(input, label, reduction='mean', name=None):
If `reduction` set to ``'none'``, the loss is: If `reduction` set to ``'none'``, the loss is:
.. math:: .. math::
Out = \\lvert input - label \\rvert Out = \lvert input - label \rvert
If `reduction` set to ``'mean'``, the loss is: If `reduction` set to ``'mean'``, the loss is:
.. math:: .. math::
Out = MEAN(\\lvert input - label \\rvert) Out = MEAN(\lvert input - label \rvert)
If `reduction` set to ``'sum'``, the loss is: If `reduction` set to ``'sum'``, the loss is:
.. math:: .. math::
Out = SUM(\\lvert input - label\\rvert) Out = SUM(\lvert input - label \rvert)
Parameters: Parameters:
...@@ -1563,15 +1563,15 @@ def sigmoid_focal_loss(logit, ...@@ -1563,15 +1563,15 @@ def sigmoid_focal_loss(logit,
This operator measures focal loss function as follows: This operator measures focal loss function as follows:
.. math:: .. math::
Out = -Labels * alpha * {(1 - \\sigma(Logit))}^{gamma}\\log(\\sigma(Logit)) - (1 - Labels) * (1 - alpha) * {\\sigma(Logit)}^{gamma}\\log(1 - \\sigma(Logit)) Out = -Labels * alpha * {(1 - \sigma(Logit))}^{gamma}\log(\sigma(Logit)) - (1 - Labels) * (1 - alpha) * {\sigma(Logit)}^{gamma}\log(1 - \sigma(Logit))
We know that :math:`\\sigma(Logit) = \\frac{1}{1 + \\exp(-Logit)}`. We know that :math:`\sigma(Logit) = \frac{1}{1 + \exp(-Logit)}`.
Then, if :attr:`normalizer` is not None, this operator divides the Then, if :attr:`normalizer` is not None, this operator divides the
normalizer tensor on the loss `Out`: normalizer tensor on the loss `Out`:
.. math:: .. math::
Out = \\frac{Out}{normalizer} Out = \frac{Out}{normalizer}
Finally, this operator applies reduce operation on the loss. Finally, this operator applies reduce operation on the loss.
If :attr:`reduction` set to ``'none'``, the operator will return the original loss `Out`. If :attr:`reduction` set to ``'none'``, the operator will return the original loss `Out`.
......
...@@ -34,12 +34,12 @@ def normalize(x, p=2, axis=1, epsilon=1e-12, name=None): ...@@ -34,12 +34,12 @@ def normalize(x, p=2, axis=1, epsilon=1e-12, name=None):
.. math:: .. math::
y = \\frac{x}{ \\max\\left( \\lvert \\lvert x \\rvert \\rvert_p, epsilon\\right) } y = \frac{x}{ \max\left( \lvert \lvert x \rvert \rvert_p, epsilon\right) }
.. math:: .. math::
\\lvert \\lvert x \\rvert \\rvert_p = \\left( \\sum_i {\\lvert x_i \\rvert^p} \\right)^{1/p} \lvert \lvert x \rvert \rvert_p = \left( \sum_i {\lvert x_i \rvert^p} \right)^{1/p}
where, :math:`\\sum_i{\\lvert x_i \\rvert^p}` is calculated along the ``axis`` dimension. where, :math:`\sum_i{\lvert x_i \rvert^p}` is calculated along the ``axis`` dimension.
Parameters: Parameters:
...@@ -432,7 +432,7 @@ def local_response_norm(x, ...@@ -432,7 +432,7 @@ def local_response_norm(x,
.. math:: .. math::
Output(i, x, y) = Input(i, x, y) / \\left(k + \\alpha \\sum\\limits^{\\min(C-1, i + size/2)}_{j = \\max(0, i - size/2)}(Input(j, x, y))^2\\right)^{\\beta} Output(i, x, y) = Input(i, x, y) / \left(k + \alpha \sum\limits^{\min(C-1, i + size/2)}_{j = \max(0, i - size/2)}(Input(j, x, y))^2\right)^{\beta}
In the above equation: In the above equation:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册