提交 49972701 编写于 作者: M Megvii Engine Team 提交者: Xinran Xu

docs(mge/module): refine the docstring of several apis

GitOrigin-RevId: ea04e05be44b3db8062bff04e3d5bd49a23c2c31
上级 4c5d01fa
...@@ -191,7 +191,7 @@ class LeakyReLU(Module): ...@@ -191,7 +191,7 @@ class LeakyReLU(Module):
Applies the element-wise function: Applies the element-wise function:
.. math:: .. math::
\text{LeakyReLU}(x) = \max(0,x) + 0.01 * \min(0,x) \text{LeakyReLU}(x) = \max(0,x) + negative\_slope \times \min(0,x)
or or
...@@ -199,7 +199,7 @@ class LeakyReLU(Module): ...@@ -199,7 +199,7 @@ class LeakyReLU(Module):
\text{LeakyReLU}(x) = \text{LeakyReLU}(x) =
\begin{cases} \begin{cases}
x, & \text{ if } x \geq 0 \\ x, & \text{ if } x \geq 0 \\
0.01x, & \text{ otherwise } negative\_slope \times x, & \text{ otherwise }
\end{cases} \end{cases}
Examples: Examples:
...@@ -211,7 +211,7 @@ class LeakyReLU(Module): ...@@ -211,7 +211,7 @@ class LeakyReLU(Module):
import megengine.module as M import megengine.module as M
data = mge.tensor(np.array([-8, -12, 6, 10]).astype(np.float32)) data = mge.tensor(np.array([-8, -12, 6, 10]).astype(np.float32))
leakyrelu = M.LeakyReLU() leakyrelu = M.LeakyReLU(0.01)
output = leakyrelu(data) output = leakyrelu(data)
print(output.numpy()) print(output.numpy())
......
...@@ -204,7 +204,7 @@ class ConvTranspose2d(_ConvNd): ...@@ -204,7 +204,7 @@ class ConvTranspose2d(_ConvNd):
with respect to its input. with respect to its input.
Convolution usually reduces the size of input, while transposed convolution works Convolution usually reduces the size of input, while transposed convolution works
the other way, transforming a smaller input to a larger output while preserving the the opposite way, transforming a smaller input to a larger output while preserving the
connectivity pattern. connectivity pattern.
:param in_channels: number of input channels. :param in_channels: number of input channels.
......
...@@ -11,9 +11,9 @@ from .module import Module ...@@ -11,9 +11,9 @@ from .module import Module
class Dropout(Module): class Dropout(Module):
r"""Randomly set input elements to zeros. Commonly used in large networks to prevent overfitting. r"""Randomly set input elements to zeros with the probability :math:`drop\_prob` during training. Commonly used in large networks to prevent overfitting.
Note that we perform dropout only during training, we also rescale(multiply) the output tensor Note that we perform dropout only during training, we also rescale(multiply) the output tensor
by :math:`\frac{1}{1 - p}`. During inference :class:`~.Dropout` is equal to :class:`~.Identity`. by :math:`\frac{1}{1 - drop\_prob}`. During inference :class:`~.Dropout` is equal to :class:`~.Identity`.
:param drop_prob: The probability to drop (set to zero) each single element :param drop_prob: The probability to drop (set to zero) each single element
""" """
......
...@@ -11,5 +11,7 @@ from .module import Module ...@@ -11,5 +11,7 @@ from .module import Module
class Identity(Module): class Identity(Module):
r"""A placeholder identity operator that will ignore any argument."""
def forward(self, x): def forward(self, x):
return identity(x) return identity(x)
...@@ -176,8 +176,8 @@ def xavier_uniform_(tensor: Tensor, gain: float = 1.0) -> None: ...@@ -176,8 +176,8 @@ def xavier_uniform_(tensor: Tensor, gain: float = 1.0) -> None:
a = \text{gain} \times \sqrt{\frac{6}{\text{fan_in} + \text{fan_out}}} a = \text{gain} \times \sqrt{\frac{6}{\text{fan_in} + \text{fan_out}}}
Also known as Glorot initialization. Detailed information can be retrieved from Also known as Glorot initialization. Detailed information can be retrieved from
`Understanding the difficulty of training deep feedforward neural networks` - `"Understanding the difficulty of training deep feedforward neural networks" <http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf>`_.
Glorot, X. & Bengio, Y. (2010).
:param tensor: An n-dimentional tensor to be initialized :param tensor: An n-dimentional tensor to be initialized
:param gain: Scaling factor for :math:`a`. :param gain: Scaling factor for :math:`a`.
...@@ -196,8 +196,7 @@ def xavier_normal_(tensor: Tensor, gain: float = 1.0) -> None: ...@@ -196,8 +196,7 @@ def xavier_normal_(tensor: Tensor, gain: float = 1.0) -> None:
\text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan_in} + \text{fan_out}}} \text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan_in} + \text{fan_out}}}
Also known as Glorot initialization. Detailed information can be retrieved from Also known as Glorot initialization. Detailed information can be retrieved from
`Understanding the difficulty of training deep feedforward neural networks` - `"Understanding the difficulty of training deep feedforward neural networks" <http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf>`_.
Glorot, X. & Bengio, Y. (2010).
:param tensor: An n-dimentional tensor to be initialized :param tensor: An n-dimentional tensor to be initialized
:param gain: Scaling factor for :math:`std`. :param gain: Scaling factor for :math:`std`.
...@@ -217,8 +216,9 @@ def msra_uniform_( ...@@ -217,8 +216,9 @@ def msra_uniform_(
\text{bound} = \sqrt{\frac{6}{(1 + a^2) \times \text{fan_in}}} \text{bound} = \sqrt{\frac{6}{(1 + a^2) \times \text{fan_in}}}
Detailed information can be retrieved from Detailed information can be retrieved from
`Delving deep into rectifiers: Surpassing human-level performance on ImageNet `"Delving deep into rectifiers: Surpassing human-level performance on ImageNet
classification` classification" <https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf>`_.
:param tensor: An n-dimentional tensor to be initialized :param tensor: An n-dimentional tensor to be initialized
:param a: Optional parameter for calculating gain for leaky_relu. See :param a: Optional parameter for calculating gain for leaky_relu. See
...@@ -246,8 +246,8 @@ def msra_normal_( ...@@ -246,8 +246,8 @@ def msra_normal_(
\text{std} = \sqrt{\frac{2}{(1 + a^2) \times \text{fan_in}}} \text{std} = \sqrt{\frac{2}{(1 + a^2) \times \text{fan_in}}}
Detailed information can be retrieved from Detailed information can be retrieved from
`Delving deep into rectifiers: Surpassing human-level performance on ImageNet `"Delving deep into rectifiers: Surpassing human-level performance on ImageNet
classification` classification" <https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf>`_.
:param tensor: An n-dimentional tensor to be initialized :param tensor: An n-dimentional tensor to be initialized
:param a: Optional parameter for calculating gain for leaky_relu. See :param a: Optional parameter for calculating gain for leaky_relu. See
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册