:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 2D convolution operation. Default: 1
:param dilation: dilation of the 2D convolution operation. Default: 1
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided, so as to perform a ``grouped convolution``. When ``groups`` is not 1,
so as to perform a ``grouped convolution``. When groups is not 1,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
in_channels and out_channels must be divisible by groups,
and the shape of weight should be `(groups, out_channel // groups,
and the shape of weight should be `(groups, out_channel // groups,
in_channels // groups, height, width)`.
in_channels // groups, height, width)`.
:type conv_mode: string or :class:`P.Convolution.Mode`.
:type conv_mode: string or :class:`P.Convolution.Mode`
:param conv_mode: supports "CROSS_CORRELATION" or "CONVOLUTION". Default:
:param conv_mode: supports "CROSS_CORRELATION" or "CONVOLUTION". Default:
"CROSS_CORRELATION"
"CROSS_CORRELATION"
:type compute_mode: string or
:type compute_mode: string or
:class:`P.Convolution.ComputeMode`.
:class:`P.Convolution.ComputeMode`
:param compute_mode: when set to "DEFAULT", no special requirements will be
:param compute_mode: when set to "DEFAULT", no special requirements will be
placed on the precision of intermediate results. When set to "FLOAT32",
placed on the precision of intermediate results. When set to "FLOAT32",
Float32 would be used for accumulator and intermediate result, but only
"Float32" would be used for accumulator and intermediate result, but only
effective when input and output are of Float16 dtype.
effective when input and output are of Float16 dtype.
:return: output tensor.
:return: output tensor.
"""
"""
...
@@ -168,24 +167,23 @@ def conv_transpose2d(
...
@@ -168,24 +167,23 @@ def conv_transpose2d(
:param inp: feature map of the convolution operation.
:param inp: feature map of the convolution operation.
:param weight: convolution kernel.
:param weight: convolution kernel.
:param bias: bias added to the result of convolution (if given)
:param bias: bias added to the result of convolution (if given).
:param stride: stride of the 2D convolution operation. Default: 1
:param stride: stride of the 2D convolution operation. Default: 1
:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 2D convolution operation. Default: 1
:param dilation: dilation of the 2D convolution operation. Default: 1
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided, so as to perform a ``grouped convolution``. When ``groups`` is not 1,
so as to perform a ``grouped convolution``. When groups is not 1,
``in_channels`` and ``out_channels`` must be divisible by groups,
in_channels and out_channels must be divisible by groups,
and the shape of weight should be `(groups, out_channel // groups,
and the shape of weight should be `(groups, out_channel // groups,
It is applied to all elements along axis, and will re-scale them so that
It is applied to all elements along axis, and rescales elements so that
the elements lie in the range `[0, 1]` and sum to 1.
they stay in the range `[0, 1]` and sum to 1.
See :class:`~megengine.module.activation.Softmax` for more details.
See :class:`~megengine.module.activation.Softmax` for more details.
:param inp: The input tensor.
:param inp: input tensor.
:param axis: An axis along which softmax will be applied. By default,
:param axis: an axis along which softmax will be applied. By default,
softmax will apply along the highest ranked axis.
softmax will apply along the highest ranked axis.
Examples:
Examples:
...
@@ -573,7 +570,7 @@ def batch_norm2d(
...
@@ -573,7 +570,7 @@ def batch_norm2d(
eps:float=1e-5,
eps:float=1e-5,
inplace:bool=True
inplace:bool=True
):
):
"""Applies batch normalization to the input.
r"""Applies batch normalization to the input.
Refer to :class:`~.BatchNorm2d` and :class:`~.BatchNorm1d` for more information.
Refer to :class:`~.BatchNorm2d` and :class:`~.BatchNorm1d` for more information.
...
@@ -585,13 +582,13 @@ def batch_norm2d(
...
@@ -585,13 +582,13 @@ def batch_norm2d(
:param bias: bias tensor in the learnable affine parameters.
:param bias: bias tensor in the learnable affine parameters.
See :math:`\beta` in :class:`~.BatchNorm2d`.
See :math:`\beta` in :class:`~.BatchNorm2d`.
:param training: a boolean value to indicate whether batch norm is performed
:param training: a boolean value to indicate whether batch norm is performed
in traning mode. Default: False
in training mode. Default: False
:param momentum: value used for the ``running_mean`` and ``running_var``
:param momentum: value used for the ``running_mean`` and ``running_var``
computation.
computation.
Default: 0.9
Default: 0.9
:param eps: a value added to the denominator for numerical stability.
:param eps: a value added to the denominator for numerical stability.
Default: 1e-5
Default: 1e-5
:param inplace: whether to update running_mean and running_var inplace or return new tensors
:param inplace: whether to update ``running_mean`` and ``running_var`` inplace or return new tensors
Default: True
Default: True
:return: output tensor.
:return: output tensor.
"""
"""
...
@@ -677,7 +674,7 @@ def sync_batch_norm(
...
@@ -677,7 +674,7 @@ def sync_batch_norm(
eps_mode="ADDITIVE",
eps_mode="ADDITIVE",
group=WORLD,
group=WORLD,
)->Tensor:
)->Tensor:
"""Applies synchronized batch normalization to the input.
r"""Applies synchronized batch normalization to the input.
Refer to :class:`~.BatchNorm2d` and :class:`~.BatchNorm1d` for more information.
Refer to :class:`~.BatchNorm2d` and :class:`~.BatchNorm1d` for more information.
...
@@ -887,19 +884,18 @@ def matmul(
...
@@ -887,19 +884,18 @@ def matmul(
With different inputs dim, this function behaves differently:
With different inputs dim, this function behaves differently:
- Both 1-D tensor, simply forward to dot.
- Both 1-D tensor, simply forward to ``dot``.
- Both 2-D tensor, normal matrix multiplication.
- Both 2-D tensor, normal matrix multiplication.
- If one input tensor is 1-D, matrix vector multiplication.
- If one input tensor is 1-D, matrix vector multiplication.
- If at least one tensor are 3-dimensional or >3-dimensional, the batched matrix-matrix is returned, and the tensor with smaller dimension will
- If at least one tensor are 3-dimensional or >3-dimensional, the other tensor should have dim >= 2, the batched matrix-matrix is returned, and the tensor with smaller dimension will
:param bias: bias added to the result of convolution
:param bias: bias added to the result of convolution
:param stride: stride of the 2D convolution operation. Default: 1
:param stride: stride of the 2D convolution operation. Default: 1
:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 2D convolution operation. Default: 1
:param dilation: dilation of the 2D convolution operation. Default: 1
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided, so as to perform a "grouped convolution". When ``groups`` is not 1,
so as to perform a "grouped convolution". When groups is not 1,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
in_channels and out_channels must be divisible by groups,
and the shape of weight should be `(groups, out_channel // groups,
and the shape of weight should be `(groups, out_channel // groups,
in_channels // groups, height, width)`.
in_channels // groups, height, width)`.
:type conv_mode: string or :class:`P.Convolution.Mode`.
:type conv_mode: string or :class:`P.Convolution.Mode`.
:param conv_mode: supports 'CROSS_CORRELATION' or 'CONVOLUTION'. Default:
:param conv_mode: supports 'CROSS_CORRELATION' or 'CONVOLUTION'. Default:
'CROSS_CORRELATION'
'CROSS_CORRELATION'
:param dtype: support for np.dtype, Default: np.int8
:param dtype: support for ``np.dtype``, Default: np.int8
:param scale: scale if use quantization, Default: 0.0
:param scale: scale if use quantization, Default: 0.0
:param zero_point: scale if use quantization quint8, Default: 0.0
:param zero_point: scale if use quantization quint8, Default: 0.0
:type compute_mode: string or
:type compute_mode: string or
:class:`P.Convolution.ComputeMode`.
:class:`P.Convolution.ComputeMode`.
:param compute_mode: when set to 'DEFAULT', no special requirements will be
:param compute_mode: when set to "DEFAULT", no special requirements will be
placed on the precision of intermediate results. When set to 'FLOAT32',
placed on the precision of intermediate results. When set to "FLOAT32",
Float32 would be used for accumulator and intermediate result, but only
"Float32" would be used for accumulator and intermediate result, but only effective when input and output are of Float16 dtype.
effective when input and output are of Float16 dtype.
where :math:`\star` is the valid 2D cross-correlation operator,
where :math:`\star` is the valid 2D cross-correlation operator,
:math:`N` is a batch size, :math:`C` denotes a number of channels,
:math:`N` is batch size, :math:`C` denotes number of channels,
:math:`H` is a height of input planes in pixels, and :math:`W` is
:math:`H` is height of input planes in pixels, and :math:`W` is
width in pixels.
width in pixels.
When `groups == in_channels` and `out_channels == K * in_channels`,
When `groups == in_channels` and `out_channels == K * in_channels`,
...
@@ -120,9 +120,8 @@ class Conv2d(_ConvNd):
...
@@ -120,9 +120,8 @@ class Conv2d(_ConvNd):
:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 2D convolution operation. Default: 1
:param dilation: dilation of the 2D convolution operation. Default: 1
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided, so as to perform a "grouped convolution". When ``groups`` is not 1,
so as to perform a "grouped convolution". When groups is not 1,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
in_channels and out_channels must be divisible by groups,
and there would be an extra dimension at the beginning of the weight's
and there would be an extra dimension at the beginning of the weight's
shape. Specifically, the shape of weight would be `(groups,
shape. Specifically, the shape of weight would be `(groups,
:param conv_mode: Supports `CROSS_CORRELATION` or `CONVOLUTION`. Default:
:param conv_mode: Supports `CROSS_CORRELATION` or `CONVOLUTION`. Default:
`CROSS_CORRELATION`
`CROSS_CORRELATION`
:param compute_mode: When set to `DEFAULT`, no special requirements will be
:param compute_mode: When set to "DEFAULT", no special requirements will be
placed on the precision of intermediate results. When set to `FLOAT32`,
placed on the precision of intermediate results. When set to "FLOAT32",
float32 would be used for accumulator and intermediate result, but only
"Float32" would be used for accumulator and intermediate result, but only
effective when input and output are of float16 dtype.
effective when input and output are of float16 dtype.
Examples:
Examples:
...
@@ -236,7 +235,7 @@ class ConvTranspose2d(_ConvNd):
...
@@ -236,7 +235,7 @@ class ConvTranspose2d(_ConvNd):
r"""Applies a 2D transposed convolution over an input tensor.
r"""Applies a 2D transposed convolution over an input tensor.
This module is also known as a deconvolution or a fractionally-strided convolution.
This module is also known as a deconvolution or a fractionally-strided convolution.
:class:`ConvTranspose2d` can ben seen as the gradient of :class:`Conv2d` operation
:class:`ConvTranspose2d` can be seen as the gradient of :class:`Conv2d` operation
with respect to its input.
with respect to its input.
Convolution usually reduces the size of input, while transposed convolution works
Convolution usually reduces the size of input, while transposed convolution works
...
@@ -252,8 +251,7 @@ class ConvTranspose2d(_ConvNd):
...
@@ -252,8 +251,7 @@ class ConvTranspose2d(_ConvNd):
:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 2D convolution operation. Default: 1
:param dilation: dilation of the 2D convolution operation. Default: 1
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided, so as to perform a "grouped convolution". When ``groups`` is not 1,
so as to perform a "grouped convolution". When ``groups`` is not 1,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
and there would be an extra dimension at the beginning of the weight's
and there would be an extra dimension at the beginning of the weight's
shape. Specifically, the shape of weight would be ``(groups,
shape. Specifically, the shape of weight would be ``(groups,
...
@@ -262,9 +260,9 @@ class ConvTranspose2d(_ConvNd):
...
@@ -262,9 +260,9 @@ class ConvTranspose2d(_ConvNd):
True
True
:param conv_mode: Supports `CROSS_CORRELATION` or `CONVOLUTION`. Default:
:param conv_mode: Supports `CROSS_CORRELATION` or `CONVOLUTION`. Default:
`CROSS_CORRELATION`
`CROSS_CORRELATION`
:param compute_mode: When set to `DEFAULT`, no special requirements will be
:param compute_mode: When set to "DEFAULT", no special requirements will be
placed on the precision of intermediate results. When set to `FLOAT32`,
placed on the precision of intermediate results. When set to "FLOAT32",
float32 would be used for accumulator and intermediate result, but only
"Float32" would be used for accumulator and intermediate result, but only
effective when input and output are of float16 dtype.
effective when input and output are of float16 dtype.
"""
"""
...
@@ -342,7 +340,7 @@ class ConvTranspose2d(_ConvNd):
...
@@ -342,7 +340,7 @@ class ConvTranspose2d(_ConvNd):
classLocalConv2d(Conv2d):
classLocalConv2d(Conv2d):
r"""Applies a spatial convolution with untied kernels over an input 4D tensor.
r"""Applies a spatial convolution with unshared kernels over an input 4D tensor.
It is also known as the locally connected layer.
It is also known as the locally connected layer.
:param in_channels: number of input channels.
:param in_channels: number of input channels.
...
@@ -355,9 +353,9 @@ class LocalConv2d(Conv2d):
...
@@ -355,9 +353,9 @@ class LocalConv2d(Conv2d):
:param stride: stride of the 2D convolution operation. Default: 1
:param stride: stride of the 2D convolution operation. Default: 1
:param padding: size of the paddings added to the input on both sides of its
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
spatial dimensions. Only zero-padding is supported. Default: 0
:param groups: number of groups to divide input and output channels into,
:param groups: number of groups into which the input and output channels are divided,
so as to perform a "grouped convolution". When groups is not 1,
so as to perform a "grouped convolution". When ``groups`` is not 1,
in_channels and out_channels must be divisible by groups.
``in_channels`` and ``out_channels`` must be divisible by ``groups``.
The shape of weight is `(groups, output_height, output_width,
The shape of weight is `(groups, output_height, output_width,