修复paddle.nn.ChannelShuffle等 API 的文档 (#48742)

* 1 * Update vision.py ; test=docs_preview * Update vision.py * Update loss.py 修改缩进; test=docs_preview * Update loss.py * test=document_fix * test=document_fix * for codestyle; test=document_fix Co-authored-by: N Ligoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: Ligoml <limengliu@tiaozhan.com>

修复paddle.nn.ChannelShuffle等 API 的文档 (#48742)
* 1 * Update vision.py ; test=docs_preview * Update vision.py * Update loss.py 修改缩进; test=docs_preview * Update loss.py * test=document_fix * test=document_fix * for codestyle; test=document_fix Co-authored-by: N Ligoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: Ligoml <limengliu@tiaozhan.com>
12473236 · zhangyingying520 · GitHub · b89cea33 · 12473236 · 12473236
4 changed file
--- a/python/paddle/nn/functional/loss.py
+++ b/python/paddle/nn/functional/loss.py
@@ -2346,21 +2346,21 @@ def cross_entropy(
 ):
    r"""
-    By default, this operator implements the cross entropy loss function with softmax. This function
+    By default, the cross entropy loss function is implemented using softmax. This function
    combines the calculation of the softmax operation and the cross entropy loss function
    to provide a more numerically stable computing.
-    This operator will calculate the cross entropy loss function without softmax when use_softmax=False.
+    Calculate the cross entropy loss function without softmax when use_softmax=False.
-    By default, this operator will calculate the mean of the result, and you can also affect
+    By default, calculate the mean of the result, and you can also affect
    the default behavior by using the reduction parameter. Please refer to the part of
    parameters for details.
-    This operator can be used to calculate the softmax cross entropy loss with soft and hard labels.
+    Can be used to calculate the softmax cross entropy loss with soft and hard labels.
    Where, the hard labels mean the actual label value, 0, 1, 2, etc.  And the soft labels
    mean the probability of the actual label, 0.6, 0.8, 0.2, etc.
-    The calculation of this operator includes the following two steps.
+    The calculation includes the following two steps.
    - **1.softmax cross entropy**
@@ -3480,7 +3480,7 @@ def cosine_embedding_loss(
    input1, input2, label, margin=0, reduction='mean', name=None
 ):
    r"""
-    This operator computes the cosine embedding loss of Tensor ``input1``, ``input2`` and ``label`` as follows.
+    Compute the cosine embedding loss of Tensor ``input1``, ``input2`` and ``label`` as follows.
    If label = 1, then the loss value can be calculated as follow:
@@ -3496,12 +3496,12 @@ def cosine_embedding_loss(
     .. math::
        cos(x1, x2) = \frac{x1 \cdot{} x2}{\Vert x1 \Vert_2 * \Vert x2 \Vert_2}
-     Parameters:
+    Parameters:
-        input1 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, 'M' means the length of input array.
+        input1 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, which can be 0, 'M' means the length of input array.
                         Available dtypes are float32, float64.
-        input2 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, 'M' means the length of input array.
+        input2 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, which can be 0, 'M' means the length of input array.
                         Available dtypes are float32, float64.
-        label (Tensor): tensor with shape: [N] or [1]. The target labels values should be -1 or 1.
+        label (Tensor): tensor with shape: [N] or [1], 'N' means the length of input array. The target labels values should be -1 or 1.
                         Available dtypes are int32, int64, float32, float64.
        margin (float, optional): Should be a number from :math:`-1` to :math:`1`,
                         :math:`0` to :math:`0.5` is suggested. If :attr:`margin` is missing, the

--- a/python/paddle/nn/functional/vision.py
+++ b/python/paddle/nn/functional/vision.py
@@ -465,12 +465,12 @@ def pixel_unshuffle(x, downscale_factor, data_format="NCHW", name=None):
 def channel_shuffle(x, groups, data_format="NCHW", name=None):
    """
    This API implements channel shuffle operation.
-    See more details in :ref:`api_nn_vision_ChannelShuffle` .
+    See more details in :ref:`api_nn_vision_ChannelShuffle`.
    Parameters:
        x (Tensor): 4-D tensor, the data type should be float32 or float64.
        groups (int): Number of groups to divide channels in.
-        data_format (str): The data format of the input and output data. An optional string of NCHW or NHWC. The default is NCHW. When it is NCHW, the data is stored in the order of [batch_size, input_channels, input_height, input_width].
+        data_format (str, optional): The data format of the input and output data. An optional string of NCHW or NHWC. The default is NCHW. When it is NCHW, the data is stored in the order of [batch_size, input_channels, input_height, input_width].
        name (str, optional): Name for the operation (optional, default is None). Normally there is no need for user to set this property. For more information, please refer to :ref:`api_guide_Name`.
    Returns:

--- a/python/paddle/nn/layer/loss.py
+++ b/python/paddle/nn/layer/loss.py
@@ -138,21 +138,21 @@ class BCEWithLogitsLoss(Layer):
 class CrossEntropyLoss(Layer):
    r"""
-    By default, this operator implements the cross entropy loss function with softmax. This function
+    By default, the cross entropy loss function is implemented using softmax. This function
    combines the calculation of the softmax operation and the cross entropy loss function
    to provide a more numerically stable computing.
-    This operator will calculate the cross entropy loss function without softmax when use_softmax=False.
+    Calculate the cross entropy loss function without softmax when use_softmax=False.
-    By default, this operator will calculate the mean of the result, and you can also affect
+    By default, calculate the mean of the result, and you can also affect
    the default behavior by using the reduction parameter. Please refer to the part of
    parameters for details.
-    This operator can be used to calculate the softmax cross entropy loss with soft and hard labels.
+    Can be used to calculate the softmax cross entropy loss with soft and hard labels.
    Where, the hard labels mean the actual label value, 0, 1, 2, etc.  And the soft labels
    mean the probability of the actual label, 0.6, 0.8, 0.2, etc.
-    The calculation of this operator includes the following two steps.
+    The calculation includes the following two steps.
    -  **I.softmax cross entropy**
@@ -277,8 +277,8 @@ class CrossEntropyLoss(Layer):
    Shape:
-        - **input** (Tensor), the data type is float32, float64. Shape is
+        - **input** (Tensor), the data type is float32, float64. Shape is :math:`[N_1, N_2, ..., N_k, C]`, where C is number of classes, ``k >= 1`` .
-          :math:`[N_1, N_2, ..., N_k, C]`, where C is number of classes ,  ``k >= 1`` .
            Note:
                1. when use_softmax=True, it expects unscaled logits. This operator should not be used with the
@@ -1413,11 +1413,11 @@ class CosineEmbeddingLoss(Layer):
            For more information, please refer to :ref:`api_guide_Name`.
    Shape:
-        input1 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, 'M' means the length of input array.
+        input1 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, which can be 0, 'M' means the length of input array.
                         Available dtypes are float32, float64.
-        input2 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, 'M' means the length of input array.
+        input2 (Tensor): tensor with shape: [N, M] or [M], 'N' means batch size, which can be 0, 'M' means the length of input array.
                         Available dtypes are float32, float64.
-        label (Tensor): tensor with shape: [N] or [1]. The target labels values should be -1 or 1.
+        label (Tensor): tensor with shape: [N] or [1], 'N' means the length of input array. The target labels values should be -1 or 1.
                         Available dtypes are int32, int64, float32, float64.
        output (Tensor): Tensor, the cosine embedding Loss of Tensor ``input1`` ``input2`` and ``label``.
                         If `reduction` is ``'none'``, the shape of output loss is [N], the same as ``input`` .

--- a/python/paddle/nn/layer/vision.py
+++ b/python/paddle/nn/layer/vision.py
@@ -156,7 +156,7 @@ class PixelUnshuffle(Layer):
 class ChannelShuffle(Layer):
    """
-    This operator divides channels in a tensor of shape [N, C, H, W] or [N, H, W, C] into g groups,
+    Can divide channels in a tensor of shape [N, C, H, W] or [N, H, W, C] into g groups,
    getting a tensor with the shape of [N, g, C/g, H, W] or [N, H, W, g, C/g], and transposes them
    as [N, C/g, g, H, W] or [N, H, W, g, C/g], then rearranges them to original tensor shape. This
    operation can improve the interaction between channels, using features efficiently. Please
@@ -166,7 +166,7 @@ class ChannelShuffle(Layer):
    Parameters:
        groups (int): Number of groups to divide channels in.
-        data_format (str): The data format of the input and output data. An optional string of NCHW or NHWC. The default is NCHW. When it is NCHW, the data is stored in the order of [batch_size, input_channels, input_height, input_width].
+        data_format (str, optional): The data format of the input and output data. An optional string of NCHW or NHWC. The default is NCHW. When it is NCHW, the data is stored in the order of [batch_size, input_channels, input_height, input_width].
        name (str, optional): Name for the operation (optional, default is None). Normally there is no need for user to set this property. For more information, please refer to :ref:`api_guide_Name`.
    Shape: