add_three_args (#2318)

* add_three_args * add broadcasting index and some common args text=develop * fix broadcasting index test=develop * fix broadcasting test=develop

add_three_args (#2318)
* add_three_args * add broadcasting index and some common args text=develop * fix broadcasting index test=develop * fix broadcasting test=develop
f4a94a99 · Chen Long · GitHub · 09a87cf3 · f4a94a99 · f4a94a99
6 changed file
--- a/doc/fluid/beginners_guide/basic_concept/broadcasting.rst
+++ b/doc/fluid/beginners_guide/basic_concept/broadcasting.rst
 .. _cn_user_guide_broadcasting:
-=========
+==================
 广播 (broadcasting)
-=========
+==================
 飞桨（PaddlePaddle，以下简称Paddle）和其他框架一样，提供的一些API支持广播(broadcasting)机制，允许在一些运算时使用不同形状的张量。
 通常来讲，如果有一个形状较小和一个形状较大的张量，我们希望多次使用较小的张量来对较大的张量执行一些操作，看起来像是较小形状的张量的形状首先被扩展到和较大形状的张量一致，然后做运算。

--- a/doc/fluid/beginners_guide/basic_concept/broadcasting_en.rst
+++ b/doc/fluid/beginners_guide/basic_concept/broadcasting_en.rst
 .. _user_guide_broadcasting
-=========
+==================
 Broadcasting
-=========
+==================
 PaddlePaddle provides broadcasting semantics in some APIs like other deep learning frameworks, which allows using tensors with different shapes while operating.
 In General, broadcast is the rule how the smaller tensor is “broadcast” across the larger tsnsor so that they have same shapes.
@@ -98,4 +98,4 @@ For example:
    z = paddle.elementwise_add(x,y,axis=1)
    print(z.shape)
    # z'shape [2, 3, 4, 5]
    # Start comparation at axis=1 from forward to backward.
\ No newline at end of file
--- a/doc/fluid/beginners_guide/basic_concept/index_cn.rst
+++ b/doc/fluid/beginners_guide/basic_concept/index_cn.rst
@@ -11,7 +11,7 @@
 - `Operator <operator.html>`_ : Operator表示对数据的操作。
 - `Program <program.html>`_ : Program表示对计算过程的描述。
 - `Executor <executor.html>`_ : Executor表示执行引擎。
+- `Broadcasting <broadcasting.html>`_ : Paddle对广播支持的说明。
 ..  toctree::
    :hidden:
@@ -22,4 +22,4 @@
    operator.rst
    program.rst
    executor.rst
+    broadcasting.rst
--- a/doc/fluid/beginners_guide/basic_concept/index_en.rst
+++ b/doc/fluid/beginners_guide/basic_concept/index_en.rst
@@ -6,13 +6,13 @@ This paper introduces the basic concepts of Paddle:
 - `Guide to Fluid Programming <./programming_guide/programming_guide_en.html>`_ :introduces the basic concept and usage of Paddle.
 - `LoD-Tensor User Guide <lod_tensor_en.html>`_ : LoD-Tensor is a high-level feature of Paddle. It adds sequence information on the basis of tensor and supports processing variable length data.
+- `Broadcasting <broadcasting_en.html>`_ : introduces Paddle provides broadcasting semantics.
 ..  toctree::
    :hidden:
    programming_guide/programming_guide_en.md
    lod_tensor_en.rst
+    broadcasting_en.rst
--- a/doc/fluid/beginners_guide/basic_concept/operator.rst
+++ b/doc/fluid/beginners_guide/basic_concept/operator.rst
 .. _cn_user_guide_Operator:
-=======
+=========
 Operator
-=======
+=========
 在飞桨（PaddlePaddle，以下简称Paddle）中，所有对数据的操作都由Operator表示

--- a/doc/templates/common_docs.py
+++ b/doc/templates/common_docs.py
@@ -22,33 +22,45 @@ common_args_en = """
    include_sublayers (bool, optional): Whether include the sublayers. If True, return list includes the sublayers weights. Default is True.
    stride (tuple|int): The stride size. It can be a single integer or a tuple containing two integers, representing the strides of the convolution along the height and width. If it is a single integer, the height and width are equal to the integer. Default is 1. 
    groups (int, optional): The group number of convolution layer. When group=n, the input and convolution kernels are divided into n groups equally, the first group of convolution kernels and the first group of inputs are subjected to convolution calculation, the second group of convolution kernels and the second group of inputs are subjected to convolution calculation, ……, the nth group of convolution kernels and the nth group of inputs perform convolution calculations. Default is 1.
-    regularization (WeightDecayRegularizer, optional) – The strategy of regularization. There are two method: :ref:`api_fluid_regularizer_L1Decay` 、 :ref:`api_fluid_regularizer_L2Decay` . If a parameter has set regularizer using  :ref:`api_fluid_ParamAttr` already, the regularization setting here in optimizer will be ignored for this parameter. Otherwise, the regularization setting here in optimizer will take effect. Default None, meaning there is no regularization.
+    regularization (WeightDecayRegularizer, optional): The strategy of regularization. There are two method: :ref:`api_fluid_regularizer_L1Decay` 、 :ref:`api_fluid_regularizer_L2Decay` . If a parameter has set regularizer using  :ref:`api_fluid_ParamAttr` already, the regularization setting here in optimizer will be ignored for this parameter. Otherwise, the regularization setting here in optimizer will take effect. Default None, meaning there is no regularization.
    grad_clip (GradientClipBase, optional): Gradient cliping strategy, it's an instance of some derived class of ``GradientClipBase`` . There are three cliping strategies ( :ref:`api_fluid_clip_GradientClipByGlobalNorm` , :ref:`api_fluid_clip_GradientClipByNorm` , :ref:`api_fluid_clip_GradientClipByValue` ). Default None, meaning there is no gradient clipping.
-    dilation (tuple|int) – The dilation size. It can be a single integer or a tuple containing two integers, representing the height and width of dilation of the convolution kernel elements. If it is a single integer,the height and width of dilation are equal to the integer. Default is 1.
+    dilation (tuple|int): The dilation size. It can be a single integer or a tuple containing two integers, representing the height and width of dilation of the convolution kernel elements. If it is a single integer,the height and width of dilation are equal to the integer. Default is 1.
-    stop_gradient (bool, optional) – A boolean that mentions whether gradient should flow. Default is True, means stop calculate gradients.
+    stop_gradient (bool, optional): A boolean that mentions whether gradient should flow. Default is True, means stop calculate gradients.
+    force_cpu (bool, optional): Whether force to store the output tensor in CPU memory. If force_cpu is False, the output tensor will be stored in running device memory, otherwise it will be stored  to the CPU memory. Default is False.
+    data_format (str, optional): Specify the input data format, the output data format will be consistent with the input, which can be "NCHW" or "NHWC". N is batch size, C is channels, H is height, and W is width. Default is "NCHW".
+    grad_clip (GradientClipBase, optional): Gradient cliping strategy, it's an instance of some derived class of ``GradientClipBase`` . There are three cliping strategies ( :ref:`api_fluid_clip_GradientClipByGlobalNorm` , :ref:`api_fluid_clip_GradientClipByNorm` , :ref:`api_fluid_clip_GradientClipByValue` ). Default is None, meaning there is no gradient clipping.
+    num_filters (int): The number of filter. It is as same as the output channals numbers.
+    dim (int, optional): A dimension along which to operate. Default is 0.
+    is_sparse (bool, optional): Whether use sparse updating. For more information, please refer to :ref:`api_guide_sparse_update_en` . If it’s True, it will ues sparse updating.
 """
 common_args_cn = """
    x (Tensor) - 输入的Tensor，数据类型为：float32、float64、int32、int64。
    y (Tensor) - 输入的Tensor，数据类型为：float32、float64、int32、int64。
-    name (str, 可选） - 操作的名称(可选，默认值为None）。更多信息请参见 :ref:`api_guide_Name`。
+    name (str，可选） - 操作的名称(可选，默认值为None）。更多信息请参见 :ref:`api_guide_Name`。
-    dtype (str, 可选) - 输出Tensor的数据类型，支持int32、int64、float32、float64。
+    dtype (str，可选) - 输出Tensor的数据类型，支持int32、int64、float32、float64。
-    param_attr (ParamAttr, 可选) – 该Layer的可学习的权重(Parameter)的参数属性。更多信息请参见 :ref:`cn_api_fluid_ParamAttr`。
+    param_attr (ParamAttr，可选) – 该Layer的可学习的权重(Parameter)的参数属性。更多信息请参见 :ref:`cn_api_fluid_ParamAttr`。
-    bias_attr (ParamAttr, 可选) - 该Layer的可学习的偏置(Bias)的参数属性。更多信息请参见 :ref:`cn_api_fluid_ParamAttr`。
+    bias_attr (ParamAttr，可选) - 该Layer的可学习的偏置(Bias)的参数属性。更多信息请参见 :ref:`cn_api_fluid_ParamAttr`。
    label (Tensor) - 训练数据的标签，数据类型为：int32, int64。
    learning_rate (Tensor|float) - 学习率，可以是一个Tensor或者是一个浮点数。默认值为1e-03.
-    axis (int, 可选) - 指定对输入Tensor进行运算的轴。默认值为0。
+    axis (int，可选) - 指定对输入Tensor进行运算的轴。默认值为0。
-    epsilon (float, 可选) - 添加到分母上的值以防止分母除0。默认值为1e-05。
+    epsilon (float，可选) - 添加到分母上的值以防止分母除0。默认值为1e-05。
-    is_test (bool, 可选) - 用于表明是否在测试阶段执行。默认值为False，表示非测试阶段。
+    is_test (bool，可选) - 用于表明是否在测试阶段执行。默认值为False，表示非测试阶段。
    shape (Tensor|tuple|list) - Tensor的形状。如果shape是一个列表或元组，则其元素应该是形状为[1]的整数或Tensor。 如果shape是Tensor，则它应该是一维Tensor。
    keep_dim (bool) - 是否在输出Tensor中保留减小的维度。如 keep_dim 为True，否则结果张量的维度将比输入张量小，默认值为False。
    filter_size (tuple|list|int) - 卷积核大小。可以为单个整数或包含两个整数的元组或列表，分别表示卷积核的高和宽。如果为单个整数，表示卷积核的高和宽都等于该整数。
    padding (tuple|int) – 填充大小。可以为单个整数或包含两个整数的元组，分别表示对输入高和宽两侧填充的大小。如果为单个整数，表示高和宽的填充都等于该整数。默认值为0。
-    include_sublayers (bool, 可选) - 是否返回子层的参数。如果为True，返回的列表中包含子层的参数。默认值为True。
+    include_sublayers (bool，可选) - 是否返回子层的参数。如果为True，返回的列表中包含子层的参数。默认值为True。
    stride (tuple|int) -  步长大小。可以为单个整数或包含两个整数的元组，分别表示卷积沿着高和宽的步长。如果为单个整数，表示沿着高和宽的步长都等于该整数。默认值为1。
-    groups (int, 可选) - 卷积的组数。当group=n，输入和卷积核分别平均分为n组，第一组卷积核和第一组输入进行卷积计算，第二组卷积核和第二组输入进行卷积计算，……，第n组卷积核和第n组输入进行卷积计算。默认值为11。
+    groups (int，可选) - 卷积的组数。当group=n，输入和卷积核分别平均分为n组，第一组卷积核和第一组输入进行卷积计算，第二组卷积核和第二组输入进行卷积计算，……，第n组卷积核和第n组输入进行卷积计算。默认值为11。
    regularization (WeightDecayRegularizer，可选) - 正则化方法。支持两种正则化策略: :ref:`cn_api_fluid_regularizer_L1Decay` 、 :ref:`cn_api_fluid_regularizer_L2Decay` 。如果一个参数已经在 :ref:`cn_api_fluid_ParamAttr` 中设置了正则化，这里的正则化设置将被忽略；如果没有在 :ref:`cn_api_fluid_ParamAttr` 中设置正则化，这里的设置才会生效。默认值为None，表示没有正则化。
-    grad_clip (GradientClipBase, 可选) – 梯度裁剪的策略，支持三种裁剪策略： :ref:`cn_api_fluid_clip_GradientClipByGlobalNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByValue` 。
+    grad_clip (GradientClipBase，可选) – 梯度裁剪的策略，支持三种裁剪策略： :ref:`cn_api_fluid_clip_GradientClipByGlobalNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByValue` 。
-    dilation (tuple|int, 可选) - 空洞大小。可以为单个整数或包含两个整数的元组，分别表示卷积核中的元素沿着高和宽的空洞。如果为单个整数，表示高和宽的空洞都等于该整数。默认值为1。
+    dilation (tuple|int，可选) - 空洞大小。可以为单个整数或包含两个整数的元组，分别表示卷积核中的元素沿着高和宽的空洞。如果为单个整数，表示高和宽的空洞都等于该整数。默认值为1。
    stop_gradient (bool，可选) - 提示是否应该停止计算梯度，默认值为True，表示停止计算梯度。
+    force_cpu (bool，可选) - 是否强制将输出Tensor写入CPU内存。如果为False，则将输出Tensor写入当前所在运算设备的内存，否则写入CPU内存中。默认为False。
+    data_format (str，可选) - 指定输入的数据格式，输出的数据格式将与输入保持一致，可以是"NCHW"和"NHWC"。N是批大小，C是通道数，H是高度，W是宽度。默认值为"NCHW"。
+    grad_clip (GradientClipBase，可选) – 梯度裁剪的策略，支持三种裁剪策略： :ref:`cn_api_fluid_clip_GradientClipByGlobalNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByNorm` 、 :ref:`cn_api_fluid_clip_GradientClipByValue` 。默认值为None，表示不使用梯度裁剪。
+    num_filters (int) - 卷积核的个数，与输出的通道数相同。
+    dim (int，可选) - 指定对输入Tensor进行运算的维度。默认值为0。
+    is_sparse (bool，可选) - 是否使用稀疏更新的方式，更多信息请参见 :ref:`api_guide_sparse_update` 。默认值为True，表示使用稀疏更新的方式。
 """