cn api split all (#1041)

5880ad2b · Hao Wang · Cheerego · bec74123 · 5880ad2b · 5880ad2b
480 changed file
--- a/doc/fluid/api_cn/average_cn.rst
+++ b/doc/fluid/api_cn/average_cn.rst
-#################
+=======================
- fluid.average
+fluid.average
-#################
+=======================
-.. _cn_api_fluid_average_WeightedAverage:
+..  toctree::
-WeightedAverage
+    :maxdepth: 1
-------------------------------
+    average_cn/WeightedAverage_cn.rst
-.. py:class:: paddle.fluid.average.WeightedAverage
-计算加权平均值。
-平均计算完全通过Python完成。它们不会改变Paddle的程序，也不会修改NN模型的配置。它们完全是Python函数的包装器。
-**示例代码**
-.. code-block:: python
-            import paddle.fluid as fluid
-            avg = fluid.average.WeightedAverage()
-            avg.add(value=2.0, weight=1)
-            avg.add(value=4.0, weight=2)
-            avg.eval()
-            # 结果为 3.333333333.
-            # 因为 (2.0 * 1 + 4.0 * 2) / (1 + 2) = 3.333333333
--- a/doc/fluid/api_cn/average_cn/WeightedAverage_cn.rst
+++ b/doc/fluid/api_cn/average_cn/WeightedAverage_cn.rst
+.. _cn_api_fluid_average_WeightedAverage:
+WeightedAverage
+-------------------------------
+.. py:class:: paddle.fluid.average.WeightedAverage
+计算加权平均值。
+平均计算完全通过Python完成。它们不会改变Paddle的程序，也不会修改NN模型的配置。它们完全是Python函数的包装器。
+**示例代码**
+.. code-block:: python
+            import paddle.fluid as fluid
+            avg = fluid.average.WeightedAverage()
+            avg.add(value=2.0, weight=1)
+            avg.add(value=4.0, weight=2)
+            avg.eval()
+            # 结果为 3.333333333.
+            # 因为 (2.0 * 1 + 4.0 * 2) / (1 + 2) = 3.333333333
--- a/doc/fluid/api_cn/backward_cn.rst
+++ b/doc/fluid/api_cn/backward_cn.rst
-#################
+=======================
- fluid.backward
+fluid.backward
-#################
+=======================
-.. _cn_api_fluid_backward_append_backward:
+..  toctree::
-append_backward
+    :maxdepth: 1
-------------------------------
+    backward_cn/append_backward_cn.rst
-.. py:function:: paddle.fluid.backward.append_backward(loss, parameter_list=None, no_grad_set=None, callbacks=None)
+    backward_cn/gradients_cn.rst
-将向 ``main_program`` 追加 ``backward`` 。
-完整的神经网络训练由前向和反向传播组成。但是当我们配置网络时，我们只需要指定其前向部分。通过该功能，根据前向部分自动生成反向部分。
-在大多数情况下，用户无需手动调用此功能。它将由优化程序的最小化函数自动调用。
-参数：
-    - **loss** （Variable）- 网络的损失变量。
-    - **parameter_list** （list [string] | None）- 优化器需要更新的参数名称。如果为None，则将更新所有参数。默认值：None。
-    - **no_grad_set** （set | None）- ``block`` 0中变量的梯度应该被忽略。所有 ``block`` 中带有 ``step_gradient = True`` 的所有变量都将自动添加到此集合中。默认值：None。
-    - **callbacks** （list [callable object] | None）- 回调用于在反向传播构建中执行一些自定义作业。每次将新的梯度运算符添加到程序中时，将调用其中的所有可调用对象。可调用对象必须有两个输入参数： ``block`` 和 ``context`` 。 ``block`` 是将被添加到新梯度算子的块。 ``context`` 是一个映射，其键是梯度变量名，值是对应的原始变量。除此之外， ``context`` 还有另一个特殊的键值对：键是字符串 ``__ current_op_desc__`` ，值是刚刚触发可调用对象的梯度运算符的 ``op_desc`` 。
-返回：   成对参数及其相应的梯度。键是参数，值是梯度变量。
-返回类型：       list[(Variable,Variable)]
-抛出：     
-    - ``AssertionError`` - 如果loss不是Variable的实例。
-**示例代码**
-.. code-block:: python
-        # 网络配置
-        # 损失计算
-        import paddle.fluid as fluid
-        x = fluid.layers.data(name='x', shape=[13], dtype='float32')
-        y = fluid.layers.data(name='y', shape=[1], dtype='float32') 
-        y_predict = fluid.layers.fc(input=x, size=1, act=None)
-        loss = fluid.layers.square_error_cost(input=y_predict, label=y)
-        avg_loss = fluid.layers.mean(loss)
-        param_grad_list = fluid.backward.append_backward(loss=avg_loss)
-.. _cn_api_fluid_backward_gradients:
-gradients
-------------------------------
-.. py:function:: paddle.fluid.backward.gradients(targets, inputs, target_gradients=None, no_grad_set=None)
-将目标梯度反向传播到输入。
-参数：  
-  - **targets** (Variable|list[Variable]) – 目标变量
-  - **inputs** (Variable|list[Variable]) – 输入变量
-  - **target_gradients** (Variable|list[Variable]|None) – 目标的梯度变量，应与目标变量形状相同；如果设置为None，则以1初始化所有梯度变量
-  - **no_grad_sethread** (set[string]) – 在Block 0中不具有梯度的变量，所有block中被设置 ``stop_gradient=True`` 的变量将被自动加入该set
-返回：数组，包含与输入对应的梯度。如果一个输入不影响目标函数，则对应的梯度变量为None
-返回类型：(list[Variable])
-**示例代码**
-.. code-block:: python
-            import paddle.fluid as fluid
-            x = fluid.layers.data(name='x', shape=[2,8,8], dtype='float32')
-            x.stop_gradient=False
-            y = fluid.layers.conv2d(x, 4, 1, bias_attr=False)
-            y = fluid.layers.relu(y)
-            y = fluid.layers.conv2d(y, 4, 1, bias_attr=False)
-            y = fluid.layers.relu(y)
-            z = fluid.gradients([y], x)
-            print(z)
\ No newline at end of file
--- a/doc/fluid/api_cn/backward_cn/append_backward_cn.rst
+++ b/doc/fluid/api_cn/backward_cn/append_backward_cn.rst
+.. _cn_api_fluid_backward_append_backward:
+append_backward
+-------------------------------
+.. py:function:: paddle.fluid.backward.append_backward(loss, parameter_list=None, no_grad_set=None, callbacks=None)
+将向 ``main_program`` 追加 ``backward`` 。
+完整的神经网络训练由前向和反向传播组成。但是当我们配置网络时，我们只需要指定其前向部分。通过该功能，根据前向部分自动生成反向部分。
+在大多数情况下，用户无需手动调用此功能。它将由优化程序的最小化函数自动调用。
+参数：
+    - **loss** （Variable）- 网络的损失变量。
+    - **parameter_list** （list [string] | None）- 优化器需要更新的参数名称。如果为None，则将更新所有参数。默认值：None。
+    - **no_grad_set** （set | None）- ``block`` 0中变量的梯度应该被忽略。所有 ``block`` 中带有 ``step_gradient = True`` 的所有变量都将自动添加到此集合中。默认值：None。
+    - **callbacks** （list [callable object] | None）- 回调用于在反向传播构建中执行一些自定义作业。每次将新的梯度运算符添加到程序中时，将调用其中的所有可调用对象。可调用对象必须有两个输入参数： ``block`` 和 ``context`` 。 ``block`` 是将被添加到新梯度算子的块。 ``context`` 是一个映射，其键是梯度变量名，值是对应的原始变量。除此之外， ``context`` 还有另一个特殊的键值对：键是字符串 ``__ current_op_desc__`` ，值是刚刚触发可调用对象的梯度运算符的 ``op_desc`` 。
+返回：   成对参数及其相应的梯度。键是参数，值是梯度变量。
+返回类型：       list[(Variable,Variable)]
+抛出：     
+    - ``AssertionError`` - 如果loss不是Variable的实例。
+**示例代码**
+.. code-block:: python
+        # 网络配置
+        # 损失计算
+        import paddle.fluid as fluid
+        x = fluid.layers.data(name='x', shape=[13], dtype='float32')
+        y = fluid.layers.data(name='y', shape=[1], dtype='float32') 
+        y_predict = fluid.layers.fc(input=x, size=1, act=None)
+        loss = fluid.layers.square_error_cost(input=y_predict, label=y)
+        avg_loss = fluid.layers.mean(loss)
+        param_grad_list = fluid.backward.append_backward(loss=avg_loss)
--- a/doc/fluid/api_cn/backward_cn/gradients_cn.rst
+++ b/doc/fluid/api_cn/backward_cn/gradients_cn.rst
+.. _cn_api_fluid_backward_gradients:
+gradients
+-------------------------------
+.. py:function:: paddle.fluid.backward.gradients(targets, inputs, target_gradients=None, no_grad_set=None)
+将目标梯度反向传播到输入。
+参数：  
+  - **targets** (Variable|list[Variable]) – 目标变量
+  - **inputs** (Variable|list[Variable]) – 输入变量
+  - **target_gradients** (Variable|list[Variable]|None) – 目标的梯度变量，应与目标变量形状相同；如果设置为None，则以1初始化所有梯度变量
+  - **no_grad_sethread** (set[string]) – 在Block 0中不具有梯度的变量，所有block中被设置 ``stop_gradient=True`` 的变量将被自动加入该set
+返回：数组，包含与输入对应的梯度。如果一个输入不影响目标函数，则对应的梯度变量为None
+返回类型：(list[Variable])
+**示例代码**
+.. code-block:: python
+            import paddle.fluid as fluid
+            x = fluid.layers.data(name='x', shape=[2,8,8], dtype='float32')
+            x.stop_gradient=False
+            y = fluid.layers.conv2d(x, 4, 1, bias_attr=False)
+            y = fluid.layers.relu(y)
+            y = fluid.layers.conv2d(y, 4, 1, bias_attr=False)
+            y = fluid.layers.relu(y)
+            z = fluid.gradients([y], x)
+            print(z)
\ No newline at end of file
--- a/doc/fluid/api_cn/clip_cn.rst
+++ b/doc/fluid/api_cn/clip_cn.rst
-#################
+=======================
- fluid.clip
+fluid.clip
-#################
+=======================
-.. _cn_api_fluid_clip_ErrorClipByValue:
+..  toctree::
-ErrorClipByValue
+    :maxdepth: 1
-------------------------------
+    clip_cn/ErrorClipByValue_cn.rst
-.. py:class:: paddle.fluid.clip.ErrorClipByValue(max, min=None)
+    clip_cn/GradientClipByGlobalNorm_cn.rst
+    clip_cn/GradientClipByNorm_cn.rst
-将张量值的范围压缩到 [min, max]。
+    clip_cn/GradientClipByValue_cn.rst
-给定一个张量 ``t`` ，该操作将它的值压缩到 ``min`` 和 ``max``  之间
- 任何小于min（最小值）的值都被设置为min
- 任何大于max（最大值）的值都被设置为max
-参数:
- - **max** (foat) - 要修剪的最大值。
- - **min** (float) - 要修剪的最小值。如果用户没有设置，将被框架默认设置为 ``-max`` 
-**代码示例**
-.. code-block:: python
-     import paddle.fluid as fluid
-     BATCH_SIZE = 128
-     CLIP_MAX = 2e-6
-     CLIP_MIN = -1e-6
-     prog = fluid.framework.Program()
-     with fluid.program_guard(main_program=prog):
-        image = fluid.layers.data(name='x', shape=[784], dtype='float32')
-        hidden1 = fluid.layers.fc(input=image, size=128, act='relu')
-        hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu')
-        predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
-        label = fluid.layers.data(name='y', shape=[1], dtype='int64')
-        cost = fluid.layers.cross_entropy(input=predict, label=label)
-        avg_cost = fluid.layers.mean(cost)
-     prog_clip = prog.clone()
-     prog_clip.block(0).var(hidden1.name)._set_error_clip(
-        fluid.clip.ErrorClipByValue(
-            max=CLIP_MAX, min=CLIP_MIN)
-.. _cn_api_fluid_clip_GradientClipByGlobalNorm:
-GradientClipByGlobalNorm
-------------------------------
-.. py:class:: paddle.fluid.clip.GradientClipByGlobalNorm(clip_norm, group_name='default_group')
-通过多个张量的范数之和的比率来剪切（clip）多个张量。
-给定一个张量列表 :math:`t\_list` 和一个剪切比率 ``clip_norm`` ，返回一个被剪切的张量列表list_clipped和 :math:`t\_list` 中所有张量的全局范数(global_norm)。
-剪切过程如下：
-.. math::
-            \\t\_list[i]=t\_list[i]∗\frac{clip\_norm}{max(global\_norm,clip\_norm)}\\
-其中：
-.. math::            
-            \\global\_norm=\sqrt{\sum_{i=0}^{n-1}(l2norm(t\_list[i]))^2}\\
-如果 :math:`clip\_norm>global\_norm` ， :math:`t\_list` 中的张量保持不变，否则它们都会按照全局比率缩减。
-参数:
- - **clip_norm** (float) - 范数最大值
- - **group_name** (str, optional) - 剪切的组名
-**代码示例**
-.. code-block:: python
-    import paddle.fluid as fluid
-    prog = fluid.framework.Program()
-    startup_program = fluid.framework.Program()
-    with fluid.program_guard(
-            main_program=prog, startup_program=startup_program):
-        image = fluid.layers.data(name='x', shape=[784], dtype='float32')
-        label = fluid.layers.data(name='y', shape=[1], dtype='int64')
-        hidden1 = fluid.layers.fc(input=image, size=128, act='relu')
-        hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu')
-        predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
-        cost = fluid.layers.cross_entropy(input=predict, label=label)
-        avg_cost = fluid.layers.mean(cost)
-    prog_clip = prog.clone()
-    avg_cost_clip = prog_clip.block(0).var(avg_cost.name)
-    p_g_clip = fluid.backward.append_backward(loss=avg_cost_clip)
-    with fluid.program_guard(main_program=prog_clip):
-        fluid.clip.set_gradient_clip(
-            fluid.clip.GradientClipByGlobalNorm(clip_norm=2.0))
-        p_g_clip = fluid.clip.append_gradient_clip_ops(p_g_clip)
-.. _cn_api_fluid_clip_GradientClipByNorm:
-GradientClipByNorm
-------------------------------
-.. py:class:: paddle.fluid.clip.GradientClipByNorm(clip_norm)
-将张量转换为L2范数不超过 ``clip_norm`` 的张量
-该operator 限制了 输入张量 :math:`X` 的L2范数不会超过 :math:`max\_norm` 。如果 :math:`X` 的 ``L2`` 范数小于或等于 :math:`max\_norm` ,输出和 :math:`X` 一样，如果 :math:`X` 的L2范数大于 :math:`max\_norm` , :math:`X` 将被线性缩放到L2范数等于 :math:`max\_norm` ,如以下公式所示:
-.. math::
-            \\Out = \frac{max\_norm∗X}{norm(X)}\\
-其中 :math:`norm（X）` 代表 :math:`X` 的 L2 范数
-参数:
- - **clip_norm** (float) - 二范数最大值
-**代码示例**
-.. code-block:: python
-    import paddle.fluid as fluid
-    w_param_attrs = fluid.ParamAttr(name=None,
-                                    initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0),
-                                    learning_rate=1.0,
-                                    regularizer=fluid.regularizer.L1Decay(1.0),
-                                    trainable=True,
-                                    gradient_clip=fluid.clip.GradientClipByNorm(clip_norm=2.0))
-    x = fluid.layers.data(name='x', shape=[10], dtype='float32')
-    y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)
-.. _cn_api_fluid_clip_GradientClipByValue:
-GradientClipByValue
-------------------------------
-.. py:class:: paddle.fluid.clip.GradientClipByValue(max, min=None)
-将梯度值(gradient values)的范围压缩到 [min, max]。
-给定一个张量 ``t`` ，该操作将它的值压缩到 ``min`` 和 ``max`` 之间
- 任何小于最小值的值都被设置为最小值
- 任何大于max的值都被设置为max
-参数:
- - **max** (foat) - 要修剪的最大值。
- - **min** (float，optional) - 要修剪的最小值。如果用户没有设置，将被 ``framework`` 设置为 ``-max`` 。
-**代码示例**
-.. code-block:: python
-     import paddle.fluid as fluid
-     w_param_attrs = fluid.ParamAttr(name=None,
-                                     initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0),
-                                     learning_rate=1.0,
-                                     regularizer=fluid.regualrizer.L1Decay(1.0),
-                                     trainable=True,
-                                     gradient_clip=fluid.clip.GradientClipByValue(-1.0, 1.0))
-     x = fluid.layers.data(name='x', shape=[10], dtype='float32')
-     y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)
--- a/doc/fluid/api_cn/clip_cn/ErrorClipByValue_cn.rst
+++ b/doc/fluid/api_cn/clip_cn/ErrorClipByValue_cn.rst
+.. _cn_api_fluid_clip_ErrorClipByValue:
+ErrorClipByValue
+-------------------------------
+.. py:class:: paddle.fluid.clip.ErrorClipByValue(max, min=None)
+将张量值的范围压缩到 [min, max]。
+给定一个张量 ``t`` ，该操作将它的值压缩到 ``min`` 和 ``max``  之间
+- 任何小于min（最小值）的值都被设置为min
+- 任何大于max（最大值）的值都被设置为max
+参数:
+ - **max** (foat) - 要修剪的最大值。
+ - **min** (float) - 要修剪的最小值。如果用户没有设置，将被框架默认设置为 ``-max`` 
+**代码示例**
+.. code-block:: python
+     import paddle.fluid as fluid
+     BATCH_SIZE = 128
+     CLIP_MAX = 2e-6
+     CLIP_MIN = -1e-6
+     prog = fluid.framework.Program()
+     with fluid.program_guard(main_program=prog):
+        image = fluid.layers.data(name='x', shape=[784], dtype='float32')
+        hidden1 = fluid.layers.fc(input=image, size=128, act='relu')
+        hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu')
+        predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
+        label = fluid.layers.data(name='y', shape=[1], dtype='int64')
+        cost = fluid.layers.cross_entropy(input=predict, label=label)
+        avg_cost = fluid.layers.mean(cost)
+     prog_clip = prog.clone()
+     prog_clip.block(0).var(hidden1.name)._set_error_clip(
+        fluid.clip.ErrorClipByValue(
+            max=CLIP_MAX, min=CLIP_MIN)
--- a/doc/fluid/api_cn/clip_cn/GradientClipByGlobalNorm_cn.rst
+++ b/doc/fluid/api_cn/clip_cn/GradientClipByGlobalNorm_cn.rst
+.. _cn_api_fluid_clip_GradientClipByGlobalNorm:
+GradientClipByGlobalNorm
+-------------------------------
+.. py:class:: paddle.fluid.clip.GradientClipByGlobalNorm(clip_norm, group_name='default_group')
+通过多个张量的范数之和的比率来剪切（clip）多个张量。
+给定一个张量列表 :math:`t\_list` 和一个剪切比率 ``clip_norm`` ，返回一个被剪切的张量列表list_clipped和 :math:`t\_list` 中所有张量的全局范数(global_norm)。
+剪切过程如下：
+.. math::
+            \\t\_list[i]=t\_list[i]∗\frac{clip\_norm}{max(global\_norm,clip\_norm)}\\
+其中：
+.. math::            
+            \\global\_norm=\sqrt{\sum_{i=0}^{n-1}(l2norm(t\_list[i]))^2}\\
+如果 :math:`clip\_norm>global\_norm` ， :math:`t\_list` 中的张量保持不变，否则它们都会按照全局比率缩减。
+参数:
+ - **clip_norm** (float) - 范数最大值
+ - **group_name** (str, optional) - 剪切的组名
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    prog = fluid.framework.Program()
+    startup_program = fluid.framework.Program()
+    with fluid.program_guard(
+            main_program=prog, startup_program=startup_program):
+        image = fluid.layers.data(name='x', shape=[784], dtype='float32')
+        label = fluid.layers.data(name='y', shape=[1], dtype='int64')
+        hidden1 = fluid.layers.fc(input=image, size=128, act='relu')
+        hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu')
+        predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
+        cost = fluid.layers.cross_entropy(input=predict, label=label)
+        avg_cost = fluid.layers.mean(cost)
+    prog_clip = prog.clone()
+    avg_cost_clip = prog_clip.block(0).var(avg_cost.name)
+    p_g_clip = fluid.backward.append_backward(loss=avg_cost_clip)
+    with fluid.program_guard(main_program=prog_clip):
+        fluid.clip.set_gradient_clip(
+            fluid.clip.GradientClipByGlobalNorm(clip_norm=2.0))
+        p_g_clip = fluid.clip.append_gradient_clip_ops(p_g_clip)
--- a/doc/fluid/api_cn/clip_cn/GradientClipByNorm_cn.rst
+++ b/doc/fluid/api_cn/clip_cn/GradientClipByNorm_cn.rst
+.. _cn_api_fluid_clip_GradientClipByNorm:
+GradientClipByNorm
+-------------------------------
+.. py:class:: paddle.fluid.clip.GradientClipByNorm(clip_norm)
+将张量转换为L2范数不超过 ``clip_norm`` 的张量
+该operator 限制了 输入张量 :math:`X` 的L2范数不会超过 :math:`max\_norm` 。如果 :math:`X` 的 ``L2`` 范数小于或等于 :math:`max\_norm` ,输出和 :math:`X` 一样，如果 :math:`X` 的L2范数大于 :math:`max\_norm` , :math:`X` 将被线性缩放到L2范数等于 :math:`max\_norm` ,如以下公式所示:
+.. math::
+            \\Out = \frac{max\_norm∗X}{norm(X)}\\
+其中 :math:`norm（X）` 代表 :math:`X` 的 L2 范数
+参数:
+ - **clip_norm** (float) - 二范数最大值
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    w_param_attrs = fluid.ParamAttr(name=None,
+                                    initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0),
+                                    learning_rate=1.0,
+                                    regularizer=fluid.regularizer.L1Decay(1.0),
+                                    trainable=True,
+                                    gradient_clip=fluid.clip.GradientClipByNorm(clip_norm=2.0))
+    x = fluid.layers.data(name='x', shape=[10], dtype='float32')
+    y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)
--- a/doc/fluid/api_cn/clip_cn/GradientClipByValue_cn.rst
+++ b/doc/fluid/api_cn/clip_cn/GradientClipByValue_cn.rst
+.. _cn_api_fluid_clip_GradientClipByValue:
+GradientClipByValue
+-------------------------------
+.. py:class:: paddle.fluid.clip.GradientClipByValue(max, min=None)
+将梯度值(gradient values)的范围压缩到 [min, max]。
+给定一个张量 ``t`` ，该操作将它的值压缩到 ``min`` 和 ``max`` 之间
+- 任何小于最小值的值都被设置为最小值
+- 任何大于max的值都被设置为max
+参数:
+ - **max** (foat) - 要修剪的最大值。
+ - **min** (float，optional) - 要修剪的最小值。如果用户没有设置，将被 ``framework`` 设置为 ``-max`` 。
+**代码示例**
+.. code-block:: python
+     import paddle.fluid as fluid
+     w_param_attrs = fluid.ParamAttr(name=None,
+                                     initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0),
+                                     learning_rate=1.0,
+                                     regularizer=fluid.regualrizer.L1Decay(1.0),
+                                     trainable=True,
+                                     gradient_clip=fluid.clip.GradientClipByValue(-1.0, 1.0))
+     x = fluid.layers.data(name='x', shape=[10], dtype='float32')
+     y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)
--- a/doc/fluid/api_cn/data/data_reader_cn.rst
+++ b/doc/fluid/api_cn/data/data_reader_cn.rst
-#################
+=======================
 Data Reader
-#################
+=======================
-.. _cn_api_paddle_data_reader_datafeeder:
-DataFeeder
-==================================
-.. py:class:: paddle.fluid.data_feeder.DataFeeder(feed_list, place, program=None)
+..  toctree::
+    :maxdepth: 1
+    data_reader_cn/DataFeeder_cn.rst
+    data_reader_cn/Reader_cn.rst
-DataFeeder将reader返回的数据转换为可以输入Executor和ParallelExecutor的数据结构。reader通常返回一个小批量数据条目列表。列表中的每个数据条目都是一个样本。每个样本都是具有一个或多个特征的列表或元组。
-简单用法如下：
-**代码示例**
-..  code-block:: python
-    import paddle.fluid as fluid
-    place = fluid.CPUPlace()
-    img = fluid.layers.data(name='image', shape=[1, 28, 28])
-    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder([img, label], fluid.CPUPlace())
-    result = feeder.feed([([0] * 784, [9]), ([1] * 784, [1])])
-如果您想在使用多个GPU训练模型时预先将数据单独输入GPU端，可以使用decorate_reader函数。
-**代码示例**
-..  code-block:: python
-    import paddle
-    import paddle.fluid as fluid
-    place=fluid.CUDAPlace(0)
-    data = fluid.layers.data(name='data', shape=[3, 224, 224], dtype='float32')
-    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
-    reader = feeder.decorate_reader(
-        paddle.batch(paddle.dataset.flowers.train(), batch_size=16), multi_devices=False)
-参数：
-    - **feed_list**  (list) –  将输入模型的变量或变量的名称。
-    - **place**  (Place) – place表示将数据输入CPU或GPU，如果要将数据输入GPU，请使用fluid.CUDAPlace(i)（i表示GPU的ID），如果要将数据输入CPU，请使用fluid.CPUPlace()。
-    - **program**  (Program) –将数据输入的Program，如果Program为None，它将使用default_main_program() 。默认值None。
-抛出异常：     ``ValueError`` – 如果某些变量未在Program中出现
-**代码示例**
-..  code-block:: python
-    import numpy as np
-    import paddle
-    import paddle.fluid as fluid
-    place = fluid.CPUPlace()
-    def reader():
-        yield [np.random.random([4]).astype('float32'), np.random.random([3]).astype('float32')],
-    main_program = fluid.Program()
-    startup_program = fluid.Program()
-    with fluid.program_guard(main_program, startup_program):
-        data_1 = fluid.layers.data(name='data_1', shape=[1, 2, 2])
-        data_2 = fluid.layers.data(name='data_2', shape=[1, 1, 3])
-        out = fluid.layers.fc(input=[data_1, data_2], size=2)
-        # ...
-    feeder = fluid.DataFeeder([data_1, data_2], place)
-    exe = fluid.Executor(place)
-    exe.run(startup_program)
-    for data in reader():
-        outs = exe.run(program=main_program,
-                       feed=feeder.feed(data),
-                       fetch_list=[out])
-.. py:method::  feed(iterable)
-根据feed_list和iterable，将输入转换成一个数据结构，该数据结构可以输入Executor和ParallelExecutor。
-参数：
-    - **iterable** (list|tuple) – 输入的数据
-返回： 转换结果
-返回类型： dict
-**代码示例**
-..  code-block:: python
-        import numpy.random as random
-        import paddle.fluid as fluid
-        def reader(limit=5):
-            for i in range(limit):
-                    yield random.random([784]).astype('float32'), random.random([1]).astype('int64'), random.random([256]).astype('float32')
-        data_1 = fluid.layers.data(name='data_1', shape=[1, 28, 28])
-        data_2 = fluid.layers.data(name='data_2', shape=[1], dtype='int64')
-        data_3 = fluid.layers.data(name='data_3', shape=[16, 16], dtype='float32')
-        feeder = fluid.DataFeeder(['data_1','data_2', 'data_3'], fluid.CPUPlace())
-        result = feeder.feed(reader())
-.. py:method::  feed_parallel(iterable, num_places=None)
-需要多个mini-batches。每个mini-batch都将提前在每个设备上输入。
-参数：
-    - **iterable** (list|tuple) – 输入的数据。
-    - **num_places**  (int) – 设备编号，默认值为None。
-返回： 转换结果
-返回类型： dict
-.. note::
-    设备数量和mini-batches数量必须一致。
-**代码示例**
-..  code-block:: python
-        import numpy.random as random
-        import paddle.fluid as fluid
-        def reader(limit=10):
-            for i in range(limit):
-                yield [random.random([784]).astype('float32'), random.randint(10)],
-        x = fluid.layers.data(name='x', shape=[1, 28, 28])
-        y = fluid.layers.data(name='y', shape=[1], dtype='int64')
-        feeder = fluid.DataFeeder(['x','y'], fluid.CPUPlace())
-        place_num = 2
-        places = [fluid.CPUPlace() for x in range(place_num)]
-        data = []
-        exe = fluid.Executor(fluid.CPUPlace())
-        exe.run(fluid.default_startup_program())
-        program = fluid.CompiledProgram(fluid.default_main_program()).with_data_parallel(places=places)
-        for item in reader():
-            data.append(item)
-            if place_num == len(data):
-                exe.run(program=program, feed=list(feeder.feed_parallel(data, place_num)), fetch_list=[])
-                data = []
-.. py:method::  decorate_reader(reader, multi_devices, num_places=None, drop_last=True)
-将输入数据转换成reader返回的多个mini-batches。每个mini-batch分别送入各设备中。
-参数：
-    - **reader** (function) – reader是可以生成数据的函数。
-    - **multi_devices** (bool) – 是否用多个设备。
-    - **num_places** (int) – 如果multi_devices是True, 你可以指定GPU的使用数量, 如果multi_devices是None, 会使用当前机器的所有GPU ，默认值None。
-    - **drop_last** (bool) – 如果最后一个batch的大小小于batch_size，选择是否删除最后一个batch，默认值True。
-返回： 转换结果
-返回类型： dict
-抛出异常：     ``ValueError`` – 如果drop_last为False并且数据batch和设备数目不匹配。
-**代码示例**
-..  code-block:: python
-        import numpy.random as random
-        import paddle
-        import paddle.fluid as fluid
-        def reader(limit=5):
-            for i in range(limit):
-                yield (random.random([784]).astype('float32'), random.random([1]).astype('int64')),
-        place=fluid.CUDAPlace(0)
-        data = fluid.layers.data(name='data', shape=[1, 28, 28], dtype='float32')
-        label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-        feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
-        reader = feeder.decorate_reader(reader, multi_devices=False)
-        exe = fluid.Executor(place)
-        exe.run(fluid.default_startup_program())
-        for data in reader():
-            exe.run(feed=data)
-.. _cn_api_paddle_data_reader_reader:
-Reader
-==================================
-在训练和测试时，PaddlePaddle需要读取数据。为了简化用户编写数据读取代码的工作，我们定义了
-    - reader是一个读取数据（从文件、网络、随机数生成器等）并生成数据项的函数。
-    - reader creator是返回reader函数的函数。
-    - reader decorator是一个函数，它接受一个或多个reader，并返回一个reader。
-    - batch reader是一个函数，它读取数据（从reader、文件、网络、随机数生成器等）并生成一批数据项。
-Data Reader Interface
------------------------------------
-的确，data reader不必是读取和生成数据项的函数，它可以是任何不带参数的函数来创建一个iterable（任何东西都可以被用于 ``for x in iterable`` ):
-..  code-block:: python
-    iterable = data_reader()
-从iterable生成的元素应该是单个数据条目，而不是mini batch。数据输入可以是单个项目，也可以是项目的元组，但应为 :ref:`user_guide_paddle_support_data_types` （如, numpy 1d array of float32, int, list of int）
-单项目数据读取器创建者的示例实现：
-..  code-block:: python
-    def reader_creator_random_image(width, height):
-        def reader():
-            while True:
-                yield numpy.random.uniform(-1, 1, size=width*height)
-        return reader
-多项目数据读取器创建者的示例实现：
-..  code-block:: python
-    def reader_creator_random_image_and_label(width, height, label):
-        def reader():
-            while True:
-                yield numpy.random.uniform(-1, 1, size=width*height), label
-        return reader
-.. py:function::   paddle.reader.map_readers(func, *readers)
-创建使用每个数据读取器的输出作为参数输出函数返回值的数据读取器。
-参数：
-    - **func**  - 使用的函数. 函数类型应为(Sample) => Sample
-    - **readers**  - 其输出将用作func参数的reader。
-类型：callable
-返回： 被创建数据的读取器
-返回类型： callable
-.. py:function::  paddle.reader.buffered(reader, size)
-创建缓冲数据读取器。
-缓冲数据reader将读取数据条目并将其保存到缓冲区中。只要缓冲区不为空，就将继续从缓冲数据读取器读取数据。
-参数：
-    - **reader** (callable) - 要读取的数据读取器
-    - **size** (int) - 最大缓冲
-返回：缓冲数据的读取器
-.. py:function::   paddle.reader.compose(*readers, **kwargs)
-创建一个数据reader，其输出是输入reader的组合。
-如果输入reader输出以下数据项：（1，2）3（4，5），则组合reader将输出：（1，2，3，4，5）。
-参数：
-    - **readers** - 将被组合的多个读取器。
-    - **check_alignment** (bool) - 如果为True，将检查输入reader是否正确对齐。如果为False，将不检查对齐，将丢弃跟踪输出。默认值True。
-返回：新的数据读取器
-抛出异常：     ``ComposeNotAligned`` – reader的输出不一致。 当check_alignment设置为False，不会抛出异常。
-.. py:function:: paddle.reader.chain(*readers)
-创建一个数据reader，其输出是链接在一起的输入数据reader的输出。
-如果输入reader输出以下数据条目：[0，0，0][1，1，1][2，2，2]，链接reader将输出：[0，0，0，1，1，1，2，2，2] 。
-参数：
-    - **readers** – 输入的数据。
-返回： 新的数据读取器
-返回类型： callable
-.. py:function:: paddle.reader.shuffle(reader, buf_size)
-创建数据读取器，该reader的数据输出将被无序排列。
-由原始reader创建的迭代器的输出将被缓冲到shuffle缓冲区，然后进行打乱。打乱缓冲区的大小由参数buf_size决定。
-参数：
-    - **reader** (callable)  – 输出会被打乱的原始reader
-    - **buf_size** (int)  – 打乱缓冲器的大小
-返回： 输出会被打乱的reader
-返回类型： callable
-.. py:function:: paddle.reader.firstn(reader, n)
-限制reader可以返回的最大样本数。
-参数：
-    - **reader** (callable)  – 要读取的数据读取器。
-    - **n** (int)  – 返回的最大样本数 。
-返回： 装饰reader
-返回类型： callable
-.. py:function:: paddle.reader.xmap_readers(mapper, reader, process_num, buffer_size, order=False)
-通过多线程方式，通过用户自定义的映射器mapper来映射reader返回的样本（到输出队列）。
-参数：
-    - **mapper** （callable） - 一种映射reader数据的函数。
-    - **reader** （callable） - 产生数据的reader。
-    - **process_num** （int） - 用于处理样本的线程数目。
-    - **buffer_size** （int） - 存有待读取数据的队列的大小。
-    - **order** （bool） - 是否保持原始reader的数据顺序。 默认为False。
-返回：一个将原数据进行映射后的decorated reader。
-返回类型： callable
-.. py:class:: paddle.reader.PipeReader(command, bufsize=8192, file_type='plain')
-PipeReader通过流从一个命令中读取数据，将它的stdout放到管道缓冲区中，并将其重定向到解析器进行解析，然后根据需要的格式生成数据。
-您可以使用标准Linux命令或调用其他Program来读取数据，例如通过HDFS、CEPH、URL、AWS S3中读取：
-**代码示例**
-..  code-block:: python
-    def example_reader():
-        for f in myfiles:
-            pr = PipeReader("cat %s"%f)
-            for l in pr.get_line():
-                sample = l.split(" ")
-                yield sample
-.. py:method:: get_line(cut_lines=True, line_break='\n')
-参数：
-    - **cut_lines** （bool） - 将缓冲区分行。
-    - **line_break** （string） - 文件中的行分割符，比如 ‘\\n’ 或者 ‘\\r’。
-返回：一行或者一段缓冲区。
-返回类型： string
-.. py:function:: paddle.reader.multiprocess_reader(readers, use_pipe=True, queue_size=1000)
-多进程reader使用python多进程从reader中读取数据，然后使用multi process.queue或multi process.pipe合并所有数据。进程号等于输入reader的编号，每个进程调用一个reader。
-multiprocess.queue需要/dev/shm的rw访问权限，某些平台不支持。
-您需要首先创建多个reader，这些reader应该相互独立，这样每个进程都可以独立工作。
-**代码示例**
-..  code-block:: python
-    reader0 = reader(["file01", "file02"])
-    reader1 = reader(["file11", "file12"])
-    reader1 = reader(["file21", "file22"])
-    reader = multiprocess_reader([reader0, reader1, reader2],
-        queue_size=100, use_pipe=False)
-.. py:class::paddle.reader.Fake
-Fakereader将缓存它读取的第一个数据，并将其输出data_num次。它用于缓存来自真实reader的数据，并将其用于速度测试。
-参数：
-    - **reader** – 原始读取器。
-    - **data_num** – reader产生数据的次数 。
-返回： 一个Fake读取器
-**代码示例**
-..  code-block:: python
-    def reader():
-        for i in range(10):
-            yield i
-    fake_reader = Fake()(reader, 100)
-Creator包包含一些简单的reader creator，可以在用户Program中使用。
-.. py:function:: paddle.reader.creator.np_array(x)
-如果是numpy向量，则创建一个生成x个元素的读取器。或者，如果它是一个numpy矩阵，创建一个生成x行元素的读取器。或由最高维度索引的任何子超平面。
-参数：
-    - **x** – 用于创建reader的numpy数组。
-返回： 从x创建的数据读取器
-.. py:function:: paddle.reader.creator.text_file(path)
-创建从给定文本文件逐行输出文本的数据读取器。将删除每行的行尾的(‘\n’)。
-路径：文本文件的路径
-返回： 文本文件的数据读取器
-.. py:function::  paddle.reader.creator.recordio(paths, buf_size=100)
-从给定的recordio文件路径创建数据reader，用“，”分隔“，支持全局模式。
-路径：recordio文件的路径，可以是字符串或字符串列表。
-返回：recordio文件的数据读取器
--- a/doc/fluid/api_cn/data/data_reader_cn/DataFeeder_cn.rst
+++ b/doc/fluid/api_cn/data/data_reader_cn/DataFeeder_cn.rst
+.. _cn_api_paddle_data_reader_datafeeder:
+DataFeeder
+-----------------------------------
+.. py:class:: paddle.fluid.data_feeder.DataFeeder(feed_list, place, program=None)
+DataFeeder将reader返回的数据转换为可以输入Executor和ParallelExecutor的数据结构。reader通常返回一个小批量数据条目列表。列表中的每个数据条目都是一个样本。每个样本都是具有一个或多个特征的列表或元组。
+简单用法如下：
+**代码示例**
+..  code-block:: python
+    import paddle.fluid as fluid
+    place = fluid.CPUPlace()
+    img = fluid.layers.data(name='image', shape=[1, 28, 28])
+    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder([img, label], fluid.CPUPlace())
+    result = feeder.feed([([0] * 784, [9]), ([1] * 784, [1])])
+如果您想在使用多个GPU训练模型时预先将数据单独输入GPU端，可以使用decorate_reader函数。
+**代码示例**
+..  code-block:: python
+    import paddle
+    import paddle.fluid as fluid
+    place=fluid.CUDAPlace(0)
+    data = fluid.layers.data(name='data', shape=[3, 224, 224], dtype='float32')
+    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
+    reader = feeder.decorate_reader(
+        paddle.batch(paddle.dataset.flowers.train(), batch_size=16), multi_devices=False)
+参数：
+    - **feed_list**  (list) –  将输入模型的变量或变量的名称。
+    - **place**  (Place) – place表示将数据输入CPU或GPU，如果要将数据输入GPU，请使用fluid.CUDAPlace(i)（i表示GPU的ID），如果要将数据输入CPU，请使用fluid.CPUPlace()。
+    - **program**  (Program) –将数据输入的Program，如果Program为None，它将使用default_main_program() 。默认值None。
+抛出异常：     ``ValueError`` – 如果某些变量未在Program中出现
+**代码示例**
+..  code-block:: python
+    import numpy as np
+    import paddle
+    import paddle.fluid as fluid
+    place = fluid.CPUPlace()
+    def reader():
+        yield [np.random.random([4]).astype('float32'), np.random.random([3]).astype('float32')],
+    main_program = fluid.Program()
+    startup_program = fluid.Program()
+    with fluid.program_guard(main_program, startup_program):
+        data_1 = fluid.layers.data(name='data_1', shape=[1, 2, 2])
+        data_2 = fluid.layers.data(name='data_2', shape=[1, 1, 3])
+        out = fluid.layers.fc(input=[data_1, data_2], size=2)
+        # ...
+    feeder = fluid.DataFeeder([data_1, data_2], place)
+    exe = fluid.Executor(place)
+    exe.run(startup_program)
+    for data in reader():
+        outs = exe.run(program=main_program,
+                       feed=feeder.feed(data),
+                       fetch_list=[out])
+.. py:method::  feed(iterable)
+根据feed_list和iterable，将输入转换成一个数据结构，该数据结构可以输入Executor和ParallelExecutor。
+参数：
+    - **iterable** (list|tuple) – 输入的数据
+返回： 转换结果
+返回类型： dict
+**代码示例**
+..  code-block:: python
+        import numpy.random as random
+        import paddle.fluid as fluid
+        def reader(limit=5):
+            for i in range(limit):
+                    yield random.random([784]).astype('float32'), random.random([1]).astype('int64'), random.random([256]).astype('float32')
+        data_1 = fluid.layers.data(name='data_1', shape=[1, 28, 28])
+        data_2 = fluid.layers.data(name='data_2', shape=[1], dtype='int64')
+        data_3 = fluid.layers.data(name='data_3', shape=[16, 16], dtype='float32')
+        feeder = fluid.DataFeeder(['data_1','data_2', 'data_3'], fluid.CPUPlace())
+        result = feeder.feed(reader())
+.. py:method::  feed_parallel(iterable, num_places=None)
+需要多个mini-batches。每个mini-batch都将提前在每个设备上输入。
+参数：
+    - **iterable** (list|tuple) – 输入的数据。
+    - **num_places**  (int) – 设备编号，默认值为None。
+返回： 转换结果
+返回类型： dict
+.. note::
+    设备数量和mini-batches数量必须一致。
+**代码示例**
+..  code-block:: python
+        import numpy.random as random
+        import paddle.fluid as fluid
+        def reader(limit=10):
+            for i in range(limit):
+                yield [random.random([784]).astype('float32'), random.randint(10)],
+        x = fluid.layers.data(name='x', shape=[1, 28, 28])
+        y = fluid.layers.data(name='y', shape=[1], dtype='int64')
+        feeder = fluid.DataFeeder(['x','y'], fluid.CPUPlace())
+        place_num = 2
+        places = [fluid.CPUPlace() for x in range(place_num)]
+        data = []
+        exe = fluid.Executor(fluid.CPUPlace())
+        exe.run(fluid.default_startup_program())
+        program = fluid.CompiledProgram(fluid.default_main_program()).with_data_parallel(places=places)
+        for item in reader():
+            data.append(item)
+            if place_num == len(data):
+                exe.run(program=program, feed=list(feeder.feed_parallel(data, place_num)), fetch_list=[])
+                data = []
+.. py:method::  decorate_reader(reader, multi_devices, num_places=None, drop_last=True)
+将输入数据转换成reader返回的多个mini-batches。每个mini-batch分别送入各设备中。
+参数：
+    - **reader** (function) – reader是可以生成数据的函数。
+    - **multi_devices** (bool) – 是否用多个设备。
+    - **num_places** (int) – 如果multi_devices是True, 你可以指定GPU的使用数量, 如果multi_devices是None, 会使用当前机器的所有GPU ，默认值None。
+    - **drop_last** (bool) – 如果最后一个batch的大小小于batch_size，选择是否删除最后一个batch，默认值True。
+返回： 转换结果
+返回类型： dict
+抛出异常：     ``ValueError`` – 如果drop_last为False并且数据batch和设备数目不匹配。
+**代码示例**
+..  code-block:: python
+        import numpy.random as random
+        import paddle
+        import paddle.fluid as fluid
+        def reader(limit=5):
+            for i in range(limit):
+                yield (random.random([784]).astype('float32'), random.random([1]).astype('int64')),
+        place=fluid.CUDAPlace(0)
+        data = fluid.layers.data(name='data', shape=[1, 28, 28], dtype='float32')
+        label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+        feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
+        reader = feeder.decorate_reader(reader, multi_devices=False)
+        exe = fluid.Executor(place)
+        exe.run(fluid.default_startup_program())
+        for data in reader():
+            exe.run(feed=data)
\ No newline at end of file
--- a/doc/fluid/api_cn/data/data_reader_cn/Reader_cn.rst
+++ b/doc/fluid/api_cn/data/data_reader_cn/Reader_cn.rst
+.. _cn_api_paddle_data_reader_reader:
+Reader
+-------------------------------------
+在训练和测试时，PaddlePaddle需要读取数据。为了简化用户编写数据读取代码的工作，我们定义了
+    - reader是一个读取数据（从文件、网络、随机数生成器等）并生成数据项的函数。
+    - reader creator是返回reader函数的函数。
+    - reader decorator是一个函数，它接受一个或多个reader，并返回一个reader。
+    - batch reader是一个函数，它读取数据（从reader、文件、网络、随机数生成器等）并生成一批数据项。
+Data Reader Interface
+======================================
+的确，data reader不必是读取和生成数据项的函数，它可以是任何不带参数的函数来创建一个iterable（任何东西都可以被用于 ``for x in iterable`` ):
+..  code-block:: python
+    iterable = data_reader()
+从iterable生成的元素应该是单个数据条目，而不是mini batch。数据输入可以是单个项目，也可以是项目的元组，但应为 :ref:`user_guide_paddle_support_data_types` （如, numpy 1d array of float32, int, list of int）
+单项目数据读取器创建者的示例实现：
+..  code-block:: python
+    def reader_creator_random_image(width, height):
+        def reader():
+            while True:
+                yield numpy.random.uniform(-1, 1, size=width*height)
+        return reader
+多项目数据读取器创建者的示例实现：
+..  code-block:: python
+    def reader_creator_random_image_and_label(width, height, label):
+        def reader():
+            while True:
+                yield numpy.random.uniform(-1, 1, size=width*height), label
+        return reader
+.. py:function::   paddle.reader.map_readers(func, *readers)
+创建使用每个数据读取器的输出作为参数输出函数返回值的数据读取器。
+参数：
+    - **func**  - 使用的函数. 函数类型应为(Sample) => Sample
+    - **readers**  - 其输出将用作func参数的reader。
+类型：callable
+返回： 被创建数据的读取器
+返回类型： callable
+.. py:function::  paddle.reader.buffered(reader, size)
+创建缓冲数据读取器。
+缓冲数据reader将读取数据条目并将其保存到缓冲区中。只要缓冲区不为空，就将继续从缓冲数据读取器读取数据。
+参数：
+    - **reader** (callable) - 要读取的数据读取器
+    - **size** (int) - 最大缓冲
+返回：缓冲数据的读取器
+.. py:function::   paddle.reader.compose(*readers, **kwargs)
+创建一个数据reader，其输出是输入reader的组合。
+如果输入reader输出以下数据项：（1，2）3（4，5），则组合reader将输出：（1，2，3，4，5）。
+参数：
+    - **readers** - 将被组合的多个读取器。
+    - **check_alignment** (bool) - 如果为True，将检查输入reader是否正确对齐。如果为False，将不检查对齐，将丢弃跟踪输出。默认值True。
+返回：新的数据读取器
+抛出异常：     ``ComposeNotAligned`` – reader的输出不一致。 当check_alignment设置为False，不会抛出异常。
+.. py:function:: paddle.reader.chain(*readers)
+创建一个数据reader，其输出是链接在一起的输入数据reader的输出。
+如果输入reader输出以下数据条目：[0，0，0][1，1，1][2，2，2]，链接reader将输出：[0，0，0，1，1，1，2，2，2] 。
+参数：
+    - **readers** – 输入的数据。
+返回： 新的数据读取器
+返回类型： callable
+.. py:function:: paddle.reader.shuffle(reader, buf_size)
+创建数据读取器，该reader的数据输出将被无序排列。
+由原始reader创建的迭代器的输出将被缓冲到shuffle缓冲区，然后进行打乱。打乱缓冲区的大小由参数buf_size决定。
+参数：
+    - **reader** (callable)  – 输出会被打乱的原始reader
+    - **buf_size** (int)  – 打乱缓冲器的大小
+返回： 输出会被打乱的reader
+返回类型： callable
+.. py:function:: paddle.reader.firstn(reader, n)
+限制reader可以返回的最大样本数。
+参数：
+    - **reader** (callable)  – 要读取的数据读取器。
+    - **n** (int)  – 返回的最大样本数 。
+返回： 装饰reader
+返回类型： callable
+.. py:function:: paddle.reader.xmap_readers(mapper, reader, process_num, buffer_size, order=False)
+通过多线程方式，通过用户自定义的映射器mapper来映射reader返回的样本（到输出队列）。
+参数：
+    - **mapper** （callable） - 一种映射reader数据的函数。
+    - **reader** （callable） - 产生数据的reader。
+    - **process_num** （int） - 用于处理样本的线程数目。
+    - **buffer_size** （int） - 存有待读取数据的队列的大小。
+    - **order** （bool） - 是否保持原始reader的数据顺序。 默认为False。
+返回：一个将原数据进行映射后的decorated reader。
+返回类型： callable
+.. py:class:: paddle.reader.PipeReader(command, bufsize=8192, file_type='plain')
+PipeReader通过流从一个命令中读取数据，将它的stdout放到管道缓冲区中，并将其重定向到解析器进行解析，然后根据需要的格式生成数据。
+您可以使用标准Linux命令或调用其他Program来读取数据，例如通过HDFS、CEPH、URL、AWS S3中读取：
+**代码示例**
+..  code-block:: python
+    def example_reader():
+        for f in myfiles:
+            pr = PipeReader("cat %s"%f)
+            for l in pr.get_line():
+                sample = l.split(" ")
+                yield sample
+.. py:method:: get_line(cut_lines=True, line_break='\n')
+参数：
+    - **cut_lines** （bool） - 将缓冲区分行。
+    - **line_break** （string） - 文件中的行分割符，比如 ‘\\n’ 或者 ‘\\r’。
+返回：一行或者一段缓冲区。
+返回类型： string
+.. py:function:: paddle.reader.multiprocess_reader(readers, use_pipe=True, queue_size=1000)
+多进程reader使用python多进程从reader中读取数据，然后使用multi process.queue或multi process.pipe合并所有数据。进程号等于输入reader的编号，每个进程调用一个reader。
+multiprocess.queue需要/dev/shm的rw访问权限，某些平台不支持。
+您需要首先创建多个reader，这些reader应该相互独立，这样每个进程都可以独立工作。
+**代码示例**
+..  code-block:: python
+    reader0 = reader(["file01", "file02"])
+    reader1 = reader(["file11", "file12"])
+    reader1 = reader(["file21", "file22"])
+    reader = multiprocess_reader([reader0, reader1, reader2],
+        queue_size=100, use_pipe=False)
+.. py:class::paddle.reader.Fake
+Fakereader将缓存它读取的第一个数据，并将其输出data_num次。它用于缓存来自真实reader的数据，并将其用于速度测试。
+参数：
+    - **reader** – 原始读取器。
+    - **data_num** – reader产生数据的次数 。
+返回： 一个Fake读取器
+**代码示例**
+..  code-block:: python
+    def reader():
+        for i in range(10):
+            yield i
+    fake_reader = Fake()(reader, 100)
+Creator包包含一些简单的reader creator，可以在用户Program中使用。
+.. py:function:: paddle.reader.creator.np_array(x)
+如果是numpy向量，则创建一个生成x个元素的读取器。或者，如果它是一个numpy矩阵，创建一个生成x行元素的读取器。或由最高维度索引的任何子超平面。
+参数：
+    - **x** – 用于创建reader的numpy数组。
+返回： 从x创建的数据读取器
+.. py:function:: paddle.reader.creator.text_file(path)
+创建从给定文本文件逐行输出文本的数据读取器。将删除每行的行尾的(‘\n’)。
+路径：文本文件的路径
+返回： 文本文件的数据读取器
+.. py:function::  paddle.reader.creator.recordio(paths, buf_size=100)
+从给定的recordio文件路径创建数据reader，用“，”分隔“，支持全局模式。
+路径：recordio文件的路径，可以是字符串或字符串列表。
+返回：recordio文件的数据读取器
\ No newline at end of file
--- a/doc/fluid/api_cn/data/dataset_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn.rst
-#################
+=======================
 dataset
-#################
+=======================
-.. _cn_api_paddle_dataset_mnist:
-mnist
-------------------------------
-MNIST数据集。
-此模块将从 http://yann.lecun.com/exdb/mnist/ 下载数据集，并将训练集和测试集解析为paddle reader creator。
-.. py:function:: paddle.dataset.mnist.train()
-MNIST训练数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[-1，1]，标签范围是[0，9]。
-返回： 训练数据的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.mnist.test()
-MNIST测试数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[-1，1]，标签范围是[0，9]。
-返回： 测试数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.mnist.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_cifar:
-cifar
-------------------------------
-CIFAR数据集。
-此模块将从 https://www.cs.toronto.edu/~kriz/cifar.html 下载数据集，并将训练集和测试集解析为paddle reader creator。
-cifar-10数据集由10个类别的60000张32x32彩色图像组成，每个类别6000张图像。共有5万张训练图像，1万张测试图像。
-cifar-100数据集与cifar-10类似，只是它有100个类，每个类包含600张图像。每个类有500张训练图像和100张测试图像。
-.. py:function:: paddle.dataset.cifar.train100()
-CIFAR-100训练数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
-返回： 训练数据集的reader creator。
-返回类型：callable
-.. py:function:: paddle.dataset.cifar.test100()
-CIFAR-100测试数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
-返回： 测试数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.cifar.train10(cycle=False)
-CIFAR-10训练数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
-参数：
-    - **cycle** (bool) – 是否循环使用数据集
-返回： 训练数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.cifar.test10(cycle=False)
-CIFAR-10测试数据集的creator。
-它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
-参数：
-    - **cycle** (bool) – 是否循环使用数据集
-返回： 测试数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.cifar.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_Conll05:
-Conll05
-------------------------------
-Conll05数据集。Paddle深度学习基础中的语义角色标注文档使用这个数据集为例。因为Conll05数据集不是免费公开的，所以默认下载的url是Conll05的测试集（它是公开的）。用户可以将url和md5更改为其Conll数据集。并采用基于维基百科语料库的预训练词向量模型对SRL模型进行初始化。
-.. py:function:: paddle.dataset.conll05.get_dict()
-获取维基百科语料库的单词、动词和标签字典。
-.. py:function:: paddle.dataset.conll05.get_embedding()
-获取基于维基百科语料库的训练词向量。
-.. py:function:: paddle.dataset.conll05.test()
-Conll05测试数据集的creator。
-因为训练数据集不是免费公开的，所以用测试数据集进行训练。它返回一个reader creator，reader中的每个样本都有九个特征，包括句子序列、谓词、谓词上下文、谓词上下文标记和标记序列。
-返回： 训练数据集的reader creator
-返回类型：callable
-.. _cn_api_paddle_dataset_imdb:
-imdb
-------------------------------
-IMDB数据集。
-本模块的数据集从 http://ai.stanford.edu/%7Eamaas/data/sentiment/IMDB 数据集。这个数据集包含了25000条训练用电影评论数据，25000条测试用评论数据，且这些评论带有明显情感倾向。此外，该模块还提供了用于构建词典的API。
-.. py:function:: paddle.dataset.imdb.build_dict(pattern, cutoff)
-从语料库构建一个单词字典，词典的键是word，值是这些单词从0开始的ID。
-.. py:function:: paddle.dataset.imdb.train(word_idx)
-IMDB训练数据集的creator。
-它返回一个reader creator, reader中的每个样本的是一个从0开始的ID序列，标签范围是[0，1]。
-参数：
-    - **word_idx** (dict) – 词典
-返回： 训练数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.imdb.test(word_idx)
-IMDB测试数据集的creator。
-它返回一个reader creator, reader中的每个样本的是一个从0开始的ID序列，标签范围是[0，1]。
-参数：
-    - **word_idx** (dict) – 词典
-返回： 训练数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.imdb.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_imikolov:
-imikolov
-------------------------------
-imikolov的简化版数据集。
-此模块将从 http://www.fit.vutbr.cz/~imikolov/rnnlm/ 下载数据集，并将训练集和测试集解析为paddle reader creator。
-.. py:function:: paddle.dataset.imikolov.build_dict(min_word_freq=50)
-从语料库构建一个单词字典，字典的键是word，值是这些单词从0开始的ID。
-.. py:function:: paddle.dataset.imikolov.train(word_idx, n, data_type=1)
-imikolov训练数据集的creator。
-它返回一个reader creator, reader中的每个样本的是一个单词ID元组。
-参数：
-    - **word_idx** (dict) – 词典
-    - **n** (int) – 如果类型是ngram，表示滑窗大小；否则表示序列最大长度
-    - **data_type** (数据类型的成员变量(NGRAM 或 SEQ)) – 数据类型 (ngram 或 sequence)
-返回： 训练数据集的reader creator
-返回类型：callable
-.. py:function::paddle.dataset.imikolov.test(word_idx, n, data_type=1)
-imikolov测试数据集的creator。
-它返回一个reader creator, reader中的每个样本的是一个单词ID元组。
-参数：
-    - **word_idx** (dict) – 词典
-    - **n** (int) – 如果类型是ngram，表示滑窗大小；否则表示序列最大长度
-    - **data_type** (数据类型的成员变量(NGRAM 或 SEQ)) – 数据类型 (ngram 或 sequence)
-返回： 测试数据集的reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.imikolov.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_movielens:
-movielens
-------------------------------
-Movielens 1-M数据集。
-Movielens 1-M数据集是由GroupLens Research采集的6000个用户对4000个电影的的100万个评级。 该模块将从 http://files.grouplens.org/datasets/movielens/ml-1m.zip 下载Movielens 1-M数据集，并将训练集和测试集解析为paddle reader creator。
-.. py:function:: paddle.dataset.movielens.get_movie_title_dict()
-获取电影标题词典。
-.. py:function:: paddle.dataset.movielens.max_movie_id()
-获取电影ID的最大值。
-.. py:function:: paddle.dataset.movielens.max_user_id()
-获取用户ID的最大值。
-.. py:function:: paddle.dataset.movielens.max_job_id()
-获取职业ID的最大值。
-.. py:function:: paddle.dataset.movielens.movie_categories()
-获取电影类别词典。
-.. py:function:: paddle.dataset.movielens.user_info()
-获取用户信息词典。
-.. py:function:: paddle.dataset.movielens.movie_info()
-获取电影信息词典。
-.. py:function:: paddle.dataset.movielens.convert(path)
-将数据集转换为recordio格式。
-.. py:class:: paddle.dataset.movielens.MovieInfo(index, categories, title)
-电影ID，标题和类别信息存储在MovieInfo中。
-.. py:class:: paddle.dataset.movielens.UserInfo(index, gender, age, job_id)
-用户ID，性别，年龄和工作信息存储在UserInfo中。
-.. _cn_api_paddle_dataset_sentiment:
-sentiment
-------------------------------
-脚本获取并预处理由NLTK提供的movie_reviews数据集。
-.. py:function:: paddle.dataset.sentiment.get_word_dict()
-按照样本中出现的单词的频率对单词进行排序。
-返回： words_freq_sorted
-.. py:function:: paddle.dataset.sentiment.train()
-默认的训练集reader creator。
-.. py:function:: paddle.dataset.sentiment.test()
-默认的测试集reader creator。
-.. py:function:: paddle.dataset.sentiment.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_uci_housing:
-uci_housing
-------------------------------
-UCI Housing数据集。
-该模块将从 https://archive.ics.uci.edu/ml/machine-learning-databases/housing/下载数据集，并将训练集和测试集解析为paddle reader creator。
-.. py:function:: paddle.dataset.uci_housing.train()
-UCI_HOUSING训练集creator。
-它返回一个reader creator，reader中的每个样本都是正则化和价格编号后的特征。
-返回：训练集reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.uci_housing.test()
-UCI_HOUSING测试集creator。
-它返回一个reader creator，reader中的每个样本都是正则化和价格编号后的特征。
-返回：测试集reader creator
-返回类型：callable
-.. _cn_api_paddle_dataset_wmt14:
-wmt14
-------------------------------
-WMT14数据集。 原始WMT14数据集太大，所以提供了一组小数据集。 该模块将从 http://paddlepaddle.cdn.bcebos.com/demo/wmt_shrinked_data/wmt14.tgz 下载数据集，并将训练集和测试集解析为paddle reader creator。
-.. py:function:: paddle.dataset.wmt14.train(dict_size)
-WMT14训练集creator。
-它返回一个reader creator，reader中的每个样本都是源语言单词ID序列，目标语言单词ID序列和下一个单词ID序列。
-返回：训练集reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.wmt14.test(dict_size)
-WMT14测试集creator。
-它返回一个reader creator，reader中的每个样本都是源语言单词ID序列，目标语言单词ID序列和下一个单词ID序列。
-返回：测试集reader creator
-返回类型：callable
-.. py:function:: paddle.dataset.wmt14.convert(path)
-将数据集转换为recordio格式。
-.. _cn_api_paddle_dataset_wmt16:
-wmt16
-------------------------------
-ACL2016多模式机器翻译。 有关更多详细信息，请访问此网站：http://www.statmt.org/wmt16/multimodal-task.html#task1
-如果您任务中使用该数据集，请引用以下文章：Multi30K：多语言英语 - 德语图像描述。
-@article{elliott-EtAl:2016:VL16, author = {{Elliott}, D. and {Frank}, S. and {Sima”an}, K. and {Specia}, L.}, title = {Multi30K: Multilingual English-German Image Descriptions}, booktitle = {Proceedings of the 6th Workshop on Vision and Language}, year = {2016}, pages = {70–74}, year = 2016
-}
-.. py:function:: paddle.dataset.wmt16.train(src_dict_size, trg_dict_size, src_lang='en')
-WMT16训练集reader（读取器）。
-此功能返回可读取训练数据的reader。 reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
-注意：训练数据的原始内容如下： http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz
-paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本： https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
-参数：
-    - **src_dict_size** (int) – 源语言词典的大小。三个特殊标记将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊标记将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
-返回: 读训练集数据的reader
-返回类型: callable
-.. py:function:: paddle.dataset.wmt16.test(src_dict_size, trg_dict_size, src_lang='en')
-WMT16测试(test)集reader。
-此功能返回可读取测试数据的reader。reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
-注意：原始测试数据如下： http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz
-paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本： https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
-参数：
-    - **src_dict_size** (int) – 源语言词典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
-返回: 读测试集数据的reader
-返回类型: callable
-.. py:function:: paddle.dataset.wmt16.validation(src_dict_size, trg_dict_size, src_lang='en')
-WMT16验证(validation)集reader。
-此功能返回可读取验证数据的reader 。reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
-注意：验证数据的原始内容如下：http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz
-paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本：https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
-参数：
-    - **src_dict_size** (int) – 源语言词典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
-    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
-返回: 读集数据的reader
-返回类型: callable
-.. py:function:: paddle.dataset.wmt16.get_dict(lang, dict_size, reverse=False)
-返回指定语言的词典(word dictionary)。
-参数：
-    - **lang** （string） - 表示哪种语言是源语言的字符串。 可用选项包括：英语为“en”，德国为“de”。
-    - **dict_size** （int） - 指定语言字典的大小。
-    - **reverse** （bool） - 如果reverse设置为False，则返回的python字典将使用word作为键并使用index作为值。 如果reverse设置为True，则返回的python字典将使用index作为键，将word作为值。
-返回：特定语言的单词词典。
-返回类型： dict
-.. py:function:: paddle.dataset.wmt16.fetch()
-下载完整的数据集。
-.. py:function:: paddle.dataset.wmt16.convert(path, src_dict_size, trg_dict_size, src_lang)
-将数据集转换为recordio格式。
+..  toctree::
+    :maxdepth: 1
+    dataset_cn/mnist_cn.rst
+    dataset_cn/cifar_cn.rst
+    dataset_cn/Conll05_cn.rst
+    dataset_cn/imdb_cn.rst
+    dataset_cn/imikolov_cn.rst
+    dataset_cn/movielens_cn.rst
+    dataset_cn/sentiment_cn.rst
+    dataset_cn/uci_housing_cn.rst
+    dataset_cn/wmt14_cn.rst
+    dataset_cn/wmt16_cn.rst
--- a/doc/fluid/api_cn/data/dataset_cn/Conll05_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/Conll05_cn.rst
+.. _cn_api_paddle_dataset_Conll05:
+Conll05
+-------------------------------
+Conll05数据集。Paddle深度学习基础中的语义角色标注文档使用这个数据集为例。因为Conll05数据集不是免费公开的，所以默认下载的url是Conll05的测试集（它是公开的）。用户可以将url和md5更改为其Conll数据集。并采用基于维基百科语料库的预训练词向量模型对SRL模型进行初始化。
+.. py:function:: paddle.dataset.conll05.get_dict()
+获取维基百科语料库的单词、动词和标签字典。
+.. py:function:: paddle.dataset.conll05.get_embedding()
+获取基于维基百科语料库的训练词向量。
+.. py:function:: paddle.dataset.conll05.test()
+Conll05测试数据集的creator。
+因为训练数据集不是免费公开的，所以用测试数据集进行训练。它返回一个reader creator，reader中的每个样本都有九个特征，包括句子序列、谓词、谓词上下文、谓词上下文标记和标记序列。
+返回： 训练数据集的reader creator
+返回类型：callable
--- a/doc/fluid/api_cn/data/dataset_cn/cifar_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/cifar_cn.rst
+.. _cn_api_paddle_dataset_cifar:
+cifar
+-------------------------------
+CIFAR数据集。
+此模块将从 https://www.cs.toronto.edu/~kriz/cifar.html 下载数据集，并将训练集和测试集解析为paddle reader creator。
+cifar-10数据集由10个类别的60000张32x32彩色图像组成，每个类别6000张图像。共有5万张训练图像，1万张测试图像。
+cifar-100数据集与cifar-10类似，只是它有100个类，每个类包含600张图像。每个类有500张训练图像和100张测试图像。
+.. py:function:: paddle.dataset.cifar.train100()
+CIFAR-100训练数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
+返回： 训练数据集的reader creator。
+返回类型：callable
+.. py:function:: paddle.dataset.cifar.test100()
+CIFAR-100测试数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
+返回： 测试数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.cifar.train10(cycle=False)
+CIFAR-10训练数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
+参数：
+    - **cycle** (bool) – 是否循环使用数据集
+返回： 训练数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.cifar.test10(cycle=False)
+CIFAR-10测试数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[0，1]，标签范围是[0，9]。
+参数：
+    - **cycle** (bool) – 是否循环使用数据集
+返回： 测试数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.cifar.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/imdb_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/imdb_cn.rst
+.. _cn_api_paddle_dataset_imdb:
+imdb
+-------------------------------
+IMDB数据集。
+本模块的数据集从 http://ai.stanford.edu/%7Eamaas/data/sentiment/IMDB 数据集。这个数据集包含了25000条训练用电影评论数据，25000条测试用评论数据，且这些评论带有明显情感倾向。此外，该模块还提供了用于构建词典的API。
+.. py:function:: paddle.dataset.imdb.build_dict(pattern, cutoff)
+从语料库构建一个单词字典，词典的键是word，值是这些单词从0开始的ID。
+.. py:function:: paddle.dataset.imdb.train(word_idx)
+IMDB训练数据集的creator。
+它返回一个reader creator, reader中的每个样本的是一个从0开始的ID序列，标签范围是[0，1]。
+参数：
+    - **word_idx** (dict) – 词典
+返回： 训练数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.imdb.test(word_idx)
+IMDB测试数据集的creator。
+它返回一个reader creator, reader中的每个样本的是一个从0开始的ID序列，标签范围是[0，1]。
+参数：
+    - **word_idx** (dict) – 词典
+返回： 训练数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.imdb.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/imikolov_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/imikolov_cn.rst
+.. _cn_api_paddle_dataset_imikolov:
+imikolov
+-------------------------------
+imikolov的简化版数据集。
+此模块将从 http://www.fit.vutbr.cz/~imikolov/rnnlm/ 下载数据集，并将训练集和测试集解析为paddle reader creator。
+.. py:function:: paddle.dataset.imikolov.build_dict(min_word_freq=50)
+从语料库构建一个单词字典，字典的键是word，值是这些单词从0开始的ID。
+.. py:function:: paddle.dataset.imikolov.train(word_idx, n, data_type=1)
+imikolov训练数据集的creator。
+它返回一个reader creator, reader中的每个样本的是一个单词ID元组。
+参数：
+    - **word_idx** (dict) – 词典
+    - **n** (int) – 如果类型是ngram，表示滑窗大小；否则表示序列最大长度
+    - **data_type** (数据类型的成员变量(NGRAM 或 SEQ)) – 数据类型 (ngram 或 sequence)
+返回： 训练数据集的reader creator
+返回类型：callable
+.. py:function::paddle.dataset.imikolov.test(word_idx, n, data_type=1)
+imikolov测试数据集的creator。
+它返回一个reader creator, reader中的每个样本的是一个单词ID元组。
+参数：
+    - **word_idx** (dict) – 词典
+    - **n** (int) – 如果类型是ngram，表示滑窗大小；否则表示序列最大长度
+    - **data_type** (数据类型的成员变量(NGRAM 或 SEQ)) – 数据类型 (ngram 或 sequence)
+返回： 测试数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.imikolov.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/mnist_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/mnist_cn.rst
+.. _cn_api_paddle_dataset_mnist:
+mnist
+-------------------------------
+MNIST数据集。
+此模块将从 http://yann.lecun.com/exdb/mnist/ 下载数据集，并将训练集和测试集解析为paddle reader creator。
+.. py:function:: paddle.dataset.mnist.train()
+MNIST训练数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[-1，1]，标签范围是[0，9]。
+返回： 训练数据的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.mnist.test()
+MNIST测试数据集的creator。
+它返回一个reader creator, reader中的每个样本的图像像素范围是[-1，1]，标签范围是[0，9]。
+返回： 测试数据集的reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.mnist.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/movielens_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/movielens_cn.rst
+.. _cn_api_paddle_dataset_movielens:
+movielens
+-------------------------------
+Movielens 1-M数据集。
+Movielens 1-M数据集是由GroupLens Research采集的6000个用户对4000个电影的的100万个评级。 该模块将从 http://files.grouplens.org/datasets/movielens/ml-1m.zip 下载Movielens 1-M数据集，并将训练集和测试集解析为paddle reader creator。
+.. py:function:: paddle.dataset.movielens.get_movie_title_dict()
+获取电影标题词典。
+.. py:function:: paddle.dataset.movielens.max_movie_id()
+获取电影ID的最大值。
+.. py:function:: paddle.dataset.movielens.max_user_id()
+获取用户ID的最大值。
+.. py:function:: paddle.dataset.movielens.max_job_id()
+获取职业ID的最大值。
+.. py:function:: paddle.dataset.movielens.movie_categories()
+获取电影类别词典。
+.. py:function:: paddle.dataset.movielens.user_info()
+获取用户信息词典。
+.. py:function:: paddle.dataset.movielens.movie_info()
+获取电影信息词典。
+.. py:function:: paddle.dataset.movielens.convert(path)
+将数据集转换为recordio格式。
+.. py:class:: paddle.dataset.movielens.MovieInfo(index, categories, title)
+电影ID，标题和类别信息存储在MovieInfo中。
+.. py:class:: paddle.dataset.movielens.UserInfo(index, gender, age, job_id)
+用户ID，性别，年龄和工作信息存储在UserInfo中。
--- a/doc/fluid/api_cn/data/dataset_cn/sentiment_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/sentiment_cn.rst
+.. _cn_api_paddle_dataset_sentiment:
+sentiment
+-------------------------------
+脚本获取并预处理由NLTK提供的movie_reviews数据集。
+.. py:function:: paddle.dataset.sentiment.get_word_dict()
+按照样本中出现的单词的频率对单词进行排序。
+返回： words_freq_sorted
+.. py:function:: paddle.dataset.sentiment.train()
+默认的训练集reader creator。
+.. py:function:: paddle.dataset.sentiment.test()
+默认的测试集reader creator。
+.. py:function:: paddle.dataset.sentiment.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/uci_housing_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/uci_housing_cn.rst
+.. _cn_api_paddle_dataset_uci_housing:
+uci_housing
+-------------------------------
+UCI Housing数据集。
+该模块将从 https://archive.ics.uci.edu/ml/machine-learning-databases/housing/下载数据集，并将训练集和测试集解析为paddle reader creator。
+.. py:function:: paddle.dataset.uci_housing.train()
+UCI_HOUSING训练集creator。
+它返回一个reader creator，reader中的每个样本都是正则化和价格编号后的特征。
+返回：训练集reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.uci_housing.test()
+UCI_HOUSING测试集creator。
+它返回一个reader creator，reader中的每个样本都是正则化和价格编号后的特征。
+返回：测试集reader creator
+返回类型：callable
--- a/doc/fluid/api_cn/data/dataset_cn/wmt14_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/wmt14_cn.rst
+.. _cn_api_paddle_dataset_wmt14:
+wmt14
+-------------------------------
+WMT14数据集。 原始WMT14数据集太大，所以提供了一组小数据集。 该模块将从 http://paddlepaddle.cdn.bcebos.com/demo/wmt_shrinked_data/wmt14.tgz 下载数据集，并将训练集和测试集解析为paddle reader creator。
+.. py:function:: paddle.dataset.wmt14.train(dict_size)
+WMT14训练集creator。
+它返回一个reader creator，reader中的每个样本都是源语言单词ID序列，目标语言单词ID序列和下一个单词ID序列。
+返回：训练集reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.wmt14.test(dict_size)
+WMT14测试集creator。
+它返回一个reader creator，reader中的每个样本都是源语言单词ID序列，目标语言单词ID序列和下一个单词ID序列。
+返回：测试集reader creator
+返回类型：callable
+.. py:function:: paddle.dataset.wmt14.convert(path)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data/dataset_cn/wmt16_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn/wmt16_cn.rst
+.. _cn_api_paddle_dataset_wmt16:
+wmt16
+-------------------------------
+ACL2016多模式机器翻译。 有关更多详细信息，请访问此网站：http://www.statmt.org/wmt16/multimodal-task.html#task1
+如果您任务中使用该数据集，请引用以下文章：Multi30K：多语言英语 - 德语图像描述。
+@article{elliott-EtAl:2016:VL16, author = {{Elliott}, D. and {Frank}, S. and {Sima”an}, K. and {Specia}, L.}, title = {Multi30K: Multilingual English-German Image Descriptions}, booktitle = {Proceedings of the 6th Workshop on Vision and Language}, year = {2016}, pages = {70–74}, year = 2016
+}
+.. py:function:: paddle.dataset.wmt16.train(src_dict_size, trg_dict_size, src_lang='en')
+WMT16训练集reader（读取器）。
+此功能返回可读取训练数据的reader。 reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
+注意：训练数据的原始内容如下： http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz
+paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本： https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
+参数：
+    - **src_dict_size** (int) – 源语言词典的大小。三个特殊标记将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊标记将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
+返回: 读训练集数据的reader
+返回类型: callable
+.. py:function:: paddle.dataset.wmt16.test(src_dict_size, trg_dict_size, src_lang='en')
+WMT16测试(test)集reader。
+此功能返回可读取测试数据的reader。reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
+注意：原始测试数据如下： http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/mmt16_task1_test.tar.gz
+paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本： https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
+参数：
+    - **src_dict_size** (int) – 源语言词典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
+返回: 读测试集数据的reader
+返回类型: callable
+.. py:function:: paddle.dataset.wmt16.validation(src_dict_size, trg_dict_size, src_lang='en')
+WMT16验证(validation)集reader。
+此功能返回可读取验证数据的reader 。reader返回的每个样本由三个字段组成：源语言单词索引序列，目标语言单词索引序列和下一单词索引序列。
+注意：验证数据的原始内容如下：http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz
+paddle.dataset.wmt16使用moses的tokenization脚本提供原始数据集的tokenized版本：https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl
+参数：
+    - **src_dict_size** (int) – 源语言词典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **trg_dict_size**  (int) – 目标语言字典的大小。三个特殊token将被添加到所述词典：<S>为起始标记，<E>为结束标记，<UNK>为未知单词。
+    - **src_lang**  (string) – 一个字符串，指示哪种语言是源语言。 可用选项包括：英语为“en”，德国为“de”。
+返回: 读集数据的reader
+返回类型: callable
+.. py:function:: paddle.dataset.wmt16.get_dict(lang, dict_size, reverse=False)
+返回指定语言的词典(word dictionary)。
+参数：
+    - **lang** （string） - 表示哪种语言是源语言的字符串。 可用选项包括：英语为“en”，德国为“de”。
+    - **dict_size** （int） - 指定语言字典的大小。
+    - **reverse** （bool） - 如果reverse设置为False，则返回的python字典将使用word作为键并使用index作为值。 如果reverse设置为True，则返回的python字典将使用index作为键，将word作为值。
+返回：特定语言的单词词典。
+返回类型： dict
+.. py:function:: paddle.dataset.wmt16.fetch()
+下载完整的数据集。
+.. py:function:: paddle.dataset.wmt16.convert(path, src_dict_size, trg_dict_size, src_lang)
+将数据集转换为recordio格式。
--- a/doc/fluid/api_cn/data_feeder_cn.rst
+++ b/doc/fluid/api_cn/data_feeder_cn.rst
-###################
+=======================
- fluid.data_feeder
+fluid.data_feeder
-###################
+=======================
-.. _cn_api_fluid_data_feeder_DataFeeder:
+..  toctree::
-DataFeeder
+    :maxdepth: 1
-------------------------------
+    data_feeder_cn/DataFeeder_cn.rst
-.. py:class:: paddle.fluid.data_feeder.DataFeeder(feed_list, place, program=None)
-``DataFeeder`` 负责将reader(读取器)返回的数据转成一种特殊的数据结构，使它们可以输入到 ``Executor`` 和 ``ParallelExecutor`` 中。
-reader通常返回一个minibatch条目列表。在列表中每一条目都是一个样本（sample），它是由具有一至多个特征的列表或元组组成的。
-以下是简单用法：
-.. code-block:: python
-    import paddle.fluid as fluid
-    place = fluid.CPUPlace()
-    img = fluid.layers.data(name='image', shape=[1, 28, 28])
-    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder([img, label], fluid.CPUPlace())
-    result = feeder.feed([([0] * 784, [9]), ([1] * 784, [1])])
-在多GPU模型训练时，如果需要提前分别向各GPU输入数据，可以使用 ``decorate_reader`` 函数。
-.. code-block:: python
-    import paddle
-    import paddle.fluid as fluid
-    place=fluid.CUDAPlace(0)
-    data = fluid.layers.data(name='data', shape=[3, 224, 224], dtype='float32')
-    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
-    reader = feeder.decorate_reader(
-            paddle.batch(paddle.dataset.flowers.train(), batch_size=16), multi_devices=False)
-参数：  
-    - **feed_list** (list) – 向模型输入的变量表或者变量表名
-    - **place** (Place) – place表明是向GPU还是CPU中输入数据。如果想向GPU中输入数据, 请使用 ``fluid.CUDAPlace(i)`` (i 代表 the GPU id)；如果向CPU中输入数据, 请使用  ``fluid.CPUPlace()``
-    - **program** (Program) – 需要向其中输入数据的Program。如果为None, 会默认使用 ``default_main_program()``。 缺省值为None
-弹出异常:   ``ValueError``  – 如果一些变量不在此 Program 中
-**代码示例**
-.. code-block:: python
-    import numpy as np
-    import paddle
-    import paddle.fluid as fluid
-    place = fluid.CPUPlace()
-    def reader():
-        yield [np.random.random([4]).astype('float32'), np.random.random([3]).astype('float32')],
-    main_program = fluid.Program()
-    startup_program = fluid.Program()   
-    with fluid.program_guard(main_program, startup_program):
-        data_1 = fluid.layers.data(name='data_1', shape=[1, 2, 2])
-        data_2 = fluid.layers.data(name='data_2', shape=[1, 1, 3])
-        out = fluid.layers.fc(input=[data_1, data_2], size=2)
-        # ...
-    feeder = fluid.DataFeeder([data_1, data_2], place)
-    exe = fluid.Executor(place)
-    exe.run(startup_program)
-    for data in reader():
-        outs = exe.run(program=main_program,
-                        feed=feeder.feed(data),
-                        fetch_list=[out])
-.. py:method:: feed(iterable)
-根据feed_list（数据输入表）和iterable（可遍历的数据）提供的信息，将输入数据转成一种特殊的数据结构，使它们可以输入到 ``Executor`` 和 ``ParallelExecutor`` 中。
-参数:    
-    - **iterable** (list|tuple) – 要输入的数据
-返回：  转换结果
-返回类型: dict
-**代码示例**
-.. code-block:: python
-        import numpy.random as random
-        import paddle.fluid as fluid
-        def reader(limit=5):
-            for i in range(limit):
-                yield random.random([784]).astype('float32'), random.random([1]).astype('int64'), random.random([256]).astype('float32')
-        data_1 = fluid.layers.data(name='data_1', shape=[1, 28, 28])
-        data_2 = fluid.layers.data(name='data_2', shape=[1], dtype='int64')
-        data_3 = fluid.layers.data(name='data_3', shape=[16, 16], dtype='float32')
-        feeder = fluid.DataFeeder(['data_1','data_2', 'data_3'], fluid.CPUPlace())
-        result = feeder.feed(reader())
-.. py:method:: feed_parallel(iterable, num_places=None)
-该方法获取的多个minibatch，并把每个minibatch提前输入进各个设备中。
-参数:    
-    - **iterable** (list|tuple) – 要输入的数据
-    - **num_places** (int) – 设备数目。默认为None。
-返回: 转换结果
-返回类型: dict
-.. note::
-   设备（CPU或GPU）的数目必须等于minibatch的数目
-**代码示例**
-.. code-block:: python
-    import numpy.random as random
-    import paddle.fluid as fluid
-    def reader(limit=10):
-        for i in range(limit):
-            yield [random.random([784]).astype('float32'), random.randint(10)],
-    x = fluid.layers.data(name='x', shape=[1, 28, 28])
-    y = fluid.layers.data(name='y', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder(['x','y'], fluid.CPUPlace())
-    place_num = 2
-    places = [fluid.CPUPlace() for x in range(place_num)]
-    data = []
-    exe = fluid.Executor(fluid.CPUPlace())
-    exe.run(fluid.default_startup_program())
-    program = fluid.CompiledProgram(fluid.default_main_program()).with_data_parallel(places=places)
-    for item in reader():
-        data.append(item)
-        if place_num == len(data):
-            exe.run(program=program, feed=list(feeder.feed_parallel(data, place_num)), fetch_list=[])
-            data = []
-.. py:method::  decorate_reader(reader, multi_devices, num_places=None, drop_last=True)
-将reader返回的输入数据batch转换为多个mini-batch，之后每个mini-batch都会被输入进各个设备（CPU或GPU）中。
-参数：
-        - **reader** (fun) – 该参数是一个可以生成数据的函数
-        - **multi_devices** (bool) – bool型，指明是否使用多个设备
-        - **num_places** (int) – 如果 ``multi_devices`` 为 ``True`` , 可以使用此参数来设置GPU数目。如果 ``num_places`` 为 ``None`` ，该函数默认使用当前训练机所有GPU设备。默认为None。
-        - **drop_last** (bool) – 如果最后一个batch的大小比 ``batch_size`` 要小，则可使用该参数来指明是否选择丢弃最后一个batch数据。 默认为 ``True`` 
-返回：转换结果
-返回类型: dict
-弹出异常： ValueError – 如果 ``drop_last`` 值为False并且reader返回的minibatch数目与设备数目不相等时，产生此异常
-**代码示例**
-.. code-block:: python
-    import numpy.random as random
-    import paddle
-    import paddle.fluid as fluid
-    def reader(limit=5):
-        for i in range(limit):
-            yield (random.random([784]).astype('float32'), random.random([1]).astype('int64')),
-    place=fluid.CUDAPlace(0)
-    data = fluid.layers.data(name='data', shape=[1, 28, 28], dtype='float32')
-    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
-    reader = feeder.decorate_reader(reader, multi_devices=False)
-    exe = fluid.Executor(place)
-    exe.run(fluid.default_startup_program())
-    for data in reader():
-        exe.run(feed=data)
--- a/doc/fluid/api_cn/data_feeder_cn/DataFeeder_cn.rst
+++ b/doc/fluid/api_cn/data_feeder_cn/DataFeeder_cn.rst
+.. _cn_api_fluid_data_feeder_DataFeeder:
+DataFeeder
+-------------------------------
+.. py:class:: paddle.fluid.data_feeder.DataFeeder(feed_list, place, program=None)
+``DataFeeder`` 负责将reader(读取器)返回的数据转成一种特殊的数据结构，使它们可以输入到 ``Executor`` 和 ``ParallelExecutor`` 中。
+reader通常返回一个minibatch条目列表。在列表中每一条目都是一个样本（sample），它是由具有一至多个特征的列表或元组组成的。
+以下是简单用法：
+.. code-block:: python
+    import paddle.fluid as fluid
+    place = fluid.CPUPlace()
+    img = fluid.layers.data(name='image', shape=[1, 28, 28])
+    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder([img, label], fluid.CPUPlace())
+    result = feeder.feed([([0] * 784, [9]), ([1] * 784, [1])])
+在多GPU模型训练时，如果需要提前分别向各GPU输入数据，可以使用 ``decorate_reader`` 函数。
+.. code-block:: python
+    import paddle
+    import paddle.fluid as fluid
+    place=fluid.CUDAPlace(0)
+    data = fluid.layers.data(name='data', shape=[3, 224, 224], dtype='float32')
+    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
+    reader = feeder.decorate_reader(
+            paddle.batch(paddle.dataset.flowers.train(), batch_size=16), multi_devices=False)
+参数：  
+    - **feed_list** (list) – 向模型输入的变量表或者变量表名
+    - **place** (Place) – place表明是向GPU还是CPU中输入数据。如果想向GPU中输入数据, 请使用 ``fluid.CUDAPlace(i)`` (i 代表 the GPU id)；如果向CPU中输入数据, 请使用  ``fluid.CPUPlace()``
+    - **program** (Program) – 需要向其中输入数据的Program。如果为None, 会默认使用 ``default_main_program()``。 缺省值为None
+弹出异常:   ``ValueError``  – 如果一些变量不在此 Program 中
+**代码示例**
+.. code-block:: python
+    import numpy as np
+    import paddle
+    import paddle.fluid as fluid
+    place = fluid.CPUPlace()
+    def reader():
+        yield [np.random.random([4]).astype('float32'), np.random.random([3]).astype('float32')],
+    main_program = fluid.Program()
+    startup_program = fluid.Program()   
+    with fluid.program_guard(main_program, startup_program):
+        data_1 = fluid.layers.data(name='data_1', shape=[1, 2, 2])
+        data_2 = fluid.layers.data(name='data_2', shape=[1, 1, 3])
+        out = fluid.layers.fc(input=[data_1, data_2], size=2)
+        # ...
+    feeder = fluid.DataFeeder([data_1, data_2], place)
+    exe = fluid.Executor(place)
+    exe.run(startup_program)
+    for data in reader():
+        outs = exe.run(program=main_program,
+                        feed=feeder.feed(data),
+                        fetch_list=[out])
+.. py:method:: feed(iterable)
+根据feed_list（数据输入表）和iterable（可遍历的数据）提供的信息，将输入数据转成一种特殊的数据结构，使它们可以输入到 ``Executor`` 和 ``ParallelExecutor`` 中。
+参数:    
+    - **iterable** (list|tuple) – 要输入的数据
+返回：  转换结果
+返回类型: dict
+**代码示例**
+.. code-block:: python
+        import numpy.random as random
+        import paddle.fluid as fluid
+        def reader(limit=5):
+            for i in range(limit):
+                yield random.random([784]).astype('float32'), random.random([1]).astype('int64'), random.random([256]).astype('float32')
+        data_1 = fluid.layers.data(name='data_1', shape=[1, 28, 28])
+        data_2 = fluid.layers.data(name='data_2', shape=[1], dtype='int64')
+        data_3 = fluid.layers.data(name='data_3', shape=[16, 16], dtype='float32')
+        feeder = fluid.DataFeeder(['data_1','data_2', 'data_3'], fluid.CPUPlace())
+        result = feeder.feed(reader())
+.. py:method:: feed_parallel(iterable, num_places=None)
+该方法获取的多个minibatch，并把每个minibatch提前输入进各个设备中。
+参数:    
+    - **iterable** (list|tuple) – 要输入的数据
+    - **num_places** (int) – 设备数目。默认为None。
+返回: 转换结果
+返回类型: dict
+.. note::
+   设备（CPU或GPU）的数目必须等于minibatch的数目
+**代码示例**
+.. code-block:: python
+    import numpy.random as random
+    import paddle.fluid as fluid
+    def reader(limit=10):
+        for i in range(limit):
+            yield [random.random([784]).astype('float32'), random.randint(10)],
+    x = fluid.layers.data(name='x', shape=[1, 28, 28])
+    y = fluid.layers.data(name='y', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder(['x','y'], fluid.CPUPlace())
+    place_num = 2
+    places = [fluid.CPUPlace() for x in range(place_num)]
+    data = []
+    exe = fluid.Executor(fluid.CPUPlace())
+    exe.run(fluid.default_startup_program())
+    program = fluid.CompiledProgram(fluid.default_main_program()).with_data_parallel(places=places)
+    for item in reader():
+        data.append(item)
+        if place_num == len(data):
+            exe.run(program=program, feed=list(feeder.feed_parallel(data, place_num)), fetch_list=[])
+            data = []
+.. py:method::  decorate_reader(reader, multi_devices, num_places=None, drop_last=True)
+将reader返回的输入数据batch转换为多个mini-batch，之后每个mini-batch都会被输入进各个设备（CPU或GPU）中。
+参数：
+        - **reader** (fun) – 该参数是一个可以生成数据的函数
+        - **multi_devices** (bool) – bool型，指明是否使用多个设备
+        - **num_places** (int) – 如果 ``multi_devices`` 为 ``True`` , 可以使用此参数来设置GPU数目。如果 ``num_places`` 为 ``None`` ，该函数默认使用当前训练机所有GPU设备。默认为None。
+        - **drop_last** (bool) – 如果最后一个batch的大小比 ``batch_size`` 要小，则可使用该参数来指明是否选择丢弃最后一个batch数据。 默认为 ``True`` 
+返回：转换结果
+返回类型: dict
+弹出异常： ValueError – 如果 ``drop_last`` 值为False并且reader返回的minibatch数目与设备数目不相等时，产生此异常
+**代码示例**
+.. code-block:: python
+    import numpy.random as random
+    import paddle
+    import paddle.fluid as fluid
+    def reader(limit=5):
+        for i in range(limit):
+            yield (random.random([784]).astype('float32'), random.random([1]).astype('int64')),
+    place=fluid.CUDAPlace(0)
+    data = fluid.layers.data(name='data', shape=[1, 28, 28], dtype='float32')
+    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+    feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
+    reader = feeder.decorate_reader(reader, multi_devices=False)
+    exe = fluid.Executor(place)
+    exe.run(fluid.default_startup_program())
+    for data in reader():
+        exe.run(feed=data)
--- a/doc/fluid/api_cn/dataset_cn.rst
+++ b/doc/fluid/api_cn/dataset_cn.rst
-#################
+=======================
- fluid.dataset
+fluid.dataset
-#################
+=======================
+..  toctree::
+    :maxdepth: 1
-.. _cn_api_fluid_dataset_DatasetFactory:
+    dataset_cn/DatasetFactory_cn.rst
-DatasetFactory
+    dataset_cn/InMemoryDataset_cn.rst
-------------------------------
+    dataset_cn/QueueDataset_cn.rst
-.. py:class:: paddle.fluid.dataset.DatasetFactory
-DatasetFactory是一个按数据集名称创建数据集的 "工厂"，可以创建“QueueDataset”，“InMemoryDataset”或“FileInstantDataset”，默认为“QueueDataset”。
-**代码示例**
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = paddle.fluid.DatasetFactory().create_dataset("InMemoryDataset")
-.. py:method:: create_dataset(datafeed_class='QueueDataset')
-创建“QueueDataset”，“InMemoryDataset” 或 “FileInstantDataset”，默认为“QueueDataset”。
-参数：
-    - **datafeed_class** (str) – datafeed类名，为QueueDataset或InMemoryDataset。默认为QueueDataset。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = fluid.DatasetFactory().create_dataset()
-.. _cn_api_fluid_dataset_InMemoryDataset:
-InMemoryDataset
-------------------------------
-.. py:class:: paddle.fluid.dataset.InMemoryDataset
-InMemoryDataset会向内存中加载数据并在训练前缓冲数据。此类由DatasetFactory创建。
-**代码示例**:
-.. code-block:: python
-    dataset = paddle.fluid.DatasetFactory().create_dataset(“InMemoryDataset”)
-.. py:method:: load_into_memory()
-向内存中加载数据。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-.. py:method:: local_shuffle()
-局域shuffle。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-    dataset.local_shuffle()
-.. py:method:: global_shuffle(fleet=None)
-全局shuffle。
-只能用在分布式模式（单机多进程或多机多进程）中。您如果在分布式模式中运行，应当传递fleet而非None。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-    dataset.global_shuffle(fleet)
-参数：
-    - **fleet** (Fleet) – fleet单例。默认为None。
-.. py:method:: release_memory()
-当数据不再使用时，释放InMemoryDataset内存数据。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-    dataset.global_shuffle(fleet)
-    exe = fluid.Executor(fluid.CPUPlace())
-    exe.run(fluid.default_startup_program())
-    exe.train_from_dataset(fluid.default_main_program(), dataset)dataset.release_memory()
-    dataset.release_memory()
-.. py:method:: get_memory_data_size(fleet=None)
-用户可以调用此函数以了解加载进内存后所有workers中的ins数量。
-.. note::
-    该函数可能会导致性能不佳，因为它具有barrier。
-参数：
-    - **fleet** (Fleet) – fleet对象。
-返回：内存数据的大小。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-    print dataset.get_memory_data_size(fleet)
-.. py:method:: get_shuffle_data_size(fleet=None)
-获取shuffle数据大小，用户可以调用此函数以了解局域/全局shuffle后所有workers中的ins数量。
-.. note::
-    该函数可能会导致局域shuffle性能不佳，因为它具有barrier。但其不影响局域shuffle。
-参数：
-    - **fleet** (Fleet) – fleet对象。
-返回：shuffle数据的大小。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
-    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
-    filelist = ["a.txt", "b.txt"]
-    dataset.set_filelist(filelist)
-    dataset.load_into_memory()
-    dataset.global_shuffle(fleet)
-    print dataset.get_shuffle_data_size(fleet)
-.. _cn_api_fluid_dataset_QueueDataset:
-QueueDataset
-------------------------------
-.. py:class:: paddle.fluid.dataset.QueueDataset
-流式处理数据。
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
-.. py:method:: local_shuffle()
-局域shuffle数据
-QueueDataset中不支持局域shuffle，可能抛出NotImplementedError
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
-    dataset.local_shuffle()
-.. py:method:: global_shuffle(fleet=None)
-全局shuffle数据
-QueueDataset中不支持全局shuffle，可能抛出NotImplementedError
-**代码示例**:
-.. code-block:: python
-    import paddle.fluid as fluid
-    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
-    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
-    dataset.global_shuffle(fleet)
--- a/doc/fluid/api_cn/dataset_cn/DatasetFactory_cn.rst
+++ b/doc/fluid/api_cn/dataset_cn/DatasetFactory_cn.rst
+.. _cn_api_fluid_dataset_DatasetFactory:
+DatasetFactory
+-------------------------------
+.. py:class:: paddle.fluid.dataset.DatasetFactory
+DatasetFactory是一个按数据集名称创建数据集的 "工厂"，可以创建“QueueDataset”，“InMemoryDataset”或“FileInstantDataset”，默认为“QueueDataset”。
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = paddle.fluid.DatasetFactory().create_dataset("InMemoryDataset")
+.. py:method:: create_dataset(datafeed_class='QueueDataset')
+创建“QueueDataset”，“InMemoryDataset” 或 “FileInstantDataset”，默认为“QueueDataset”。
+参数：
+    - **datafeed_class** (str) – datafeed类名，为QueueDataset或InMemoryDataset。默认为QueueDataset。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = fluid.DatasetFactory().create_dataset()
--- a/doc/fluid/api_cn/dataset_cn/InMemoryDataset_cn.rst
+++ b/doc/fluid/api_cn/dataset_cn/InMemoryDataset_cn.rst
+.. _cn_api_fluid_dataset_InMemoryDataset:
+InMemoryDataset
+-------------------------------
+.. py:class:: paddle.fluid.dataset.InMemoryDataset
+InMemoryDataset会向内存中加载数据并在训练前缓冲数据。此类由DatasetFactory创建。
+**代码示例**:
+.. code-block:: python
+    dataset = paddle.fluid.DatasetFactory().create_dataset(“InMemoryDataset”)
+.. py:method:: load_into_memory()
+向内存中加载数据。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+.. py:method:: local_shuffle()
+局域shuffle。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+    dataset.local_shuffle()
+.. py:method:: global_shuffle(fleet=None)
+全局shuffle。
+只能用在分布式模式（单机多进程或多机多进程）中。您如果在分布式模式中运行，应当传递fleet而非None。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+    dataset.global_shuffle(fleet)
+参数：
+    - **fleet** (Fleet) – fleet单例。默认为None。
+.. py:method:: release_memory()
+当数据不再使用时，释放InMemoryDataset内存数据。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+    dataset.global_shuffle(fleet)
+    exe = fluid.Executor(fluid.CPUPlace())
+    exe.run(fluid.default_startup_program())
+    exe.train_from_dataset(fluid.default_main_program(), dataset)dataset.release_memory()
+    dataset.release_memory()
+.. py:method:: get_memory_data_size(fleet=None)
+用户可以调用此函数以了解加载进内存后所有workers中的ins数量。
+.. note::
+    该函数可能会导致性能不佳，因为它具有barrier。
+参数：
+    - **fleet** (Fleet) – fleet对象。
+返回：内存数据的大小。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+    print dataset.get_memory_data_size(fleet)
+.. py:method:: get_shuffle_data_size(fleet=None)
+获取shuffle数据大小，用户可以调用此函数以了解局域/全局shuffle后所有workers中的ins数量。
+.. note::
+    该函数可能会导致局域shuffle性能不佳，因为它具有barrier。但其不影响局域shuffle。
+参数：
+    - **fleet** (Fleet) – fleet对象。
+返回：shuffle数据的大小。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
+    dataset = fluid.DatasetFactory().create_dataset("InMemoryDataset")
+    filelist = ["a.txt", "b.txt"]
+    dataset.set_filelist(filelist)
+    dataset.load_into_memory()
+    dataset.global_shuffle(fleet)
+    print dataset.get_shuffle_data_size(fleet)
--- a/doc/fluid/api_cn/dataset_cn/QueueDataset_cn.rst
+++ b/doc/fluid/api_cn/dataset_cn/QueueDataset_cn.rst
+.. _cn_api_fluid_dataset_QueueDataset:
+QueueDataset
+-------------------------------
+.. py:class:: paddle.fluid.dataset.QueueDataset
+流式处理数据。
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
+.. py:method:: local_shuffle()
+局域shuffle数据
+QueueDataset中不支持局域shuffle，可能抛出NotImplementedError
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
+    dataset.local_shuffle()
+.. py:method:: global_shuffle(fleet=None)
+全局shuffle数据
+QueueDataset中不支持全局shuffle，可能抛出NotImplementedError
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet
+    dataset = fluid.DatasetFactory().create_dataset("QueueDataset")
+    dataset.global_shuffle(fleet)
--- a/doc/fluid/api_cn/dygraph_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/BackwardStrategy_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/BackwardStrategy_cn.rst
+.. _cn_api_fluid_dygraph_BackwardStrategy:
+BackwardStrategy
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.BackwardStrategy
+BackwardStrategy是描述反向过程的描述符，现有如下功能:
+1. ``sort_sum_gradient`` 按回溯逆序将梯度加和
+**代码示例**
+.. code-block:: python
+    import numpy as np
+    import paddle.fluid as fluid
+    from paddle.fluid import FC
+    x = np.ones([2, 2], np.float32)
+    with fluid.dygraph.guard():
+        inputs2 = []
+        for _ in range(10):
+            inputs2.append(fluid.dygraph.base.to_variable(x))
+        ret2 = fluid.layers.sums(inputs2)
+        loss2 = fluid.layers.reduce_sum(ret2)
+        backward_strategy = fluid.dygraph.BackwardStrategy()
+        backward_strategy.sort_sum_gradient = True
+        loss2.backward(backward_strategy)
--- a/doc/fluid/api_cn/dygraph_cn/BatchNorm_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/BatchNorm_cn.rst
+.. _cn_api_fluid_dygraph_BatchNorm:
+BatchNorm
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.BatchNorm(name_scope, num_channels, act=None, is_test=False, momentum=0.9, epsilon=1e-05, param_attr=None, bias_attr=None, dtype='float32', data_layout='NCHW', in_place=False, moving_mean_name=None, moving_variance_name=None, do_model_average_for_mean_and_var=False, fuse_with_relu=False, use_global_stats=False, trainable_statistics=False)
+批正则化层（Batch Normalization Layer）
+可用作conv2d和全连接操作的正则化函数。该层需要的数据格式如下：
+1.NHWC[batch,in_height,in_width,in_channels]
+2.NCHW[batch,in_channels,in_height,in_width]
+更多详情请参考 : `Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift <https://arxiv.org/pdf/1502.03167.pdf>`_
+``input`` 是mini-batch的输入特征。
+.. math::
+    \mu_{\beta}        &\gets \frac{1}{m} \sum_{i=1}^{m} x_i                                 \quad &// mini-batch-mean \\
+    \sigma_{\beta}^{2} &\gets \frac{1}{m} \sum_{i=1}^{m}(x_i - \mu_{\beta})^2               \quad &// mini-batch-variance \\
+    \hat{x_i}          &\gets \frac{x_i - \mu_\beta} {\sqrt{\sigma_{\beta}^{2} + \epsilon}}  \quad &// normalize \\
+    y_i &\gets \gamma \hat{x_i} + \beta                                                      \quad &// scale-and-shift
+当use_global_stats = True时， :math:`\mu_{\beta}` 和 :math:`\sigma_{\beta}^{2}` 不是一个minibatch的统计数据。 它们是全局（或运行）统计数据。 （它通常来自预训练模型）。训练和测试（或预测）具有相同的行为：
+.. math::
+    \hat{x_i} &\gets \frac{x_i - \mu_\beta} {\sqrt{\
+    \sigma_{\beta}^{2} + \epsilon}}  \\
+    y_i &\gets \gamma \hat{x_i} + \beta
+参数：
+    - **name_scope** (str) - 该类的名称
+    - **act** （string，默认None）- 激活函数类型，linear|relu|prelu|...
+    - **is_test** （bool,默认False） - 指示它是否在测试阶段。
+    - **momentum** （float，默认0.9）- 此值用于计算 moving_mean and moving_var. 更新公式为:  :math:`moving\_mean = moving\_mean * momentum + new\_mean * (1. - momentum` :math:`moving\_var = moving\_var * momentum + new\_var * (1. - momentum` ， 默认值0.9.
+    - **epsilon** （float，默认1e-05）- 加在分母上为了数值稳定的值。默认值为1e-5。
+    - **param_attr** （ParamAttr|None） - batch_norm参数范围的属性，如果设为None或者是ParamAttr的一个属性，batch_norm创建ParamAttr为param_attr。如果没有设置param_attr的初始化函数，参数初始化为Xavier。默认：None
+    - **bias_attr** （ParamAttr|None） - batch_norm bias参数的属性，如果设为None或者是ParamAttr的一个属性，batch_norm创建ParamAttr为bias_attr。如果没有设置bias_attr的初始化函数，参数初始化为0。默认：None
+    - **data_layout** （string,默认NCHW) - NCHW|NHWC。默认NCHW
+    - **in_place** （bool，默认False）- 得出batch norm可复用记忆的输入和输出
+    - **moving_mean_name** （string，默认None）- moving_mean的名称，存储全局Mean均值。 
+    - **moving_variance_name** （string，默认None）- moving_variance的名称，存储全局方差。 
+    - **do_model_average_for_mean_and_var** （bool，默认False）- 是否为mean和variance做模型均值
+    - **fuse_with_relu** （bool）- 如果为True，batch norm后该操作符执行relu。默认为False。
+    - **use_global_stats** （bool, Default False） – 是否使用全局均值和方差。 在预测或测试模式下，将use_global_stats设置为true或将is_test设置为true，并且行为是等效的。 在训练模式中，当设置use_global_stats为True时，在训练期间也使用全局均值和方差。
+    - **trainable_statistics** （bool）- eval模式下是否计算mean均值和var方差。eval模式下，trainable_statistics为True时，由该批数据计算均值和方差。默认为False。
+返回： 张量，在输入中运用批正则后的结果
+返回类型：变量（Variable）
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    with fluid.dygraph.guard():
+        fc = fluid.FC('fc', size=200, param_attr='fc1.w')
+        hidden1 = fc(x)
+        batch_norm = fluid.BatchNorm("batch_norm", 10)
+        hidden2 = batch_norm(hidden1)
--- a/doc/fluid/api_cn/dygraph_cn/BilinearTensorProduct_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/BilinearTensorProduct_cn.rst
+.. _cn_api_fluid_dygraph_BilinearTensorProduct:
+BilinearTensorProduct
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.BilinearTensorProduct(name_scope, size, name=None, act=None, param_attr=None, bias_attr=None)
+该层可将一对张量进行双线性乘积计算，例如：
+.. math::
+    out_{i} = x * W_{i} * {y^\mathrm{T}}, i=0,1,...,size-1
+式中，
+- :math:`x` ： 第一个输入，分别包含M个元素，形为[batch_size, M]
+- :math:`y` ：第二个输入，分别包含N个元素，形为[batch_size, N]
+- :math:`W_i` ：第i个学习到的权重，形为[M,N]
+- :math:`out_i` ：输出的第i个元素
+- :math:`y^T` ： :math:`y_2` 的转置
+参数：
+    - **name_scope**  (str) – 类的名称。
+    - **size**  (int) – 该层的维度大小。
+    - **act**  (str) – 对输出应用的激励函数。默认:None。
+    - **name**  (str) – 该层的名称。 默认: None。
+    - **param_attr**  (ParamAttr) – 该层中可学习权重/参数w的参数属性。默认: None.
+    - **bias_attr**  (ParamAttr) – 该层中偏置(bias)的参数属性。若为False, 则输出中不应用偏置。如果为None, 偏置默认为0。默认: None.
+返回：形为 [batch_size, size]的二维张量
+返回类型： Variable
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        layer1 = numpy.random.random((5, 5)).astype('float32')
+        layer2 = numpy.random.random((5, 4)).astype('float32')
+        bilinearTensorProduct = fluid.dygraph.nn.BilinearTensorProduct(
+               'BilinearTensorProduct', size=1000)
+        ret = bilinearTensorProduct(fluid.dygraph.base.to_variable(layer1),
+                           fluid.dygraph.base.to_variable(layer2))
--- a/doc/fluid/api_cn/dygraph_cn/Conv2DTranspose_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Conv2DTranspose_cn.rst
+.. _cn_api_fluid_dygraph_Conv2DTranspose:
+Conv2DTranspose
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Conv2DTranspose(name_scope, num_filters, output_size=None, filter_size=None, padding=0, stride=1, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None)
+2-D卷积转置层（Convlution2D transpose layer）
+该层根据 输入（input）、滤波器（filter）和卷积核膨胀（dilations）、步长（stride）、填充（padding）来计算输出。输入(Input)和输出(Output)为NCHW格式，其中 ``N`` 为batch大小， ``C`` 为通道数（channel），``H`` 为特征高度， ``W`` 为特征宽度。参数(膨胀、步长、填充)分别都包含两个元素。这两个元素分别表示高度和宽度。欲了解卷积转置层细节，请参考下面的说明和 参考文献_ 。如果参数 ``bias_attr`` 和 ``act`` 不为 ``None``，则在卷积的输出中加入偏置，并对最终结果应用相应的激活函数。
+.. _参考文献: http://www.matthewzeiler.com/wp-content/uploads/2017/07/cvpr2010.pdf
+输入 :math:`X` 和输出 :math:`Out` 函数关系如下：
+.. math::
+                        Out=\sigma (W*X+b)\\
+其中：
+    -  :math:`X` : 输入张量，具有 ``NCHW`` 格式
+    -  :math:`W` : 滤波器张量，具有 ``NCHW`` 格式
+    -  :math:`*` : 卷积操作
+    -  :math:`b` : 偏置（bias），二维张量，shape为 ``[M,1]``
+    -  :math:`σ` : 激活函数
+    -  :math:`Out` : 输出值，Out和 ``X`` 的 ``shape`` 可能不一样
+**样例**：
+输入：
+.. math::
+    输入张量的shape :  （N，C_{in}， H_{in}， W_{in})
+    滤波器（filter）shape ： （C_{in}, C_{out}, H_f, W_f)
+输出：
+.. math::
+    输出张量的 shape ： （N，C_{out}, H_{out}, W_{out})
+其中
+.. math::
+        & H'_{out} = (H_{in}-1)*strides[0]-2*paddings[0]+dilations[0]*(H_f-1)+1\\
+        & W'_{out} = (W_{in}-1)*strides[1]-2*paddings[1]+dilations[1]*(W_f-1)+1 \\
+        & H_{out}\in[H'_{out},H'_{out} + strides[0])\\
+        & W_{out}\in[W'_{out},W'_{out} + strides[1])\\
+参数:
+    - **name_scope** (str) - 该类的名称
+    - **num_filters** (int) - 滤波器（卷积核）的个数，与输出的图片的通道数（ channel ）相同
+    - **output_size** (int|tuple|None) - 输出图片的大小。如果output_size是一个元组（tuple），则该元形式为（image_H,image_W),这两个值必须为整型。如果output_size=None,则内部会使用filter_size、padding和stride来计算output_size。如果output_size和filter_size是同时指定的，那么它们应满足上面的公式。默认为None。
+    - **filter_size** (int|tuple|None) - 滤波器大小。如果filter_size是一个tuple，则形式为(filter_size_H, filter_size_W)。否则，滤波器将是一个方阵。如果filter_size=None，则内部会计算输出大小。默认为None。
+    - **padding** (int|tuple) - 填充大小。如果padding是一个元组，它必须包含两个整数(padding_H、padding_W)。否则，padding_H = padding_W = padding。默认:padding = 0。
+    - **stride** (int|tuple) - 步长大小。如果stride是一个元组，那么元组的形式为(stride_H、stride_W)。否则，stride_H = stride_W = stride。默认:stride = 1。
+    - **dilation** (int|元组) - 膨胀(dilation)大小。如果dilation是一个元组，那么元组的形式为(dilation_H, dilation_W)。否则，dilation_H = dilation_W = dilation_W。默认:dilation= 1。
+    - **groups** (int) - Conv2d转置层的groups个数。从Alex Krizhevsky的CNN Deep论文中的群卷积中受到启发，当group=2时，前半部分滤波器只连接到输入通道的前半部分，而后半部分滤波器只连接到输入通道的后半部分。默认值:group = 1。
+    - **param_attr** (ParamAttr|None) - conv2d_transfer中可学习参数/权重的属性。如果param_attr值为None或ParamAttr的一个属性，conv2d_transfer使用ParamAttrs作为param_attr的值。如果没有设置的param_attr初始化器，那么使用Xavier初始化。默认值:None。
+    - **bias_attr** (ParamAttr|bool|None) - conv2d_tran_bias中的bias属性。如果设置为False，则不会向输出单元添加偏置。如果param_attr值为None或ParamAttr的一个属性，将conv2d_transfer使用ParamAttrs作为，bias_attr。如果没有设置bias_attr的初始化器，bias将初始化为零。默认值:None。
+    - **use_cudnn** (bool) - 是否使用cudnn内核，只有已安装cudnn库时才有效。默认值:True。
+    - **act** (str) -  激活函数类型，如果设置为None，则不使用激活函数。默认值:None。
+返回： 存储卷积转置结果的张量。
+返回类型: 变量（variable）
+抛出异常:
+    -  ``ValueError`` : 如果输入的shape、filter_size、stride、padding和groups不匹配，抛出ValueError
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        data = numpy.random.random((3, 32, 32)).astype('float32')
+        conv2DTranspose = fluid.dygraph.nn.Conv2DTranspose(
+              'Conv2DTranspose', num_filters=2, filter_size=3)
+        ret = conv2DTranspose(fluid.dygraph.base.to_variable(data))
--- a/doc/fluid/api_cn/dygraph_cn/Conv2D_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Conv2D_cn.rst
+.. _cn_api_fluid_dygraph_Conv2D:
+Conv2D
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Conv2D(name_scope, num_filters, filter_size, stride=1, padding=0, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None, dtype='float32')
+卷积二维层（convolution2D layer）根据输入、滤波器（filter）、步长（stride）、填充（padding）、dilations、一组参数计算输出。输入和输出是NCHW格式，N是批尺寸，C是通道数，H是特征高度，W是特征宽度。滤波器是MCHW格式，M是输出图像通道数，C是输入图像通道数，H是滤波器高度，W是滤波器宽度。如果组数大于1，C等于输入图像通道数除以组数的结果。详情请参考UFLDL's : `卷积 <http://ufldl.stanford.edu/tutorial/supervised/FeatureExtractionUsingConvolution/>`_ 。如果提供了bias属性和激活函数类型，bias会添加到卷积（convolution）的结果中相应的激活函数会作用在最终结果上。
+对每个输入X，有等式：
+.. math::
+    Out = \sigma \left ( W * X + b \right )
+其中：
+    - :math:`X` ：输入值，NCHW格式的张量（Tensor）
+    - :math:`W` ：滤波器值，MCHW格式的张量（Tensor）
+    - :math:`*` ： 卷积操作
+    - :math:`b` ：Bias值，二维张量（Tensor），shape为 ``[M,1]``
+    - :math:`\sigma` ：激活函数
+    - :math:`Out` ：输出值，``Out`` 和 ``X`` 的shape可能不同
+**示例**
+- 输入：
+  输入shape：:math:`( N,C_{in},H_{in},W_{in} )`
+  滤波器shape： :math:`( C_{out},C_{in},H_{f},W_{f} )`
+- 输出：
+  输出shape： :math:`( N,C_{out},H_{out},W_{out} )`
+其中
+.. math::
+    H_{out} = \frac{\left ( H_{in}+2*paddings[0]-\left ( dilations[0]*\left ( H_{f}-1 \right )+1 \right ) \right )}{strides[0]}+1
+    W_{out} = \frac{\left ( W_{in}+2*paddings[1]-\left ( dilations[1]*\left ( W_{f}-1 \right )+1 \right ) \right )}{strides[1]}+1
+参数：
+    - **name_scope** (str) - 该类的名称
+    - **num_fliters** (int) - 滤波器数。和输出图像通道相同
+    - **filter_size** (int|tuple|None) - 滤波器大小。如果filter_size是一个元组，则必须包含两个整型数，（filter_size，filter_size_W）。否则，滤波器为square
+    - **stride** (int|tuple) - 步长(stride)大小。如果步长（stride）为元组，则必须包含两个整型数，（stride_H,stride_W）。否则，stride_H = stride_W = stride。默认：stride = 1
+    - **padding** (int|tuple) - 填充（padding）大小。如果填充（padding）为元组，则必须包含两个整型数，（padding_H,padding_W)。否则，padding_H = padding_W = padding。默认：padding = 0
+    - **dilation** (int|tuple) - 膨胀（dilation）大小。如果膨胀（dialation）为元组，则必须包含两个整型数，（dilation_H,dilation_W）。否则，dilation_H = dilation_W = dilation。默认：dilation = 1
+    - **groups** (int) - 卷积二维层（Conv2D Layer）的组数。根据Alex Krizhevsky的深度卷积神经网络（CNN）论文中的成组卷积：当group=2，滤波器的前一半仅和输入通道的前一半连接。滤波器的后一半仅和输入通道的后一半连接。默认：groups = 1
+    - **param_attr** (ParamAttr|None) - conv2d的可学习参数/权重的参数属性。如果设为None或者ParamAttr的一个属性，conv2d创建ParamAttr为param_attr。如果param_attr的初始化函数未设置，参数则初始化为 :math:`Normal(0.0,std)` ，并且std为 :math:`\frac{2.0}{filter\_elem\_num}^{0.5}` 。默认为None
+    - **bias_attr** (ParamAttr|bool|None) - conv2d bias的参数属性。如果设为False，则没有bias加到输出。如果设为None或者ParamAttr的一个属性，conv2d创建ParamAttr为bias_attr。如果bias_attr的初始化函数未设置，bias初始化为0.默认为None
+    - **use_cudnn** （bool） - 是否用cudnn核，仅当下载cudnn库才有效。默认：True
+    - **act** (str) - 激活函数类型，如果设为None，则未添加激活函数。默认：None
+抛出异常:
+  - ``ValueError`` - 如果输入shape和filter_size，stride,padding和groups不匹配。
+**代码示例**
+.. code-block:: python
+    from paddle.fluid.dygraph.base import to_variable
+    import paddle.fluid as fluid
+    from paddle.fluid.dygraph import Conv2D
+    import numpy as np
+    data = np.random.uniform( -1, 1, [10, 3, 32, 32] ).astype('float32')
+    with fluid.dygraph.guard():
+        conv2d = Conv2D( "conv2d", 2, 3)
+        data = to_variable( data )
+        conv = conv2d( data )
--- a/doc/fluid/api_cn/dygraph_cn/Conv3DTranspose_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Conv3DTranspose_cn.rst
+.. _cn_api_fluid_dygraph_Conv3DTranspose:
+Conv3DTranspose
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Conv3DTranspose(name_scope, num_filters, output_size=None, filter_size=None, padding=0, stride=1, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None, name=None)
+3-D卷积转置层（Convlution3D transpose layer)
+该层根据 输入（input）、滤波器（filter）和卷积核膨胀（dilations）、步长（stride）、填充来计算输出。输入(Input)和输出(Output)为NCDHW格式。其中 ``N`` 为batch大小， ``C`` 为通道数（channel）, ``D``  为特征深度, ``H`` 为特征高度， ``W`` 为特征宽度。参数(膨胀、步长、填充)分别包含两个元素。这两个元素分别表示高度和宽度。欲了解卷积转置层细节，请参考下面的说明和 参考文献_ 。如果参数 ``bias_attr`` 和 ``act`` 不为None，则在卷积的输出中加入偏置，并对最终结果应用相应的激活函数
+.. _参考文献: http://www.matthewzeiler.com/wp-content/uploads/2017/07/cvpr2010.pdf
+输入X和输出Out函数关系X，有等式如下：
+.. math::
+                        \\Out=\sigma (W*X+b)\\
+其中：
+    -  :math:`X` : 输入张量，具有 ``NCDHW`` 格式
+    -  :math:`W` : 滤波器张量，，具有 ``NCDHW`` 格式
+    -  :math:`*` : 卷积操作
+    -  :math:`b` : 偏置（bias），二维张量，shape为 ``[M,1]``
+    -  :math:`σ` : 激活函数
+    -  :math:`Out` : 输出值， ``Out`` 和 ``X`` 的 shape可能不一样
+**样例**
+输入:
+    输入形状: :math:`(N,C_{in},D_{in},H_{in},W_{in})` 
+    Filter形状: :math:`(C_{in},C_{out},D_f,H_f,W_f)` 
+输出:
+    输出形状: :math:`(N,C_{out},D_{out},H_{out},W_{out})`
+其中：
+.. math::
+    D_{out}=(D_{in}-1)*strides[0]-2*paddings[0]+dilations[0]*(D_f-1)+1
+    H_{out}=(H_{in}-1)*strides[1]-2*paddings[1]+dilations[1]*(H_f-1)+1
+    W_{out}=(W_{in}-1)*strides[2]-2*paddings[2]+dilations[2]*(W_f-1)+1
+参数:
+      - **name_scope** （str）- 该类的名称
+      - **num_filters** (int) - 滤波器（卷积核）的个数，与输出的图片的通道数（channel）相同
+      - **output_size** (int|tuple|None) - 输出图片的大小。如果 ``output_size`` 是一个元组（tuple），则该元形式为（image_H,image_W),这两个值必须为整型。如果 ``output_size=None`` ,则内部会使用filter_size、padding和stride来计算output_size。如果 ``output_size`` 和 ``filter_size`` 是同时指定的，那么它们应满足上面的公式。
+      - **filter_size** (int|tuple|None) - 滤波器大小。如果 ``filter_size`` 是一个tuple，则形式为(filter_size_H, filter_size_W)。否则，滤波器将是一个方阵。如果 ``filter_size=None`` ，则内部会计算输出大小。
+      - **padding** (int|tuple) - 填充大小。如果 ``padding`` 是一个元组，它必须包含两个整数(padding_H、padding_W)。否则，padding_H = padding_W = padding。默认:padding = 0。
+      - **stride** (int|tuple) - 步长大小。如果 ``stride`` 是一个元组，那么元组的形式为(stride_H、stride_W)。否则，stride_H = stride_W = stride。默认:stride = 1。
+      - **dilation** (int|元组) - 膨胀大小。如果 ``dilation`` 是一个元组，那么元组的形式为(dilation_H, dilation_W)。否则，dilation_H = dilation_W = dilation_W。默认:dilation= 1。
+      - **groups** (int) - Conv2d转置层的groups个数。从Alex Krizhevsky的CNN Deep论文中的群卷积中受到启发，当group=2时，前半部分滤波器只连接到输入通道的前半部分，而后半部分滤波器只连接到输入通道的后半部分。默认值:group = 1。
+      - **param_attr** (ParamAttr|None) - conv2d_transfer中可学习参数/权重的属性。如果param_attr值为None或ParamAttr的一个属性，conv2d_transfer使用ParamAttrs作为param_attr的值。如果没有设置的param_attr初始化器，那么使用Xavier初始化。默认值:None。
+      - **bias_attr** (ParamAttr|bool|None) - conv2d_tran_bias中的bias属性。如果设置为False，则不会向输出单元添加偏置。如果param_attr值为None或ParamAttr的一个属性，将conv2d_transfer使用ParamAttrs作为，bias_attr。如果没有设置bias_attr的初始化器，bias将初始化为零。默认值:None。
+      - **use_cudnn** (bool) - 是否使用cudnn内核，只有已安装cudnn库时才有效。默认值:True。
+      - **act** (str) -  激活函数类型，如果设置为None，则不使用激活函数。默认值:None。
+      - **name** (str|None) - 该layer的名称(可选)。如果设置为None， 将自动命名该layer。默认值:True。
+返回： 存储卷积转置结果的张量。
+返回类型: 变量（variable）
+抛出异常:
+    -  ``ValueError``  - 如果输入的shape、filter_size、stride、padding和groups不匹配，抛出ValueError
+**代码示例**
+..  code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        data = numpy.random.random((5, 3, 12, 32, 32)).astype('float32')
+        conv3dTranspose = fluid.dygraph.nn.Conv3DTranspose(
+               'Conv3DTranspose',
+               num_filters=12,
+               filter_size=12,
+               use_cudnn=False)
+        ret = conv3dTranspose(fluid.dygraph.base.to_variable(data))
--- a/doc/fluid/api_cn/dygraph_cn/Conv3D_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Conv3D_cn.rst
+.. _cn_api_fluid_dygraph_Conv3D:
+Conv3D
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Conv3D(name_scope, num_filters, filter_size, stride=1, padding=0, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None)
+3D卷积层（convolution3D layer）根据输入、滤波器（filter）、步长（stride）、填充（padding）、膨胀（dilations）、组数参数计算得到输出。输入和输出是NCHW格式，N是批尺寸，C是通道数，H是特征高度，W是特征宽度。卷积三维（Convlution3D）和卷积二维（Convlution2D）相似，但多了一维深度（depth）。如果提供了bias属性和激活函数类型，bias会添加到卷积（convolution）的结果中相应的激活函数会作用在最终结果上。
+对每个输入X，有等式：
+.. math::
+    Out = \sigma \left ( W * X + b \right )
+其中：
+    - :math:`X` ：输入值，NCDHW格式的张量（Tensor）
+    - :math:`W` ：滤波器值，MCDHW格式的张量（Tensor）
+    - :math:`*` ： 卷积操作
+    - :math:`b` ：Bias值，二维张量（Tensor），形为 ``[M,1]``
+    - :math:`\sigma` ：激活函数
+    - :math:`Out` ：输出值, 和 ``X`` 的形状可能不同
+**示例**
+- 输入：
+    输入shape： :math:`(N, C_{in}, D_{in}, H_{in}, W_{in})`
+    滤波器shape： :math:`(C_{out}, C_{in}, D_f, H_f, W_f)`
+- 输出：
+    输出shape： :math:`(N, C_{out}, D_{out}, H_{out}, W_{out})`
+其中
+.. math::
+    D_{out}&= \frac{(D_{in} + 2 * paddings[0] - (dilations[0] * (D_f - 1) + 1))}{strides[0]} + 1 \\
+    H_{out}&= \frac{(H_{in} + 2 * paddings[1] - (dilations[1] * (H_f - 1) + 1))}{strides[1]} + 1 \\
+    W_{out}&= \frac{(W_{in} + 2 * paddings[2] - (dilations[2] * (W_f - 1) + 1))}{strides[2]} + 1
+参数：
+    - **name_scope** (str) - 该类的名称
+    - **num_fliters** (int) - 滤波器数。和输出图像通道相同
+    - **filter_size** (int|tuple|None) - 滤波器大小。如果filter_size是一个元组，则必须包含三个整型数，(filter_size_D, filter_size_H, filter_size_W)。否则，滤波器为棱长为int的立方体形。
+    - **stride** (int|tuple) - 步长(stride)大小。如果步长（stride）为元组，则必须包含三个整型数， (stride_D, stride_H, stride_W)。否则，stride_D = stride_H = stride_W = stride。默认：stride = 1
+    - **padding** (int|tuple) - 填充（padding）大小。如果填充（padding）为元组，则必须包含三个整型数，(padding_D, padding_H, padding_W)。否则， padding_D = padding_H = padding_W = padding。默认：padding = 0
+    - **dilation** (int|tuple) - 膨胀（dilation）大小。如果膨胀（dialation）为元组，则必须包含两个整型数， (dilation_D, dilation_H, dilation_W)。否则，dilation_D = dilation_H = dilation_W = dilation。默认：dilation = 1
+    - **groups** (int) - 卷积二维层（Conv2D Layer）的组数。根据Alex Krizhevsky的深度卷积神经网络（CNN）论文中的成组卷积：当group=2，滤波器的前一半仅和输入通道的前一半连接。滤波器的后一半仅和输入通道的后一半连接。默认：groups = 1
+    - **param_attr** (ParamAttr|None) - conv2d的可学习参数/权重的参数属性。如果设为None或者ParamAttr的一个属性，conv2d创建ParamAttr为param_attr。如果param_attr的初始化函数未设置，参数则初始化为 :math:`Normal(0.0,std)`，并且std为 :math:`\left ( \frac{2.0}{filter\_elem\_num} \right )^{0.5}` 。默认为None
+    - **bias_attr** (ParamAttr|bool|None) - conv2d bias的参数属性。如果设为False，则没有bias加到输出。如果设为None或者ParamAttr的一个属性，conv2d创建ParamAttr为bias_attr。如果bias_attr的初始化函数未设置，bias初始化为0.默认为None
+    - **use_cudnn** （bool） - 是否用cudnn核，仅当下载cudnn库才有效。默认：True
+    - **act** (str) - 激活函数类型，如果设为None，则未添加激活函数。默认：None
+返回：张量，存储卷积和非线性激活结果
+返回类型：变量（Variable）
+抛出异常：
+  - ``ValueError`` - 如果 ``input`` 的形和 ``filter_size`` ， ``stride`` , ``padding`` 和 ``groups`` 不匹配。
+**代码示例**：
+.. code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        data = numpy.random.random((5, 3, 12, 32, 32)).astype('float32')
+        conv3d = fluid.dygraph.nn.Conv3D(
+              'Conv3D', num_filters=2, filter_size=3, act="relu")
+        ret = conv3d(fluid.dygraph.base.to_variable(data))
--- a/doc/fluid/api_cn/dygraph_cn/CosineDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/CosineDecay_cn.rst
+.. _cn_api_fluid_dygraph_CosineDecay:
+CosineDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.CosineDecay(learning_rate, step_each_epoch, epochs, begin=0, step=1, dtype='float32')
+使用 cosine decay 的衰减方式进行学习率调整。
+在训练模型时，建议一边进行训练一边降低学习率。 通过使用此方法，学习率将通过如下cosine衰减策略进行衰减：
+.. math::
+    decayed\_lr = learning\_rate * 0.5 * (math.cos * (epoch * \frac{math.pi}{epochs} ) + 1)
+参数：
+    - **learning_rate** (Variable | float) - 初始学习率。
+    - **step_each_epoch** （int） - 一次迭代中的步数。
+    - **begin** (int) - 起始步，默认为0。
+    - **step** (int) - 步大小，默认为1。
+    - **dtype**  (str) - 学习率的dtype，默认为‘float32’
+**代码示例**
+.. code-block:: python
+    base_lr = 0.1
+    with fluid.dygraph.guard():
+        optimizer  = fluid.optimizer.SGD(
+            learning_rate = fluid.dygraph.CosineDecay(
+                    base_lr, 10000, 120) )
--- a/doc/fluid/api_cn/dygraph_cn/Embedding_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Embedding_cn.rst
+.. _cn_api_fluid_dygraph_Embedding:
+Embedding
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Embedding(name_scope, size, is_sparse=False, is_distributed=False, padding_idx=None, param_attr=None, dtype='float32')
+Embedding层
+该层用于在查找表中查找 ``input`` 中的ID对应的embeddings。查找的结果是input里每个ID对应的embedding。
+所有的输入变量都作为局部变量传入LayerHelper构造器
+参数：
+    - **name_scope** (str)-该类的名称。
+    - **size** (tuple|list)-查找表参数的维度。应当有两个参数，一个代表嵌入矩阵字典的大小，一个代表每个嵌入向量的大小。
+    - **is_sparse** (bool)-代表是否用稀疏更新的标志。
+    - **is_distributed** (bool)-是否从远程参数服务端运行查找表。
+    - **padding_idx** (int|long|None)-如果为 ``None`` ，对查找结果无影响。如果 ``padding_idx`` 不为空，表示一旦查找表中找到input中对应的 ``padding_idx``，则用0填充输出结果。如果 ``padding_idx`` <0 ,则在查找表中使用的 ``padding_idx`` 值为 :math:`size[0]+dim` 。默认：None。
+    - **param_attr** (ParamAttr)-该层参数。默认为None。
+    - **dtype** (np.dtype|core.VarDesc.VarType|str)-数据类型：float32,float_16,int等。默认:‘float32’
+返回：张量，存储已有输入的嵌入矩阵。
+返回类型：变量(Variable)
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    import paddle.fluid.dygraph.base as base
+    import numpy as np
+    inp_word = np.array([[[1]]]).astype('int64')
+    dict_size = 20
+    with fluid.dygraph.guard():
+        emb = fluid.dygraph.Embedding(
+            name_scope='embedding',
+            size=[dict_size, 32],
+            param_attr='emb.w',
+            is_sparse=False)
+        static_rlt3 = emb(base.to_variable(inp_word))
--- a/doc/fluid/api_cn/dygraph_cn/ExponentialDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/ExponentialDecay_cn.rst
+.. _cn_api_fluid_dygraph_ExponentialDecay:
+ExponentialDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.ExponentialDecay(learning_rate, decay_steps, decay_rate, staircase=False, begin=0, step=1, dtype='float32')
+对学习率应用指数衰减。
+在学习率上运用指数衰减。
+训练模型时，推荐在训练过程中降低学习率。每次 ``decay_steps`` 步骤中用 ``decay_rate`` 衰减学习率。
+.. code-block:: text
+    if staircase == True:
+        decayed_learning_rate = learning_rate * decay_rate ^ floor(global_step / decay_steps)
+    else:
+        decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)
+参数：
+    - **learning_rate** (Variable|float)-初始学习率
+    - **decay_steps** (int)-见以上衰减运算
+    - **decay_rate** (float)-衰减率。见以上衰减运算
+    - **staircase** (Boolean)-若为True,按离散区间衰减学习率。默认：False
+    - **begin** (int) - 起始步，默认为0。
+    - **step** (int) - 步大小，默认为1。
+    - **dtype**  (str) - 学习率的dtype，默认为‘float32’
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    base_lr = 0.1
+    with fluid.dygraph.guard():
+        sgd_optimizer = fluid.optimizer.SGD(
+              learning_rate=fluid.dygraph.ExponentialDecay(
+                  learning_rate=base_lr,
+                  decay_steps=10000,
+                  decay_rate=0.5,
+                  staircase=True))
--- a/doc/fluid/api_cn/dygraph_cn/FC_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/FC_cn.rst
+.. _cn_api_fluid_dygraph_FC:
+FC
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.FC(name_scope, size, num_flatten_dims=1, param_attr=None, bias_attr=None, act=None, is_test=False, dtype='float32')
+**全连接层**
+该函数在神经网络中建立一个全连接层。 它可以将一个或多个tensor（ ``input`` 可以是一个list或者Variable，详见参数说明）作为自己的输入，并为每个输入的tensor创立一个变量，称为“权”（weights），等价于一个从每个输入单元到每个输出单元的全连接权矩阵。FC层用每个tensor和它对应的权相乘得到形状为[M, size]输出tensor，M是批大小。如果有多个输入tensor，那么形状为[M, size]的多个输出张量的结果将会被加起来。如果 ``bias_attr`` 非空，则会新创建一个偏向变量（bias variable），并把它加入到输出结果的运算中。最后，如果 ``act`` 非空，它也会加入最终输出的计算中。
+当输入为单个张量：
+.. math::
+        \\Out = Act({XW + b})\\
+当输入为多个张量：
+.. math::
+        \\Out=Act(\sum^{N-1}_{i=0}X_iW_i+b) \\
+上述等式中：
+  - :math:`N` ：输入的数目,如果输入是变量列表，N等于len（input）
+  - :math:`X_i` ：第i个输入的tensor
+  - :math:`W_i` ：对应第i个输入张量的第i个权重矩阵
+  - :math:`b` ：该层创立的bias参数
+  - :math:`Act` ：activation function(激励函数)
+  - :math:`Out` ：输出tensor
+::
+            Given:
+                data_1.data = [[[0.1, 0.2],
+                               [0.3, 0.4]]]
+                data_1.shape = (1, 2, 2) # 1 is batch_size
+                data_2 = [[[0.1, 0.2, 0.3]]]
+                data_2.shape = (1, 1, 3)
+                out = fluid.layers.fc(input=[data_1, data_2], size=2)
+            Then:
+                out.data = [[0.18669507, 0.1893476]]
+                out.shape = (1, 2)
+参数:
+  - **name_scope** (str) – 该类的名称
+  - **size** (int) – 该层输出单元的数目
+  - **num_flatten_dims** (int, 默认为1) – fc层可以接受一个维度大于2的tensor。此时， 它首先会被扁平化(flattened)为一个二维矩阵。 参数 ``num_flatten_dims`` 决定了输入tensor的flattened方式: 前 ``num_flatten_dims`` (包含边界，从1开始数) 个维度会被扁平化为最终矩阵的第一维 (维度即为矩阵的高), 剩下的 rank(X) - num_flatten_dims 维被扁平化为最终矩阵的第二维 (即矩阵的宽)。 例如， 假设X是一个五维tensor，其形可描述为(2, 3, 4, 5, 6), 且num_flatten_dims = 3。那么扁平化的矩阵形状将会如此： (2 x 3 x 4, 5 x 6) = (24, 30)
+  - **param_attr** (ParamAttr|list of ParamAttr|None) – 该层可学习的参数/权的参数属性
+  - **bias_attr** (ParamAttr|list of ParamAttr, default None) – 该层bias变量的参数属性。如果值为False， 则bias变量不参与输出单元运算。 如果值为None，bias变量被初始化为0。默认为 None。
+  - **act** (str|None) – 应用于输出的Activation（激励函数）
+  - **is_test** (bool) – 表明当前执行是否处于测试阶段的标志
+  - **dtype** (str) – 权重的数据类型
+弹出异常：``ValueError`` - 如果输入tensor的维度小于2
+**代码示例**
+..  code-block:: python
+    from paddle.fluid.dygraph.base import to_variable
+    import paddle.fluid as fluid
+    from paddle.fluid.dygraph import FC
+    import numpy as np
+    data = np.random.uniform( -1, 1, [30, 10, 32] ).astype('float32')
+    with fluid.dygraph.guard():
+        fc = FC( "fc", 64, num_flatten_dims=2)
+        data = to_variable( data )
+        conv = fc( data )
--- a/doc/fluid/api_cn/dygraph_cn/GRUUnit_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/GRUUnit_cn.rst
+.. _cn_api_fluid_dygraph_GRUUnit:
+GRUUnit
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.GRUUnit(name_scope, size, param_attr=None, bias_attr=None, activation='tanh', gate_activation='sigmoid', origin_mode=False, dtype='float32')
+GRU单元层。GRU执行步骤基于如下等式：
+如果origin_mode为True，则该运算公式来自论文
+`Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling  <https://arxiv.org/pdf/1412.3555.pdf>`_ 。
+公式如下:
+.. math::
+    u_t=actGate(xu_t+W_{u}h_{t-1}+b_u)
+.. math::
+    r_t=actGate(xr_t+W_{r}h_{t-1}+b_r)
+.. math::
+    m_t=actNode(xm_t+W_{c}dot(r_t,h_{t-1})+b_m)
+.. math::
+    h_t=dot((1-u_t),m_t)+dot(u_t,h_{t-1})
+如果origin_mode为False，则该运算公式来自论文
+`Learning Phrase Representations using RNN Encoder Decoder for Statistical Machine Translation <https://arxiv.org/pdf/1406.1078.pdf>`_ 。
+.. math::
+    u_t & = actGate(xu_{t} + W_u h_{t-1} + b_u)\\
+    r_t & = actGate(xr_{t} + W_r h_{t-1} + b_r)\\
+    m_t & = actNode(xm_t + W_c dot(r_t, h_{t-1}) + b_m)\\
+    h_t & = dot((1-u_t), h_{t-1}) + dot(u_t, m_t)
+GRU单元的输入包括 :math:`z_t` ， :math:`h_{t-1}` 。在上述等式中， :math:`z_t` 会被分割成三部分： :math:`xu_t` 、 :math:`xr_t` 和 :math:`xm_t`  。
+这意味着要为一批输入实现一个全GRU层，我们需要采用一个全连接层，才能得到 :math:`z_t=W_{fc}x_t` 。
+:math:`u_t` 和 :math:`r_t` 分别代表了GRU神经元的update gates（更新门）和reset gates(重置门)。
+和LSTM不同，GRU少了一个门（它没有LSTM的forget gate）。但是它有一个叫做中间候选隐藏状态（intermediate candidate hidden output）的输出，
+记为 :math:`m_t` 。 该层有三个输出： :math:`h_t, dot(r_t,h_{t-1})` 以及 :math:`u_t，r_t，m_t` 的连结(concatenation)。
+参数:
+    - **name_scope** (str) – 该类的名称
+    - **size** (int) – 输入数据的维度
+    - **param_attr** (ParamAttr|None) – 可学习的隐藏层权重矩阵的参数属性。
+    注意：
+      - 该权重矩阵形为 :math:`(T×3D)` ， :math:`D` 是隐藏状态的规模（hidden size）
+      - 该权重矩阵的所有元素由两部分组成， 一是update gate和reset gate的权重，形为 :math:`(D×2D)` ；二是候选隐藏状态（candidate hidden state）的权重矩阵，形为 :math:`(D×D)`
+      如果该函数参数值为None或者 ``ParamAttr`` 类中的属性之一，gru_unit则会创建一个 ``ParamAttr`` 类的对象作为param_attr。如果param_attr没有被初始化，那么会由Xavier来初始化它。默认值为None
+    - **bias_attr** (ParamAttr|bool|None) - GRU的bias变量的参数属性。形为 :math:`(1x3D)` 的bias连结（concatenate）在update gates（更新门），reset gates(重置门)以及candidate calculations（候选隐藏状态计算）中的bias。如果值为False，那么上述三者将没有bias参与运算。若值为None或者 ``ParamAttr`` 类中的属性之一，gru_unit则会创建一个 ``ParamAttr`` 类的对象作为 bias_attr。如果bias_attr没有被初始化，那它会被默认初始化为0。默认值为None。
+    - **activation** (str) –  神经元 “actNode” 的激励函数（activation）类型。默认类型为‘tanh’
+    - **gate_activation** (str) – 门 “actGate” 的激励函数（activation）类型。 默认类型为 ‘sigmoid’。
+    - **dtype** (str) – 该层的数据类型，默认为‘float32’。
+返回：  hidden value（隐藏状态的值），reset-hidden value(重置隐藏状态值)，gate values(门值)
+返回类型:  元组（tuple）
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    import paddle.fluid.dygraph.base as base
+    import numpy
+    lod = [[2, 4, 3]]
+    D = 5
+    T = sum(lod[0])
+    hidden_input = numpy.random.rand(T, D).astype('float32')
+    with fluid.dygraph.guard():
+        x = numpy.random.random((3, 32, 32)).astype('float32')
+        gru = fluid.dygraph.GRUUnit('gru', size=D * 3)
+        dy_ret = gru(
+          base.to_variable(input), base.to_variable(hidden_input))
--- a/doc/fluid/api_cn/dygraph_cn/GroupNorm_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/GroupNorm_cn.rst
+.. _cn_api_fluid_dygraph_GroupNorm:
+GroupNorm
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.GroupNorm(name_scope, groups, epsilon=1e-05, param_attr=None, bias_attr=None, act=None, data_layout='NCHW')
+**Group Normalization层**
+请参考 `Group Normalization <https://arxiv.org/abs/1803.08494>`_ 。
+参数：
+    - **name_scope** (str) - 该类名称
+    - **groups** (int) - 从 channel 中分离出来的 group 的数目
+    - **epsilon** (float) - 为防止方差除零，增加一个很小的值
+    - **param_attr** (ParamAttr|None)  - 可学习标度的参数属性 :math:`g`,如果设置为False，则不会向输出单元添加标度。如果设置为0，偏差初始化为1。默认值:None
+    - **bias_attr** (ParamAttr|None) - 可学习偏置的参数属性 :math:`b ` , 如果设置为False，则不会向输出单元添加偏置量。如果设置为零，偏置初始化为零。默认值:None。
+    - **act** (str) - 将激活应用于输出的 group normalizaiton
+    - **data_layout** (string|NCHW) - 只支持NCHW。
+返回： 一个张量变量，它是对输入进行 group normalization 后的结果。
+返回类型：Variable
+**代码示例**
+..  code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        x = numpy.random.random((8, 32, 32)).astype('float32')
+        groupNorm = fluid.dygraph.nn.GroupNorm('GroupNorm', groups=4)
+        ret = groupNorm(fluid.dygraph.base.to_variable(x))
--- a/doc/fluid/api_cn/dygraph_cn/InverseTimeDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/InverseTimeDecay_cn.rst
+.. _cn_api_fluid_dygraph_InverseTimeDecay:
+InverseTimeDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.InverseTimeDecay(learning_rate, decay_steps, decay_rate, staircase=False, begin=0, step=1, dtype='float32')
+在初始学习率上运用逆时衰减。
+训练模型时，最好在训练过程中降低学习率。通过执行该函数，将对初始学习率运用逆向衰减函数。
+.. code-block:: text
+    if staircase == True:
+         decayed_learning_rate = learning_rate / (1 + decay_rate * floor(global_step / decay_step))
+    else:
+         decayed_learning_rate = learning_rate / (1 + decay_rate * global_step / decay_step)
+参数：
+    - **learning_rate** (Variable|float)-初始学习率
+    - **decay_steps** (int)-见以上衰减运算
+    - **decay_rate** (float)-衰减率。见以上衰减运算
+    - **staircase** (Boolean)-若为True，按间隔区间衰减学习率。默认：False
+    - **begin** (int) - 起始步，默认为0。
+    - **step** (int) - 步大小，默认为1。
+    - **dtype**  (str) - 学习率的dtype，默认为‘float32’
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    base_lr = 0.1
+    with fluid.dygraph.guard():
+        sgd_optimizer = fluid.optimizer.SGD(
+            learning_rate=fluid.dygraph.InverseTimeDecay(
+                  learning_rate=base_lr,
+                  decay_steps=10000,
+                  decay_rate=0.5,
+                  staircase=True))
--- a/doc/fluid/api_cn/dygraph_cn/LayerNorm_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/LayerNorm_cn.rst
+.. _cn_api_fluid_dygraph_LayerNorm:
+LayerNorm
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.LayerNorm(name_scope, scale=True, shift=True, begin_norm_axis=1, epsilon=1e-05, param_attr=None, bias_attr=None, act=None)
+假设特征向量存在于维度 ``begin_norm_axis ... rank (input）`` 上，计算大小为 ``H`` 的特征向量a在该维度上的矩统计量，然后使用相应的统计量对每个特征向量进行归一化。 之后，如果设置了 ``scale`` 和 ``shift`` ，则在标准化的张量上应用可学习的增益和偏差以进行缩放和移位。
+请参考 `Layer Normalization <https://arxiv.org/pdf/1607.06450v1.pdf>`_
+公式如下
+.. math::
+            \\\mu=\frac{1}{H}\sum_{i=1}^{H}a_i\\
+.. math::
+            \\\sigma=\sqrt{\frac{1}{H}\sum_i^H{(a_i-\mu)^2}}\\
+.. math::
+             \\h=f(\frac{g}{\sigma}(a-\mu) + b)\\
+- :math:`\alpha` : 该层神经元输入总和的向量表示
+- :math:`H` : 层中隐藏的神经元个数
+- :math:`g` : 可训练的缩放因子参数
+- :math:`b` : 可训练的bias参数
+参数:
+    - **name_scope** (str) – 该类的名称
+    - **scale** （bool） - 是否在归一化后学习自适应增益g。默认为True。
+    - **shift** （bool） - 是否在归一化后学习自适应偏差b。默认为True。
+    - **begin_norm_axis** （int） - ``begin_norm_axis`` 到 ``rank（input）`` 的维度执行规范化。默认1。
+    - **epsilon** （float） - 添加到方差的很小的值，以防止除零。默认1e-05。
+    - **param_attr** （ParamAttr | None） - 可学习增益g的参数属性。如果  ``scale`` 为False，则省略 ``param_attr`` 。如果 ``scale`` 为True且 ``param_attr`` 为None，则默认 ``ParamAttr`` 将作为比例。如果添加了 ``param_attr``， 则将其初始化为1。默认None。
+    - **bias_attr** （ParamAttr | None） - 可学习偏差的参数属性b。如果 ``shift`` 为False，则省略 ``bias_attr`` 。如果 ``shift`` 为True且 ``param_attr`` 为None，则默认 ``ParamAttr`` 将作为偏差。如果添加了 ``bias_attr`` ，则将其初始化为0。默认None。
+    - **act** （str） - 激活函数。默认 None
+返回： 标准化后的结果
+**代码示例**
+..  code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        x = numpy.random.random((3, 32, 32)).astype('float32')
+        layerNorm = fluid.dygraph.nn.LayerNorm(
+              'LayerNorm', begin_norm_axis=1)
+       ret = layerNorm(fluid.dygraph.base.to_variable(x))
--- a/doc/fluid/api_cn/dygraph_cn/Layer_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Layer_cn.rst
+.. _cn_api_fluid_dygraph_Layer:
+Layer
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.Layer(name_scope, dtype=VarType.FP32)
+由多个算子组成的层。
+参数：
+    - **name_scope** - 层为其参数命名而采用的名称前缀。如果前缀为“my_model/layer_1”，在一个名为MyLayer的层中，参数名为“my_model/layer_1/MyLayer/w_n”，其中w是参数的基础名称，n为自动生成的具有唯一性的后缀。
+    - **dtype** - 层中变量的数据类型
+.. py:method:: full_name()
+层的全名。
+组成方式如下：
+name_scope + “/” + MyLayer.__class__.__name__
+返回：  层的全名
+.. py:method:: create_parameter(attr, shape, dtype, is_bias=False, default_initializer=None)
+为层(layers)创建参数。
+参数：
+    - **attr** (ParamAttr)- 参数的参数属性
+    - **shape** - 参数的形状
+    - **dtype** - 参数的数据类型
+    - **is_bias** - 是否为偏置bias参数      
+    - **default_initializer** - 设置参数的默认初始化方法
+返回：    创建的参数变量
+.. py:method:: create_variable(name=None, persistable=None, dtype=None, type=VarType.LOD_TENSOR)
+为层创建变量
+参数：
+    - **name** - 变量名
+    - **persistable** - 是否为持久性变量
+    - **dtype** - 变量中的数据类型
+    - **type** - 变量类型   
+返回： 创建的变量(Variable)
+.. py:method:: parameters(include_sublayers=True)
+返回一个由当前层及其子层的参数组成的列表。
+参数：
+    - **include_sublayers** - 如果为True，返回的列表中包含子层的参数
+返回：  一个由当前层及其子层的参数组成的列表
+.. py:method:: sublayers(include_sublayers=True)
+返回一个由所有子层组成的列表。
+参数：
+    - **include_sublayers** - 如果为True，则包括子层中的各层
+返回： 一个由所有子层组成的列表
+.. py:method:: add_sublayer(name, sublayer)
+添加子层实例。被添加的子层实例的访问方式和self.name类似。
+参数：
+    - **name** - 该子层的命名
+    - **sublayer** - Layer实例
+返回：   传入的子层
+.. py:method:: add_parameter(name, parameter)
+添加参数实例。被添加的参数实例的访问方式和self.name类似。
+参数：
+    - **name** - 该子层的命名
+    - **parameter** - Parameter实例
+返回：   传入的参数实例   
--- a/doc/fluid/api_cn/dygraph_cn/NCE_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/NCE_cn.rst
+.. _cn_api_fluid_dygraph_NCE:
+NCE
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.NCE(name_scope, num_total_classes, param_attr=None, bias_attr=None, num_neg_samples=None, sampler='uniform', custom_dist=None, seed=0, is_sparse=False)
+计算并返回噪音对比估计（ noise-contrastive estimation training loss）。 
+`请参考Noise-contrastive estimation: A new estimation principle for unnormalized statistical models
+<http://www.jmlr.org/proceedings/papers/v9/gutmann10a/gutmann10a.pdf>`_
+该operator默认使用均匀分布进行抽样。
+参数:
+    - **name_scope** (str) – 该类的名称
+    - **num_total_classes** (int) - 所有样本中的类别的总数
+    - **sample_weight** (Variable|None) - 存储每个样本权重，shape为[batch_size, 1]存储每个样本的权重。每个样本的默认权重为1.0
+    - **param_attr** (ParamAttr|None) - :math:`可学习参数/nce权重` 的参数属性。如果它没有被设置为ParamAttr的一个属性，nce将创建ParamAttr为param_attr。如没有设置param_attr的初始化器，那么参数将用Xavier初始化。默认值:None
+    - **bias_attr** (ParamAttr|bool|None) -  nce偏置的参数属性。如果设置为False，则不会向输出添加偏置（bias）。如果值为None或ParamAttr的一个属性，则bias_attr=ParamAtt。如果没有设置bias_attr的初始化器，偏置将被初始化为零。默认值:None
+    - **num_neg_samples** (int) - 负样例的数量。默认值是10
+    - **name** (str|None) - 该layer的名称(可选)。如果设置为None，该层将被自动命名
+    - **sampler** (str) – 取样器，用于从负类别中进行取样。可以是 ‘uniform’, ‘log_uniform’ 或 ‘custom_dist’。 默认 ‘uniform’
+    - **custom_dist** (float[]) – 一个 float[] 并且它的长度为 ``num_total_classes`` 。  如果取样器类别为‘custom_dist’，则使用此参数。 custom_dist[i] 是第i个类别被取样的概率。默认为 None
+    - **seed** (int) – 取样器使用的seed。默认为0
+    - **is_sparse** (bool) – 标志位，指明是否使用稀疏更新,  :math:`weight@GRAD` 和 :math:`bias@GRAD` 会变为 SelectedRows
+返回： nce loss
+返回类型: 变量（Variable）
+**代码示例**
+..  code-block:: python
+    import numpy as np
+    import paddle.fluid as fluid
+    window_size = 5
+    dict_size = 20
+    label_word = int(window_size // 2) + 1
+    inp_word = np.array([[[1]], [[2]], [[3]], [[4]], [[5]]]).astype('int64')
+    nid_freq_arr = np.random.dirichlet(np.ones(20) * 1000).astype('float32')
+    with fluid.dygraph.guard():
+        words = []
+        for i in range(window_size):
+            words.append(fluid.dygraph.base.to_variable(inp_word[i]))
+        emb = fluid.Embedding(
+            'embedding',
+            size=[dict_size, 32],
+            param_attr='emb.w',
+            is_sparse=False)
+        embs3 = []
+        for i in range(window_size):
+            if i == label_word:
+                continue
+            emb_rlt = emb(words[i])
+            embs3.append(emb_rlt)
+        embs3 = fluid.layers.concat(input=embs3, axis=1)
+        nce = fluid.NCE('nce',
+                     num_total_classes=dict_size,
+                     num_neg_samples=2,
+                     sampler="custom_dist",
+                     custom_dist=nid_freq_arr.tolist(),
+                     seed=1,
+                     param_attr='nce.w',
+                     bias_attr='nce.b')
+        nce_loss3 = nce(embs3, words[label_word])
--- a/doc/fluid/api_cn/dygraph_cn/NaturalExpDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/NaturalExpDecay_cn.rst
+.. _cn_api_fluid_dygraph_NaturalExpDecay:
+NaturalExpDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.NaturalExpDecay(learning_rate, decay_steps, decay_rate, staircase=False, begin=0, step=1, dtype='float32')
+为初始学习率应用指数衰减策略。
+.. code-block:: text
+    if not staircase:
+        decayed_learning_rate = learning_rate * exp(- decay_rate * (global_step / decay_steps))
+    else:
+        decayed_learning_rate = learning_rate * exp(- decay_rate * (global_step / decay_steps))
+参数：
+    - **learning_rate** (Variable|float)- 类型为float32的标量值或为一个Variable。它是训练的初始学习率。
+    - **decay_steps** (int)-一个Python int32 数。
+    - **decay_rate** (float)- 一个Python float数。
+    - **staircase** (Boolean)-布尔型。若为True,每隔decay_steps衰减学习率。
+    - **begin**  – Python ‘int32’ 数，起始步(默认为0)。
+    - **step**  – Python ‘int32’ 数, 步大小(默认为1)。
+    - **dtype**  – Python ‘str’ 类型, 初始化学习率变量的dtype(默认为‘float32’)。
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    base_lr = 0.1
+    with fluid.dygraph.guard():
+        sgd_optimizer = fluid.optimizer.SGD(
+                learning_rate=fluid.dygraph.NaturalExpDecay(
+                      learning_rate=base_lr,
+                      decay_steps=10000,
+                      decay_rate=0.5,
+                      staircase=True))
--- a/doc/fluid/api_cn/dygraph_cn/NoamDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/NoamDecay_cn.rst
+.. _cn_api_fluid_dygraph_NoamDecay:
+NoamDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.NoamDecay(d_model, warmup_steps, begin=1, step=1, dtype='float32')
+Noam衰减方法。noam衰减的numpy实现如下。
+.. code-block:: python
+    import numpy as np
+    # 设置超参数
+    d_model = 2
+    current_steps = 20
+    warmup_steps = 200
+    # 计算
+    lr_value = np.power(d_model, -0.5) * np.min([
+                           np.power(current_steps, -0.5),
+                           np.power(warmup_steps, -1.5) * current_steps])
+请参照 `attention is all you need <https://arxiv.org/pdf/1706.03762.pdf>`_
+参数：
+    - **d_model** (Variable)-模型的输入和输出维度
+    - **warmup_steps** (Variable)-超参数
+    - **begin**  – 起始步(默认为0)。
+    - **step**  – 步大小(默认为1)。
+    - **dtype**  – 初始学习率的dtype(默认为‘float32’)。
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    warmup_steps = 100
+    learning_rate = 0.01
+    with fluid.dygraph.guard():
+        optimizer  = fluid.optimizer.SGD(
+            learning_rate = fluid.dygraph.NoamDecay(
+                   1/(warmup_steps *(learning_rate ** 2)),
+                   warmup_steps) )
--- a/doc/fluid/api_cn/dygraph_cn/PRelu_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/PRelu_cn.rst
+.. _cn_api_fluid_dygraph_PRelu:
+PRelu
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.PRelu(name_scope, mode, param_attr=None)
+等式：
+.. math::
+    y = max(0, x) + \alpha min(0, x)
+参数：
+          - **name_scope** （string）- 该类的名称。
+          - **mode** (string) - 权重共享模式。共提供三种激活方式：
+             .. code-block:: text
+                all: 所有元素使用同一个权值
+                channel: 在同一个通道中的元素使用同一个权值
+                element: 每一个元素有一个独立的权值
+          - **param_attr** (ParamAttr|None) - 可学习权重 :math:`[\alpha]` 的参数属性。
+返回： 输出Tensor与输入tensor的shape相同。
+返回类型：  变量（Variable）
+**代码示例：**
+.. code-block:: python
+          import paddle.fluid as fluid
+          import numpy as np
+          inp_np = np.ones([5, 200, 100, 100]).astype('float32')
+          with fluid.dygraph.guard():
+              mode = 'channel'
+              prelu = fluid.PRelu(
+                 'prelu',
+                 mode=mode,
+                 param_attr=fluid.ParamAttr(initializer=fluid.initializer.Constant(1.0)))
+              dy_rlt = prelu(fluid.dygraph.base.to_variable(inp_np))
--- a/doc/fluid/api_cn/dygraph_cn/PiecewiseDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/PiecewiseDecay_cn.rst
+.. _cn_api_fluid_dygraph_PiecewiseDecay:
+PiecewiseDecay
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.PiecewiseDecay(boundaries, values, begin, step=1, dtype='float32')
+对初始学习率进行分段(piecewise)衰减。
+该算法可用如下代码描述。
+.. code-block:: text
+    boundaries = [10000, 20000]
+    values = [1.0, 0.5, 0.1]
+    if step < 10000:
+        learning_rate = 1.0
+    elif 10000 <= step < 20000:
+        learning_rate = 0.5
+    else:
+        learning_rate = 0.1
+参数：
+    - **boundaries** -一列代表步数的数字
+    - **values** -一列学习率的值，从不同的步边界中挑选
+    - **begin**  – 用于初始化self.step_num的起始步(默认为0)。
+    - **step**  – 计算新的step_num步号时使用的步大小(默认为1)。
+    - **dtype**  – 初始化学习率变量的dtype
+**代码示例**
+.. code-block:: python
+    import paddle.fluid as fluid
+    boundaries = [10000, 20000]
+    values = [1.0, 0.5, 0.1]
+    with fluid.dygraph.guard():
+        optimizer = fluid.optimizer.SGD(
+           learning_rate=fluid.dygraph.PiecewiseDecay(boundaries, values, 0) )
--- a/doc/fluid/api_cn/dygraph_cn/PolynomialDecay_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/PolynomialDecay_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/Pool2D_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/Pool2D_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/SpectralNorm_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/SpectralNorm_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/TreeConv_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/TreeConv_cn.rst
+.. _cn_api_fluid_dygraph_TreeConv:
+TreeConv
+-------------------------------
+.. py:class:: paddle.fluid.dygraph.TreeConv(name_scope, output_size, num_filters=1, max_depth=2, act='tanh', param_attr=None, bias_attr=None, name=None)
+基于树结构的卷积Tree-Based Convolution运算。
+基于树的卷积是基于树的卷积神经网络（TBCNN，Tree-Based Convolution Neural Network）的一部分，它用于对树结构进行分类，例如抽象语法树。 Tree-Based Convolution提出了一种称为连续二叉树的数据结构，它将多路（multiway）树视为二叉树。提出 `基于树的卷积论文 <https://arxiv.org/abs/1409.5718v1>`_
+参数：
+    - **name_scope**  (str) – 该类的名称
+    - **output_size**  (int) – 输出特征宽度
+    - **num_filters**  (int) – filter数量，默认值1
+    - **max_depth**  (int) – filter的最大深度，默认值2
+    - **act**  (str) – 激活函数，默认 tanh
+    - **param_attr**  (ParamAttr) – filter的参数属性，默认None
+    - **bias_attr**  (ParamAttr) – 此层bias的参数属性，默认None
+    - **name**  (str) – 此层的名称（可选）。如果设置为None，则将自动命名层，默认为None
+返回： （Tensor）子树的特征向量。输出张量的形状是[max_tree_node_size，output_size，num_filters]。输出张量可以是下一个树卷积层的新特征向量
+返回类型：out（Variable）
+**代码示例**:
+.. code-block:: python
+    import paddle.fluid as fluid
+    import numpy
+    with fluid.dygraph.guard():
+        nodes_vector = numpy.random.random((1, 10, 5)).astype('float32')
+        edge_set = numpy.random.random((1, 9, 2)).astype('int32')
+        treeConv = fluid.dygraph.nn.TreeConv(
+          'TreeConv', output_size=6, num_filters=1, max_depth=2)
+        ret = treeConv(fluid.dygraph.base.to_variable(nodes_vector), fluid.dygraph.base.to_variable(edge_set))
--- a/doc/fluid/api_cn/dygraph_cn/guard_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/guard_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/load_persistables_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/load_persistables_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/save_persistables_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/save_persistables_cn.rst
--- a/doc/fluid/api_cn/dygraph_cn/to_variable_cn.rst
+++ b/doc/fluid/api_cn/dygraph_cn/to_variable_cn.rst
--- a/doc/fluid/api_cn/executor_cn.rst
+++ b/doc/fluid/api_cn/executor_cn.rst
--- a/doc/fluid/api_cn/executor_cn/Executor_cn.rst
+++ b/doc/fluid/api_cn/executor_cn/Executor_cn.rst
--- a/doc/fluid/api_cn/executor_cn/global_scope_cn.rst
+++ b/doc/fluid/api_cn/executor_cn/global_scope_cn.rst
--- a/doc/fluid/api_cn/executor_cn/scope_guard_cn.rst
+++ b/doc/fluid/api_cn/executor_cn/scope_guard_cn.rst
--- a/doc/fluid/api_cn/fluid_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/BuildStrategy_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/BuildStrategy_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/CPUPlace_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/CPUPlace_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/CUDAPinnedPlace_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/CUDAPinnedPlace_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/CUDAPlace_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/CUDAPlace_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/CompiledProgram_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/CompiledProgram_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/DataFeedDesc_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/DataFeedDesc_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/DataFeeder_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/DataFeeder_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/DistributeTranspilerConfig_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/DistributeTranspilerConfig_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/DistributeTranspiler_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/DistributeTranspiler_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/ExecutionStrategy_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/ExecutionStrategy_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/Executor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/Executor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/LoDTensorArray_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/LoDTensorArray_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/LoDTensor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/LoDTensor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/ParallelExecutor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/ParallelExecutor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/ParamAttr_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/ParamAttr_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/Program_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/Program_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/Tensor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/Tensor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/WeightNormParamAttr_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/WeightNormParamAttr_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/cpu_places_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/cpu_places_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/create_lod_tensor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/create_lod_tensor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/create_random_int_lodtensor_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/create_random_int_lodtensor_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/cuda_pinned_places_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/cuda_pinned_places_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/cuda_places_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/cuda_places_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/default_main_program_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/default_main_program_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/default_startup_program_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/default_startup_program_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/global_scope_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/global_scope_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/gradients_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/gradients_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/in_dygraph_mode_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/in_dygraph_mode_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/memory_optimize_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/memory_optimize_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/name_scope_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/name_scope_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/program_guard_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/program_guard_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/release_memory_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/release_memory_cn.rst
--- a/doc/fluid/api_cn/fluid_cn/scope_guard_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn/scope_guard_cn.rst
--- a/doc/fluid/api_cn/initializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/BilinearInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/BilinearInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/Bilinear_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/Bilinear_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/ConstantInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/ConstantInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/Constant_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/Constant_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/MSRAInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/MSRAInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/MSRA_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/MSRA_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/NormalInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/NormalInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/Normal_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/Normal_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/NumpyArrayInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/NumpyArrayInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/TruncatedNormalInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/TruncatedNormalInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/TruncatedNormal_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/TruncatedNormal_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/UniformInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/UniformInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/Uniform_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/Uniform_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/XavierInitializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/XavierInitializer_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/Xavier_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/Xavier_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/force_init_on_cpu_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/force_init_on_cpu_cn.rst
--- a/doc/fluid/api_cn/initializer_cn/init_on_cpu_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn/init_on_cpu_cn.rst
--- a/doc/fluid/api_cn/io_cn.rst
+++ b/doc/fluid/api_cn/io_cn.rst
--- a/doc/fluid/api_cn/io_cn/PyReader_cn.rst
+++ b/doc/fluid/api_cn/io_cn/PyReader_cn.rst
--- a/doc/fluid/api_cn/io_cn/load_inference_model_cn.rst
+++ b/doc/fluid/api_cn/io_cn/load_inference_model_cn.rst
--- a/doc/fluid/api_cn/io_cn/load_params_cn.rst
+++ b/doc/fluid/api_cn/io_cn/load_params_cn.rst
--- a/doc/fluid/api_cn/io_cn/load_persistables_cn.rst
+++ b/doc/fluid/api_cn/io_cn/load_persistables_cn.rst
--- a/doc/fluid/api_cn/io_cn/load_vars_cn.rst
+++ b/doc/fluid/api_cn/io_cn/load_vars_cn.rst
--- a/doc/fluid/api_cn/io_cn/save_inference_model_cn.rst
+++ b/doc/fluid/api_cn/io_cn/save_inference_model_cn.rst
--- a/doc/fluid/api_cn/io_cn/save_params_cn.rst
+++ b/doc/fluid/api_cn/io_cn/save_params_cn.rst
--- a/doc/fluid/api_cn/io_cn/save_persistables_cn.rst
+++ b/doc/fluid/api_cn/io_cn/save_persistables_cn.rst
--- a/doc/fluid/api_cn/io_cn/save_vars_cn.rst
+++ b/doc/fluid/api_cn/io_cn/save_vars_cn.rst
--- a/doc/fluid/api_cn/layers-breakdown.py
+++ b/doc/fluid/api_cn/layers-breakdown.py
--- a/doc/fluid/api_cn/layers_cn.rst
+++ b/doc/fluid/api_cn/layers_cn.rst
--- a/doc/fluid/api_cn/layers_cn/DynamicRNN_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/DynamicRNN_cn.rst
--- a/doc/fluid/api_cn/layers_cn/IfElse_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/IfElse_cn.rst
--- a/doc/fluid/api_cn/layers_cn/Preprocessor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/Preprocessor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/Print_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/Print_cn.rst
--- a/doc/fluid/api_cn/layers_cn/StaticRNN_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/StaticRNN_cn.rst
--- a/doc/fluid/api_cn/layers_cn/Switch_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/Switch_cn.rst
--- a/doc/fluid/api_cn/layers_cn/While_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/While_cn.rst
--- a/doc/fluid/api_cn/layers_cn/abs_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/abs_cn.rst
--- a/doc/fluid/api_cn/layers_cn/accuracy_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/accuracy_cn.rst
--- a/doc/fluid/api_cn/layers_cn/acos_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/acos_cn.rst
--- a/doc/fluid/api_cn/layers_cn/adaptive_pool2d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/adaptive_pool2d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/adaptive_pool3d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/adaptive_pool3d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/add_position_encoding_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/add_position_encoding_cn.rst
--- a/doc/fluid/api_cn/layers_cn/affine_channel_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/affine_channel_cn.rst
--- a/doc/fluid/api_cn/layers_cn/affine_grid_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/affine_grid_cn.rst
--- a/doc/fluid/api_cn/layers_cn/anchor_generator_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/anchor_generator_cn.rst
--- a/doc/fluid/api_cn/layers_cn/argmax_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/argmax_cn.rst
--- a/doc/fluid/api_cn/layers_cn/argmin_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/argmin_cn.rst
--- a/doc/fluid/api_cn/layers_cn/argsort_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/argsort_cn.rst
--- a/doc/fluid/api_cn/layers_cn/array_length_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/array_length_cn.rst
--- a/doc/fluid/api_cn/layers_cn/array_read_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/array_read_cn.rst
--- a/doc/fluid/api_cn/layers_cn/array_write_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/array_write_cn.rst
--- a/doc/fluid/api_cn/layers_cn/asin_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/asin_cn.rst
--- a/doc/fluid/api_cn/layers_cn/assign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/assign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/atan_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/atan_cn.rst
--- a/doc/fluid/api_cn/layers_cn/metric_op_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/metric_op_cn.rst
--- a/doc/fluid/api_cn/layers_cn/autoincreased_step_counter_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/autoincreased_step_counter_cn.rst
--- a/doc/fluid/api_cn/layers_cn/batch_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/batch_cn.rst
--- a/doc/fluid/api_cn/layers_cn/batch_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/batch_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/beam_search_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/beam_search_cn.rst
--- a/doc/fluid/api_cn/layers_cn/beam_search_decode_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/beam_search_decode_cn.rst
--- a/doc/fluid/api_cn/layers_cn/bilinear_tensor_product_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/bilinear_tensor_product_cn.rst
--- a/doc/fluid/api_cn/layers_cn/bipartite_match_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/bipartite_match_cn.rst
--- a/doc/fluid/api_cn/layers_cn/box_clip_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/box_clip_cn.rst
--- a/doc/fluid/api_cn/layers_cn/box_coder_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/box_coder_cn.rst
--- a/doc/fluid/api_cn/layers_cn/box_decoder_and_assign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/box_decoder_and_assign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/bpr_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/bpr_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/brelu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/brelu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cast_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cast_cn.rst
--- a/doc/fluid/api_cn/layers_cn/ceil_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/ceil_cn.rst
--- a/doc/fluid/api_cn/layers_cn/chunk_eval_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/chunk_eval_cn.rst
--- a/doc/fluid/api_cn/layers_cn/clip_by_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/clip_by_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/clip_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/clip_cn.rst
--- a/doc/fluid/api_cn/layers_cn/collect_fpn_proposals_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/collect_fpn_proposals_cn.rst
--- a/doc/fluid/api_cn/layers_cn/concat_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/concat_cn.rst
--- a/doc/fluid/api_cn/layers_cn/continuous_value_model_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/continuous_value_model_cn.rst
--- a/doc/fluid/api_cn/layers_cn/control_flow_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/control_flow_cn.rst
--- a/doc/fluid/api_cn/layers_cn/conv2d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/conv2d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/conv2d_transpose_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/conv2d_transpose_cn.rst
--- a/doc/fluid/api_cn/layers_cn/conv3d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/conv3d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/conv3d_transpose_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/conv3d_transpose_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cos_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cos_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cos_sim_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cos_sim_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cosine_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cosine_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/create_array_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/create_array_cn.rst
--- a/doc/fluid/api_cn/layers_cn/create_global_var_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/create_global_var_cn.rst
--- a/doc/fluid/api_cn/layers_cn/create_parameter_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/create_parameter_cn.rst
--- a/doc/fluid/api_cn/layers_cn/create_py_reader_by_data_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/create_py_reader_by_data_cn.rst
--- a/doc/fluid/api_cn/layers_cn/create_tensor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/create_tensor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/crf_decoding_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/crf_decoding_cn.rst
--- a/doc/fluid/api_cn/layers_cn/crop_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/crop_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cross_entropy_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cross_entropy_cn.rst
--- a/doc/fluid/api_cn/layers_cn/ctc_greedy_decoder_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/ctc_greedy_decoder_cn.rst
--- a/doc/fluid/api_cn/layers_cn/cumsum_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/cumsum_cn.rst
--- a/doc/fluid/api_cn/layers_cn/data_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/data_cn.rst
--- a/doc/fluid/api_cn/layers_cn/data_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/data_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/deformable_conv_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/deformable_conv_cn.rst
--- a/doc/fluid/api_cn/layers_cn/deformable_roi_pooling_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/deformable_roi_pooling_cn.rst
--- a/doc/fluid/api_cn/layers_cn/density_prior_box_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/density_prior_box_cn.rst
--- a/doc/fluid/api_cn/layers_cn/detection_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/detection_cn.rst
--- a/doc/fluid/api_cn/layers_cn/detection_map_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/detection_map_cn.rst
--- a/doc/fluid/api_cn/layers_cn/detection_output_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/detection_output_cn.rst
--- a/doc/fluid/api_cn/layers_cn/diag_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/diag_cn.rst
--- a/doc/fluid/api_cn/layers_cn/dice_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/dice_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/distribute_fpn_proposals_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/distribute_fpn_proposals_cn.rst
--- a/doc/fluid/api_cn/layers_cn/double_buffer_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/double_buffer_cn.rst
--- a/doc/fluid/api_cn/layers_cn/dropout_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/dropout_cn.rst
--- a/doc/fluid/api_cn/layers_cn/dynamic_gru_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/dynamic_gru_cn.rst
--- a/doc/fluid/api_cn/layers_cn/dynamic_lstm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/dynamic_lstm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/dynamic_lstmp_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/dynamic_lstmp_cn.rst
--- a/doc/fluid/api_cn/layers_cn/edit_distance_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/edit_distance_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_add_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_add_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_div_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_div_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_floordiv_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_floordiv_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_max_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_max_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_min_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_min_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_mod_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_mod_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_mul_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_mul_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_pow_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_pow_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elementwise_sub_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elementwise_sub_cn.rst
--- a/doc/fluid/api_cn/layers_cn/elu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/elu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/embedding_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/embedding_cn.rst
--- a/doc/fluid/api_cn/layers_cn/equal_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/equal_cn.rst
--- a/doc/fluid/api_cn/layers_cn/exp_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/exp_cn.rst
--- a/doc/fluid/api_cn/layers_cn/expand_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/expand_cn.rst
--- a/doc/fluid/api_cn/layers_cn/exponential_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/exponential_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/fc_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/fc_cn.rst
--- a/doc/fluid/api_cn/layers_cn/fill_constant_batch_size_like_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/fill_constant_batch_size_like_cn.rst
--- a/doc/fluid/api_cn/layers_cn/fill_constant_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/fill_constant_cn.rst
--- a/doc/fluid/api_cn/layers_cn/flatten_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/flatten_cn.rst
--- a/doc/fluid/api_cn/layers_cn/floor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/floor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/fsp_matrix_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/fsp_matrix_cn.rst
--- a/doc/fluid/api_cn/layers_cn/gather_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/gather_cn.rst
--- a/doc/fluid/api_cn/layers_cn/gaussian_random_batch_size_like_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/gaussian_random_batch_size_like_cn.rst
--- a/doc/fluid/api_cn/layers_cn/gaussian_random_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/gaussian_random_cn.rst
--- a/doc/fluid/api_cn/layers_cn/generate_mask_labels_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/generate_mask_labels_cn.rst
--- a/doc/fluid/api_cn/layers_cn/generate_proposal_labels_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/generate_proposal_labels_cn.rst
--- a/doc/fluid/api_cn/layers_cn/generate_proposals_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/generate_proposals_cn.rst
--- a/doc/fluid/api_cn/layers_cn/get_tensor_from_selected_rows_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/get_tensor_from_selected_rows_cn.rst
--- a/doc/fluid/api_cn/layers_cn/greater_equal_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/greater_equal_cn.rst
--- a/doc/fluid/api_cn/layers_cn/greater_than_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/greater_than_cn.rst
--- a/doc/fluid/api_cn/layers_cn/grid_sampler_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/grid_sampler_cn.rst
--- a/doc/fluid/api_cn/layers_cn/group_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/group_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/gru_unit_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/gru_unit_cn.rst
--- a/doc/fluid/api_cn/layers_cn/hard_shrink_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/hard_shrink_cn.rst
--- a/doc/fluid/api_cn/layers_cn/hard_sigmoid_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/hard_sigmoid_cn.rst
--- a/doc/fluid/api_cn/layers_cn/has_inf_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/has_inf_cn.rst
--- a/doc/fluid/api_cn/layers_cn/has_nan_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/has_nan_cn.rst
--- a/doc/fluid/api_cn/layers_cn/hash_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/hash_cn.rst
--- a/doc/fluid/api_cn/layers_cn/hsigmoid_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/hsigmoid_cn.rst
--- a/doc/fluid/api_cn/layers_cn/huber_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/huber_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/im2sequence_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/im2sequence_cn.rst
--- a/doc/fluid/api_cn/layers_cn/image_resize_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/image_resize_cn.rst
--- a/doc/fluid/api_cn/layers_cn/image_resize_short_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/image_resize_short_cn.rst
--- a/doc/fluid/api_cn/layers_cn/increment_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/increment_cn.rst
--- a/doc/fluid/api_cn/layers_cn/inverse_time_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/inverse_time_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/io_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/io_cn.rst
--- a/doc/fluid/api_cn/layers_cn/iou_similarity_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/iou_similarity_cn.rst
--- a/doc/fluid/api_cn/layers_cn/is_empty_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/is_empty_cn.rst
--- a/doc/fluid/api_cn/layers_cn/isfinite_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/isfinite_cn.rst
--- a/doc/fluid/api_cn/layers_cn/kldiv_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/kldiv_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/l2_normalize_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/l2_normalize_cn.rst
--- a/doc/fluid/api_cn/layers_cn/label_smooth_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/label_smooth_cn.rst
--- a/doc/fluid/api_cn/layers_cn/layer_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/layer_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/leaky_relu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/leaky_relu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/learning_rate_scheduler_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/learning_rate_scheduler_cn.rst
--- a/doc/fluid/api_cn/layers_cn/less_equal_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/less_equal_cn.rst
--- a/doc/fluid/api_cn/layers_cn/less_than_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/less_than_cn.rst
--- a/doc/fluid/api_cn/layers_cn/linear_chain_crf_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/linear_chain_crf_cn.rst
--- a/doc/fluid/api_cn/layers_cn/linear_lr_warmup_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/linear_lr_warmup_cn.rst
--- a/doc/fluid/api_cn/layers_cn/linspace_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/linspace_cn.rst
--- a/doc/fluid/api_cn/layers_cn/load_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/load_cn.rst
--- a/doc/fluid/api_cn/layers_cn/lod_reset_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/lod_reset_cn.rst
--- a/doc/fluid/api_cn/layers_cn/log_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/log_cn.rst
--- a/doc/fluid/api_cn/layers_cn/log_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/log_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/logical_and_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/logical_and_cn.rst
--- a/doc/fluid/api_cn/layers_cn/logical_not_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/logical_not_cn.rst
--- a/doc/fluid/api_cn/layers_cn/logical_or_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/logical_or_cn.rst
--- a/doc/fluid/api_cn/layers_cn/logical_xor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/logical_xor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/logsigmoid_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/logsigmoid_cn.rst
--- a/doc/fluid/api_cn/layers_cn/lrn_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/lrn_cn.rst
--- a/doc/fluid/api_cn/layers_cn/lstm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/lstm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/lstm_unit_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/lstm_unit_cn.rst
--- a/doc/fluid/api_cn/layers_cn/margin_rank_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/margin_rank_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/matmul_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/matmul_cn.rst
--- a/doc/fluid/api_cn/layers_cn/maxout_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/maxout_cn.rst
--- a/doc/fluid/api_cn/layers_cn/mean_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/mean_cn.rst
--- a/doc/fluid/api_cn/layers_cn/mean_iou_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/mean_iou_cn.rst
--- a/doc/fluid/api_cn/layers_cn/merge_selected_rows_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/merge_selected_rows_cn.rst
--- a/doc/fluid/api_cn/layers_cn/mul_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/mul_cn.rst
--- a/doc/fluid/api_cn/layers_cn/multi_box_head_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/multi_box_head_cn.rst
--- a/doc/fluid/api_cn/layers_cn/multiclass_nms_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/multiclass_nms_cn.rst
--- a/doc/fluid/api_cn/layers_cn/multiplex_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/multiplex_cn.rst
--- a/doc/fluid/api_cn/layers_cn/natural_exp_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/natural_exp_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/nce_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/nce_cn.rst
--- a/doc/fluid/api_cn/layers_cn/nn_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/nn_cn.rst
--- a/doc/fluid/api_cn/layers_cn/noam_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/noam_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/not_equal_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/not_equal_cn.rst
--- a/doc/fluid/api_cn/layers_cn/npair_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/npair_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/one_hot_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/one_hot_cn.rst
--- a/doc/fluid/api_cn/layers_cn/ones_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/ones_cn.rst
--- a/doc/fluid/api_cn/layers_cn/open_files_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/open_files_cn.rst
--- a/doc/fluid/api_cn/layers_cn/ops_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/ops_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pad2d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pad2d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pad_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pad_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pad_constant_like_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pad_constant_like_cn.rst
--- a/doc/fluid/api_cn/layers_cn/piecewise_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/piecewise_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pixel_shuffle_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pixel_shuffle_cn.rst
--- a/doc/fluid/api_cn/layers_cn/polygon_box_transform_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/polygon_box_transform_cn.rst
--- a/doc/fluid/api_cn/layers_cn/polynomial_decay_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/polynomial_decay_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pool2d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pool2d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pool3d_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pool3d_cn.rst
--- a/doc/fluid/api_cn/layers_cn/pow_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/pow_cn.rst
--- a/doc/fluid/api_cn/layers_cn/prelu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/prelu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/prior_box_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/prior_box_cn.rst
--- a/doc/fluid/api_cn/layers_cn/psroi_pool_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/psroi_pool_cn.rst
--- a/doc/fluid/api_cn/layers_cn/py_func_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/py_func_cn.rst
--- a/doc/fluid/api_cn/layers_cn/py_reader_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/py_reader_cn.rst
--- a/doc/fluid/api_cn/layers_cn/random_crop_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/random_crop_cn.rst
--- a/doc/fluid/api_cn/layers_cn/random_data_generator_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/random_data_generator_cn.rst
--- a/doc/fluid/api_cn/layers_cn/range_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/range_cn.rst
--- a/doc/fluid/api_cn/layers_cn/rank_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/rank_cn.rst
--- a/doc/fluid/api_cn/layers_cn/rank_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/rank_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/read_file_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/read_file_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reciprocal_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reciprocal_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_all_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_all_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_any_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_any_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_max_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_max_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_mean_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_mean_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_min_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_min_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_prod_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_prod_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reduce_sum_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reduce_sum_cn.rst
--- a/doc/fluid/api_cn/layers_cn/relu6_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/relu6_cn.rst
--- a/doc/fluid/api_cn/layers_cn/relu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/relu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reorder_lod_tensor_by_rank_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reorder_lod_tensor_by_rank_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reshape_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reshape_cn.rst
--- a/doc/fluid/api_cn/layers_cn/resize_bilinear_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/resize_bilinear_cn.rst
--- a/doc/fluid/api_cn/layers_cn/resize_nearest_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/resize_nearest_cn.rst
--- a/doc/fluid/api_cn/layers_cn/retinanet_detection_output_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/retinanet_detection_output_cn.rst
--- a/doc/fluid/api_cn/layers_cn/retinanet_target_assign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/retinanet_target_assign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/reverse_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/reverse_cn.rst
--- a/doc/fluid/api_cn/layers_cn/roi_align_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/roi_align_cn.rst
--- a/doc/fluid/api_cn/layers_cn/roi_perspective_transform_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/roi_perspective_transform_cn.rst
--- a/doc/fluid/api_cn/layers_cn/roi_pool_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/roi_pool_cn.rst
--- a/doc/fluid/api_cn/layers_cn/round_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/round_cn.rst
--- a/doc/fluid/api_cn/layers_cn/row_conv_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/row_conv_cn.rst
--- a/doc/fluid/api_cn/layers_cn/rpn_target_assign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/rpn_target_assign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/rsqrt_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/rsqrt_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sampled_softmax_with_cross_entropy_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sampled_softmax_with_cross_entropy_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sampling_id_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sampling_id_cn.rst
--- a/doc/fluid/api_cn/layers_cn/scale_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/scale_cn.rst
--- a/doc/fluid/api_cn/layers_cn/scatter_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/scatter_cn.rst
--- a/doc/fluid/api_cn/layers_cn/selu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/selu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_concat_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_concat_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_conv_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_conv_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_enumerate_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_enumerate_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_expand_as_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_expand_as_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_expand_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_expand_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_first_step_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_first_step_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_last_step_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_last_step_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_mask_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_mask_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_pad_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_pad_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_pool_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_pool_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_reshape_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_reshape_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_reverse_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_reverse_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_scatter_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_scatter_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_slice_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_slice_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_softmax_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_softmax_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sequence_unpad_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sequence_unpad_cn.rst
--- a/doc/fluid/api_cn/layers_cn/shape_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/shape_cn.rst
--- a/doc/fluid/api_cn/layers_cn/shuffle_channel_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/shuffle_channel_cn.rst
--- a/doc/fluid/api_cn/layers_cn/shuffle_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/shuffle_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sigmoid_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sigmoid_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sigmoid_cross_entropy_with_logits_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sigmoid_cross_entropy_with_logits_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sigmoid_focal_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sigmoid_focal_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/similarity_focus_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/similarity_focus_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sin_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sin_cn.rst
--- a/doc/fluid/api_cn/layers_cn/slice_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/slice_cn.rst
--- a/doc/fluid/api_cn/layers_cn/smooth_l1_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/smooth_l1_cn.rst
--- a/doc/fluid/api_cn/layers_cn/soft_relu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/soft_relu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/softmax_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/softmax_cn.rst
--- a/doc/fluid/api_cn/layers_cn/softmax_with_cross_entropy_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/softmax_with_cross_entropy_cn.rst
--- a/doc/fluid/api_cn/layers_cn/softplus_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/softplus_cn.rst
--- a/doc/fluid/api_cn/layers_cn/softshrink_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/softshrink_cn.rst
--- a/doc/fluid/api_cn/layers_cn/softsign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/softsign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/space_to_depth_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/space_to_depth_cn.rst
--- a/doc/fluid/api_cn/layers_cn/spectral_norm_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/spectral_norm_cn.rst
--- a/doc/fluid/api_cn/layers_cn/split_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/split_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sqrt_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sqrt_cn.rst
--- a/doc/fluid/api_cn/layers_cn/square_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/square_cn.rst
--- a/doc/fluid/api_cn/layers_cn/square_error_cost_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/square_error_cost_cn.rst
--- a/doc/fluid/api_cn/layers_cn/squeeze_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/squeeze_cn.rst
--- a/doc/fluid/api_cn/layers_cn/ssd_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/ssd_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/stack_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/stack_cn.rst
--- a/doc/fluid/api_cn/layers_cn/stanh_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/stanh_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sum_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sum_cn.rst
--- a/doc/fluid/api_cn/layers_cn/sums_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/sums_cn.rst
--- a/doc/fluid/api_cn/layers_cn/swish_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/swish_cn.rst
--- a/doc/fluid/api_cn/layers_cn/tanh_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/tanh_cn.rst
--- a/doc/fluid/api_cn/layers_cn/tanh_shrink_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/tanh_shrink_cn.rst
--- a/doc/fluid/api_cn/layers_cn/target_assign_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/target_assign_cn.rst
--- a/doc/fluid/api_cn/layers_cn/teacher_student_sigmoid_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/teacher_student_sigmoid_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/temporal_shift_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/temporal_shift_cn.rst
--- a/doc/fluid/api_cn/layers_cn/tensor_array_to_tensor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/tensor_array_to_tensor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/tensor_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/tensor_cn.rst
--- a/doc/fluid/api_cn/layers_cn/thresholded_relu_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/thresholded_relu_cn.rst
--- a/doc/fluid/api_cn/layers_cn/topk_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/topk_cn.rst
--- a/doc/fluid/api_cn/layers_cn/transpose_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/transpose_cn.rst
--- a/doc/fluid/api_cn/layers_cn/tree_conv_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/tree_conv_cn.rst
--- a/doc/fluid/api_cn/layers_cn/uniform_random_batch_size_like_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/uniform_random_batch_size_like_cn.rst
--- a/doc/fluid/api_cn/layers_cn/uniform_random_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/uniform_random_cn.rst
--- a/doc/fluid/api_cn/layers_cn/unsqueeze_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/unsqueeze_cn.rst
--- a/doc/fluid/api_cn/layers_cn/unstack_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/unstack_cn.rst
--- a/doc/fluid/api_cn/layers_cn/warpctc_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/warpctc_cn.rst
--- a/doc/fluid/api_cn/layers_cn/where_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/where_cn.rst
--- a/doc/fluid/api_cn/layers_cn/yolo_box_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/yolo_box_cn.rst
--- a/doc/fluid/api_cn/layers_cn/yolov3_loss_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/yolov3_loss_cn.rst
--- a/doc/fluid/api_cn/layers_cn/zeros_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/zeros_cn.rst
--- a/doc/fluid/api_cn/layers_cn/zeros_like_cn.rst
+++ b/doc/fluid/api_cn/layers_cn/zeros_like_cn.rst
--- a/doc/fluid/api_cn/metrics_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/Accuracy_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/Accuracy_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/Auc_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/Auc_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/ChunkEvaluator_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/ChunkEvaluator_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/CompositeMetric_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/CompositeMetric_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/DetectionMAP_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/DetectionMAP_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/EditDistance_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/EditDistance_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/MetricBase_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/MetricBase_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/Precision_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/Precision_cn.rst
--- a/doc/fluid/api_cn/metrics_cn/Recall_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn/Recall_cn.rst
--- a/doc/fluid/api_cn/nets_cn.rst
+++ b/doc/fluid/api_cn/nets_cn.rst
--- a/doc/fluid/api_cn/nets_cn/glu_cn.rst
+++ b/doc/fluid/api_cn/nets_cn/glu_cn.rst
--- a/doc/fluid/api_cn/nets_cn/img_conv_group_cn.rst
+++ b/doc/fluid/api_cn/nets_cn/img_conv_group_cn.rst
--- a/doc/fluid/api_cn/nets_cn/scaled_dot_product_attention_cn.rst
+++ b/doc/fluid/api_cn/nets_cn/scaled_dot_product_attention_cn.rst
--- a/doc/fluid/api_cn/nets_cn/sequence_conv_pool_cn.rst
+++ b/doc/fluid/api_cn/nets_cn/sequence_conv_pool_cn.rst
--- a/doc/fluid/api_cn/nets_cn/simple_img_conv_pool_cn.rst
+++ b/doc/fluid/api_cn/nets_cn/simple_img_conv_pool_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Adadelta_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Adadelta_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/AdagradOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/AdagradOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Adagrad_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Adagrad_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/AdamOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/AdamOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Adam_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Adam_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/AdamaxOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/AdamaxOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Adamax_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Adamax_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/DGCMomentumOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/DGCMomentumOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/DecayedAdagradOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/DecayedAdagradOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/DecayedAdagrad_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/DecayedAdagrad_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/ExponentialMovingAverage_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/ExponentialMovingAverage_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/FtrlOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/FtrlOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Ftrl_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Ftrl_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/LambOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/LambOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/LarsMomentumOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/LarsMomentumOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/LarsMomentum_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/LarsMomentum_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/ModelAverage_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/ModelAverage_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/MomentumOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/MomentumOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/Momentum_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/Momentum_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/PipelineOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/PipelineOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/RMSPropOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/RMSPropOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/SGDOptimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/SGDOptimizer_cn.rst
--- a/doc/fluid/api_cn/optimizer_cn/SGD_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn/SGD_cn.rst
--- a/doc/fluid/api_cn/profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn.rst
--- a/doc/fluid/api_cn/profiler_cn/cuda_profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn/cuda_profiler_cn.rst
--- a/doc/fluid/api_cn/profiler_cn/profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn/profiler_cn.rst
--- a/doc/fluid/api_cn/profiler_cn/reset_profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn/reset_profiler_cn.rst
--- a/doc/fluid/api_cn/profiler_cn/start_profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn/start_profiler_cn.rst
--- a/doc/fluid/api_cn/profiler_cn/stop_profiler_cn.rst
+++ b/doc/fluid/api_cn/profiler_cn/stop_profiler_cn.rst
--- a/doc/fluid/api_cn/regularizer_cn.rst
+++ b/doc/fluid/api_cn/regularizer_cn.rst
--- a/doc/fluid/api_cn/regularizer_cn/L1DecayRegularizer_cn.rst
+++ b/doc/fluid/api_cn/regularizer_cn/L1DecayRegularizer_cn.rst
--- a/doc/fluid/api_cn/regularizer_cn/L1Decay_cn.rst
+++ b/doc/fluid/api_cn/regularizer_cn/L1Decay_cn.rst
--- a/doc/fluid/api_cn/regularizer_cn/L2DecayRegularizer_cn.rst
+++ b/doc/fluid/api_cn/regularizer_cn/L2DecayRegularizer_cn.rst
--- a/doc/fluid/api_cn/regularizer_cn/L2Decay_cn.rst
+++ b/doc/fluid/api_cn/regularizer_cn/L2Decay_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/DistributeTranspilerConfig_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/DistributeTranspilerConfig_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/DistributeTranspiler_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/DistributeTranspiler_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/HashName_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/HashName_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/RoundRobin_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/RoundRobin_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/memory_optimize_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/memory_optimize_cn.rst
--- a/doc/fluid/api_cn/transpiler_cn/release_memory_cn.rst
+++ b/doc/fluid/api_cn/transpiler_cn/release_memory_cn.rst
--- a/doc/fluid/api_cn/unique_name_cn.rst
+++ b/doc/fluid/api_cn/unique_name_cn.rst
--- a/doc/fluid/api_cn/unique_name_cn/generate_cn.rst
+++ b/doc/fluid/api_cn/unique_name_cn/generate_cn.rst
--- a/doc/fluid/api_cn/unique_name_cn/guard_cn.rst
+++ b/doc/fluid/api_cn/unique_name_cn/guard_cn.rst
--- a/doc/fluid/api_cn/unique_name_cn/switch_cn.rst
+++ b/doc/fluid/api_cn/unique_name_cn/switch_cn.rst