Update0323 (#718)

* fix layers. ==> fluid.layers. (#688) * fix data_reader_cn and dataset_cn (#685) * fix textual content * fix issue #682 3 3.2 * fix issues in conv * Fix io_cn (#597) * routine bug fix * update text info * fix io_cn all * 持久性变量=>长期变量 * 1220 improve cnapi (#509) * 1220 update cnapi * Apply suggestions from code review Co-Authored-By: N haowang101779990 <31058429+haowang101779990@users.noreply.github.com> * refine reader doc (#686) * update VisualDL README to add more details (#694) * improve save_load_variables cn (#689) * improve save_load_variables * Update save_load_variables.rst * fix typo of models (#698) * fix typo of models * fix en * Fix arguments according to 1.3 en + ceil,floor 翻译 +operator 翻译 (#695) * Fix arguments according to 1.3 en + ceil,floor trans +operator trans * fix LoDTensorArray spell * update_reading_data (#699) * add api_guides low_level backward parameter program_en (#696) * add api_guides low_level backward parameter program_en * Apply suggestions from code review Co-Authored-By: N zy0531 <48094155+zy0531@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: N zy0531 <48094155+zy0531@users.noreply.github.com> * Update backward_en.rst * Update parameter_en.rst * Update program_en.rst * Update doc/fluid/api_guides/low_level/program_en.rst

Update0323 (#718)
* fix layers. ==> fluid.layers. (#688) * fix data_reader_cn and dataset_cn (#685) * fix textual content * fix issue #682 3 3.2 * fix issues in conv * Fix io_cn (#597) * routine bug fix * update text info * fix io_cn all * 持久性变量=>长期变量 * 1220 improve cnapi (#509) * 1220 update cnapi * Apply suggestions from code review Co-Authored-By: N haowang101779990 <31058429+haowang101779990@users.noreply.github.com> * refine reader doc (#686) * update VisualDL README to add more details (#694) * improve save_load_variables cn (#689) * improve save_load_variables * Update save_load_variables.rst * fix typo of models (#698) * fix typo of models * fix en * Fix arguments according to 1.3 en + ceil,floor 翻译 +operator 翻译 (#695) * Fix arguments according to 1.3 en + ceil,floor trans +operator trans * fix LoDTensorArray spell * update_reading_data (#699) * add api_guides low_level backward parameter program_en (#696) * add api_guides low_level backward parameter program_en * Apply suggestions from code review Co-Authored-By: N zy0531 <48094155+zy0531@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: N zy0531 <48094155+zy0531@users.noreply.github.com> * Update backward_en.rst * Update parameter_en.rst * Update program_en.rst * Update doc/fluid/api_guides/low_level/program_en.rst
60acca73 · Cheerego · GitHub · 6cd92b0b · 60acca73 · 60acca73
24 changed file
--- a/doc/fluid/api_cn/data/data_reader_cn.rst
+++ b/doc/fluid/api_cn/data/data_reader_cn.rst
@@ -11,7 +11,7 @@ DataFeeder
 .. py:class:: paddle.fluid.data_feeder.DataFeeder(feed_list, place, program=None)
-DataFeeder将读卡器返回的数据转换为可以输入Executor和ParallelExecutor的数据结构。读卡器通常返回一个小批量数据条目列表。列表中的每个数据条目都是一个样本。每个样本都是具有一个或多个特征的列表或元组。
+DataFeeder将reader返回的数据转换为可以输入Executor和ParallelExecutor的数据结构。reader通常返回一个小批量数据条目列表。列表中的每个数据条目都是一个样本。每个样本都是具有一个或多个特征的列表或元组。
 简单用法如下：
@@ -42,7 +42,7 @@ DataFeeder将读卡器返回的数据转换为可以输入Executor和ParallelExe
 参数：
    - **feed_list**  (list) –  将输入模型的变量或变量的名称。
    - **place**  (Place) – place表示将数据输入CPU或GPU，如果要将数据输入GPU，请使用fluid.CUDAPlace(i)（i表示GPU的ID），如果要将数据输入CPU，请使用fluid.CPUPlace()。
-    - **program**  (Program) –将数据输入的Program，如果Program为None，它将使用default_main_program() 。默认值None.
+    - **program**  (Program) –将数据输入的Program，如果Program为None，它将使用default_main_program() 。默认值None。
 抛出异常： 	``ValueError`` – 如果某些变量未在Program中出现
@@ -81,7 +81,7 @@ DataFeeder将读卡器返回的数据转换为可以输入Executor和ParallelExe
 需要多个mini-batches。每个mini-batch都将提前在每个设备上输入。
 参数：
-    - **iterable** (list|tuple) – 输入的数据
+    - **iterable** (list|tuple) – 输入的数据。
    - **num_places**  (int) – 设备编号，默认值为None。
 返回： 转换结果
@@ -96,19 +96,19 @@ DataFeeder将读卡器返回的数据转换为可以输入Executor和ParallelExe
 .. py:method::  decorate_reader(reader, multi_devices, num_places=None, drop_last=True)
-将输入数据转换成读卡器返回的多个mini-batches。每个mini-batch
+将输入数据转换成reader返回的多个mini-batches。每个mini-batch分别送入各设备中。
 参数：
-    - **reader** (function) – reader是可以生成数据的函数
+    - **reader** (function) – reader是可以生成数据的函数。
-    - **multi_devices** (bool) – 是否用多个设备
+    - **multi_devices** (bool) – 是否用多个设备。
    - **num_places** (int) – 如果multi_devices是True, 你可以指定GPU的使用数量, 如果multi_devices是None, 会使用当前机器的所有GPU ，默认值None。
-    - **drop_last** (bool) – 如果最后一个batch的大小小于batch_size，是否删除最后一个batch，默认值True。
+    - **drop_last** (bool) – 如果最后一个batch的大小小于batch_size，选择是否删除最后一个batch，默认值True。
 返回： 转换结果
 返回类型： dict
-引起异常： 	ValueError – 如果drop_last为False并且数据批不适合设备。
+抛出异常： 	``ValueError`` – 如果drop_last为False并且数据batch和设备数目不匹配。
 .. _cn_api_paddle_data_reader_reader:
@@ -120,14 +120,14 @@ Reader
 	- reader是一个读取数据（从文件、网络、随机数生成器等）并生成数据项的函数。
 	- reader creator是返回reader函数的函数。
-	- reader decorator是一个函数，它接受一个或多个读卡器，并返回一个读卡器。
+	- reader decorator是一个函数，它接受一个或多个reader，并返回一个reader。
-	- batch reader是一个函数，它读取数据（从读卡器、文件、网络、随机数生成器等）并生成一批数据项。 
+	- batch reader是一个函数，它读取数据（从reader、文件、网络、随机数生成器等）并生成一批数据项。 
 Data Reader Interface
 ------------------------------------
-的确，数据阅读器不必是读取和生成数据项的函数，它可以是任何不带参数的函数来创建一个iterable（任何东西都可以被用于 ``for x in iterable`` ):
+的确，data reader不必是读取和生成数据项的函数，它可以是任何不带参数的函数来创建一个iterable（任何东西都可以被用于 ``for x in iterable`` ):
 ..  code-block:: python
@@ -163,7 +163,7 @@ Data Reader Interface
 参数：
    - **func**  - 使用的函数. 函数类型应为(Sample) => Sample
-    - **readers**  - 其输出将用作func参数的读卡器。
+    - **readers**  - 其输出将用作func参数的reader。
 类型：callable
@@ -176,7 +176,7 @@ Data Reader Interface
 创建缓冲数据读取器。
-缓冲数据读卡器将读取数据条目并将其保存到缓冲区中。只要缓冲区不为空，就将继续从缓冲数据读取器读取数据。
+缓冲数据reader将读取数据条目并将其保存到缓冲区中。只要缓冲区不为空，就将继续从缓冲数据读取器读取数据。
 参数：
    - **reader** (callable) - 要读取的数据读取器
@@ -188,28 +188,28 @@ Data Reader Interface
 .. py:function::   paddle.reader.compose(*readers, **kwargs)
-创建一个数据读卡器，其输出是输入读卡器的组合。
+创建一个数据reader，其输出是输入reader的组合。
-如果输入读卡器输出以下数据项：（1，2）3（4，5），则组合读卡器将输出：（1，2，3，4，5）
+如果输入reader输出以下数据项：（1，2）3（4，5），则组合reader将输出：（1，2，3，4，5）。
 参数：
-    - **readers** - 将被组合的多个读取器
+    - **readers** - 将被组合的多个读取器。
-    - **check_alignment** (bool) - 如果为True，将检查输入读卡器是否正确对齐。如果为False，将不检查对齐，将丢弃跟踪输出。默认值True。 
+    - **check_alignment** (bool) - 如果为True，将检查输入reader是否正确对齐。如果为False，将不检查对齐，将丢弃跟踪输出。默认值True。 
 返回：新的数据读取器
-引起异常： 	``ComposeNotAligned`` – 读卡器的输出不一致。 当check_alignment设置为False，不会升高。 
+抛出异常： 	``ComposeNotAligned`` – reader的输出不一致。 当check_alignment设置为False，不会升高。 
 .. py:function:: paddle.reader.chain(*readers)
-创建一个数据读卡器，其输出是链接在一起的输入数据读卡器的输出。
+创建一个数据reader，其输出是链接在一起的输入数据reader的输出。
-如果输入读卡器输出以下数据条目：[0，0，0][1，1，1][2，2，2]，链接读卡器将输出：[0，0，0，1，1，1，2，2，2] 
+如果输入reader输出以下数据条目：[0，0，0][1，1，1][2，2，2]，链接reader将输出：[0，0，0，1，1，1，2，2，2] 。
 参数：
-    - **readers** – 输入的数据
+    - **readers** – 输入的数据。
 返回： 新的数据读取器
@@ -218,15 +218,15 @@ Data Reader Interface
 .. py:function:: paddle.reader.shuffle(reader, buf_size)
-创建数据读取器，该阅读器的数据输出将被无序排列。
+创建数据读取器，该reader的数据输出将被无序排列。
-由原始读卡器创建的迭代器的输出将被缓冲到shuffle缓冲区，然后进行打乱。打乱缓冲区的大小由参数buf_size决定。 
+由原始reader创建的迭代器的输出将被缓冲到shuffle缓冲区，然后进行打乱。打乱缓冲区的大小由参数buf_size决定。 
 参数：
-    - **reader** (callable)  – 输出会被打乱的原始读卡器
+    - **reader** (callable)  – 输出会被打乱的原始reader
    - **buf_size** (int)  – 打乱缓冲器的大小
-返回： 输出会被打乱的读卡器
+返回： 输出会被打乱的reader
 返回类型： callable
@@ -234,13 +234,13 @@ Data Reader Interface
 .. py:function:: paddle.reader.firstn(reader, n)
-限制读卡器可以返回的最大样本数。
+限制reader可以返回的最大样本数。
 参数：
-    - **reader** (callable)  – 要读取的数据读取器
+    - **reader** (callable)  – 要读取的数据读取器。
-    - **n** (int)  – 返回的最大样本数 
+    - **n** (int)  – 返回的最大样本数 。
-返回： 装饰读卡器
+返回： 装饰reader
 返回类型： callable
@@ -294,11 +294,11 @@ rtype:	string
 .. py:function:: paddle.reader.multiprocess_reader(readers, use_pipe=True, queue_size=1000)
-多进程读卡器使用python多进程从读卡器中读取数据，然后使用multi process.queue或multi process.pipe合并所有数据。进程号等于输入读卡器的编号，每个进程调用一个读卡器。
+多进程reader使用python多进程从reader中读取数据，然后使用multi process.queue或multi process.pipe合并所有数据。进程号等于输入reader的编号，每个进程调用一个reader。
 multiprocess.queue需要/dev/shm的rw访问权限，某些平台不支持。
-您需要首先创建多个读卡器，这些读卡器应该相互独立，这样每个进程都可以独立工作。
+您需要首先创建多个reader，这些reader应该相互独立，这样每个进程都可以独立工作。
 **代码示例**
@@ -314,11 +314,11 @@ multiprocess.queue需要/dev/shm的rw访问权限，某些平台不支持。
 .. py:class::paddle.reader.Fake
-Fake读卡器将缓存它读取的第一个数据，并将其输出data_num次。它用于缓存来自真实阅读器的数据，并将其用于速度测试。
+Fakereader将缓存它读取的第一个数据，并将其输出data_num次。它用于缓存来自真实reader的数据，并将其用于速度测试。
 参数：
-    - **reader** – 原始读取器
+    - **reader** – 原始读取器。
-    - **data_num** – 读卡器产生数据的次数 
+    - **data_num** – reader产生数据的次数 。
 返回： 一个Fake读取器
@@ -343,7 +343,7 @@ Creator包包含一些简单的reader creator，可以在用户Program中使用
 如果是numpy向量，则创建一个生成x个元素的读取器。或者，如果它是一个numpy矩阵，创建一个生成x行元素的读取器。或由最高维度索引的任何子超平面。 
 参数：
-    - **x** – 用于创建读卡器的numpy数组
+    - **x** – 用于创建reader的numpy数组。
 返回： 从x创建的数据读取器
@@ -359,7 +359,7 @@ Creator包包含一些简单的reader creator，可以在用户Program中使用
 .. py:function::  paddle.reader.creator.recordio(paths, buf_size=100)
-从给定的recordio文件路径创建数据读卡器，用“，”分隔“，支持全局模式。 
+从给定的recordio文件路径创建数据reader，用“，”分隔“，支持全局模式。 
 路径：recordio文件的路径，可以是字符串或字符串列表。

--- a/doc/fluid/api_cn/data/dataset_cn.rst
+++ b/doc/fluid/api_cn/data/dataset_cn.rst
@@ -153,7 +153,7 @@ imdb
 IMDB数据集。
-本模块的数据集从 http://ai.stanford.edu/%7Eamaas/data/sentiment/IMDB 数据集。这个数据集包含了一组25000个用于训练的极性电影评论数据和25000个用于测试的评论数据。此外，该模块还提供了用于构建词典的API。
+本模块的数据集从 http://ai.stanford.edu/%7Eamaas/data/sentiment/IMDB 数据集。这个数据集包含了25000条训练用电影评论数据，25000条测试用评论数据，且这些评论带有明显情感倾向。此外，该模块还提供了用于构建词典的API。
 .. py:function:: paddle.dataset.imdb.build_dict(pattern, cutoff)

--- a/doc/fluid/api_cn/executor_cn.rst
+++ b/doc/fluid/api_cn/executor_cn.rst
@@ -144,7 +144,7 @@ feed map为该program提供输入数据。fetch_list提供program训练结束后
 global_scope
 -------------------------------
-.. py:function:: paddle.fluid.global_scope ()
+.. py:function:: paddle.fluid.executor.global_scope ()
 获取全局/默认作用域实例。很多api使用默认 ``global_scope`` ，例如 ``Executor.run`` 。
@@ -163,7 +163,7 @@ global_scope
 scope_guard
 -------------------------------
-.. py:function:: paddle.fluid.scope_guard (scope)
+.. py:function:: paddle.fluid.executor.scope_guard (scope)
 修改全局/默认作用域（scope）,  运行时中的所有变量都将分配给新的scope。

--- a/doc/fluid/api_cn/fluid_cn.rst
+++ b/doc/fluid/api_cn/fluid_cn.rst
@@ -162,7 +162,9 @@ BuildStrategy
 str类型。它表明了以graphviz格式向文件中写入SSA图的路径，有利于调试。 默认值为""。
+.. py:attribute:: enable_sequential_execution
+类型是BOOL。 如果设置为True，则ops的执行顺序将与program中的执行顺序相同。 默认为False。
 .. py:attribute:: fuse_elewise_add_act_ops
@@ -1015,10 +1017,10 @@ feed map为该program提供输入数据。fetch_list提供program训练结束后
 ..  code-block:: python
-	data = layers.data(name='X', shape=[1], dtype='float32')
+	data = fluid.layers.data(name='X', shape=[1], dtype='float32')
-	hidden = layers.fc(input=data, size=10)
+	hidden = fluid.layers.fc(input=data, size=10)
 	layers.assign(hidden, out)
-	loss = layers.mean(out)
+	loss = fluid.layers.mean(out)
 	adam = fluid.optimizer.Adam()
 	adam.minimize(loss)
@@ -1183,7 +1185,7 @@ LoDTensorArray
 .. py:method:: append(self: paddle.fluid.core.LoDTensorArray, tensor: paddle.fluid.core.LoDTensor) → None
-将LoDensor追加到LoDTensorArray后。
+将LoDTensor追加到LoDTensorArray后。

--- a/doc/fluid/api_cn/initializer_cn.rst
+++ b/doc/fluid/api_cn/initializer_cn.rst
@@ -116,7 +116,7 @@ force_init_on_cpu
 init_on_cpu
 -------------------------------
-.. py:function:: paddle.fluid.initializer.init_on_cpu(*args, **kwds)
+.. py:function:: paddle.fluid.initializer.init_on_cpu()
 强制变量在 cpu 上初始化。
@@ -125,7 +125,7 @@ init_on_cpu
 .. code-block:: python
        with init_on_cpu():
-                step = layers.create_global_var()
+                step = fluid.layers.create_global_var()

--- a/doc/fluid/api_cn/io_cn.rst
+++ b/doc/fluid/api_cn/io_cn.rst
@@ -9,9 +9,9 @@
 load_inference_model
 -------------------------------
-.. py:class:: paddle.fluid.io.load_inference_model(dirname, executor, model_filename=None, params_filename=None, pserver_endpoints=None)
+.. py:function:: paddle.fluid.io.load_inference_model(dirname, executor, model_filename=None, params_filename=None, pserver_endpoints=None)
-从指定目录中加载预测模型model(inference model)
+从指定目录中加载预测模型(inference model)。
 参数:
  - **dirname** (str) – model的路径
@@ -52,13 +52,13 @@ load_inference_model
 load_params
 -------------------------------
-.. py:class:: paddle.fluid.io.load_params(executor, dirname, main_program=None, filename=None)
+.. py:function:: paddle.fluid.io.load_params(executor, dirname, main_program=None, filename=None)
-该函数过滤掉 给定 ``main_program`` 中所有参数，然后将它们加载保存在到目录 ``dirname`` 中或文件中的参数。
+该函数从给定 ``main_program`` 中取出所有参数，然后从目录 ``dirname`` 中或 ``filename`` 指定的文件中加载这些参数。
-``dirname`` 用于指定保存变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指定它
+``dirname`` 用于存有变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指明这个文件。
-注意:有些变量不是参数，但它们对于训练是必要的。因此，您不能仅通过 ``save_params()`` 和 ``load_params()`` 保存并之后继续训练。可以使用 ``save_persistables()`` 和 ``load_persistables()`` 代替这两个函数
+注意:有些变量不是参数，但它们对于训练是必要的。因此，调用 ``save_params()`` 和 ``load_params()`` 来保存和加载参数是不够的，可以使用 ``save_persistables()`` 和 ``load_persistables()`` 代替这两个函数。
 参数:
 - **executor**  (Executor) – 加载变量的 executor
@@ -89,11 +89,11 @@ load_params
 load_persistables
 -------------------------------
-.. py:class:: paddle.fluid.io.load_persistables(executor, dirname, main_program=None, filename=None)
+.. py:function:: paddle.fluid.io.load_persistables(executor, dirname, main_program=None, filename=None)
-该函数过滤掉 给定 ``main_program`` 中所有参数，然后将它们加载保存在到目录 ``dirname`` 中或文件中的参数。
+该函数从给定 ``main_program`` 中取出所有 ``persistable==True`` 的变量（即长期变量），然后将它们从目录 ``dirname`` 中或 ``filename`` 指定的文件中加载出来。
-``dirname`` 用于指定保存变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指定它
+``dirname`` 用于指定存有长期变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指定它。
 参数:
    - **executor**  (Executor) – 加载变量的 executor
@@ -124,13 +124,13 @@ load_persistables
 load_vars
 -------------------------------
-.. py:class:: paddle.fluid.io.load_vars(executor, dirname, main_program=None, vars=None, predicate=None, filename=None)
+.. py:function:: paddle.fluid.io.load_vars(executor, dirname, main_program=None, vars=None, predicate=None, filename=None)
 ``executor`` 从指定目录加载变量。
 有两种方法来加载变量:方法一，``vars`` 为变量的列表。方法二，将已存在的 ``Program`` 赋值给 ``main_program`` ，然后将加载 ``Program`` 中的所有变量。第一种方法优先级更高。如果指定了 vars，那么忽略 ``main_program`` 和 ``predicate`` 。
-``dirname`` 用于指定加载变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用 ``filename`` 来指定它
+``dirname`` 用于指定加载变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用 ``filename`` 来指定它。
 参数:
 - **executor**  (Executor) – 加载变量的 executor
@@ -182,11 +182,12 @@ load_vars
 save_inference_model
 -------------------------------
-.. py:class:: paddle.fluid.io.save_inference_model(dirname, feeded_var_names, target_vars, executor, main_program=None, model_filename=None, params_filename=None, export_for_deployment=True)
+.. py:function:: paddle.fluid.io.save_inference_model(dirname, feeded_var_names, target_vars, executor, main_program=None, model_filename=None, params_filename=None, export_for_deployment=True)
-修改指定的 ``main_program`` ，构建一个专门用预测的 ``Program``，然后  ``executor`` 把它和所有相关参数保存到 ``dirname`` 中
+修改指定的 ``main_program`` ，构建一个专门用于预测的 ``Program``，然后  ``executor`` 把它和所有相关参数保存到 ``dirname`` 中。
-``dirname`` 用于指定保存变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指定它
+``dirname`` 用于指定保存变量的目录。如果变量保存在指定目录的若干文件中，设置文件名 None; 如果所有变量保存在一个文件中，请使用filename来指定它。
 参数:
  - **dirname** (str) – 保存预测model的路径
@@ -229,13 +230,13 @@ save_inference_model
 save_params
 -------------------------------
-.. py:class:: paddle.fluid.io.save_params(executor, dirname, main_program=None, filename=None)
+.. py:function:: paddle.fluid.io.save_params(executor, dirname, main_program=None, filename=None)
-该函数过滤掉 给定 ``main_program`` 中所有参数，然后将它们保存到目录 ``dirname`` 中或文件中。
+该函数从 ``main_program`` 中取出所有参数，然后将它们保存到 ``dirname`` 目录下或名为 ``filename`` 的文件中。
-``dirname`` 用于指定保存变量的目录。如果想将变量保存到指定目录的若干文件中，设置文件名 None; 如果想将所有变量保存在一个文件中，请使用filename来指定它
+``dirname`` 用于指定保存变量的目标目录。如果想将变量保存到多个独立文件中，设置 ``filename`` 为 None; 如果想将所有变量保存在单个文件中，请使用 ``filename`` 来指定该文件的命名。
-注意:有些变量不是参数，但它们对于训练是必要的。因此，您不能仅通过 ``save_params()`` 和 ``load_params()`` 保存并之后继续训练。可以使用 ``save_persistables()`` 和 ``load_persistables()`` 代替这两个函数
+注意:有些变量不是参数，但它们对于训练是必要的。因此，调用 ``save_params()`` 和 ``load_params()`` 来保存和加载参数是不够的，可以使用 ``save_persistables()`` 和 ``load_persistables()`` 代替这两个函数。
 参数:
@@ -243,7 +244,7 @@ save_params
 - **dirname**  (str) – 目录路径
 - **main_program**  (Program|None) – 需要保存变量的 Program。如果为 None，则使用 default_main_Program 。默认值: None
 - **vars**  (list[Variable]|None) –  要保存的所有变量的列表。 优先级高于main_program。默认值: None
- - **filename**  (str|None) – 保存变量的文件。如果想分开保存变量，设置 filename=None. 默认值: None
+ - **filename**  (str|None) – 保存变量的文件。如果想分不同独立文件来保存变量，设置 filename=None. 默认值: None
 返回: None
@@ -268,11 +269,11 @@ save_params
 save_persistables
 -------------------------------
-.. py:class:: paddle.fluid.io.save_persistables(executor, dirname, main_program=None, filename=None)
+.. py:function:: paddle.fluid.io.save_persistables(executor, dirname, main_program=None, filename=None)
-该函数过滤掉 给定 ``main_program`` 中所有参数，然后将它们保存到目录 ``dirname`` 中或文件中。
+该函数从给定 ``main_program`` 中取出所有 ``persistable==True`` 的变量，然后将它们保存到目录 ``dirname`` 中或 ``filename`` 指定的文件中。
-``dirname`` 用于指定保存变量的目录。如果想将变量保存到指定目录的若干文件中，设置 ``filename=None`` ; 如果想将所有变量保存在一个文件中，请使用 ``filename`` 来指定它
+``dirname`` 用于指定保存长期变量的目录。如果想将变量保存到指定目录的若干文件中，设置 ``filename=None`` ; 如果想将所有变量保存在一个文件中，请使用 ``filename`` 来指定它。
 参数:
 - **executor**  (Executor) – 保存变量的 executor
@@ -306,7 +307,7 @@ save_persistables
 save_vars
 -------------------------------
-.. py:class:: paddle.fluid.io.save_vars(executor, dirname, main_program=None, vars=None, predicate=None, filename=None)
+.. py:function:: paddle.fluid.io.save_vars(executor, dirname, main_program=None, vars=None, predicate=None, filename=None)
 通过 ``Executor`` ,此函数将变量保存到指定目录下。

--- a/doc/fluid/api_cn/layers_cn.rst
+++ b/doc/fluid/api_cn/layers_cn.rst
--- a/doc/fluid/api_cn/metrics_cn.rst
+++ b/doc/fluid/api_cn/metrics_cn.rst
@@ -105,7 +105,7 @@ ChunkEvaluator
        labels = fluid.layers.data(name="data", shape=[1], dtype="int32")
        data = fluid.layers.data(name="data", shape=[32, 32], dtype="int32")
        pred = fluid.layers.fc(input=data, size=1000, act="tanh")
-        precision, recall, f1_score, num_infer_chunks, num_label_chunks, num_correct_chunks = layers.chunk_eval(
+        precision, recall, f1_score, num_infer_chunks, num_label_chunks, num_correct_chunks = fluid.layers.chunk_eval(
        input=pred,
        label=label)
        metric = fluid.metrics.ChunkEvaluator()

--- a/doc/fluid/api_cn/optimizer_cn.rst
+++ b/doc/fluid/api_cn/optimizer_cn.rst
@@ -331,7 +331,7 @@ LarsMomentum
 LarsMomentumOptimizer
 -------------------------------
-.. py:function:: paddle.fluid.optimizer.LarsMomentumOptimizer(learning_rate, momentum, lars_coeff=0.001, lars_weight_decay=0.0005, regularization=None, name=None)
+.. py:class:: paddle.fluid.optimizer.LarsMomentumOptimizer(learning_rate, momentum, lars_coeff=0.001, lars_weight_decay=0.0005, regularization=None, name=None)
 LARS支持的Momentum优化器

--- a/doc/fluid/api_guides/index.rst
+++ b/doc/fluid/api_guides/index.rst
@@ -18,3 +18,6 @@ API使用指南分功能向您介绍PaddlePaddle Fluid的API体系和用法，
    low_level/memory_optimize.rst
    low_level/nets.rst
    low_level/parallel_executor.rst
+    low_level/backward.rst
+    low_level/parameter.rst
+    low_level/program.rst
--- a/doc/fluid/api_guides/index_en.rst
+++ b/doc/fluid/api_guides/index_en.rst
@@ -18,3 +18,6 @@ This section introduces the Fluid API structure and usage, to help you quickly g
    low_level/memory_optimize_en.rst
    low_level/nets_en.rst
    low_level/parallel_executor_en.rst
+    low_level/backward_en.rst
+    low_level/parameter_en.rst
+    low_level/program_en.rst
--- a/doc/fluid/api_guides/low_level/backward_en.rst
+++ b/doc/fluid/api_guides/low_level/backward_en.rst
+.. _api_guide_backward_en:
+################
+Back Propagation
+################
+The ability of neural network to define model depends on optimization algorithm. Optimization is a process of calculating gradient continuously and adjusting learnable parameters. You can refer to  :ref:`api_guide_optimizer_en` to learn more about optimization algorithm in Fluid.
+In the training process of network, gradient calculation is divided into two steps: forward computing and `back propagation <https://en.wikipedia.org/wiki/Backpropagation>`_ .
+Forward computing transfers the state of the input unit to the output unit according to the network structure you build.
+Back propagation calculates the derivatives of two or more compound functions by means of `chain rule <https://en.wikipedia.org/wiki/Chain_rule>`_ . The gradient of output unit is propagated back to input unit. According to the calculated gradient, the learning parameters of the network are adjusted.
+You could refer to `back propagation algorithm <http://deeplearning.stanford.edu/wiki/index.php/%E5%8F%8D%E5%90%91%E4%BC%A0%E5%AF%BC%E7%AE%97%E6%B3%95>`_ for detialed implementation process.
+We do not recommend directly calling backpropagation-related APIs in  :code:`fluid` , as these are very low-level APIs. Consider using the relevant APIs in :ref:`api_guide_optimizer_en` instead. When you use optimizer APIs, Fluid automatically calculates the complex back-propagation for you.
+If you want to implement it by yourself, you can also use: :code:`callback` in :ref:`api_fluid_backward_append_backward` to define the customized gradient form of Operator. 
+For more information, please refer to: :ref:`api_fluid_backward_append_backward`
--- a/doc/fluid/api_guides/low_level/parameter.rst
+++ b/doc/fluid/api_guides/low_level/parameter.rst
@@ -4,7 +4,7 @@
 模型参数
 #########
-模型参数为模型中的weight和bias统称，在fluid中对应fluid.Parameter类，继承自fluid.Variable，是一种可持久化的variable。模型的训练就是不断学习更新模型参数的过程。模型参数相关的属性可以通过 :ref:`cn_api_fluid_param_attr_ParamAttr` 来配置，可配置内容有：
+模型参数为模型中的weight和bias统称，在fluid中对应fluid.Parameter类，继承自fluid.Variable，是一种可持久化的variable。模型的训练就是不断学习更新模型参数的过程。模型参数相关的属性可以通过 :ref:`cn_api_fluid_ParamAttr` 来配置，可配置内容有：
 - 初始化方式
 - 正则化

--- a/doc/fluid/api_guides/low_level/parameter_en.rst
+++ b/doc/fluid/api_guides/low_level/parameter_en.rst
+..  _api_guide_parameter_en:
+##################
+Model Parameters
+##################
+Model parameters are weights and biases in a model. In fluid, they are instances of ``fluid.Parameter`` class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by :ref:`api_fluid_ParamAttr` . The configurable contents are as follows:
+- Initialization method
+- Regularization
+- gradient clipping
+- Model Average
+Initialization method
+========================
+Fluid initializes a single parameter by setting attributes of :code:`initializer` in :code:`ParamAttr` .
+examples：
+  .. code-block:: python
+      param_attrs = fluid.ParamAttr(name="fc_weight",
+                                initializer=fluid.initializer.ConstantInitializer(1.0))
+      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
+The following is the initialization method supported by fluid:
+1. BilinearInitializer
+-----------------------
+Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation.
+Alias：Bilinear
+API reference： :ref:`api_fluid_initializer_BilinearInitializer`
+2. ConstantInitializer
+--------------------------
+Constant initialization. Initialize the parameter to the specified value.
+Alias：Constant
+API reference： :ref:`api_fluid_initializer_ConstantInitializer`
+3. MSRAInitializer
+----------------------
+Please refer to https://arxiv.org/abs/1502.01852 for initialization.
+Alias：MSRA
+API reference： :ref:`api_fluid_initializer_MSRAInitializer`
+4. NormalInitializer
+-------------------------
+Initialization method of random Gaussian distribution.
+Alias：Normal
+API reference： :ref:`api_fluid_initializer_NormalInitializer`
+5. TruncatedNormalInitializer
+---------------------------------
+Initialization method of stochastic truncated Gauss distribution.
+Alias：TruncatedNormal
+API reference： :ref:`api_fluid_initializer_TruncatedNormalInitializer`
+6. UniformInitializer
+------------------------
+Initialization method of random uniform distribution.
+Alias：Uniform
+API reference： :ref:`api_fluid_initializer_UniformInitializer`
+7. XavierInitializer
+------------------------
+Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization.
+Alias：Xavier
+API reference： :ref:`api_fluid_initializer_XavierInitializer`
+Regularization
+=================
+Fluid regularizes a single parameter by setting attributes of :code:`regularizer` in :code:`ParamAttr` .
+  .. code-block:: python
+      param_attrs = fluid.ParamAttr(name="fc_weight",
+                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
+      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
+The following is the regularization approach supported by fluid:
+-  :ref:`api_fluid_regularizer_L1DecayRegularizer` (Alias：L1Decay)
+-  :ref:`api_fluid_regularizer_L2DecayRegularizer` (Alias：L2Decay)
+Clipping
+==========
+Fluid sets clipping method for a single parameter by setting attributes of :code:`gradient_clip` in :code:`ParamAttr` .
+  .. code-block:: python
+      param_attrs = fluid.ParamAttr(name="fc_weight",
+                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
+      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
+The following is the clipping method supported by fluid:
+1. ErrorClipByValue
+----------------------
+Used to clipping the value of a tensor to a specified range.
+API reference： :ref:`api_fluid_clip_ErrorClipByValue`
+2. GradientClipByGlobalNorm
+------------------------------
+Used to limit the global-norm of multiple Tensors to :code:`clip_norm`.
+API reference： :ref:`api_fluid_clip_GradientClipByGlobalNorm`
+3. GradientClipByNorm
+------------------------
+Limit the L2-norm of Tensor to :code:`max_norm` . If Tensor's L2-norm exceeds: :code:`max_norm` ,
+it will calculate a  :code:`scale` . And then all values of the Tensor multiply the :code:`scale` .
+API reference： :ref:`api_fluid_clip_GradientClipByNorm`
+4. GradientClipByValue
+-------------------------
+Limit the value of the gradient on a parameter to [min, max].
+API reference： :ref:`api_fluid_clip_GradientClipByValue`
+Model Averaging
+================
+Fluid determines whether to average a single parameter by setting attributes of :code:`do_model_average` in :code:`ParamAttr` .
+Examples:
+  .. code-block:: python
+      param_attrs = fluid.ParamAttr(name="fc_weight",
+                                do_model_average=true)
+      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
+In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates.
+The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process.
+API reference  :ref:`api_fluid_optimizer_ModelAverage` 
--- a/doc/fluid/api_guides/low_level/program_en.rst
+++ b/doc/fluid/api_guides/low_level/program_en.rst
+.. _api_guide_Program_en:
+###############################
+Program/Block/Operator/Variable
+###############################
+==================
+Program
+==================
+:code:`Fluid` describes neural network configuration in the form of abstract grammar tree similar to that of a programming language, and the user's description of computation will be written into a Program. Program in Fluid replaces the concept of models in traditional frameworks. It can describe any complex model through three execution structures: sequential execution, conditional selection and loop execution. Writing :code:`Program` is very close to writing a common program. If you have tried programming before, you will naturally apply your expertise to it.
+In brief：
+* A model is a Fluid :code:`Program`  and can contain more than one :code:`Program` ;
+* :code:`Program` consists of nested :code:`Block` , and the concept of :code:`Block` can be analogized to a pair of braces in C++ or Java, or an indentation block in Python.
+* Computing in :code:`Block` is composed of three ways: sequential execution, conditional selection or loop execution, which constitutes complex computational logic.
+* :code:`Block` contains descriptions of computation and computational objects. The description of computation is called Operator; the object of computation (or the input and output of Operator) is unified as Tensor. In Fluid, Tensor is represented by 0-leveled `LoD-Tensor <http://paddlepaddle.org/documentation/docs/zh/1.2/user_guides/howto/prepare_data/lod_tensor.html#permalink-4-lod-tensor>`_ .
+=========
+Block
+=========
+:code:`Block` is the concept of variable scope in advanced languages. In programming languages, Block is a pair of braces, which contains local variable definitions and a series of instructions or operators. Control flow structures :code:`if-else` and :code:`for` in programming languages can be equivalent to the following counterparts in deep learning:
+----------------------+-------------------------+
+| programming languages| Fluid                   |
+======================+=========================+
+| for, while loop      | RNN,WhileOP             |
+----------------------+-------------------------+
+| if-else, switch      | IfElseOp, SwitchOp      |
+----------------------+-------------------------+
+| execute sequentially | a series of layers      | 
+----------------------+-------------------------+
+As mentioned above,  :code:`Block` in Fluid describes a set of Operators that include sequential execution, conditional selection or loop execution, and the operating object of Operator: Tensor.
+=============
+Operator
+=============
+In Fluid, all operations of data are represented by :code:`Operator` . In Python, :code:`Operator` in Fluid is encapsulated into modules like :code:`paddle.fluid.layers` , :code:`paddle.fluid.nets` .
+This is because some common operations on Tensor may consist of more basic operations. For simplicity, some encapsulation of the basic Operator is carried out inside the framework, including the creation of learnable parameters relied by an Operator, the initialization details of learnable parameters, and so on, so as to reduce the cost of further development.
+More information can be read for reference. `Fluid Design Idea <../../advanced_usage/design_idea/fluid_design_idea.html>`_ 
+=========
+Variable
+=========
+In Fluid， :code:`Variable` can contain any type of value -- in most cases a LoD-Tensor.
+All the learnable parameters in the model are kept in the memory space in form of :code:`Variable` . In most cases, you do not need to create the learnable parameters in the network by yourself. Fluid provides encapsulation for almost common basic computing modules of the neural network. Taking the simplest full connection model as an example, calling :code:`fluid.layers.fc` directly creates two learnable parameters for the full connection layer, namely, connection weight (W) and bias, without explicitly calling :code:`Variable` related interfaces to create learnable parameters.
+==================
+Related API
+==================
+* A single neural network configured by the user is called :ref:`api_fluid_Program` . It is noteworthy that when training neural networks, users often need to configure and operate multiple :code:`Program` . For example,  :code:`Program` for parameter initialization, :code:`Program` for training,  :code:`Program` for testing, etc.
+* Users can also use :ref:`api_fluid_program_guard` with :code:`with` to modify the configured :ref:`api_fluid_default_startup_program` and :ref:`api_fluid_default_main_program` .
+* In Fluid，the execution order in a Block is determined by control flow，such as :ref:`api_fluid_layers_IfElse` , :ref:`api_fluid_layers_While` and :ref:`api_fluid_layers_Switch` . For more information, please refer to： :ref:`api_guide_control_flow_en` 
--- a/doc/fluid/user_guides/howto/evaluation_and_debugging/debug/visualdl.md
+++ b/doc/fluid/user_guides/howto/evaluation_and_debugging/debug/visualdl.md
@@ -48,12 +48,19 @@ VisualDL 目前支持以下组件：
 可用于播放输入或生成的音频样本
 ### Graph
-兼容 ONNX(Open Neural Network Exchange)[https://github.com/onnx/onnx], 通过与 python SDK的结合，VisualDL可以兼容包括 PaddlePaddle, pytorch, mxnet在内的大部分主流DNN平台。
+VisualDL的graph支持paddle program的展示，同时兼容 ONNX(Open Neural Network Exchange)[https://github.com/onnx/onnx]，通过与 python SDK的结合，VisualDL可以兼容包括 PaddlePaddle, pytorch, mxnet在内的大部分主流DNN平台。
 <p align="center">
-  <img src="https://raw.githubusercontent.com/daming-lu/large_files/master/graph_demo.gif" width="60%" />
+  <img src="https://raw.githubusercontent.com/PaddlePaddle/VisualDL/develop/docs/images/graph_demo.gif" width="60%" />
 </p>
+要进行paddle模型的展示，需要进行以下两步操作：
+1. 在paddle代码中，调用`fluid.io.save_inference_model()`接口保存模型
+2. 在命令行界面，使用`visualdl --model_pb [paddle_model_dir]` 加载paddle模型
 ### High Dimensional
 用高维度数据映射在2D/3D来可视化嵌入
@@ -235,3 +242,8 @@ board 还支持一下参数来实现远程的访问：
 VisualDL 是由 [PaddlePaddle](http://www.paddlepaddle.org/) 和
 [ECharts](http://echarts.baidu.com/) 合作推出的开源项目。我们欢迎所有人使用，提意见以及贡献代码。
+## 更多细节
+想了解更多关于VisualDL的使用介绍，请查看[文档](https://github.com/PaddlePaddle/VisualDL/tree/develop/demo)
--- a/doc/fluid/user_guides/howto/evaluation_and_debugging/debug/visualdl_en.md
+++ b/doc/fluid/user_guides/howto/evaluation_and_debugging/debug/visualdl_en.md
@@ -52,14 +52,21 @@ Image can be used to visualize any tensor or intermediate generated image.
 Audio can be used to play input audio samples or generated audio samples.
 ### Graph
-Graph is compatible with ONNX ([Open Neural Network Exchange](https://github.com/onnx/onnx)),
+VisualDL graph supports displaying paddle model, furthermore is compatible with ONNX ([Open Neural Network Exchange](https://github.com/onnx/onnx)),
 Cooperated with Python SDK, VisualDL can be compatible with most major DNN frameworks, including
 PaddlePaddle, PyTorch and MXNet.
 <p align="center">
-  <img src="https://raw.githubusercontent.com/daming-lu/large_files/master/graph_demo.gif" width="60%" />
+  <img src="https://raw.githubusercontent.com/PaddlePaddle/VisualDL/develop/docs/images/graph_demo.gif" width="60%" />
 </p>
+To display the paddle model, all you have to do is:
+1. call the `fluid.io.save_inference_model()`interface to save paddle model
+2. use `visualdl --model_pb [paddle_model_dir]` to load paddle model in command line
 ### High Dimensional
 High Dimensional can be used to visualize data embeddings by projecting high-dimensional data into 2D / 3D.
@@ -251,3 +258,8 @@ visualDL also supports following optional parameters:
 VisualDL is initially created by [PaddlePaddle](http://www.paddlepaddle.org/) and
 [ECharts](http://echarts.baidu.com/).
 We welcome everyone to use, comment and contribute to Visual DL :)
+## More details
+For more details about how to use VisualDL, please take a look at [documents](https://github.com/PaddlePaddle/VisualDL/tree/develop/demo)
--- a/doc/fluid/user_guides/howto/prepare_data/feeding_data.rst
+++ b/doc/fluid/user_guides/howto/prepare_data/feeding_data.rst
 .. _user_guide_use_numpy_array_as_train_data:
-###########################
+##############
-使用Numpy Array作为训练数据
+同步数据读取
-###########################
+##############
 PaddlePaddle Fluid支持使用 :code:`fluid.layers.data()` 配置数据层；
 再使用 Numpy Array 或者直接使用Python创建C++的
@@ -84,7 +84,7 @@ PaddlePaddle Fluid支持使用 :code:`fluid.layers.data()` 配置数据层；
   exe.run(feed={
     "sentence": create_lod_tensor(
       data=numpy.array([1, 3, 4, 5, 3, 6, 8], dtype='int64').reshape(-1, 1),
-       lod=[4, 1, 2],
+       lod=[[4, 1, 2]],
       place=fluid.CPUPlace()
     )
   })

--- a/doc/fluid/user_guides/howto/prepare_data/index.rst
+++ b/doc/fluid/user_guides/howto/prepare_data/index.rst
@@ -4,54 +4,104 @@
 准备数据
 ########
-PaddlePaddle Fluid支持两种传入数据的方式:
+使用PaddlePaddle Fluid准备数据分为两个步骤：
-1. Python Reader同步方式：用户需要使用 :code:`fluid.layers.data`
+Step1: 自定义Reader生成训练/预测数据
+###################################
+生成的数据类型可以为Numpy Array或LoDTensor。根据Reader返回的数据形式的不同，可分为Batch级的Reader和Sample（样本）级的Reader。
+Batch级的Reader每次返回一个Batch的数据，Sample级的Reader每次返回单个样本的数据
+如果您的数据是Sample级的数据，我们提供了一个可以组建batch及数据预处理的工具：:code:`Python Reader` 。
+Step2: 将数据送入网络进行训练/预测
+###################################
+Fluid提供两种方式，分别是同步Feed方式或异步py_reader接口方式，具体介绍如下：
+- 同步Feed方式
+用户需使用 :code:`fluid.layers.data`
 配置数据输入层，并在 :code:`fluid.Executor` 或 :code:`fluid.ParallelExecutor`
-中，使用 :code:`executor.run(feed=...)` 传入训练数据。
+中使用 :code:`executor.run(feed=...)` 传入训练数据。数据准备和模型训练/预测的过程是同步进行的，
+效率较低。
-2. py_reader接口异步方式：用户需要先使用 :code:`fluid.layers.py_reader` 配置数据输入层，然后使用
+- 异步py_reader接口方式
+用户需要先使用 :code:`fluid.layers.py_reader` 配置数据输入层，然后使用
 :code:`py_reader` 的 :code:`decorate_paddle_reader` 或 :code:`decorate_tensor_provider`
-方法配置数据源，再通过 :code:`fluid.layers.read_file` 读取数据。
+方法配置数据源，再通过 :code:`fluid.layers.read_file` 读取数据。数据传入与模型训练/预测过程是异步进行的，
+效率较高。
 这两种准备数据方法的比较如下:
 ========  =================================   =====================================
-对比项            Python Reader同步方式                py_reader接口异步方式
+对比项            同步Feed方式                          异步py_reader接口方式
 ========  =================================   =====================================
 API接口     :code:`executor.run(feed=...)`       :code:`fluid.layers.py_reader`
-数据格式              Numpy Array                   Numpy Array或LoDTensor
+数据格式         Numpy Array或LoDTensor               Numpy Array或LoDTensor
 数据增强          Python端使用其他库完成                  Python端使用其他库完成
 速度                     慢                                   快
 推荐用途                调试模型                              工业训练
 ========  =================================   =====================================
-Python Reader同步方式
+Reader数据类型对使用方式的影响
-#####################
+###############################
-Fluid提供Python Reader方式传入数据。
+根据Reader数据类型的不同，上述Step1和Step2的具体操作将有所不同，具体介绍如下:
-Python Reader是纯的Python端接口，数据传入与模型训练/预测过程是同步的。用户可通过Numpy Array传入
-数据，具体请参考:
-.. toctree::
+读取Sample级Reader数据
-   :maxdepth: 1
+++++++++++++++++++++
-   feeding_data.rst
+若自定义的Reader每次返回单个样本的数据，用户需通过以下步骤完成数据送入：
-Python Reader支持组batch、shuffle等高级功能，具体请参考：
+Step1. 组建数据
+=============================
+调用Fluid提供的Reader相关接口完成组batch和部分的数据预处理功能，具体请参见：
 .. toctree::
   :maxdepth: 1
   reader_cn.md
-py_reader接口异步方式
+Step2. 送入数据
-#####################
+=================================
+若使用同步Feed方式送入数据，请使用DataFeeder接口将Reader数据转换为LoDTensor格式后送入网络，具体请参见 :ref:`cn_api_fluid_DataFeeder`
+若使用异步py_reader接口方式送入数据，请调用 :code:`decorate_paddle_reader` 接口完成，具体请参见：
-Fluid提供PyReader异步数据传入方式，数据传入与模型训练/预测过程是异步的，效率较高。具体请参考：
+- :ref:`user_guides_use_py_reader`
+读取Batch级Reader数据
+++++++++++++++++++++++
+Step1. 组建数据
+=================
+由于Batch已经组好，已经满足了Step1的条件，可以直接进行Step2
+Step2. 送入数据
+=================================
+若使用同步Feed方式送入数据，具体请参见:
+.. toctree::
+   :maxdepth: 1
+   feeding_data.rst
+若使用异步py_reader接口方式送入数据，请调用py_reader的 :code:`decorate_tensor_provider` 接口完成，具体方式请参见:
 .. toctree::
   :maxdepth: 1
   use_py_reader.rst
--- a/doc/fluid/user_guides/howto/prepare_data/reader_cn.md
+++ b/doc/fluid/user_guides/howto/prepare_data/reader_cn.md
-# Python Reader
+# 数据预处理工具
-在模型训练和预测阶段，PaddlePaddle程序需要读取训练或预测数据。为了帮助用户编写数据读取的代码，我们提供了如下接口：
- *reader*: 用于读取数据的函数，数据可来自于文件、网络、随机数生成器等，函数每次返回一个数据项。
+在模型训练和预测阶段，PaddlePaddle程序需要读取训练或预测数据。为了帮助您编写数据读取的代码，我们提供了如下接口：
+- *reader*: 样本级的reader，用于读取数据的函数，数据可来自于文件、网络、随机数生成器等，函数每次返回一个样本数据项。
 - *reader creator*: 接受一个或多个reader作为参数、返回一个新reader的函数。
 - *reader decorator*: 一个函数，接受一个或多个reader，并返回一个reader。
 - *batch reader*: 用于读取数据的函数，数据可来自于文件、网络、随机数生成器等，函数每次返回一个batch大小的数据项。
@@ -185,15 +186,4 @@ def image_reader_creator(image_path, label_path, n):
 # images_reader_creator创建一个reader
 reader = image_reader_creator("/path/to/image_file", "/path/to/label_file", 1024)
-paddle.train(paddle.batch(reader, 128), {"image":0, "label":1}, ...)
-```
-### `paddle.train`实现原理
-实现`paddle.train`的示例如下：
-```python
-def train(batch_reader, mapping, batch_size, total_pass):
-    for pass_idx in range(total_pass):
-        for mini_batch in batch_reader(): # this loop will never end in online learning.
-            do_forward_backward(mini_batch, mapping)
 ```
--- a/doc/fluid/user_guides/howto/prepare_data/use_py_reader.rst
+++ b/doc/fluid/user_guides/howto/prepare_data/use_py_reader.rst
 ..  _user_guides_use_py_reader:
-############################
+#############
-使用PyReader读取训练和测试数据
+异步数据读取
-############################
+#############
-除Python Reader方法外，我们提供了PyReader。PyReader的性能比 :ref:`user_guide_use_numpy_array_as_train_data` 更好，因为PyReader的数据读取和模型训练过程是异步进行的，且能与 :code:`double_buffer_reader` 配合以进一步提高数据读取性能。此外， :code:`double_buffer_reader` 负责异步完成CPU Tensor到GPU Tensor的转换，一定程度上提升了数据读取效率。
+除同步Feed方式外，我们提供了PyReader。PyReader的性能比 :ref:`user_guide_use_numpy_array_as_train_data` 更好，因为PyReader的数据读取和模型训练过程是异步进行的，且能与 :code:`double_buffer_reader` 配合以进一步提高数据读取性能。此外， :code:`double_buffer_reader` 负责异步完成CPU Tensor到GPU Tensor的转换，一定程度上提升了数据读取效率。
 创建PyReader对象
 ################################

--- a/doc/fluid/user_guides/howto/training/save_load_variables.rst
+++ b/doc/fluid/user_guides/howto/training/save_load_variables.rst
@@ -7,14 +7,14 @@
 模型变量分类
 ############
-在PaddlePaddle Fluid中，所有的模型变量都用 :code:`fluid.framework.Variable()` 作为基类进行表示。
+在PaddlePaddle Fluid中，所有的模型变量都用 :code:`fluid.framework.Variable()` 作为基类。
 在该基类之下，模型变量主要可以分为以下几种类别：
 1. 模型参数
-  模型参数是深度学习模型中被训练和学习的变量，在训练过程中，训练框架根据反向传播算法计算出每一个模型参数当前的梯度，
+  模型参数是深度学习模型中被训练和学习的变量，在训练过程中，训练框架根据反向传播(backpropagation)算法计算出每一个模型参数当前的梯度，
-  并用优化器根据梯度对参数进行更新。模型的训练过程本质上可以看做是模型参数不断迭代更新的过程。
+  并用优化器(optimizer)根据梯度对参数进行更新。模型的训练过程本质上可以看做是模型参数不断迭代更新的过程。
  在PaddlePaddle Fluid中，模型参数用 :code:`fluid.framework.Parameter` 来表示，
-  这是一个 :code:`fluid.framework.Variable()` 的派生类，除了 :code:`fluid.framework.Variable()` 具有的各项性质以外，
+  这是一个 :code:`fluid.framework.Variable()` 的派生类，除了具有 :code:`fluid.framework.Variable()` 的各项性质以外，
  :code:`fluid.framework.Parameter` 还可以配置自身的初始化方法、更新率等属性。
 2. 长期变量
@@ -33,7 +33,7 @@
 ################
 根据用途的不同，我们需要保存的模型变量也是不同的。例如，如果我们只是想保存模型用来进行以后的预测，
-那么只保存模型参数就够用了。但如果我们需要保存一个checkpoint以备将来恢复训练，
+那么只保存模型参数就够用了。但如果我们需要保存一个checkpoint（检查点，类似于存档，存有复现目前模型的必要信息）以备将来恢复训练，
 那么我们应该将各种长期变量都保存下来，甚至还需要记录一下当前的epoch和step的id。
 因为一些模型变量虽然不是参数，但对于模型的训练依然必不可少。
@@ -63,7 +63,8 @@
 如何载入模型变量
 ################
-与模型变量的保存相对应，我们提供了两套API来分别载入模型的参数和载入模型的长期变量。
+与模型变量的保存相对应，我们提供了两套API来分别载入模型的参数和载入模型的长期变量，分别为保存、加载模型参数的 ``save_params()`` 、 ``load_params()`` 和
+保存、加载长期变量的 ``save_persistables`` 、 ``load_persistables`` 。
 载入模型用于对新样本的预测
 ==========================
@@ -98,8 +99,9 @@
-预测所用的模型与参数的保存：
+预测模型的保存和加载
-##################
+##############################
 预测引擎提供了存储预测模型 :code:`fluid.io.save_inference_model` 和加载预测模型 :code:`fluid.io.load_inference_model` 两个接口。
 - :code:`fluid.io.save_inference_model`：请参考  :ref:`api_guide_inference`。
@@ -109,7 +111,8 @@
 增量训练
 ############
-增量训练指一个学习系统能不断地从新样本中学习新的知识，并能保存大部分以前已经学习到的知识。因此增量学习涉及到两点：在上一次训练结束的时候保存需要持久化的参数， 在下一次训练开始的时候加载上一次保存的持久化参数。 因此增量训练涉及到如下几个API:
+增量训练指一个学习系统能不断地从新样本中学习新的知识，并能保存大部分以前已经学习到的知识。因此增量学习涉及到两点：在上一次训练结束的时候保存需要的长期变量， 在下一次训练开始的时候加载上一次保存的这些长期变量。 因此增量训练涉及到如下几个API:
 :code:`fluid.io.save_persistables`、:code:`fluid.io.load_persistables` 。
 单机增量训练
@@ -152,14 +155,15 @@
-多机增量（不带分布式大规模稀疏矩阵）训练的一般步骤为：
+多机增量（不带分布式大规模稀疏矩阵）训练的一般步骤为
 ==========================
 多机增量训练和单机增量训练有若干不同点：
-1. 在训练的最后调用 :code:`fluid.io.save_persistables` 保存持久性参数时，不必要所有的trainer都调用这个方法，一般0号trainer来保存。
+1. 在训练的最后调用 :code:`fluid.io.save_persistables` 保存长期变量时，不必要所有的trainer都调用这个方法来保存，一般0号trainer来保存即可。
 2. 多机增量训练的参数加载在PServer端，trainer端不用加载参数。在PServer全部启动后，trainer会从PServer端同步参数。
-多机增量（不启用分布式大规模稀疏矩阵）训练的一般步骤为：
+多机增量（不带分布式大规模稀疏矩阵）训练的一般步骤为：
 1. 0号trainer在训练的最后调用 :code:`fluid.io.save_persistables` 保存持久性参数到指定的 :code:`path` 下。
 2. 通过HDFS等方式将0号trainer保存下来的所有的参数共享给所有的PServer(每个PServer都需要有完整的参数)。
@@ -186,7 +190,7 @@
    hadoop fs -mkdir /remote/$path
    hadoop fs -put $path /remote/$path
-上面的例子中，0号train通过调用 :code:`fluid.io.save_persistables` 函数，PaddlePaddle Fluid会从默认
+上面的例子中，0号trainer通过调用 :code:`fluid.io.save_persistables` 函数，PaddlePaddle Fluid会从默认
 :code:`fluid.Program` 也就是 :code:`prog` 的所有模型变量中找出长期变量，并将他们保存到指定的 :code:`path` 目录下。然后通过调用第三方的文件系统（如HDFS）将存储的模型进行上传到所有PServer都可访问的位置。
 对于训练过程中待载入参数的PServer， 例如：

--- a/doc/fluid/user_guides/models/index_cn.rst
+++ b/doc/fluid/user_guides/models/index_cn.rst
@@ -112,7 +112,7 @@ kaldi 的解码器完成解码。
 --------
 机器翻译（Machine
-Translation）将一种自然语言(源语言)转换成一种自然语言（目标语音），是自然语言处理中非常基础和重要的研究方向。在全球化的浪潮中，机器翻译在促进跨语言文明的交流中所起的重要作用是不言而喻的。其发展经历了统计机器翻译和基于神经网络的神经机器翻译(Nueural
+Translation）将一种自然语言(源语言)转换成一种自然语言（目标语言），是自然语言处理中非常基础和重要的研究方向。在全球化的浪潮中，机器翻译在促进跨语言文明的交流中所起的重要作用是不言而喻的。其发展经历了统计机器翻译和基于神经网络的神经机器翻译(Nueural
 Machine Translation, NMT)等阶段。在 NMT
 成熟后，机器翻译才真正得以大规模应用。而早阶段的 NMT
 主要是基于循环神经网络 RNN

--- a/doc/fluid/user_guides/models/index_en.rst
+++ b/doc/fluid/user_guides/models/index_en.rst
@@ -92,7 +92,7 @@ Different from the end-to-end direct prediction for word distribution of the dee
 Machine Translation
 ---------------------
-Machine Translation transforms a natural language (source language) into another natural language (target speech), which is a very basic and important research direction in natural language processing. In the wave of globalization, the important role played by machine translation in promoting cross-language civilization communication is self-evident. Its development has gone through stages such as statistical machine translation and neural-network-based Neuro Machine Translation (NMT). After NMT matured, machine translation was really applied on a large scale. The early stage of NMT is mainly based on the recurrent neural network RNN. The current time step in the training process depends on the calculation of the previous time step, so it is difficult to parallelize the time steps to improve the training speed. Therefore, NMTs of non-RNN structures have emerged, such as structures based on convolutional neural networks CNN and structures based on Self-Attention.
+Machine Translation transforms a natural language (source language) into another natural language (target language), which is a very basic and important research direction in natural language processing. In the wave of globalization, the important role played by machine translation in promoting cross-language civilization communication is self-evident. Its development has gone through stages such as statistical machine translation and neural-network-based Neuro Machine Translation (NMT). After NMT matured, machine translation was really applied on a large scale. The early stage of NMT is mainly based on the recurrent neural network RNN. The current time step in the training process depends on the calculation of the previous time step, so it is difficult to parallelize the time steps to improve the training speed. Therefore, NMTs of non-RNN structures have emerged, such as structures based on convolutional neural networks CNN and structures based on Self-Attention.
 The Transformer implemented in this example is a machine translation model based on the self-attention mechanism, in which there is no more RNN or CNN structure, but fully utilizes Attention to learn the context dependency. Compared with RNN/CNN, in a single layer, this structure has lower computational complexity, easier parallelization, and easier modeling for long-range dependencies, and finally achieves the best translation effect among multiple languages.