[Dy2stat] Split Basic Usage and Architecture (#2642)

We think we should split the usage and architecture because of clear organization and future extension (we may add more usage guide)

[Dy2stat] Split Basic Usage and Architecture (#2642)
We think we should split the usage and architecture because of clear organization and future extension (we may add more usage guide)
f0e64cd0 · Huihuang Zheng · GitHub · cb3783a0 · f0e64cd0 · f0e64cd0
8 changed file
--- a/doc/paddle/guides/dygraph_to_static/basic_usage_cn.rst
+++ b/doc/paddle/guides/dygraph_to_static/basic_usage_cn.rst
+基本用法
+==============
+PaddlePaddle主要的动转静方式是基于源代码级别转换的ProgramTranslator。其基本原理是通过分析Python代码来将动态图代码转写为静态图代码，并在底层自动帮用户使用静态图执行器运行。这种转换方式使得用户可以灵活使用Python语法及其控制流来构建神经网络模型。除此之外，PaddlePaddle另外提供一种基于trace的动转静接口TracedLayer。若遇到ProgramTranslator不支持但是可以用TracedLayer运行的情况，可以作为备选方案。
+基于源代码转写的ProgramTranslator
+-----------------------------------
+源代码转写的ProgramTranslator进行动态图转静态图，其基本原理是通过分析Python代码来将动态图代码转写为静态图代码，并在底层自动帮用户使用执行器运行。其基本使用方法十分简便，只需要在要转化的函数（该函数也可以是用户自定义动态图Layer的forward函数）前添加一个装饰器 ``@paddle.jit.to_static`` ，一个转化例子如下，可以直接运行被装饰函数得到结果：
+.. code-block:: python
+    import paddle
+    @paddle.jit.to_static
+    def func(input_var)
+        # if判断与输入input_var的shape有关
+        if input_var.shape[0] > 1:
+            out = paddle.cast(input_var, "float64")
+        else:
+            out = paddle.cast(input_var, "int64")
+    paddle.disable_static()
+    in_np = np.array([-2]).astype('int')
+    input_var = paddle.to_tensor(in_np)
+    func(input_var)
+若要存储转化后的静态图模型，可以调用 ``paddle.jit.save`` ，我们定义一个简单全连接网络SimpleFcLayer，需要在下面SimpleFcLayer的forward函数添加装饰器：
+.. code-block:: python
+    import numpy as np
+    import paddle
+    class SimpleFcLayer(paddle.nn.Layer):
+        def __init__(self, feature_size, batch_size, fc_size):
+            super(SimpleFCLayer, self).__init__()
+            self._linear = paddle.nn.Linear(feature_size, fc_size)
+            self._offset = paddle.to_tensor(
+                np.random.random((batch_size, fc_size)).astype('float32'))
+        @paddle.jit.to_static
+        def forward(self, x):
+            fc = self._linear(x)
+            return fc + self._offset
+存储该模型可以使用 ``paddle.jit.save`` 接口：
+.. code-block:: python
+    import paddle
+    paddle.disable_static()
+    fc_layer = SimpleFcLayer(3, 4, 2)
+    in_np = np.random.random([3, 4]).astype('float32')
+    input_var = paddle.to_tensor(in_np)
+    out = fc_layer(input_var)
+    paddle.jit.save(fc_layer, "./fc_layer_dy2stat", input_spec=[input_var])
+基于trace的TracedLayer
+------------------------
+trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就是基于这种技术，在一次执行动态图的过程中，记录所有运行的算子，并构建和保存静态图模型。一个使用例子如下：
+我们还是定义一个简单的全连接网络作为例子，注意这里不需要像ProgramTranslator在forward函数添加装饰器：
+.. code-block:: python
+    import numpy as np
+    import paddle
+    class SimpleFcLayer(paddle.nn.Layer):
+        def __init__(self, feature_size, batch_size, fc_size):
+            super(SimpleFCLayer, self).__init__()
+            self._linear = paddle.nn.Linear(feature_size, fc_size)
+            self._offset = paddle.to_tensor(
+                np.random.random((batch_size, fc_size)).astype('float32'))
+        def forward(self, x):
+            fc = self._linear(x)
+            return fc + self._offset
+接下来是TracedLayer如何存储模型：
+.. code-block:: python
+    import paddle
+    from paddle.jit import TracedLayer
+    paddle.disable_static()
+    fc_layer = SimpleFcLayer(3, 4, 2)
+    in_np = np.random.random([3, 4]).astype('float32')
+    # 将numpy的ndarray类型的数据转换为Tensor类型
+    input_var = paddle.to_tensor(in_np)
+    # 通过 TracerLayer.trace 接口将命令式模型转换为声明式模型
+    out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
+    save_dirname = './saved_infer_model'
+    # 将转换后的模型保存
+    static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
+载入的模型可以使用静态图方式运行
+.. code-block:: python
+    place = paddle.CPUPlace()
+    exe = paddle.Executor(place)
+    program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe)
+    fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
+但是也正如我们阐述的原理，trace只是记录了一次执行涉及的算子。若在用户的模型代码中，包含了依赖数据条件（包括输入的值或者shape）的控制流分支，即根据数据条件触发运行不同的算子，则TracedLayer无法正常工作。比如下面：
+.. code-block:: python
+    import paddle
+    def func(input_var)
+        # if判断与输入input_var的shape有关
+        if input_var.shape[0] > 1:
+            return paddle.cast(input_var, "float64")
+        else:
+            return paddle.cast(input_var, "int64")
+    paddle.disable_static()
+    in_np = np.array([-2]).astype('int')
+    input_var = paddle.to_tensor(in_np)
+    out = func(input_var)
+如果对上述样例中的 ``func`` 使用 ``TracedLayer.trace(func, inputs=[input_var])`` ，由于trace只能记录if-else其中跑的一次算子，模型就无法按用户想要的根据input_var的形状进行if-else控制流保存。类似的控制流还有while/for循环的情况。
+比较ProgramTranslator和TracedLayer
+------------------------------------
+基于源代码转换的ProgramTranslator对比基于trace的TracedLayer，前者能够处理依赖数据条件的控制流分支。因此我们更推荐用户使用ProgramTranslator，如果遇到问题再以TracedLayer作为备选方案。
--- a/doc/paddle/guides/dygraph_to_static/basic_usage_en.rst
+++ b/doc/paddle/guides/dygraph_to_static/basic_usage_en.rst
+Basic Usage
+=============
+The recommended way to transform dygraph to static graph is source-code-translate based ProgramTranslator. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. Users could use Python syntax including control flow to build neural networks. Besides, PaddlePaddle has another tracing-based API for transforming dygraph to static graph which called TracedLayer. You can use it as a back-up API in case ProgramTranslator has problem.
+ProgramTranslator
+-------------------
+The basic idea of source-code-translate based ProgramTranslator is analyzing Python source code and turning it into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator ``@paddle.jit.to_static`` before the definition of the function to transform (the function can also be a method of a class, e.g., the ``forward`` function of user-defined imperative Layer). An example is:
+.. code-block:: python
+    import paddle
+    @paddle.jit.to_static
+    def func(input_var)
+        # if condition depends on the shape of input_var
+        if input_var.shape[0] > 1:
+            out = paddle.cast(input_var, "float64")
+        else:
+            out = paddle.cast(input_var, "int64")
+    paddle.disable_static()
+    in_np = np.array([-2]).astype('int')
+    input_var = paddle.to_tensor(in_np)
+    func(input_var)
+To save the transformed model, we can call ``paddle.jit.save`` . Let's take a fully connected network called ``SimpleFcLayer`` as an example, we put decorator at the ``forward`` method of ``SimpleFcLayer`` :
+.. code-block:: python
+    import numpy as np
+    import paddle
+    class SimpleFcLayer(paddle.nn.Layer):
+        def __init__(self, feature_size, batch_size, fc_size):
+            super(SimpleFCLayer, self).__init__()
+            self._linear = paddle.nn.Linear(feature_size, fc_size)
+            self._offset = paddle.to_tensor(
+                np.random.random((batch_size, fc_size)).astype('float32'))
+        @paddle.jit.to_static
+        def forward(self, x):
+            fc = self._linear(x)
+            return fc + self._offset
+Call ``paddle.jit.save`` to save above model:
+.. code-block:: python
+    import paddle
+    paddle.disable_static()
+    fc_layer = SimpleFcLayer(3, 4, 2)
+    in_np = np.random.random([3, 4]).astype('float32')
+    input_var = paddle.to_tensor(in_np)
+    out = fc_layer(input_var)
+    paddle.jit.save(fc_layer, "./fc_layer_dy2stat")
+TracedLayer
+-------------
+Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example:
+Define a simple fully connected network, note that we don't add a decorator before ``forward`` function as we did in ProgramTranslator example:
+.. code-block:: python
+    import numpy as np
+    import paddle
+    class SimpleFcLayer(paddle.nn.Layer):
+        def __init__(self, feature_size, batch_size, fc_size):
+            super(SimpleFCLayer, self).__init__()
+            self._linear = paddle.nn.Linear(feature_size, fc_size)
+            self._offset = paddle.to_tensor(
+                np.random.random((batch_size, fc_size)).astype('float32'))
+        def forward(self, x):
+            fc = self._linear(x)
+            return fc + self._offset
+Save model by TracedLayer:
+.. code-block:: python
+    import paddle
+    from paddle.jit import TracedLayer
+    paddle.disable_static()
+    fc_layer = SimpleFcLayer(3, 4, 2)
+    in_np = np.random.random([3, 4]).astype('float32')
+    # Turn numpy ndarray into Tensor
+    input_var = paddle.to_tensor(in_np)
+    # Transforming imperative mode into declarative mode by TracerLayer.trace
+    out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
+    save_dirname = './saved_infer_model'
+    # Save the transformed model
+    static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
+Load model and run it in static graph mode:
+.. code-block:: python
+    place = paddle.CPUPlace()
+    exe = paddle.Executor(place)
+    program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe)
+    fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
+However, as tracing only records operators once, if user's code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance:
+.. code-block:: python
+    import paddle
+    def func(input_var)
+        # if condition depends on the shape of input_var
+        if input_var.shape[0] > 1:
+            return paddle.cast(input_var, "float64")
+        else:
+            return paddle.cast(input_var, "int64")
+    paddle.disable_static()
+    in_np = np.array([-2]).astype('int')
+    input_var = paddle.to_tensor(in_np)
+    out = func(input_var)
+If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop.
+Comparing ProgramTranslator and TracedLayer
+-------------------------------------------
+Compared to tracing-based TracedLayer, source-code-translate based ProgramTranslator can handle the Tensor-dependent control flow. So we recommend users to use ProgramTranslator, use TracedLayer as a back-up plan when ProgramTranslator doesn't work.
--- a/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst
+++ b/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst
-ProgramTranslator支持的语法
+支持语法列表
-==========================
+==============
 ProgramTranslator本质是把Python运行语法转写为PaddlePaddle静态图代码，但是Python语法的表达能力和PaddlePaddle静态图表达能力存在不同，这使得一些代码无法被转换。

--- a/doc/paddle/guides/dygraph_to_static/grammar_list_en.rst
+++ b/doc/paddle/guides/dygraph_to_static/grammar_list_en.rst
 Supported Grammars
-==================
+====================
 The key part of ProgramTranslator is transforming Python grammar into PaddlePaddle static graph code, but there exists difference between Python and PaddlePaddle static graph which causes some limitation of the code transformation.

--- a/doc/paddle/guides/dygraph_to_static/index_cn.rst
+++ b/doc/paddle/guides/dygraph_to_static/index_cn.rst
@@ -2,7 +2,21 @@
 动态图转静态图
 ###############
- `动态图转静态图 <program_translator_cn.html>`_ ：介绍了动态图转静态图的基本使用方法和架构原理
+动态图有诸多优点，包括易用的接口，Python风格的编程体验，友好的debug交互机制等。在动态图模式下，代码是按照我们编写的顺序依次执行。这种机制更符合Python程序员的习
+惯，可以很方便地将大脑中的想法快速地转化为实际代码，也更容易调试。但在性能方面，
+Python执行开销较大，与C++有一定差距。因此在工业界的许多部署场景中（如大型推荐系统、移动端）都倾向于直接使用C++来提速。
+相比动态图，静态图在部署方面更具有性能的优势。静态图程序在编译执行时，先搭建模型
+的神经网络结构，然后再对神经网络执行计算操作。预先搭建好的神经网络可以脱离Python依赖，在C++端被重新解析执行，而且拥有整体网络结构也能进行一些网络结构的优化。
+动态图代码更易编写和debug，但在部署性能上，静态图更具优势。因此我们新增了动态图转静态图的功能，支持用户依然使用动态图编写组网代码。PaddlePaddle会对用户代码进行
+分析，自动转换为静态图网络结构，兼顾了动态图易用性和静态图部署性能两方面优势。
+我们在以下链接介绍PaddlePaddle动态图转静态图的各个部分：
+- `基本用法 <basic_usage_cn.html>`_ : 介绍了动态图转静态图的基本使用方法
+- `内部架构原理 <program_translator_cn.html>`_ ：介绍了动态图转静态图的架构原理
 - `支持语法列表 <grammar_list_cn.html>`_ ：介绍了动态图转静态图支持的语法以及罗列不支持的语法写法
@@ -16,8 +30,10 @@
 ..  toctree::
    :hidden:
+    basic_usage_cn.rst    
    program_translator_cn.rst
    grammar_list_cn.rst
    input_spec_cn.rst
    error_handling_cn.md
    debugging_cn.md
--- a/doc/paddle/guides/dygraph_to_static/index_en.rst
+++ b/doc/paddle/guides/dygraph_to_static/index_en.rst
@@ -2,25 +2,34 @@
 Dygraph to Static Graph
 #######################
- `Dygraph to Static Graph <program_translator_cn.html>`_ ：Introduce the basic usage for transforming dygraph code into static code and the architecture of ProgramTranslator.
+The imperative-style coding of PaddlePaddle takes advantage of flexibility, Pythonic coding, and easy-to-debug interface. In dygraph mode, code immediately executes kernels and gets numerical results, which allows users to enjoy traditional Pythonic code order. Therefore it is efficient to transform idea into real code and simple to debug. However, Python code is usually slower than C++ thus lots of industrial systems (such as large recommend system, mobile devices) prefer to deploy with C++ implementation.
- `Supported Grammars <grammar_list_en.html>`_ ：Introduce the grammars supported by ProgramTranslator and list unsupported grammars.
+Static graph is better at speed and portability. Static graph builds the network structure during compiling time and then does computation. The built network intermediate representation can be executed in C++ and gets rids of Python dependency.
- `Introduction of InputSpec <input_spec_en.html>`_ ：Introduce the usage of InputSpec to specify the input signature from dygraph to static program.
+While dygraph has usability and debug benefits and static graph yields performance and deployment advantage, we adds functionality to convert dygraph to static graph. Users use imperative mode to write dygraph code and PaddlePaddle will analyze the Python syntax and turn it into network structure of static graph mode. Our approach retains both the usability of dygraph and portability of static graph.
- `Error Handling <error_handling_en.html>`_ ：Introduce the error handling by ProgramTranslator.
+We introduce the transformation of dygraph to static graph in the following links:
- `Debugging Methods <debugging_en.html>`_ ：Introduce the debugging methods when using ProgramTranslator.
+- `Basic Usage <basic_usage_en.html>`_ : Introduce the basic usage for transforming dygraph code into static code.
- `Error Handling <error_handling_en.html>`_ ：Introduce the error handling by ProgramTranslator.
+- `Architecture <program_translator_en.html>`_ : Introduce the architecture of ProgramTranslator.
+- `Supported Grammars <grammar_list_en.html>`_ : Introduce the grammars supported by ProgramTranslator and list unsupported grammars.
+- `Introduction of InputSpec <input_spec_en.html>`_ : Introduce the usage of InputSpec to specify the input signature from dygraph to static program.
+- `Error Handling <error_handling_en.html>`_ : Introduce the error handling by ProgramTranslator.
+- `Debugging Methods <debugging_en.html>`_ : Introduce the debugging methods when using ProgramTranslator.
- `Debugging Methods <debugging_en.html>`_ ：Introduce the debugging methods when using ProgramTranslator.
 ..  toctree::
    :hidden:
+    basic_usage_en.rst
    program_translator_en.rst
    grammar_list_en.rst
    input_spec_en.rst
    error_handling_en.md
    debugging_en.md
--- a/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst
+++ b/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst
-动态图转静态图
-================
-动态图有诸多优点，包括易用的接口，python风格的编程体验，友好的debug交互机制等。在动态图模式下，代码是按照我们编写的顺序依次执行。这种机制更符合Python程序员的习惯，可以很方便地将大脑中的想法快速地转化为实际代码，也更容易调试。但在性能方面，Python执行开销较大，与C++有一定差距。因此在工业界的许多部署场景中（如大型推荐系统、移动端）都倾向于直接使用C++来提速。
-相比动态图，静态图在部署方面更具有性能的优势。静态图程序在编译执行时，先搭建模型的神经网络结构，然后再对神经网络执行计算操作。预先搭建好的神经网络可以脱离Python依赖，在C++端被重新解析执行，而且拥有整体网络结构也能进行一些网络结构的优化。
-动态图代码更易编写和debug，但在部署性能上，静态图更具优势。因此我们新增了动态图转静态图的功能，支持用户依然使用动态图编写组网代码。PaddlePaddle会对用户代码进行分析，自动转换为静态图网络结构，兼顾了动态图易用性和静态图部署性能两方面优势。
-基本使用方法
--------------
-PaddlePaddle提供了两种动态图转静态图的方式，基于动态图trace的TracedLayer与基于源代码级别转换的ProgramTranslator。
-1. 基于trace的TracedLayer：
-trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就是基于这种技术，在一次执行动态图的过程中，记录所有运行的算子，并构建和保存静态图模型。一个使用例子如下：
-我们先定义一个简单的Fully Connected网络：
-.. code-block:: python
-    import numpy as np
-    import paddle
-    class SimpleFcLayer(paddle.nn.Layer):
-        def __init__(self, feature_size, batch_size, fc_size):
-            super(SimpleFCLayer, self).__init__()
-            self._linear = paddle.nn.Linear(feature_size, fc_size)
-            self._offset = paddle.to_tensor(
-                np.random.random((batch_size, fc_size)).astype('float32'))
-        def forward(self, x):
-            fc = self._linear(x)
-            return fc + self._offset
-接下来是TracedLayer如何存储模型：
-.. code-block:: python
-    import paddle
-    from paddle.jit import TracedLayer
-    paddle.disable_static()
-    fc_layer = SimpleFcLayer(3, 4, 2)
-    in_np = np.random.random([3, 4]).astype('float32')
-    # 将numpy的ndarray类型的数据转换为Tensor类型
-    input_var = paddle.to_tensor(in_np)
-    # 通过 TracerLayer.trace 接口将命令式模型转换为声明式模型
-    out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
-    save_dirname = './saved_infer_model'
-    # 将转换后的模型保存
-    static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
-载入的模型可以使用静态图方式运行
-.. code-block:: python
-    place = paddle.CPUPlace()
-    exe = paddle.Executor(place)
-    program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe)
-    fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
-但是也正如我们阐述的原理，trace只是记录了一次执行涉及算子，若在用户的模型代码中，包含了依赖数据条件（包括输入的值或者shape）的控制流分支，即根据数据条件触发运行不同的算子，则TracedLayer无法正常工作。比如下面
-.. code-block:: python
-    import paddle
-    def func(input_var)
-        # if判断与输入input_var的shape有关
-        if input_var.shape[0] > 1:
-            return paddle.cast(input_var, "float64")
-        else:
-            return paddle.cast(input_var, "int64")
-    paddle.disable_static()
-    in_np = np.array([-2]).astype('int')
-    input_var = paddle.to_tensor(in_np)
-    out = func(input_var)
-上例如果在使用TracedLayer.trace(func, inputs=[input_var])，由于trace只能记录if-else其中跑的一次算子，模型就无法按用户想要的根据input_var的形状进行if-else控制流保存。类似的控制流还有while/for循环的情况
-2. 基于源代码转写的ProgramTranslator
-对于依赖数据的控制流，我们使用基于源代码转写的ProgramTranslator来进行动态图转静态图。其基本原理是通过分析Python代码来将动态图代码转写为静态图代码，并在底层自动帮用户使用执行器运行。其基本使用方法十分简便，只需要在要转化的函数（该函数也可以是用户自定义动态图Layer的forward函数）前添加一个装饰器 ``@paddle.jit.to_static`` ，上面的例子转化如下，并且可以依旧使用该函数运行得到结果：
-.. code-block:: python
-    import paddle
-    @paddle.jit.to_static
-    def func(input_var)
-        # if判断与输入input_var的shape有关
-        if input_var.shape[0] > 1:
-            out = paddle.cast(input_var, "float64")
-        else:
-            out = paddle.cast(input_var, "int64")
-    paddle.disable_static()
-    in_np = np.array([-2]).astype('int')
-    input_var = paddle.to_tensor(in_np)
-    func(input_var)
-若要存储转化后的静态图模型，可以调用 ``paddle.jit.save`` ，我们再以SimpleFcLayer为例，需要在SimpleFcLayer的forward函数添加装饰器：
-.. code-block:: python
-    import numpy as np
-    import paddle
-    class SimpleFcLayer(paddle.nn.Layer):
-        def __init__(self, feature_size, batch_size, fc_size):
-            super(SimpleFCLayer, self).__init__()
-            self._linear = paddle.nn.Linear(feature_size, fc_size)
-            self._offset = paddle.to_tensor(
-                np.random.random((batch_size, fc_size)).astype('float32'))
-        @paddle.jit.to_static
-        def forward(self, x):
-            fc = self._linear(x)
-            return fc + self._offset
-存储该模型可以使用paddle.jit.save接口：
-.. code-block:: python
-    import paddle
-    paddle.disable_static()
-    fc_layer = SimpleFcLayer(3, 4, 2)
-    in_np = np.random.random([3, 4]).astype('float32')
-    input_var = paddle.to_tensor(in_np)
-    out = fc_layer(input_var)
-    paddle.jit.save(fc_layer, "./fc_layer_dy2stat", input_spec=[input_var])
 内部架构原理
--------------
+==============
 TracedLayer的原理就是trace，相对简单，因此我们在这里不展开描述。本节将主要阐述ProgramTranslator基于源代码将动态图代码转化为静态图代码。
 转化过程发生在用户开始调用被装饰的函数，转换过程在装饰器中实现。我们将内部涉及的过程分为以下几步：
-1. 函数与缓存
+函数与缓存
+------------
 动态图转静态图的主体是函数（Function）。对于函数内包含的PaddlePaddle接口，如果是仅计算相关算子代码语句，那么因为PaddlePaddle动态图和静态图接口一致，我们不需要额外转换这些代码为静态图代码。但是对于动态图，此类代码接口是直接运行计算和返回结果，而对于静态图此类代码接口其实是组网。那么如果被转化的函数被调用多次，动态图转静态图后会多次组网添加对应算子，这显然会导致问题。为了解决这个问题以及为了加速动转静转化过程，我们维护了被装饰器装饰的函数（Function）与其输入形状（shape），数据类型（dtype）映射到被转化后组网的Program的缓存（Cache）。当要被转化的函数命中缓存，我们直接用对应存储的Program运行静态图得到结果，否则我们才进行语句转化，并且转化成功后的Program存储进缓存。
-2. 动态图源码转AST（抽象语法树）
+动态图源码转AST（抽象语法树）
+------------------------------
 动态图转静态图的最核心部分类似一个编译器，解析动态图代码语句为AST，再对应AST进行改写，最后反转回成静态图代码。从函数转化为代码字符串可以使用Python的inspect.getsource。从字符串Python提供了自带的 `ast <https://docs.python.org/3/library/ast.html>`_ 库来解析字符串为AST，但是由于Python2，Python3的语法略有不同，为了避免我们需要额外处理这些Python2，Python3的不同情况，我们使用了统一Python2，Python3的开源AST处理 `gast库 <https://github.com/serge-sans-paille/gast>`_ 。这些接口使得函数转化为AST没有本质上的困难。
-3. AST改写和静态图源码转换
+AST改写和静态图源码转换
+-------------------------
 这部分为动转静最核心的部分，我们对支持的各种语法进行ast转写。其中最重要的Python控制流，if-else，while，for循环被分别分析转化为PaddlePaddle静态图接口cond，while_loop等接口实现。我们对想转化的每一种主要语法创建一个Transformer（这里的Transformer是Python ast转写的概念，而不是自然语言处理NLP领域的Transformer），每个Transformer扫一遍AST并进行对应的改写。最后被转化完成的AST我们使用gast提供的接口转回成源码。
-4. 静态图源码作为动态图一部分运行的技术
+静态图源码作为动态图一部分运行的技术
+--------------------------------------
 为了动静转化更加易用和被转化的代码能在动态图中复用，我们在拥有源码后运行生成Program，并将这个Program作为一个大op，包装成动态图的一个op，这样既能把用户的代码转为静态图提速或者保存部署，另一方面如果用户想在Python层使用生成的静态图代码作为动态图的一部分继续训练或者别的动态图运算也是可以直接使用。
-5. 易用性与Debug功能在动转静过程的实现
+易用性与Debug功能在动转静过程的实现
+-------------------------------------
 正如AST转写类似编译器，而一般编译器都会提供debug断点，报错，输出一些中间代码等功能。我们在进行动转静时，万一用户的动态图代码出错，或者用户想断点调试，或者用户想看看被转化后的静态图代码是否符合其预期，我们也希望能够像编译器一样提供这些易用性功能，使得动转静兼顾性能和部署同时还具有易用性。我们这里将列出这些功能的实现方式

--- a/doc/paddle/guides/dygraph_to_static/program_translator_en.rst
+++ b/doc/paddle/guides/dygraph_to_static/program_translator_en.rst
-Dygraph to Static Graph
-=======================
-The imperative-style coding of PaddlePaddle takes advantage of flexibility, Pythonic coding, and easy-to-debug interface. In dygraph mode, code immediately executes kernels and gets numerical results, which allows users to enjoy traditional Pythonic code order. Therefore it is efficient to transform idea into real code and simple to debug. However, Python code is usually slower than C++ thus lots of industrial systems (such as large recommend system, mobile devices) prefer to deploy with C++ implementation.
-Static graph is better at speed and portability. Static graph builds the network structure during compiling time and then does computation. The built network intermediate representation can be executed in C++ and gets rids of Python dependency.
-While dygraph has usability and debug benefits and static graph yields performance and deployment advantage, we adds functionality to convert dygraph to static graph. Users use imperative mode to write dygraph code and PaddlePaddle will analyze the Python syntax and turn it into network structure of static graph mode. Our approach retains both the usability of dygraph and portability of static graph.
-Basic Usage
--------------
-PaddlePaddle has two ways to transform dygraph to static graph. TracedLayer extracts computation graph through tracing and ProgramTranslator gets computation graph through source code transformation.
-1. TracedLayer：
-Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example:
-Define a simple fully connected network:
-.. code-block:: python
-    import numpy as np
-    import paddle
-    class SimpleFcLayer(paddle.nn.Layer):
-        def __init__(self, feature_size, batch_size, fc_size):
-            super(SimpleFCLayer, self).__init__()
-            self._linear = paddle.nn.Linear(feature_size, fc_size)
-            self._offset = paddle.to_tensor(
-                np.random.random((batch_size, fc_size)).astype('float32'))
-        def forward(self, x):
-            fc = self._linear(x)
-            return fc + self._offset
-Save model by TracedLayer:
-.. code-block:: python
-    import paddle
-    from paddle.jit import TracedLayer
-    paddle.disable_static()
-    fc_layer = SimpleFcLayer(3, 4, 2)
-    in_np = np.random.random([3, 4]).astype('float32')
-    # Turn numpy ndarray into Tensor
-    input_var = paddle.to_tensor(in_np)
-    # Transforming imperative mode into declarative mode by TracerLayer.trace
-    out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
-    save_dirname = './saved_infer_model'
-    # Save the transformed model
-    static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
-Load model and run it in static graph mode:
-.. code-block:: python
-    place = paddle.CPUPlace()
-    exe = paddle.Executor(place)
-    program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe)
-    fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
-However, as tracing only records operators once, if user's code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance:
-.. code-block:: python
-    import paddle
-    def func(input_var)
-        # if condition depends on the shape of input_var
-        if input_var.shape[0] > 1:
-            return paddle.cast(input_var, "float64")
-        else:
-            return paddle.cast(input_var, "int64")
-    paddle.disable_static()
-    in_np = np.array([-2]).astype('int')
-    input_var = paddle.to_tensor(in_np)
-    out = func(input_var)
-If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop.
-2. ProgramTranslator
-For the Tensor-dependent control flow, we use source-code-translate based ProgramTranslator to convert dygraph into static graph. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator ``@paddle.jit.to_static`` before the definition of the function to transform (the function can also be a method of a class, e.g., the ``forward`` function of user-defined imperative Layer). Above Tensor-dependent example can be transformed correctly by ProgramTranslator as below:
-.. code-block:: python
-    import paddle
-    @paddle.jit.to_static
-    def func(input_var)
-        # if condition depends on the shape of input_var
-        if input_var.shape[0] > 1:
-            out = paddle.cast(input_var, "float64")
-        else:
-            out = paddle.cast(input_var, "int64")
-    paddle.disable_static()
-    in_np = np.array([-2]).astype('int')
-    input_var = paddle.to_tensor(in_np)
-    func(input_var)
-To save the transformed model, we can call ``paddle.jit.save`` . Let's take ``SimpleFcLayer`` as an example again, we put decorator at the ``forward`` method of ``SimpleFcLayer`` :
-.. code-block:: python
-    import numpy as np
-    import paddle
-    class SimpleFcLayer(paddle.nn.Layer):
-        def __init__(self, feature_size, batch_size, fc_size):
-            super(SimpleFCLayer, self).__init__()
-            self._linear = paddle.nn.Linear(feature_size, fc_size)
-            self._offset = paddle.to_tensor(
-                np.random.random((batch_size, fc_size)).astype('float32'))
-        @paddle.jit.to_static
-        def forward(self, x):
-            fc = self._linear(x)
-            return fc + self._offset
-Calling ``paddle.jit.save`` to save above model:
-.. code-block:: python
-    import paddle
-    paddle.disable_static()
-    fc_layer = SimpleFcLayer(3, 4, 2)
-    in_np = np.random.random([3, 4]).astype('float32')
-    input_var = paddle.to_tensor(in_np)
-    out = fc_layer(input_var)
-    paddle.jit.save(fc_layer, "./fc_layer_dy2stat")
 Architecture
--------------
+==============
 The basic idea of TracedLayer is tracing, it is relatively simple so we won't expend here. This section will talk about the source code transformation of ProgramTranslator.
 The transformation is implemented in the decorator so transformation happens when user calls the decorated function, the procedure includes these steps:
-1. Function and cache.
+Function and cache
+--------------------
 The entity for transforming dygraph to static graph is the decorated function. For the PaddlePaddle APIs in the function, since they are same code under dygraph mode and static mode, we don't have to transform those code. However, those APIs are computation in dygraph model while they are building network in static graph mode, if the transformed functions are called multiple times, those APIs will build network multiple times in static graph, which can cause problem. To solve it as well as speed up the transformation, we maintain a cache that maps from function, input shapes, input data types to the Program built by the transformed function. If the function hits cache, we run the stored Program in static graph mode to get result, else we do the code transformation on the function and store the transformed Program into the cache.
-2. From dygraph source code to AST (Abstract Syntax Tree)
+From dygraph source code to AST (Abstract Syntax Tree)
+--------------------------------------------------------
-The core of transforming dygraph to static graph is similar to a compiler, we parse the dygraph code into AST, change AST, then turn it back into static graph code. We use Python ``inspect.getsource`` to get the source code string of the function. Python provides ``ast`` library to parse string code into AST, but Python2, Python3 have slight grammar difference. To avoid the work to handle different grammars, we used an open source AST library `gast <https://github.com/serge-sans-paille/gast>`_ that provides compatibility AST among various Python versions. There is no essential difficulty to turn function into AST with these library.
+The core of transforming dygraph to static graph is similar to a compiler, we parse the dygraph code into AST, change AST, then turn it back into static graph code. We use Python ``inspect.getsource`` to get the source code string of the function. Python provides `ast <https://docs.python.org/3/library/ast.html>`_ library to parse string code into AST, but Python2, Python3 have slight grammar difference. To avoid the work to handle different grammars, we used an open source AST library `gast <https://github.com/serge-sans-paille/gast>`_ that provides compatibility AST among various Python versions. There is no essential difficulty to turn function into AST with these library.
-3. Transform AST and turn it to static graph code
+Transform AST and turn it to static graph code
+------------------------------------------------
 This part is the key part in ProgramTranslator, we modify AST for supported grammars. Those important Python control flows, such as ``if-elif-else, while, for`` loop are converted to PaddlePaddle static graph API ``cond, while_loop`` and so on. We created a Transformer (AST-to-AST Transformer in Python, not the Transformer in Natural Language Process) to transform each grammar. Every Transformer scans AST and modify it. Lastly, we turn AST back to source code string by ``gast`` library.
-4. Running static graph code as part of dygraph
+Running static graph code as part of dygraph
+----------------------------------------------
 In order to increase usability and re-use the transformed static graph code in dygraph, we wrap the generated Program as an dygraph op, the op can run the forward and backward computation of transformed Program. Then we can not only speed up dygraph code or save it for deployment, but also enable user to run part of their dygraph code in static graph mode so that they can continue training or other dygraph computation in their dygraph code.
-5. Error handling and Debug
+Error handling and Debug
+--------------------------
 Compiler usually supports debug functionality like breakpoint, throwing exception, print some mid-level codes. ProgramTranslator is similar to a compiler, users may would like to set breakpoints for debugging, or see whether the transformed static graph code is expected. So we also implemented those error handling and debug functionality. Here we list those functions and their implementation.