未验证 提交 ed7c4a38 编写于 作者: H Huihuang Zheng 提交者: GitHub

[Dy2stat] Add Basic English Usage Guide (#2569)

We added basic English usage guide. In addition, I thought some Chinese sentences are not good so I modified them.

TODO: Yamei suggested to split this file into basic usage and architecture. I will do it in next PRs
上级 5858c87f
Dygraph to Static Graph
- `Dygraph to Static Graph <program_translator_en.html>`_ :Introduce the basic usage for transforming dygraph code into static code and the architecture of ProgramTranslator.
.. toctree::
1. 基于trace的TracedLayer:
......@@ -88,7 +88,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就
2. 基于源代码转写的ProgramTranslator
对于依赖数据的控制流,我们使用基于源代码转写的ProgramTranslator来进行动态图转静态图。其基本原理是通过分析Python代码来将动态图代码转写为静态图代码,并在底层自动帮用户使用执行器运行。其基本使用方法十分简便,只需要在要转化的函数(该函数也可以是用户自定义动态图Layer的forward函数)前添加一个装饰器 ``@paddle.jit.to_static`` ,上面的例子转化如下,并且可以依旧使用该函数运行得到结果:
.. code-block:: python
......@@ -108,7 +108,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就
若要存储转化后的静态图模型,可以调用 ``paddle.jit.save`` ,我们再以SimpleFcLayer为例,需要在SimpleFcLayer的forward函数添加装饰器:
.. code-block:: python
......@@ -141,7 +141,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就
input_var = paddle.to_tensor(in_np)
out = fc_layer(input_var)
paddle.jit.save(mnist, "./mnist_dy2stat", input_spec=[input_var])
paddle.jit.save(fc_layer, "./fc_layer_dy2stat", input_spec=[input_var])
......@@ -157,21 +157,21 @@ TracedLayer的原理就是trace,相对简单,因此我们在这里不展开
2. 动态图源码转AST(抽象语法树)
动态图转静态图的最核心部分类似一个编译器,解析动态图代码语句为AST,再对应AST进行改写,最后反转回成静态图代码。从函数转化为代码字符串可以使用Python的inspect.getsource。从字符串Python提供了自带的ast库来解析字符串为 `AST <https://docs.python.org/3/library/ast.html>`_ ,但是由于python2,python3的语法略有不同,为了避免我们需要额外处理这些python2,python3的不同情况,我们使用了统一python2,python3的开源AST处理 `gast库 <https://github.com/serge-sans-paille/gast>`_ 。这些接口使得函数转化为AST没有本质上的困难。
动态图转静态图的最核心部分类似一个编译器,解析动态图代码语句为AST,再对应AST进行改写,最后反转回成静态图代码。从函数转化为代码字符串可以使用Python的inspect.getsource。从字符串Python提供了自带的 `ast <https://docs.python.org/3/library/ast.html>`_ 库来解析字符串为AST,但是由于Python2,Python3的语法略有不同,为了避免我们需要额外处理这些Python2,Python3的不同情况,我们使用了统一Python2,Python3的开源AST处理 `gast库 <https://github.com/serge-sans-paille/gast>`_ 。这些接口使得函数转化为AST没有本质上的困难。
3. AST改写和静态图源码转换
这部分为动转静最核心的部分,我们对支持的各种语法进行ast转写。其中最重要的python控制流,if-else,while,for循环被分别分析转化为PaddlePaddle静态图接口cond,while_loop等接口实现。我们对想转化的每一种主要语法创建一个Transformer(这里的Transformer是python ast转写的概念,而不是自然语言处理NLP领域的Transformer),每个Transformer扫一遍AST并进行对应的改写。最后被转化完成的AST我们使用gast提供的接口转回成源码。
这部分为动转静最核心的部分,我们对支持的各种语法进行ast转写。其中最重要的Python控制流,if-else,while,for循环被分别分析转化为PaddlePaddle静态图接口cond,while_loop等接口实现。我们对想转化的每一种主要语法创建一个Transformer(这里的Transformer是Python ast转写的概念,而不是自然语言处理NLP领域的Transformer),每个Transformer扫一遍AST并进行对应的改写。最后被转化完成的AST我们使用gast提供的接口转回成源码。
4. 静态图源码作为动态图一部分运行的技术
5. 易用性与Debug功能在动转静过程的实现
A. 报错对应到动态图代码行。由于被转化后的静态图代码和原动态图代码不同,python运行出错时会报静态图的错误,因此我们在每一次AST转写时添加AST节点对应的原动态图代码行等信息,在python报错栈中将静态图的报错转化成对应的动态图源码报错
A. 报错对应到动态图代码行。由于被转化后的静态图代码和原动态图代码不同,Python运行出错时会报静态图的错误,因此我们在每一次AST转写时添加AST节点对应的原动态图代码行等信息,在Python报错栈中将静态图的报错转化成对应的动态图源码报错
B. 设置断点功能。我们保留了被转化后代码的中的pdb.set_trace(), 用户可以使用这种方式进行断点调试
Dygraph to Static Graph
The imperative-style coding of PaddlePaddle takes advantage of flexibility, Pythonic coding, and easy-to-debug interface. In dygraph mode, code immediately executes kernels and gets numerical results, which allows users to enjoy traditional Pythonic code order. Therefore it is efficient to transform idea into real code and simple to debug. However, Python code is usually slower than C++ thus lots of industrial systems (such as large recommend system, mobile devices) prefer to deploy with C++ implementation.
Static graph is better at speed and portability. Static graph builds the network structure during compiling time and then does computation. The built network intermediate representation can be executed in C++ and gets rids of Python dependency.
While dygraph has usability and debug benefits and static graph yields performance and deployment advantage, we adds functionality to convert dygraph to static graph. Users use imperative mode to write dygraph code and PaddlePaddle will analyze the Python syntax and turn it into network structure of static graph mode. Our approach retains both the usability of dygraph and portability of static graph.
Basic Usage
PaddlePaddle has two ways to transform dygraph to static graph. TracedLayer extracts computation graph through tracing and ProgramTranslator gets computation graph through source code transformation.
1. TracedLayer:
Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example:
Define a simple fully connected network:
.. code-block:: python
import numpy as np
import paddle
class SimpleFcLayer(paddle.nn.Layer):
def __init__(self, feature_size, batch_size, fc_size):
super(SimpleFCLayer, self).__init__()
self._linear = paddle.nn.Linear(feature_size, fc_size)
self._offset = paddle.to_tensor(
np.random.random((batch_size, fc_size)).astype('float32'))
def forward(self, x):
fc = self._linear(x)
return fc + self._offset
Save model by TracedLayer:
.. code-block:: python
import paddle
from paddle.jit import TracedLayer
fc_layer = SimpleFcLayer(3, 4, 2)
in_np = np.random.random([3, 4]).astype('float32')
# Turn numpy ndarray into Tensor
input_var = paddle.to_tensor(in_np)
# Transforming imperative mode into declarative mode by TracerLayer.trace
out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
save_dirname = './saved_infer_model'
# Save the transformed model
static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
Load model and run it in static graph mode:
.. code-block:: python
place = paddle.CPUPlace()
exe = paddle.Executor(place)
program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe)
fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
However, as tracing only records operators once, if user's code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance:
.. code-block:: python
import paddle
def func(input_var)
# if condition depends on the shape of input_var
if input_var.shape[0] > 1:
return paddle.cast(input_var, "float64")
return paddle.cast(input_var, "int64")
in_np = np.array([-2]).astype('int')
input_var = paddle.to_tensor(in_np)
out = func(input_var)
If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop.
2. ProgramTranslator
For the Tensor-dependent control flow, we use source-code-translate based ProgramTranslator to convert dygraph into static graph. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator ``@paddle.jit.to_static`` before the definition of the function to transform (the function can also be a method of a class, e.g., the ``forward`` function of user-defined imperative Layer). Above Tensor-dependent example can be transformed correctly by ProgramTranslator as below:
.. code-block:: python
import paddle
def func(input_var)
# if condition depends on the shape of input_var
if input_var.shape[0] > 1:
out = paddle.cast(input_var, "float64")
out = paddle.cast(input_var, "int64")
in_np = np.array([-2]).astype('int')
input_var = paddle.to_tensor(in_np)
To save the transformed model, we can call ``paddle.jit.save`` . Let's take ``SimpleFcLayer`` as an example again, we put decorator at the ``forward`` method of ``SimpleFcLayer`` :
.. code-block:: python
import numpy as np
import paddle
class SimpleFcLayer(paddle.nn.Layer):
def __init__(self, feature_size, batch_size, fc_size):
super(SimpleFCLayer, self).__init__()
self._linear = paddle.nn.Linear(feature_size, fc_size)
self._offset = paddle.to_tensor(
np.random.random((batch_size, fc_size)).astype('float32'))
def forward(self, x):
fc = self._linear(x)
return fc + self._offset
Calling ``paddle.jit.save`` to save above model:
.. code-block:: python
import paddle
fc_layer = SimpleFcLayer(3, 4, 2)
in_np = np.random.random([3, 4]).astype('float32')
input_var = paddle.to_tensor(in_np)
out = fc_layer(input_var)
paddle.jit.save(fc_layer, "./fc_layer_dy2stat")
The basic idea of TracedLayer is tracing, it is relatively simple so we won't expend here. This section will talk about the source code transformation of ProgramTranslator.
The transformation is implemented in the decorator so transformation happens when user calls the decorated function, the procedure includes these steps:
1. Function and cache.
The entity for transforming dygraph to static graph is the decorated function. For the PaddlePaddle APIs in the function, since they are same code under dygraph mode and static mode, we don't have to transform those code. However, those APIs are computation in dygraph model while they are building network in static graph mode, if the transformed functions are called multiple times, those APIs will build network multiple times in static graph, which can cause problem. To solve it as well as speed up the transformation, we maintain a cache that maps from function, input shapes, input data types to the Program built by the transformed function. If the function hits cache, we run the stored Program in static graph mode to get result, else we do the code transformation on the function and store the transformed Program into the cache.
2. From dygraph source code to AST (Abstract Syntax Tree)
The core of transforming dygraph to static graph is similar to a compiler, we parse the dygraph code into AST, change AST, then turn it back into static graph code. We use Python ``inspect.getsource`` to get the source code string of the function. Python provides ``ast`` library to parse string code into AST, but Python2, Python3 have slight grammar difference. To avoid the work to handle different grammars, we used an open source AST library `gast <https://github.com/serge-sans-paille/gast>`_ that provides compatibility AST among various Python versions. There is no essential difficulty to turn function into AST with these library.
3. Transform AST and turn it to static graph code
This part is the key part in ProgramTranslator, we modify AST for supported grammars. Those important Python control flows, such as ``if-elif-else, while, for`` loop are converted to PaddlePaddle static graph API ``cond, while_loop`` and so on. We created a Transformer (AST-to-AST Transformer in Python, not the Transformer in Natural Language Process) to transform each grammar. Every Transformer scans AST and modify it. Lastly, we turn AST back to source code string by ``gast`` library.
4. Running static graph code as part of dygraph
In order to increase usability and re-use the transformed static graph code in dygraph, we wrap the generated Program as an dygraph op, the op can run the forward and backward computation of transformed Program. Then we can not only speed up dygraph code or save it for deployment, but also enable user to run part of their dygraph code in static graph mode so that they can continue training or other dygraph computation in their dygraph code.
5. Error handling and Debug
Compiler usually supports debug functionality like breakpoint, throwing exception, print some mid-level codes. ProgramTranslator is similar to a compiler, users may would like to set breakpoints for debugging, or see whether the transformed static graph code is expected. So we also implemented those error handling and debug functionality. Here we list those functions and their implementation.
A. Report errors/exceptions on dygraph code line. Because the transformed static graph code is different to original dygraph code, when Python executes the static graph code, the exceptions will be reported at static graph code. To locate the corresponding dygraph code, we attach some informations such as line number on AST nodes when we transform AST, then we can re-write the static graph exception to the corresponding dygraph code exception.
B. We support ``pdb.set_trace()`` when running ProgramTranslator, user can add this line to set breakpoints.
C. Check the transformed static graph code. Our transformed output is a Python class named ``StaticLayer``, this class can be called, but it also stores the transformed code string. Users could call ``StaticLayer.code`` to get the converted code.
D. Print mid-level transformed code, such as what's the code after transforming ``for`` loop. We provide APIs to set log level to let user check the mid-level code.
......@@ -15,6 +15,7 @@ So far you have already been familiar with PaddlePaddle. And the next expectatio
.. toctree::
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
想要评论请 注册