From e05264a231fed5b6d3fb16045aa0cf13481cffc3 Mon Sep 17 00:00:00 2001 From: Chen Long Date: Thu, 10 Sep 2020 15:38:57 +0800 Subject: [PATCH] add dynamic to static dir test=develop (#2584) --- .../guides/dygraph_to_static/debugging_cn.md | 192 ++++++++++++++++++ .../dygraph_to_static/error_handling_cn.md | 147 ++++++++++++++ .../dygraph_to_static/grammar_list_cn.rst | 16 +- .../dygraph_to_static/grammar_list_en.rst | 124 +++++++++++ .../guides/dygraph_to_static/index_cn.rst | 7 + .../guides/dygraph_to_static/index_en.rst | 14 ++ .../program_translator_cn.rst | 20 +- .../program_translator_en.rst | 178 ++++++++++++++++ doc/paddle/guides/index_en.rst | 3 +- 9 files changed, 685 insertions(+), 16 deletions(-) create mode 100644 doc/paddle/guides/dygraph_to_static/debugging_cn.md create mode 100644 doc/paddle/guides/dygraph_to_static/error_handling_cn.md create mode 100644 doc/paddle/guides/dygraph_to_static/grammar_list_en.rst create mode 100644 doc/paddle/guides/dygraph_to_static/index_en.rst create mode 100644 doc/paddle/guides/dygraph_to_static/program_translator_en.rst diff --git a/doc/paddle/guides/dygraph_to_static/debugging_cn.md b/doc/paddle/guides/dygraph_to_static/debugging_cn.md new file mode 100644 index 000000000..36586337b --- /dev/null +++ b/doc/paddle/guides/dygraph_to_static/debugging_cn.md @@ -0,0 +1,192 @@ +# 调试方法 + +本节内容将介绍动态图转静态图(下文简称动转静)推荐的几种调试方法。 + +注意:请确保转换前的动态图代码能够成功运行,建议使用[paddle.jit.ProgramTranslator().enable(False)](../../api_cn/dygraph_cn/ProgramTranslator_cn.html#enable)关闭动转静功能,直接运行动态图,如下: +```python +import paddle +import numpy as np +paddle.disable_static() +# 关闭动转静动能 +paddle.jit.ProgramTranslator().enable(False) + +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + if x > 3: + x = x - 1 + return x + +func(np.ones([3, 2])) +``` + +## 断点调试 +使用动转静功能时,您可以使用断点调试代码。 +例如,在代码中,调用`pdb.set_trace()`: +```Python +import pdb + +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + pdb.set_trace() + if x > 3: + x = x - 1 + return x +``` +执行以下代码,将会在转化后的静态图代码中使用调试器: +```Python +func(np.ones([3, 2])) +``` + +运行结果: +```bash +> /tmp/tmpR809hf.py(6)func() +-> def true_fn_0(x): +(Pdb) n +> /tmp/tmpR809hf.py(6)func() +-> def false_fn_0(x): +... +``` + +如果您想在原始的动态图代码中使用调试器,请先调用[`paddle.jit.ProgramTranslator().enable(False)`](../../api_cn/dygraph_cn/ProgramTranslator_cn.html#enable),如下: +```python +paddle.jit.ProgramTranslator().enable(False) +func(np.ones([3, 2])) +``` +运行结果: +```bash +> (10)func() +-> if x > 3: +... + +``` + +## 打印转换后的代码 +您可以打印转换后的静态图代码,有2种方法: + +1. 使用被装饰函数的`code` 属性 + ```Python + @paddle.jit.to_static + def func(x): + x = paddle.to_tensor(x) + if x > 3: + x = x - 1 + return x + + print(func.code) + ``` + 运行结果: + + ```bash + + def func(x): + x = fluid.layers.assign(x) + + def true_fn_0(x): + x = x - 1 + return x + + def false_fn_0(x): + return x + x = fluid.dygraph.dygraph_to_static.convert_operators.convert_ifelse(x > + 3, true_fn_0, false_fn_0, (x,), (x,), (x,)) + return x + ``` + +2. 使用`set_code_level(level)`或环境变量`TRANSLATOR_CODE_LEVEL=level` + + 通过调用`set_code_level`或设置环境变量`TRANSLATOR_CODE_LEVEL`,可以在log中查看转换后的代码 +```python +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + if x > 3: + x = x - 1 + return x + +paddle.jit.set_code_level() # 也可设置 os.environ["TRANSLATOR_CODE_LEVEL"] = '100',效果相同 +func(np.ones([1])) +``` + 运行结果: + +```bash +2020-XX-XX 00:00:00,980-INFO: After the level 100 ast transformer: 'All Transformers', the transformed code: +def func(x): + x = fluid.layers.assign(x) + + def true_fn_0(x): + x = x - 1 + return x + + def false_fn_0(x): + return x + x = fluid.dygraph.dygraph_to_static.convert_operators.convert_ifelse(x > + 3, true_fn_0, false_fn_0, (x,), (x,), (x,)) + return x +``` + `set_code_level` 函数可以设置查看不同的AST Transformer转化后的代码,详情请见[set_code_level]()。 + +## 使用 `print` +`print` 函数可以用来查看变量,该函数在动转静中会被转化。当仅打印Paddle Tensor时,实际运行时会被转换为Paddle算子[Print](../../api_cn/layers_cn/Print_cn.html),否则仍然运行`print`。 +```python +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + # 打印x,x是Paddle Tensor,实际运行时会运行Paddle Print(x) + print(x) + # 打印注释,非Paddle Tensor,实际运行时仍运行print + print("Here call print function.") + if len(x) > 3: + x = x - 1 + else: + x = paddle.ones(shape=[1]) + return x +func(np.ones([1])) +``` + +运行结果: +```bash +Variable: assign_0.tmp_0 + - lod: {} + - place: CPUPlace + - shape: [1] + - layout: NCHW + - dtype: double + - data: [1] +Here call print function. +``` + +## 日志打印 +ProgramTranslator在日志中记录了额外的调试信息,以帮助您了解动转静过程中函数是否被成功转换。 +您可以调用`paddle.jit.set_verbosity(level)` 或设置环境变量`TRANSLATOR_VERBOSITY=level`来设置日志详细等级,并查看不同等级的日志信息。目前,`level`可以取值0-3: +- 0: 无日志 +- 1: 包括了动转静转化流程的信息,如转换前的源码、转换的可调用对象 +- 2: 包括以上信息,还包括更详细函数转化日志 +- 3: 包括以上信息,以及更详细的动转静日志 + + +可以在代码运行前调用`paddle.jit.set_verbosity()`: +```python +paddle.jit.set_verbosity(3) +``` +或者设置环境变量`TRANSLATOR_VERBOSITY`: +```python +import os +os.environ["TRANSLATOR_VERBOSITY"] = '3' +``` + +运行结果: +```bash +2020-XX-XX 00:00:00,123-Level 1: Source code: +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + if len(x) > 3: + x = x - 1 + else: + x = paddle.ones(shape=[1]) + return x + +2020-XX-XX 00:00:00,152-Level 1: Convert callable object: convert . +``` diff --git a/doc/paddle/guides/dygraph_to_static/error_handling_cn.md b/doc/paddle/guides/dygraph_to_static/error_handling_cn.md new file mode 100644 index 000000000..be835d1b2 --- /dev/null +++ b/doc/paddle/guides/dygraph_to_static/error_handling_cn.md @@ -0,0 +1,147 @@ +# 报错信息处理 + +本节内容将介绍使用动态图转静态图(下文简称动转静)功能发生异常时,[ProgramTranslator](./program_translator_cn.html)对报错信息做的处理,以帮助您更好地理解动转静报错信息。使用动转静功能运行动态图代码时,内部可以分为2个步骤:动态图代码转换成静态图代码,运行静态图代码。接下来将分别介绍这2个步骤中的异常报错情况。 + +## 动转静过程中的异常 +在动态图代码转换成静态图代码的过程中,如果ProgramTranslator无法转换一个函数时,将会显示警告信息,并尝试直接运行该函数。 +如下代码中,函数`inner_func` 在调用前被转换成静态图代码,当`x = inner_func(data)`调用该函数时,不能重复转换,会给出警告信息: +```python +import paddle +import numpy as np + +paddle.disable_static() + +@paddle.jit.to_static +def func(): + def inner_func(x): + x_tensor = paddle.to_tensor(x) + return x_tensor + data = np.ones([3]).astype("int32") + x = inner_func(data) + return x +func() +``` +ProgramTranslator打印的警告信息如下: +```bash +WARNING: doesn't have to be transformed to static function because it has been transformed before, it will be run as-is. +``` +## 运行转换后的代码报错 +如果在动转静后的静态图代码中发生异常,ProgramTranslator会捕获该异常,增强异常报错信息,将静态图代码报错行映射到转换前的动态图代码,并重新抛出该异常。 +重新抛出的异常具有以下特点: + +- 隐藏了部分对用户无用的动转静过程调用栈; +- 转换前的代码会给出提示:"In User Code:"; +- 报错信息中包含了转换前的原始动态图代码; + +例如,运行以下代码,在静态图构建时,即编译期会抛出异常: +```python +import paddle +import numpy as np + +paddle.disable_static() + +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + x = paddle.reshape(x, shape=[-1, -1]) + return x + +func(np.ones([3, 2])) +``` +运行结果: +```bash +Traceback (most recent call last): + in () + func(np.ones([3, 2])) + File "paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 332, in __call__ + raise new_exception +AssertionError: In user code: + + File "", line 7, in func + x = fluid.layers.reshape(x, shape=[-1, -1]) + File "paddle/fluid/layers/nn.py", line 6193, in reshape + attrs["shape"] = get_attr_shape(shape) + File "paddle/fluid/layers/nn.py", line 6169, in get_attr_shape + "be -1. But received shape[%d] is also -1." % dim_idx) + AssertionError: Only one dimension value of 'shape' in reshape can be -1. But received shape[1] is also -1. +``` + +上述报错信息可以分为3点: + +1. 报错栈中,涉及代码转换过程的信息栈默认会被隐藏,不对用户展示,以避免给用户带来困扰。 +2. ProgramTranslator处理后的报错信息中,会包含提示"In user code:",表示之后的报错栈中,包含动转静前的动态图代码,即用户写的代码: + ```bash + AssertionError: In user code: + + File "", line 7, in func + x = fluid.layers.reshape(x, shape=[-1, -1]) + File "paddle/fluid/layers/nn.py", line 6193, in reshape + attrs["shape"] = get_attr_shape(shape) + File "paddle/fluid/layers/nn.py", line 6169, in get_attr_shape + "be -1. But received shape[%d] is also -1." % dim_idx) + ``` + 其中,`File "", line 7, in func` 是转换前的代码位置信息,`x = fluid.layers.reshape(x, shape=[-1, -1])` 是转换前的代码。 +3. 新的异常中,包含原始报错中的的报错信息,如下: + ```bash + AssertionError: Only one dimension value of 'shape' in reshape can be -1. But received shape[1] is also -1. + ``` + +运行以下代码,在静态图运行时,即运行期会抛出异常: +```Python +@paddle.jit.to_static +def func(x): + x = paddle.to_tensor(x) + two = paddle.fill_constant(shape=[1], value=2, dtype="int32") + x = paddle.reshape(x, shape=[1, two]) + return x + +func(np.ones([3]).astype("int32")) +``` +运行结果: +```bash +Traceback (most recent call last): + File "", line 10, in () + func(np.ones([3]).astype("int32")) + File "paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 332, in __call__ + raise new_exception + +EnforceNotMet: In user code: + + File "", line 7, in func + x = paddle.reshape(x, shape=[1, two]) + File "paddle/tensor/manipulation.py", line 1347, in reshape + return paddle.fluid.layers.reshape(x=x, shape=shape, name=name) + File "paddle/fluid/layers/nn.py", line 6209, in reshape + "XShape": x_shape}) + File "paddle/fluid/layer_helper.py", line 43, in append_op + return self.main_program.current_block().append_op(*args, **kwargs) + File "paddle/fluid/framework.py", line 2880, in append_op + attrs=kwargs.get("attrs", None)) + File "paddle/fluid/framework.py", line 1977, in __init__ + for frame in traceback.extract_stack(): + +-------------------------------------- +C++ Traceback (most recent call last): +-------------------------------------- +0 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool) +1 paddle::imperative::OpBase::Run(paddle::framework::OperatorBase const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap const&, paddle::platform::Place const&) +2 paddle::imperative::PreparedOp::Run(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap const&) +3 std::_Function_handler >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) +4 paddle::operators::RunProgramOpKernel::Compute(paddle::framework::ExecutionContext const&) const +5 paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool) +6 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) +7 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const +8 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const +9 paddle::operators::ReshapeKernel::operator()(paddle::framework::ExecutionContext const&) const +10 paddle::operators::ReshapeOp::ValidateShape(std::vector >, paddle::framework::DDim const&) +11 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) +12 paddle::platform::GetCurrentTraceBackString() + +---------------------- +Error Message Summary: +---------------------- +InvalidArgumentError: The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'. But received X's shape = [3], X's size = 3, 'shape' is [1, 2], the capacity of 'shape' is 2. + [Hint: Expected capacity == in_size, but received capacity:2 != in_size:3.] (at /paddle/paddle/fluid/operators/reshape_op.cc:206) + [operator < reshape2 > error] [operator < run_program > error] +``` +上述异常中,除了隐藏部分报错栈、报错定位到转换前的动态图代码外,报错信息中包含了C++报错栈`C++ Traceback`和`Error Message Summary`,这是Paddle的C++端异常信息,经处理后在Python的异常信息中显示。 diff --git a/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst b/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst index b067ecb7f..e1a0867bb 100644 --- a/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst +++ b/doc/paddle/guides/dygraph_to_static/grammar_list_cn.rst @@ -10,7 +10,7 @@ ProgramTranslator本质是把Python运行语法转写为PaddlePaddle静态图代 控制流相关关键词 ------------------ -控制流指if-elif-else,while等能够控制程序语句执行顺序的关键字。PaddlePaddle静态图通过cond,while_loop API来实现条件判断和循环,如果动态图Python控制流的判断条件/循环条件依赖 Paddle Tensor,动转静后会被转化为等价的Paddle控制流接口,否则仍然使用Python控制流逻辑运行。在动转静过程中这些关键字的转化情况为: +控制流指if-elif-else,while等能够控制程序语句执行顺序的关键字。PaddlePaddle静态图通过cond,while_loop API来实现条件判断和循环,如果动态图Python控制流的判断条件或循环条件依赖 PaddlePaddle Tensor,动转静后会被转化为等价的PaddlePaddle控制流接口,否则仍然使用Python控制流逻辑运行。在动转静过程中这些关键字的转化情况为: 1. if-elif-else 条件 @@ -47,7 +47,7 @@ ProgramTranslator 支持在循环,条件判断中return结果而不需要一 一些需要转化的运算类型 ------------------------ -1. +,-,*,/,** 等Python内置运算 +1. +,-,*,/,**, >, <, >= , <=, == 等Python内置运算 由于静态图有重载这些基本运算符,所以这些被ProgramTranslator转化后都适用相应重载的运算符,动转静支持此类运算。 @@ -76,7 +76,7 @@ Python 函数相关 4. 函数内再调用函数 -对于函数内调用其他函数的情况,ProgramTranslator也会对内部的函数递归地进行动转静,这样做的好处是可以在最外层函数加一次装饰器就能进行动转静,而不需要每个函数都加装饰器。 +对于函数内调用其他函数的情况,ProgramTranslator也会对内部的函数递归地进行动转静,这样做的好处是可以在最外层函数只需加一次装饰器即可,而不需要每个函数都加装饰器。但需要注意,动转静还不支持函数递归调用自己,详细原因请查看下文动转静无法正确运行的情况。 报错异常相关 -------------- @@ -96,7 +96,7 @@ Python基本容器 动转静无法正确运行的情况 -------------------------- -1. Reshape后的变量调用其shape作为Paddle API参数。 +1. Reshape后的变量调用其shape作为PaddlePaddle API参数。 具体表现比如 ``x = reshape(x, shape=shape_tensor)`` ,再使用 ``x.shape[0]`` 的值进行其他操作。这种情况会由于动态图和静态图的本质不同而使得动态图能够运行,但静态图运行失败。其原因是动态图情况下,API是直接返回运行结果,因此 ``x.shape`` 在经过reshape运算后是确定的。但是在转化为静态图后,因为静态图API只是组网,``shape_tensor`` 的值在组网时是不知道的,所以 ``reshape`` 接口组网完,静态图并不知道 ``x.shape`` 的值。PaddlePaddle静态图用-1表示未知的shape值,此时 ``x`` 的shape每个维度会被设为-1,而不是期望的值。 @@ -104,7 +104,7 @@ Python基本容器 2. 多重list嵌套读写Tensor -具体表现如 ``l = [[tensor1, tensor2], [tensor3, tensor4]]`` ,因为现在静态图将元素全是Tensor的list转化为TensorArray,而Paddle的TensorArray还不支持多维数组,因此这种情况无法动转静正确运行。 +具体表现如 ``l = [[tensor1, tensor2], [tensor3, tensor4]]`` ,因为现在动转静将元素全是Tensor的list转化为TensorArray,而PaddlePaddle的TensorArray还不支持多维数组,因此这种情况下,动转静无法正确运行。 遇到这类情况我们建议尽量用一维list,或者自己使用PaddlePaddle的create_array,array_read,array_write接口编写为TensorArray。 @@ -114,3 +114,9 @@ Python基本容器 遇到这种情况我们建议在动转静的函数中尽量使用PaddlePaddle接口替代numpy接口进行运算。 +4. 一个函数递归调用自己 + +ProgramTranslator还无法支持一个函数递归调用自己,原因是递归常常会用 ``if-else`` 构造停止递归的条件。然而这样的停止条件在静态图下只是一个 ``cond`` 组网,组网并不能在编译阶段决定自己组多少次,会导致函数运行时一直组网递归直至栈溢出,因此ProgramTranslator还无法支持一个函数递归调用自己。 + +遇到这种情况我们建议将代码改为非递归写法。 + diff --git a/doc/paddle/guides/dygraph_to_static/grammar_list_en.rst b/doc/paddle/guides/dygraph_to_static/grammar_list_en.rst new file mode 100644 index 000000000..0c88a9971 --- /dev/null +++ b/doc/paddle/guides/dygraph_to_static/grammar_list_en.rst @@ -0,0 +1,124 @@ +Supported Grammars +================== + +The key part of ProgramTranslator is transforming Python grammar into PaddlePaddle static graph code, but there exists difference between Python and PaddlePaddle static graph which causes some limitation of the code transformation. + +In this section we will talk about the supported grammars and unsupported grammars, also give some suggestions when the grammar is unsupported. + +There are several kinds of supported grammars: + +Control flow keywords +--------------------- + +Control flow means those keywords that controls the execution order of program statements, for example ``if-elif-else, while`` . Conditional operation and loop were implemented as ``cond, while_loop`` APIs in PaddlePaddle static graph. If the condition of a Python dygraph control flow depends on PaddlePaddle Tensor, the ProgramTranslator will convert the control flow into equivalent PaddlePaddle control flow APIs, else it will still be executed as Python control flow. The transformations of those control flow keywords are listed below: + +1. ``if-elif-else`` statements + +If the condition of ``if `` is Tensor, ProgramTranslator will turn this ``if-elif-else`` statement to equivalent PaddlePaddle static graph ``cond`` statements, otherwise the ``if-elif-else`` statement is executed as normal Python conditional statement. Note that ``cond`` API only accepts input conditional Tensor with numel equals to 1, so please use this kind of Tensor to write dygraph conditional statement, other Tensors will cause error. + +2. ``while`` loop + +If the condition of ``while`` is Tensor, ProgramTranslator will turn this ``while`` statement to equivalent PaddlePaddle static graph ``while_loop`` statements, otherwise the ``while`` statement is executed as normal Python ``while`` loop statement. Note that ``while_loop`` API only accepts input conditional Tensor with numel equals to 1, so please use this kind of Tensor to write dygraph loop condition statement, other Tensors will cause error. + +3. ``for`` loop + +3.1 ``for _ in range(__)`` loop + +Firstly, ProgramTranslator will transform it into equivalent Python while loop, then convert dygraph to static graph by same logic of ``while`` loop. + +3.2 ``for _ in x`` loop + +If ``x`` is a Python container, iterator, or generator, it will be executed as original Python statement. Otherwise ``x`` is a Tensor, ProgramTranslator will transform the loop into PaddlePaddle static graph loop and fetches ``x[0], x[1], ...`` as loop iteration variable in each loop iteration. + +3.3 ``for idx, val in enumerate(x)`` loop + +If ``x`` is a Python container, iterator, or generator, it will be executed as original Python statement. Otherwise ``x`` is a Tensor, Program +Translator will transform the loop into PaddlePaddle static graph loop. The ``idx`` will be transformed to 1-D tensor with value ``0, 1, ...`` and the ``val`` will be transformed to ``x[0], x[1], ...`` in each loop iteration. + +4. ``break, continue`` + +ProgramTranslator supports ``break, continue`` statements in loop. ProgramTranslator will add some PaddlePaddle static graph ``cond`` statements to skip execution of corresponding part when ``break, continue`` condition is meet. + +5. ``return`` + +ProgramTranslator supports ``return`` in a conditonal block or loop body, not necessary to be at the end of a function. It also supports returning tuple with various length of Tensors with different dtype. The implementation is adding some PaddlePaddle static graph ``cond`` statement to skipparts of code when ``return`` is triggered. + + +Some Python basic operators +--------------------------- + +1. ``+, -, *, /, **, >, <, >= , <=, ==`` etc. + +Because PaddlePaddle static graph overrides those Python basic arithmetic operators and comparison operators, ProgramTranslator can support those operators. + +2. ``and, or, not`` logical operators + +Python has ``and, or, not`` keywards as basic logical operators, ProgramTranslator will check whether the variables of the logical operators are Tensors, if they are Tensors, ProgramTranslator replaces the ``and, or, not`` statements into corresponding PaddlePaddle static graph logical operator and run it. + +3. Type casting + +In dygraph mode, users can use Python type casting grammar. For instance, if ``x`` is a Tensor, ``float(x)`` casts the data type of ``x`` to float. ProgramTranslator will check whether ``x`` is a Tensor during run time, if it is, the casting sentence will be modified to PaddlePaddle static graph ``cast`` API so that its dtype can be changed in the dygraph to static transformation. + +Python functions +------------------------------ + +1. ``print`` + +In dygraph mode, ``print(x)`` will print Tensor value if ``x`` is a Tensor. ProgramTranslator converts the built-in ``print`` to PaddlePaddle static graph ``Print`` API during dygraph to static graph transformation if the arguments are Tensors, otherwise ProgramTranslator won't convert the ``print``. + +2. ``len`` + +If ``x`` is a Tensor, ``len(x)`` can get the length at 0-dimension of ``x`` in dygraph mode. ProgramTranslator turns it to PaddlePaddle static graph ``shape`` API and returns the 0-dimension of the ``shape``, else if ``x`` is a TensorArray, then ``len(x)`` will be transformed to static graph API ``control_flow.array_length`` to return the length of TensorArray. In other cases, the ``len`` function will be executed as Python built-in ``len`` + +3. lambda expression + +ProgramTranslator supports Python lambda expression and it modifies code to return the expected result. + + +4. Calling function + +If the transformed function calls another function, ProgramTranslator also transform the called function. The benefit is that users can add one decorator at the outside function to do transformation, no need to add the decorator for each function. Note that ProgramTranslator doesn't support +that a function calls itself recursively, the details is in the unsupported grammars section below. + + +Errors and Exceptions +--------------------- + +1. ``assert`` + +If ``x`` is a Tensor, ``assert x`` statement can assert ``x`` to be ``True`` or non-zero value in dygraph mode. ProgramTranslator converts the statement into PaddlePaddle static graph ``Assert`` API to support this grammar. + + +Python containers +----------------- + +1. ``list``: if all elements in a list are Tensors, then ProgramTranslator converts it to TensorArray. PaddlePaddle static graph TensorArray supports append, pop, and modify, other list operations such as sort cannot be supported. When not all elements in a list are Tensors, ProgramTranslator will treat it as normal Python list. + +2. ``dict``: ProgramTranslator will add the Tensors in a dict into PaddlePaddle static graph ``Program``, so ``dict`` is supported by ProgramTranslator. + +Unsupported grammars +-------------------- + +1. Use the shape of output tensor of ``reshape`` + +For example, ``x = reshape(x, shape=shape_tensor)`` , then use ``x.shape[0]`` to do other operation. Due to the difference between dygraph and static graph, it is okay in dygraph but it will fail in static graph. The reason is that APIs return computation result in dygraph mode, so ``x.shape`` has deterministic value after calling ``reshape`` . However, static graph doesn't have the value ``shape_tensor`` during building network, so PaddlePaddle doesn't know the value of ``x.shape`` after calling ``reshape``. PaddlePaddle static graph will set -1 to represent unknown shape value for each dimension of ``x.shape`` in this case, not the expected value. + +We suggest to set fixed shape value as much as possible, reduce the reshape operation. + +2. List of list of Tensor + +For example: ``l = [[tensor1, tensor2], [tensor3, tensor4]]``, because ProgramTranslator transformed a list whose elements are all Tensors into PaddlePaddle static graph TensorArray, but TensorArray doesn't support multi-dimensions, ProgramTranslator cannot run this case. + +We suggest to use 1-D list at most time, or use PaddlePaddle API ``create_array, array_read, array_write`` to control TensorArray. + +3. Convert Tensor to numpy array and do operation + +For example, user doesn't return Tensor in the decorated function but call ``numpy.array(tensor)`` to convert Tensor to numpy array and then use numpy API to compute on it. In dygraph mode, it is okey because Tensor has value, but Tensor is variable for building network in static graph mode, it doesn't contain value if not in static graph running time, so we cannot do numpy calculation on it. + +We suggest to use PaddlePaddle APIs to replace numpy API in this case. + +4. A function calls itself recursively + +ProgramTranslator doesn't support a function calls itself recursively, the reason is that recursive function usually uses ``if-else`` for a condition to stop the recursion, the stop condition will be transformed to a ``cond`` in static graph mode. Since ``cond`` just builds network, it cannot determine how many times it recursively builds network during network built stage, so the function will recursively call itself and build network until stack overflow. Due to above reason, ProgramTranslator cannot support a function calls itself recursively now. + +We suggest to write non-recursive function in this case. diff --git a/doc/paddle/guides/dygraph_to_static/index_cn.rst b/doc/paddle/guides/dygraph_to_static/index_cn.rst index 40ab04d3e..cd4ee8e11 100644 --- a/doc/paddle/guides/dygraph_to_static/index_cn.rst +++ b/doc/paddle/guides/dygraph_to_static/index_cn.rst @@ -6,8 +6,15 @@ - `支持语法列表 `_ :介绍了动态图转静态图支持的语法以及罗列不支持的语法写法 +- `报错信息处理 `_ :介绍了动态图转静态图支持的报错信息处理方法 + +- `调试方法 `_ :介绍了动态图转静态图支持的调试方法 + + .. toctree:: :hidden: grammar_list_cn.rst program_translator_cn.rst + error_handling_cn.md + debugging_cn.md diff --git a/doc/paddle/guides/dygraph_to_static/index_en.rst b/doc/paddle/guides/dygraph_to_static/index_en.rst new file mode 100644 index 000000000..1ce60c913 --- /dev/null +++ b/doc/paddle/guides/dygraph_to_static/index_en.rst @@ -0,0 +1,14 @@ +####################### +Dygraph to Static Graph +####################### + +- `Dygraph to Static Graph `_ :Introduce the basic usage for transforming dygraph code into static code and the architecture of ProgramTranslator. + +- `Supported Grammars `_ :Introduce the grammars supported by ProgramTranslator and list unsupport grammars. + +.. toctree:: + :hidden: + + grammar_list_en.rst + program_translator_en.rst + diff --git a/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst b/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst index 0475097fb..03852fc58 100644 --- a/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst +++ b/doc/paddle/guides/dygraph_to_static/program_translator_cn.rst @@ -1,16 +1,16 @@ 动态图转静态图 ================ -PaddlePadde的动态图具有接口易用、Python风格的编程体验、友好的debug交互等优点。在动态图模式下,代码是按照我们编写的顺序依次执行。这种机制更符合python程序员的习惯,可以很方便地将大脑中的想法快速地转化为实际代码,也更容易调试。但在部分性能方面,python速度负担太大,无法与C++相提并论。因此在工业界部署很多地方(如大型推荐系统、移动端)都倾向于直接使用C++来提速。 +动态图有诸多优点,包括易用的接口,python风格的编程体验,友好的debug交互机制等。在动态图模式下,代码是按照我们编写的顺序依次执行。这种机制更符合Python程序员的习惯,可以很方便地将大脑中的想法快速地转化为实际代码,也更容易调试。但在性能方面,Python执行开销较大,与C++有一定差距。因此在工业界的许多部署场景中(如大型推荐系统、移动端)都倾向于直接使用C++来提速。 -此时,静态图在部署方面更具有性能的优势。静态图程序在编译执行时,先搭建模型的神经网络结构,然后再对神经网络执行计算操作。预先搭建好的神经网络可以脱离python依赖,在C++端被重新解析执行。 +相比动态图,静态图在部署方面更具有性能的优势。静态图程序在编译执行时,先搭建模型的神经网络结构,然后再对神经网络执行计算操作。预先搭建好的神经网络可以脱离Python依赖,在C++端被重新解析执行,而且拥有整体网络结构也能进行一些网络结构的优化。 动态图代码更易编写和debug,但在部署性能上,静态图更具优势。因此我们新增了动态图转静态图的功能,支持用户依然使用动态图编写组网代码。PaddlePaddle会对用户代码进行分析,自动转换为静态图网络结构,兼顾了动态图易用性和静态图部署性能两方面优势。 基本使用方法 -------------- -PaddlePaddle提供了两种动态图转静态图的方式,基于动态图trace的转换与基于源代码级别的转换的ProgramTranslator。 +PaddlePaddle提供了两种动态图转静态图的方式,基于动态图trace的TracedLayer与基于源代码级别转换的ProgramTranslator。 1. 基于trace的TracedLayer: @@ -88,7 +88,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就 2. 基于源代码转写的ProgramTranslator -对于依赖数据的控制流,我们使用基于源代码转写的ProgramTranslator来进行动态图转静态图。其基本原理是通过分析python代码来将动态图代码转写为静态图代码,并在底层自动帮用户使用执行器运行。其基本使用方法十分简便,只需要在要转化的函数(该函数也可以是用户自定义动态图Layer的forward函数)前添加一个装饰器@paddle.jit.to_static,上面的例子转化如下,并且可以依旧使用该函数运行得到结果: +对于依赖数据的控制流,我们使用基于源代码转写的ProgramTranslator来进行动态图转静态图。其基本原理是通过分析Python代码来将动态图代码转写为静态图代码,并在底层自动帮用户使用执行器运行。其基本使用方法十分简便,只需要在要转化的函数(该函数也可以是用户自定义动态图Layer的forward函数)前添加一个装饰器 ``@paddle.jit.to_static`` ,上面的例子转化如下,并且可以依旧使用该函数运行得到结果: .. code-block:: python @@ -108,7 +108,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就 func(input_var) -若要存储转化后的静态图模型,可以调用paddle.jit.save,我们再以SimpleFcLayer为例,需要在SimpleFcLayer的forward函数添加装饰器: +若要存储转化后的静态图模型,可以调用 ``paddle.jit.save`` ,我们再以SimpleFcLayer为例,需要在SimpleFcLayer的forward函数添加装饰器: .. code-block:: python @@ -141,7 +141,7 @@ trace是指在模型运行时记录下其运行过哪些算子。TracedLayer就 input_var = paddle.to_tensor(in_np) out = fc_layer(input_var) - paddle.jit.save(mnist, "./mnist_dy2stat", input_spec=[input_var]) + paddle.jit.save(fc_layer, "./fc_layer_dy2stat", input_spec=[input_var]) 内部架构原理 -------------- @@ -157,21 +157,21 @@ TracedLayer的原理就是trace,相对简单,因此我们在这里不展开 2. 动态图源码转AST(抽象语法树) -动态图转静态图的最核心部分类似一个编译器,解析动态图代码语句为AST,再对应AST进行改写,最后反转回成静态图代码。从函数转化为代码字符串可以使用Python的inspect.getsource。从字符串Python提供了自带的ast库来解析字符串为 `AST `_ ,但是由于python2,python3的语法略有不同,为了避免我们需要额外处理这些python2,python3的不同情况,我们使用了统一python2,python3的开源AST处理 `gast库 `_ 。这些接口使得函数转化为AST没有本质上的困难。 +动态图转静态图的最核心部分类似一个编译器,解析动态图代码语句为AST,再对应AST进行改写,最后反转回成静态图代码。从函数转化为代码字符串可以使用Python的inspect.getsource。从字符串Python提供了自带的 `ast `_ 库来解析字符串为AST,但是由于Python2,Python3的语法略有不同,为了避免我们需要额外处理这些Python2,Python3的不同情况,我们使用了统一Python2,Python3的开源AST处理 `gast库 `_ 。这些接口使得函数转化为AST没有本质上的困难。 3. AST改写和静态图源码转换 -这部分为动转静最核心的部分,我们对支持的各种语法进行ast转写。其中最重要的python控制流,if-else,while,for循环被分别分析转化为PaddlePaddle静态图接口cond,while_loop等接口实现。我们对想转化的每一种主要语法创建一个Transformer(这里的Transformer是python ast转写的概念,而不是自然语言处理NLP领域的Transformer),每个Transformer扫一遍AST并进行对应的改写。最后被转化完成的AST我们使用gast提供的接口转回成源码。 +这部分为动转静最核心的部分,我们对支持的各种语法进行ast转写。其中最重要的Python控制流,if-else,while,for循环被分别分析转化为PaddlePaddle静态图接口cond,while_loop等接口实现。我们对想转化的每一种主要语法创建一个Transformer(这里的Transformer是Python ast转写的概念,而不是自然语言处理NLP领域的Transformer),每个Transformer扫一遍AST并进行对应的改写。最后被转化完成的AST我们使用gast提供的接口转回成源码。 4. 静态图源码作为动态图一部分运行的技术 -为了动静转化更加易用和被转化的代码能在动态图中复用,我们在拥有源码后运行生成Program,并将这个Program作为一个大op,包装成动态图的一个op,这样既能把用户的代码转为静态图提速或者保存部署,另一方面如果用户想在python层使用生成的静态图代码作为动态图的一部分继续训练或者别的动态图运算也是可以直接使用。 +为了动静转化更加易用和被转化的代码能在动态图中复用,我们在拥有源码后运行生成Program,并将这个Program作为一个大op,包装成动态图的一个op,这样既能把用户的代码转为静态图提速或者保存部署,另一方面如果用户想在Python层使用生成的静态图代码作为动态图的一部分继续训练或者别的动态图运算也是可以直接使用。 5. 易用性与Debug功能在动转静过程的实现 正如AST转写类似编译器,而一般编译器都会提供debug断点,报错,输出一些中间代码等功能。我们在进行动转静时,万一用户的动态图代码出错,或者用户想断点调试,或者用户想看看被转化后的静态图代码是否符合其预期,我们也希望能够像编译器一样提供这些易用性功能,使得动转静兼顾性能和部署同时还具有易用性。我们这里将列出这些功能的实现方式 -A. 报错对应到动态图代码行。由于被转化后的静态图代码和原动态图代码不同,python运行出错时会报静态图的错误,因此我们在每一次AST转写时添加AST节点对应的原动态图代码行等信息,在python报错栈中将静态图的报错转化成对应的动态图源码报错 +A. 报错对应到动态图代码行。由于被转化后的静态图代码和原动态图代码不同,Python运行出错时会报静态图的错误,因此我们在每一次AST转写时添加AST节点对应的原动态图代码行等信息,在Python报错栈中将静态图的报错转化成对应的动态图源码报错 B. 设置断点功能。我们保留了被转化后代码的中的pdb.set_trace(), 用户可以使用这种方式进行断点调试 diff --git a/doc/paddle/guides/dygraph_to_static/program_translator_en.rst b/doc/paddle/guides/dygraph_to_static/program_translator_en.rst new file mode 100644 index 000000000..573ddbb79 --- /dev/null +++ b/doc/paddle/guides/dygraph_to_static/program_translator_en.rst @@ -0,0 +1,178 @@ +Dygraph to Static Graph +======================= + +The imperative-style coding of PaddlePaddle takes advantage of flexibility, Pythonic coding, and easy-to-debug interface. In dygraph mode, code immediately executes kernels and gets numerical results, which allows users to enjoy traditional Pythonic code order. Therefore it is efficient to transform idea into real code and simple to debug. However, Python code is usually slower than C++ thus lots of industrial systems (such as large recommend system, mobile devices) prefer to deploy with C++ implementation. + +Static graph is better at speed and portability. Static graph builds the network structure during compiling time and then does computation. The built network intermediate representation can be executed in C++ and gets rids of Python dependency. + +While dygraph has usability and debug benefits and static graph yields performance and deployment advantage, we adds functionality to convert dygraph to static graph. Users use imperative mode to write dygraph code and PaddlePaddle will analyze the Python syntax and turn it into network structure of static graph mode. Our approach retains both the usability of dygraph and portability of static graph. + +Basic Usage +-------------- + +PaddlePaddle has two ways to transform dygraph to static graph. TracedLayer extracts computation graph through tracing and ProgramTranslator gets computation graph through source code transformation. + + +1. TracedLayer: + +Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example: + +Define a simple fully connected network: + +.. code-block:: python + + import numpy as np + import paddle + + class SimpleFcLayer(paddle.nn.Layer): + def __init__(self, feature_size, batch_size, fc_size): + super(SimpleFCLayer, self).__init__() + self._linear = paddle.nn.Linear(feature_size, fc_size) + self._offset = paddle.to_tensor( + np.random.random((batch_size, fc_size)).astype('float32')) + + def forward(self, x): + fc = self._linear(x) + return fc + self._offset + +Save model by TracedLayer: + +.. code-block:: python + + import paddle + from paddle.jit import TracedLayer + + paddle.disable_static() + + fc_layer = SimpleFcLayer(3, 4, 2) + in_np = np.random.random([3, 4]).astype('float32') + # Turn numpy ndarray into Tensor + input_var = paddle.to_tensor(in_np) + # Transforming imperative mode into declarative mode by TracerLayer.trace + out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var]) + save_dirname = './saved_infer_model' + # Save the transformed model + static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0]) + +Load model and run it in static graph mode: + +.. code-block:: python + + place = paddle.CPUPlace() + exe = paddle.Executor(place) + program, feed_vars, fetch_vars = paddle.io.load_inference_model(save_dirname, exe) + fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars) + +However, as tracing only records operators once, if user's code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance: + +.. code-block:: python + + import paddle + + def func(input_var) + # if condition depends on the shape of input_var + if input_var.shape[0] > 1: + return paddle.cast(input_var, "float64") + else: + return paddle.cast(input_var, "int64") + + paddle.disable_static() + in_np = np.array([-2]).astype('int') + input_var = paddle.to_tensor(in_np) + out = func(input_var) + +If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop. + +2. ProgramTranslator + +For the Tensor-dependent control flow, we use source-code-translate based ProgramTranslator to convert dygraph into static graph. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator ``@paddle.jit.to_static`` before the definition of the function to transform (the function can also be a method of a class, e.g., the ``forward`` function of user-defined imperative Layer). Above Tensor-dependent example can be transformed correctly by ProgramTranslator as below: + +.. code-block:: python + + import paddle + + @paddle.jit.to_static + def func(input_var) + # if condition depends on the shape of input_var + if input_var.shape[0] > 1: + out = paddle.cast(input_var, "float64") + else: + out = paddle.cast(input_var, "int64") + + paddle.disable_static() + in_np = np.array([-2]).astype('int') + input_var = paddle.to_tensor(in_np) + func(input_var) + +To save the transformed model, we can call ``paddle.jit.save`` . Let's take ``SimpleFcLayer`` as an example again, we put decorator at the ``forward`` method of ``SimpleFcLayer`` : + +.. code-block:: python + + import numpy as np + import paddle + + class SimpleFcLayer(paddle.nn.Layer): + def __init__(self, feature_size, batch_size, fc_size): + super(SimpleFCLayer, self).__init__() + self._linear = paddle.nn.Linear(feature_size, fc_size) + self._offset = paddle.to_tensor( + np.random.random((batch_size, fc_size)).astype('float32')) + + @paddle.jit.to_static + def forward(self, x): + fc = self._linear(x) + return fc + self._offset + + +Calling ``paddle.jit.save`` to save above model: + +.. code-block:: python + + import paddle + + paddle.disable_static() + + fc_layer = SimpleFcLayer(3, 4, 2) + in_np = np.random.random([3, 4]).astype('float32') + input_var = paddle.to_tensor(in_np) + out = fc_layer(input_var) + + paddle.jit.save(fc_layer, "./fc_layer_dy2stat") + + +Architecture +-------------- + +The basic idea of TracedLayer is tracing, it is relatively simple so we won't expend here. This section will talk about the source code transformation of ProgramTranslator. + +The transformation is implemented in the decorator so transformation happens when user calls the decorated function, the procedure includes these steps: + +1. Function and cache. + +The entity for transforming dygraph to static graph is the decorated function. For the PaddlePaddle APIs in the function, since they are same code under dygraph mode and static mode, we don't have to transform those code. However, those APIs are computation in dygraph model while they are building network in static graph mode, if the transformed functions are called multiple times, those APIs will build network multiple times in static graph, which can cause problem. To solve it as well as speed up the transformation, we maintain a cache that maps from function, input shapes, input data types to the Program built by the transformed function. If the function hits cache, we run the stored Program in static graph mode to get result, else we do the code transformation on the function and store the transformed Program into the cache. + +2. From dygraph source code to AST (Abstract Syntax Tree) + +The core of transforming dygraph to static graph is similar to a compiler, we parse the dygraph code into AST, change AST, then turn it back into static graph code. We use Python ``inspect.getsource`` to get the source code string of the function. Python provides ``ast`` library to parse string code into AST, but Python2, Python3 have slight grammar difference. To avoid the work to handle different grammars, we used an open source AST library `gast `_ that provides compatibility AST among various Python versions. There is no essential difficulty to turn function into AST with these library. + +3. Transform AST and turn it to static graph code + +This part is the key part in ProgramTranslator, we modify AST for supported grammars. Those important Python control flows, such as ``if-elif-else, while, for`` loop are converted to PaddlePaddle static graph API ``cond, while_loop`` and so on. We created a Transformer (AST-to-AST Transformer in Python, not the Transformer in Natural Language Process) to transform each grammar. Every Transformer scans AST and modify it. Lastly, we turn AST back to source code string by ``gast`` library. + +4. Running static graph code as part of dygraph + +In order to increase usability and re-use the transformed static graph code in dygraph, we wrap the generated Program as an dygraph op, the op can run the forward and backward computation of transformed Program. Then we can not only speed up dygraph code or save it for deployment, but also enable user to run part of their dygraph code in static graph mode so that they can continue training or other dygraph computation in their dygraph code. + +5. Error handling and Debug + +Compiler usually supports debug functionality like breakpoint, throwing exception, print some mid-level codes. ProgramTranslator is similar to a compiler, users may would like to set breakpoints for debugging, or see whether the transformed static graph code is expected. So we also implemented those error handling and debug functionality. Here we list those functions and their implementation. + +A. Report errors/exceptions on dygraph code line. Because the transformed static graph code is different to original dygraph code, when Python executes the static graph code, the exceptions will be reported at static graph code. To locate the corresponding dygraph code, we attach some informations such as line number on AST nodes when we transform AST, then we can re-write the static graph exception to the corresponding dygraph code exception. + +B. We support ``pdb.set_trace()`` when running ProgramTranslator, user can add this line to set breakpoints. + +C. Check the transformed static graph code. Our transformed output is a Python class named ``StaticLayer``, this class can be called, but it also stores the transformed code string. Users could call ``StaticLayer.code`` to get the converted code. + +D. Print mid-level transformed code, such as what's the code after transforming ``for`` loop. We provide APIs to set log level to let user check the mid-level code. + + diff --git a/doc/paddle/guides/index_en.rst b/doc/paddle/guides/index_en.rst index e0b9ac210..ff7bad589 100644 --- a/doc/paddle/guides/index_en.rst +++ b/doc/paddle/guides/index_en.rst @@ -10,9 +10,10 @@ Please refer to `PaddlePaddle Github `_ Let's start with studying basic concept of PaddlePaddle: - `migration tools <./migration_en.html>`_:how to use migration tools to upgrade your code. - +- `dynamic to static <./dygraph_to_static/index_en.html>`_:how to convert your model from dynamic graph to static graph. .. toctree:: :hidden: migration_en.rst + dynamic_to_static/index_en.rst -- GitLab