basic_usage_en.rst 5.5 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
Basic Usage
=============

The recommended way to transform dygraph to static graph is source-code-translate based ProgramTranslator. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. Users could use Python syntax including control flow to build neural networks. Besides, PaddlePaddle has another tracing-based API for transforming dygraph to static graph which called TracedLayer. You can use it as a back-up API in case ProgramTranslator has problem.

ProgramTranslator
-------------------

The basic idea of source-code-translate based ProgramTranslator is analyzing Python source code and turning it into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator ``@paddle.jit.to_static`` before the definition of the function to transform (the function can also be a method of a class, e.g., the ``forward`` function of user-defined imperative Layer). An example is:

.. code-block:: python

    import paddle

    @paddle.jit.to_static
    def func(input_var)
        # if condition depends on the shape of input_var
        if input_var.shape[0] > 1:
            out = paddle.cast(input_var, "float64")
        else:
            out = paddle.cast(input_var, "int64")

    paddle.disable_static()
    in_np = np.array([-2]).astype('int')
    input_var = paddle.to_tensor(in_np)
    func(input_var)

To save the transformed model, we can call ``paddle.jit.save`` . Let's take a fully connected network called ``SimpleFcLayer`` as an example, we put decorator at the ``forward`` method of ``SimpleFcLayer`` :

.. code-block:: python

    import numpy as np
    import paddle

    class SimpleFcLayer(paddle.nn.Layer):
        def __init__(self, feature_size, batch_size, fc_size):
            super(SimpleFCLayer, self).__init__()
            self._linear = paddle.nn.Linear(feature_size, fc_size)
            self._offset = paddle.to_tensor(
                np.random.random((batch_size, fc_size)).astype('float32'))

        @paddle.jit.to_static
        def forward(self, x):
            fc = self._linear(x)
            return fc + self._offset


Call ``paddle.jit.save`` to save above model:

.. code-block:: python

    import paddle

    paddle.disable_static()

    fc_layer = SimpleFcLayer(3, 4, 2)
    in_np = np.random.random([3, 4]).astype('float32')
    input_var = paddle.to_tensor(in_np)
    out = fc_layer(input_var)

    paddle.jit.save(fc_layer, "./fc_layer_dy2stat")


TracedLayer
-------------

Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example:

Define a simple fully connected network, note that we don't add a decorator before ``forward`` function as we did in ProgramTranslator example:

.. code-block:: python

    import numpy as np
    import paddle

    class SimpleFcLayer(paddle.nn.Layer):
        def __init__(self, feature_size, batch_size, fc_size):
            super(SimpleFCLayer, self).__init__()
            self._linear = paddle.nn.Linear(feature_size, fc_size)
            self._offset = paddle.to_tensor(
                np.random.random((batch_size, fc_size)).astype('float32'))

        def forward(self, x):
            fc = self._linear(x)
            return fc + self._offset

Save model by TracedLayer:

.. code-block:: python

    import paddle
    from paddle.jit import TracedLayer

    paddle.disable_static()

    fc_layer = SimpleFcLayer(3, 4, 2)
    in_np = np.random.random([3, 4]).astype('float32')
    # Turn numpy ndarray into Tensor
    input_var = paddle.to_tensor(in_np)
    # Transforming imperative mode into declarative mode by TracerLayer.trace
    out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
    save_dirname = './saved_infer_model'
    # Save the transformed model
    static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])

Load model and run it in static graph mode:

.. code-block:: python

    place = paddle.CPUPlace()
    exe = paddle.Executor(place)
112
    program, feed_vars, fetch_vars = paddle.static.load_inference_model(save_dirname, exe)
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
    fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)

However, as tracing only records operators once, if user's code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance:

.. code-block:: python

    import paddle

    def func(input_var)
        # if condition depends on the shape of input_var
        if input_var.shape[0] > 1:
            return paddle.cast(input_var, "float64")
        else:
            return paddle.cast(input_var, "int64")

    paddle.disable_static()
    in_np = np.array([-2]).astype('int')
    input_var = paddle.to_tensor(in_np)
    out = func(input_var)

If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop.

Comparing ProgramTranslator and TracedLayer
-------------------------------------------

Compared to tracing-based TracedLayer, source-code-translate based ProgramTranslator can handle the Tensor-dependent control flow. So we recommend users to use ProgramTranslator, use TracedLayer as a back-up plan when ProgramTranslator doesn't work.