fix Dygraph cn sample code and doc format (#1508)

* fix Dygraph cn sample code, test=develop * fix Dygraph doc format, test=develop

fix Dygraph cn sample code and doc format (#1508)
* fix Dygraph cn sample code, test=develop * fix Dygraph doc format, test=develop
783c4b20 · Youwei Song · Jiabin Yang · cacbfe4e · 783c4b20
显示空白变更内容
内联并排

Showing with 649 addition and 622 deletion

doc/fluid/user_guides/howto/dygraph/DyGraph.md doc/fluid/user_guides/howto/dygraph/DyGraph.md +649 -622

未找到文件。
--- a/doc/fluid/user_guides/howto/dygraph/DyGraph.md
+++ b/doc/fluid/user_guides/howto/dygraph/DyGraph.md
@@ -22,35 +22,40 @@ PaddlePaddle DyGraph是一个更加灵活易用的模式，可提供：

 1. 升级到最新的PaddlePaddle 1.5:		

-		pip install -q --upgrade paddlepaddle==1.5
+```
+pip install -q --upgrade paddlepaddle==1.5
+```



 2. 使用`fluid.dygraph.guard(place=None)` 上下文：

-		
-	    import paddle.fluid as fluid
-	    with fluid.dygraph.guard():
+```python
+import paddle.fluid as fluid
+with fluid.dygraph.guard():
    # write your executable dygraph code here             
+```

-
-	现在您就可以在`fluid.dygraph.guard()`上下文环境中使用DyGraph的模式运行网络了，DyGraph将改变以往PaddlePaddle的执行方式： 现在他们将会立即执行，并且将计算结果返回给Python。	
+现在您就可以在`fluid.dygraph.guard()`上下文环境中使用DyGraph的模式运行网络了，DyGraph将改变以往PaddlePaddle的执行方式： 现在他们将会立即执行，并且将计算结果返回给Python。	
 	
 Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)`将会将ndarray转换为`fluid.Variable`，而使用`fluid.Variable.numpy()`将可以把任意时刻获取到的计算结果转换为Numpy`ndarray`：         
 		
-		x = np.ones([2, 2], np.float32)
-		with fluid.dygraph.guard():
+```python
+x = np.ones([2, 2], np.float32)
+with fluid.dygraph.guard():
    inputs = []
    for _ in range(10):
        inputs.append(fluid.dygraph.to_variable(x))
    ret = fluid.layers.sums(inputs)
    print(ret.numpy())
+```

+得到输出：

-		[[10. 10.]
-		[10. 10.]]
-	
-		Process finished with exit code 0
+```
+[[10. 10.]
+[10. 10.]]
+```
 	
 	
 >	这里创建了一系列`ndarray`的输入，执行了一个`sum`操作之后，我们可以直接将运行的结果打印出来
@@ -58,75 +63,75 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
 然后通过调用`reduce_sum`后使用`Variable.backward()`方法执行反向，使用`Variable.gradient()`方法即可获得反向网络执行完成后的梯度值的`ndarray`形式：
 	
 	
-	
-	
-	    loss = fluid.layers.reduce_sum(ret)
-	    loss.backward()
-	    print(loss.gradient())
-	
-	
+```python
+loss = fluid.layers.reduce_sum(ret)
+loss.backward()
+print(loss.gradient())
+```
 	
 			
 得到输出 ：

-	    [1.]
-	
-	    Process finished with exit code 0
-
+```
+[1.]
+```

 ## 基于DyGraph构建网络
 		
 1. 编写一段用于DyGraph执行的Object-Oriented-Designed, PaddlePaddle模型代码主要由以下**三个部分**组成： **请注意，如果您设计的这一层结构是包含参数的，则必须要使用继承自`fluid.dygraph.Layer`的Object-Oriented-Designed的类来描述该层的行为。**

-
-	
 	1. 建立一个可以在DyGraph模式中执行的，Object-Oriented的网络，需要继承自`fluid.dygraph.Layer`，其中需要调用基类的`__init__`方法，并且实现带有参数`name_scope`（用来标识本层的名字）的`__init__`构造函数，在构造函数中，我们通常会执行一些例如参数初始化，子网络初始化的操作，执行这些操作时不依赖于输入的动态信息:
        
-
+        ```python
        class MyLayer(fluid.dygraph.Layer):
            def __init__(self, name_scope):
                super(MyLayer, self).__init__(name_scope)
-                
+        ```
    
    2. 实现一个`forward(self, *inputs)`的执行函数，该函数将负责执行实际运行时网络的执行逻辑， 该函数将会在每一轮训练/预测中被调用，这里我们将执行一个简单的`relu` -> `elementwise add` -> `reduce sum`：

+        ```python
            def forward(self, inputs):
                x = fluid.layers.relu(inputs)
                self._x_for_debug = x
                x = fluid.layers.elementwise_mul(x, x)
                x = fluid.layers.reduce_sum(x)
                return [x]
-		        
+        ```

 2. 在`fluid.dygraph.guard()`中执行：
    
-
-
    1. 使用Numpy构建输入：

+        ```python
        np_inp = np.array([1.0, 2.0, -1.0], dtype=np.float32)
+        ```

 	2. 转换输入的`ndarray`为`Variable`, 并执行前向网络获取返回值： 使用`fluid.dygraph.to_variable(np_inp)`转换Numpy输入为DyGraph接收的输入，然后使用`my_layer(var_inp)[0]`调用callable object并且获取了`x`作为返回值，利用`x.numpy()`方法直接获取了执行得到的`x`的`ndarray`返回值。

+        ```python
        with fluid.dygraph.guard():
            var_inp = fluid.dygraph.to_variable(np_inp)
            my_layer = MyLayer("my_layer")
            x = my_layer(var_inp)[0]
            dy_out = x.numpy()
+        ```
 	
 	3. 计算梯度：自动微分对于实现机器学习算法（例如用于训练神经网络的反向传播）来说很有用， 使用`x.backward()`方法可以从某个`fluid.Varaible`开始执行反向网络，同时利用`my_layer._x_for_debug.gradient()`获取了网络中`x`梯度的`ndarray` 返回值：

+        ```python
            x.backward()
            dy_grad = my_layer._x_for_debug.gradient()
+        ```

 完整代码如下：

-				
-		import paddle.fluid as fluid
-		import numpy as np
+```python
+import paddle.fluid as fluid
+import numpy as np


-		class MyLayer(fluid.dygraph.Layer):
+class MyLayer(fluid.dygraph.Layer):
    def __init__(self, name_scope):
        super(MyLayer, self).__init__(name_scope)
    
@@ -138,16 +143,18 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
        return [x]


-		if __name__ == '__main__':
+if __name__ == '__main__':
    np_inp = np.array([1.0, 2.0, -1.0], dtype=np.float32)
    with fluid.dygraph.guard():
        var_inp = fluid.dygraph.to_variable(np_inp)
+        var_inp.stop_gradient = False
        my_layer = MyLayer("my_layer")
        x = my_layer(var_inp)[0]
        dy_out = x.numpy()
        x.backward()
        dy_grad = my_layer._x_for_debug.gradient()
        my_layer.clear_gradients()  # 将参数梯度清零以保证下一轮训练的正确性
+```

 ## 使用DyGraph训练模型

@@ -158,13 +165,14 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)

 1. 准备数据，我们使用`paddle.dataset.mnist`作为训练所需要的数据集：

-
+    ```python
    train_reader = paddle.batch(
    paddle.dataset.mnist.train(), batch_size=BATCH_SIZE, drop_last=True)
-
+    ```

 2. 构建网络，虽然您可以根据之前的介绍自己定义所有的网络结构，但是您也可以直接使用`fluid.dygraph.Layer`当中我们为您定制好的一些基础网络结构，这里我们利用`fluid.dygraph.Conv2D`以及`fluid.dygraph.Pool2d`构建了基础的`SimpleImgConvPool`：

+    ```python
    class SimpleImgConvPool(fluid.dygraph.Layer):
        def __init__(self,
                     name_scope,
@@ -211,18 +219,13 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
            x = self._conv2d(inputs)
            x = self._pool2d(x)
            return x
-
-
-
+    ```

    > 注意: 构建网络时子网络的定义和使用请在`__init__`中进行， 而子网络的执行则在`forward`函数中进行

-
-
-		       
-
 3. 利用已经构建好的`SimpleImgConvPool`组成最终的`MNIST`网络：

+    ```python
    class MNIST(fluid.dygraph.Layer):
        def __init__(self, name_scope):
            super(MNIST, self).__init__(name_scope)
@@ -252,12 +255,11 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
                return x, acc
            else:
                return x
-
-				  
-
+   ```
 			
 4. 在`fluid.dygraph.guard()`中定义配置好的`MNIST`网络结构，此时即使没有训练也可以在`fluid.dygraph.guard()`中调用模型并且检查输出：

+    ```python
    with fluid.dygraph.guard():
        mnist = MNIST("mnist")
        id, data = list(enumerate(train_reader()))[0]
@@ -266,9 +268,11 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
             for x in data]).astype('float32')
        img = fluid.dygraph.to_variable(dy_x_data)
        print("result is: {}".format(mnist(img).numpy()))
+   ```
   
+   输出：
   
-				
+   ```
   result is: [[0.10135901 0.1051138  0.1027941  ... 0.0972859  0.10221873 0.10165327]
           [0.09735426 0.09970362 0.10198303 ... 0.10134517 0.10179105 0.10025002]
           [0.09539858 0.10213123 0.09543551 ... 0.10613529 0.10535969 0.097991  ]
@@ -276,11 +280,11 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
           [0.10120598 0.0996111  0.10512722 ... 0.10067689 0.10088114 0.10071224]
           [0.09889644 0.10033772 0.10151272 ... 0.10245881 0.09878646 0.101483  ]
           [0.09097178 0.10078511 0.10198414 ... 0.10317434 0.10087223 0.09816764]]
-					
-				Process finished with exit code 0
+   ```

 5. 构建训练循环，在每一轮参数更新完成后我们调用`mnist.clear_gradients()`来重置梯度：

+    ```python
    with fluid.dygraph.guard():
        epoch_num = 5		
        BATCH_SIZE = 64
@@ -309,9 +313,7 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
                avg_loss.backward()
                adam.minimize(avg_loss)
                mnist.clear_gradients()
-
-
-
+    ```

 6. 变量及优化器

@@ -319,6 +321,7 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)

 	反向运行后调用之前定义的`Adam`优化器对象的`minimize`方法进行参数更新:
 		
+    ```python
    with fluid.dygraph.guard():
        epoch_num = 5
        BATCH_SIZE = 64
@@ -360,9 +363,11 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
        print("Final loss: {}".format(avg_loss.numpy()))
        print("_simple_img_conv_pool_1_conv2d W's mean is: {}".format(mnist._simple_img_conv_pool_1._conv2d._filter_param.numpy().mean()))
        print("_simple_img_conv_pool_1_conv2d Bias's mean is: {}".format(mnist._simple_img_conv_pool_1._conv2d._bias_param.numpy().mean()))
+    ```

+    输出：
    
-
+        ```
        Loss at step 0: [2.302]
        Loss at step 20: [1.616]
        Loss at step 40: [1.244]
@@ -390,17 +395,19 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
        Final loss: [0.164]
        _simple_img_conv_pool_1_conv2d W's mean is: 0.00606656912714
        _simple_img_conv_pool_1_conv2d Bias's mean is: -3.4576318285e-05
+        ```

 7.	性能

-	在使用`fluid.dygraph.guard()`时可以通过传入`fluid.CUDAPlace(0)`或者`fluid.CPUPlace()`来选择执行DyGraph的设备，通常如果不做任何处理将会自动适配您的设备。
+在使用`fluid.dygraph.guard()`时可以通过传入`fluid.CUDAPlace(0)`或者`fluid.CPUPlace()`来选择执行DyGraph的设备，通常如果不做任何处理将会自动适配您的设备。

 ## 使用多卡训练模型

 目前PaddlePaddle支持通过多进程方式进行多卡训练，即每个进程对应一张卡。训练过程中，在第一次执行前向操作时，如果该操作需要参数，则会将0号卡的参数Broadcast到其他卡上，确保各个卡上的参数一致；在计算完反向操作之后，将产生的参数梯度在所有卡之间进行聚合；最后在各个GPU卡上分别进行参数更新。

-    place = fluid.CUDAPlace(fluid.dygraph.parallel.Env().dev_id)
-    with fluid.dygraph.guard(place):
+```python
+place = fluid.CUDAPlace(fluid.dygraph.parallel.Env().dev_id)
+with fluid.dygraph.guard(place):

    strategy = fluid.dygraph.parallel.prepare_context()
    mnist = MNIST("mnist")
@@ -436,100 +443,116 @@ Dygraph将非常适合和Numpy一起使用，使用`fluid.dygraph.to_variable(x)
            mnist.clear_gradients()
            if batch_id % 100 == 0 and batch_id is not 0:
                print("epoch: {}, batch_id: {}, loss is: {}".format(epoch, batch_id, avg_loss.numpy()))
+```

 动态图单卡训练转多卡训练需要修改的地方主要有四处：
 1. 需要从环境变量获取设备的ID，即：

+    ```python
    place = fluid.CUDAPlace(fluid.dygraph.parallel.Env().dev_id)
+    ```

 2. 需要对原模型做一些预处理，即：

+    ```python
    strategy = fluid.dygraph.parallel.prepare_context()
    mnist = MNIST("mnist")
    adam = AdamOptimizer(learning_rate=0.001)
    mnist = fluid.dygraph.parallel.DataParallel(mnist, strategy)
+    ```

 3. 数据读取，必须确保每个进程读取的数据是不同的，即所有进程读取数据的交集为空，所有进程读取数据的并集是完整的数据集：

+    ```python
    train_reader = paddle.batch(
        paddle.dataset.mnist.train(), batch_size=BATCH_SIZE, drop_last=True)
    train_reader = fluid.contrib.reader.distributed_batch_reader(
        train_reader)
+    ```

 4. 需要对loss进行调整，以及对参数的梯度进行聚合，即：

+    ```python
    avg_loss = mnist.scale_loss(avg_loss)
    avg_loss.backward()
    mnist.apply_collective_grads()
+    ```

 Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，即如果使用`0,1,2,3`卡，启动方式如下：

-    python -m paddle.distributed.launch --selected_gpus=0,1,2,3 --log_dir ./mylog train.py 
+```
+python -m paddle.distributed.launch --selected_gpus=0,1,2,3 --log_dir ./mylog train.py 
+```

 输出结果为：

-	-----------  Configuration Arguments -----------
-	cluster_node_ips: 127.0.0.1
-	log_dir: ./mylog
-	node_ip: 127.0.0.1
-	print_config: True
-	selected_gpus: 0,1,2,3
-	started_port: 6170
-	training_script: train.py
-	training_script_args: ['--use_data_parallel', '1']
-	use_paddlecloud: True
-	------------------------------------------------
-	trainers_endpoints: 127.0.0.1:6170,127.0.0.1:6171,127.0.0.1:6172,127.0.0.1:6173 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 4
-
+```
+-----------  Configuration Arguments -----------
+cluster_node_ips: 127.0.0.1
+log_dir: ./mylog
+node_ip: 127.0.0.1
+print_config: True
+selected_gpus: 0,1,2,3
+started_port: 6170
+training_script: train.py
+training_script_args: ['--use_data_parallel', '1']
+use_paddlecloud: True
+------------------------------------------------
+trainers_endpoints: 127.0.0.1:6170,127.0.0.1:6171,127.0.0.1:6172,127.0.0.1:6173 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 4
+```

 此时，程序会将每个进程的输出log导出到./mylog路径下：

-    .
-    ├── mylog
-    │   ├── workerlog.0
-    │   ├── workerlog.1
-    │   ├── workerlog.2
-    │   └── workerlog.3
-    └── train.py
+```
+.
+├── mylog
+│   ├── workerlog.0
+│   ├── workerlog.1
+│   ├── workerlog.2
+│   └── workerlog.3
+└── train.py
+```

 如果不指定`--log_dir`，程序会将打印出所有进程的输出，即：

-    -----------  Configuration Arguments -----------
-    cluster_node_ips: 127.0.0.1
-    log_dir: None
-    node_ip: 127.0.0.1
-    print_config: True
-    selected_gpus: 0,1,2,3
-    started_port: 6170
-    training_script: train.py
-    training_script_args: ['--use_data_parallel', '1']
-    use_paddlecloud: True
-    ------------------------------------------------
-    trainers_endpoints: 127.0.0.1:6170,127.0.0.1:6171,127.0.0.1:6172,127.0.0.1:6173 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 4
-    grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
-    grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
-    grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
-    grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
-    I0923 09:32:36.423513 56410 nccl_context.cc:120] init nccl context nranks: 4 local rank: 1 gpu id: 1
-    I0923 09:32:36.425287 56411 nccl_context.cc:120] init nccl context nranks: 4 local rank: 2 gpu id: 2
-    I0923 09:32:36.429337 56409 nccl_context.cc:120] init nccl context nranks: 4 local rank: 0 gpu id: 0
-    I0923 09:32:36.429440 56412 nccl_context.cc:120] init nccl context nranks: 4 local rank: 3 gpu id: 3
-    W0923 09:32:42.594097 56412 device_context.cc:198] Please NOTE: device: 3, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
-    W0923 09:32:42.605836 56412 device_context.cc:206] device: 3, cuDNN Version: 7.5.
-    W0923 09:32:42.632463 56410 device_context.cc:198] Please NOTE: device: 1, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
-    W0923 09:32:42.637948 56410 device_context.cc:206] device: 1, cuDNN Version: 7.5.
-    W0923 09:32:42.648674 56411 device_context.cc:198] Please NOTE: device: 2, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
-    W0923 09:32:42.654021 56411 device_context.cc:206] device: 2, cuDNN Version: 7.5.
-    W0923 09:32:43.048696 56409 device_context.cc:198] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
-    W0923 09:32:43.053236 56409 device_context.cc:206] device: 0, cuDNN Version: 7.5.
-    start data reader (trainers_num: 4, trainer_id: 2)
-    start data reader (trainers_num: 4, trainer_id: 3)
-    start data reader (trainers_num: 4, trainer_id: 1)
-    start data reader (trainers_num: 4, trainer_id: 0)
-    Loss at epoch 0 step 0: [0.57390565]
-    Loss at epoch 0 step 0: [0.57523954]
-    Loss at epoch 0 step 0: [0.575606]
-    Loss at epoch 0 step 0: [0.5767452]
+```
+-----------  Configuration Arguments -----------
+cluster_node_ips: 127.0.0.1
+log_dir: None
+node_ip: 127.0.0.1
+print_config: True
+selected_gpus: 0,1,2,3
+started_port: 6170
+training_script: train.py
+training_script_args: ['--use_data_parallel', '1']
+use_paddlecloud: True
+------------------------------------------------
+trainers_endpoints: 127.0.0.1:6170,127.0.0.1:6171,127.0.0.1:6172,127.0.0.1:6173 , node_id: 0 , current_node_ip: 127.0.0.1 , num_nodes: 1 , node_ips: ['127.0.0.1'] , nranks: 4
+grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
+grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
+grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
+grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
+I0923 09:32:36.423513 56410 nccl_context.cc:120] init nccl context nranks: 4 local rank: 1 gpu id: 1
+I0923 09:32:36.425287 56411 nccl_context.cc:120] init nccl context nranks: 4 local rank: 2 gpu id: 2
+I0923 09:32:36.429337 56409 nccl_context.cc:120] init nccl context nranks: 4 local rank: 0 gpu id: 0
+I0923 09:32:36.429440 56412 nccl_context.cc:120] init nccl context nranks: 4 local rank: 3 gpu id: 3
+W0923 09:32:42.594097 56412 device_context.cc:198] Please NOTE: device: 3, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
+W0923 09:32:42.605836 56412 device_context.cc:206] device: 3, cuDNN Version: 7.5.
+W0923 09:32:42.632463 56410 device_context.cc:198] Please NOTE: device: 1, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
+W0923 09:32:42.637948 56410 device_context.cc:206] device: 1, cuDNN Version: 7.5.
+W0923 09:32:42.648674 56411 device_context.cc:198] Please NOTE: device: 2, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
+W0923 09:32:42.654021 56411 device_context.cc:206] device: 2, cuDNN Version: 7.5.
+W0923 09:32:43.048696 56409 device_context.cc:198] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.0, Runtime API Version: 9.0
+W0923 09:32:43.053236 56409 device_context.cc:206] device: 0, cuDNN Version: 7.5.
+start data reader (trainers_num: 4, trainer_id: 2)
+start data reader (trainers_num: 4, trainer_id: 3)
+start data reader (trainers_num: 4, trainer_id: 1)
+start data reader (trainers_num: 4, trainer_id: 0)
+Loss at epoch 0 step 0: [0.57390565]
+Loss at epoch 0 step 0: [0.57523954]
+Loss at epoch 0 step 0: [0.575606]
+Loss at epoch 0 step 0: [0.5767452]
+```

 ## 模型参数的保存

@@ -549,8 +572,8 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，
 下面的代码展示了如何在“手写数字识别”任务中保存参数并且读取已经保存的参数来继续训练。


-
-	with fluid.dygraph.guard():
+```python
+with fluid.dygraph.guard():
    epoch_num = 5
    BATCH_SIZE = 64

@@ -601,11 +624,14 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，
        if (not np.array_equal(value.numpy(), dy_param_init_value[value.name])) or (not np.isfinite(value.numpy().all())) or (np.isnan(value.numpy().any())):
            success = False
    print("model save and load success? {}".format(success))
+```

 需要注意的是，如果采用多卡训练，只需要一个进程对模型参数进行保存，因此在保存模型参数时，需要进行指定保存哪个进程的参数，比如

+```python
    if fluid.dygraph.parallel.Env().local_rank == 0:
        fluid.dygraph.save_persistables(mnist.state_dict(), "save_dir")
+```

 ## 模型评估

@@ -617,8 +643,8 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，

 我们在`inference_mnist `中启用另一个`fluid.dygraph.guard()`，并在其上下文中`load`之前保存的`checkpoint`进行预测，同样的在执行预测前需要使用`YourModel.eval()`来切换到预测模式。
 			
-
-	def test_mnist(reader, model, batch_size):
+```python
+def test_mnist(reader, model, batch_size):
    acc_set = []
    avg_loss_set = []
    for batch_id, data in enumerate(reader()):
@@ -643,7 +669,7 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，
    return avg_loss_val_mean, acc_val_mean


-	def inference_mnist():
+def inference_mnist():
    with fluid.dygraph.guard():
        mnist_infer = MNIST("mnist")
        # load checkpoint
@@ -668,7 +694,7 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，
        lab = np.argsort(results.numpy())
        print("Inference result of image/infer_3.png is: %d" % lab[0][-1])

-	with fluid.dygraph.guard():
+with fluid.dygraph.guard():
    epoch_num = 1
    BATCH_SIZE = 64
    mnist = MNIST("mnist")
@@ -717,50 +743,53 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，
    print("checkpoint saved")

    inference_mnist()
-	
-	
-	
-	Loss at epoch 0 step 0: [2.2991252]
-	Loss at epoch 0 step 100: [0.15491392]
-	Loss at epoch 0 step 200: [0.13315125]
-	Loss at epoch 0 step 300: [0.10253005]
-	Loss at epoch 0 step 400: [0.04266362]
-	Loss at epoch 0 step 500: [0.08894891]
-	Loss at epoch 0 step 600: [0.08999012]
-	Loss at epoch 0 step 700: [0.12975612]
-	Loss at epoch 0 step 800: [0.15257305]
-	Loss at epoch 0 step 900: [0.07429226]
-	Loss at epoch 0 , Test avg_loss is: 0.05995981965082674, acc is: 0.9794671474358975
-	checkpoint saved
-	No optimizer loaded. If you didn't save optimizer, please ignore this. The program can still work with new optimizer. 
-	checkpoint loaded
-	Inference result of image/infer_3.png is: 3
+```
+
+输出：
+
+```
+Loss at epoch 0 step 0: [2.2991252]
+Loss at epoch 0 step 100: [0.15491392]
+Loss at epoch 0 step 200: [0.13315125]
+Loss at epoch 0 step 300: [0.10253005]
+Loss at epoch 0 step 400: [0.04266362]
+Loss at epoch 0 step 500: [0.08894891]
+Loss at epoch 0 step 600: [0.08999012]
+Loss at epoch 0 step 700: [0.12975612]
+Loss at epoch 0 step 800: [0.15257305]
+Loss at epoch 0 step 900: [0.07429226]
+Loss at epoch 0 , Test avg_loss is: 0.05995981965082674, acc is: 0.9794671474358975
+checkpoint saved
+No optimizer loaded. If you didn't save optimizer, please ignore this. The program can still work with new optimizer. 
+checkpoint loaded
+Inference result of image/infer_3.png is: 3
+```


 ## 编写兼容的模型

 以上一步中手写数字识别的例子为例，动态图的模型代码可以直接用于静态图中作为模型代码，执行时，直接使用PaddlePaddle静态图执行方式即可，这里以静态图中的`executor`为例, 模型代码可以直接使用之前的模型代码，执行时使用`Executor`执行即可
 	
+```python
+epoch_num = 1
+BATCH_SIZE = 64
+exe = fluid.Executor(fluid.CPUPlace())

-	epoch_num = 1
-	BATCH_SIZE = 64
-	exe = fluid.Executor(fluid.CPUPlace())
-	
-	mnist = MNIST("mnist")
-	sgd = fluid.optimizer.SGDOptimizer(learning_rate=1e-3)
-	train_reader = paddle.batch(
+mnist = MNIST("mnist")
+sgd = fluid.optimizer.SGDOptimizer(learning_rate=1e-3)
+train_reader = paddle.batch(
    paddle.dataset.mnist.train(), batch_size=BATCH_SIZE, drop_last=True)
-	img = fluid.layers.data(
+img = fluid.layers.data(
    name='pixel', shape=[1, 28, 28], dtype='float32')
-	label = fluid.layers.data(name='label', shape=[1], dtype='int64')
-	cost = mnist(img)
-	loss = fluid.layers.cross_entropy(cost, label)
-	avg_loss = fluid.layers.mean(loss)
-	sgd.minimize(avg_loss)
+label = fluid.layers.data(name='label', shape=[1], dtype='int64')
+cost = mnist(img)
+loss = fluid.layers.cross_entropy(cost, label)
+avg_loss = fluid.layers.mean(loss)
+sgd.minimize(avg_loss)

-	out = exe.run(fluid.default_startup_program())
+out = exe.run(fluid.default_startup_program())

-	for epoch in range(epoch_num):
+for epoch in range(epoch_num):
    for batch_id, data in enumerate(train_reader()):
        static_x_data = np.array(
            [x[0].reshape(1, 28, 28)
@@ -779,6 +808,4 @@ Paddle动态图多进程多卡模型训练启动时需要指定使用的GPU，

        if batch_id % 100 == 0 and batch_id is not 0:
            print("epoch: {}, batch_id: {}, loss: {}".format(epoch, batch_id, static_out))
-    
-			
-			
\ No newline at end of file
+```