!68 [Mixed-precision] Fix the runing error of mix_precision

Merge pull request !68 from Xiaoda/master

!68 [Mixed-precision] Fix the runing error of mix_precision
Merge pull request !68 from Xiaoda/master
22257173 · mindspore-ci-bot · Gitee · 6a17ca57 · 6bf02fb7 · 22257173
2 changed file
--- a/tutorials/source_en/advanced_use/mixed_precision.md
+++ b/tutorials/source_en/advanced_use/mixed_precision.md
@@ -15,6 +15,8 @@
 The mixed precision training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. 
 Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.

+For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching 'reduce precision'.
+
 ## Computation Process

 The following figure shows the typical computation process of mixed precision in MindSpore.
@@ -46,9 +48,20 @@ The procedure is as follows:
 A code example is as follows:

 ```python
+import numpy as np
+import mindspore.nn as nn
+import mindspore.common.dtype as mstype
+from mindspore import Tensor, context
+from mindspore.ops import operations as P
+from mindspore.nn import WithLossCell
+from mindspore.nn import Momentum
+from mindspore.nn.loss import MSELoss
 # The interface of Auto_mixed precision
 from mindspore.train import amp

+context.set_context(mode=context.GRAPH_MODE)
+context.set_context(device_target="Ascend")
+
 # Define network
 class LeNet5(nn.Cell):
    def __init__(self):
@@ -59,7 +72,7 @@ class LeNet5(nn.Cell):
        self.fc2 = nn.Dense(120, 84)
        self.fc3 = nn.Dense(84, 10)
        self.relu = nn.ReLU()
-        self.max_pool2d = nn.MaxPool2d(kernel_size=2)
+        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = P.Flatten()

    def construct(self, x):
@@ -95,15 +108,28 @@ output = train_network(predict, label, scaling_sens)
 MindSpore also supports manual mixed precision. It is assumed that only one dense layer in the network needs to be calculated by using FP32, and other layers are calculated by using FP16. The mixed precision is configured in the granularity of cell. The default format of a cell is FP32.

 The following is the procedure for implementing manual mixed precision:
-1. Define the network. This step is similar to step 2 in the automatic mixed precision. NoteThe fc3 operator in LeNet needs to be manually set to FP32.
+1. Define the network. This step is similar to step 2 in the automatic mixed precision. 

-2. Configure the mixed precision. Use net.to_float(mstype.float16) to set all operators of the cell and its sub-cells to FP16.
+2. Configure the mixed precision. Use net.to_float(mstype.float16) to set all operators of the cell and its sub-cells to FP16. Then, configure the fc3 to FP32.

 3. Use TrainOneStepWithLossScaleCell to encapsulate the network model and optimizer.

 A code example is as follows:

 ```python
+import numpy as np
+import mindspore.nn as nn
+import mindspore.common.dtype as mstype
+from mindspore import Tensor, context
+from mindspore.ops import operations as P
+from mindspore.nn import WithLossCell, TrainOneStepWithLossScaleCell
+from mindspore.nn import Momentum
+from mindspore.nn.loss import MSELoss
+
+
+context.set_context(mode=context.GRAPH_MODE)
+context.set_context(device_target="Ascend")
+
 # Define network
 class LeNet5(nn.Cell):
    def __init__(self):
@@ -112,9 +138,9 @@ class LeNet5(nn.Cell):
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')
        self.fc1 = nn.Dense(16 * 5 * 5, 120)
        self.fc2 = nn.Dense(120, 84)
-        self.fc3 = nn.Dense(84, 10).to_float(mstype.float32)
+        self.fc3 = nn.Dense(84, 10)
        self.relu = nn.ReLU()
-        self.max_pool2d = nn.MaxPool2d(kernel_size=2)
+        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = P.Flatten()

    def construct(self, x):
@@ -129,6 +155,7 @@ class LeNet5(nn.Cell):
 # Initialize network and set mixing precision
 net = LeNet5()
 net.to_float(mstype.float16)
+net.fc3.to_float(mstype.float32)

 # Define training data, label and sens
 predict = Tensor(np.ones([1, 1, 32, 32]).astype(np.float32) * 0.01)
@@ -141,6 +168,7 @@ loss = MSELoss()
 optimizer = Momentum(params=net.trainable_params(), learning_rate=0.1, momentum=0.9)
 net_with_loss = WithLossCell(net, loss)
 train_network = TrainOneStepWithLossScaleCell(net_with_loss, optimizer)
+train_network.set_train()

 # Run training
 output = train_network(predict, label, scaling_sens)

--- a/tutorials/source_zh_cn/advanced_use/mixed_precision.md
+++ b/tutorials/source_zh_cn/advanced_use/mixed_precision.md
@@ -14,6 +14,8 @@

 混合精度训练方法是通过混合使用单精度和半精度数据格式来加速深度神经网络训练的过程，同时保持了单精度训练所能达到的网络精度。混合精度训练能够加速计算过程，同时减少内存使用和存取，并使得在特定的硬件上可以训练更大的模型或batch size。

+对于FP16的算子，若给定的数据类型是FP32，MindSpore框架的后端会进行降精度处理。用户可以开启INFO日志，并通过搜索关键字“reduce precision”查看降精度处理的算子。
+
 ## 计算流程

 MindSpore混合精度典型的计算流程如下图所示：
@@ -45,9 +47,20 @@ MindSpore混合精度典型的计算流程如下图所示：
 代码样例如下：

 ```python
+import numpy as np
+import mindspore.nn as nn
+import mindspore.common.dtype as mstype
+from mindspore import Tensor, context
+from mindspore.ops import operations as P
+from mindspore.nn import WithLossCell
+from mindspore.nn import Momentum
+from mindspore.nn.loss import MSELoss
 # The interface of Auto_mixed precision
 from mindspore.train import amp

+context.set_context(mode=context.GRAPH_MODE)
+context.set_context(device_target="Ascend")
+
 # Define network
 class LeNet5(nn.Cell):
    def __init__(self):
@@ -58,7 +71,7 @@ class LeNet5(nn.Cell):
        self.fc2 = nn.Dense(120, 84)
        self.fc3 = nn.Dense(84, 10)
        self.relu = nn.ReLU()
-        self.max_pool2d = nn.MaxPool2d(kernel_size=2)
+        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = P.Flatten()

    def construct(self, x):
@@ -94,15 +107,28 @@ output = train_network(predict, label, scaling_sens)
 MindSpore还支持手动混合精度。假定在网络中只有一个Dense Layer要用FP32计算，其他Layer都用FP16计算。混合精度配置以Cell为粒度，Cell默认是FP32类型。

 以下是一个手动混合精度的实现步骤：
-1. 定义网络: 该步骤与自动混合精度中的步骤2类似；注意：在LeNet中的fc3算子，需要手动配置成FP32；
+1. 定义网络: 该步骤与自动混合精度中的步骤2类似；

-2. 配置混合精度: LeNet通过net.to_float(mstype.float16)，把该Cell及其子Cell中所有的算子都配置成FP16；
+2. 配置混合精度: LeNet通过net.to_float(mstype.float16)，把该Cell及其子Cell中所有的算子都配置成FP16；然后，将LeNet中的fc3算子手动配置成FP32；

 3. 使用TrainOneStepWithLossScaleCell封装网络模型和优化器。

 代码样例如下：

 ```python
+import numpy as np
+import mindspore.nn as nn
+import mindspore.common.dtype as mstype
+from mindspore import Tensor, context
+from mindspore.ops import operations as P
+from mindspore.nn import WithLossCell, TrainOneStepWithLossScaleCell
+from mindspore.nn import Momentum
+from mindspore.nn.loss import MSELoss
+
+
+context.set_context(mode=context.GRAPH_MODE)
+context.set_context(device_target="Ascend")
+
 # Define network
 class LeNet5(nn.Cell):
    def __init__(self):
@@ -111,9 +137,9 @@ class LeNet5(nn.Cell):
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')
        self.fc1 = nn.Dense(16 * 5 * 5, 120)
        self.fc2 = nn.Dense(120, 84)
-        self.fc3 = nn.Dense(84, 10).to_float(mstype.float32)
+        self.fc3 = nn.Dense(84, 10)
        self.relu = nn.ReLU()
-        self.max_pool2d = nn.MaxPool2d(kernel_size=2)
+        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = P.Flatten()

    def construct(self, x):
@@ -128,6 +154,7 @@ class LeNet5(nn.Cell):
 # Initialize network and set mixing precision
 net = LeNet5()
 net.to_float(mstype.float16)
+net.fc3.to_float(mstype.float32)

 # Define training data, label and sens
 predict = Tensor(np.ones([1, 1, 32, 32]).astype(np.float32) * 0.01)
@@ -140,6 +167,7 @@ loss = MSELoss()
 optimizer = Momentum(params=net.trainable_params(), learning_rate=0.1, momentum=0.9)
 net_with_loss = WithLossCell(net, loss)
 train_network = TrainOneStepWithLossScaleCell(net_with_loss, optimizer)
+train_network.set_train()

 # Run training
 output = train_network(predict, label, scaling_sens)