!5063 modify sgd and momentum and WithGradCell comments

Merge pull request !5063 from lijiaqi/momentum_and_sgd

!5063 modify sgd and momentum and WithGradCell comments
Merge pull request !5063 from lijiaqi/momentum_and_sgd
15ae3702 · mindspore-ci-bot · Gitee · 776d0034 · d9573099 · 15ae3702
Showing with 9 addition and 9 deletion

mindspore/nn/optim/momentum.py mindspore/nn/optim/momentum.py +4 -4

mindspore/nn/optim/sgd.py mindspore/nn/optim/sgd.py +4 -4

mindspore/nn/wrap/cell_wrapper.py mindspore/nn/wrap/cell_wrapper.py +1 -1

未找到文件。
--- a/mindspore/nn/optim/momentum.py
+++ b/mindspore/nn/optim/momentum.py
@@ -56,12 +56,12 @@ class Momentum(Optimizer):
    .. math::
            v_{t} = v_{t-1} \ast u + gradients

-        If use_nesterov is True:
-            .. math::
+    If use_nesterov is True:
+        .. math::
                p_{t} =  p_{t-1} - (grad \ast lr + v_{t} \ast u \ast lr)

-        If use_nesterov is Flase:
-            .. math::
+    If use_nesterov is Flase:
+        .. math::
                p_{t} = p_{t-1} - lr \ast v_{t}

    Here: where grad, lr, p, v and u denote the gradients, learning_rate, params, moments, and momentum respectively.

--- a/mindspore/nn/optim/sgd.py
+++ b/mindspore/nn/optim/sgd.py
@@ -49,12 +49,12 @@ class SGD(Optimizer):
    .. math::
            v_{t+1} = u \ast v_{t} + gradient \ast (1-dampening)

-        If nesterov is True:
-            .. math::
+    If nesterov is True:
+        .. math::
                p_{t+1} = p_{t} - lr \ast (gradient + u \ast v_{t+1})

-        If nesterov is Flase:
-            .. math::
+    If nesterov is Flase:
+        .. math::
                p_{t+1} = p_{t} - lr \ast v_{t+1}

    To be noticed, for the first step, v_{t+1} = gradient

--- a/mindspore/nn/wrap/cell_wrapper.py
+++ b/mindspore/nn/wrap/cell_wrapper.py
@@ -82,7 +82,7 @@ class WithGradCell(Cell):

    Wraps the network with backward cell to compute gradients. A network with a loss function is necessary
    as argument. If loss function in None, the network must be a wrapper of network and loss function. This
-    Cell accepts *inputs as inputs and returns gradients for each trainable parameter.
+    Cell accepts '*inputs' as inputs and returns gradients for each trainable parameter.

    Note:
        Run in PyNative mode.