Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
magicwindyyd
mindspore
提交
2e1d2e8e
M
mindspore
项目概览
magicwindyyd
/
mindspore
与 Fork 源项目一致
Fork自
MindSpore / mindspore
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
mindspore
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
2e1d2e8e
编写于
7月 30, 2020
作者:
M
mindspore-ci-bot
提交者:
Gitee
7月 30, 2020
浏览文件
操作
浏览文件
下载
差异文件
!3650 add optimizer formula and verification
Merge pull request !3650 from lijiaqi/add_calculation_formula
上级
71a3c363
dc7cc66b
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
34 addition
and
1 deletion
+34
-1
mindspore/nn/optim/momentum.py
mindspore/nn/optim/momentum.py
+13
-0
mindspore/nn/optim/sgd.py
mindspore/nn/optim/sgd.py
+21
-1
未找到文件。
mindspore/nn/optim/momentum.py
浏览文件 @
2e1d2e8e
...
...
@@ -53,6 +53,19 @@ class Momentum(Optimizer):
To improve parameter groups performance, the customized order of parameters can be supported.
.. math::
v_{t} = v_{t-1}
\a
st u + gradients
If use_nesterov is True:
.. math::
p_{t} = grad
\a
st lr + v_{t}
\a
st u
\a
st lr
If use_nesterov is Flase:
.. math::
p_{t} = lr
\a
st v_{t}
Here: where grad, lr, p, v and u denote the gradients, learning_rate, parameter, accum, and momentum respectively.
Args:
params (Union[list[Parameter], list[dict]]): When the `params` is a list of `Parameter` which will be updated,
the element in `params` should be class `Parameter`. When the `params` is a list of `dict`, the "params",
...
...
mindspore/nn/optim/sgd.py
浏览文件 @
2e1d2e8e
...
...
@@ -46,6 +46,21 @@ class SGD(Optimizer):
To improve parameter groups performance, the customized order of parameters can be supported.
.. math::
v_{t+1} = u
\a
st v_{t} + gradient
\a
st (1-dampening)
If nesterov is True:
.. math::
p_{t+1} = p_{t} - lr
\a
st (gradient + u
\a
st v_{t+1})
If nesterov is Flase:
.. math::
p_{t+1} = p_{t} - lr
\a
st v_{t+1}
To be notice, for the first step, v_{t+1} = gradient
Here : where p, v and u denote the parameters, accum, and momentum respectively.
Args:
params (Union[list[Parameter], list[dict]]): When the `params` is a list of `Parameter` which will be updated,
the element in `params` should be class `Parameter`. When the `params` is a list of `dict`, the "params",
...
...
@@ -74,7 +89,8 @@ class SGD(Optimizer):
momentum (float): A floating point value the momentum. should be at least 0.0. Default: 0.0.
dampening (float): A floating point value of dampening for momentum. should be at least 0.0. Default: 0.0.
weight_decay (float): Weight decay (L2 penalty). It should be in range [0.0, 1.0]. Default: 0.0.
nesterov (bool): Enables the Nesterov momentum. Default: False.
nesterov (bool): Enables the Nesterov momentum. If use nesterov, momentum must greater then 0,
and dampening must equal to 1. Default: False.
loss_scale (float): A floating point value for the loss scale. Should be not less than 1.0. Default: 1.0.
Inputs:
...
...
@@ -118,6 +134,10 @@ class SGD(Optimizer):
if
isinstance
(
momentum
,
float
)
and
momentum
<
0.0
:
raise
ValueError
(
"momentum should be at least 0.0, but got momentum {}"
.
format
(
momentum
))
if
nesterov
and
(
momentum
<=
0
or
dampening
!=
0
):
raise
ValueError
(
"If use nesterov, momentum must be positive and dampening must equal to 0,"
"but got momentum {}, dampening {}"
.
format
(
momentum
,
dampening
))
if
isinstance
(
dampening
,
int
):
dampening
=
float
(
dampening
)
if
not
isinstance
(
dampening
,
float
):
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录