Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Crayon鑫
Paddle
提交
977764f2
P
Paddle
项目概览
Crayon鑫
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
977764f2
编写于
7月 19, 2018
作者:
F
fengjiayi
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Fix the other lr_decay
上级
381bacaa
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
32 addition
and
39 deletion
+32
-39
python/paddle/fluid/layers/learning_rate_scheduler.py
python/paddle/fluid/layers/learning_rate_scheduler.py
+32
-39
未找到文件。
python/paddle/fluid/layers/learning_rate_scheduler.py
浏览文件 @
977764f2
...
@@ -62,10 +62,10 @@ def noam_decay(d_model, warmup_steps):
...
@@ -62,10 +62,10 @@ def noam_decay(d_model, warmup_steps):
The decayed learning rate.
The decayed learning rate.
"""
"""
global_step
=
_decay_step_counter
(
1
)
global_step
=
_decay_step_counter
(
1
)
with
init_on_cpu
():
a
=
global_step
**-
0.5
a
=
global_step
**-
0.5
b
=
(
warmup_steps
**-
1.5
)
*
global_step
b
=
(
warmup_steps
**-
1.5
)
*
global_step
lr_value
=
(
d_model
**-
0.5
)
*
ops
.
elementwise_min
(
a
,
b
)
lr_value
=
(
d_model
**-
0.5
)
*
ops
.
elementwise_min
(
a
,
b
)
return
lr_value
return
lr_value
...
@@ -108,12 +108,10 @@ def exponential_decay(learning_rate, decay_steps, decay_rate, staircase=False):
...
@@ -108,12 +108,10 @@ def exponential_decay(learning_rate, decay_steps, decay_rate, staircase=False):
"""
"""
global_step
=
_decay_step_counter
()
global_step
=
_decay_step_counter
()
with
init_on_cpu
():
div_res
=
global_step
/
decay_steps
# update learning_rate
if
staircase
:
div_res
=
global_step
/
decay_steps
div_res
=
ops
.
floor
(
div_res
)
if
staircase
:
decayed_lr
=
learning_rate
*
(
decay_rate
**
div_res
)
div_res
=
ops
.
floor
(
div_res
)
decayed_lr
=
learning_rate
*
(
decay_rate
**
div_res
)
return
decayed_lr
return
decayed_lr
...
@@ -138,11 +136,10 @@ def natural_exp_decay(learning_rate, decay_steps, decay_rate, staircase=False):
...
@@ -138,11 +136,10 @@ def natural_exp_decay(learning_rate, decay_steps, decay_rate, staircase=False):
"""
"""
global_step
=
_decay_step_counter
()
global_step
=
_decay_step_counter
()
with
init_on_cpu
():
div_res
=
global_step
/
decay_steps
div_res
=
global_step
/
decay_steps
if
staircase
:
if
staircase
:
div_res
=
ops
.
floor
(
div_res
)
div_res
=
ops
.
floor
(
div_res
)
decayed_lr
=
learning_rate
*
ops
.
exp
(
-
1
*
decay_rate
*
div_res
)
decayed_lr
=
learning_rate
*
ops
.
exp
(
-
1
*
decay_rate
*
div_res
)
return
decayed_lr
return
decayed_lr
...
@@ -184,12 +181,11 @@ def inverse_time_decay(learning_rate, decay_steps, decay_rate, staircase=False):
...
@@ -184,12 +181,11 @@ def inverse_time_decay(learning_rate, decay_steps, decay_rate, staircase=False):
"""
"""
global_step
=
_decay_step_counter
()
global_step
=
_decay_step_counter
()
with
init_on_cpu
():
div_res
=
global_step
/
decay_steps
div_res
=
global_step
/
decay_steps
if
staircase
:
if
staircase
:
div_res
=
ops
.
floor
(
div_res
)
div_res
=
ops
.
floor
(
div_res
)
decayed_lr
=
learning_rate
/
(
1
+
decay_rate
*
div_res
)
decayed_lr
=
learning_rate
/
(
1
+
decay_rate
*
div_res
)
return
decayed_lr
return
decayed_lr
...
@@ -224,25 +220,22 @@ def polynomial_decay(learning_rate,
...
@@ -224,25 +220,22 @@ def polynomial_decay(learning_rate,
"""
"""
global_step
=
_decay_step_counter
()
global_step
=
_decay_step_counter
()
with
init_on_cpu
():
if
cycle
:
if
cycle
:
div_res
=
ops
.
ceil
(
global_step
/
decay_steps
)
div_res
=
ops
.
ceil
(
global_step
/
decay_steps
)
zero_var
=
tensor
.
fill_constant
(
shape
=
[
1
],
dtype
=
'float32'
,
value
=
0.0
)
zero_var
=
tensor
.
fill_constant
(
one_var
=
tensor
.
fill_constant
(
shape
=
[
1
],
dtype
=
'float32'
,
value
=
1.0
)
shape
=
[
1
],
dtype
=
'float32'
,
value
=
0.0
)
one_var
=
tensor
.
fill_constant
(
with
control_flow
.
Switch
()
as
switch
:
shape
=
[
1
],
dtype
=
'float32'
,
value
=
1.0
)
with
switch
.
case
(
global_step
==
zero_var
):
tensor
.
assign
(
input
=
one_var
,
output
=
div_res
)
with
control_flow
.
Switch
()
as
switch
:
decay_steps
=
decay_steps
*
div_res
with
switch
.
case
(
global_step
==
zero_var
):
else
:
tensor
.
assign
(
input
=
one_var
,
output
=
div_res
)
decay_steps_var
=
tensor
.
fill_constant
(
decay_steps
=
decay_steps
*
div_res
shape
=
[
1
],
dtype
=
'float32'
,
value
=
float
(
decay_steps
))
else
:
global_step
=
ops
.
elementwise_min
(
x
=
global_step
,
y
=
decay_steps_var
)
decay_steps_var
=
tensor
.
fill_constant
(
shape
=
[
1
],
dtype
=
'float32'
,
value
=
float
(
decay_steps
))
decayed_lr
=
(
learning_rate
-
end_learning_rate
)
*
\
global_step
=
ops
.
elementwise_min
(
x
=
global_step
,
y
=
decay_steps_var
)
((
1
-
global_step
/
decay_steps
)
**
power
)
+
end_learning_rate
decayed_lr
=
(
learning_rate
-
end_learning_rate
)
*
\
((
1
-
global_step
/
decay_steps
)
**
power
)
+
end_learning_rate
return
decayed_lr
return
decayed_lr
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录