Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
5c12c5eb
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
5c12c5eb
编写于
7月 12, 2018
作者:
Q
qiaolongfei
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
update distribute transformer for adam and adamax optimizer
上级
6ff7f238
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
5 addition
and
30 deletion
+5
-30
python/paddle/fluid/transpiler/distribute_transpiler.py
python/paddle/fluid/transpiler/distribute_transpiler.py
+5
-30
未找到文件。
python/paddle/fluid/transpiler/distribute_transpiler.py
浏览文件 @
5c12c5eb
...
...
@@ -377,11 +377,6 @@ class DistributeTranspiler(object):
# append it into the sub program.
global_ops
=
[]
# HACK: optimization global ops only used to scale beta1 and beta2
# replace it with dependency engine.
for
op
in
self
.
optimize_ops
:
if
self
.
_is_adam_connected_op
(
op
):
global_ops
.
append
(
op
)
def
__append_optimize_op__
(
op
,
block
,
grad_to_block_id
,
merged_var
,
lr_ops
):
...
...
@@ -1289,22 +1284,16 @@ class DistributeTranspiler(object):
# If one op's input is another op's output or
# one op's output is another op's input, we say
# the two operator is connected.
def
_append_inname
_remove_beta
(
varname_list
):
def
_append_inname
(
varname_list
):
op_input_names
=
[]
for
in_name
in
varname_list
:
# HACK: remove beta1 and beta2 to avoid let all
# ops connected.
if
in_name
.
startswith
(
"beta2_pow_acc"
)
or
\
in_name
.
startswith
(
"beta1_pow_acc"
):
continue
else
:
op_input_names
.
append
(
in_name
)
op_input_names
.
append
(
in_name
)
return
op_input_names
op1_input_names
=
_append_inname
_remove_beta
(
op1
.
desc
.
input_arg_names
())
op1_input_names
=
_append_inname
(
op1
.
desc
.
input_arg_names
())
op1_output_names
=
op1
.
desc
.
output_arg_names
()
op2_input_names
=
_append_inname
_remove_beta
(
op2
.
desc
.
input_arg_names
())
op2_input_names
=
_append_inname
(
op2
.
desc
.
input_arg_names
())
op2_output_names
=
op2
.
desc
.
output_arg_names
()
if
set
(
op1_output_names
)
&
set
(
op2_input_names
)
or
\
...
...
@@ -1413,7 +1402,7 @@ class DistributeTranspiler(object):
def
_get_optimize_pass
(
self
):
"""
Get optimizer operators, paramters and gradients from origin_program
Get optimizer operators, param
e
ters and gradients from origin_program
Returns:
opt_ops (list): optimize operators.
params_grads (dict): paramter->gradient.
...
...
@@ -1436,20 +1425,6 @@ class DistributeTranspiler(object):
origin_var_dict
[
param_name
],
origin_var_dict
[
input_name
]
])
elif
self
.
_is_adam_connected_op
(
op
):
opt_ops
.
append
(
op
)
else
:
pass
return
opt_ops
,
params_grads
def
_is_adam_connected_op
(
self
,
op
):
"""
A hack function to determinate whether the input operator
is connected to optimize operator.
"""
if
op
.
type
==
"scale"
:
for
in_name
in
op
.
input_arg_names
:
if
in_name
.
startswith
(
"beta1_pow_acc"
)
or
\
in_name
.
startswith
(
"beta2_pow_acc"
):
return
True
return
False
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录