Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
0627ee83
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
0627ee83
编写于
11月 08, 2018
作者:
W
Wu Yi
提交者:
GitHub
11月 08, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #14314 from typhoonzero/fix_pserver_weight_decay_multi_inputs
fix pserver weight decay multi inputs
上级
387610aa
f3eafec1
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
36 addition
and
16 deletion
+36
-16
python/paddle/fluid/transpiler/distribute_transpiler.py
python/paddle/fluid/transpiler/distribute_transpiler.py
+36
-16
未找到文件。
python/paddle/fluid/transpiler/distribute_transpiler.py
浏览文件 @
0627ee83
...
...
@@ -1706,13 +1706,27 @@ to transpile() call.")
outputs
=
outputs
,
attrs
=
opt_op
.
all_attrs
())
def
_is_splited_grad_var
(
self
,
var
,
var_dict
):
def
_get_pserver_grad_param_var
(
self
,
var
,
var_dict
):
"""
Return pserver side grad/param variable, return None
if the variable is not grad/param, e.g.
a@GRAD -> a@GRAD.block0
a@GRAD -> a@GRAD (a is not splited)
fc_0.w_0 -> fc_0.w_0.block_0
fc_0.w_0 -> fc_0.w_0 (weight is not splited)
_generated_var_123 -> None
"""
grad_block
=
None
for
_
,
g
in
six
.
iteritems
(
var_dict
):
if
self
.
_orig_varname
(
g
.
name
)
==
self
.
_orig_varname
(
var
.
name
):
# skip per trainer vars
if
g
.
name
.
find
(
".trainer_"
)
==
-
1
:
grad_block
=
g
break
# only param or grads have splited blocks
if
self
.
_orig_varname
(
g
.
name
)
in
self
.
grad_name_to_param_name
or
\
self
.
_orig_varname
(
g
.
name
)
in
self
.
param_name_to_grad_name
:
grad_block
=
g
break
return
grad_block
def
_clone_lr_op
(
self
,
program
,
block
,
op
):
...
...
@@ -1745,32 +1759,38 @@ to transpile() call.")
for
key
,
varlist
in
six
.
iteritems
(
inputs
):
if
not
isinstance
(
varlist
,
list
):
varlist
=
[
varlist
]
for
var
in
varlist
:
# for ops like clipping and weight decay, get the splited var
for
i
in
range
(
len
(
varlist
)):
var
=
varlist
[
i
]
# for ops like clipping and weight decay, get the splited var (xxx.block0)
# for inputs/outputs
grad_block
=
self
.
_
is_splited_grad
_var
(
grad_block
=
self
.
_
get_pserver_grad_param
_var
(
var
,
program
.
global_block
().
vars
)
if
grad_block
:
inputs
[
key
]
=
grad_block
varlist
[
i
]
=
grad_block
elif
var
.
name
not
in
program
.
global_block
().
vars
:
program
.
global_block
().
create_var
(
name
=
var
.
name
,
persistable
=
var
.
persistable
,
dtype
=
var
.
dtype
,
shape
=
var
.
shape
)
tmpvar
=
program
.
global_block
().
_clone_variable
(
var
)
varlist
[
i
]
=
tmpvar
else
:
varlist
[
i
]
=
program
.
global_block
().
vars
[
var
.
name
]
inputs
[
key
]
=
varlist
outputs
=
self
.
_get_output_map_from_op
(
self
.
origin_program
.
global_block
().
vars
,
opt_op
)
for
key
,
varlist
in
six
.
iteritems
(
outputs
):
if
not
isinstance
(
varlist
,
list
):
varlist
=
[
varlist
]
for
var
in
varlist
:
grad_block
=
self
.
_is_splited_grad_var
(
for
i
in
range
(
len
(
varlist
)):
var
=
varlist
[
i
]
grad_block
=
self
.
_get_pserver_grad_param_var
(
var
,
program
.
global_block
().
vars
)
if
grad_block
:
outputs
[
key
]
=
grad_block
varlist
[
i
]
=
grad_block
elif
var
.
name
not
in
program
.
global_block
().
vars
:
program
.
global_block
().
_clone_variable
(
var
)
tmpvar
=
program
.
global_block
().
_clone_variable
(
var
)
varlist
[
i
]
=
tmpvar
else
:
varlist
[
i
]
=
program
.
global_block
().
vars
[
var
.
name
]
outputs
[
key
]
=
varlist
return
optimize_block
.
append_op
(
type
=
opt_op
.
type
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录