Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
b6ee59ae
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
b6ee59ae
编写于
5月 18, 2018
作者:
T
tangwei12
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
optimize python checkpint dir config
上级
ee91e48e
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
14 addition
and
16 deletion
+14
-16
python/paddle/fluid/transpiler/distribute_transpiler.py
python/paddle/fluid/transpiler/distribute_transpiler.py
+14
-16
未找到文件。
python/paddle/fluid/transpiler/distribute_transpiler.py
浏览文件 @
b6ee59ae
...
...
@@ -219,7 +219,8 @@ class DistributeTranspiler:
# is_chief (no.0 triner) for checkpoint
# the no.0 trainer will save all variables and its own reader offset to checkpoint
# other trianers will save its own reader offset to checkpoint
self
.
is_chief
=
trainer_id
==
0
self
.
_is_chief
=
trainer_id
==
0
self
.
checkpoint_dir
=
checkpoint_dir
# process lookup_table_op
# 1. check all lookup_table_op is distributed
...
...
@@ -327,7 +328,7 @@ class DistributeTranspiler:
"sync_mode"
:
self
.
sync_mode
})
if
checkpoint_dir
and
self
.
is_chief
:
if
self
.
checkpoint_dir
and
self
.
_
is_chief
:
program
.
global_block
().
create_var
(
name
=
SERIAL_VAR_NAME
,
persistable
=
True
,
...
...
@@ -342,7 +343,7 @@ class DistributeTranspiler:
type
=
"checkpoint_save"
,
inputs
=
{
"X"
:
save_vars
},
attrs
=
{
"overwrite"
:
True
,
"dir"
:
checkpoint_dir
})
"dir"
:
self
.
checkpoint_dir
})
# step4: Concat the parameters splits together after recv.
for
varname
,
splited_var
in
param_var_mapping
.
iteritems
():
...
...
@@ -524,15 +525,15 @@ class DistributeTranspiler:
pserver_program
.
sync_with_cpp
()
return
pserver_program
def
get_train_startup_program
(
self
,
checkpoint_load_dir
=
None
):
def
get_train_startup_program
(
self
):
"""
Get train startup program.
If
checkpoint_load
_dir is None, rerurn default startup program.
IF
checkpoint_load
_dir is Exist, add checkpoint_load op and load Var.
If
self.checkpoint
_dir is None, rerurn default startup program.
IF
self.checkpoint
_dir is Exist, add checkpoint_load op and load Var.
"""
startup_prog
=
default_startup_program
()
if
not
checkpoint_load
_dir
:
if
not
self
.
checkpoint
_dir
:
return
startup_prog
load_vars
=
[]
...
...
@@ -540,20 +541,17 @@ class DistributeTranspiler:
if
self
.
_is_persistable
(
var
):
load_vars
.
append
(
var
.
name
)
serial_number
=
self
.
_get_lastest_checkpoint_dir
(
checkpoint_load
_dir
)
serial_number
=
self
.
_get_lastest_checkpoint_dir
(
self
.
checkpoint
_dir
)
startup_prog
.
global_block
().
append_op
(
type
=
"checkpoint_load"
,
inputs
=
{
"X"
:
load_vars
},
outputs
=
{
"Argv"
:
[]},
attrs
=
{
"dir"
:
checkpoint_load
_dir
,
attrs
=
{
"dir"
:
self
.
checkpoint
_dir
,
"Serial"
:
serial_number
})
return
startup_prog
def
get_startup_program
(
self
,
endpoint
,
pserver_program
,
checkpoint_load_dir
=
None
):
def
get_startup_program
(
self
,
endpoint
,
pserver_program
):
"""
Get startup program for current parameter server.
Modify operator input variables if there are variables that
...
...
@@ -609,16 +607,16 @@ class DistributeTranspiler:
for
var
in
new_outputs
.
values
():
load_vars
.
append
(
var
.
name
)
# add checkpoint op
if
not
checkpoint_load
_dir
:
if
not
self
.
checkpoint
_dir
:
return
s_prog
serial_number
=
self
.
_get_lastest_checkpoint_dir
(
checkpoint_load
_dir
)
serial_number
=
self
.
_get_lastest_checkpoint_dir
(
self
.
checkpoint
_dir
)
s_prog
.
global_block
().
append_op
(
type
=
"checkpoint_load"
,
inputs
=
{
"X"
:
load_vars
},
outputs
=
{
"Argv"
:
[]},
attrs
=
{
"dir"
:
checkpoint_load
_dir
,
attrs
=
{
"dir"
:
self
.
checkpoint
_dir
,
"Serial"
:
serial_number
})
return
s_prog
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录