Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Paddle
提交
3682035f
P
Paddle
项目概览
PaddlePaddle
/
Paddle
大约 1 年 前同步成功
通知
2298
Star
20931
Fork
5422
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1423
列表
看板
标记
里程碑
合并请求
543
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1,423
Issue
1,423
列表
看板
标记
里程碑
合并请求
543
合并请求
543
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
3682035f
编写于
4月 07, 2018
作者:
W
Wu Yi
提交者:
GitHub
4月 07, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #9695 from emailweixu/round_robin
Fix a minor bug for distributed_spliter.round_robin
上级
5bb7d59e
560d960b
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
17 addition
and
10 deletion
+17
-10
python/paddle/fluid/distribute_transpiler.py
python/paddle/fluid/distribute_transpiler.py
+6
-6
python/paddle/fluid/distributed_splitter.py
python/paddle/fluid/distributed_splitter.py
+11
-4
未找到文件。
python/paddle/fluid/distribute_transpiler.py
浏览文件 @
3682035f
...
@@ -17,7 +17,7 @@ import framework
...
@@ -17,7 +17,7 @@ import framework
from
framework
import
Program
,
default_main_program
,
default_startup_program
,
Parameter
,
Variable
from
framework
import
Program
,
default_main_program
,
default_startup_program
,
Parameter
,
Variable
import
optimizer
import
optimizer
from
layer_helper
import
LayerHelper
from
layer_helper
import
LayerHelper
from
distributed_spliter
import
*
import
distributed_splitter
as
splitter
import
math
import
math
from
.
import
core
from
.
import
core
import
debuger
import
debuger
...
@@ -36,7 +36,7 @@ class VarBlock:
...
@@ -36,7 +36,7 @@ class VarBlock:
class
UnionFind
(
object
):
class
UnionFind
(
object
):
""" Union-find data struct.
""" Union-find data struct.
Union-find is a data struct that keeps track of a set of elements partitioned
Union-find is a data struct that keeps track of a set of elements partitioned
into a number of disjoint (non-overlapping) subsets.
into a number of disjoint (non-overlapping) subsets.
...
@@ -138,7 +138,7 @@ class DistributeTranspiler:
...
@@ -138,7 +138,7 @@ class DistributeTranspiler:
program
=
None
,
program
=
None
,
pservers
=
"127.0.0.1:6174"
,
pservers
=
"127.0.0.1:6174"
,
trainers
=
1
,
trainers
=
1
,
split_method
=
round_robin
):
split_method
=
splitter
.
round_robin
):
"""
"""
Transpile the program to distributed data-parallelism programs.
Transpile the program to distributed data-parallelism programs.
The main_program will be transformed to use a remote parameter server
The main_program will be transformed to use a remote parameter server
...
@@ -303,7 +303,7 @@ class DistributeTranspiler:
...
@@ -303,7 +303,7 @@ class DistributeTranspiler:
# If two ops are connected, we could add these two ops
# If two ops are connected, we could add these two ops
# into one set.
# into one set.
ufind
=
self
.
_create_ufind
(
self
.
optimize_ops
)
ufind
=
self
.
_create_ufind
(
self
.
optimize_ops
)
# step 4.2
# step 4.2
# Iterate through the ops and append optimize op which
# Iterate through the ops and append optimize op which
# located on current pserver
# located on current pserver
opt_op_on_pserver
=
[]
opt_op_on_pserver
=
[]
...
@@ -312,7 +312,7 @@ class DistributeTranspiler:
...
@@ -312,7 +312,7 @@ class DistributeTranspiler:
opt_op_on_pserver
.
append
(
op
)
opt_op_on_pserver
.
append
(
op
)
# step 4.3
# step 4.3
# Iterate through the ops, and if an op and the optimize ops
# Iterate through the ops, and if an op and the optimize ops
# which located on current pserver are in one set, then
# which located on current pserver are in one set, then
# append it into the sub program.
# append it into the sub program.
# We try to put optimization program run parallelly, assume
# We try to put optimization program run parallelly, assume
...
@@ -752,7 +752,7 @@ class DistributeTranspiler:
...
@@ -752,7 +752,7 @@ class DistributeTranspiler:
def
_is_opt_op
(
self
,
op
):
def
_is_opt_op
(
self
,
op
):
# NOTE: It's a HACK implement.
# NOTE: It's a HACK implement.
# optimize op: SGDOptimize, MomentumOptimizer, AdamOptimizer and etc...
# optimize op: SGDOptimize, MomentumOptimizer, AdamOptimizer and etc...
if
"Param"
in
op
.
input_names
and
\
if
"Param"
in
op
.
input_names
and
\
"LearningRate"
in
op
.
input_names
:
"LearningRate"
in
op
.
input_names
:
return
True
return
True
...
...
python/paddle/fluid/distributed_spliter.py
→
python/paddle/fluid/distributed_split
t
er.py
浏览文件 @
3682035f
...
@@ -17,8 +17,10 @@ def hash_name(varlist, pserver_endpoints):
...
@@ -17,8 +17,10 @@ def hash_name(varlist, pserver_endpoints):
"""
"""
hash variable names to several endpoints.
hash variable names to several endpoints.
:param varlist: a list of Variables
Args:
:return: a map of pserver endpoint -> varname
varlist(list): a list of Variables
Returns(dict): a map of pserver endpoint -> varname
"""
"""
def
_hash_block
(
block_str
,
total
):
def
_hash_block
(
block_str
,
total
):
...
@@ -34,9 +36,14 @@ def hash_name(varlist, pserver_endpoints):
...
@@ -34,9 +36,14 @@ def hash_name(varlist, pserver_endpoints):
def
round_robin
(
varlist
,
pserver_endpoints
):
def
round_robin
(
varlist
,
pserver_endpoints
):
"""
"""
distribute variables to several endpoints.
Distribute variables to several endpoints.
Args:
varlist(list): a list of variables
pserver_endpoints(list): a list of pserver endpoints
Returns(list[int]): the endpoint for each variable
"""
"""
assert
(
len
(
varlist
)
>
len
(
pserver_endpoints
))
assert
(
len
(
varlist
)
>
=
len
(
pserver_endpoints
))
eplist
=
[]
eplist
=
[]
pserver_idx
=
0
pserver_idx
=
0
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录