Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
3682035f
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
3682035f
编写于
4月 07, 2018
作者:
W
Wu Yi
提交者:
GitHub
4月 07, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #9695 from emailweixu/round_robin
Fix a minor bug for distributed_spliter.round_robin
上级
5bb7d59e
560d960b
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
17 addition
and
10 deletion
+17
-10
python/paddle/fluid/distribute_transpiler.py
python/paddle/fluid/distribute_transpiler.py
+6
-6
python/paddle/fluid/distributed_splitter.py
python/paddle/fluid/distributed_splitter.py
+11
-4
未找到文件。
python/paddle/fluid/distribute_transpiler.py
浏览文件 @
3682035f
...
@@ -17,7 +17,7 @@ import framework
...
@@ -17,7 +17,7 @@ import framework
from
framework
import
Program
,
default_main_program
,
default_startup_program
,
Parameter
,
Variable
from
framework
import
Program
,
default_main_program
,
default_startup_program
,
Parameter
,
Variable
import
optimizer
import
optimizer
from
layer_helper
import
LayerHelper
from
layer_helper
import
LayerHelper
from
distributed_spliter
import
*
import
distributed_splitter
as
splitter
import
math
import
math
from
.
import
core
from
.
import
core
import
debuger
import
debuger
...
@@ -36,7 +36,7 @@ class VarBlock:
...
@@ -36,7 +36,7 @@ class VarBlock:
class
UnionFind
(
object
):
class
UnionFind
(
object
):
""" Union-find data struct.
""" Union-find data struct.
Union-find is a data struct that keeps track of a set of elements partitioned
Union-find is a data struct that keeps track of a set of elements partitioned
into a number of disjoint (non-overlapping) subsets.
into a number of disjoint (non-overlapping) subsets.
...
@@ -138,7 +138,7 @@ class DistributeTranspiler:
...
@@ -138,7 +138,7 @@ class DistributeTranspiler:
program
=
None
,
program
=
None
,
pservers
=
"127.0.0.1:6174"
,
pservers
=
"127.0.0.1:6174"
,
trainers
=
1
,
trainers
=
1
,
split_method
=
round_robin
):
split_method
=
splitter
.
round_robin
):
"""
"""
Transpile the program to distributed data-parallelism programs.
Transpile the program to distributed data-parallelism programs.
The main_program will be transformed to use a remote parameter server
The main_program will be transformed to use a remote parameter server
...
@@ -303,7 +303,7 @@ class DistributeTranspiler:
...
@@ -303,7 +303,7 @@ class DistributeTranspiler:
# If two ops are connected, we could add these two ops
# If two ops are connected, we could add these two ops
# into one set.
# into one set.
ufind
=
self
.
_create_ufind
(
self
.
optimize_ops
)
ufind
=
self
.
_create_ufind
(
self
.
optimize_ops
)
# step 4.2
# step 4.2
# Iterate through the ops and append optimize op which
# Iterate through the ops and append optimize op which
# located on current pserver
# located on current pserver
opt_op_on_pserver
=
[]
opt_op_on_pserver
=
[]
...
@@ -312,7 +312,7 @@ class DistributeTranspiler:
...
@@ -312,7 +312,7 @@ class DistributeTranspiler:
opt_op_on_pserver
.
append
(
op
)
opt_op_on_pserver
.
append
(
op
)
# step 4.3
# step 4.3
# Iterate through the ops, and if an op and the optimize ops
# Iterate through the ops, and if an op and the optimize ops
# which located on current pserver are in one set, then
# which located on current pserver are in one set, then
# append it into the sub program.
# append it into the sub program.
# We try to put optimization program run parallelly, assume
# We try to put optimization program run parallelly, assume
...
@@ -752,7 +752,7 @@ class DistributeTranspiler:
...
@@ -752,7 +752,7 @@ class DistributeTranspiler:
def
_is_opt_op
(
self
,
op
):
def
_is_opt_op
(
self
,
op
):
# NOTE: It's a HACK implement.
# NOTE: It's a HACK implement.
# optimize op: SGDOptimize, MomentumOptimizer, AdamOptimizer and etc...
# optimize op: SGDOptimize, MomentumOptimizer, AdamOptimizer and etc...
if
"Param"
in
op
.
input_names
and
\
if
"Param"
in
op
.
input_names
and
\
"LearningRate"
in
op
.
input_names
:
"LearningRate"
in
op
.
input_names
:
return
True
return
True
...
...
python/paddle/fluid/distributed_spliter.py
→
python/paddle/fluid/distributed_split
t
er.py
浏览文件 @
3682035f
...
@@ -17,8 +17,10 @@ def hash_name(varlist, pserver_endpoints):
...
@@ -17,8 +17,10 @@ def hash_name(varlist, pserver_endpoints):
"""
"""
hash variable names to several endpoints.
hash variable names to several endpoints.
:param varlist: a list of Variables
Args:
:return: a map of pserver endpoint -> varname
varlist(list): a list of Variables
Returns(dict): a map of pserver endpoint -> varname
"""
"""
def
_hash_block
(
block_str
,
total
):
def
_hash_block
(
block_str
,
total
):
...
@@ -34,9 +36,14 @@ def hash_name(varlist, pserver_endpoints):
...
@@ -34,9 +36,14 @@ def hash_name(varlist, pserver_endpoints):
def
round_robin
(
varlist
,
pserver_endpoints
):
def
round_robin
(
varlist
,
pserver_endpoints
):
"""
"""
distribute variables to several endpoints.
Distribute variables to several endpoints.
Args:
varlist(list): a list of variables
pserver_endpoints(list): a list of pserver endpoints
Returns(list[int]): the endpoint for each variable
"""
"""
assert
(
len
(
varlist
)
>
len
(
pserver_endpoints
))
assert
(
len
(
varlist
)
>
=
len
(
pserver_endpoints
))
eplist
=
[]
eplist
=
[]
pserver_idx
=
0
pserver_idx
=
0
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录