Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
4ba49af5
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
4ba49af5
编写于
4月 26, 2021
作者:
S
ShenLiang
提交者:
GitHub
4月 26, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add barrier for new group (#32572)
上级
fcd18ef1
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
42 addition
and
41 deletion
+42
-41
python/paddle/distributed/collective.py
python/paddle/distributed/collective.py
+42
-41
未找到文件。
python/paddle/distributed/collective.py
浏览文件 @
4ba49af5
...
...
@@ -160,6 +160,46 @@ def get_group(id=0):
return
gm
[
group
]
if
group
in
gm
else
None
def
barrier
(
group
=
None
):
"""
Barrier among all participators in the group.
Args:
group (Group): The group instance return by new_group or None for global default group.
Returns:
None.
Examples:
.. code-block:: python
import paddle
from paddle.distributed import init_parallel_env
paddle.set_device('gpu:%d'%paddle.distributed.ParallelEnv().dev_id)
init_parallel_env()
paddle.distributed.barrier()
"""
if
group
is
not
None
and
not
group
.
is_member
():
return
ring_id
=
0
if
group
is
None
else
group
.
id
op_type
=
'barrier'
temp
=
fill_constant
([
1
],
dtype
=
"int32"
,
value
=
"1"
)
if
in_dygraph_mode
():
return
core
.
ops
.
barrier
(
temp
,
temp
,
'ring_id'
,
ring_id
)
if
not
isinstance
(
ring_id
,
int
):
raise
ValueError
(
"The type of 'group' for barrier must be int."
)
helper
=
LayerHelper
(
op_type
,
**
locals
())
helper
.
append_op
(
type
=
op_type
,
inputs
=
{
'X'
:
[
temp
]},
outputs
=
{
'Out'
:
[
temp
]},
attrs
=
{
'ring_id'
:
ring_id
})
def
new_group
(
ranks
=
None
,
backend
=
None
):
"""
...
...
@@ -220,7 +260,8 @@ def new_group(ranks=None, backend=None):
core
.
NCCLParallelContext
(
strategy
,
place
).
init_with_ring_id
(
ring_id
)
else
:
assert
False
,
(
"no cuda device found"
)
# need to barrier to construct group
barrier
(
gp
)
return
gp
...
...
@@ -838,46 +879,6 @@ def _mp_allreduce(tensor,
raise
NotImplementedError
(
"No support _mp_allreduce in dygraph mode."
)
def
barrier
(
group
=
None
):
"""
Barrier among all participators in the group.
Args:
group (Group): The group instance return by new_group or None for global default group.
Returns:
None.
Examples:
.. code-block:: python
import paddle
from paddle.distributed import init_parallel_env
paddle.set_device('gpu:%d'%paddle.distributed.ParallelEnv().dev_id)
init_parallel_env()
paddle.distributed.barrier()
"""
if
group
is
not
None
and
not
group
.
is_member
():
return
ring_id
=
0
if
group
is
None
else
group
.
id
op_type
=
'barrier'
temp
=
fill_constant
([
1
],
dtype
=
"int32"
,
value
=
"1"
)
if
in_dygraph_mode
():
return
core
.
ops
.
barrier
(
temp
,
temp
,
'ring_id'
,
ring_id
)
if
not
isinstance
(
ring_id
,
int
):
raise
ValueError
(
"The type of 'group' for barrier must be int."
)
helper
=
LayerHelper
(
op_type
,
**
locals
())
helper
.
append_op
(
type
=
op_type
,
inputs
=
{
'X'
:
[
temp
]},
outputs
=
{
'Out'
:
[
temp
]},
attrs
=
{
'ring_id'
:
ring_id
})
def
_parallel_linear
(
x
,
num_rows
,
num_cols
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录