Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
magicwindyyd
mindspore
提交
59e519e8
M
mindspore
项目概览
magicwindyyd
/
mindspore
与 Fork 源项目一致
Fork自
MindSpore / mindspore
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
mindspore
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
59e519e8
编写于
7月 17, 2020
作者:
J
jinyaohui
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
allreduce add ps filter
上级
bbcefa73
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
6 addition
and
4 deletion
+6
-4
mindspore/nn/wrap/grad_reducer.py
mindspore/nn/wrap/grad_reducer.py
+6
-4
未找到文件。
mindspore/nn/wrap/grad_reducer.py
浏览文件 @
59e519e8
...
@@ -50,8 +50,8 @@ def _init_allreduce_operators(length):
...
@@ -50,8 +50,8 @@ def _init_allreduce_operators(length):
return
opt_list
return
opt_list
@
reduce_opt
.
register
(
"Number"
,
"Bool"
,
"Function"
,
"Bool"
,
"Tensor"
,
"Function"
)
@
reduce_opt
.
register
(
"Number"
,
"Bool"
,
"Function"
,
"Bool"
,
"Tensor"
,
"Function"
,
"Bool"
)
def
_tensors_allreduce
(
degree
,
mean
,
allgather
,
allreduce_filter
,
grad
,
allreduce
):
def
_tensors_allreduce
(
degree
,
mean
,
allgather
,
allreduce_filter
,
grad
,
allreduce
,
ps_parameter
):
"""
"""
Apply allreduce on gradient.
Apply allreduce on gradient.
...
@@ -66,7 +66,7 @@ def _tensors_allreduce(degree, mean, allgather, allreduce_filter, grad, allreduc
...
@@ -66,7 +66,7 @@ def _tensors_allreduce(degree, mean, allgather, allreduce_filter, grad, allreduc
Returns:
Returns:
Tensor, the gradient tensor after operation.
Tensor, the gradient tensor after operation.
"""
"""
if
allreduce_filter
:
if
not
ps_parameter
and
allreduce_filter
:
grad
=
allreduce
(
grad
)
grad
=
allreduce
(
grad
)
if
mean
:
if
mean
:
degree
=
F
.
scalar_cast
(
degree
,
F
.
dtype
(
grad
))
degree
=
F
.
scalar_cast
(
degree
,
F
.
dtype
(
grad
))
...
@@ -257,6 +257,8 @@ class DistributedGradReducer(Cell):
...
@@ -257,6 +257,8 @@ class DistributedGradReducer(Cell):
self
.
allreduce_filter
=
tuple
(
x
.
layerwise_parallel
is
False
for
x
in
parameters
)
self
.
allreduce_filter
=
tuple
(
x
.
layerwise_parallel
is
False
for
x
in
parameters
)
self
.
opt_list
=
_init_allreduce_operators
(
len
(
parameters
))
self
.
opt_list
=
_init_allreduce_operators
(
len
(
parameters
))
self
.
allgather
=
AllGather
(
GlobalComm
.
WORLD_COMM_GROUP
)
self
.
allgather
=
AllGather
(
GlobalComm
.
WORLD_COMM_GROUP
)
ps_filter
=
lambda
x
:
x
.
is_param_ps
self
.
ps_parameters
=
tuple
(
ps_filter
(
x
)
for
x
in
parameters
)
def
construct
(
self
,
grads
):
def
construct
(
self
,
grads
):
"""
"""
...
@@ -273,7 +275,7 @@ class DistributedGradReducer(Cell):
...
@@ -273,7 +275,7 @@ class DistributedGradReducer(Cell):
datatypes
=
self
.
map_
(
F
.
partial
(
_get_datatype
),
grads
)
datatypes
=
self
.
map_
(
F
.
partial
(
_get_datatype
),
grads
)
grads
=
self
.
map_
(
F
.
partial
(
_cast_datatype
,
mstype
.
float32
),
grads
)
grads
=
self
.
map_
(
F
.
partial
(
_cast_datatype
,
mstype
.
float32
),
grads
)
new_grad
=
self
.
map_
(
F
.
partial
(
reduce_opt
,
self
.
degree
,
self
.
mean
,
self
.
allgather
),
new_grad
=
self
.
map_
(
F
.
partial
(
reduce_opt
,
self
.
degree
,
self
.
mean
,
self
.
allgather
),
self
.
allreduce_filter
,
grads
,
self
.
opt_list
)
self
.
allreduce_filter
,
grads
,
self
.
opt_list
,
self
.
ps_parameters
)
new_grad
=
self
.
map_
(
F
.
partial
(
_cast_datatype
),
datatypes
,
new_grad
)
new_grad
=
self
.
map_
(
F
.
partial
(
_cast_datatype
),
datatypes
,
new_grad
)
return
new_grad
return
new_grad
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录