Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Greenplum
DeepSpeed
提交
21105521
D
DeepSpeed
项目概览
Greenplum
/
DeepSpeed
上一次同步 大约 1 年
通知
10
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeed
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
未验证
提交
21105521
编写于
11月 21, 2022
作者:
J
Jeff Rasley
提交者:
GitHub
11月 21, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Fixes for torch 1.14 due to new torch.numel return type (#2522)
* fixes for new torch.numel return type * address comment
上级
30c8d8a8
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
17 addition
and
17 deletion
+17
-17
deepspeed/profiling/flops_profiler/profiler.py
deepspeed/profiling/flops_profiler/profiler.py
+16
-16
deepspeed/runtime/comm/nccl.py
deepspeed/runtime/comm/nccl.py
+1
-1
未找到文件。
deepspeed/profiling/flops_profiler/profiler.py
浏览文件 @
21105521
...
...
@@ -480,38 +480,38 @@ def _prod(dims):
def
_linear_flops_compute
(
input
,
weight
,
bias
=
None
):
out_features
=
weight
.
shape
[
0
]
macs
=
torch
.
numel
(
input
)
*
out_features
macs
=
input
.
numel
(
)
*
out_features
return
2
*
macs
,
macs
def
_relu_flops_compute
(
input
,
inplace
=
False
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_prelu_flops_compute
(
input
:
Tensor
,
weight
:
Tensor
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_elu_flops_compute
(
input
:
Tensor
,
alpha
:
float
=
1.0
,
inplace
:
bool
=
False
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_leaky_relu_flops_compute
(
input
:
Tensor
,
negative_slope
:
float
=
0.01
,
inplace
:
bool
=
False
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_relu6_flops_compute
(
input
:
Tensor
,
inplace
:
bool
=
False
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_silu_flops_compute
(
input
:
Tensor
,
inplace
:
bool
=
False
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_gelu_flops_compute
(
input
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_pool_flops_compute
(
input
,
...
...
@@ -523,7 +523,7 @@ def _pool_flops_compute(input,
count_include_pad
=
True
,
divisor_override
=
None
,
return_indices
=
None
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_conv_flops_compute
(
input
,
...
...
@@ -625,8 +625,8 @@ def _batch_norm_flops_compute(
has_affine
=
weight
is
not
None
if
training
:
# estimation
return
torch
.
numel
(
input
)
*
(
5
if
has_affine
else
4
),
0
flops
=
torch
.
numel
(
input
)
*
(
2
if
has_affine
else
1
)
return
input
.
numel
(
)
*
(
5
if
has_affine
else
4
),
0
flops
=
input
.
numel
(
)
*
(
2
if
has_affine
else
1
)
return
flops
,
0
...
...
@@ -639,7 +639,7 @@ def _layer_norm_flops_compute(
):
has_affine
=
weight
is
not
None
# estimation
return
torch
.
numel
(
input
)
*
(
5
if
has_affine
else
4
),
0
return
input
.
numel
(
)
*
(
5
if
has_affine
else
4
),
0
def
_group_norm_flops_compute
(
input
:
Tensor
,
...
...
@@ -649,7 +649,7 @@ def _group_norm_flops_compute(input: Tensor,
eps
:
float
=
1e-5
):
has_affine
=
weight
is
not
None
# estimation
return
torch
.
numel
(
input
)
*
(
5
if
has_affine
else
4
),
0
return
input
.
numel
(
)
*
(
5
if
has_affine
else
4
),
0
def
_instance_norm_flops_compute
(
...
...
@@ -664,7 +664,7 @@ def _instance_norm_flops_compute(
):
has_affine
=
weight
is
not
None
# estimation
return
torch
.
numel
(
input
)
*
(
5
if
has_affine
else
4
),
0
return
input
.
numel
(
)
*
(
5
if
has_affine
else
4
),
0
def
_upsample_flops_compute
(
input
,
...
...
@@ -678,7 +678,7 @@ def _upsample_flops_compute(input,
else
:
return
int
(
size
),
0
assert
scale_factor
is
not
None
,
"either size or scale_factor should be defined"
flops
=
torch
.
numel
(
input
)
flops
=
input
.
numel
(
)
if
isinstance
(
scale_factor
,
tuple
)
and
len
(
scale_factor
)
==
len
(
input
):
flops
*
int
(
_prod
(
scale_factor
))
else
:
...
...
@@ -687,7 +687,7 @@ def _upsample_flops_compute(input,
def
_softmax_flops_compute
(
input
,
dim
=
None
,
_stacklevel
=
3
,
dtype
=
None
):
return
torch
.
numel
(
input
),
0
return
input
.
numel
(
),
0
def
_embedding_flops_compute
(
...
...
deepspeed/runtime/comm/nccl.py
浏览文件 @
21105521
...
...
@@ -68,7 +68,7 @@ class NcclBackend(object):
buffer_m
=
torch
.
cat
([
buffer_m
,
empty_tensor
])
buffer_m
.
add_
(
worker_error
)
worker_scale
=
torch
.
norm
(
buffer_m
)
/
np
.
sqrt
(
torch
.
numel
(
buffer_m
))
worker_scale
=
torch
.
norm
(
buffer_m
)
/
np
.
sqrt
(
buffer_m
.
numel
(
))
worker_error
.
set_
(
buffer_m
-
worker_scale
*
buffer_m
.
sign
().
add_
(
1
).
bool
().
float
().
add_
(
-
0.5
).
mul_
(
2.0
))
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录