未验证 提交 429dfa6c 编写于 作者: O Olatunji Ruwase 提交者: GitHub

Handle Norm allreduce when no mp (#1021)

Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
上级 dad26428
......@@ -2405,7 +2405,7 @@ class FP16_DeepSpeedZeroOptimizer_Stage3(object):
""" Perform all reduce within model parallel group, if any.
"""
if self.model_parallel_group is None:
torch.distributed.all_reduce(tensor=tensor, op=op)
pass
else:
torch.distributed.all_reduce(tensor=tensor,
op=op,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册