未验证 提交 79c922d0 编写于 作者: Y Yuang Liu 提交者: GitHub

fix sharding vpp overlap bug (#55366)

上级 147e7a38
......@@ -264,12 +264,12 @@ class PipelineParallel(MetaParallelBase):
act = HOOK_ACTION.ALL_REDUCE if dp else HOOK_ACTION.REDUCE
fused_parameter_group = {}
for model in models:
# For virtual pipeline. Will separate parameters in different chunk into
# different groups to get the best performance.
fused_parameter_group = {}
parameter_list = [
p for p in model.parameters() if not p.stop_gradient
]
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册