未验证 提交 f1b8f0ef 编写于 作者: C caozhou 提交者: GitHub

fix process group init bug (#47224)

上级 ab936d8a
...@@ -647,12 +647,13 @@ class Engine: ...@@ -647,12 +647,13 @@ class Engine:
# Traverse different rank programs and traverse each op of them, # Traverse different rank programs and traverse each op of them,
# instantiate communication by process_mapping. # instantiate communication by process_mapping.
all_process_groups = get_all_process_groups() all_process_groups = get_all_process_groups()
cur_rank = self._cur_rank
# NOTE: After the implementation of the unified dynamic and static communication group initialization mode in the future, the initialization logic of full mode will be removed because port occupation error may occur.
if self._strategy.auto_mode == "full": if self._strategy.auto_mode == "full":
initialize_pg_in_full_mode(all_process_groups, cur_rank) initialize_pg_in_full_mode(all_process_groups, cur_rank)
else: else:
for process_group in all_process_groups: for process_group in all_process_groups:
if self._cur_rank not in process_group.ranks: if cur_rank not in process_group.ranks:
continue continue
process_group.instantiate() process_group.instantiate()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册