python/paddle/distributed/parallel.py · ade51aa57d3d4357535cdad972307d8854a70d74 · PaddlePaddle / Paddle

[Auto Parallel] Compatible new comm library upgrade (#56604) · ade51aa5

由 Ghost Screaming 提交于 8月 30, 2023

* for verify

fluid operator support new comm library

* u

* u

* u

* compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter.

* Remove useless comments in process_group.py

* Polish code style.

* Fix some problems.

* Remove use fluid api in phi comm_context_manager.

* Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement.

* Fix bug of HIP architecture.

* Fix some problems.
1. remove useless loggings.
2. Fix conditional compilation for HIP.
3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first,
then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call
paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not
set, as a result, NCCLCommContext's dev_ctx is not initialized.

* Fix some problems.

* Polish code.

* Polish code.

* Revert compatiable upgrade for communication operators. Their upgrades
will be submitted in another PR.

* Remove StaticTCPStore.

* Remove useless modification.

* Remove useless set_cuda_device_id.

* Polish code.

* Remove fluid header files in phi files.

* Remove useless comments.

* Fix problems of hip arch.

* Fix some problems.

* Polish code.

* Polish code style.

---------
Co-authored-by: hitywt <yuwentao126@126.com>

ade51aa5

parallel.py 44.2 KB

PaddlePaddle / Paddle 大约 1 年 前同步成功

Replace parallel.py

PaddlePaddle / Paddle
大约 1 年前同步成功