[Auto Parallel] Compatible new comm library upgrade (#56604)
* for verify
fluid operator support new comm library
* u
* u
* u
* compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter.
* Remove useless comments in process_group.py
* Polish code style.
* Fix some problems.
* Remove use fluid api in phi comm_context_manager.
* Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement.
* Fix bug of HIP architecture.
* Fix some problems.
1. remove useless loggings.
2. Fix conditional compilation for HIP.
3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first,
then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call
paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not
set, as a result, NCCLCommContext's dev_ctx is not initialized.
* Fix some problems.
* Polish code.
* Polish code.
* Revert compatiable upgrade for communication operators. Their upgrades
will be submitted in another PR.
* Remove StaticTCPStore.
* Remove useless modification.
* Remove useless set_cuda_device_id.
* Polish code.
* Remove fluid header files in phi files.
* Remove useless comments.
* Fix problems of hip arch.
* Fix some problems.
* Polish code.
* Polish code style.
---------
Co-authored-by: hitywt <yuwentao126@126.com>
Showing
想要评论请 注册 或 登录