• G
    [Auto Parallel] Compatible new comm library upgrade (#56604) · ade51aa5
    Ghost Screaming 提交于
    * for verify
    
    fluid operator support new comm library
    
    * u
    
    * u
    
    * u
    
    * compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter.
    
    * Remove useless comments in process_group.py
    
    * Polish code style.
    
    * Fix some problems.
    
    * Remove use fluid api in phi comm_context_manager.
    
    * Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement.
    
    * Fix bug of HIP architecture.
    
    * Fix some problems.
    1. remove useless loggings.
    2. Fix conditional compilation for HIP.
    3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first,
    then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call
    paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not
    set, as a result, NCCLCommContext's dev_ctx is not initialized.
    
    * Fix some problems.
    
    * Polish code.
    
    * Polish code.
    
    * Revert compatiable upgrade for communication operators. Their upgrades
    will be submitted in another PR.
    
    * Remove StaticTCPStore.
    
    * Remove useless modification.
    
    * Remove useless set_cuda_device_id.
    
    * Polish code.
    
    * Remove fluid header files in phi files.
    
    * Remove useless comments.
    
    * Fix problems of hip arch.
    
    * Fix some problems.
    
    * Polish code.
    
    * Polish code style.
    
    ---------
    Co-authored-by: TaoTao Li's avatarhitywt <yuwentao126@126.com>
    ade51aa5
communication.cc 4.7 KB