• Y
    [Auto para] Relaunch with auto mapping function (#37326) · 506e79d1
    Yulong Ao 提交于
    * [Auto Parallel]  Add the unified cluster representation
    
    * [Auto Parallel] Add the graph class for physical mapping
    
    * [Auto Parallel] Add the simple physical mapper
    
    * Set the timeout of the mapper
    
    * Merge the upstream develop unittests cmake files
    
    * Fix a bug of the process group
    
    * Remove mapper unittest from platforms which is not GPU
    
    * Move the instantiation of process group after resharding
    
    * Add the local id for devices
    
    * Update the rank mapping format
    
    * [Auto Parallel] Relaunch with the rank mapping file
    
    * Remove the unnecessary json file
    
    * Avoid entering get_device_proc_info for auto mapping
    
    * Correct the mapper unit test
    
    * Add some comments
    
    * Remove the related files about mapping
    
    * Update the unittest for auto mapping
    
    * Remove unused rank_mapping unittest
    
    * Improve the unittest coverage
    
    * Improve the unittest coverage
    
    * Improve the unittest of relaunch
    
    * Fix the unittest problem in CI
    
    * Improve the unittest of relaunch
    
    * Remove unnecessary statements
    
    * Update the unittest cmakefile
    
    * Correct the cmakefile of auto parallel unittests
    
    * Modify codes based on the new elastic change
    
    * Use the GPUs exclusively in the unittest
    
    * Correct the cmakefile
    
    * Set the timeout of the unittest
    506e79d1
launch_utils.py 67.8 KB