• Z
    [Cherry-Pick][AutoParallel] auto_parallel cherry-pick to release2.4 (#47145) · 90b31790
    zhaoyingli 提交于
    * [Auto Parallel] Make Engine class callable (#46416)
    
    * [Auto Parallel] Imporve the user-defined fetches and logging
    
    * [Auto Parallel] Make Engine class callable
    
    * [Auto Parallel] Update the data loading of tuner
    
    * Print IPS in auto parallel Engine (#46554)
    
    * [AutoParallel] fix dist_split (#46505)
    
    * [AutoParallel] fix dist_split
    
    * add unittest
    
    * update cmakelist
    
    * [AutoParallel] fix sharding (#46572)
    
    * [AutoParallel] fix process_mesh (#46583)
    
    * [AutoParallel] fix reshard when train with eval (#46605)
    
    * [AutoParallel] fix reshard when train with eval
    
    * fix mppp
    
    * [AutoParallel] fix amp when predict (#46637)
    
    * [Auto Parallel]Update comp cost and completion for gpt auto search (#46387)
    
    * update comp cost and completion for gpt auto search
    
    * add unittest
    
    * [Auto Parallel] Fix bugs caused by the inconsistent outputs of Engine API (#46633)
    
    * [Auto Parallel] Unify the logger and outputs of Engine API
    
    * [Auto Parallel] Fix the bugs of to_static
    
    * [Auto Parallel] Adjust the test_to_static.py
    
    * [Auto Parallel] Improve the fine-grained APIs (#46552)
    
    * [Auto Parallel] Suppport different dataloaders
    
    * [Auto Parallel] Add num_shards config for dataset
    
    * [Auto Parallel] Unify the logger and outputs of Engine API
    
    * [Auto Parallel] Fix the bugs of to_static
    
    * [Auto Parallel] Adjust the test_to_static.py
    
    * [Auto Parallel] Add the prepare API and replace __call__ with run
    
    * [Auto Parallel] Improve the private implementations of Engine
    
    * [Auto Parallel] Set capacity of dataloader for opt tuning
    
    * [Auto Parallel] [WIP] Change the fine-grained API
    
    * [Auto Parallel] Improve APIs to support different user cases
    
    * [Auto Parallel] Add removed config
    
    * [Auto Parallel] Add imports
    
    * [Auto Parallel] Fix bugs for to_static
    
    * [Auto Parallel] Remove unnecessary imports
    
    * bugfix (#46921)
    
    * [Auto Parallel] Fix the bug for None labels (#46987)
    
    * [AutoParallel] adapt for gpt-gen (#46771)
    
    * for gpt-gen
    
    * fix reshard
    
    * adapt assign and shape op
    
    * add dist_assign & unittest
    
    * add conditional block unittest
    
    * rename unittest
    
    * [Auto Parallel] Fix the bug of completion (#47056)
    
    * [Auto Parallel] Fix the bug for None labels
    
    * [Auto Parallel] Fix the completion bug
    
    * [AutoParallel] add callbacks (#47014)
    
    * [AutoParallel] add callbacks
    
    * fix unittest
    
    * fix dist_context
    
    * fix engine
    
    * fix cmakelist
    
    * fix unittest's returns
    
    * fix cmakelist
    
    * [Auto Parallel] Add cost interface (#47043)
    
    * add cost interface
    
    * update inferface and add unittest
    
    * update unittest
    
    * update inferface
    
    * [Auto Parallel]Add parallel tuner (#46189)
    
    * add parallel tuner
    
    * add unittest
    
    * fix unittest
    
    * set timeout of unittest
    
    * set unittest timeout
    
    * fix auto_mode setting
    
    * update unittest
    
    * sync from develop and update unittest
    
    * remove unused import
    
    * update unittest
    
    * update cmakelist
    
    * add unittests
    Co-authored-by: NYulong Ao <aoyulong@baidu.com>
    Co-authored-by: NRuibiao Chen <chenruibiao@baidu.com>
    Co-authored-by: Ncaozhou <48191911+Caozhou1995@users.noreply.github.com>
    Co-authored-by: NJZ-LIANG <jianzhongliang10@gmail.com>
    90b31790
auto_parallel_data_parallel_optimization.py 23.4 KB