1. 27 2月, 2020 2 次提交
    • J
      MPI 3.x support via mpi4py (#107) · 7e813283
      Jeff Rasley 提交于
      * add mpirun support for openmpi 4.0
      
      * add master addr support from args
      
      * switch mpi detection to use mpi4py
      
      * set constant for default distributed port
      
      * Make sure deepspeed_mpi exits in args
      7e813283
    • J
      Init distributed torch only if needed (#108) · 5aa58b38
      Jeff Rasley 提交于
      * add auto-detect to torch dist init
      
      * update tests to infer distributed init status
      
      * prevent crash if dist_init_required is True but already initiliazed
      
      * only init if safe to do so (forgot to add this file in prev commit)
      5aa58b38
  2. 22 2月, 2020 1 次提交
  3. 21 2月, 2020 1 次提交
  4. 04 2月, 2020 1 次提交