1. 15 12月, 2020 1 次提交
    • A
      NCCL-based 1-bit Adam + Code Refactor for Comm. Backends (#594) · a6dba72a
      Ammar Ahmad Awan 提交于
      * NCCL based 1-bit Implementation + Refactor to add communication backends (#593)
      
      * add nccl 1-bit optim.
      
      * temporary commit to save stuff.
      
      * Use dist collectives instead of mpi routines.
      
      * remove old code for comm.
      
      * Fix bugs. still does not work.
      
      * modify to test the nccl side code path
      
      * Initial gather impl. Works intra-node.
      
      * Updates to comm. phase 2. nccl comm. passed the tests.
      
      * refactor code to introduce nccl/mpi as backends for onebit adam.
      
      * Refactor updates to test/engine.
      
      * Fix compile/runtime errors.
      
      * simplify support for nccl/mpi backends.
      
      * Add missign file
      
      * Add compression backend in constructor. Revert later.
      
      * modify test with some perf counting.
      
      * Implement a true non-blocking gather for nccl side.
      
      * Revert "Add compression backend in constructor. Revert later."
      
      This reverts commit df8c40d3.
      
      * improve the 1-bit adam test.
      
      * Refactor comm. and compression backend in 1-bit adam.
      
      * Fix the test.
      
      * Fix runtime errors and typos in nccl backend
      
      * fix mpi backend. modify tests.
      
      * modify nccl perf test.
      
      * fix mpi side errors.
      
      * Add an mpi perf test
      
      * Sync DSE.
      
      * Remove old collectives file.
      
      * Undo a typo.
      
      * Graceful failure for torch versions that don't support nccl pt2pt.
      a6dba72a
  2. 10 12月, 2020 4 次提交
  3. 09 12月, 2020 1 次提交
  4. 08 12月, 2020 2 次提交
  5. 05 12月, 2020 1 次提交
  6. 03 12月, 2020 5 次提交
  7. 02 12月, 2020 2 次提交
  8. 28 11月, 2020 1 次提交
    • S
      [doc] typo fix and clarification (#563) · 17f36f1b
      Stas Bekman 提交于
      This PR:
      * fixes a misspelled method name
      * also `( () )` doesn't read too well, until one reads the code and understands that it's not a formatting bug. I proposed to simply say that it's a callable object.
      17f36f1b
  9. 26 11月, 2020 4 次提交
  10. 23 11月, 2020 1 次提交
  11. 25 11月, 2020 6 次提交
  12. 24 11月, 2020 1 次提交
  13. 23 11月, 2020 1 次提交
  14. 22 11月, 2020 1 次提交
  15. 21 11月, 2020 1 次提交
  16. 20 11月, 2020 6 次提交
  17. 19 11月, 2020 2 次提交