- 20 5月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 29 4月, 2021 1 次提交
-
-
由 Sean Naren 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 17 3月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 09 3月, 2021 1 次提交
-
-
由 Samyam Rajbhandari 提交于
* Squash stage3 v1 (#146) Co-authored-by: NSamyam <samyamr@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: Neltonzheng <eltonz@microsoft.com> * Fix correctness bug (#147) * formatting fix (#150) * stage3 bugfix (API) update and simplified FP16 Z3 tests (#151) * fp16 Z3 API update and bugfix * revert debug change * ZeRO-3 detach and race condition bugfixes (#149) * trying out ZeRO-3 race condition fix * CUDA sync instead of stream * reduction stream sync * remove commented code * Fix optimizer state_dict KeyError (#148) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * fix for smaller SGS sizes, ensures each grad is backed by unique tensors (#152) * Simplifying the logic for getting averaged gradients (#153) * skip for now * Z3 Docs redux (#154) * removing some TODOs and commented code (#155) * New Z3 defaults (#156) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * formatting * megatron external params Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: Neltonzheng <eltonz@microsoft.com>
-
- 15 1月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 18 12月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 20 11月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 13 11月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
-
- 31 10月, 2020 1 次提交
-
-
由 Reza Yazdani 提交于
* add adamW to CPU-ADAM implementation * supporting cpu-adam optimizer for zero-offload on deepspeed side * bump DSE to match cpu-adam updates Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 10 9月, 2020 2 次提交
-
-
由 Shaden Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
* ZeRO-Offload (squash) (#381) Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NJie <37380896+jren73@users.noreply.github.com> Co-authored-by: NArash Ashari <arashari@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Narashashari <arashashari@ArashMSLaptop.redmond.corp.microsoft.com> Co-authored-by: NRezaYazdaniAminabadi <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
- 02 9月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
* Sparse attn + ops/runtime refactor + v0.3.0 Co-authored-by: NArash Ashari <arashari@microsoft.com> Co-authored-by: NArash Ashari <arashari@microsoft.com>
-
- 05 6月, 2020 1 次提交
-
-
由 Chunyang Wen 提交于
* Add log util * replace all occurrences of print and logging * address format * disable propagate to avoid duplicate log
-
- 30 5月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
* Transformer kernels release Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NElton Zheng <eltonz@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NRezaYazdaniAminabadi <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NTunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NElton Zheng <eltonz@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NRezaYazdaniAminabadi <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NTunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com>
-
- 19 5月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Updates for ZeRO stage 2 + ZeRO stage 1 w. RS Co-authored-by: NTunji Ruwase <olruwase@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NElton Zheng <eltonz@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Nyuxionghe <yuxhe@microsoft.com> Co-authored-by: NArash Ashari <arashari@microsoft.com>
-
- 22 4月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
-
- 27 2月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
* add mpirun support for openmpi 4.0 * add master addr support from args * switch mpi detection to use mpi4py * set constant for default distributed port * Make sure deepspeed_mpi exits in args
-
由 Jeff Rasley 提交于
* add auto-detect to torch dist init * update tests to infer distributed init status * prevent crash if dist_init_required is True but already initiliazed * only init if safe to do so (forgot to add this file in prev commit)
-
- 25 2月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 21 2月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Also a fix for #94
-
- 20 2月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
-
- 10 2月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
-
- 07 2月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
Unit tests for add_XXX_arguments
-
- 04 2月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
Fixing file permissions.
-
- 01 2月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
-