- 27 1月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 09 12月, 2021 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Control ds_report output with two flags --hide_operators and --hide_errors_and_warnings Separate cli and function entry points to ds_report * Formatting fixes
-
- 17 6月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 03 6月, 2021 1 次提交
-
-
由 Reza Yazdani 提交于
* Change the sparse attention API to be compatible with latest changes on the triton side * remove compatibility checks for CUDA 11 * Update requirements-sparse_attn.txt Co-authored-by: NArash Ashari <arashari@microsoft.com>
-
- 09 3月, 2021 1 次提交
-
-
由 Samyam Rajbhandari 提交于
* Squash stage3 v1 (#146) Co-authored-by: NSamyam <samyamr@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: Neltonzheng <eltonz@microsoft.com> * Fix correctness bug (#147) * formatting fix (#150) * stage3 bugfix (API) update and simplified FP16 Z3 tests (#151) * fp16 Z3 API update and bugfix * revert debug change * ZeRO-3 detach and race condition bugfixes (#149) * trying out ZeRO-3 race condition fix * CUDA sync instead of stream * reduction stream sync * remove commented code * Fix optimizer state_dict KeyError (#148) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * fix for smaller SGS sizes, ensures each grad is backed by unique tensors (#152) * Simplifying the logic for getting averaged gradients (#153) * skip for now * Z3 Docs redux (#154) * removing some TODOs and commented code (#155) * New Z3 defaults (#156) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * formatting * megatron external params Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: Neltonzheng <eltonz@microsoft.com>
-
- 05 1月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 13 11月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
-