- 22 4月, 2021 3 次提交
-
-
由 sdtblck 提交于
-
由 Olatunji Ruwase 提交于
* Make reduce scatter optional for ZeRO-1 as workaround * Make allreduce default for ZeRO 1 Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Cheng Li 提交于
* use wierd shaped tensor to avoid silent failures when not registering externel params * fix typo Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 21 4月, 2021 5 次提交
-
-
由 Conglong Li 提交于
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed. Author: @conglongli, @awan-10, @samyam, Hanlin Tang, Yuxiong He Paper: https://arxiv.org/abs/2104.06069Co-authored-by: Nsdtblck <46172032+sdtblck@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Sean Naren 提交于
* Add check to see if json file is already loaded * Update doc * Address review * Remove doc comment Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 20 4月, 2021 1 次提交
-
-
由 Shaden Smith 提交于
-
- 19 4月, 2021 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 20 4月, 2021 1 次提交
-
-
由 Shaden Smith 提交于
* zinf tutorial * more megatron integration docs * ZInf + tiling docs
-
- 19 4月, 2021 5 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Shaden Smith 提交于
* zinf tutorial * more megatron integration docs
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
- 17 4月, 2021 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Fix UnboundLocalError * Get full partition size
-
- 15 4月, 2021 3 次提交
-
-
由 Cheng Li 提交于
* update lr scheduler doc for doing per step or epoch update * work * trigger build Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
* faster flatten/unflatten with apex * switch to cpp flatten/unflatten * style * better comment * missing import * switch to build ops at run time * fixes Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
-
- 14 4月, 2021 3 次提交
-
-
由 Stas Bekman 提交于
* e-notation for large floats * handle ints too * readability * handle bool Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
-
由 Takuya Makino 提交于
-
- 09 4月, 2021 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 08 4月, 2021 9 次提交
-
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Samyam Rajbhandari 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
We really really really need those params to be validated... Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Samyam Rajbhandari 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 07 4月, 2021 1 次提交
-
-
由 Takuya Makino 提交于
-
- 03 4月, 2021 2 次提交
-
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
This test has been giving us trouble for a bit, seeing nondeterministic failures, skipping for now to not break out CI. Need to revisit soon though.
-
- 02 4月, 2021 1 次提交
-
-
由 Stas Bekman 提交于
* zero.Init() clarification clarify that if `model.half()` can't fit into gpu memory `zero.Init()` is a must. this proposal is via @samyam's clarification shared elsewhere. Thank you. * style * add clarity * style Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 31 3月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-