- 09 7月, 2022 1 次提交
-
-
由 Sam Ade Jacobs 提交于
* Add github username to CODEOWNERS * add import torch.distributed to small model debugging test script * Replace torch.dist with deepspeed.com
-
- 08 7月, 2022 1 次提交
-
-
由 Siddharth Singh 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 07 7月, 2022 4 次提交
-
-
由 Alex Hedges 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 kisseternity 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 30 6月, 2022 2 次提交
-
-
由 Quentin Anthony 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Alex Hedges 提交于
* Fix missing import in replace_module.py * Change import from torch.distributed to deepspeed.comm
-
- 28 6月, 2022 1 次提交
-
-
由 Siddharth Singh 提交于
-
- 24 6月, 2022 1 次提交
-
-
由 Michael Wyatt 提交于
* assert no FP16 with AMD CPUs * add unit test for AMD assert error * missing import * downgrade assert to warning
-
- 23 6月, 2022 3 次提交
-
-
由 Reza Yazdani 提交于
* Fix the half-precision version of CPU-Adam * remove unexpected return * fix the increase width (fp32/fp16) * support fp16 tests for cpu-adam * fix the fp16 data-loading * change unit-test for fp16 check & slight change to parameter size * fix for numpy error Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Conglong Li 提交于
-
由 Michael Wyatt 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 22 6月, 2022 2 次提交
-
-
由 Olatunji Ruwase 提交于
* Split parameter offload from z3 * Format fixes * Bug fixes * Cleanup * Remove dead code
-
由 Olatunji Ruwase 提交于
* Retain prefetched params until last use * Unit tests fixes
-
- 21 6月, 2022 2 次提交
-
-
由 Michael Wyatt 提交于
* fix to catch assert error for inference test imports * fix wrong syntax * changed to sequential inf tests * fix for lm_eval import * added environment check fixture * added expected torch and cuda version * check various version depth for cuda/torch Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Karim Foda 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 20 6月, 2022 1 次提交
-
-
由 Olatunji Ruwase 提交于
-
- 16 6月, 2022 5 次提交
-
-
由 Quentin Anthony 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Aman Sanger 提交于
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
- 15 6月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Conglong Li 提交于
-
- 14 6月, 2022 1 次提交
-
-
由 Quentin Anthony 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 11 6月, 2022 1 次提交
-
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NQuentin Anthony <qganthony@yahoo.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 08 6月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jerry Mannil 提交于
Add '-S' argument to pdsh command to return the largest error code from the ssh sessions
-
- 07 6月, 2022 2 次提交
-
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 01 6月, 2022 4 次提交
-
-
由 Cheng Li 提交于
-
由 Michael Wyatt 提交于
-
由 Michael Wyatt 提交于
* added unit test for various HF model families and tasks * formatting * added missing import * fixed broken pytest global vars * modified test to conform to other test structure * removed gpt-j. it cannot run on V100s (OOM) Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Reza Yazdani 提交于
-
- 26 5月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 25 5月, 2022 1 次提交
-
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 23 5月, 2022 1 次提交
-
-
由 Mikhail Druzhinin 提交于
* Fix do not updated sparse grads * Remove call .data for sparse grads Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 21 5月, 2022 1 次提交
-
-
由 Quentin Anthony 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-