- 21 4月, 2021 1 次提交
-
-
由 Conglong Li 提交于
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed. Author: @conglongli, @awan-10, @samyam, Hanlin Tang, Yuxiong He Paper: https://arxiv.org/abs/2104.06069Co-authored-by: Nsdtblck <46172032+sdtblck@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 11 3月, 2021 1 次提交
-
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 17 9月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
* Switches fused_optimizer overflow calculation
-
- 10 9月, 2020 2 次提交
-
-
由 Shaden Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ammar Ahmad Awan 提交于
* 1-bit adam (#353) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NYour Name <you@example.com> Co-authored-by: Ntanghl1994 <htang14@ur.rochester.edu> Co-authored-by: NHank <tanghl1994@gmail.com> Co-authored-by: Nroot <root@node2x12b.cs.rochester.edu> Co-authored-by: NAmmar Ahmad Awan <awan.ammar@microsoft.com>
-
- 02 9月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
* Sparse attn + ops/runtime refactor + v0.3.0 Co-authored-by: NArash Ashari <arashari@microsoft.com> Co-authored-by: NArash Ashari <arashari@microsoft.com>
-
- 24 6月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Load non-DeepSpeed checkpoints into ZeRO optimizer * Handle parameters smaller than DP * Formatting fixes
-
- 20 6月, 2020 2 次提交
-
-
由 Shaden Smith 提交于
This reverts commit 54c0267e.
-
由 Tunji Ruwase 提交于
-
- 06 6月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Debugging * Fix step() bug; Make step timing optional * Remove unnecessary changes * Format fixes * Replace list with scalar variable * Remove redundant code * Fix typo
-
- 05 6月, 2020 1 次提交
-
-
由 Chunyang Wen 提交于
* Add log util * replace all occurrences of print and logging * address format * disable propagate to avoid duplicate log
-
- 12 5月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Support dynamic loss scale args in fp16 optimizers * Update names
-
- 03 4月, 2020 1 次提交
-
-
由 kouml 提交于
-
- 11 3月, 2020 1 次提交
-
-
由 Samyam Rajbhandari 提交于
* Enhancement: Ability to load checkpoint without loading the optimizer states. Unittest testing saving and loading checkpoint with fused, unfused and zero optimizer. The unitest takes about 165s
-
- 04 2月, 2020 1 次提交
-
-
由 Samyam Rajbhandari 提交于
Different Optimizers in DeepSpeed.
-