- 16 5月, 2023 2 次提交
-
-
由 digger yu 提交于
* fix spelling error with deepspeed/runtime/ * fix typo docs/ * fix typo in comments with deepspeed/ --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Joe Mayer 提交于
* Updating fused adam with new Apex bf16 support. * Removing capturable and master weight configs. * resolving pr comments --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 15 5月, 2023 2 次提交
-
-
由 Heyang Qin 提交于
* share inflight registry between PartitionedParameterCoordinator * bound registry to model * make InflightParamRegistry standalone * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Yizhou Wang 提交于
* * try to fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2 * * fix format error * * fix format issue * * add TODO for integrated testing of TP and ZeRO 1/2/3 * fix default pg error --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 13 5月, 2023 4 次提交
-
-
由 mmhab 提交于
DeepSpeedZeRoOffload Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
-
由 Gavin Goodship 提交于
* Update pytorch-profiler.md * Update one-cycle.md --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 digger-yu 提交于
* fix spelling error with deepspeed/runtime/ * fix typo docs/ --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 12 5月, 2023 4 次提交
-
-
由 digger-yu 提交于
-
由 Joe Mayer 提交于
* Changing monitor loss to aggregate loss over gas. * Adding self.losses to engine constructor. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Molly Smith 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 digger-yu 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 11 5月, 2023 3 次提交
-
-
由 Lev Kurilenko 提交于
Co-authored-by: NConnor Holmes <connorholmes@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 Gavin Goodship 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Tian, Feng 提交于
Add snip_momentum structured pruning which can support higher sparse ratio with minor accuracy loss (#3300) Signed-off-by: NTian, Feng <feng.tian@intel.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 10 5月, 2023 5 次提交
-
-
由 Wang, Yi 提交于
fix regression in shard checkpoint loading in AutoTP Path caused by qkv_copy() is deleted and add UT case for shard checkpoint loading in AutoTP (#3457) * add UT case for shard checkpoint loading in AutoTP Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> * autoTP path also support shard loading Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: NWang, Yi A <yi.a.wang@intel.com>
-
由 Lev Kurilenko 提交于
Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 MrZhengXin 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 digger-yu 提交于
* Update index.md fix spelling error * Update training.md fix spelling error --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 digger-yu 提交于
-
- 09 5月, 2023 5 次提交
-
-
由 YiSheng5 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 LiYu Lu 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 digger-yu 提交于
* fix some spelling error under doc/ * fix spelling error deepspeed/ --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Gavin Goodship 提交于
* Update 2020-09-09-sparse-attention.md * Update MoQ-tutorial.md --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Pablo Emídio S.S 提交于
-
- 08 5月, 2023 1 次提交
-
-
由 Wang, Yi 提交于
remove bloom from unsupported Models Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 06 5月, 2023 1 次提交
-
-
由 Yizhou Wang 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 05 5月, 2023 1 次提交
-
-
由 Wang, Yi 提交于
* add sharded checkpoint loading for AutoTP path to reduce the peak memory in initialization stage Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> * fix gptj sharded checkpoint loading problem Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 04 5月, 2023 5 次提交
-
-
由 Connor Holmes 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Michael Wyatt 提交于
-
由 jianan-gu 提交于
* Enable auto TP policy for llama model * Update automatic-tensor-parallelism.md --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com>
-
由 Jeff Rasley 提交于
-
由 Gavin Goodship 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 03 5月, 2023 3 次提交
-
-
由 Joe Mayer 提交于
* Adding torch.optim.Adagrad * adding adagrad for zero 1 2 * Adding Adagrad support to zero 3. * Adding documentation and DeepSpeedCPUAdagrad to list. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
* [zero_to_fp32] fix shared param recovery * cleanup * cleanup * better naming * not all params have ds_id it seems --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Joe Mayer 提交于
* Add ZeRO 1 support to PP for BF16. * Switching enum. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 02 5月, 2023 4 次提交
-
-
由 Connor Holmes 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NZhewei Yao <zheweiyao@gmail.com>
-
由 Nr Wu 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Logan Adams 提交于
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-