- 03 5月, 2023 3 次提交
-
-
由 Joe Mayer 提交于
* Adding torch.optim.Adagrad * adding adagrad for zero 1 2 * Adding Adagrad support to zero 3. * Adding documentation and DeepSpeedCPUAdagrad to list. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
* [zero_to_fp32] fix shared param recovery * cleanup * cleanup * better naming * not all params have ds_id it seems --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Joe Mayer 提交于
* Add ZeRO 1 support to PP for BF16. * Switching enum. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 02 5月, 2023 6 次提交
-
-
由 Connor Holmes 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NZhewei Yao <zheweiyao@gmail.com>
-
由 Nr Wu 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Logan Adams 提交于
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Masahiro Tanaka 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Gavin Goodship 提交于
-
- 30 4月, 2023 1 次提交
-
-
由 hablb 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 29 4月, 2023 1 次提交
-
-
由 Jeff Rasley 提交于
* remove megatron-lm, no longer pip installable * Add skips to tests that require megatron-lm and can't be run currently. * formatting * Formatting --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
- 27 4月, 2023 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 26 4月, 2023 5 次提交
-
-
由 hablb 提交于
No usage of extra_large_param_to_reduce if contiguous_gradients is False. It keeps reference of the param for the lifetime of the application. Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 郭叶军 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Zhen Zhang 提交于
* include mics config and optimizer * change private vars to public vars so the child class can initialize these vars * Port the init function from stage3 * adding a model test file for mics * adopt to get_acceleartor api and fp16 group defrag * WIP: porting mics modification to ms master * WIP: included gradient all-reduce among replication groups * WIP: ported hierarchical all gather part did basic loss test on a simple MLP model * [Bug fix] using the comm group attached on the param * torch2.0 support * remove print * delegate wait op * [Bug] fix naming * adding doc string * resolving recursive import * fix formating, typo and license * fix license and unit test error --------- Co-authored-by: NUbuntu <ubuntu@ip-172-31-14-191.us-west-2.compute.internal> Co-authored-by: NUbuntu <ubuntu@ip-172-31-7-70.us-west-2.compute.internal> Co-authored-by: NZhen Zhang <zhzhn@amazon.com> Co-authored-by: Nzhzhn <zhzhn@ip-10-2-57-114.us-west-2.compute.internal>
-
由 Molly Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Alexander Jipa 提交于
Co-authored-by: NAlexander Jipa <azzhipa@amazon.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 25 4月, 2023 5 次提交
-
-
由 ShijieZZZZ 提交于
* submit changes * update format * fix fomrat * revert * test * add top * treat z1 as z2 * fix shared * remove old changes --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Conglong Li 提交于
-
由 Michael Wyatt 提交于
* request log output * add more details
-
由 Wang, Yi 提交于
Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Adam Moody 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 24 4月, 2023 1 次提交
-
-
由 Kobie Crawford 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 22 4月, 2023 5 次提交
-
-
由 Dino Chen 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
-
由 Molly Smith 提交于
* diffusers 0.15.0 cross attention class check * revert diffusers_attention.py
-
由 Ramya Ramineni 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
* formatting * fixing clang-format version * update pre-commit URL
-
- 21 4月, 2023 7 次提交
-
-
由 Jeff Rasley 提交于
-
由 Connor Holmes 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Connor Holmes 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Michael Wyatt 提交于
* move dist init out of Engine
-
由 Olatunji Ruwase 提交于
* zero3 checkpoint frozen params * Remove debug prints * Move to cpu * WIP * WIP * WIP * Cleanup * Cleanup * Extend unit test for frozen params * API fix
-
由 bobowwb 提交于
line 98 should be curl -O https://bootstrap.pypa.io/pip/3.6/get-pip.py && \ to avoid #16 106.9 ERROR: This script does not work on Python 3.6 The minimum supported Python version is 3.7. Please use https://bootstrap.pypa.io/pip/3.6/get-pip.py instead. Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 digger-yu 提交于
Optimization Code GitHub's image caching mechanism will cache images,Add a random number after the last modified link.so that every time you visit that link, the contributor's image will be refreshed in real time.
-
- 19 4月, 2023 5 次提交
-
-
由 Logan Adams 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Logan Adams 提交于
-
由 Jinzhen Lin 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 digger-yu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Logan Adams 提交于
-