- 03 5月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 30 4月, 2022 1 次提交
-
-
由 kisseternity 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 29 4月, 2022 2 次提交
-
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ramya Ramineni 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 28 4月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
-
- 27 4月, 2022 4 次提交
-
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
- 26 4月, 2022 1 次提交
-
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 22 4月, 2022 2 次提交
-
-
由 Shuai Zheng 提交于
Co-authored-by: NShuai Zheng <shzheng@amazon.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
-
- 21 4月, 2022 3 次提交
-
-
由 Olatunji Ruwase 提交于
* Fix zero3 tracing issues * Remove debug prints * Code clarity
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Conglong Li 提交于
-
- 20 4月, 2022 4 次提交
-
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Manuel R. Ciosici 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Shuai Zheng 提交于
Co-authored-by: NShuai Zheng <shzheng@amazon.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Olatunji Ruwase 提交于
* bf16 updates * Got bf16 working * fp32 reduction; flattened tensors * bf16+zero_stage_1 first cut * finish zero_stage 1 sharding * Matching fp16 with debugging codes * Matching loss with fp16 * Fix gradient clipping * bf16 gradient clipping fix bf16 checkpoint save/load * Unscale grad norm * Fix grad norm scaling * Enable loading fp16_zero_1 into bf16_zero_1 engine and vice versa * Fix clip_grad key error * Reduce tied weight gradients * Fix grad norm for moe * Reduce specified gradients * Use O(n) instead of O(n^2) * Remove optimizer restriction for bf16 * Link bf16 & fp32 params * Clip gradients of last stage tied weights * Simplify tied weights reduction logic * Also clip all tp rank parameters * lp to hp mapping * Link lp/hp/optim state; Refresh links after checkpoint load * Remove debug print * Remove debug print * Simplify zero_grad logic * fp32 accessors * Fix update bug Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 16 4月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 dependabot[bot] 提交于
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.3 to 1.13.4. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/v1.13.4/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.13.3...v1.13.4) --- updated-dependencies: - dependency-name: nokogiri dependency-type: indirect ... Signed-off-by: Ndependabot[bot] <support@github.com> Co-authored-by: Ndependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 09 4月, 2022 1 次提交
-
-
由 TongXU 提交于
-
- 06 4月, 2022 1 次提交
-
-
由 Stas Bekman 提交于
-
- 02 4月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 31 3月, 2022 4 次提交
-
-
由 Samyam Rajbhandari 提交于
Supporting multiple modules injection with a single policy when they have identical architectures (#1869) Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 30 3月, 2022 2 次提交
-
-
由 matherit 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Blaine Rogers 提交于
Co-authored-by: NBlaine Rogers <blaine.rogers@five.ai>
-
- 29 3月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 24 3月, 2022 2 次提交
-
-
由 Samyam Rajbhandari 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Karthikeyan Singaravelan 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 23 3月, 2022 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Sayed Hadi Hashemi 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ammar Ahmad Awan 提交于
-
- 22 3月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 19 3月, 2022 1 次提交
-
-
由 Michael Wyatt 提交于
* added concurrency to github actions * fixed problem where one workflow can cancel another workflow * added tmp file to help test concurrency policy on CI * removed tmp file to finish testing concurrency policy
-
- 18 3月, 2022 1 次提交
-
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-