- 06 9月, 2023 4 次提交
-
-
由 Ammar Ahmad Awan 提交于
-
由 Alexander Jipa 提交于
Co-authored-by: NAlexander Jipa <azzhipa@amazon.com>
-
由 Dom 提交于
* use dict update * use dictionary update --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Add check that opening issues on CI failure requires build to be scheduled * Update ()
-
- 04 9月, 2023 1 次提交
-
-
由 Satpal Singh Rathore 提交于
* check injection policy * transformers v4 * move check_inference_tuple * user injection policy check in infer engine * fix pre-commit format * fix formatting * fix clang format --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLev Kurilenko <113481193+lekurile@users.noreply.github.com>
-
- 02 9月, 2023 2 次提交
-
-
由 Olatunji Ruwase 提交于
* Modify zero parameters * Docs * py3.6 compatibility * Update docs * Update deepspeed/runtime/zero/stage3.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Add TODO * Formatting --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Lev Kurilenko 提交于
* Pin Triton version to 2.0.0 * Pin Triton version to < 2.1.0 * Add >=2.0.0 * pin transformers version --------- Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 01 9月, 2023 4 次提交
-
-
由 Ammar Ahmad Awan 提交于
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NLev Kurilenko <lekurile@microsoft.com> Co-authored-by: NMolly Smith <mosm@microsoft.com>
-
由 Heyang Qin 提交于
Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 31 8月, 2023 4 次提交
-
-
由 Heyang Qin 提交于
* enable hpz when running with torch.no_grad * change the way to detect no_grad * fix format --------- Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 Dino Chen 提交于
Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
由 Logan Adams 提交于
-
由 Maxime 提交于
* fix: linker issues in conda environments #3929 * ignore: re-ordering * Update builder.py
-
- 30 8月, 2023 2 次提交
-
-
由 Joe Mayer 提交于
* Size for transformer engine. * adding kwargs * args tuple * format updates
-
由 Hiromasa 提交于
* added port argument for ssh * changed arg name; moved PDSH arg; added if-else for arg * fixed missing key error * updated test code to correspond to the change in multinode_re_runner.py * changed default ssh port to None * Update deepspeed/launcher/runner.py * formatting --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>
-
- 29 8月, 2023 3 次提交
-
-
由 Molly Smith 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 CurryRice233 提交于
Co-authored-by: Njializheng <jializheng@huawei.com>
-
由 Hugh Pu 提交于
feat(activation_checkpointing): add `non_reentrant_checkpoint` to support inputs require no grad (#4118) * feat: add `non_reentrant_checkpoint` * feat: add missing output postprocess and change the hook to record leaf forward tensor refs * fix: make the multi_grad_hook registered after graph construction * fix: backward compatibility for multi_tensor_hook * fix: nonlocal reference error of deepspeed_saved_tensors * fix: reduce repeating hook registration * test: add test for `activation_checkpointing.checkpointing.non_reentrant_checkpoint` * Pass correct node size for ZeRO++ (#4085) * Pass correct node size * formatting --------- Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * add deepspeed chat arxiv report (#4110) * add deepspeed chat arxiv report * add zeroquant v2 and fp * add selective enhencement * add ignore for 'Youn' in spell checker --------- Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * style: change flake8 detected style missmatch * test: hack to clone the `test_activation_checkpointing` module for reuse and add regression tests * doc: explain the introduction of `non_reentrant_checkpoint` * doc: explain the test of `non_reentrant_checkpoint` --------- Co-authored-by: NConnor Holmes <connorholmes@microsoft.com> Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 26 8月, 2023 1 次提交
-
-
由 hamlet 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 25 8月, 2023 8 次提交
-
-
由 Björn Plüster 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Dino Chen 提交于
Co-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com>
-
由 Joe Mayer 提交于
* name changes * formatting changes
-
由 Michael Wyatt 提交于
* added paths for mup optimizers * added tests * formatting * Add license, fix missing distributed test, formatting * Add mpi4py to confirm tests work * Undo requirements change * Move to runtime folder * Rework to match new format * missing comma * hidden dim fix --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Conglong Li 提交于
-
由 Heyang Qin 提交于
* Chinese translation with Conglong's feedback * fix format --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Masahiro Tanaka 提交于
* add Japanese blog of DS-Ulysses * fix fig --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Sam Ade Jacobs 提交于
-
- 24 8月, 2023 8 次提交
-
-
由 Sam Ade Jacobs 提交于
* Update README.md * Update README.md * Format fix --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Sam Ade Jacobs 提交于
* fix identation * fix formatting --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
Co-authored-by: NMasahiro Tanaka <mtanaka@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Load z3 checkpoints for inference * PR feedback * Fix API bugs * Fix typo
-
由 Minjia Zhang 提交于
* add tutorial file from Minjia. * fix format. --------- Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 23 8月, 2023 3 次提交
-
-
由 Xuehai Pan 提交于
* Fix ZeRO parameter initialization for tensors with `requires_grad=True` * Simplify detach logic --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Add unittest to check huggingface low_cpu_mem_usageflag * change lag to true * Formatting has changes * Indentation fix * Fix chanves * final format fix * Accidently dropped pytestmark from other test * Remove invalid model test config as that was removed. * Whitespace and PR feedback * Format and PR feedback means that we can remove the import we added. * Update tests/unit/inference/test_inference.py * Update tests/unit/inference/test_inference.py --------- Co-authored-by: NLok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Kuan-Ying Lai 提交于
-