- 30 8月, 2023 1 次提交
-
-
由 Hiromasa 提交于
* added port argument for ssh * changed arg name; moved PDSH arg; added if-else for arg * fixed missing key error * updated test code to correspond to the change in multinode_re_runner.py * changed default ssh port to None * Update deepspeed/launcher/runner.py * formatting --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>
-
- 29 8月, 2023 3 次提交
-
-
由 Molly Smith 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 CurryRice233 提交于
Co-authored-by: Njializheng <jializheng@huawei.com>
-
由 Hugh Pu 提交于
feat(activation_checkpointing): add `non_reentrant_checkpoint` to support inputs require no grad (#4118) * feat: add `non_reentrant_checkpoint` * feat: add missing output postprocess and change the hook to record leaf forward tensor refs * fix: make the multi_grad_hook registered after graph construction * fix: backward compatibility for multi_tensor_hook * fix: nonlocal reference error of deepspeed_saved_tensors * fix: reduce repeating hook registration * test: add test for `activation_checkpointing.checkpointing.non_reentrant_checkpoint` * Pass correct node size for ZeRO++ (#4085) * Pass correct node size * formatting --------- Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * add deepspeed chat arxiv report (#4110) * add deepspeed chat arxiv report * add zeroquant v2 and fp * add selective enhencement * add ignore for 'Youn' in spell checker --------- Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * style: change flake8 detected style missmatch * test: hack to clone the `test_activation_checkpointing` module for reuse and add regression tests * doc: explain the introduction of `non_reentrant_checkpoint` * doc: explain the test of `non_reentrant_checkpoint` --------- Co-authored-by: NConnor Holmes <connorholmes@microsoft.com> Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 26 8月, 2023 1 次提交
-
-
由 hamlet 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 25 8月, 2023 8 次提交
-
-
由 Björn Plüster 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Dino Chen 提交于
Co-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com>
-
由 Joe Mayer 提交于
* name changes * formatting changes
-
由 Michael Wyatt 提交于
* added paths for mup optimizers * added tests * formatting * Add license, fix missing distributed test, formatting * Add mpi4py to confirm tests work * Undo requirements change * Move to runtime folder * Rework to match new format * missing comma * hidden dim fix --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Conglong Li 提交于
-
由 Heyang Qin 提交于
* Chinese translation with Conglong's feedback * fix format --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Masahiro Tanaka 提交于
* add Japanese blog of DS-Ulysses * fix fig --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Sam Ade Jacobs 提交于
-
- 24 8月, 2023 8 次提交
-
-
由 Sam Ade Jacobs 提交于
* Update README.md * Update README.md * Format fix --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Sam Ade Jacobs 提交于
* fix identation * fix formatting --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
Co-authored-by: NMasahiro Tanaka <mtanaka@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Load z3 checkpoints for inference * PR feedback * Fix API bugs * Fix typo
-
由 Minjia Zhang 提交于
* add tutorial file from Minjia. * fix format. --------- Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 23 8月, 2023 4 次提交
-
-
由 Xuehai Pan 提交于
* Fix ZeRO parameter initialization for tensors with `requires_grad=True` * Simplify detach logic --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Add unittest to check huggingface low_cpu_mem_usageflag * change lag to true * Formatting has changes * Indentation fix * Fix chanves * final format fix * Accidently dropped pytestmark from other test * Remove invalid model test config as that was removed. * Whitespace and PR feedback * Format and PR feedback means that we can remove the import we added. * Update tests/unit/inference/test_inference.py * Update tests/unit/inference/test_inference.py --------- Co-authored-by: NLok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Kuan-Ying Lai 提交于
-
由 Michael Wyatt 提交于
* Disable nv-nightly workflow since it doesn't work * Run on PRs to debug * fix for nv-nightly * fix * OOM fix? * Update nv-nightly.yml --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
- 22 8月, 2023 2 次提交
-
-
由 Wang, Yi 提交于
see https://github.com/huggingface/transformers/tree/main/src/transformers/models/mptCo-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Logan Adams 提交于
-
- 21 8月, 2023 2 次提交
-
-
由 mzl 提交于
* skip all-gather * add notes --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Heyang Qin 提交于
* zero++ tutorial PR (#3783) * [Fix] _conv_flops_compute when padding is a str and stride=1 (#3169) * fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * fix interpolate flops compute (#3782) * use `Flops Profiler` to test `model.generate()` (#2515) * Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NCheng Li <pistasable@gmail.com> * revert PR #3611 (#3786) * bump to 0.9.6 * ZeRO++ chinese blog (#3793) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * remove staging trigger (#3792) * DeepSpeed-Triton for Inference (#3748) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * ZeRO++ (#3784) Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> * adding zero++ to navigation panel of deepspeed.ai (#3796) * Add ZeRO++ Japanese blog (#3797) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> * Bug Fixes for autotuner and flops profiler (#1880) * fix autotuner when backward is not called * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * Missing strided copy for gated MLP (#3788) Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * Requires grad checking. (#3789) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * bump to 0.10.0 * Fix Bug in transform.cu (#3534) * Bug fix * Fixed formatting error --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * bug fix: triton importing error (#3799) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * init commit for mixed precision lora * fix format * patch _allgather_params & minor fixes * make sure initial quantization are finished * make sure dequantization is finished * skip quantization for small parameters * fix format * remove unused async_op * lazy load of quantizer kernels * add mixed precision lora tutorial * cleanup mics * cleanup mics * replace get_accelerator().current_device() * add kwargs to mics * fix format * seperate code and tutorial * fix _all_gather in zero3 --------- Co-authored-by: NBill Luo <50068224+zhiruiluo@users.noreply.github.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NGuorun <84232793+CaffreyR@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: Nstephen youn <13525892+stephen-youn@users.noreply.github.com> Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NMasahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NRamya Ramineni <62723901+rraminen@users.noreply.github.com>
-
- 19 8月, 2023 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
-
由 Lev Kurilenko 提交于
* Add DSE branch input to nv-ds-chat * Use provided DSE branch * Echo DSE branch
-
- 17 8月, 2023 4 次提交
-
-
由 Ma, Guokai 提交于
* distinguish shm name with uid and addr_port * fix formatting
-
由 Lev Kurilenko 提交于
* Add DS Chat CI workflow * Add CRITIC_CKPT_DIR env variable to actions.yml * Update step 2 opt 125m ckpt dir name * Update test dir * Add workflow_dispatch * Add : * Add nv-ds-chat badge to main README * Open GH issue if DS Chat CI fails * Remove pull_request and merge_group conditions * Update and test torch version * Remove PR trigger --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
-
由 Logan Adams 提交于
-
- 16 8月, 2023 3 次提交
-
-
由 Sam Foreman 提交于
-
由 Molly Smith 提交于
* Return nn.parameter type for weights and biases * whitespace * Fix bias tensor size
-
由 Logan Adams 提交于
* Update library installed checker to use check_cmd * This code was used for checking if aio was installed but this was refactored and this code was left
-
- 15 8月, 2023 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Respect memory pinning config * Bug fix
-