- 29 8月, 2023 1 次提交
-
-
由 Hugh Pu 提交于
feat(activation_checkpointing): add `non_reentrant_checkpoint` to support inputs require no grad (#4118) * feat: add `non_reentrant_checkpoint` * feat: add missing output postprocess and change the hook to record leaf forward tensor refs * fix: make the multi_grad_hook registered after graph construction * fix: backward compatibility for multi_tensor_hook * fix: nonlocal reference error of deepspeed_saved_tensors * fix: reduce repeating hook registration * test: add test for `activation_checkpointing.checkpointing.non_reentrant_checkpoint` * Pass correct node size for ZeRO++ (#4085) * Pass correct node size * formatting --------- Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * add deepspeed chat arxiv report (#4110) * add deepspeed chat arxiv report * add zeroquant v2 and fp * add selective enhencement * add ignore for 'Youn' in spell checker --------- Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * style: change flake8 detected style missmatch * test: hack to clone the `test_activation_checkpointing` module for reuse and add regression tests * doc: explain the introduction of `non_reentrant_checkpoint` * doc: explain the test of `non_reentrant_checkpoint` --------- Co-authored-by: NConnor Holmes <connorholmes@microsoft.com> Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 26 8月, 2023 1 次提交
-
-
由 hamlet 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 25 8月, 2023 8 次提交
-
-
由 Björn Plüster 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Dino Chen 提交于
Co-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com>
-
由 Joe Mayer 提交于
* name changes * formatting changes
-
由 Michael Wyatt 提交于
* added paths for mup optimizers * added tests * formatting * Add license, fix missing distributed test, formatting * Add mpi4py to confirm tests work * Undo requirements change * Move to runtime folder * Rework to match new format * missing comma * hidden dim fix --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Conglong Li 提交于
-
由 Heyang Qin 提交于
* Chinese translation with Conglong's feedback * fix format --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Masahiro Tanaka 提交于
* add Japanese blog of DS-Ulysses * fix fig --------- Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Sam Ade Jacobs 提交于
-
- 24 8月, 2023 8 次提交
-
-
由 Sam Ade Jacobs 提交于
* Update README.md * Update README.md * Format fix --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 Sam Ade Jacobs 提交于
* fix identation * fix formatting --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
-
由 Sam Ade Jacobs 提交于
Co-authored-by: NMasahiro Tanaka <mtanaka@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Load z3 checkpoints for inference * PR feedback * Fix API bugs * Fix typo
-
由 Minjia Zhang 提交于
* add tutorial file from Minjia. * fix format. --------- Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 23 8月, 2023 4 次提交
-
-
由 Xuehai Pan 提交于
* Fix ZeRO parameter initialization for tensors with `requires_grad=True` * Simplify detach logic --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Add unittest to check huggingface low_cpu_mem_usageflag * change lag to true * Formatting has changes * Indentation fix * Fix chanves * final format fix * Accidently dropped pytestmark from other test * Remove invalid model test config as that was removed. * Whitespace and PR feedback * Format and PR feedback means that we can remove the import we added. * Update tests/unit/inference/test_inference.py * Update tests/unit/inference/test_inference.py --------- Co-authored-by: NLok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Kuan-Ying Lai 提交于
-
由 Michael Wyatt 提交于
* Disable nv-nightly workflow since it doesn't work * Run on PRs to debug * fix for nv-nightly * fix * OOM fix? * Update nv-nightly.yml --------- Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
- 22 8月, 2023 2 次提交
-
-
由 Wang, Yi 提交于
see https://github.com/huggingface/transformers/tree/main/src/transformers/models/mptCo-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Logan Adams 提交于
-
- 21 8月, 2023 2 次提交
-
-
由 mzl 提交于
* skip all-gather * add notes --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Heyang Qin 提交于
* zero++ tutorial PR (#3783) * [Fix] _conv_flops_compute when padding is a str and stride=1 (#3169) * fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * fix interpolate flops compute (#3782) * use `Flops Profiler` to test `model.generate()` (#2515) * Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NCheng Li <pistasable@gmail.com> * revert PR #3611 (#3786) * bump to 0.9.6 * ZeRO++ chinese blog (#3793) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * remove staging trigger (#3792) * DeepSpeed-Triton for Inference (#3748) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * ZeRO++ (#3784) Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> * adding zero++ to navigation panel of deepspeed.ai (#3796) * Add ZeRO++ Japanese blog (#3797) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> * Bug Fixes for autotuner and flops profiler (#1880) * fix autotuner when backward is not called * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * Missing strided copy for gated MLP (#3788) Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * Requires grad checking. (#3789) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * bump to 0.10.0 * Fix Bug in transform.cu (#3534) * Bug fix * Fixed formatting error --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * bug fix: triton importing error (#3799) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * init commit for mixed precision lora * fix format * patch _allgather_params & minor fixes * make sure initial quantization are finished * make sure dequantization is finished * skip quantization for small parameters * fix format * remove unused async_op * lazy load of quantizer kernels * add mixed precision lora tutorial * cleanup mics * cleanup mics * replace get_accelerator().current_device() * add kwargs to mics * fix format * seperate code and tutorial * fix _all_gather in zero3 --------- Co-authored-by: NBill Luo <50068224+zhiruiluo@users.noreply.github.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NGuorun <84232793+CaffreyR@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: Nstephen youn <13525892+stephen-youn@users.noreply.github.com> Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NMasahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NRamya Ramineni <62723901+rraminen@users.noreply.github.com>
-
- 19 8月, 2023 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
-
由 Lev Kurilenko 提交于
* Add DSE branch input to nv-ds-chat * Use provided DSE branch * Echo DSE branch
-
- 17 8月, 2023 4 次提交
-
-
由 Ma, Guokai 提交于
* distinguish shm name with uid and addr_port * fix formatting
-
由 Lev Kurilenko 提交于
* Add DS Chat CI workflow * Add CRITIC_CKPT_DIR env variable to actions.yml * Update step 2 opt 125m ckpt dir name * Update test dir * Add workflow_dispatch * Add : * Add nv-ds-chat badge to main README * Open GH issue if DS Chat CI fails * Remove pull_request and merge_group conditions * Update and test torch version * Remove PR trigger --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
-
由 Logan Adams 提交于
-
- 16 8月, 2023 3 次提交
-
-
由 Sam Foreman 提交于
-
由 Molly Smith 提交于
* Return nn.parameter type for weights and biases * whitespace * Fix bias tensor size
-
由 Logan Adams 提交于
* Update library installed checker to use check_cmd * This code was used for checking if aio was installed but this was refactored and this code was left
-
- 15 8月, 2023 3 次提交
-
-
由 Olatunji Ruwase 提交于
* Respect memory pinning config * Bug fix
-
由 Olatunji Ruwase 提交于
* Fix unit test * Fix unit test
-
由 Chris M 提交于
* Update engine.py This branch includes changes to handle potential exceptions that may occur when attempting to change file permissions using the os.chmod function within the DeepSpeed engine. The specific issue addressed is the PermissionError that may arise when working with certain filesystems or under restricted permissions. * Change to use logger * Split permissions out and add unit test * UnitTest(use DistTestClass) + trailing whitespace * update unit test * UT parametrize 1, 2 ,3 * trim white space from unit test * change to PermissionError * run pre-commit formats * Catch FileNotFoundError & PermissionError
-
- 11 8月, 2023 1 次提交
-
-
由 Logan Adams 提交于
* Fix torch19 tests * test pip list and --no-build-isolation * Enable verbosity * pin to older accelerate version * Update oldest tested torch to 1.10 * Properly rename directories * Return PR tests to CI again. * Remove -vv
-