- 22 8月, 2023 2 次提交
-
-
由 Wang, Yi 提交于
see https://github.com/huggingface/transformers/tree/main/src/transformers/models/mptCo-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Logan Adams 提交于
-
- 21 8月, 2023 2 次提交
-
-
由 mzl 提交于
* skip all-gather * add notes --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Heyang Qin 提交于
* zero++ tutorial PR (#3783) * [Fix] _conv_flops_compute when padding is a str and stride=1 (#3169) * fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * fix interpolate flops compute (#3782) * use `Flops Profiler` to test `model.generate()` (#2515) * Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NCheng Li <pistasable@gmail.com> * revert PR #3611 (#3786) * bump to 0.9.6 * ZeRO++ chinese blog (#3793) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * remove staging trigger (#3792) * DeepSpeed-Triton for Inference (#3748) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * ZeRO++ (#3784) Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> * adding zero++ to navigation panel of deepspeed.ai (#3796) * Add ZeRO++ Japanese blog (#3797) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> * Bug Fixes for autotuner and flops profiler (#1880) * fix autotuner when backward is not called * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * Missing strided copy for gated MLP (#3788) Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * Requires grad checking. (#3789) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * bump to 0.10.0 * Fix Bug in transform.cu (#3534) * Bug fix * Fixed formatting error --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * bug fix: triton importing error (#3799) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * init commit for mixed precision lora * fix format * patch _allgather_params & minor fixes * make sure initial quantization are finished * make sure dequantization is finished * skip quantization for small parameters * fix format * remove unused async_op * lazy load of quantizer kernels * add mixed precision lora tutorial * cleanup mics * cleanup mics * replace get_accelerator().current_device() * add kwargs to mics * fix format * seperate code and tutorial * fix _all_gather in zero3 --------- Co-authored-by: NBill Luo <50068224+zhiruiluo@users.noreply.github.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NGuorun <84232793+CaffreyR@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: Nstephen youn <13525892+stephen-youn@users.noreply.github.com> Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NMasahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NRamya Ramineni <62723901+rraminen@users.noreply.github.com>
-
- 19 8月, 2023 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
-
由 Lev Kurilenko 提交于
* Add DSE branch input to nv-ds-chat * Use provided DSE branch * Echo DSE branch
-
- 17 8月, 2023 4 次提交
-
-
由 Ma, Guokai 提交于
* distinguish shm name with uid and addr_port * fix formatting
-
由 Lev Kurilenko 提交于
* Add DS Chat CI workflow * Add CRITIC_CKPT_DIR env variable to actions.yml * Update step 2 opt 125m ckpt dir name * Update test dir * Add workflow_dispatch * Add : * Add nv-ds-chat badge to main README * Open GH issue if DS Chat CI fails * Remove pull_request and merge_group conditions * Update and test torch version * Remove PR trigger --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
-
由 Logan Adams 提交于
-
- 16 8月, 2023 3 次提交
-
-
由 Sam Foreman 提交于
-
由 Molly Smith 提交于
* Return nn.parameter type for weights and biases * whitespace * Fix bias tensor size
-
由 Logan Adams 提交于
* Update library installed checker to use check_cmd * This code was used for checking if aio was installed but this was refactored and this code was left
-
- 15 8月, 2023 3 次提交
-
-
由 Olatunji Ruwase 提交于
* Respect memory pinning config * Bug fix
-
由 Olatunji Ruwase 提交于
* Fix unit test * Fix unit test
-
由 Chris M 提交于
* Update engine.py This branch includes changes to handle potential exceptions that may occur when attempting to change file permissions using the os.chmod function within the DeepSpeed engine. The specific issue addressed is the PermissionError that may arise when working with certain filesystems or under restricted permissions. * Change to use logger * Split permissions out and add unit test * UnitTest(use DistTestClass) + trailing whitespace * update unit test * UT parametrize 1, 2 ,3 * trim white space from unit test * change to PermissionError * run pre-commit formats * Catch FileNotFoundError & PermissionError
-
- 11 8月, 2023 1 次提交
-
-
由 Logan Adams 提交于
* Fix torch19 tests * test pip list and --no-build-isolation * Enable verbosity * pin to older accelerate version * Update oldest tested torch to 1.10 * Properly rename directories * Return PR tests to CI again. * Remove -vv
-
- 10 8月, 2023 3 次提交
-
-
由 Logan Adams 提交于
* Update H100 workflow to open an issue if nightly CI fails * Test running as not CI * Add all nightly/switch envvar name * Test with AMD * Add way to get url, switch path of template * Add additional checkout step * Move actions checkout step * Try absolute path with github workspace * Create issue without template/path * Re-enable and add debug logic * add if failed() * More debug * Try without checkout action uses * Rename file * Update variables * Update issue template * Confirm removing permissions still work * Revert "Confirm removing permissions still work" This reverts commit e7c2915a. * Re-enable permissions * Remove PR trigger for AMD MI200 tests * Revert "Remove PR trigger for AMD MI200 tests" This reverts commit 5c5c5fd6. * Test update_existing * Switch to composite action * Fix line ending encoding issue * Switch failure to be a variable * Test with second workflow * Format fix * Switch failure to always * Switch back to previously working way * Test permission changes * Revert "Test permission changes" This reverts commit e051da75. * Update existing bugs with newest build failure link * Remove PR triggers for that were used for testing.
-
由 Logan Adams 提交于
-
由 Joe Mayer 提交于
* removing bad check * adding offload check for bf16 optimizer * grad reduce for extra large param * check grad_accum exists before converting --------- Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 09 8月, 2023 6 次提交
-
-
由 leiwen83 提交于
In cpu ram limited machine, loading checkpoint at the start up may cause oom as all rank in the same node are loading the opt state in the same time. So for this scenario, we make a choice that loading checkpoint could be made pipeline way. Signed-off-by: NLei Wen <wenlei03@qiyi.com> Co-authored-by: NLei Wen <wenlei03@qiyi.com>
-
由 Conglong Li 提交于
* add deepspeed chat arxiv report * add zeroquant v2 and fp * add selective enhencement * add ignore for 'Youn' in spell checker --------- Co-authored-by: Nyaozhewei <zheweiy@berkeley.edu> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Connor Holmes 提交于
* Pass correct node size * formatting --------- Co-authored-by: NConnor Holmes <development@cmikeh2.me> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Olatunji Ruwase 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
* base_dir may not present all time and results in incorrect path * Update replace_module.py * Update config.py --------- Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Michael Wyatt 提交于
-
- 08 8月, 2023 2 次提交
-
-
由 Earlee 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Earlee 提交于
-
- 05 8月, 2023 2 次提交
-
-
由 mzl 提交于
* update ut/doc for glm/codegen * formatting/spacing on docs * re-order/alphabetize the models --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 digger yu 提交于
-
- 04 8月, 2023 2 次提交
-
-
由 marcobellagente93 提交于
* update partition_uniform util function * formatting --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Lev Kurilenko 提交于
* Initial commit * Clean up * Fix formatting
-
- 03 8月, 2023 1 次提交
-
-
由 Michael Wyatt 提交于
-
- 01 8月, 2023 3 次提交
-
-
由 Molly Smith 提交于
* Refactor autoTP inference for HE * Formatting * Move redundant functions to autotp * Remove self from loading class * formatting * Some gpt2 autotp path fixes * precommit
-
由 Hugh Pu 提交于
-
由 Xie Zejian 提交于
* add reproducible compilation environment * fix ci * fix typo for formatting check * Fix casing for format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
- 29 7月, 2023 1 次提交
-
-
由 Zhen Zhang 提交于
Co-authored-by: NZhen Zhang <zhzhn@amazon.com>
-
- 28 7月, 2023 2 次提交
-
-
由 Ma, Guokai 提交于
* Fix deadlock when allreduce spin too fast * Change state to enum to increase readability --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Option to override module apply * Removing early partitioning in override * Unit tests * Cleanup * Adapt unit test to succeed * Handle missed params * Add accelerate * Code cleanup * Add doc * Add doc * Add doc
-