- 07 7月, 2023 1 次提交
-
-
由 Michael Wyatt 提交于
* update workflow triggers for merge queue * add branch specifier to trigger
-
- 06 7月, 2023 10 次提交
-
-
由 Pinstripe Potoroo 提交于
Co-authored-by: NGuillaume Sautiere <gsautie@qti.qualcomm.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 digger yu 提交于
* fix error :Dictionary expression not allowed in type annotation Pylance * formatting * formatting --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>
-
由 Lev Kurilenko 提交于
This PR fixes Meta Tensor checkpoint loading for BLOOM models where the SD keys start with transformer..
-
由 Yejing-Lai 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMolly Smith <112220543+molly-smith@users.noreply.github.com>
-
由 Heyang Qin 提交于
* create standalone registries for training and eval respectively --------- Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 Ammar Ahmad Awan 提交于
* extend the test and fix fp16 typo. * guard reset params with z3 enabled check. --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Logan Adams 提交于
-
由 Reza Yazdani 提交于
* Add FALCON auto-tp support * added (skipped) unit test, refactored code to be more readable --------- Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Mashiro 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
-
- 05 7月, 2023 1 次提交
-
-
由 Xingjian Shi 提交于
* fix lora fuse unfuse in hybrid_engine * fix name * fix typo * remove empty lines * Update gptj.py * add lora test-case + fix gptneo implementation * try to fix format * try to accelerate testcase by reducing max length * reduce test runtime * Fix bloom / gpt-neox and add test for bloom * fix CI + fix issue in engine --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 04 7月, 2023 1 次提交
-
-
由 Pinstripe Potoroo 提交于
Co-authored-by: NPinstripe Potoroo <pinstripe-potoroo@users.github.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 03 7月, 2023 2 次提交
-
-
由 Jeff Rasley 提交于
* zero++ tutorial PR (#3783) * [Fix] _conv_flops_compute when padding is a str and stride=1 (#3169) * fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * fix interpolate flops compute (#3782) * use `Flops Profiler` to test `model.generate()` (#2515) * Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NCheng Li <pistasable@gmail.com> * revert PR #3166, it disabled grad clip for bf16 * ensure no loss scaling for non-fp16 dtypes * revert PR #3611 (#3786) * bump to 0.9.6 * ZeRO++ chinese blog (#3793) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * remove staging trigger (#3792) * DeepSpeed-Triton for Inference (#3748) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * ZeRO++ (#3784) Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> * adding zero++ to navigation panel of deepspeed.ai (#3796) * Add ZeRO++ Japanese blog (#3797) * zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> * Bug Fixes for autotuner and flops profiler (#1880) * fix autotuner when backward is not called * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> * Missing strided copy for gated MLP (#3788) Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * Requires grad checking. (#3789) Co-authored-by: NJeff Rasley <jerasley@microsoft.com> * bump to 0.10.0 * Fix Bug in transform.cu (#3534) * Bug fix * Fixed formatting error --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> * bug fix: triton importing error (#3799) Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> --------- Co-authored-by: NHeyang Qin <heyangqin@microsoft.com> Co-authored-by: NBill Luo <50068224+zhiruiluo@users.noreply.github.com> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NGuorun <84232793+CaffreyR@users.noreply.github.com> Co-authored-by: Nstephen youn <13525892+stephen-youn@users.noreply.github.com> Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NMasahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NRamya Ramineni <62723901+rraminen@users.noreply.github.com>
-
由 hablb 提交于
Grad tensors that don't fit in the bucket flat buffer are not added to it, but still added to params_in_ipg_bucket if such tensors exists use reduce_scatter of params_in_ipg_bucket instead of allreduce. since allreduce assumes all grads are in ipg_bucket_flat_buffer. Add test for reduce scatter=false Fix padding to zeros instead of undefined values Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 01 7月, 2023 1 次提交
-
-
由 郭叶军 提交于
.cpp files are excluded Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 30 6月, 2023 4 次提交
-
-
由 Alexander Jipa 提交于
Co-authored-by: NAlexander Jipa <azzhipa@amazon.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Ma, Guokai 提交于
* add show_straggler argument to log_summary() * Show straggler effect logging in seperate table * fix formatting * add docs for log_summary with straggler effect * fix typo --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
* utilize shorter tests for MII * use cached torch download * rework zero++ unit tests * formatting --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com>
-
由 Logan Adams 提交于
* Disable AMD workflows in the YML * Switch from PR to nightly so we can enable the flows here
-
- 29 6月, 2023 4 次提交
-
-
由 郭叶军 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Heyang Qin 提交于
* Fix racing condition in GatheredParameters
-
由 Michael Wyatt 提交于
-
由 Michael Wyatt 提交于
* move torch19 tests to nightly * make megatron apex install persistent on blob storage
-
- 27 6月, 2023 5 次提交
-
-
由 Sam Ade Jacobs 提交于
Set reduce_scatter flag to true
-
由 Masahiro Tanaka 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
chrome://tracing由 Nadav Timor 提交于
July 2022 update: chrome://tracing is deprecated, and by default will redirect to https://ui.perfetto.dev. (see https://chromium.googlesource.com/catapult/+/refs/heads/main/tracing/docs/perfetto.md) (#3805) Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Jeff Rasley 提交于
This reverts commit 2b2be85f.
-
由 Michael Wyatt 提交于
-
- 24 6月, 2023 11 次提交
-
-
由 kisseternity 提交于
Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 stephen youn 提交于
Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ramya Ramineni 提交于
* Bug fix * Fixed formatting error --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Jeff Rasley 提交于
-
由 Joe Mayer 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Connor Holmes 提交于
Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Cheng Li 提交于
* fix autotuner when backward is not called * fix format --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Masahiro Tanaka 提交于
* zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: NHeyangQin <heyangqin@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com>
-
由 Heyang Qin 提交于
-
由 Heyang Qin 提交于
Co-authored-by: NSam Abe Jacobs <samjacobs@microsoft.com> Co-authored-by: NGuanhuaWang <alexwgh333@gmail.com> Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 stephen youn 提交于
Co-authored-by: NStephen Youn <styoun@microsoft.com> Co-authored-by: NArash Bakhtiari <arash@bakhtiari.org> Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NEthan Doe <yidoe@microsoft.com> Co-authored-by: Nyidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-