- 01 2月, 2023 1 次提交
-
-
由 Logan Adams 提交于
* Remove hardcoded instances to fp16 in log messages. * Add model_dtype to print the correct format * Respond to PR feedback --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 31 1月, 2023 2 次提交
-
-
由 cassieesvelt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Bing Xie 提交于
* modify engine.py for formatting * commit formatting changes on engine.py
-
- 29 1月, 2023 1 次提交
-
-
由 Connor Holmes 提交于
-
- 27 1月, 2023 5 次提交
-
-
由 Lev Kurilenko 提交于
This PR adds a torch version check in the test_bias_gelu unit test to skip if the torch version < 1.12. This is due to gelu implementation differences in versions prior to 1.12.
-
由 Reza Yazdani 提交于
* Reset KV-cache at the beginning of text-generation * Add new backward kernel to handle large softmax-length * remove unrelated changes Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Dashiell Stander 提交于
* Fix how autotuning reports TFLOPS so that they are reported in FLOPS per second, not millisecond Co-authored-by: NNick Sarkauskas <nsarka00@gmail.com> Co-authored-by: NQuentin Anthony <anthony.301@osu.edu> Signed-off-by: NDashiell Stander <dstander@protonmail.com> * Actually it is microseconds -> seconds Signed-off-by: NDashiell Stander <dstander@protonmail.com> * Actually it is microseconds -> seconds Signed-off-by: NDashiell Stander <dstander@protonmail.com> Signed-off-by: NDashiell Stander <dstander@protonmail.com> Co-authored-by: NNick Sarkauskas <nsarka00@gmail.com> Co-authored-by: NQuentin Anthony <anthony.301@osu.edu>
-
- 26 1月, 2023 2 次提交
-
-
由 Ma, Guokai 提交于
* Integrate accelerator abstraction interface into deepspeed/ * Fix error message in fp16/fused_optimizer * fix error message in fp16/unfused_optimizer.py * assign get_accelerator().pin_memory() result to input Tensor name * no need to check cuda and whether nvtx supported * move try-except into inner most block * call Event() and Stream() in get_accelerator() for data type * Make Stream and Event as properties of abstract interface so they can be used as data type in deepspeed * Apply op_builder backend api change from #2705 from @jeffra * fix tests where Builder NAME is used * keep original ...Builder.NAME interface instead of ...Builder().NAME interface * fix builder closure for installation * fix randomltd builder * add comments to clarify create_op_builder and get_op_builder * fix compatibility with pip install -e Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
* [GatheredParameters] fix memory leak * simplify * cleanup and move * style * Formatting * fix test * fix test * fix test take 2 * Trigger CI Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com>
-
- 25 1月, 2023 3 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Molly Smith 提交于
* loop through pipe.model * tp_parser first draft * client_module must be type object * Simplify layernorm tracking. Add unittest. * cleanup * Add more models to unittest * cleanup inference pytest for merging * Add unittest * cleanup * pre-commit * unittest id and pytest marker * try marian for unittest * precommit * Move tp code to seperate file * Add new auto tp file * pre-commit and type * Update deepspeed/module_inject/auto_tp.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update deepspeed/module_inject/auto_tp.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update tests/unit/inference/test_inference.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * remove unused fillmask function Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 loadams 提交于
-
- 20 1月, 2023 1 次提交
-
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NLev Kurilenko <lekurile@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 19 1月, 2023 4 次提交
-
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Joe Mayer 提交于
* BF16 optimizer only with ZeRO stage 1. * Updating to grad accum of fp32 for BF16 ZeRO1 case. Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 18 1月, 2023 6 次提交
-
-
由 Michael Wyatt 提交于
-
由 Olatunji Ruwase 提交于
* CPU-Adam: add compile-flag to enable param-copy from CPU to GPU * guarde the CUDA-related include files and variables * remove CUDA dependency from op_builder when building against CPU * fixing the builder issues * fix formatting * return true when there is no mismatch on the cuda version * guard for when cuda is not available & test with cpu-only environment * Update cpu_adam and cpu_adagrad * Format fixes * Add configurable half precision type; Build/run in CUDA environment * Run cpu_adam and cpu_adagrad in cpu only environment * Mark CUDA only unit tests * CPU environment CI * Format fixes * Remove --forked * Add --forked * CPU only CI should pass * Format fixes * Format fixes * Remove scattered pytest.skip * Fix cpu_adam unit test * Update .github/workflows/nv-torch-latest-cpu.yml Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update .github/workflows/nv-torch-latest-cpu.yml Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Address PR feedback * OpenMP linking * Fix unit tests Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Olatunji Ruwase 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 14 1月, 2023 4 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 13 1月, 2023 1 次提交
-
-
由 LOK CHAND KOPPAKA 提交于
* Extend quantization utils features * remove unwanted files * fix cahce setting Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
-
- 12 1月, 2023 2 次提交
-
-
由 LOK CHAND KOPPAKA 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
由 Masahiro Tanaka 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 11 1月, 2023 2 次提交
-
-
由 cassieesvelt 提交于
* add logging changes * try w/out abspath * undo last change * start mlflow debug * remove mlflow from export_envs * add mlflow logging for reversed * remove mlflow.start_run * add back start run * don't clean cmd * print os environment variables * remove first start run * add run_id to mlflow star * remove context managers * move last end run * add extra parent start_runs * add run id logging * add logging to run_ds_config * change run_id to run_name * add back context managers and run_id logs * remove context mng * debug environment variable * reset environment variables * add env variable deletion * clean up * remove unused import * fix yapf/whitespace errors Co-authored-by: NCheng Li <pistasable@gmail.com>
-
由 JackieWu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 10 1月, 2023 2 次提交
-
-
由 Ma, Guokai 提交于
-
由 Jeff Rasley 提交于
-
- 09 1月, 2023 4 次提交
-
-
由 Xiaoxia (Shirley) Wu 提交于
double check the unit tests
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 JackieWu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 li-yi-dong 提交于
* Remove unnecessary device synchronization for stage 2 * Remove unnecessary device synchronization for stage 2 Co-authored-by: Nliyidong.lyd <liyidong.lyd@alibaba-inc.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-