- 25 1月, 2023 3 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Molly Smith 提交于
* loop through pipe.model * tp_parser first draft * client_module must be type object * Simplify layernorm tracking. Add unittest. * cleanup * Add more models to unittest * cleanup inference pytest for merging * Add unittest * cleanup * pre-commit * unittest id and pytest marker * try marian for unittest * precommit * Move tp code to seperate file * Add new auto tp file * pre-commit and type * Update deepspeed/module_inject/auto_tp.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update deepspeed/module_inject/auto_tp.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update tests/unit/inference/test_inference.py Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * remove unused fillmask function Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 loadams 提交于
-
- 20 1月, 2023 1 次提交
-
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NLev Kurilenko <lekurile@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 19 1月, 2023 4 次提交
-
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Joe Mayer 提交于
* BF16 optimizer only with ZeRO stage 1. * Updating to grad accum of fp32 for BF16 ZeRO1 case. Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 18 1月, 2023 6 次提交
-
-
由 Michael Wyatt 提交于
-
由 Olatunji Ruwase 提交于
* CPU-Adam: add compile-flag to enable param-copy from CPU to GPU * guarde the CUDA-related include files and variables * remove CUDA dependency from op_builder when building against CPU * fixing the builder issues * fix formatting * return true when there is no mismatch on the cuda version * guard for when cuda is not available & test with cpu-only environment * Update cpu_adam and cpu_adagrad * Format fixes * Add configurable half precision type; Build/run in CUDA environment * Run cpu_adam and cpu_adagrad in cpu only environment * Mark CUDA only unit tests * CPU environment CI * Format fixes * Remove --forked * Add --forked * CPU only CI should pass * Format fixes * Format fixes * Remove scattered pytest.skip * Fix cpu_adam unit test * Update .github/workflows/nv-torch-latest-cpu.yml Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Update .github/workflows/nv-torch-latest-cpu.yml Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> * Address PR feedback * OpenMP linking * Fix unit tests Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Olatunji Ruwase 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 14 1月, 2023 4 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 13 1月, 2023 1 次提交
-
-
由 LOK CHAND KOPPAKA 提交于
* Extend quantization utils features * remove unwanted files * fix cahce setting Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
-
- 12 1月, 2023 2 次提交
-
-
由 LOK CHAND KOPPAKA 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
由 Masahiro Tanaka 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 11 1月, 2023 2 次提交
-
-
由 cassieesvelt 提交于
* add logging changes * try w/out abspath * undo last change * start mlflow debug * remove mlflow from export_envs * add mlflow logging for reversed * remove mlflow.start_run * add back start run * don't clean cmd * print os environment variables * remove first start run * add run_id to mlflow star * remove context managers * move last end run * add extra parent start_runs * add run id logging * add logging to run_ds_config * change run_id to run_name * add back context managers and run_id logs * remove context mng * debug environment variable * reset environment variables * add env variable deletion * clean up * remove unused import * fix yapf/whitespace errors Co-authored-by: NCheng Li <pistasable@gmail.com>
-
由 JackieWu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 10 1月, 2023 2 次提交
-
-
由 Ma, Guokai 提交于
-
由 Jeff Rasley 提交于
-
- 09 1月, 2023 4 次提交
-
-
由 Xiaoxia (Shirley) Wu 提交于
double check the unit tests
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 JackieWu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 li-yi-dong 提交于
* Remove unnecessary device synchronization for stage 2 * Remove unnecessary device synchronization for stage 2 Co-authored-by: Nliyidong.lyd <liyidong.lyd@alibaba-inc.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 07 1月, 2023 2 次提交
-
-
由 Ma, Guokai 提交于
* Abstract accelerator (step 2) * more flex op_builder path for both installation and runtime * add SpatialInferenceBuilder into cuda_accelerator.py * use reflection to make cuda_accelerator adapt to CUDA op builder change automatically * clean up deepspeed/__init__.py * add comments in cuda_accelerator for no torch path * Update deepspeed/env_report.py Change env_report.py according to suggestion Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com> * reduce the range of try...except for better code clarity * Add porting for deepspeed/ops/random_ltd/dropping_utils.py * move accelerator to top directory and create symlink under deepspeed Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Reza Yazdani 提交于
* fix Opt injection & add injection verification check at inference test * fix several issues * remove fixture * remove check_injection when no kerenl is injected Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 04 1月, 2023 2 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Stas Bekman 提交于
* [doc] fix `min_loss_scale` default * align
-
- 29 12月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 23 12月, 2022 4 次提交
-
-
由 Guanhua Wang 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Samyam Rajbhandari 提交于
Co-authored-by: NStas Bekman <stas00@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 22 12月, 2022 1 次提交
-
-
由 Ikko Ashimine 提交于
-
- 21 12月, 2022 1 次提交
-
-
由 mzl 提交于
-