提交 · 4be8df721ae631264abb62216692c7eb64b49e20 · Greenplum / DeepSpeed

25 1月, 2023 3 次提交

J
fixing optimizer sanity check (#2742) · 4be8df72
由 Joe Mayer 提交于 1月 25, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
4be8df72

Automatic tensor parallelism v2 (#2670) · d59b5729

由 Molly Smith 提交于 1月 24, 2023

* loop through pipe.model

* tp_parser first draft

* client_module must be type object

* Simplify layernorm tracking. Add unittest.

* cleanup

* Add more models to unittest

* cleanup inference pytest for merging

* Add unittest

* cleanup

* pre-commit

* unittest id and pytest marker

* try marian for unittest

* precommit

* Move tp code to seperate file

* Add new auto tp file

* pre-commit and type

* Update deepspeed/module_inject/auto_tp.py
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

* Update deepspeed/module_inject/auto_tp.py
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

* Update tests/unit/inference/test_inference.py
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

* remove unused fillmask function
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

d59b5729

L

Change zero_grad() argument to match pytorch (#2741) · 34a11688
由 loadams 提交于 1月 24, 2023

34a11688

20 1月, 2023 1 次提交

Inference Refactor (replace_with_policy, model_implementations) (#2554) · 867da307

由 Ammar Ahmad Awan 提交于 1月 19, 2023

Co-authored-by: NLev Kurilenko <lekurile@microsoft.com>
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

867da307

19 1月, 2023 4 次提交
- M
  fix typo (#2718) · 8df50a26
  由 Michael Wyatt 提交于 1月 18, 2023
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  8df50a26
- J
  BF16 optimizer for BF16+ZeRO Stage 1 (#2706) · 8d87c89e
  由 Joe Mayer 提交于 1月 18, 2023
```
* BF16 optimizer only with ZeRO stage 1.

* Updating to grad accum of fp32 for BF16 ZeRO1 case.
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  8d87c89e
- M
  update for lm-eval==0.3.0 (#2713) · 23e5133c
  由 Michael Wyatt 提交于 1月 18, 2023
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  23e5133c
- J
  [install] only add deepspeed pkg at install (#2714) · 0b549ad7
  由 Jeff Rasley 提交于 1月 18, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  0b549ad7
18 1月, 2023 6 次提交

M

remove master branch from CI triggers (#2712) · df2495ca
由 Michael Wyatt 提交于 1月 17, 2023

df2495ca

CUDA optional deepspeed ops (#2507) · 3f210c97

由 Olatunji Ruwase 提交于 1月 17, 2023

* CPU-Adam: add compile-flag to enable param-copy from CPU to GPU

* guarde the CUDA-related include files and variables

* remove CUDA dependency from op_builder when building against CPU

* fixing the builder issues

* fix formatting

* return true when there is no mismatch on the cuda version

* guard for when cuda is not available & test with cpu-only environment

* Update cpu_adam and cpu_adagrad

* Format fixes

* Add configurable half precision type; Build/run in CUDA environment

* Run cpu_adam and cpu_adagrad in cpu only environment

* Mark CUDA only unit tests

* CPU environment CI

* Format fixes

* Remove --forked

* Add --forked

* CPU only CI should pass

* Format fixes

* Format fixes

* Remove scattered pytest.skip

* Fix cpu_adam unit test

* Update .github/workflows/nv-torch-latest-cpu.yml
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

* Update .github/workflows/nv-torch-latest-cpu.yml
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

* Address PR feedback

* OpenMP linking

* Fix unit tests
Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

3f210c97

J

bump to 0.8.1 · 7d0e4270
由 Jeff Rasley 提交于 1月 17, 2023

7d0e4270
O

ZeRO3 handling frozen weights] (#2653) · bf6b9802
由 Olatunji Ruwase 提交于 1月 17, 2023

bf6b9802
J

remove print side effect from importing deepspeed (#2704) · 35575bce
由 Jeff Rasley 提交于 1月 17, 2023

35575bce
J
non-MoE stage 1 requires CG disabled (#2703) · e4ba7222
由 Jeff Rasley 提交于 1月 17, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
e4ba7222

14 1月, 2023 4 次提交
- J
  using correct loss scale in zero step (#2695) · fe728e3e
  由 Joe Mayer 提交于 1月 14, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  fe728e3e
- J
  
  exclude benchmarks during install (#2698) · cd271a4a
  由 Jeff Rasley 提交于 1月 13, 2023
  
  cd271a4a
- M
  fix for latest diffusers (#2699) · c9c6ab9e
  由 Michael Wyatt 提交于 1月 13, 2023
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  c9c6ab9e
- S
  [GatheredParameters] add support for any iterator (#2664) · 217cc07b
  由 Stas Bekman 提交于 1月 13, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  217cc07b
13 1月, 2023 1 次提交

Extend quantization utils features (#2683) · aef8a856

由 LOK CHAND KOPPAKA 提交于 1月 12, 2023

* Extend quantization utils features

* remove unwanted files

* fix cahce setting
Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>

aef8a856

12 1月, 2023 2 次提交
- L
  Pass training flag to forward call from Eval (#2604) · e7c14026
  由 LOK CHAND KOPPAKA 提交于 1月 11, 2023
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
```
  e7c14026
- M
  fix import path to op_builder (#2687) · cf9e433f
  由 Masahiro Tanaka 提交于 1月 11, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  cf9e433f
11 1月, 2023 2 次提交

Add mlflow logging for aml (#2495) · a3d7f106

由 cassieesvelt 提交于 1月 10, 2023

* add logging changes

* try w/out abspath

* undo last change

* start mlflow debug

* remove mlflow from export_envs

* add mlflow logging for reversed

* remove mlflow.start_run

* add back start run

* don't clean cmd

* print os environment variables

* remove first start run

* add run_id to mlflow star

* remove context managers

* move last end run

* add extra parent start_runs

* add run id logging

* add logging to run_ds_config

* change run_id to run_name

* add back context managers and run_id logs

* remove context mng

* debug environment variable

* reset environment variables

* add env variable deletion

* clean up

* remove unused import

* fix yapf/whitespace errors
Co-authored-by: NCheng Li <pistasable@gmail.com>

a3d7f106

J
remove duplicated code in ZeRO (#2655) · 89da037e
由 JackieWu 提交于 1月 11, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
89da037e

10 1月, 2023 2 次提交
- M
  
  real_accelerator validation check for both accelerator and deepspeed.accelerator path (#2685) · 62c071e0
  由 Ma, Guokai 提交于 1月 10, 2023
  
  62c071e0
- J
  
  [inference] ds-mlp refactor w.r.t. ops (#2668) · c702b64c
  由 Jeff Rasley 提交于 1月 09, 2023
  
  c702b64c
09 1月, 2023 4 次提交

X
fix Tensor contiguous bug in model_compression (#2671) · be6d19f0
由 Xiaoxia (Shirley) Wu 提交于 1月 09, 2023
```
double check the unit tests
```
be6d19f0
S
[fp16] lower initial_scale_power (#2663) · f30a0308
由 Stas Bekman 提交于 1月 09, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
f30a0308
J
[Bug Fixed] use torch.cuda.is_available() (#2661) · 323c266c
由 JackieWu 提交于 1月 09, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
323c266c

Remove unnecessary device synchronization for stage 2 (#2500) · 97deaaec

由 li-yi-dong 提交于 1月 09, 2023

* Remove unnecessary device synchronization for stage 2

* Remove unnecessary device synchronization for stage 2
Co-authored-by: Nliyidong.lyd <liyidong.lyd@alibaba-inc.com>
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: NJoe Mayer <114769929+jomayeri@users.noreply.github.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

97deaaec

07 1月, 2023 2 次提交

Abstract accelerator (step 2) (#2560) · 9548d48f

由 Ma, Guokai 提交于 1月 07, 2023

* Abstract accelerator (step 2)

* more flex op_builder path for both installation and runtime

* add SpatialInferenceBuilder into cuda_accelerator.py

* use reflection to make cuda_accelerator adapt to CUDA op builder change automatically

* clean up deepspeed/__init__.py

* add comments in cuda_accelerator for no torch path

* Update deepspeed/env_report.py

Change env_report.py according to suggestion
Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>

* reduce the range of try...except for better code clarity

* Add porting for deepspeed/ops/random_ltd/dropping_utils.py

* move accelerator to top directory and create symlink under deepspeed
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

9548d48f

Fix Opt injection (#2541) · 95d9a1b6

由 Reza Yazdani 提交于 1月 06, 2023

* fix Opt injection & add injection verification check at inference test

* fix several issues

* remove fixture

* remove check_injection when no kerenl is injected
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

95d9a1b6

04 1月, 2023 2 次提交
- J
  [launcher] fail gracefully if hostname -i doesn't work as expected (#2631) · a091bc22
  由 Jeff Rasley 提交于 1月 03, 2023
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  a091bc22
- S
  [doc] fix `min_loss_scale` default (#2660) · f2ea7a38
  由 Stas Bekman 提交于 1月 03, 2023
```
* [doc] fix `min_loss_scale` default

* align
```
  f2ea7a38
29 12月, 2022 1 次提交
- J
  
  tweaks to ds-attn, distilbert policy, and mup (#2649) · d9b788d7
  由 Jeff Rasley 提交于 12月 28, 2022
  
  d9b788d7
23 12月, 2022 4 次提交
- G
  
  fix assertion error in zero stage 3 (#2647) · 6375cb3f
  由 Guanhua Wang 提交于 12月 23, 2022
  
  6375cb3f
- J
  
  Fix issue w. bloom when changing tp size (#2645) · e0aa84c5
  由 Jeff Rasley 提交于 12月 22, 2022
  
  e0aa84c5
- J
  
  [inference] ds-attention refactor w.r.t. ops (#2623) · bb68c526
  由 Jeff Rasley 提交于 12月 22, 2022
  
  bb68c526
- S
  [zero-3] Handle forward parameter return correctly in nested cases (#2642) · a298a43a
  由 Samyam Rajbhandari 提交于 12月 22, 2022
```
Co-authored-by: NStas Bekman <stas00@users.noreply.github.com>
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  a298a43a
22 12月, 2022 1 次提交
- I
  
  Fix typo in autotuner.py (#2639) · 6273dffc
  由 Ikko Ashimine 提交于 12月 22, 2022
  
  6273dffc
21 12月, 2022 1 次提交
- M
  
  add enable_each_rank_log to deepspeed/launcher/runner.py (#2571) · 11f5daba
  由 mzl 提交于 12月 21, 2022
  
  11f5daba

Greenplum / DeepSpeed 上一次同步 大约 1 年

Greenplum / DeepSpeed
上一次同步大约 1 年