提交 · 43bf035cfce6eec7b93e534a8a44d2aefd420ffd · Greenplum / DeepSpeed

16 11月, 2022 1 次提交
- M
  Update docs to autogenerate pydantic config model docs (#2509) · 43bf035c
  由 Michael Wyatt 提交于 11月 15, 2022
```
* update zero config docs
* add autogenerated docs for pydantic models used in ZeRO and Inference configs
```
  43bf035c
15 11月, 2022 2 次提交

DeepSpeed inference config. (#2459) (#2472) · b5d18a6a

由 Ammar Ahmad Awan 提交于 11月 14, 2022

Changes to inference API to use accept a config dict and cleaning up Inference Engine to utilize the newly added inference config.
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

b5d18a6a

J

bump to 0.7.6 · a4ceabb6
由 Jeff Rasley 提交于 11月 14, 2022

a4ceabb6

14 11月, 2022 1 次提交
- I
  
  Fix typos: deepseed -> deepspeed (#2499) · 06e00f61
  由 iLeGend 提交于 11月 14, 2022
  
  06e00f61
12 11月, 2022 1 次提交
- L
  Make data contiguous before the inplace reshape-copy_ function (#2489) · f2710bbe
  由 lokoppakmsft 提交于 11月 11, 2022
```
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
```
  f2710bbe
11 11月, 2022 2 次提交
- M
  Fix nightly CI tests (#2493) · be5ec506
  由 Michael Wyatt 提交于 11月 10, 2022
```
* fix for lm-eval nightly tests and add gpt-j to MPtest because OOM on single GPU

* add nv-nightly badge
```
  be5ec506
- O
  
  Make bf16_optimizer work for non pipeline (#2470) · ee39187d
  由 Olatunji Ruwase 提交于 11月 10, 2022
  
  ee39187d
10 11月, 2022 5 次提交
- stage_1_and_2.py: no allreduce needed when mp size is 1 (#2494) · 3ca9878d
  由郭叶军提交于 11月 10, 2022
  
  3ca9878d
- C
  Stable Diffusion Enhancements (#2491) · e7e75955
  由 Connor Holmes 提交于 11月 09, 2022
```
Co-authored-by: Ncmikeh2 <connorholmes@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
```
  e7e75955
- K
  Add `scale_attn_by_inverse_layer_idx` feature (#2486) · 6f77da1b
  由 Kevin Ko 提交于 11月 10, 2022
```
* Add scale_attn_by_inverse_layer_idx feature

* Fix layer_id bug

* Fix scaling value
Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
```
  6f77da1b
- J
  
  [docs] add SD tutorial to deepspeed.ai news · d2d1b4c3
  由 Jeff Rasley 提交于 11月 09, 2022
  
  d2d1b4c3
- J
  
  [docs] add SD tutorial to news · a63cb07e
  由 Jeff Rasley 提交于 11月 09, 2022
  
  a63cb07e
09 11月, 2022 1 次提交

Fix CI issues related to cupy install (#2483) · 521d329b

由 Michael Wyatt 提交于 11月 08, 2022

* remove any cupy install when setting up environments

* revert previous changes to run on cu111 runners

* fix for when no cupy is installed

* remove cupy uninstall for workflows not using latest torch version

* update to cu116 for inference tests

* fix pip uninstall line

* move python environment list to after DS install

* remove cupy uninstall

* re-add --forked

* fix how we get cupy version (should be based on nvcc version)

521d329b

08 11月, 2022 2 次提交
- R
  Add correct memory-allocation at DeepSpeed-Attention (#2474) · 9cfcf743
  由 Reza Yazdani 提交于 11月 07, 2022
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
```
  9cfcf743
- K
  
  fix accelerate link (#2481) · a47c3e03
  由 kyoto7250 提交于 11月 08, 2022
  
  a47c3e03
05 11月, 2022 2 次提交
- S
  Added MLFLOW environment variables for logging metrics within trainig… (#2477) · ffb6d987
  由 savitamittal1 提交于 11月 04, 2022
```
* Added MLFLOW environment variables for logging metrics within trainign script

* exporting MLFlow env variables from AML env
Co-authored-by: NCheng Li <pistasable@gmail.com>
```
  ffb6d987
- J
  Updating autotune json default in docs. (#2476) · 4a06ecf6
  由 Joe Mayer 提交于 11月 04, 2022
```
* Updating autotune default in docs.

* Running pre-commit.
```
  4a06ecf6
04 11月, 2022 2 次提交
- don't gather partitioned activations for mp size 1 (#2454) · f74ee318
  由郭叶军提交于 11月 04, 2022
```
* don't gather partitioned activations for mp size 1

* add inline comment for the change
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  f74ee318
- A
  
  Create a new folder structure to isolate model-specific code in DS (#2464) · 35458da0
  由 Ammar Ahmad Awan 提交于 11月 03, 2022
  
  35458da0
03 11月, 2022 2 次提交
- R
  fixing the checkpoint loading at inference-engine (#2429) · 39bdc141
  由 Reza Yazdani 提交于 11月 02, 2022
```
Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
```
  39bdc141
- C
  Cache Allocation and Softmax Fixes (#2433) · 10e9d04c
  由 Connor Holmes 提交于 11月 02, 2022
```
Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  10e9d04c
02 11月, 2022 1 次提交

Fixes for various CI problems (#2457) · 825f9d48

由 Michael Wyatt 提交于 11月 01, 2022

* check only major CUDA version in CI

* update expected torch latest version

* pin torch latest to 1.12 until issues with 1.13 are resolve

* wrong expected torch version

* Update nv-torch18-v100.yml

* remove forked from pytest option due to cuda re-initialization errors

* removed expected torch version from inference tests, causing errors currently

* fix various bugs that popped up

* move all tests over to cu111 runners, cu113 runners having problems

825f9d48

28 10月, 2022 2 次提交
- deepspeed/launcher/launch.py: add option '--enable_each_rank_log logdir' (#2409) · 3432c740
  由郭叶军提交于 10月 28, 2022
  
  3432c740
- C
  Reduction Kernel Utility (#2436) · be4ffb82
  由 Connor Holmes 提交于 10月 27, 2022
```
* Initial reduction_utils.h implementation

* Add initialization helper, ensures correct min/max behavior

* Remove unnecessary warp sync
```
  be4ffb82
27 10月, 2022 3 次提交

J

Fixing a mismatch in basic adam test. (#2447) · 3b3ba3c2
由 Joe Mayer 提交于 10月 26, 2022

3b3ba3c2
M
Use CUDA events for inference model profiling (#2371) · e772f166
由 Michael Wyatt 提交于 10月 26, 2022
```
* use cuda event timers for model profiling
```
e772f166

rollback ds config changes (#2395) · 8da0238b

由 Cheng Li 提交于 10月 26, 2022

* rollback ds config changes

* fix format

* Fix error when output_file is a relative path without a prefix (#2397)
Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com>

* fix restuls and exprs path to use absolute path

* write out optimial config after tuning

* fix format

* assert tuning result dir creation
Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com>
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

8da0238b

26 10月, 2022 2 次提交

Fix build issues on Windows (#2428) · b85eb3b9

由 eltonzheng 提交于 10月 25, 2022

* Fix build issues on Windows

* small fix to complie with new version of Microsoft C++ Build Tools
Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>

b85eb3b9

C
update pytorch pool operator function signiture (#2443) · 5d1f595c
由 Cheng Li 提交于 10月 25, 2022
```
* update pytorch pool operator function signiture

* fix the case where kwargs is None
```
5d1f595c

25 10月, 2022 1 次提交
- J
  Fix Bug #2319 (#2438) · 7d113633
  由 Joe Mayer 提交于 10月 24, 2022
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  7d113633
22 10月, 2022 3 次提交

J

bump to 0.7.5 · a5248643
由 Jeff Rasley 提交于 10月 21, 2022

a5248643
L
Fix broken link to DeepSpeed Megatron fork (#2440) · 877a8818
由 lekurile 提交于 10月 21, 2022
```
Co-authored-by: NLev Kurilenko <lekurile@microsoft.com>
```
877a8818

parallelize writing of layer checkpoint files across data parallel instances (#1419) · b8fb9c3f

由 Adam Moody 提交于 10月 21, 2022

* parallelize layer checkpoints across data parallel groups

* use partition_uniform to determine start/end index values

* formatting fix

* config: add option for parallel write of layer checkpoints in pipeline stage

* yapf fixes

* enable parallel layer write according to config param

* avoid extraneous makedir when rank 0 writes all layers
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>

b8fb9c3f

20 10月, 2022 1 次提交

[memory estimators] new config args sync (#2431) · 99fde3b7

由 Stas Bekman 提交于 10月 19, 2022

Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

99fde3b7

19 10月, 2022 2 次提交

Add TestInjectionPolicy inference unittest class for testing custom injection policies (#2426) · b2a724e2

由 lekurile 提交于 10月 18, 2022

This PR adds a TestInjectionPolicy inference unittest class for testing custom injection policies.

This test differs from the existing tests in that the injection_policy dictionary is explicitly specified when calling the DeepSpeed init_inference API.

The google/t5-v1_1-small text2text-generation model and the roberta-large fill-mask model are added as tests with the injection policy explicitly specified.

This is done to expand our unittest coverage to test the path where the replace_wo_policy function is invoked (see GH-2387).
Co-authored-by: NLev Kurilenko <lekurile@microsoft.com>
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

b2a724e2

J

only add deps if extra is explictly called (#2432) · 1b7c6791
由 Jeff Rasley 提交于 10月 18, 2022

1b7c6791

18 10月, 2022 3 次提交

Universal checkpoint for zero stage 1 (#2284) · 799120e7

由 Olatunji Ruwase 提交于 10月 18, 2022

* Refactor universal checkpointing and tensor fragments

* Formatting

* Support zero stage1; Expand TP dim

* Remove debug prints

* Detect sharded optimizer state

* Format fixes

* Encode reshaping guide

* More symbolic constants
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

799120e7

Fixing bug 2361 (#2410) · 906b4a02

由 Joe Mayer 提交于 10月 17, 2022

* fixing bug 2361

* adding pytest for config initialization

* chaning expected output to FusedAdam

* remove print statement

* running yapf on modified files

* running pre-commit formatting
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>

906b4a02

Fix for inference gpt-j test (#2430) · 34fb6d19

由 Michael Wyatt 提交于 10月 17, 2022

* fix for gpt-j failing due to tokenizer error

* limit number of gpt-j tokens generated due to low memory

34fb6d19

15 10月, 2022 1 次提交

fixes #2389 (#2411) · cfead551

由 Alexander Jipa 提交于 10月 14, 2022

truncating expert param storage for checkpointing
Co-authored-by: NAlexander Jipa <azzhipa@amazon.com>
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>

cfead551

Greenplum / DeepSpeed 上一次同步 大约 1 年

Greenplum / DeepSpeed
上一次同步大约 1 年