提交 · c14b839d9898f4c84e372e896e3ce8fa2e169a79 · Greenplum / DeepSpeed

09 1月, 2021 4 次提交
- J
  
  version bump to 0.3.10 · c14b839d
  由 Jeff Rasley 提交于 1月 08, 2021
  
  c14b839d
- A
  Remove a very verbose print statement. (#649) · af212f66
  由 Ammar Ahmad Awan 提交于 1月 08, 2021
```
* Remove a very verbose print statement.

* Update engine.py
```
  af212f66
- J
  
  add additional validation checks in elastic config (#646) · bc046dc4
  由 Jeff Rasley 提交于 1月 08, 2021
  
  bc046dc4
- S
  document deepspeed.initialize() (#644) · 828d75ba
  由 Stas Bekman 提交于 1月 08, 2021
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  828d75ba
08 1月, 2021 2 次提交

J
Add deepspeed.init_distributed to RTD page (#645) · 4e2dc4e4
由 Jeff Rasley 提交于 1月 07, 2021
```
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
4e2dc4e4

Bump nokogiri from 1.10.10 to 1.11.0 in /docs (#630) · 8cea96dd

由 dependabot[bot] 提交于 1月 07, 2021

Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.10.10 to 1.11.0.
- [Release notes](https://github.com/sparklemotion/nokogiri/releases)
- [Changelog](https://github.com/sparklemotion/nokogiri/blob/master/CHANGELOG.md)
- [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.10.10...v1.11.0)
Signed-off-by: Ndependabot[bot] <support@github.com>
Co-authored-by: Ndependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

8cea96dd

07 1月, 2021 2 次提交
- X
  
  Update builder.py (#642) · 64461da4
  由 Xingjian Shi 提交于 1月 06, 2021
  
  64461da4
- J
  Module replacement support (#586) · 44bd538b
  由 Jeff Rasley 提交于 1月 06, 2021
```
Co-authored-by: NReza Yazdani <reyazda@microsoft.com>
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
```
  44bd538b
06 1月, 2021 4 次提交
- O
  
  Fix docstring format (#640) · 5ab12795
  由 Olatunji Ruwase 提交于 1月 05, 2021
  
  5ab12795
- B
  docs: minor spelling tweaks (#623) · 46d2e287
  由 brett koonce 提交于 1月 05, 2021
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  46d2e287
- A
  
  change dist to torch.distributed to fix bug in assert. (#638) · d38ad6a1
  由 Ammar Ahmad Awan 提交于 1月 05, 2021
  
  d38ad6a1
- G
  Allow DeepSpeed models to be initialized with optimizer=None (#469) · a9a83a6f
  由 gcooper-isi 提交于 1月 05, 2021
```
Allow DeepSpeed models to be initialized with optimizer=None
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
```
  a9a83a6f
05 1月, 2021 2 次提交
- O
  
  Support initialization with dict configuration (#632) · e6ac7311
  由 Olatunji Ruwase 提交于 1月 04, 2021
  
  e6ac7311
- J
  
  update SA comp check to fix torch-cpu issue (#631) · 24e07399
  由 Jeff Rasley 提交于 1月 04, 2021
  
  24e07399
23 12月, 2020 1 次提交
- J
  Elastic training support (#602) · 81aeea36
  由 Jeff Rasley 提交于 12月 22, 2020
```
Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com>
```
  81aeea36
18 12月, 2020 2 次提交
- J
  
  Ability to initialize distributed backend outside deepspeed runtime (#608) · 7435b2f1
  由 Jeff Rasley 提交于 12月 17, 2020
  
  7435b2f1
- R
  Transformer-kernel - supporting any arbitrary sequence-length (#587) · fd2f970b
  由 Reza Yazdani 提交于 12月 17, 2020
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  fd2f970b
16 12月, 2020 2 次提交

J
Fixes for RTD build errors (#606) · 6380ee35
由 Jeff Rasley 提交于 12月 15, 2020
```
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
```
6380ee35

[doc] xref to hostfile discussion (#604) · 007466e5

由 Stas Bekman 提交于 12月 15, 2020

* [doc] xref to hostfile discussion

wasn't clear where to find what was meant by `hostfile` - so adding a link to where it's discussed.

* remove whitespace

007466e5

15 12月, 2020 1 次提交
- S
  implement missing get_last_lr (#595) · 9f8e8f38
  由 Stas Bekman 提交于 12月 14, 2020
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  9f8e8f38
12 12月, 2020 5 次提交
- J
  Update launcher to set local rank environ variable (#597) · c5a449f9
  由 Jeff Rasley 提交于 12月 11, 2020
```
* Update launch.py

* formatting
```
  c5a449f9
- C
  Supported customizing kwargs for lr_scheduler (#584) · a4763f55
  由 carefree0910 提交于 12月 12, 2020
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  a4763f55
- S
  add DeepSpeedZeroConfig repr method (#596) · 66268bd3
  由 Stas Bekman 提交于 12月 11, 2020
```
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  66268bd3
- S
  [build] fix computer capability arch flags, add PTX, handle PTX (#591) · 8a184b6b
  由 Stas Bekman 提交于 12月 11, 2020
```
* fix arch flags, add PTX

* bug fix
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
```
  8a184b6b
- J
  
  add manual workflow to run tests with precompiled ops · 0518252d
  由 Jeff Rasley 提交于 12月 11, 2020
  
  0518252d
10 12月, 2020 4 次提交
- J
  
  Add AML video link · 7300f3e3
  由 Jeff Rasley 提交于 12月 09, 2020
  
  7300f3e3
- J
  
  Add papers/videos to readme/website (#592) · 19acd6cf
  由 Jeff Rasley 提交于 12月 09, 2020
  
  19acd6cf
- J
  
  bump to 0.3.8 · cb7c7da6
  由 Jeff Rasley 提交于 12月 09, 2020
  
  cb7c7da6
- J
  
  Pin triton to 0.2.3 for now, 0.3.0 is broken · d901a6d2
  由 Jeff Rasley 提交于 12月 09, 2020
  
  d901a6d2
09 12月, 2020 1 次提交
- S
  Pipeline warnings and checkpoint portability (#588) · 2f626978
  由 Shaden Smith 提交于 12月 08, 2020
```
* Switch from deprecated allreduce interface.

* Make pipeline checkpoint files portable.
```
  2f626978
08 12月, 2020 2 次提交

[build] add compute_86 (#577) · e8b126d9

由 Stas Bekman 提交于 12月 07, 2020

RTX-30 series are compute_86
```
python -c "import torch; print(torch.cuda.get_device_capability())"
```
This PR adds support for this compute capability.

Reference: https://developer.nvidia.com/cuda-gpusCo-authored-by: NJeff Rasley <jerasley@microsoft.com>

e8b126d9

S

[build] make builder smarter and configurable wrt compute capabilities + docs (#578) · ce363d0e
由 Stas Bekman 提交于 12月 07, 2020

ce363d0e

05 12月, 2020 1 次提交

Fix potential random layout inconsistency issues in sparse attention modules (#534) · 1e44d48d

由 Zhun 提交于 12月 04, 2020

* 1) Register layout as buffer of module so that we can save/load checkpoint; 2) Add a broadcast of layout at the beginning to ensure different processes will have consistent layout during distributed training.

* Add docstring for max_seq_length argument in SparseSelfAttention
Co-authored-by: NZhun Liu <zhunliu@microsoft.com>
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

1e44d48d

03 12月, 2020 5 次提交
- S
  
  [build] build against installed cuda-11.1 while torch built w/ cuda-11.0 (#570) · ff58fa7e
  由 Stas Bekman 提交于 12月 02, 2020
  
  ff58fa7e
- J
  
  Add compute capability 8.0 if on cuda 11+ (#572) · be33bea4
  由 Jeff Rasley 提交于 12月 02, 2020
  
  be33bea4
- S
  
  [engine] train should be able to get `mode` arg (#571) · 2d1f7c01
  由 Stas Bekman 提交于 12月 02, 2020
  
  2d1f7c01
- J
  
  Add 'latest' checkpoint save/load support (#569) · 845921b3
  由 Jeff Rasley 提交于 12月 02, 2020
  
  845921b3
- S
  [cifar tutorial] improve readability (#567) · 7a75f8b3
  由 Stas Bekman 提交于 12月 02, 2020
```
* [cifar tutorial] improve readability 
```
  7a75f8b3
02 12月, 2020 2 次提交

tracking optimizer step in cpu-adam when loading checkpoint (#564) · 9f52a36f

由 Reza Yazdani 提交于 12月 01, 2020

* tracking optimizer step in cpu-adam when loading checkpoint

* add warning/error message for updating optimizer step count

* resolve build issue

* supporting state update from the python side

* track step from python in all cases

* remove comma

9f52a36f

supporting different hidden dimensions (#559) · c78c29f9

由 Reza Yazdani 提交于 12月 01, 2020

* supporting different hidden dimensions

* add support for larger hidden dimensions (greater than 8K)

* remove empty line

* add loop unrolling factor for dropout kernels

* update different kernels based on the reviews
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>

c78c29f9

Greenplum / DeepSpeed 上一次同步 大约 1 年

Greenplum / DeepSpeed
上一次同步大约 1 年