- 09 1月, 2021 4 次提交
-
-
由 Jeff Rasley 提交于
-
由 Ammar Ahmad Awan 提交于
* Remove a very verbose print statement. * Update engine.py
-
由 Jeff Rasley 提交于
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 08 1月, 2021 2 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 dependabot[bot] 提交于
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.10.10 to 1.11.0. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/master/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.10.10...v1.11.0) Signed-off-by: Ndependabot[bot] <support@github.com> Co-authored-by: Ndependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 07 1月, 2021 2 次提交
-
-
由 Xingjian Shi 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 06 1月, 2021 4 次提交
-
-
由 Olatunji Ruwase 提交于
-
由 brett koonce 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ammar Ahmad Awan 提交于
-
由 gcooper-isi 提交于
Allow DeepSpeed models to be initialized with optimizer=None Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
- 05 1月, 2021 2 次提交
-
-
由 Olatunji Ruwase 提交于
-
由 Jeff Rasley 提交于
-
- 23 12月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com>
-
- 18 12月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 16 12月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
由 Stas Bekman 提交于
* [doc] xref to hostfile discussion wasn't clear where to find what was meant by `hostfile` - so adding a link to where it's discussed. * remove whitespace
-
- 15 12月, 2020 1 次提交
-
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 12 12月, 2020 5 次提交
-
-
由 Jeff Rasley 提交于
* Update launch.py * formatting
-
由 carefree0910 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
* fix arch flags, add PTX * bug fix Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 10 12月, 2020 4 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 09 12月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
* Switch from deprecated allreduce interface. * Make pipeline checkpoint files portable.
-
- 08 12月, 2020 2 次提交
-
-
由 Stas Bekman 提交于
RTX-30 series are compute_86 ``` python -c "import torch; print(torch.cuda.get_device_capability())" ``` This PR adds support for this compute capability. Reference: https://developer.nvidia.com/cuda-gpusCo-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Stas Bekman 提交于
-
- 05 12月, 2020 1 次提交
-
-
由 Zhun 提交于
* 1) Register layout as buffer of module so that we can save/load checkpoint; 2) Add a broadcast of layout at the beginning to ensure different processes will have consistent layout during distributed training. * Add docstring for max_seq_length argument in SparseSelfAttention Co-authored-by: NZhun Liu <zhunliu@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 03 12月, 2020 5 次提交
-
-
由 Stas Bekman 提交于
-
由 Jeff Rasley 提交于
-
由 Stas Bekman 提交于
-
由 Jeff Rasley 提交于
-
由 Stas Bekman 提交于
* [cifar tutorial] improve readability
-
- 02 12月, 2020 2 次提交
-
-
由 Reza Yazdani 提交于
* tracking optimizer step in cpu-adam when loading checkpoint * add warning/error message for updating optimizer step count * resolve build issue * supporting state update from the python side * track step from python in all cases * remove comma
-
由 Reza Yazdani 提交于
* supporting different hidden dimensions * add support for larger hidden dimensions (greater than 8K) * remove empty line * add loop unrolling factor for dropout kernels * update different kernels based on the reviews Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-