- 19 9月, 2020 3 次提交
-
-
由 Shaden Smith 提交于
-
由 Jeff Rasley 提交于
This reverts commit 01b6e27e. Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com>
-
由 Shaden Smith 提交于
* Activation checkpointing bugfix and unit tests. * Activation checkpointing bugfix and unit tests. Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 18 9月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 17 9月, 2020 6 次提交
-
-
由 Gowtham Prudhvi 提交于
-
由 Shaden Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Haibin Lin 提交于
* Update stage2.py Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Shaden Smith 提交于
* Switches fused_optimizer overflow calculation
-
由 Olatunji Ruwase 提交于
* Update installation instructions * Format fix * ZeRO tutorial * Format fixes * ZeRO-Offload * ZeRO and ZeRO-Offload tutorials * Update navigation page * Format fixes * Add yuxhe feedback * Fix blog post link * Fix OneBit-Adam link Tweak scheduler example * Fix date link Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Olatunji Ruwase 提交于
Update lr schedule unit tests
-
- 16 9月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 15 9月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
* add pytest skips around tests that require certain ops to be installed
-
- 14 9月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
-
- 12 9月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
This reverts commit e549be60.
-
- 11 9月, 2020 11 次提交
-
-
由 RezaYazdaniAminabadi 提交于
* supporting different intermediate sizes other than 4*hidden_dim * run precommit * uncommnet the unit tests Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Shaden Smith 提交于
-
由 Jeff Rasley 提交于
-
由 Olatunji Ruwase 提交于
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Shaden Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 10 9月, 2020 14 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
由 Arash Ashari 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Minjia Zhang 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Arash Ashari 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Olatunji Ruwase 提交于
Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
Fixes a dataype issue with softmax where the number of blocks being sent to the Triton kernel source was a torch.Tensor but should have been a python integer. On some environments (e.g., conda) this resulted in triton not knowing how to serialize the input (and crashing in our tests). Once switching to the correct datatype that triton expects this seems to have solved the issue. Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Shaden Smith 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
* ZeRO-Offload (squash) (#381) Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NJie <37380896+jren73@users.noreply.github.com> Co-authored-by: NArash Ashari <arashari@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Narashashari <arashashari@ArashMSLaptop.redmond.corp.microsoft.com> Co-authored-by: NRezaYazdaniAminabadi <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Arash Ashari 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-