- 02 9月, 2021 3 次提交
-
-
由 Olatunji Ruwase 提交于
-
由 Olatunji Ruwase 提交于
-
由 Hari Prasad 提交于
* Added drop_last to DeepSpeedDataLoader This solves issue #326 * Updated drop_last in engine.py added drop_last as a ds_config as mentioned by @tjruwase * Update engine.py * Update engine.py * updated config.py and constants.py * Update constants.py * added dataloader_ prefix * Update dataloader.py * corrected yapf test errors * Update test_data.py Added dataloader_drop_last unit test * Corrected yapf and formatting issues * updated simple_model.py and test_data.py * Update simple_model.py * pre-commit fix * corrected issues * Update test_data.py * Update test_data.py * Update test_data.py * Update test_data.py * removed batch_size from test_data.py * Update simple_model.py * Update test_data.py * Update test_data.py * Fix unit test issues * Use fp32 to make things work Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 31 8月, 2021 2 次提交
-
-
由 Ammar Ahmad Awan 提交于
* Remove the wrong function with duplicate name * fix format. * add mpu check. fix tests.
-
由 Stas Bekman 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 28 8月, 2021 4 次提交
-
-
由 Olatunji Ruwase 提交于
-
由 Olatunji Ruwase 提交于
* Rename PA_TO_cpu * Code cleanup * Revert accidental change
-
由 Reza Yazdani 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Reza Yazdani 提交于
* add more synchronizations and barriers for resolving gpu-halt issue * removing unuseful broadcasts
-
- 27 8月, 2021 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 26 8月, 2021 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Callable option for optimizer and scheduler * Add unit test * Formatting * Disable debug prints * Use base optimizer to construct lr scheduler * Formatting * Remove dead import
-
- 25 8月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
* restore fp16 params if no zero ckpts available * formatting
-
- 20 8月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 19 8月, 2021 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Pruthvi Madugundu 提交于
-
- 18 8月, 2021 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 17 8月, 2021 2 次提交
-
-
由 Ammar Ahmad Awan 提交于
Co-authored-by: NAlex Muzio <Alex.Muzio@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NFelipe Cruz Salinas <Andres.Cruz@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <shaden.smith@microsoft.com> Co-authored-by: NYoung Jin Kim <youki@microsoft.com> Co-authored-by: Nbapatra <bapatra@microsoft.com> Co-authored-by: NSamyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: NShaden Smith <shaden.smith@microsoft.com> Co-authored-by: NYoung Jin Kim <youki@microsoft.com>
-
由 Conglong Li 提交于
Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 11 8月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 07 8月, 2021 2 次提交
-
-
由 Olatunji Ruwase 提交于
* Use correct input size for splits * Use smarter partitioning
-
由 Olatunji Ruwase 提交于
-
- 06 8月, 2021 1 次提交
-
-
由 Denis Tarasov 提交于
Make add operation inplace. Without it momentum decays to zero and training has no effect on corresponding parameters
-
- 03 8月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
* fix empty grad zero tests * dont clear grads in stage 1 code path * prevent none grads from being reduced
-
- 31 7月, 2021 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 30 7月, 2021 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Fix docstring * Make screenshots clickable for easier viewing * Navigation menu in alphabetical order; More clicable screenshots * Rename 1Cycle doc * Tweak naming * Remove no longer used flag * ZeRO3 Offload release * Single GPU results * Rearrange figures * Single GPU text * tweak intro * zero3-offload section * Add asynchronous i/o docs * Fix print_per_steps doc * Document round_robin_gradients * Tweak description * Trigger CI
-
- 29 7月, 2021 4 次提交
-
-
由 Adam Moody 提交于
* aio: test for libaio with various package managers * aio: note typical tool used to install libaio package * setup: abort with error if cannot build requested op * setup: define op_envvar to return op build environment variable * setup: call is_compatible once for each op * setup: only print suggestion to disable op when its envvar not set * setup: add method to abort from fatal error * Revert "setup: add method to abort from fatal error" This reverts commit 0e4cde6b0a650591c3fafface7e27b4efd9aad4f. * setup: add method to abort from fatal error Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Make round robin gradient partitioning configurable (default False) * Use the correct default * Log config setting
-
由 Ivan Komarov 提交于
Co-authored-by: NIvan Komarov <dfyz@yandex-team.ru> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
由 Olatunji Ruwase 提交于
-
- 27 7月, 2021 2 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Adam Moody 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 25 7月, 2021 1 次提交
-
-
由 Adam Moody 提交于
-
- 21 7月, 2021 1 次提交
-
-
由 Reza Yazdani 提交于
* fixing inference api for FP32 and non-masking GPT-based models * use a dummy tensor if input_mask is none * fix input_mask * minor fix * send input_mask to compute_attn func for checking
-
- 20 7月, 2021 1 次提交
-
-
由 Stas Bekman 提交于
* zero_param_shapes: switch to round_robin_fp16_groups * add test * old torch workaround
-
- 16 7月, 2021 2 次提交
-
-
由 Stas Bekman 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Adam Moody 提交于
* enable async io op on powerpc architectures * drop any empty strings returned by cxx_args Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-