- 24 8月, 2022 3 次提交
-
-
由 Reza Yazdani 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Arash Bakhtiari 提交于
-
由 Olatunji Ruwase 提交于
Refactor distributed tests: checkpointing Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 23 8月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 22 8月, 2022 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Correctly detect offload configuration * Correctly detect offload configuration * Handle deprecated cpu offload setting * Correcly detect zero_offload setting * Minor tweak Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
- 20 8月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 18 8月, 2022 2 次提交
-
-
由 Reza Yazdani 提交于
* Fix the tensor-slicing copy for qkv parameters * remove the random-generator from context during inference * formatting Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Conglong Li 提交于
-
- 17 8月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 16 8月, 2022 1 次提交
-
-
由 Zhihong Chen 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 15 8月, 2022 1 次提交
-
-
由 Arash Bakhtiari 提交于
* add opt replace policy * simplify inf. api * fix opt replace policy * fix use-cash & add relu * Add support of custom MLP act. function * Revert "simplify inf. api" This reverts commit 9e910fcbd5471dec9b3c92008426f5ba590bf0b6. * fix the inference API (temp. solution) * fix code formatting * add unit tests for OPT models. * refactor pre-attention layer norm configuration * add support of opt-350m model * refactor the HF model config initialization * fix hf model config issue Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-
- 13 8月, 2022 2 次提交
-
-
由 Ammar Ahmad Awan 提交于
* print warning only once. * add support for torch param and only warn on gpu 0 * remove type checking. will be done on a new PR with more tests. Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 12 8月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
* add cuda 11.7 * formatting
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 11 8月, 2022 3 次提交
-
-
由 Kamal Raj 提交于
Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
Refactor Distributed unit tests
-
由 Reza Yazdani 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 10 8月, 2022 3 次提交
-
-
由 Olatunji Ruwase 提交于
-
由 Minjia Zhang 提交于
Adding additional instructiosn in the compression tutorial on pre-training distillation and quantization for GPT (#2197) Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Mikhail Druzhinin 提交于
* Add gradient_average flag support for sparse grads * formatting fixes * Add tests Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 09 8月, 2022 1 次提交
-
-
由 Reza Yazdani 提交于
-
- 08 8月, 2022 1 次提交
-
-
由 Tiago De Gaspari 提交于
Fix typos.
-
- 06 8月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 05 8月, 2022 3 次提交
-
-
由 Rahil Bathwal 提交于
-
由 Hanlin Tang 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Ramya Ramineni 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 04 8月, 2022 4 次提交
-
-
由 Olatunji Ruwase 提交于
* Match compute and reduce dtype * Unit tests Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Reza Yazdani 提交于
-
由 Jeff Rasley 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
re-enable AMD CI with some modifications
-
- 03 8月, 2022 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Zion Wu 提交于
Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 02 8月, 2022 2 次提交
-
-
由 Michael Wyatt 提交于
Fix for distributed tests Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 01 8月, 2022 2 次提交
-
-
由 Siddharth Singh 提交于
* tensor parallelism for mixture of experts Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NAmmar Ahmad Awan <ammar.awan@microsoft.com>
-
由 Olatunji Ruwase 提交于
* Split parameter offload from z3 * Format fixes * Bug fixes * Cleanup * Remove dead code * Release swap buffers for persisted params * Format fixes * Format fixes * Pass args correctly * Use pinned memory for nvme offload * Merge with masster * Fix missing import * model pesistence params * Fix merge issues * Handle none device * Usse log_dist
-
- 31 7月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-