- 21 12月, 2022 1 次提交
-
-
由 Michael Wyatt 提交于
add reusable workflow that sets up fresh venv for each test and prints relevant environment info
-
- 20 12月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 18 12月, 2022 1 次提交
-
-
由 Michael Wyatt 提交于
* added megatron unit test * Update nv-megatron.yml Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 17 12月, 2022 4 次提交
-
-
由 Connor Holmes 提交于
* Update cpuinfo AVX512 detection * Missing conversion from `_mm256` to `_mm256i` Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Alexander Jipa 提交于
taking gradient accumulation steps into account for throughput calculation Co-authored-by: NAlexander Jipa <azzhipa@amazon.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Lev Kurilenko 提交于
This PR removes the zero-infernece GatheredParameters context from replace_with_policy due to no longer needing zero-inference after the introduction of meta tensor support for BLOOM.
-
由 郭叶军 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 16 12月, 2022 3 次提交
-
-
由 郭叶军 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Jithun Nair 提交于
-
由 Michael Wyatt 提交于
-
- 14 12月, 2022 3 次提交
-
-
由 Rahil Bathwal 提交于
* bug fix for binary search for batch size * fix binary search termination condition
-
由 lokoppakmsft 提交于
* Move layer norm to new schedule * Pre-commit fixes * fix comments * format fixes * Merge unrolls * format fixes * camelCase * format fixes * revert unwanted file * move pow2 function * format fixes Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
-
由 Connor Holmes 提交于
* Migrate ops tests to new inference_ops marker * Disable by default * Add missing test cases * Reorder such that inference_ops will run[fail] first
-
- 13 12月, 2022 3 次提交
-
-
由 Conglong Li 提交于
-
由 Conglong Li 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
- 10 12月, 2022 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
-
- 09 12月, 2022 3 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Joe Mayer 提交于
* Updating docs README with API update procedure. * Addressing comments. Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Michael Wyatt 提交于
* added checkpopint sharding tests
-
- 08 12月, 2022 3 次提交
-
-
由 lokoppakmsft 提交于
* Add support for inputs > 2D * use vec * Add N-Dim support to Dequant kernel * merge master and fix format * Bug Fix * fix num_bits * Fix dequant Co-authored-by: NConnor Holmes <connorholmes@microsoft.com>
-
由 Quentin Anthony 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Lev Kurilenko 提交于
This PR updates the MegatronLayerPolicy to set megatron_v2=True, which is required in order to properly transpose in the replace_with_policy() function. After the change in this PR, in conjunction with PR #99 in the Megatron-DeepSpeed fork, the Megatron text-generation example works with DS inference.
-
- 07 12月, 2022 2 次提交
-
-
由 Reza Yazdani 提交于
* fix checkpoint loading when it is a dictionary * fix some issues with saving ckpt & int8 inference * fix quantized-inference & add generic support of checkpoint loading * remove int8 hard-coded flag * fix mlp return tensors * fix several issue to load checkpoints of GPT-J, GPT-NEOX, and OPT with different TP-size * add more comments & description for checkpoint-loading module Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Connor Holmes 提交于
* Officially drop Maxwell support * Formatting * Comparison mismatch fix
-
- 06 12月, 2022 2 次提交
-
-
由 Ma, Guokai 提交于
* allow bf16 model with fp32 gradient accumulation datatype * allow fp32 gradient accumulation and bfloat16 model in amp mode * alternative fix for grad accumulation type mismatch. In the case of zero optimizer we should have grad accum type == model data type Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Hayden 提交于
-
- 03 12月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 02 12月, 2022 2 次提交
-
-
由 Jeongseok Kang 提交于
-
由 Jeff Rasley 提交于
-
- 01 12月, 2022 1 次提交
-
-
由 AGUL 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 30 11月, 2022 3 次提交
-
-
由 Ma, Guokai 提交于
* Establish building block of abstract accelerator * Change .*Tensor variable to @property * [op builder] add op builder reflection to allow enumerate of builders in all_ops.py and builder_names.py * change @abstractproperty to @property @abstractmethod Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Michael Wyatt 提交于
-
由 Cheng Li 提交于
* rollback ds config changes * fix format * Fix error when output_file is a relative path without a prefix (#2397) Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com> * fix restuls and exprs path to use absolute path * use base64 encoded ds config as cmd arg * fix format * remove assert * write out optimial config after tuning * fix format * no need to update ds config path when encoding ds config * udpate * do not use abs path for result and expr dir * fix conflicts * fix run mode * fix format * fix format Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 29 11月, 2022 1 次提交
-
-
由 ShijieZZZZ 提交于
* report progress at gradient accumulation boundary * format * format
-
- 28 11月, 2022 1 次提交
-
-
由 Joe Mayer 提交于
* Adding gradient accumulation dtype config. * Switching to new DtypeEnum * Adding standalone check function, and unit tests * Variable disambiguation * Adding checks for unsupported states. * Updating for PR comments. * Reorganizing unit test. Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 24 11月, 2022 2 次提交
-
-
由 Ammar Ahmad Awan 提交于
* pass down the new DS inference config to replace_transformer_layer. * remove quantize_settings and rename the ep_mp_group. * Fix model_config passing. Fixes gptj issue with wrong output. * fix small bug in gpt-neo. Co-authored-by: Reza Yazdani and Michael Wyatt
-
由 Connor Holmes 提交于
* Change utilization of DS/Triton kernels * add config at Clip-encoder Co-authored-by: NReza Yazdani <reyazda@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
-