update ut/doc for glm/codegen (#4057)

* update ut/doc for glm/codegen * formatting/spacing on docs * re-order/alphabetize the models --------- Co-authored-by: N Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: N Logan Adams <loadams@microsoft.com>

update ut/doc for glm/codegen (#4057)
* update ut/doc for glm/codegen * formatting/spacing on docs * re-order/alphabetize the models --------- Co-authored-by: N Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: N Logan Adams <loadams@microsoft.com>
85dc854b · mzl · GitHub · 4cde5da8 · 85dc854b · 85dc854b
隐藏空白更改
内联并排

Showing with 6 addition and 9 deletion

docs/_tutorials/automatic-tensor-parallelism.md docs/_tutorials/automatic-tensor-parallelism.md +4 -3

tests/unit/inference/test_inference.py tests/unit/inference/test_inference.py +2 -6

未找到文件。
--- a/docs/_tutorials/automatic-tensor-parallelism.md
+++ b/docs/_tutorials/automatic-tensor-parallelism.md
@@ -123,11 +123,14 @@ The following model families have been successfully tested with automatic tensor
 - albert
 - bert
 - bigbird_pegasus
+- bloom
 - camembert
+- codegen
 - deberta_v2
 - electra
 - ernie
 - esm
+- glm
 - gpt-j
 - gpt-neo
 - gpt-neox
@@ -136,6 +139,7 @@ The following model families have been successfully tested with automatic tensor
 - llama
 - m2m_100
 - marian
+- mpt
 - mvp
 - nezha
 - openai
@@ -151,14 +155,11 @@ The following model families have been successfully tested with automatic tensor
 - xglm
 - xlm_roberta
 - yoso
- bloom
- mpt

 # Unsupported Models

 The following models are not currently supported with automatic tensor parallelism. They may still be compatible with other DeepSpeed features (e.g., kernel injection for Bloom):

- codegen
 - deberta
 - flaubert
 - fsmt

--- a/tests/unit/inference/test_inference.py
+++ b/tests/unit/inference/test_inference.py
@@ -478,12 +478,8 @@ class TestInjectionPolicy(DistributedTest):
 @pytest.mark.seq_inference
 @pytest.mark.parametrize(
    "model_w_task",
-    [
-        ("Helsinki-NLP/opus-mt-en-de", "translation"),
-    ],
-    ids=[
-        "marian",
-    ],
+    [("Helsinki-NLP/opus-mt-en-de", "translation"), ("Salesforce/codegen-350M-mono", "text-generation")],
+    ids=["marian", "codegen"],  #codegen has fusedqkv weight.
 )
 @pytest.mark.parametrize("dtype", [torch.float16], ids=["fp16"])
 class TestAutoTensorParallelism(DistributedTest):