未验证 提交 bbe030c5 编写于 作者: J Jeongseok Kang 提交者: GitHub

docs: Update the recent url for Megatron-LM (#2564)

上级 c77d42dc
......@@ -184,7 +184,7 @@ When using DeepSpeed for model training, the profiler can be configured in the d
#### Example: Megatron-LM
For information on running Megatron-LM with DeepSpeed, please refer to our tutorial [Megatron-LM](https://github.com/microsoft/DeepSpeedExamples/tree/master/Megatron-LM).
For information on running Megatron-LM with DeepSpeed, please refer to our tutorial [Megatron-LM](https://github.com/microsoft/DeepSpeedExamples/tree/master/megatron/Megatron-LM).
An example output of 12-layer Megatron-LM model (`hidden_size = 8192, num_attention_heads = 32, batch_size = 1024, seq_length = 1024`) is shown below.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册