未验证 提交 2abef1ef 编写于 作者: S Shaden Smith 提交者: GitHub

Updating MPU docs (#92)

上级 bca23057
......@@ -54,7 +54,7 @@ def initialize(args,
step(), state_dict(), and load_state_dict() methods
mpu: Optional: A model parallelism unit object that implements
get_model/data_parallel_group/rank/size()
get_{model,data}_parallel_{rank,group,world_size}()
dist_init_required: Optional: Initializes torch.distributed
......
......@@ -68,10 +68,11 @@ mpu.get_model_parallel_rank()
mpu.get_model_parallel_group()
mpu.get_model_parallel_world_size()
mpu.get_data_parallel_rank/group/world_size()
mpu.get_data_parallel_rank()
mpu.get_data_parallel_group()
mpu.get_data_parallel_world_size()
```
### Integration with Megatron-LM
DeepSpeed is fully compatible with [Megatron](https://github.com/NVIDIA/Megatron-LM).
Please see the [Megatron-LM tutorial](tutorials/MegatronGPT2Tutorial.md) for details.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册