未验证 提交 22d5a1f3 编写于 作者: S Stas Bekman 提交者: GitHub

[doc] pipeline (#888)

* [doc] pipeline

As @g-karthik flagged in https://github.com/microsoft/DeepSpeed/pull/659#discussion_r600132598 my previous correction PR had one sentence that said the wrong thing. So this PR attempts to rectify that. 

Thank you!

* tweak
上级 9e9f8cbe
......@@ -276,9 +276,9 @@ For example, a machine with 16 GPUs must have as much local CPU memory as 16 tim
DeepSpeed provides a `LayerSpec` class that delays the construction of
modules until the model layers have been partitioned across workers.
Then each worker will allocate only the layers it's assigned to. So, continuing the
example from the previous paragraph, a machine with 16 GPUs will need to allocate a
total of 1x model size on its CPU, compared to 16x in the LayerSpec example.
Then each worker will allocate only the layers it's assigned to. So, comparing to the
example from the previous paragraph, using `LayerSpec` a machine with 16 GPUs will need to
allocate a total of 1x model size on its CPU memory and not 16x.
Here is an example of the abbreviated AlexNet model, but expressed only
with `LayerSpec`s. Note that the syntax is almost unchanged: `nn.ReLU(inplace=True)`
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册