未验证 提交 e976accb 编写于 作者: M Manuel R. Ciosici 提交者: GitHub

Fix typo (#1501)

上级 56635d5b
......@@ -119,7 +119,7 @@ The following calculations show how much memory is required by model params, gra
The optimizer states assume that ``Adam`` is used, where 4 bytes per parameter are used by momentum and another 4 by variance (8 in total).
Gradients at ``fp32`` take 4 bytes, and parameters take 2 bytes at ``fp16` and 4 bytes at ``fp32``.
Gradients at ``fp32`` take 4 bytes, and parameters take 2 bytes at ``fp16`` and 4 bytes at ``fp32``.
**GPU RAM**
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册