Fix ZeRO parameter initialization for tensors with `requires_grad=True` (#4138)
* Fix ZeRO parameter initialization for tensors with `requires_grad=True`
* Simplify detach logic
---------
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
Showing
想要评论请 注册 或 登录