Call multiply_ instead of scale_ to avoid multiple DtoH copy. (#55589)
* Call multiply_ instead of scale_ to avoid multiple DtoH copy. * Call _squared_l2_norm to calculate grad_clip. * Fix import error.
Showing
想要评论请 注册 或 登录
* Call multiply_ instead of scale_ to avoid multiple DtoH copy. * Call _squared_l2_norm to calculate grad_clip. * Fix import error.