Add MultiTensorApply to calculate L2-Norm in DistributedFusedLamb optimizer (#39900)
* add multi tensor apply l2 norm * add multi_tensor_apply code * make sizeof(TensorMeta) smalller * move code to distributed_fused_lamb_op.cu * remove useless FLAGS
Showing
想要评论请 注册 或 登录