Add gradient merge for DistributedFusedLamb optimizer (#40177)
* add gradient merge for DistributedFusedLamb * use master acc gradient * fix CI ut * polish * remove math_function_impl.h change * fix test_update_loss_scaling_op.py * try to fix XPU/NPU CI * add gm ut
Showing
想要评论请 注册 或 登录