[Auto Parallel Performance] Optimizing data parallel Fuse-Allreduce-Overlapping (#48092)
* add depend * add origin amp files * fp16 distinguish None & False * engine log * dp add deps for graph exe * add dep for grad clip * dep ops in comm stream * unitest
Showing
想要评论请 注册 或 登录