optimize elementwise_mul_grad using new interfaces (#37728)
* init commit: new elem_mul_grad * add template speciallization for complex in multiply * reply review comments * correct dx and dy computation when T is complex * reply review comments * update to new ReduceRunctor * mul-output broadcast * call functions * call functions with comments * remove comments
Showing
想要评论请 注册 或 登录