implementation of broadcast div backward by reduce (#38044)
* add elementwise div * move mul and div grad functor * Combine multiple CUDA kernels * Update the reduce interface call * add multi-output * add multi-output div * add branch judge * Package branch * Combine the x and y functions into one
Showing
想要评论请 注册 或 登录