Optimization of Kernels that related to DeepLabv3+ (#13534)
* refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible test=release/1.0.0
Showing
想要评论请 注册 或 登录