Fork自 PaddlePaddle / Paddle
* refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible
拖放文件到此处或点击上传