“77179d8f9b247e977094f5632b0b520606ed4e29”上不存在“git@gitcode.net:qq_37101384/mace.git”
* refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible test=release/1.0.0