conv_2d cuDNN operator slow (#7902) · Issue · PaddlePaddle / Paddle

conv_2d cuDNN operator slow

Created by: tonyyang-svail

During the benchmarking of https://github.com/dzhwinter/benchmark/blob/master/fluid/mnist.py, I found conv_2d with cuDNN is slow. It turns out the ~~element_wise_add~~ element_wise_add_grad bias takes about 80% of the time while the actual cudnn_conv only takes 10%.

@dzhwinter This looks pretty bad to me. Please confirm if you've seen the similar results during your benchmarking? I am using nvprof and NVIDIA Visual Profiler.

I am wondering if we can do one of the following.

See if can improve the elementwise_add, but I am not sure how hard it is in Eigen. @reyoung
Combine conv and add bias into one operator, like CudnnConvBaseLayer.cpp in v2.

We need to improve this due to the importance of this operator in vision task.

PaddlePaddle / Paddle 大约 1 年 前同步成功

conv_2d cuDNN operator slow

PaddlePaddle / Paddle
大约 1 年前同步成功