Cuda code should be compiled with -O0 for debug version
Created by: emailweixu
Currently, the cuda source code (.cu files) are always compiled with -O3 optimization option (https://github.com/baidu/Paddle/blob/develop/CMakeLists.txt#L109) which makes debug difficult. Should change the flag according to CMAKE_BUILD_TYPE option.