Fork自 PaddlePaddle / PaddleDetection
* Should always use `dynload::` for cuda function. * Fix cublas.h without DSO load.