配置gpu运行book例子的02.recognize_digits,报错CUDA error: invalid device function
Created by: shiyazhou121
问题描述: 按照【AI学习】PaddlePaddle深度学习实战-PaddlePaddle在不同平台的安装 (http://learn.baidu.com/pages/index.html#/courseInfo/13655?courseId=13655&_k=usdv7x)中centos 6.3环境安装gpu版paddle方法。首先安装python27-gcc482,然后按照视频中方法配置gpu。 下面是配置的cudnn和cuda的环境变量 LD_LIBRARY_PATH=/usr/local/cuda/lib64:/home/work/cudnn/cudnn_v5/cuda/lib64:/usr/local/ganglia/lib64:/usr/local/apr/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/local/ganglia/lib64:/usr/local/apr/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib::/home/work/cuda-8.0/lib64:/home/work/cuda-8.0/lib:/home/HGCP_Program/software-install/hadoop-v2/hadoop/lib:/home/HGCP_Program/software-install/hadoop-v2/hadoop/libhce:/home/HGCP_Program/software-install/hadoop-v2/hadoop/libhdfs:/home/HGCP_Program/software-install/openmpi-1.8.5/lib:/home/work/cuda-8.0/lib64:/home/work/cuda-8.0/lib:/home/HGCP_Program/software-install/hadoop-v2/hadoop/lib:/home/HGCP_Program/software-install/hadoop-v2/hadoop/libhce:/home/HGCP_Program/software-install/hadoop-v2/hadoop/libhdfs:/home/HGCP_Program/software-install/openmpi-1.8.5/lib
配置完成后,尝试运行book中的02.recognize_digits时报错,下面是全部日志
I1114 14:56:59.516850 4275 Util.cpp:166] commandline: --use_gpu=1 --trainer_count=1 W1114 14:57:08.683694 4275 CpuId.h:112] PaddlePaddle wasn't compiled to use avx instructions, but these are available on your machine and could speed up CPU computations via CMAKE .. -DWITH_AVX=ON [INFO 2017-11-14 14:57:08,688 layers.py:2539] output for __conv_pool_0___conv: c = 20, h = 24, w = 24, size = 11520 [INFO 2017-11-14 14:57:08,689 layers.py:2667] output for __conv_pool_0___pool: c = 20, h = 12, w = 12, size = 2880 [INFO 2017-11-14 14:57:08,690 layers.py:2539] output for __conv_pool_1___conv: c = 50, h = 8, w = 8, size = 3200 [INFO 2017-11-14 14:57:08,691 layers.py:2667] output for __conv_pool_1___pool: c = 50, h = 4, w = 4, size = 800 F1114 14:57:08.697180 4275 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function *** Check failure stack trace: *** @ 0x7fe360c605ed google::LogMessage::Fail() @ 0x7fe360c6409c google::LogMessage::SendToLog() @ 0x7fe360c600e3 google::LogMessage::Flush() @ 0x7fe360c655ae google::LogMessageFatal::~LogMessageFatal() @ 0x7fe360aeaec4 hl_gpu_apply_unary_op<>() @ 0x7fe360aeb205 paddle::BaseMatrixT<>::applyUnary<>() @ 0x7fe360aeb433 paddle::BaseMatrixT<>::zero() @ 0x7fe3609868d1 paddle::Parameter::enableType() @ 0x7fe3609821cc paddle::parameterInitNN() @ 0x7fe36098491a paddle::NeuralNetwork::init() @ 0x7fe3609ad491 paddle::GradientMachine::create() @ 0x7fe360c3d3b3 GradientMachine::createFromPaddleModelPtr() @ 0x7fe360c3d58f GradientMachine::createByConfigProtoStr() @ 0x7fe36084c4cd _wrap_GradientMachine_createByConfigProtoStr @ 0x4b4cb9 PyEval_EvalFrameEx @ 0x4b6b28 PyEval_EvalCodeEx @ 0x4b5d10 PyEval_EvalFrameEx @ 0x4b6b28 PyEval_EvalCodeEx @ 0x4b5d10 PyEval_EvalFrameEx @ 0x4b6b28 PyEval_EvalCodeEx @ 0x52940f function_call @ 0x422cba PyObject_Call @ 0x4271ad instancemethod_call @ 0x422cba PyObject_Call @ 0x48121f slot_tp_init @ 0x47eb1a type_call @ 0x422cba PyObject_Call @ 0x4b31dd PyEval_EvalFrameEx @ 0x4b6b28 PyEval_EvalCodeEx @ 0x4b5d10 PyEval_EvalFrameEx @ 0x4b6b28 PyEval_EvalCodeEx @ 0x4b6c52 PyEval_EvalCode Aborted
之后尝试其他book例子,发现全部是这个报错,这个是什么原因?怎么解决?