在训练场景文字识别时报错Check failed: cudaSuccess == cudaStat (0 vs. 38)
Created by: yeyupiaoling
我使用的是百度深度学习GPU集群,以前使用没有问题的,今天新创建一个集群就报错,错误信息如下:
F0529 09:31:53.699651 39220 hl_cuda_device.cc:453] Check failed: cudaSuccess == cudaStat (0 vs. 38) Cuda Error: no CUDA-capable device is detected
*** Check failure stack trace: ***
@ 0x7f253560f18d google::LogMessage::Fail()
@ 0x7f25356114d8 google::LogMessage::SendToLog()
@ 0x7f253560ec9b google::LogMessage::Flush()
@ 0x7f25356123ae google::LogMessageFatal::~LogMessageFatal()
@ 0x7f25355bce40 hl_specify_devices_start()
@ 0x7f25355bd04d hl_start()
@ 0x7f25355496fe paddle::initMain()
@ 0x7f25355f5761 initPaddle()
@ 0x7f25351690b7 _wrap_initPaddle
@ 0x4c45fa PyEval_EvalFrameEx
@ 0x4c9d7f PyEval_EvalFrameEx
@ 0x4c2705 PyEval_EvalCodeEx
@ 0x4de69e (unknown)
@ 0x4b0c93 PyObject_Call
@ 0x4c6ef6 PyEval_EvalFrameEx
@ 0x4c2705 PyEval_EvalCodeEx
@ 0x4ca7df PyEval_EvalFrameEx
@ 0x4c2705 PyEval_EvalCodeEx
@ 0x4ca7df PyEval_EvalFrameEx
@ 0x4c2705 PyEval_EvalCodeEx
@ 0x4c24a9 PyEval_EvalCode
@ 0x4f19ef (unknown)
@ 0x4ec372 PyRun_FileExFlags
@ 0x4eaaf1 PyRun_SimpleFileExFlags
@ 0x49e208 Py_Main
@ 0x7f2538a30830 __libc_start_main
@ 0x49da59 _start
@ (nil) (unknown)
Aborted (core dumped)
这个是PaddlePaddle的问题,还是服务器的问题呢?