训练时出错
Created by: neverland0621
我用单张卡训练报错如下:
Traceback (most recent call last):
File "train.py", line 183, in <module>
train(args, config, train_file_list, optimizer_method="momentum")
File "train.py", line 92, in train
exe.run(fluid.default_startup_program())
File "/home/jw/.conda/envs/Paddle/lib/python2.7/site-packages/paddle/fluid/executor.py", line 470, in run
self.executor.run(program.desc, scope, 0, True, True)
paddle.fluid.core.EnforceNotMet: Enforce failed. Expected allocating <= available, but received allocating:10243394109 > available:426114816.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:120]
应该是内存不足的问题,所以我换成了3张卡,结果还是报错:
Traceback (most recent call last):
File "train.py", line 183, in <module>
train(args, config, train_file_list, optimizer_method="momentum")
File "train.py", line 92, in train
exe.run(fluid.default_startup_program())
File "/home/jw/.conda/envs/Paddle/lib/python2.7/site-packages/paddle/fluid/executor.py", line 470, in run
self.executor.run(program.desc, scope, 0, True, True)
paddle.fluid.core.EnforceNotMet: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCount: no CUDA-capable device is detected at [/paddle/paddle/fluid/platform/gpu_info.cc:33]
麻烦解答,谢谢。