使用rcnn代码测试图片会出现”Cannot malloc 382.813 MB GPU memory“
Created by: SunAhong1993
使用训练后的faster rcnn模型对图像进行检测时,若图像的长边大于某个值时(1900左右)会出现以下问题,小于这个值则能正常对图像进行检测。同时,如果每次exe.run的时候如果feed多张图像,也会出现这样的问题。 已经使用了”export CUDA_VISIBLE_DEVICES=3“,但是报错仍未在“device: 0”。
W0401 09:56:19.963274 30457 device_context.cc:263] Please NOTE: device: 0, CUDA Capability: 35, Driver API Version: 9.2, Runtime API Version: 9.0
W0401 09:56:19.963361 30457 device_context.cc:271] device: 0, cuDNN Version: 7.0.
W0401 09:56:19.963380 30457 device_context.cc:295] WARNING: device: 0. The installed Paddle is compiled with CUDNN 7.3, but CUDNN version in your machine is 7.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version.
W0401 09:56:22.391635 30457 system_allocator.cc:122] Cannot malloc 382.813 MB GPU memory. Please shrink FLAGS_fraction_of_gpu_memory_to_use environment variable to a lower value. Current value is 0
W0401 09:56:22.391923 30457 legacy_allocator.cc:191] Cannot allocate 382.812500MB in GPU 0, available 177.562500MB
W0401 09:56:22.391957 30457 legacy_allocator.cc:194] total 11996954624
W0401 09:56:22.391997 30457 legacy_allocator.cc:195] GpuMinChunkSize 256.000000B
W0401 09:56:22.392019 30457 legacy_allocator.cc:198] GpuMaxChunkSize 0.000000B
W0401 09:56:22.392050 30457 legacy_allocator.cc:201] GPU memory used: 0.000000B
Traceback (most recent call last):
File "infer2.py", line 117, in <module>
infer()
File "infer2.py", line 98, in infer
return_numpy=False)
File "/home/XX/anaconda3/envs/icaffe/lib/python3.5/site-packages/paddle/fluid/executor.py", line 525, in run
use_program_cache=use_program_cache)
File "/home/XX/anaconda3/envs/icaffe/lib/python3.5/site-packages/paddle/fluid/executor.py", line 591, in _run
exe.run(program.desc, scope, 0, True, True)
RuntimeError: parallel_for failed: out of memory
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what(): cudaFree{Host} failed in GPUAllocator::Free.: an illegal memory access was encountered at [/paddle/paddle/fluid/memory/detail/system_allocator.cc:150]
PaddlePaddle Call Stacks:
0 0x7f764b67a885p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 357
1 0x7f764b67ac09p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 137
2 0x7f764d13b7abp paddle::memory::detail::GPUAllocator::Free(void*, unsigned long, unsigned long) + 187
3 0x7f764d139852p paddle::memory::detail::BuddyAllocator::Free(void*) + 1122
4 0x7f764d135177p void paddle::memory::legacy::Free<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, void*, unsigned long) + 39
5 0x7f764d1351edp paddle::memory::allocation::LegacyAllocator::Free(paddle::memory::allocation::Allocation*) + 77
6 0x7f764b67d2d9p std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 57
7 0x7f764b67e308p paddle::framework::Variable::PlaceholderImpl<paddle::framework::LoDTensor>::~PlaceholderImpl() + 56
8 0x7f764d0da49dp paddle::framework::Scope::~Scope() + 157
9 0x7f764d0da3b1p paddle::framework::Scope::DropKids() + 65
10 0x7f764d0da41dp paddle::framework::Scope::~Scope() + 29
11 0x7f764b7d1306p paddle::framework::ScopePool::DeleteScope(paddle::framework::Scope*) + 22
12 0x7f764b7d1361p paddle::framework::ScopePool::Clear() + 65
13 0x7f767b1343e0p
14 0x7f767b1d2112p
15 0x7f767b2950abp
16 0x7f767b1d79a4p
17 0x7f767b2e032ap _PyGC_CollectNoFail + 42
18 0x7f767b2700c0p PyImport_Cleanup + 608
19 0x7f767b2e03a6p Py_Finalize + 86
20 0x7f767b2f15c1p Py_Main + 897
21 0x7f767b1bb571p main + 225
22 0x7f767a901b45p __libc_start_main + 245
23 0x7f767b293f38p
*** Aborted at 1554083783 (unix time) try "date -d @1554083783" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGABRT (@0x1ff000076f9) received by PID 30457 (TID 0x7f767b0c1740) from PID 30457; stack trace: ***
@ 0x7f767acb0130 (unknown)
@ 0x7f767a9159d9 __GI_raise
@ 0x7f767a9170e8 __GI_abort
@ 0x7f76666c93df __gnu_cxx::__verbose_terminate_handler()
@ 0x7f76666c7b16 __cxxabiv1::__terminate()
@ 0x7f76666c6f91 __cxa_call_terminate
@ 0x7f76666c779d __gxx_personality_v0
@ 0x7f7672f3cf56 _Unwind_RaiseException_Phase2
@ 0x7f7672f3d3e9 _Unwind_Resume
@ 0x7f764d139ba5 paddle::memory::detail::BuddyAllocator::Free()
@ 0x7f764d135177 paddle::memory::legacy::Free<>()
@ 0x7f764d1351ed paddle::memory::allocation::LegacyAllocator::Free()
@ 0x7f764b67d2d9 std::_Sp_counted_base<>::_M_release()
@ 0x7f764b67e308 paddle::framework::Variable::PlaceholderImpl<>::~PlaceholderImpl()
@ 0x7f764d0da49d paddle::framework::Scope::~Scope()
@ 0x7f764d0da3b1 paddle::framework::Scope::DropKids()
@ 0x7f764d0da41d paddle::framework::Scope::~Scope()
@ 0x7f764b7d1306 paddle::framework::ScopePool::DeleteScope()
@ 0x7f764b7d1361 paddle::framework::ScopePool::Clear()
@ 0x7f767b1343e0 capsule_dealloc.cold.413
@ 0x7f767b1d2112 dict_dealloc
@ 0x7f767b2950ab module_clear
@ 0x7f767b1d79a4 collect
@ 0x7f767b2e032a _PyGC_CollectNoFail
@ 0x7f767b2700c0 PyImport_Cleanup
@ 0x7f767b2e03a6 Py_Finalize
@ 0x7f767b2f15c1 Py_Main
@ 0x7f767b1bb571 main
@ 0x7f767a901b45 __libc_start_main
@ 0x7f767b293f38 (unknown)
Aborted