使用Docker镜像运行PyramidBox遇到的问题以及一些建议
Created by: fengyuentau
相关issue: https://github.com/PaddlePaddle/Paddle/issues/20667
受限于Docker版本,在我的机器上只能使用cuda9.0及以下版本的Docker镜像。机器的GPU是16GB显存的P100。
首先我拉取了paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7
,并且尝试运行这个repo下最新的PyramidBox代码,但是遇到了以下错误:
root@df3a17988b31:~/pyramidbox# python3.6 -u widerface_eval.py --model_dir=/root/pyramidbox/models/PyramidBox_WiderFace
----------- Configuration Arguments -----------
confs_threshold: 0.15
data_dir: data/WIDER_val/images/
file_list: data/wider_face_split/wider_face_val_bbx_gt.txt
image_path:
infer: False
model_dir: /root/pyramidbox/models/PyramidBox_WiderFace
pred_dir: pred
use_gpu: True
use_pyramidbox: True
------------------------------------------------
Traceback (most recent call last):
File "widerface_eval.py", line 328, in <module>
exe, args.model_dir, main_program=infer_program)
File "/usr/local/lib/python3.6/site-packages/paddle/fluid/io.py", line 803, in load_persistables
filename=filename)
File "/usr/local/lib/python3.6/site-packages/paddle/fluid/io.py", line 643, in load_vars
filename=filename)
File "/usr/local/lib/python3.6/site-packages/paddle/fluid/io.py", line 664, in load_vars
assert var_temp != None, "can't not find var: " + each_var.name
AssertionError: can't not find var: conv2d_61.w_0
可以保证的是,模型文件的确是放置在了指定的路径下,而且conv2d_61.w_0
这个文件也是有的。
从相关issue中得知latest是develop分支,感觉可能是develop分支的问题,于是转而拉取了paddlepaddle/paddle:1.5.0-cuda9.0-cudnn7
,并且以同样的步骤运行同样的PyramidBox代码。这次运行没有报找不到模型文件的错,而是报出了显存不足的错:
root@fe31a5b0b0bd:~/pyramidbox# python widerface_eval.py --model_dir=models/PyramidBox_WiderFace
----------- Configuration Arguments -----------
confs_threshold: 0.15
data_dir: data/WIDER_val/images/
file_list: data/wider_face_split/wider_face_val_bbx_gt.txt
image_path:
infer: False
model_dir: models/PyramidBox_WiderFace
pred_dir: pred
use_gpu: True
use_pyramidbox: True
------------------------------------------------
W1016 11:56:06.097822 14 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 60, Driver API Version: 10.1, Runtime API Version: 9.0
W1016 11:56:06.101693 14 device_context.cc:267] device: 0, cuDNN Version: 7.4.
Traceback (most recent call last):
File "widerface_eval.py", line 328, in <module>
exe, args.model_dir, main_program=infer_program)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 742, in load_persistables
filename=filename)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 608, in load_vars
filename=filename)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 645, in load_vars
executor.run(load_prog)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 650, in run
use_program_cache=use_program_cache)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/executor.py", line 748, in _run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: Invoke operator load error.
Python Callstacks:
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/framework.py", line 1748, in append_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 630, in load_vars
'file_path': os.path.join(load_dirname, new_var.name)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 608, in load_vars
filename=filename)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/io.py", line 742, in load_persistables
filename=filename)
File "widerface_eval.py", line 328, in <module>
exe, args.model_dir, main_program=infer_program)
C++ Callstacks:
Enforce failed. Expected allocating <= available, but received allocating:14920696460 > available:13418495744.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:262]
PaddlePaddle Call Stacks:
0 0x7f23f14bfbc8p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 360
1 0x7f23f14bff17p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 87
2 0x7f23f34d7996p paddle::platform::GpuMaxChunkSize() + 630
3 0x7f23f34abc8ap
4 0x7f24deb1ea99p
5 0x7f23f34ab32dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6 0x7f23f34ac175p void* paddle::memory::legacy::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, unsigned long) + 37
7 0x7f23f34ac6b5p paddle::memory::allocation::LegacyAllocator::AllocateImpl(unsigned long) + 421
8 0x7f23f34a07d5p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 181
9 0x7f23f34a095ap paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 26
10 0x7f23f30adfccp paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 44
11 0x7f23f3473204p paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, unsigned long) + 148
12 0x7f23f34768a4p paddle::framework::TensorCopy(paddle::framework::Tensor const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::platform::DeviceContext const&, paddle::framework::Tensor*) + 452
13 0x7f23f347a49bp paddle::framework::TensorFromStream(std::istream&, paddle::framework::Tensor*, paddle::platform::DeviceContext const&) + 699
14 0x7f23f30698d0p paddle::framework::DeserializeFromStream(std::istream&, paddle::framework::LoDTensor*, paddle::platform::DeviceContext const&) + 576
15 0x7f23f1f72f99p paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, float>::LoadLodTensor(std::istream&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::Variable*, paddle::framework::ExecutionContext const&) const + 89
16 0x7f23f1f734c0p paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 432
17 0x7f23f1f73883p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, signed char>, paddle::operators::LoadOpKernel<paddle::platform::CUDADeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 35
18 0x7f23f341c907p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375
19 0x7f23f341cce1p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529
20 0x7f23f341a2dcp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332
21 0x7f23f164b38ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382
22 0x7f23f164e42fp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 143
23 0x7f23f14b0b2dp
24 0x7f23f14f25c6p
25 0x4c5326p PyEval_EvalFrameEx + 37958
26 0x4b9b66p PyEval_EvalCodeEx + 774
27 0x4c1f56p PyEval_EvalFrameEx + 24694
28 0x4b9b66p PyEval_EvalCodeEx + 774
29 0x4c17c6p PyEval_EvalFrameEx + 22758
30 0x4b9b66p PyEval_EvalCodeEx + 774
31 0x4c17c6p PyEval_EvalFrameEx + 22758
32 0x4b9b66p PyEval_EvalCodeEx + 774
33 0x4c17c6p PyEval_EvalFrameEx + 22758
34 0x4b9b66p PyEval_EvalCodeEx + 774
35 0x4c17c6p PyEval_EvalFrameEx + 22758
36 0x4b9b66p PyEval_EvalCodeEx + 774
37 0x4eb69fp
38 0x4e58f2p PyRun_FileExFlags + 130
39 0x4e41a6p PyRun_SimpleFileExFlags + 390
40 0x4938cep Py_Main + 1358
41 0x7f24de766830p __libc_start_main + 240
42 0x493299p _start + 41
从https://github.com/PaddlePaddle/models/issues/1259 和https://github.com/PaddlePaddle/models/issues/1262#issuecomment-422724707 得知可能有超显存的风险,并且可以通过加入显存优化策略缓解这个问题:
infer_program, nmsed_out = network.infer(main_program)
fluid.memory_optimize(infer_program)
加了这一行之后的确可以跑widerface的val和test了,虽然在paddle1.5.0中提示这个api已经被舍弃了。
我的建议是:
- 能否在PyramidBox的Readme中加入支持的Paddle版本的说明?
- 能否在PyramidBox的Readme中加入这个模型对显存的需求的说明?
- 能否提供新api下显存优化策略的用法?
这样能够极大地提升和节省使用者的效率和时间。谢谢!