deeplabv3训练报错
Created by: pele228
AIStudio 跑deeplabv3p_xception65模型, 完全相同的训练代码和数据,昨天可以跑。今天报错如下: /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "pdseg/train.py", line 518, in main(args) File "pdseg/train.py", line 505, in main train(cfg) File "pdseg/train.py", line 261, in train exe.run(startup_prog) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run six.reraise(*sys.exc_info()) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 693, in reraise raise value File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run use_program_cache=use_program_cache) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 831, in _run_impl use_program_cache=use_program_cache) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 905, in _run_program fetch_var_name) paddle.fluid.core_avx.EnforceNotMet:
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) 2 paddle::platform::CUDADeviceContext::CUDADeviceContext(paddle::platform::CUDAPlace) 3 std::_Function_handler<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > (), std::reference_wrapper<std::_Bind_simple<paddle::platform::EmplaceDeviceContext<paddle::platform::CUDADeviceContext, paddle::platform::CUDAPlace>(std::map<paddle::platform::Place, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > >, std::lesspaddle::platform::Place, std::allocator<std::pair<paddle::platform::Place const, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > > > > >, paddle::platform::Place)::{lambda()#1} ()> > >::_M_invoke(std::_Any_data const&) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > >, std::__future_base::_Result_base::_Deleter>, std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > > >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 std::__future_base::_Deferred_state<std::_Bind_simple<paddle::platform::EmplaceDeviceContext<paddle::platform::CUDADeviceContext, paddle::platform::CUDAPlace>(std::map<paddle::platform::Place, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > >, std::lesspaddle::platform::Place, std::allocator<std::pair<paddle::platform::Place const, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > > > > >, paddle::platform::Place)::{lambda()#1} ()>, std::unique_ptr<paddle::platform::DeviceContext, std::default_deletepaddle::platform::DeviceContext > >::_M_run_deferred() 7 paddle::platform::DeviceContextPool::Get(paddle::platform::Place const&) 8 paddle::framework::GarbageCollector::GarbageCollector(paddle::platform::Place const&, unsigned long) 9 paddle::framework::UnsafeFastGPUGarbageCollector::UnsafeFastGPUGarbageCollector(paddle::platform::CUDAPlace const&, unsigned long) 10 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) 11 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool, bool)
Error Message Summary:
Error: An error occurred here. There is no accurate error hint for this error yet. We are continuously in the process of increasing hint for this kind of error check. It would be helpful if you could inform us of how this conversion went by opening a github issue. And we will resolve it with high priority.
- New issue link: https://github.com/PaddlePaddle/Paddle/issues/new
- Recommended issue content: all error stack information: out of memory at (/paddle/paddle/fluid/platform/device_context.cc:221)