水果识别训练出错
Created by: KernelErr
环境:
- Ubuntu 19.10
- Cudatoolkit 10.2
- cudnn 7.6.5
在训练的时候出现错误:
2020-03-24 10:30:43,998-INFO: Test finish iter 60
2020-03-24 10:30:43,998-INFO: Total number of images: 60, inference time: 45.83638231372806 fps.
2020-03-24 10:30:43,998-INFO: Start evaluate...
2020-03-24 10:30:44,049-INFO: Accumulating evaluatation results...
2020-03-24 10:30:44,053-INFO: mAP(0.50, 11point) = 8.63
2020-03-24 10:30:44,053-INFO: Save model to output/yolov3_mobilenet_v1_fruit/best_model.
2020-03-24 10:30:45,546-INFO: Best test box ap: 8.631082580643715, in iter: 200
/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 323, in <module>
main()
File "tools/train.py", line 233, in main
outs = exe.run(compiled_train_prog, fetch_list=train_values)
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run
six.reraise(*sys.exc_info())
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run
use_program_cache=use_program_cache)
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 843, in _run_impl
return_numpy=return_numpy)
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 677, in _run_parallel
tensors = exe.run(fetch_var_names)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> >*)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> >*)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<unsigned long>, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/reader.py", line 733, in _init_non_iterable
outputs={'Out': self._feed_list})
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/reader.py", line 646, in __init__
self._init_non_iterable()
File "/home/li/anaconda3/envs/Paddle/lib/python3.7/site-packages/paddle/fluid/reader.py", line 280, in from_generator
iterable, return_list)
File "/home/li/development/repository/PaddleDetection/ppdet/modeling/architectures/yolov3.py", line 152, in build_inputs
iterable=iterable) if use_dataloader else None
File "tools/train.py", line 115, in main
feed_vars, train_loader = model.build_inputs(**inputs_def)
File "tools/train.py", line 323, in <module>
main()
----------------------
Error Message Summary:
----------------------
Error: Blocking queue is killed because the data reader raises an exception
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141)
[operator < read > error]