training error 2
Created by: toplinuxsir
2020-09-02 16:20:21,166-WARNING: consumer[8977] exit abnormally with exitcode[-9]
2020-09-02 16:20:21,198-WARNING: 1 consumers have exited abnormally!!!
2020-09-02 16:20:21,199-WARNING: consumer[8977] exit abnormally with exitcode[-9]
2020-09-02 16:20:21,200-WARNING: 1 consumers have exited abnormally!!!
2020-09-02 16:20:21,223-WARNING: Your reader has raised an exception!
Exception in thread Thread-11:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/usr/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 1145, in __thread_main__
six.reraise(*sys.exc_info())
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 1125, in __thread_main__
for tensors in self._tensor_reader():
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 1195, in __tensor_reader_impl__
for slots in paddle_reader():
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 506, in __reader_creator__
for item in reader():
File "/data/trainings/paddle_detection/PaddleDetection/ppdet/data/reader.py", line 445, in _reader
reader.reset()
File "/data/trainings/paddle_detection/PaddleDetection/ppdet/data/parallel_map.py", line 267, in reset
assert self._consumer_healthy(), "cannot start another pass of data" \
AssertionError: cannot start another pass of data for some consumers exited abnormally before!!!
/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "tools/train.py", line 370, in <module>
main()
File "tools/train.py", line 243, in main
outs = exe.run(compiled_train_prog, fetch_list=train_values)
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run
six.reraise(*sys.exc_info())
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run
return_merged=return_merged)
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl
return_merged=return_merged)
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel
tensors = exe.run(fetch_var_names, return_merged)._move_to_list()
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> >*)
3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocator<paddle::framework::LoDTensor> >*)
4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<unsigned long>, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&)
5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 1080, in _init_non_iterable
attrs={'drop_last': self._drop_last})
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 978, in __init__
self._init_non_iterable()
File "/home/lhs/python-envs/devpy37/lib/python3.7/site-packages/paddle/fluid/reader.py", line 620, in from_generator
iterable, return_list, drop_last)
File "/data/trainings/paddle_detection/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 155, in build_inputs
iterable=iterable) if use_dataloader else None
File "tools/train.py", line 115, in main
feed_vars, train_loader = model.build_inputs(**inputs_def)
File "tools/train.py", line 370, in <module>
main()
----------------------
Error Message Summary:
----------------------
Error: Blocking queue is killed because the data reader raises an exception
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141)
[operator < read > error]