CPU 训练可以,GPU训练就报错如下:
Created by: YanDingXin
Process Process-1: Process Process-2: Traceback (most recent call last): Traceback (most recent call last): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 556, in _read_into_queue six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 549, in _read_into_queue for sample in reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 303, in batch_iter_reader for outs in sample_iter_reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 272, in sample_iter_reader ) * self.num_workers > img_num: File "tools/../ppocr/data/rec/dataset_traversal.py", line 241, in get_device_num gpu_num = len(gpus.split(',')) AttributeError: 'int' object has no attribute 'split' File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 556, in _read_into_queue six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 549, in _read_into_queue for sample in reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 303, in batch_iter_reader for outs in sample_iter_reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 272, in sample_iter_reader ) * self.num_workers > img_num: File "tools/../ppocr/data/rec/dataset_traversal.py", line 241, in get_device_num gpu_num = len(gpus.split(',')) AttributeError: 'int' object has no attribute 'split' Process Process-3: Traceback (most recent call last): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 556, in _read_into_queue six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 549, in _read_into_queue for sample in reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 303, in batch_iter_reader for outs in sample_iter_reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 272, in sample_iter_reader ) * self.num_workers > img_num: File "tools/../ppocr/data/rec/dataset_traversal.py", line 241, in get_device_num gpu_num = len(gpus.split(',')) AttributeError: 'int' object has no attribute 'split' 2020-08-19 13:58:35,693-WARNING: Your reader has raised an exception! Exception in thread Thread-1: Traceback (most recent call last): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 805, in thread_main six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 785, in thread_main for tensors in self._tensor_reader(): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 853, in tensor_reader_impl for slots in paddle_reader(): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 488, in reader_creator for item in reader(): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 572, in queue_reader raise ValueError("multiprocess reader raises an exception") ValueError: multiprocess reader raises an exception
Process Process-4: Traceback (most recent call last): File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 556, in _read_into_queue six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/reader/decorator.py", line 549, in _read_into_queue for sample in reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 303, in batch_iter_reader for outs in sample_iter_reader(): File "tools/../ppocr/data/rec/dataset_traversal.py", line 272, in sample_iter_reader ) * self.num_workers > img_num: File "tools/../ppocr/data/rec/dataset_traversal.py", line 241, in get_device_num gpu_num = len(gpus.split(',')) AttributeError: 'int' object has no attribute 'split' /home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/executor.py:789: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 123, in main() File "tools/train.py", line 100, in main program.train_eval_rec_run(config, exe, train_info_dict, eval_info_dict) File "tools/../tools/program.py", line 336, in train_eval_rec_run return_numpy=False) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/executor.py", line 790, in run six.reraise(*sys.exc_info()) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/executor.py", line 785, in run use_program_cache=use_program_cache) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/executor.py", line 850, in _run_impl return_numpy=return_numpy) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/executor.py", line 684, in _run_parallel tensors = exe.run(fetch_var_names)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const
Python Call Stacks (More useful to users):
File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op attrs=kwargs.get("attrs", None)) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 733, in _init_non_iterable outputs={'Out': self._feed_list}) File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 646, in init self._init_non_iterable() File "/home/aisrv/anaconda3/envs/ydx/lib/python3.7/site-packages/paddle/fluid/reader.py", line 280, in from_generator iterable, return_list) File "tools/../ppocr/modeling/architectures/rec_model.py", line 135, in create_feed iterable=False) File "tools/../ppocr/modeling/architectures/rec_model.py", line 188, in call image, labels, loader = self.create_feed(mode) File "tools/../tools/program.py", line 170, in build dataloader, outputs = model(mode=mode) File "tools/train.py", line 50, in main config, train_program, startup_program, mode='train') File "tools/train.py", line 123, in main()
Error Message Summary:
Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]