fluid.io.DataLoader.from_generator在reset时hang住
Created by: littletomatodonkey
环境:k8s集群环境,paddle1.6.1 reader是自己整理的reader,用from_generator组了batch,在迭代完成之后(不是按照epoch去迭代,是按照minibatch数量去迭代,因此迭代结束时,当前epoch中的数据可能没有请求完),reader中的线程都是daemon的,所以怀疑是fluid reader一直处于join的状态没有退出,pystack打印出的日志信息如下:
Dumping Threads....
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 774, in __bootstrap
self.__bootstrap_inner()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "./aiflowlib/dep/visreader/pipeline/decorator.py", line 98, in _fetcher
if isinstance(inq.get(), ReaderEndSignal):
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 340, in wait
waiter.acquire()
---------------
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 774, in __bootstrap
self.__bootstrap_inner()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle/fluid/reader.py", line 479, in __thread_main__
if not self._queue.push(array):
---------------
File "train_aiflow.py", line 76, in <module>
trainmain()
File "train_aiflow.py", line 73, in trainmain
train.main()
File "./train.py", line 307, in main
trainmain(FLAGS)
File "./train.py", line 299, in trainmain
train_loader.reset()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle/fluid/reader.py", line 452, in reset
self._reset()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/paddle/fluid/reader.py", line 498, in _reset
thread.join()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 940, in join
self.__block.wait()
File "/opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/threading.py", line 340, in wait
waiter.acquire()
File "<string>", line 1, in <module>
File "<string>", line 1, in <module>