paddleDetection训练模型时,报错并终止训练
Created by: CrazyRockWZJ
在训练中途显示报错: ValueError: all consumers exited, no more samples
2019-10-28 12:36:37,664-INFO: iter: 920, lr: 0.001000, 'loss_cls_0': '0.484241', 'loss_cls_2': '0.106703', 'loss_rpn_bbox': 'nan', 'loss_cls_1': '0.233949', 'loss_rpn_cls': '0.246742', 'loss_loc_1': '0.219890', 'loss': 'nan', 'loss_loc_2': '0.042145', 'loss_loc_0': '0.321073', time: 0.246, eta: 6:05:22 2019-10-28 12:36:42,503-INFO: iter: 940, lr: 0.001000, 'loss_cls_0': '0.667223', 'loss_cls_2': '0.101810', 'loss_rpn_bbox': '0.195698', 'loss_cls_1': '0.243394', 'loss_rpn_cls': '0.176546', 'loss_loc_1': '0.315209', 'loss': '2.159427', 'loss_loc_2': '0.075595', 'loss_loc_0': '0.365450', time: 0.238, eta: 5:52:37 2019-10-28 12:36:47,423-INFO: iter: 960, lr: 0.001000, 'loss_cls_0': '0.503310', 'loss_cls_2': '0.108281', 'loss_rpn_bbox': 'nan', 'loss_cls_1': '0.239438', 'loss_rpn_cls': '0.143339', 'loss_loc_1': '0.274694', 'loss': 'nan', 'loss_loc_2': '0.095471', 'loss_loc_0': '0.394992', time: 0.244, eta: 6:02:17 Traceback (most recent call last): File "tools/train.py", line 338, in main() File "tools/train.py", line 244, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/home/wzj/.local/lib/python3.5/site-packages/paddle/fluid/executor.py", line 672, in run return_numpy=return_numpy) File "/home/wzj/.local/lib/python3.5/site-packages/paddle/fluid/executor.py", line 534, in _run_parallel exe.run(fetch_var_names, fetch_var_name) paddle.fluid.core_avx.EOFException: There is no next data. at [/paddle/paddle/fluid/operators/reader/read_op.cc:92]
我检查了我的数据集,似乎是没有问题的。不知道为啥会这样