在paddledetection上用自制VOC数据集训练yolov3_darknet_voc报错ValueError:

Created by: LongLivecn

请大神不吝赐教。我自己用labelimg标注的图片做VOC格式的数据集，放在paddledetection上的yolov3_darknet上训练，在训练3000多轮后出现报错fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (600,400,3) into shape (400,600,3) and stack:

以下是完整log： aistudio@jupyter-213340-333670:~/PaddleDetection$ python -u tools/train.py -c configs/yolov3_darknet_voc.yml --eval DarkNet: [32mnorm_decay[0m: 0.99 [32mnorm_type[0m: sync_bn depth: 53 weight_prefix_name: '' EvalReader: batch_size: 8 bufsize: 32 dataset: !VOCDataSet anno_path: val.txt dataset_dir: dataset/chong image_dir: '' label_list: label_list.txt sample_num: -1 use_default_label: false with_background: false drop_empty: false inputs_def: fields: - image - im_size - im_id - gt_bbox - gt_class - is_difficult num_max_boxes: 50 sample_transforms:

!DecodeImage to_rgb: true with_mixup: false
!ResizeImage interp: 2 max_size: 0 target_size: 608 use_cv2: true
!NormalizeImage is_channel_first: false is_scale: true mean:
- 0.485
- 0.456
- 0.406 std:
- 0.229
- 0.224
- 0.225
!PadBox num_max_boxes: 50
!Permute channel_first: true to_bgr: false worker_num: 8 LearningRate: [32mbase_lr[0m: 0.0001 [32mschedulers[0m:
!PiecewiseDecay gamma: 0.1 milestones:
- 5000
- 10000
- 15000 values: null
!LinearWarmup start_factor: 0.0 steps: 1000 OptimizerBuilder: [32mregularizer[0m: factor: 0.0005 type: L2 optimizer: momentum: 0.9 type: Momentum TestReader: batch_size: 1 dataset: !ImageFolder anno_path: label_list.txt dataset_dir: dataset/chong image_dir: '' sample_num: -1 use_default_label: false with_background: false inputs_def: fields:
- image
- im_size
- im_id image_shape:
- 3
- 608
- 608 sample_transforms:
!DecodeImage to_rgb: true with_mixup: false
!ResizeImage interp: 2 max_size: 0 target_size: 608 use_cv2: true
!NormalizeImage is_channel_first: false is_scale: true mean:
- 0.485
- 0.456
- 0.406 std:
- 0.229
- 0.224
- 0.225
!Permute channel_first: true to_bgr: false TrainReader: batch_size: 8 batch_transforms:
!RandomShape random_inter: true sizes:
- 320
- 352
- 384
- 416
- 448
- 480
- 512
- 544
- 576
- 608
!NormalizeImage is_channel_first: false is_scale: true mean:
- 0.485
- 0.456
- 0.406 std:
- 0.229
- 0.224
- 0.225
!Permute channel_first: true to_bgr: false
!Gt2YoloTarget anchor_masks:
- - 6
  - 7
  - 8
- - 3
  - 4
  - 5
- - 0
  - 1
  - 2 anchors:
- - 10
  - 13
- - 16
  - 30
- - 33
  - 23
- - 30
  - 61
- - 62
  - 45
- - 59
  - 119
- - 116
  - 90
- - 156
  - 198
- - 373
  - 326 downsample_ratios:
- 32
- 16
- 8 num_classes: 80 bufsize: 32 dataset: !VOCDataSet anno_path: train.txt dataset_dir: dataset/chong image_dir: '' label_list: label_list.txt sample_num: -1 use_default_label: false with_background: false drop_last: true inputs_def: fields:
- image
- gt_bbox
- gt_class
- gt_score num_max_boxes: 50 mixup_epoch: 250 sample_transforms:
!DecodeImage to_rgb: true with_mixup: true
!MixupImage alpha: 1.5 beta: 1.5
!ColorDistort brightness:
- 0.5
- 1.5
- 0.5 contrast:
- 0.5
- 1.5
- 0.5 hue:
- -18
- 18
- 0.5 random_apply: true saturation:
- 0.5
- 1.5
- 0.5
!RandomExpand fill_value: !!python/tuple
- 123.675
- 116.28
- 103.53 prob: 0.5 ratio: 4.0
!RandomCrop allow_no_crop: true aspect_ratio:
- 0.5
- 2.0 cover_all_box: false num_attempts: 50 scaling:
- 0.3
- 1.0 thresholds:
- 0.0
- 0.1
- 0.3
- 0.5
- 0.7
- 0.9
!RandomFlipImage is_mask_flip: false is_normalized: false prob: 0.5
!NormalizeBox {}
!PadBox num_max_boxes: 50
!BboxXYXY2XYWH {} shuffle: true use_process: true worker_num: 8 YOLOv3: [32mbackbone[0m: DarkNet use_fine_grained_loss: false yolo_head: YOLOv3Head YOLOv3Head: [32mnms[0m: background_label: -1 keep_top_k: 100 nms_threshold: 0.45 nms_top_k: 1000 normalized: false score_threshold: 0.01 [32mnorm_decay[0m: 0.99 anchor_masks:
- 6
- 7
- 8
- 3
- 4
- 5
- 0
- 1
- 2 anchors:
- 10
- 13
- 16
- 30
- 33
- 23
- 30
- 61
- 62
- 45
- 59
- 119
- 116
- 90
- 156
- 198
- 373
- 326 block_size: 3 drop_block: false keep_prob: 0.9 num_classes: 80 weight_prefix_name: '' yolo_loss: YOLOv3Loss YOLOv3Loss: [32mbatch_size[0m: 4 [32mlabel_smooth[0m: false ignore_thresh: 0.7 iou_loss: null use_fine_grained_loss: false architecture: YOLOv3 log_smooth_window: 20 map_type: 11point max_iters: 17000 metric: VOC num_classes: 4 pretrain_weights: output/yolov3_darknet_voc/3000 save_dir: output snapshot_iter: 1000 use_gpu: true weights: output/yolov3_darknet_voc/model_final

2020-04-14 17:22:25,269-INFO: 47 samples in file dataset/chong/val.txt 2020-04-14 17:22:25,269-INFO: places would be ommited when DataLoader is not iterable W0414 17:22:26.379868 1124 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0414 17:22:26.384959 1124 device_context.cc:245] device: 0, cuDNN Version: 7.3. 2020-04-14 17:22:28,038-INFO: Loading parameters from output/yolov3_darknet_voc/3000... 2020-04-14 17:22:31,307-INFO: 107 samples in file dataset/chong/train.txt 2020-04-14 17:22:40,309-INFO: places would be ommited when DataLoader is not iterable I0414 17:22:40.341902 1124 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel. I0414 17:22:40.439931 1124 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 I0414 17:22:40.578917 1124 parallel_executor.cc:307] Inplace strategy is enabled, when build_strategy.enable_inplace = True I0414 17:22:40.641772 1124 parallel_executor.cc:375] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0 2020-04-14 17:22:43,533-INFO: iter: 0, lr: 0.000000, 'loss': '12.386612', time: 0.000, eta: 0:00:00 2020-04-14 17:22:50,253-INFO: iter: 20, lr: 0.000002, 'loss': '12.334382', time: 0.486, eta: 2:17:30 2020-04-14 17:22:56,194-INFO: iter: 40, lr: 0.000004, 'loss': '13.534571', time: 0.292, eta: 1:22:36 2020-04-14 17:23:02,139-INFO: iter: 60, lr: 0.000006, 'loss': '11.932161', time: 0.299, eta: 1:24:30 2020-04-14 17:23:10,442-INFO: iter: 80, lr: 0.000008, 'loss': '12.132146', time: 0.416, eta: 1:57:20 2020-04-14 17:23:16,190-INFO: iter: 100, lr: 0.000010, 'loss': '11.832668', time: 0.294, eta: 1:22:41 2020-04-14 17:23:22,581-INFO: iter: 120, lr: 0.000012, 'loss': '11.318932', time: 0.305, eta: 1:25:47 2020-04-14 17:23:28,009-INFO: iter: 140, lr: 0.000014, 'loss': '11.688734', time: 0.280, eta: 1:18:42 2020-04-14 17:23:35,852-INFO: iter: 160, lr: 0.000016, 'loss': '11.206422', time: 0.384, eta: 1:47:38 2020-04-14 17:23:41,959-INFO: iter: 180, lr: 0.000018, 'loss': '12.336928', time: 0.305, eta: 1:25:36 2020-04-14 17:23:46,768-INFO: iter: 200, lr: 0.000020, 'loss': '11.783538', time: 0.251, eta: 1:10:13 2020-04-14 17:23:54,457-INFO: iter: 220, lr: 0.000022, 'loss': '12.842199', time: 0.360, eta: 1:40:42 2020-04-14 17:24:00,513-INFO: iter: 240, lr: 0.000024, 'loss': '13.006575', time: 0.331, eta: 1:32:27 2020-04-14 17:24:05,719-INFO: iter: 260, lr: 0.000026, 'loss': '12.349201', time: 0.257, eta: 1:11:37 2020-04-14 17:24:10,590-INFO: iter: 280, lr: 0.000028, 'loss': '11.261145', time: 0.247, eta: 1:08:42 2020-04-14 17:24:16,218-INFO: iter: 300, lr: 0.000030, 'loss': '12.429997', time: 0.281, eta: 1:18:08 2020-04-14 17:24:22,094-INFO: iter: 320, lr: 0.000032, 'loss': '11.563356', time: 0.284, eta: 1:18:56 2020-04-14 17:24:27,274-INFO: iter: 340, lr: 0.000034, 'loss': '11.301025', time: 0.266, eta: 1:13:48 2020-04-14 17:24:32,359-INFO: iter: 360, lr: 0.000036, 'loss': '11.497587', time: 0.252, eta: 1:09:47 2020-04-14 17:24:37,266-INFO: iter: 380, lr: 0.000038, 'loss': '11.821287', time: 0.249, eta: 1:08:59 2020-04-14 17:24:42,424-INFO: iter: 400, lr: 0.000040, 'loss': '11.989442', time: 0.255, eta: 1:10:35 2020-04-14 17:24:47,496-INFO: iter: 420, lr: 0.000042, 'loss': '12.935690', time: 0.248, eta: 1:08:25 2020-04-14 17:24:54,332-INFO: iter: 440, lr: 0.000044, 'loss': '11.765289', time: 0.342, eta: 1:34:25 2020-04-14 17:24:59,725-INFO: iter: 460, lr: 0.000046, 'loss': '12.556597', time: 0.269, eta: 1:14:08 2020-04-14 17:25:04,819-INFO: iter: 480, lr: 0.000048, 'loss': '11.588726', time: 0.256, eta: 1:10:34 2020-04-14 17:25:09,801-INFO: iter: 500, lr: 0.000050, 'loss': '12.175614', time: 0.259, eta: 1:11:12 2020-04-14 17:25:15,391-INFO: iter: 520, lr: 0.000052, 'loss': '12.457203', time: 0.280, eta: 1:16:48 2020-04-14 17:25:21,748-INFO: iter: 540, lr: 0.000054, 'loss': '13.053288', time: 0.311, eta: 1:25:21 2020-04-14 17:25:26,825-INFO: iter: 560, lr: 0.000056, 'loss': '11.819932', time: 0.248, eta: 1:08:03 2020-04-14 17:25:32,110-INFO: iter: 580, lr: 0.000058, 'loss': '12.345695', time: 0.263, eta: 1:12:02 2020-04-14 17:25:36,731-INFO: iter: 600, lr: 0.000060, 'loss': '10.779097', time: 0.244, eta: 1:06:43 2020-04-14 17:25:41,836-INFO: iter: 620, lr: 0.000062, 'loss': '11.018574', time: 0.245, eta: 1:06:45 2020-04-14 17:25:47,944-INFO: iter: 640, lr: 0.000064, 'loss': '11.866978', time: 0.306, eta: 1:23:24 2020-04-14 17:25:54,130-INFO: iter: 660, lr: 0.000066, 'loss': '12.065427', time: 0.306, eta: 1:23:27 2020-04-14 17:26:00,004-INFO: iter: 680, lr: 0.000068, 'loss': '11.948149', time: 0.308, eta: 1:23:49 2020-04-14 17:26:05,653-INFO: iter: 700, lr: 0.000070, 'loss': '12.026233', time: 0.279, eta: 1:15:44 2020-04-14 17:26:11,537-INFO: iter: 720, lr: 0.000072, 'loss': '11.517006', time: 0.293, eta: 1:19:29 2020-04-14 17:26:16,502-INFO: iter: 740, lr: 0.000074, 'loss': '12.272764', time: 0.250, eta: 1:07:44 2020-04-14 17:26:22,029-INFO: iter: 760, lr: 0.000076, 'loss': '13.204546', time: 0.278, eta: 1:15:09 2020-04-14 17:26:27,706-INFO: iter: 780, lr: 0.000078, 'loss': '12.125725', time: 0.286, eta: 1:17:15 2020-04-14 17:26:31,993-INFO: iter: 800, lr: 0.000080, 'loss': '11.653991', time: 0.214, eta: 0:57:53 2020-04-14 17:26:36,617-INFO: iter: 820, lr: 0.000082, 'loss': '10.115778', time: 0.227, eta: 1:01:15 2020-04-14 17:26:41,877-INFO: iter: 840, lr: 0.000084, 'loss': '11.494898', time: 0.262, eta: 1:10:30 2020-04-14 17:26:46,661-INFO: iter: 860, lr: 0.000086, 'loss': '11.281583', time: 0.238, eta: 1:04:04 2020-04-14 17:26:52,556-INFO: iter: 880, lr: 0.000088, 'loss': '12.957414', time: 0.296, eta: 1:19:38 2020-04-14 17:26:57,869-INFO: iter: 900, lr: 0.000090, 'loss': '11.624809', time: 0.265, eta: 1:11:09 2020-04-14 17:27:03,505-INFO: iter: 920, lr: 0.000092, 'loss': '12.290806', time: 0.274, eta: 1:13:29 2020-04-14 17:27:09,336-INFO: iter: 940, lr: 0.000094, 'loss': '11.739902', time: 0.289, eta: 1:17:25 2020-04-14 17:27:15,044-INFO: iter: 960, lr: 0.000096, 'loss': '12.357161', time: 0.291, eta: 1:17:42 2020-04-14 17:27:19,977-INFO: iter: 980, lr: 0.000098, 'loss': '11.461915', time: 0.254, eta: 1:07:55 2020-04-14 17:27:25,602-INFO: iter: 1000, lr: 0.000100, 'loss': '11.596880', time: 0.268, eta: 1:11:32 2020-04-14 17:27:25,602-INFO: Save model to output/yolov3_darknet_voc/1000. I0414 17:27:37.677040 1124 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel. I0414 17:27:37.693418 1124 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 2020-04-14 17:27:38,125-INFO: Test iter 0 2020-04-14 17:27:38,600-INFO: Test finish iter 6 2020-04-14 17:27:38,600-INFO: Total number of images: 47, inference time: 50.086242462569146 fps. 2020-04-14 17:27:38,600-INFO: Start evaluate... 2020-04-14 17:27:38,605-INFO: Accumulating evaluatation results... 2020-04-14 17:27:38,605-INFO: mAP(0.50, 11point) = 32.19 2020-04-14 17:27:38,605-INFO: Save model to output/yolov3_darknet_voc/best_model. 2020-04-14 17:27:50,166-INFO: Best test box ap: 32.186277553938844, in iter: 1000

。。。。。。。。。。。。。。省略正常训练的log

2020-04-14 17:37:31,827-INFO: iter: 3020, lr: 0.000100, 'loss': '11.671132', time: 1.480, eta: 5:44:48 2020-04-14 17:37:37,486-INFO: iter: 3040, lr: 0.000100, 'loss': '11.391199', time: 0.283, eta: 1:05:47 2020-04-14 17:37:42,637-INFO: iter: 3060, lr: 0.000100, 'loss': '10.152473', time: 0.271, eta: 1:02:59 2020-04-14 17:37:47,454-INFO: iter: 3080, lr: 0.000100, 'loss': '11.135466', time: 0.238, eta: 0:55:13 2020-04-14 17:37:52,518-INFO: iter: 3100, lr: 0.000100, 'loss': '11.474866', time: 0.251, eta: 0:58:05 2020-04-14 17:37:58,244-INFO: iter: 3120, lr: 0.000100, 'loss': '11.714984', time: 0.291, eta: 1:07:16 2020-04-14 17:38:03,671-INFO: iter: 3140, lr: 0.000100, 'loss': '13.207261', time: 0.258, eta: 0:59:31 2020-04-14 17:38:08,578-INFO: iter: 3160, lr: 0.000100, 'loss': '12.236403', time: 0.260, eta: 1:00:04 2020-04-14 17:38:13,323-INFO: iter: 3180, lr: 0.000100, 'loss': '12.024139', time: 0.225, eta: 0:51:52 2020-04-14 17:38:18,661-INFO: iter: 3200, lr: 0.000100, 'loss': '11.605931', time: 0.266, eta: 1:01:10 2020-04-14 17:38:22,004-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (600,400,3) into shape (400,600,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (600,400,3) into shape (400,600,3)

2020-04-14 17:38:22,007-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-0] failed to map with error:[could not broadcast input array from shape (600,400,3) into shape (400,600,3)]] 2020-04-14 17:38:22,010-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (188,208,3) into shape (208,188,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (188,208,3) into shape (208,188,3)

2020-04-14 17:38:22,012-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-5] failed to map with error:[could not broadcast input array from shape (188,208,3) into shape (208,188,3)]] 2020-04-14 17:38:22,059-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (400,280,3) into shape (280,400,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (400,280,3) into shape (280,400,3)

2020-04-14 17:38:22,060-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-6] failed to map with error:[could not broadcast input array from shape (400,280,3) into shape (280,400,3)]] 2020-04-14 17:38:22,073-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (350,273,3) into shape (273,350,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (350,273,3) into shape (273,350,3)

2020-04-14 17:38:22,074-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-2] failed to map with error:[could not broadcast input array from shape (350,273,3) into shape (273,350,3)]] 2020-04-14 17:38:22,612-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (800,610,3) into shape (610,800,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (800,610,3) into shape (610,800,3)

2020-04-14 17:38:22,617-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-1] failed to map with error:[could not broadcast input array from shape (800,610,3) into shape (610,800,3)]] 2020-04-14 17:38:22,696-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (583,440,3) into shape (440,583,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (583,440,3) into shape (440,583,3)

2020-04-14 17:38:22,697-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-3] failed to map with error:[could not broadcast input array from shape (583,440,3) into shape (440,583,3)]] 2020-04-14 17:38:23,098-INFO: fail to map op [RandomExpand_d28600] with error: could not broadcast input array from shape (1024,775,3) into shape (775,1024,3) and stack: Traceback (most recent call last): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/home/aistudio/PaddleDetection/ppdet/data/transform/operators.py", line 1337, in call canvas[y:y + height, x:x + width, :] = img.astype(np.uint8) ValueError: could not broadcast input array from shape (1024,775,3) into shape (775,1024,3)

2020-04-14 17:38:23,100-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-4] failed to map with error:[could not broadcast input array from shape (1024,775,3) into shape (775,1024,3)]] 2020-04-14 17:38:24,365-INFO: iter: 3220, lr: 0.000100, 'loss': '12.207880', time: 0.283, eta: 1:05:02 2020-04-14 17:38:25,106-WARNING: recv endsignal from outq with errmsg[consumer[consumer-7bd-7] exits for reason[consumer[consumer-7bd-0] failed to map with error:[could not broadcast input array from shape (600,400,3) into shape (400,600,3)]]] 2020-04-14 17:38:25,106-WARNING: Your reader has raised an exception! Exception in thread Thread-11: Traceback (most recent call last): File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 805, in thread_main six.reraise(*sys.exc_info()) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 693, in reraise raise value File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 785, in thread_main for tensors in self._tensor_reader(): File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 853, in tensor_reader_impl for slots in paddle_reader(): File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 488, in reader_creator for item in reader(): File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 417, in _reader reader.reset() File "/home/aistudio/PaddleDetection/ppdet/data/parallel_map.py", line 253, in reset assert not self._exit, "cannot reset for already stopped dataset" AssertionError: cannot reset for already stopped dataset

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 342, in main() File "tools/train.py", line 240, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run six.reraise(*sys.exc_info()) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 693, in reraise raise value File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run use_program_cache=use_program_cache) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 843, in _run_impl return_numpy=return_numpy) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 677, in _run_parallel tensors = exe.run(fetch_var_names)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Python Call Stacks (More useful to users):

File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op attrs=kwargs.get("attrs", None)) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 733, in _init_non_iterable outputs={'Out': self._feed_list}) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 646, in init self._init_non_iterable() File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/reader.py", line 280, in from_generator iterable, return_list) File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolov3.py", line 152, in build_inputs iterable=iterable) if use_dataloader else None File "tools/train.py", line 122, in main feed_vars, train_loader = model.build_inputs(**inputs_def) File "tools/train.py", line 342, in main()

Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]

以下是配置文件 architecture: YOLOv3 use_gpu: true max_iters: 17000 log_smooth_window: 20 save_dir: output snapshot_iter: 1000 metric: VOC map_type: 11point pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/DarkNet53_pretrained.tar weights: output/yolov3_darknet_voc/model_final num_classes: 4 use_fine_grained_loss: false

YOLOv3: backbone: DarkNet yolo_head: YOLOv3Head

DarkNet: norm_type: sync_bn norm_decay: 0.99 depth: 53

YOLOv3Head: anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]] anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]] norm_decay: 0.99 yolo_loss: YOLOv3Loss nms: background_label: -1 keep_top_k: 100 nms_threshold: 0.45 nms_top_k: 1000 normalized: false score_threshold: 0.01

YOLOv3Loss:

batch_size here is only used for fine grained loss, not used

for training batch_size setting, training batch_size setting

is in configs/yolov3_reader.yml TrainReader.batch_size, batch

size here should be set as same value as TrainReader.batch_size

batch_size: 4 ignore_thresh: 0.7 label_smooth: false

LearningRate: base_lr: 0.0001 schedulers:

!PiecewiseDecay gamma: 0.1 milestones:
- 5000
- 10000
- 15000
!LinearWarmup start_factor: 0. steps: 1000

OptimizerBuilder: optimizer: momentum: 0.9 type: Momentum regularizer: factor: 0.0005 type: L2

READER: 'yolov3_reader.yml' TrainReader: inputs_def: fields: ['image', 'gt_bbox', 'gt_class', 'gt_score'] num_max_boxes: 50 dataset: !VOCDataSet dataset_dir: dataset/chong anno_path: train.txt use_default_label: false with_background: false

EvalReader: inputs_def: fields: ['image', 'im_size', 'im_id', 'gt_bbox', 'gt_class', 'is_difficult'] num_max_boxes: 50 dataset: !VOCDataSet dataset_dir: dataset/chong anno_path: val.txt use_default_label: false with_background: false

TestReader: dataset: !ImageFolder dataset_dir: dataset/chong anno_path: label_list.txt use_default_label: false with_background: false

PaddlePaddle / PaddleDetection 1 年多 前同步成功