Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PaddleDetection
  • Issue
  • #1381

P
PaddleDetection
  • 项目概览

PaddlePaddle / PaddleDetection
大约 2 年 前同步成功

通知 708
Star 11112
Fork 2696
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PaddleDetection
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 184
    • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
    • 合并请求 40
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 9月 10, 2020 by saxon_zh@saxon_zhGuest

pp yolo error during training

Created by: sukkyusun1

following error occured repeatly when 11000 iteration

2020-09-09 16:54:54,508-INFO: iter: 11300, lr: 0.000100, 'loss_xy': '5.475853', 'loss_wh': '9.381088', 'loss_obj': '39.984276', 'loss_cls': '11.378154', 'loss_iou': '35.723583', 'loss_iou_aware': '0.037410', 'loss': '102.309998', time: 0.633, eta: 15:35:59 2020-09-09 16:55:57,974-INFO: iter: 11400, lr: 0.000100, 'loss_xy': '5.445478', 'loss_wh': '9.870771', 'loss_obj': '38.965660', 'loss_cls': '11.542593', 'loss_iou': '36.368202', 'loss_iou_aware': '0.035767', 'loss': '103.814850', time: 0.632, eta: 15:33:56 2020-09-09 16:57:04,996-INFO: iter: 11500, lr: 0.000100, 'loss_xy': '6.019155', 'loss_wh': '9.661630', 'loss_obj': '39.524551', 'loss_cls': '11.500303', 'loss_iou': '37.492279', 'loss_iou_aware': '0.033797', 'loss': '105.630592', time: 0.671, eta: 16:29:15 2020-09-09 16:58:13,333-INFO: iter: 11600, lr: 0.000100, 'loss_xy': '5.562244', 'loss_wh': '10.271358', 'loss_obj': '38.939468', 'loss_cls': '10.803724', 'loss_iou': '34.840134', 'loss_iou_aware': '0.033829', 'loss': '100.110817', time: 0.685, eta: 16:48:35 2020-09-09 16:58:26,631-WARNING: consumer[31433] exit abnormally with exitcode[-9] 2020-09-09 16:58:26,631-WARNING: 1 consumers have exited abnormally!!! 2020-09-09 16:58:26,631-WARNING: consumer[31433] exit abnormally with exitcode[-9] 2020-09-09 16:58:26,631-WARNING: 1 consumers have exited abnormally!!! 2020-09-09 16:58:26,664-WARNING: Your reader has raised an exception! Exception in thread Thread-11: Traceback (most recent call last): File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1145, in thread_main six.reraise(*sys.exc_info()) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1125, in thread_main for tensors in self._tensor_reader(): File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1195, in tensor_reader_impl for slots in paddle_reader(): File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/data_feeder.py", line 506, in reader_creator for item in reader(): File "/home/sk/PaddleDetection/ppdet/data/reader.py", line 445, in _reader reader.reset() File "/home/sk/PaddleDetection/ppdet/data/parallel_map.py", line 267, in reset assert self._consumer_healthy(), "cannot start another pass of data"
AssertionError: cannot start another pass of data for some consumers exited abnormally before!!!

/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 368, in main() File "tools/train.py", line 241, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1071, in run six.reraise(*sys.exc_info()) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1066, in run return_merged=return_merged) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl return_merged=return_merged) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel tensors = exe.run(fetch_var_names, return_merged)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Python Call Stacks (More useful to users):

File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2610, in append_op attrs=kwargs.get("attrs", None)) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1080, in _init_non_iterable attrs={'drop_last': self._drop_last}) File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 978, in init self._init_non_iterable() File "/home/sk/anaconda3/envs/paddle3.6/lib/python3.6/site-packages/paddle/fluid/reader.py", line 620, in from_generator iterable, return_list, drop_last) File "/home/sk/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 155, in build_inputs iterable=iterable) if use_dataloader else None File "tools/train.py", line 113, in main feed_vars, train_loader = model.build_inputs(**inputs_def) File "tools/train.py", line 368, in main()


Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]

ppyolo.yml

architecture: YOLOv3 use_gpu: true max_iters: 100000 log_smooth_window: 100 log_iter: 100 save_dir: output snapshot_iter: 10000 metric: VOC pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar weights: output/ppyolo/model_final num_classes: 5 use_fine_grained_loss: true use_ema: true ema_decay: 0.9998

YOLOv3: backbone: ResNet yolo_head: YOLOv3Head use_fine_grained_loss: true

ResNet: norm_type: sync_bn freeze_at: 0 freeze_norm: false norm_decay: 0. depth: 50 feature_maps: [3, 4, 5] variant: d dcn_v2_stages: [5]

YOLOv3Head: anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]] anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]] norm_decay: 0. coord_conv: true iou_aware: true iou_aware_factor: 0.4 scale_x_y: 1.05 spp: true yolo_loss: YOLOv3Loss nms: background_label: -1 keep_top_k: 100 nms_threshold: 0.45 nms_top_k: 1000 normalized: false score_threshold: 0.01 drop_block: true

YOLOv3Loss: batch_size: 4 ignore_thresh: 0.7 scale_x_y: 1.05 label_smooth: false use_fine_grained_loss: true iou_loss: IouLoss iou_aware_loss: IouAwareLoss

IouLoss: loss_weight: 2.5 max_height: 608 max_width: 608

IouAwareLoss: loss_weight: 1.0 max_height: 608 max_width: 608

MatrixNMS: background_label: -1 keep_top_k: 100 normalized: false score_threshold: 0.01 post_threshold: 0.01

LearningRate: base_lr: 0.0001 schedulers:

  • !PiecewiseDecay gamma: 0.1 milestones:
    • 150000
    • 200000
  • !LinearWarmup start_factor: 0. steps: 4000

OptimizerBuilder: optimizer: momentum: 0.9 type: Momentum regularizer: factor: 0.0005 type: L2

READER: 'ppyolo_reader.yml'

ppyolo.reader.yml TrainReader: inputs_def: fields: ['image', 'gt_bbox', 'gt_class', 'gt_score'] num_max_boxes: 200 dataset: !VOCDataSet #image_dir: train2017 anno_path: /home/sk/PaddleDetection/dataset/voc/trainval.txt dataset_dir: /home/sk/PaddleDetection/dataset/voc with_background: false use_default_label : false sample_transforms: - !DecodeImage to_rgb: True with_mixup: True - !MixupImage alpha: 1.5 beta: 1.5 - !ColorDistort {} - !RandomExpand fill_value: [123.675, 116.28, 103.53] - !RandomCrop {} - !RandomFlipImage is_normalized: false - !NormalizeBox {} - !PadBox num_max_boxes: 200 - !BboxXYXY2XYWH {} batch_transforms:

  • !RandomShape sizes: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608] random_inter: True
  • !NormalizeImage mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] is_scale: True is_channel_first: false
  • !Permute to_bgr: false channel_first: True

Gt2YoloTarget is only used when use_fine_grained_loss set as true,

this operator will be deleted automatically if use_fine_grained_loss

is set as false

  • !Gt2YoloTarget anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]] anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]] downsample_ratios: [32, 16, 8] batch_size: 4 shuffle: true mixup_epoch: 25000 drop_last: true worker_num: 8 bufsize: 4 use_process: true

EvalReader: inputs_def: #fields: ['image', 'im_size', 'im_id'] fields : ['image','im_size','im_id','gt_bbox','gt_class','is_difficult'] num_max_boxes: 200 dataset: !VOCDataSet #image_dir: val2017 anno_path: /home/sk/PaddleDetection/dataset/voc/test.txt dataset_dir: /home/sk/PaddleDetection/dataset/voc with_background: false use_default_label : false sample_transforms: - !DecodeImage to_rgb: True - !ResizeImage target_size: 1728 interp: 2 - !NormalizeImage mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] is_scale: True is_channel_first: false - !PadBox num_max_boxes: 200 - !Permute to_bgr: false channel_first: True batch_size: 4 drop_empty: false worker_num: 8 bufsize: 4

TestReader: inputs_def: image_shape: [3, 1728,1728] fields: ['image', 'im_size', 'im_id'] dataset: !ImageFolder anno_path: /home/sk/PaddleDetection/dataset/voc/label_list.txt with_background: false use_default_label : false sample_transforms: - !DecodeImage to_rgb: True - !ResizeImage target_size: 1728 interp: 2 - !NormalizeImage mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] is_scale: True is_channel_first: false - !Permute to_bgr: false channel_first: True batch_size: 1

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PaddleDetection#1381
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7