Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • PaddleDetection
  • Issue
  • #454

P
PaddleDetection
  • 项目概览

PaddlePaddle / PaddleDetection
大约 2 年 前同步成功

通知 708
Star 11112
Fork 2696
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
PaddleDetection
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 184
    • Issue 184
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 40
    • 合并请求 40
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 4月 07, 2020 by saxon_zh@saxon_zhGuest

在自有数据集上训练mask_rcnn_r50_1x报错valueError

Created by: Magsun

@百度专家 萌新求助

自己的数据是用labelme标注后使用labelme自带的labelme2coco转换成coco格式数据集,标注了三类数据(不含背景),修改了mask_reader.yml中的data_dir,和drop_last,修改了mask rcnn r50 1x 里num_class=4。

训练时报错广播错误,无法将(3,800,1067)广播到(3,800)。 ValueError: could not broadcast input array from shape (3,800,1067) into shape (3,800)

百度了一下有的人说是输入尺寸有问题,跟模板不一样,我找了找没看见哪里规定了输入尺寸,我看框架本身会对数据进行resize的感觉应该不会存在尺寸问题额。跪求解答QAQ

log如下: (base) root@8bdbc88b7a0b:/workspace/PaddleDetection# python tools/train.py -c configs/mask_rcnn_r50_1x.yml BBoxAssigner: batch_size_per_im: 512 bbox_reg_weights:

  • 0.1
  • 0.1
  • 0.2
  • 0.2 bg_thresh_hi: 0.5 bg_thresh_lo: 0.0 fg_fraction: 0.25 fg_thresh: 0.5 num_classes: 81 shuffle_before_sample: true BBoxHead: [32mhead[0m: ResNetC5 [32mnms[0m: keep_top_k: 100 nms_threshold: 0.5 normalized: false score_threshold: 0.05 bbox_loss: sigma: 1.0 box_coder: axis: 1 box_normalized: false code_type: decode_center_size prior_box_var:
    • 0.1
    • 0.1
    • 0.2
    • 0.2 num_classes: 81 EvalReader: batch_size: 1 dataset: !COCODataSet anno_path: annotations/instances_val2017.json dataset_dir: /data/cloudcover_is/20200331/coco/ image_dir: val2017 sample_num: -1 with_background: true drop_empty: false drop_last: false inputs_def: fields:
    • image
    • im_info
    • im_id
    • im_shape sample_transforms:
  • !DecodeImage to_rgb: true with_mixup: false
  • !NormalizeImage is_channel_first: false is_scale: true mean:
    • 0.485
    • 0.456
    • 0.406 std:
    • 0.229
    • 0.224
    • 0.225
  • !ResizeImage interp: 1 max_size: 1333 target_size: 800 use_cv2: true
  • !Permute channel_first: true to_bgr: false shuffle: false worker_num: 4 LearningRate: [32mschedulers[0m:
  • !PiecewiseDecay gamma: 0.1 milestones:
    • 120000
    • 160000 values: null
  • !LinearWarmup start_factor: 0.3333333333333333 steps: 500 base_lr: 0.01 MaskAssigner: num_classes: 81 resolution: 14 MaskHead: conv_dim: 256 dilation: 1 norm_type: null num_classes: 81 num_convs: 0 resolution: 14 MaskRCNN: [32mbackbone[0m: ResNet [32mrpn_head[0m: RPNHead bbox_assigner: BBoxAssigner bbox_head: BBoxHead fpn: null mask_assigner: MaskAssigner mask_head: MaskHead roi_extractor: RoIAlign rpn_only: false OptimizerBuilder: optimizer: momentum: 0.9 type: Momentum regularizer: factor: 0.0001 type: L2 RPNHead: [32mrpn_target_assign[0m: rpn_batch_size_per_im: 256 rpn_fg_fraction: 0.5 rpn_negative_overlap: 0.3 rpn_positive_overlap: 0.7 rpn_straddle_thresh: 0.0 [32mtest_proposal[0m: min_size: 0.0 nms_thresh: 0.7 post_nms_top_n: 1000 pre_nms_top_n: 6000 [32mtrain_proposal[0m: min_size: 0.0 nms_thresh: 0.7 post_nms_top_n: 2000 pre_nms_top_n: 12000 anchor_generator: anchor_sizes:
    • 32
    • 64
    • 128
    • 256
    • 512 aspect_ratios:
    • 0.5
    • 1.0
    • 2.0 stride:
    • 16.0
    • 16.0 variance:
    • 1.0
    • 1.0
    • 1.0
    • 1.0 num_classes: 1 ResNet: [32mfeature_maps[0m: 4 [32mnorm_type[0m: affine_channel dcn_v2_stages: [] depth: 50 freeze_at: 2 freeze_norm: true gcb_params: {} gcb_stages: [] nonlocal_stages: [] norm_decay: 0.0 variant: b weight_prefix_name: '' ResNetC5: [32mnorm_type[0m: affine_channel depth: 50 feature_maps:
  • 5 freeze_at: 2 freeze_norm: true norm_decay: 0.0 variant: b weight_prefix_name: '' RoIAlign: [32mresolution[0m: 14 sampling_ratio: 0 spatial_scale: 0.0625 TestReader: batch_size: 1 dataset: !ImageFolder anno_path: annotations/instances_val2017.json dataset_dir: '' image_dir: '' sample_num: -1 use_default_label: null with_background: true drop_last: false inputs_def: fields:
    • image
    • im_info
    • im_id
    • im_shape sample_transforms:
  • !DecodeImage to_rgb: true with_mixup: false
  • !NormalizeImage is_channel_first: false is_scale: true mean:
    • 0.485
    • 0.456
    • 0.406 std:
    • 0.229
    • 0.224
    • 0.225
  • !ResizeImage interp: 1 max_size: 1333 target_size: 800 use_cv2: true
  • !Permute channel_first: true to_bgr: false shuffle: false TrainReader: batch_size: 4 dataset: !COCODataSet anno_path: annotations/instances_train2017.json dataset_dir: /data/cloudcover_is/20200331/coco/ image_dir: train2017 sample_num: -1 with_background: true drop_last: true inputs_def: fields:
    • image
    • im_info
    • im_id
    • gt_bbox
    • gt_class
    • is_crowd
    • gt_mask sample_transforms:
  • !DecodeImage to_rgb: true with_mixup: false
  • !RandomFlipImage is_mask_flip: true is_normalized: false prob: 0.5
  • !NormalizeImage is_channel_first: false is_scale: true mean:
    • 0.485
    • 0.456
    • 0.406 std:
    • 0.229
    • 0.224
    • 0.225
  • !ResizeImage interp: 1 max_size: 1333 target_size: 800 use_cv2: true
  • !Permute channel_first: true to_bgr: false shuffle: true use_process: false worker_num: 4 architecture: MaskRCNN log_smooth_window: 20 max_iters: 180000 metric: COCO num_classes: 4 pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar save_dir: output snapshot_iter: 1000 use_gpu: true weights: /data/cloudcover_is/20200331/output/mask_rcnn_r50_1x/model_final

W0407 08:11:58.271080 3546 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0 W0407 08:11:58.277384 3546 device_context.cc:245] device: 0, cuDNN Version: 7.6. 2020-04-07 08:12:02,633-INFO: Load model and fuse batch norm if have from https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar... 2020-04-07 08:12:02,634-INFO: Found /root/.cache/paddle/weights/ResNet50_cos_pretrained 2020-04-07 08:12:02,641-INFO: Loading parameters from /root/.cache/paddle/weights/ResNet50_cos_pretrained... 2020-04-07 08:12:02,641-WARNING: /root/.cache/paddle/weights/ResNet50_cos_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-04-07 08:12:02,641-WARNING: /root/.cache/paddle/weights/ResNet50_cos_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-04-07 08:12:02,650-WARNING: variable file [ /root/.cache/paddle/weights/ResNet50_cos_pretrained/fc_0.w_0 /root/.cache/paddle/weights/ResNet50_cos_pretrained/fc_0.b_0 ] not used 2020-04-07 08:12:02,650-WARNING: variable file [ /root/.cache/paddle/weights/ResNet50_cos_pretrained/fc_0.w_0 /root/.cache/paddle/weights/ResNet50_cos_pretrained/fc_0.b_0 ] not used loading annotations into memory... Done (t=1.09s) creating index... index created! 2020-04-07 08:12:05,022-INFO: 2960 samples in file /data/cloudcover_is/20200331/coco/annotations/instances_train2017.json 2020-04-07 08:12:05,047-INFO: places would be ommited when DataLoader is not iterable I0407 08:12:05.064105 3546 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel. I0407 08:12:05.089402 3546 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 I0407 08:12:05.130786 3546 parallel_executor.cc:307] Inplace strategy is enabled, when build_strategy.enable_inplace = True I0407 08:12:05.149138 3546 parallel_executor.cc:375] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0 2020-04-07 08:12:07,916-WARNING: Your reader has raised an exception! Exception in thread Thread-6: Traceback (most recent call last): File "/root/anaconda3/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "/root/anaconda3/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 805, in thread_main six.reraise(*sys.exc_info()) File "/root/anaconda3/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 785, in thread_main for tensors in self._tensor_reader(): File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 853, in tensor_reader_impl for slots in paddle_reader(): File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 489, in reader_creator yield self.feed(item) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 330, in feed ret_dict[each_name] = each_converter.done() File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/data_feeder.py", line 139, in done arr = numpy.array(self.data, dtype=self.dtype) ValueError: could not broadcast input array from shape (3,800,1067) into shape (3,800)

/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 323, in main() File "tools/train.py", line 233, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 783, in run six.reraise(*sys.exc_info()) File "/root/anaconda3/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 778, in run use_program_cache=use_program_cache) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 843, in _run_impl return_numpy=return_numpy) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/executor.py", line 677, in _run_parallel tensors = exe.run(fetch_var_names)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Python Call Stacks (More useful to users):

File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op attrs=kwargs.get("attrs", None)) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 733, in _init_non_iterable outputs={'Out': self._feed_list}) File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 646, in init self._init_non_iterable() File "/root/anaconda3/lib/python3.7/site-packages/paddle/fluid/reader.py", line 280, in from_generator iterable, return_list) File "/workspace/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 329, in build_inputs iterable=iterable) if use_dataloader else None File "tools/train.py", line 115, in main feed_vars, train_loader = model.build_inputs(**inputs_def) File "tools/train.py", line 323, in main()


Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error] terminate called without an active exception W0407 08:12:08.303510 3594 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly W0407 08:12:08.303535 3594 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle W0407 08:12:08.303544 3594 init.cc:214] The detail failure signal is:

W0407 08:12:08.303556 3594 init.cc:217] *** Aborted at 1586247128 (unix time) try "date -d @1586247128" if you are using GNU date *** W0407 08:12:08.306300 3594 init.cc:217] PC: @ 0x0 (unknown) W0407 08:12:08.306440 3594 init.cc:217] *** SIGABRT (@0xdda) received by PID 3546 (TID 0x7f445d9fd700) from PID 3546; stack trace: *** W0407 08:12:08.308898 3594 init.cc:217] @ 0x7f454f537390 (unknown) W0407 08:12:08.311234 3594 init.cc:217] @ 0x7f454f191428 gsignal W0407 08:12:08.313705 3594 init.cc:217] @ 0x7f454f19302a abort W0407 08:12:08.317917 3594 init.cc:217] @ 0x7f45267d084a __gnu_cxx::__verbose_terminate_handler() W0407 08:12:08.319195 3594 init.cc:217] @ 0x7f45267cef47 __cxxabiv1::__terminate() W0407 08:12:08.320966 3594 init.cc:217] @ 0x7f45267cef7d std::terminate() W0407 08:12:08.322532 3594 init.cc:217] @ 0x7f45267cec5a __gxx_personality_v0 W0407 08:12:08.324553 3594 init.cc:217] @ 0x7f454e796b97 _Unwind_ForcedUnwind_Phase2 W0407 08:12:08.326314 3594 init.cc:217] @ 0x7f454e796e7d _Unwind_ForcedUnwind W0407 08:12:08.327750 3594 init.cc:217] @ 0x7f454f536070 __GI___pthread_unwind W0407 08:12:08.329143 3594 init.cc:217] @ 0x7f454f52e845 __pthread_exit W0407 08:12:08.329684 3594 init.cc:217] @ 0x556edfe321c9 PyThread_exit_thread W0407 08:12:08.329820 3594 init.cc:217] @ 0x556edfcc4cb1 PyEval_RestoreThread.cold.787 W0407 08:12:08.330205 3594 init.cc:217] @ 0x7f450c77bcde (unknown) W0407 08:12:08.330754 3594 init.cc:217] @ 0x556edfdbc114 _PyMethodDef_RawFastCallKeywords W0407 08:12:08.331284 3594 init.cc:217] @ 0x556edfdbc231 _PyCFunction_FastCallKeywords W0407 08:12:08.331825 3594 init.cc:217] @ 0x556edfe20a5d _PyEval_EvalFrameDefault W0407 08:12:08.332321 3594 init.cc:217] @ 0x556edfd756f9 _PyEval_EvalCodeWithName W0407 08:12:08.332814 3594 init.cc:217] @ 0x556edfd76805 _PyFunction_FastCallDict W0407 08:12:08.333302 3594 init.cc:217] @ 0x556edfd91943 _PyObject_Call_Prepend W0407 08:12:08.333562 3594 init.cc:217] @ 0x556edfdd012a slot_tp_call W0407 08:12:08.334058 3594 init.cc:217] @ 0x556edfdd118b _PyObject_FastCallKeywords W0407 08:12:08.334596 3594 init.cc:217] @ 0x556edfe20626 _PyEval_EvalFrameDefault W0407 08:12:08.335088 3594 init.cc:217] @ 0x556edfd7673b _PyFunction_FastCallDict W0407 08:12:08.335577 3594 init.cc:217] @ 0x556edfd91943 _PyObject_Call_Prepend W0407 08:12:08.335862 3594 init.cc:217] @ 0x556edfdd012a slot_tp_call W0407 08:12:08.336364 3594 init.cc:217] @ 0x556edfdd118b _PyObject_FastCallKeywords W0407 08:12:08.336897 3594 init.cc:217] @ 0x556edfe20e8f _PyEval_EvalFrameDefault W0407 08:12:08.337393 3594 init.cc:217] @ 0x556edfd756f9 _PyEval_EvalCodeWithName W0407 08:12:08.337884 3594 init.cc:217] @ 0x556edfd76805 _PyFunction_FastCallDict W0407 08:12:08.338369 3594 init.cc:217] @ 0x556edfd91943 _PyObject_Call_Prepend W0407 08:12:08.338919 3594 init.cc:217] @ 0x556edfd84b9e PyObject_Call Aborted (core dumped)

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/PaddleDetection#454
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7