SSD训练模型验证评估阶段mAP值获取错误,单类目标检测问题
Created by: lxwzafu
- 版本、环境信息: 1)PaddlePaddle版本:paddlepaddle-1.5.2 2)CPU:Intel Core i7 3)GPU:AMD (使用CPU训练) 4)系统环境:Windows10、VS2017、Python36_64
- 模型信息 1)模型名称:SSD 2)使用数据集名称:pascalvoc(2007+2012) 3)使用算法名称:课程9-深度学习进阶CV-目标检测 4)模型链接:https://aistudio.baidu.com/aistudio/projectdetail/78972
- 问题描述:
先对算法中多线程读取训练数据改成单线程获取,因为在windows中多线程会出错。
然后对针对原始数据集进行训练和验证,一切正常
修改数据集:在train.txt中筛选出“person”类的所有记录组成新的train.txt;修改label_list文件,只保留“person”
然后进行训练和验证,训练阶段没问题,验证阶段执行到获取mAP值时出现错误,出错位置代码如下:
map_eval = fluid.metrics.DetectionMAP(nmsed_out, gt_label, gt_box, difficult,
train_parameters['class_dim'], overlap_threshold=0.5, evaluate_difficult=False, ap_version='11point') 错误信息如下: Invoke operator detection_map error. Python Call stacks: File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\paddle\fluid\framework.py", line 1774, in append_op attrs=kwargs.get("attrs", None)) File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\paddle\fluid\layer_helper.py", line 43, in append_op return self.main_program.current_block().append_op(*kwargs) File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\paddle\fluid\layers\detection.py", line 1058, in detection_map 'class_num': class_num, File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\paddle\fluid\metrics.py", line 802, in init ap_version=ap_version) File "C:\Users\lxw\source\repos\PythonApplication1\PythonApplication1\module2.py", line 663, in build_eval_program_with_feeder evaluate_difficult=False, ap_version='11point') File "C:\Users\lxw\source\repos\PythonApplication1\PythonApplication1\module2.py", line 765, in eval_feeder, eval_reader, cur_map, accum_map, nmsed_out = build_eval_program_with_feeder(eval_program, start_program) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_vendored\pydevd_pydev_imps_pydev_execfile.py", line 25, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_vendored\pydevd\pydevd.py", line 1106, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_vendored\pydevd\pydevd.py", line 1099, in run return self._exec(is_module, entry_point_fn, module_name, file, globals, locals) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_vendored\pydevd\pydevd.py", line 1752, in main globals = debugger.run(setup['file'], None, None, is_module) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_local.py", line 125, in _run _pydevd.main() File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd_local.py", line 64, in run_file run(argv, addr, kwargs) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\Packages\ptvsd\debugger.py", line 37, in debug run(address, filename, *args, kwargs) File "c:\program files (x86)\microsoft visual studio\2017\enterprise\common7\ide\extensions\microsoft\python\core\ptvsd_launcher.py", line 119, in vspd.debug(filename, port_num, debug_id, debug_options, run_as) C++ Call stacks: Enforce failed. Expected det_dims[1] == 6UL, but received det_dims[1]:1 != 6UL:6. The shape is of Input(DetectRes) [N, 6]. at [D:\1.5.2\paddle\paddle\fluid\operators\detection_map_op.cc:49] PaddlePaddle Call Stacks: Windows not support stack backtrace yet.
中断在如下位置: except fluid.core.EOFException: train_reader.reset()
log信息如下: 2019-09-25 23:02:43,359 - module2.py[line:750] - INFO: start ssd, train params: {'input_size': [3, 300, 300], 'class_dim': 1, 'label_dict': {'person': 0}, 'image_count': 128, 'log_feed_image': False, 'pretrained': True, 'pretrained_model_dir': './pretrained-model', 'continue_train': True, 'save_model_dir': './ssd-model', 'model_prefix': 'mobilenet-ssd', 'data_dir': 'c:/pascalvoc', 'mean_rgb': [127.5, 127.5, 127.5], 'file_list': 'train_person.txt', 'eval_file_list': 'eval_person1111.txt', 'label_list': 'label_list11111', 'mode': 'train', 'num_epochs': 400, 'train_batch_size': 64, 'use_gpu': False, 'apply_distort': True, 'apply_expand': True, 'apply_corp': True, 'image_distort_strategy': {'expand_prob': 0.5, 'expand_max_ratio': 4, 'hue_prob': 0.5, 'hue_delta': 18, 'contrast_prob': 0.5, 'contrast_delta': 0.5, 'saturation_prob': 0.5, 'saturation_delta': 0.5, 'brightness_prob': 0.5, 'brightness_delta': 0.125}, 'rsm_strategy': {'learning_rate': 0.001, 'lr_epochs': [40, 60, 80, 100], 'lr_decay': [1, 0.5, 0.25, 0.1, 0.01]}, 'momentum_strategy': {'learning_rate': 0.1, 'decay_steps': 128, 'decay_rate': 0.8}, 'early_stop': {'sample_frequency': 50, 'successive_limit': 3, 'min_loss': 1.28, 'min_curr_map': 0.86}} 2019-09-25 23:02:43,360 - module2.py[line:753] - INFO: create place, use gpu:False 2019-09-25 23:02:43,360 - module2.py[line:757] - INFO: build network and program 2019-09-25 23:02:49,436 - module2.py[line:768] - INFO: build executor and init params 2019-09-25 23:02:49,736 - module2.py[line:729] - INFO: load param from pretrained model 2019-09-25 23:02:49,966 - module2.py[line:790] - INFO: current pass: 0, start read image 2019-09-25 23:03:28,768 - module2.py[line:803] - INFO: Pass 0, trainbatch 1, loss 4.720804214477539 time 38.77 sec
问题诊断(定位)方法(思路):
1、修改label_list文件,使之多于一个类,如再加入“cat”类。实际上随便加任何名字的类如“aaa”。程序都能正常训练和验证,不会报错。
2、注释掉
map_eval = fluid.metrics.DetectionMAP(nmsed_out, gt_label, gt_box, difficult,
train_parameters['class_dim'], overlap_threshold=0.5,
evaluate_difficult=False, ap_version='11point')
这条语句。以及注释或修改与这条语句返回值有关语句若干条语句,使之验证过程中不需要使用cur_map_v和accum_map_v,程序能正常训练和验证,不会报错。