关于在配置中添加更多的数据增强方式导致训练错误以及训练卡住的问题
Created by: shuxsu
使用mask rcnn 模型的resnet50+fpn 训练卡死不动 控制台输出:
Done (t=0.07s) creating index... index created! {1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6} {'sky': 1, 'building': 2, 'terrain': 3, 'person': 4, 'vegetation': 5, 'car': 6} 2019-12-12 11:44:38,129-INFO: 139 samples in file /home/aistudio/data/data17467/cococo/annotations/instance_train.json 2019-12-12 11:44:38,131-INFO: places would be ommited when DataLoader is not iterable I1212 11:44:38.156224 4327 parallel_executor.cc:421] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I1212 11:44:38.196641 4327 graph_pattern_detector.cc:96] --- detected 28 subgraphs I1212 11:44:38.215721 4327 graph_pattern_detector.cc:96] --- detected 25 subgraphs I1212 11:44:38.257105 4327 build_strategy.cc:363] SeqOnlyAllReduceOps:0, num_trainers:1 I1212 11:44:38.307612 4327 parallel_executor.cc:285] Inplace strategy is enabled, when build_strategy.enable_inplace = True I1212 11:44:38.331254 4327 parallel_executor.cc:368] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0