Segmentation fault (core dumped)
Created by: chenyangMl
您好,运行训练命令遇到了以下问题,特来请教。 通过FLAGS_selected_gpus 指定显卡 和 修改train.py里面的place = fluid.CUDAPlace(3) if use_gpu else fluid.CPUPlace()运行训练都会出现这个问题。 运行use_gpu=False也出现了下面的问题。 FLAGS_selected_gpus=3 && python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001 grep: warning: GREP_OPTIONS is deprecated; please use an alias or script 2020-05-18 09:29:29,797-INFO: {'Architecture': {'function': 'ppocr.modeling.architectures.det_model,DetModel'}, 'TestReader': {'reader_function': 'ppocr.data.det.dataset_traversal,EvalTestReader', 'single_img_path': None, 'img_set_dir': './train_data/icdar2015/text_localization/', 'label_file_path': './train_data/icdar2015/text_localization/test_icdar2015_label.txt', 'test_image_shape': [736, 1280], 'process_function': 'ppocr.data.det.db_process,DBProcessTest', 'do_eval': True}, 'Backbone': {'function': 'ppocr.modeling.backbones.det_mobilenet_v3,MobileNetV3', 'model_name': 'large', 'scale': 0.5}, 'Head': {'k': 50, 'function': 'ppocr.modeling.heads.det_db_head,DBHead', 'inner_channels': 96, 'model_name': 'large', 'out_channels': 2}, 'Optimizer': {'beta1': 0.9, 'function': 'ppocr.optimizer,AdamDecay', 'base_lr': 0.0001, 'beta2': 0.999}, 'EvalReader': {'test_image_shape': [736, 1280], 'reader_function': 'ppocr.data.det.dataset_traversal,EvalTestReader', 'img_set_dir': './train_data/icdar2015/text_localization/', 'process_function': 'ppocr.data.det.db_process,DBProcessTest', 'label_file_path': './train_data/icdar2015/text_localization/test_icdar2015_label.txt'}, 'Loss': {'function': 'ppocr.modeling.losses.det_db_loss,DBLoss', 'balance_loss': True, 'beta': 10, 'alpha': 5, 'ohem_ratio': 3, 'main_loss_type': 'DiceLoss'}, 'TrainReader': {'reader_function': 'ppocr.data.det.dataset_traversal,TrainReader', 'num_workers': 8, 'img_set_dir': './train_data/icdar2015/text_localization/', 'process_function': 'ppocr.data.det.db_process,DBProcessTrain', 'label_file_path': './train_data/icdar2015/text_localization/train_icdar2015_label.txt'}, 'PostProcess': {'unclip_ratio': 1.5, 'max_candidates': 1000, 'function': 'ppocr.postprocess.db_postprocess,DBPostProcess', 'thresh': 0.3, 'box_thresh': 0.7}, 'Global': {'save_epoch_step': 200, 'save_inference_dir': None, 'eval_batch_step': 5000, 'log_smooth_window': 20, 'algorithm': 'DB', 'epoch_num': 1200, 'use_gpu': True, 'train_batch_size_per_card': 16, 'image_shape': [3, 640, 640], 'save_model_dir': './output/det_db/', 'save_res_path': './output/det_db/predicts_db.txt', 'checkpoints': None, 'pretrain_weights': './pretrain_models/MobileNetV3_large_x0_5_pretrained/', 'test_batch_size_per_card': 16, 'reader_yml': './configs/det/det_db_icdar15_reader.yml', 'print_batch_step': 2}} 3 640 640 3 640 640 import ujson error: No module named 'ujson' use json 2020-05-18 09:29:33,067-INFO: places would be ommited when DataLoader is not iterable W0518 09:29:33.928460 5607 device_context.cc:237] Please NOTE: device: 3, CUDA Capability: 75, Driver API Version: 10.2, Runtime API Version: 10.0 W0518 09:29:33.932370 5607 device_context.cc:245] device: 3, cuDNN Version: 7.5. W0518 09:29:33.932396 5607 device_context.cc:271] WARNING: device: 3. The installed Paddle is compiled with CUDNN 7.6, but CUDNN version in your machine is 7.5, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. 2020-05-18 09:29:35,015-INFO: Loading parameters from ./pretrain_models/MobileNetV3_large_x0_5_pretrained/... 2020-05-18 09:29:35,015-WARNING: ./pretrain_models/MobileNetV3_large_x0_5_pretrained/.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-05-18 09:29:35,015-WARNING: ./pretrain_models/MobileNetV3_large_x0_5_pretrained/.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-05-18 09:29:35,206-INFO: Finish initing model from ./pretrain_models/MobileNetV3_large_x0_5_pretrained/ I0518 09:29:35.251972 5607 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 8 cards are used, so 8 programs are executed in parallel. W0518 09:29:48.223284 5607 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly W0518 09:29:48.223331 5607 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle W0518 09:29:48.223345 5607 init.cc:214] The detail failure signal is: