内存溢出的问题!
Created by: NextGuido
我在训练文本检测网络DB时候,经常会出现内存溢出的问题,如下:
其中,配置文件det_r50_vd_db.yml
的内容如下:
Global:
algorithm: DB
use_gpu: true
epoch_num: 1200
log_smooth_window: 20
print_batch_step: 30
save_model_dir: ./output/det_db/
save_epoch_step: 200
eval_batch_step: 10000
train_batch_size_per_card: 2
test_batch_size_per_card: 1
image_shape: [3, 640, 640]
reader_yml: ./configs/det/det_db_chinese_reader.yml
pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
save_res_path: ./output/det_db/predicts_db.txt
checkpoints:
save_inference_dir:
配置文件det_db_chinese_reader.yml
的内容如下:
TrainReader:
reader_function: ppocr.data.det.dataset_traversal,TrainReader
process_function: ppocr.data.det.db_process,DBProcessTrain
num_workers: 4
img_set_dir: ""
label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/train.txt
EvalReader:
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
process_function: ppocr.data.det.db_process,DBProcessTest
img_set_dir: ""
label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/test.txt
test_image_shape: [736, 1280]
TestReader:
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
process_function: ppocr.data.det.db_process,DBProcessTest
infer_img:
img_set_dir: ""
label_file_path: /home/aistudio/data/data39969/icpr_mtwi_task2/test.txt
test_image_shape: [736, 1280]
do_eval: True
训练数据集来自于https://tianchi.aliyun.com/competition/entrance/231685/information,手动划分数据,训练集和验证集的划分比例9:1(9043:1005)。我的batch_size从2~16都试过,一直会出现内存溢出的问题,num_workers=1的话,可以训练,但是训练的迭代速度就太慢了。请问,有什么好的解决方法吗?