train blazeface single GPU too long time; 4 GPU core dumped
Created by: yja1
when run CUDA_VISIBLE_DEVICES=1 python tools/train.py -c configs/face_detection/blazeface_nas_v2.yml too long time, and as training continues, spend more time. htop see cpu, CPU blocking
when run CUDA_VISIBLE_DEVICES=1,2,3,4 python tools/train.py -c configs/face_detection/blazeface_nas_v2.yml