Created by: sunxl1988
gpu: one P40 faster rcnn(batch size 1): 0.298 <-> 0.336 (previous) yolov3(batch size 8): 0.1883 <-> 0.201 (previous)
add prefetch for data loader split config yml into 3 parts (reader, architecture, optimize) add increment config method
adapt mask rcnn & mask rcnn fpn