Created by: heavengate
Refine DataLoader support multi-processing
- add
paddle.io.Dataset
base class - add
paddle.io.BatchSampler
- add
paddle.io.DataLoader
support multi-process when initialize by__init__
-
num_workers = 0
for single-process mode -
num_workers > 0
for multi-process mode
-
multi-process work flow
data order keeping
- multi-process workers keep data order by
reorder_dict
andindex
check -
py_reader
can keep order by settingis_ordered=True
after https://github.com/PaddlePaddle/Paddle/pull/22699 merged
speed test
-
CUDA Driver version: 418.39
-
CUDA: V9.0.176
-
cuDNN: 7.5.1
-
GPU: 8 * P40
-
CPU: 24 Core Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
-
orignal DataLoader configs:
- static graph mode trained by ParallelExecutor, disable multiprocess implement in model readers
- dynamic graph mode trained by
paddle.distributed.launch
, will initial a DataLoader for each GPU and each DataLoader will start a subprocess
- multiprocess DataLoader configs:
- static graph mode trained by ParallelExecutor, disable multiprocess implement in model readers, multiprocess DataLoader set
num_workers=16
- dynamic graph mode trained by
paddle.distributed.launch
, will initial a DataLoader for each GPU and setnum_workers=2
for each DataLoader
- static graph
Model | batch_size | original DataLoader | multiprocess DataLoader | speed up ratio |
---|---|---|---|---|
ResNet50 | 8*32 | 711ms/step | 266ms/step | 167.3% |
YOLOv3-DarkNet53 | 8*8 | 3155ms/step | 708ms/step | 345.6% |
TSM-ResNet50 | 8*16 | 8665ms/step | 1669ms/step | 419.2% |
BMN | 8*16 | 2393ms/step | 772ms/step | 209.9% |
- dynamic graph
Model | batch_size | origin DataLoader | multiprocess DataLoader | speed up ratio |
---|---|---|---|---|
ResNet50 | 8*32 | 269ms/step | 267ms/step | - |
YOLOv3-DarkNet53 | 8*8 | 751ms/step | 623ms/step | 20.5% |
TSM-ResNet50 | 8*16 | 3550ms/step | 1767ms/step | 89.6% |
BMN | 8*16 | 831ms/step | 785ms/step | 5.8% |
-
ResNet50
TSM
usedxmap_reader
, whileYOLOv3
BMN
didn't
cn doc: https://github.com/PaddlePaddle/FluidDoc/pull/1952
Document
- DataLoader
- Dataset
- BatchSampler