Why Larger Batch Size Slows Training (#123) · Issue · PaddlePaddle / PaddleClas

Why Larger Batch Size Slows Training

Created by: sbl1996

I am training WRN-28-10 on CIFAR10 using PaddleClas. When batch size > 128, using larger batch size, training gets slower. A detailed comparison is shown below.

Batch Size	Time (Per Epoch)
32	82.2s
64	72.8s
128	68.5s
256	74.1s
512	86.4s
1024	110.5s

The time of the 2nd epoch is reported, so warm-up time is not counted. Experiments showed that the results were consistent.

This behavior is strange and unexpected. Could you help me to find the reason?

Code to reproduce is here.

Thank you very much!

PaddlePaddle / PaddleClas 1 年多 前同步成功

Why Larger Batch Size Slows Training

PaddlePaddle / PaddleClas
1 年多前同步成功