Abnormal learning curve bumping at early batches of each epoch during DS2 training. (#100) · Issue · PaddlePaddle / models

Abnormal learning curve bumping at early batches of each epoch during DS2 training.

Created by: xinghai-sun

After merging PR #74, we have seen such abnormal learning curve:

The figure plots the training cost. Notice that in the tails of the curve, there are many spikes, exactly locating at the first batch of each epoch.

Besides, it is not easy to reproduce the phenomenon in a small dataset.