Training stops on out of memory at [/root/paddle_paddle/paddle/fluid/platform/profiler.cc:114]
Created by: ddokupil
When trying training on GPU, training terminates after couple thousands of iterations. In this example after 17518th but on a different machine it would stop after some other iteration. Consequently always after the same one on the same machine. You can see what happened in attached log: tail_paddleresnet50v2.log