We Should Have The Alternative BN Implementation by Our Own (#14580) · Issue · PaddlePaddle / Paddle

We Should Have The Alternative BN Implementation by Our Own

Created by: KaiyuYue

Hi there,

This issue is tracked by the project Simple Baselines for Human Pose Estimation in Fluid.

If BN function of cuDNN is used during training in the task of Human Pose Estimation, a known issue that training can't converge will be encountered on the Tesla P40 / P100 / V100 GPU cards , both in PyTorch and PaddlePaddle.

PyTorch has its own BN implementation to solve this problem, revise the torch.backends.cudnn.enabled into False to disable the cuDNN BN usage in their code.

PaddlePaddle doesn't allow us to make this change. Luckily, training 1 image on each GPU card can ease this probelm. So, I think we should have the alternative BN implementation by our own.

Related issues in PyTorch: Issue-0 | Issue-1

PaddlePaddle / Paddle 大约 1 年 前同步成功

We Should Have The Alternative BN Implementation by Our Own

PaddlePaddle / Paddle
大约 1 年前同步成功