softmax_with_cross_entropy has bugs.
Created by: lcy-seso
The softmax with cross entropy operator does not correctly handle the gradients it receives that it fails in gradient checks after this PR https://github.com/PaddlePaddle/Paddle/pull/5027.
related to https://github.com/PaddlePaddle/Paddle/issues/5101