Issue with BatchNorm layers
Created by: NTU-P04922004
Hi,
Recently, I found that a model with BatchNorm layers can have weird results in MDL.
I have two CNN models trained on CIFAR-10: one with a BatchNorm layer following each convolutional layer, and another one without any BatchNorm layer.
In the experiment, I used each model to inference the same CIFAR-10 image for several times. For the model with no BatchNorm layer, it always produce the same Softmax probabilities as usual; however, the model with BatchNorm layers produces a different result after each inference, and the Softmax probabilities will converge to 1.0.
Is this a known issue? Any response would be grateful, thanks!
For your reference:
-
Model definitions of both models can be found here: https://gist.github.com/NTU-P04922004/6c731d6ab978fa266d09540c82c12fe1
-
The model with no BatchNorm layer produces the following result:
Class, Softmax Probability 1 0.001315519 2 1.69E-04 3 0.9699939 4 0.003979825 5 2.60E-04 6 6.70E-04 7 0.020736642 8 7.11E-04 9 0.001030057 10 0.00113452
- The model with BatchNorm layers produces the following results:
First Inference: Class, Softmax Probability 1 1.29E-06 2 6.46E-06 3 4.02E-06 4 0.012168308 5 0.48621887 6 9.40E-04 7 0.49521634 8 0.001361827 9 1.84E-06 10 0.004080651
Second Inference: Class, Softmax Probability 1 9.34E-34 2 9.49E-30 3 9.49E-30 4 2.08E-11 5 0.006708719 6 1.14E-17 7 0.99329126 8 1.07E-17 9 7.82E-33 10 7.24E-14
Third Inference: Class, Softmax Probability 1 0 2 0 3 0 4 0 5 3.50E-22 6 0 7 1 8 0 9 0 10 0