📚 batch norm notebook

03197f43 · Varuna Jayasiri · 983286e2 · 03197f43 · 03197f43 · 03197f43
6 changed file
--- a/docs/normalization/batch_norm/index.html
+++ b/docs/normalization/batch_norm/index.html
--- a/docs/normalization/batch_norm/mnist.html
+++ b/docs/normalization/batch_norm/mnist.html
--- a/labml_nn/normalization/batch_norm/__init__.py
+++ b/labml_nn/normalization/batch_norm/__init__.py
@@ -76,6 +76,9 @@ like $Wu + b$ the bias parameter $b$ gets cancelled due to normalization.
 So you can and should omit bias parameter in linear transforms right before the
 batch normalization.

+Batch normalization also makes the back propagation invariant to the scale of the weights.
+And empirically it improves generalization, so it has regularization effects too.
+
 ## Inference

 We need to know $\mathbb{E}[x^{(k)}]$ and $Var[x^{(k)}]$ in order to
@@ -84,6 +87,12 @@ So during inference, you either need to go through the whole (or part of) datase
 and find the mean and variance, or you can use an estimate calculated during training.
 The usual practice is to calculate an exponential moving average of
 mean and variance during training phase and use that for inference.
+
+Here's [the training code](mnist.html) and a notebook for training
+a CNN classifier that use batch normalization for MNIST dataset.
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/normalization/batch_norm/mnist.ipynb)
+[![View Run](https://img.shields.io/badge/labml-experiment-brightgreen)](https://web.lab-ml.com/run?uuid=011254fe647011ebbb8e0242ac1c0002)
 """

 import torch

--- a/labml_nn/normalization/batch_norm/mnist.ipynb
+++ b/labml_nn/normalization/batch_norm/mnist.ipynb
--- a/labml_nn/normalization/batch_norm/mnist.py
+++ b/labml_nn/normalization/batch_norm/mnist.py
@@ -2,7 +2,8 @@
 ---
 title: MNIST Experiment to try Batch Normalization
 summary: >
-  This is a simple model for MNIST digit classification that uses batch normalization
+  This trains is a simple convolutional neural network that uses batch normalization
+  to classify MNIST digits.
 ---

 # MNIST Experiment for Batch Normalization

--- a/setup.py
+++ b/setup.py
@@ -5,7 +5,7 @@ with open("readme.md", "r") as f:

 setuptools.setup(
    name='labml-nn',
-    version='0.4.84',
+    version='0.4.85',
    author="Varuna Jayasiri, Nipun Wijerathne",
    author_email="vpjayasiri@gmail.com, hnipun@gmail.com",
    description="A collection of PyTorch implementations of neural network architectures and layers.",