docs(mge/module): add note about definition of momentum in BatchNorm

GitOrigin-RevId: 30a9aec20f3b6a8518aa5bddc0817fc4cbfaf00b

docs(mge/module): add note about definition of momentum in BatchNorm
GitOrigin-RevId: 30a9aec20f3b6a8518aa5bddc0817fc4cbfaf00b
8d507cc3 · Megvii Engine Team · 056fd6bc · 8d507cc3
隐藏空白更改
内联并排

Showing with 11 addition and 0 deletion

imperative/python/megengine/module/batchnorm.py imperative/python/megengine/module/batchnorm.py +11 -0

未找到文件。
--- a/imperative/python/megengine/module/batchnorm.py
+++ b/imperative/python/megengine/module/batchnorm.py
@@ -280,6 +280,17 @@ class BatchNorm2d(_BatchNorm):
    statistics on `(N, H, W)` slices, it's common terminology to call this
    Spatial Batch Normalization.

+    .. note::
+
+        The update formula for ``running_mean`` and ``running_var`` (taking ``running_mean`` as an example) is
+
+        .. math::
+
+            \textrm{running_mean} = \textrm{momentum} \times \textrm{running_mean} + (1 - \textrm{momentum}) \times \textrm{batch_mean}
+
+        which could be defined differently in other frameworks. Most notably, ``momentum`` of 0.1 in PyTorch
+        is equivalent to ``mementum`` of 0.9 here.
+
    Args:
        num_features: usually :math:`C` from an input of shape
            :math:`(N, C, H, W)` or the highest ranked dimension of an input