<li><strong>input</strong> (<em>LayerOutput</em>) – batch normalization input. Better be linear activation.
Because there is an activation inside batch_normalization.</li>
<li><strong>batch_norm_type</strong>– We have batch_norm and cudnn_batch_norm. batch_norm
<li><strong>batch_norm_type</strong>(<em>None|string, None or "batch_norm" or "cudnn_batch_norm"</em>) – We have batch_norm and cudnn_batch_norm. batch_norm
supports both CPU and GPU. cudnn_batch_norm requires
cuDNN version greater or equal to v4 (>=v4). But
cudnn_batch_norm is faster and needs less memory
...
...
@@ -637,23 +644,34 @@ and <span class="math">\(out\)</span> is a (batchSize x dataDim) output vector.<
<trclass="field-odd field"><thclass="field-name">Parameters:</th><tdclass="field-body"><strong>output_max_index</strong> (<em>bool|None</em>) – True if output sequence max index instead of max
value. None means use default value in proto.</td>