Initializer¶
Initializer¶
-
class
paddle.v2.fluid.initializer.
Initializer
Base class for variable initializers
Defines the common interface of variable initializers. They add operations to the init program that are used to initialize variables. Users should not use this class directly, but need to use one of its implementations.
ConstantInitializer¶
-
class
paddle.v2.fluid.initializer.
ConstantInitializer
(value=0.0) Implements the constant initializer
UniformInitializer¶
-
class
paddle.v2.fluid.initializer.
UniformInitializer
(low=-1.0, high=1.0, seed=0) Implements the random uniform distribution initializer
NormalInitializer¶
-
class
paddle.v2.fluid.initializer.
NormalInitializer
(loc=0.0, scale=1.0, seed=0) Implements the random Normal(Gaussian) distribution initializer
XavierInitializer¶
-
class
paddle.v2.fluid.initializer.
XavierInitializer
(uniform=True, fan_in=None, fan_out=None, seed=0) Implements the Xavier initializer
This class implements the Xavier weight initializer from the paper Understanding the difficulty of training deep feedforward neural networks[1] by Xavier Glorot and Yoshua Bengio.
This initializer is designed to keep the scale of the gradients approximately same in all the layers. In case of Uniform distribution, the range is [-x, x], where x = sqrt(6 / (fan_in + fan_out)). In case of Normal distribution, the mean is 0 and the standard deviation is sqrt(2/ (fan_in + fan_out)).
References
- [1] Understanding the difficulty of training deep feedforward neural
- networks. International conference on artificial intelligence and statistics. (http://proceedings.mlr.press/v9/glorot10a.html)
MSRAInitializer¶
-
class
paddle.v2.fluid.initializer.
MSRAInitializer
(uniform=True, fan_in=None, seed=0) Implements the MSRA initializer a.k.a. Kaiming Initializer
This class implements the weight initialization from the paper Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification[1] by Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun. This is a robust initialization method that particularly considers the rectifier nonlinearities. In case of Uniform distribution, the range is [-x, x], where x = sqrt(6 / fan_in). In case of Normal distribution, the mean is 0 and the standard deviation is sqrt(2/ fan_in).
References
- [1] Delving Deep into Rectifiers: Surpassing Human-Level Performance
- on ImageNet Classification (https://arxiv.org/abs/1502.01852)