parameter_en.rst 5.0 KB
Newer Older
C
chentianyu03 已提交

..  _api_guide_parameter_en:

##################
Model Parameters
##################

Model parameters are weights and biases in a model. In fluid, they are instances of ``fluid.Parameter`` class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by :ref:`api_fluid_ParamAttr` . The configurable contents are as follows:


- Initialization method

- Regularization

- gradient clipping

- Model Average



Initialization method
========================

Fluid initializes a single parameter by setting attributes of :code:`initializer` in :code:`ParamAttr` .

examples:

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                initializer=fluid.initializer.ConstantInitializer(1.0))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)



The following is the initialization method supported by fluid:

1. BilinearInitializer
-----------------------

Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation.

Alias:Bilinear

API reference: :ref:`api_fluid_initializer_BilinearInitializer`

2. ConstantInitializer
--------------------------

Constant initialization. Initialize the parameter to the specified value.

Alias:Constant

API reference: :ref:`api_fluid_initializer_ConstantInitializer`

3. MSRAInitializer
----------------------

Please refer to https://arxiv.org/abs/1502.01852 for initialization.

Alias:MSRA

API reference: :ref:`api_fluid_initializer_MSRAInitializer`

4. NormalInitializer
-------------------------

Initialization method of random Gaussian distribution.

Alias:Normal

API reference: :ref:`api_fluid_initializer_NormalInitializer`

5. TruncatedNormalInitializer
---------------------------------

Initialization method of stochastic truncated Gauss distribution.

Alias:TruncatedNormal

API reference: :ref:`api_fluid_initializer_TruncatedNormalInitializer`

6. UniformInitializer
------------------------

Initialization method of random uniform distribution.

Alias:Uniform

API reference: :ref:`api_fluid_initializer_UniformInitializer`

7. XavierInitializer
------------------------

Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization.

Alias:Xavier

API reference: :ref:`api_fluid_initializer_XavierInitializer`

Regularization
=================

Fluid regularizes a single parameter by setting attributes of :code:`regularizer` in :code:`ParamAttr` .

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

The following is the regularization approach supported by fluid:

-  :ref:`api_fluid_regularizer_L1DecayRegularizer` (Alias:L1Decay)
-  :ref:`api_fluid_regularizer_L2DecayRegularizer` (Alias:L2Decay)

Clipping
==========

Fluid sets clipping method for a single parameter by setting attributes of :code:`gradient_clip` in :code:`ParamAttr` .

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)



The following is the clipping method supported by fluid:

1. ErrorClipByValue
----------------------

Used to clipping the value of a tensor to a specified range.

API reference: :ref:`api_fluid_clip_ErrorClipByValue`

2. GradientClipByGlobalNorm
------------------------------

Used to limit the global-norm of multiple Tensors to :code:`clip_norm`.

API reference: :ref:`api_fluid_clip_GradientClipByGlobalNorm`

3. GradientClipByNorm
------------------------
Limit the L2-norm of Tensor to :code:`max_norm` . If Tensor's L2-norm exceeds: :code:`max_norm` ,
it will calculate a  :code:`scale` . And then all values of the Tensor multiply the :code:`scale` .

API reference: :ref:`api_fluid_clip_GradientClipByNorm`

4. GradientClipByValue
-------------------------

Limit the value of the gradient on a parameter to [min, max].

API reference: :ref:`api_fluid_clip_GradientClipByValue`

Model Averaging
================

Fluid determines whether to average a single parameter by setting attributes of :code:`do_model_average` in :code:`ParamAttr` .
Examples:

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                do_model_average=true)
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates.

The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process.

API reference  :ref:`api_fluid_optimizer_ModelAverage`