parameter_en.rst 5.0 KB
Newer Older
C
chentianyu03 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
..  _api_guide_parameter_en:

##################
Model Parameters
##################

Model parameters are weights and biases in a model. In fluid, they are instances of ``fluid.Parameter`` class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by :ref:`api_fluid_ParamAttr` . The configurable contents are as follows:


- Initialization method

- Regularization

- gradient clipping

- Model Average



Initialization method
========================

Fluid initializes a single parameter by setting attributes of :code:`initializer` in :code:`ParamAttr` .

examples:

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                initializer=fluid.initializer.ConstantInitializer(1.0))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)



The following is the initialization method supported by fluid:

1. BilinearInitializer
-----------------------

Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation.

Alias:Bilinear

API reference: :ref:`api_fluid_initializer_BilinearInitializer`

2. ConstantInitializer
--------------------------

Constant initialization. Initialize the parameter to the specified value.

Alias:Constant

API reference: :ref:`api_fluid_initializer_ConstantInitializer`

3. MSRAInitializer
----------------------

Please refer to https://arxiv.org/abs/1502.01852 for initialization.

Alias:MSRA

API reference: :ref:`api_fluid_initializer_MSRAInitializer`

4. NormalInitializer
-------------------------

Initialization method of random Gaussian distribution.

Alias:Normal

API reference: :ref:`api_fluid_initializer_NormalInitializer`

5. TruncatedNormalInitializer
---------------------------------

Initialization method of stochastic truncated Gauss distribution.

Alias:TruncatedNormal

API reference: :ref:`api_fluid_initializer_TruncatedNormalInitializer`

6. UniformInitializer
------------------------

Initialization method of random uniform distribution.

Alias:Uniform

API reference: :ref:`api_fluid_initializer_UniformInitializer`

7. XavierInitializer
------------------------

Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization.

Alias:Xavier

API reference: :ref:`api_fluid_initializer_XavierInitializer`

Regularization
=================

Fluid regularizes a single parameter by setting attributes of :code:`regularizer` in :code:`ParamAttr` .

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

The following is the regularization approach supported by fluid:

-  :ref:`api_fluid_regularizer_L1DecayRegularizer` (Alias:L1Decay)
-  :ref:`api_fluid_regularizer_L2DecayRegularizer` (Alias:L2Decay)

Clipping
==========

Fluid sets clipping method for a single parameter by setting attributes of :code:`gradient_clip` in :code:`ParamAttr` .

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)



The following is the clipping method supported by fluid:

1. ErrorClipByValue
----------------------

Used to clipping the value of a tensor to a specified range.

API reference: :ref:`api_fluid_clip_ErrorClipByValue`

2. GradientClipByGlobalNorm
------------------------------

Used to limit the global-norm of multiple Tensors to :code:`clip_norm`.

API reference: :ref:`api_fluid_clip_GradientClipByGlobalNorm`

3. GradientClipByNorm
------------------------
Limit the L2-norm of Tensor to :code:`max_norm` . If Tensor's L2-norm exceeds: :code:`max_norm` ,
it will calculate a  :code:`scale` . And then all values of the Tensor multiply the :code:`scale` .

API reference: :ref:`api_fluid_clip_GradientClipByNorm`

4. GradientClipByValue
-------------------------

Limit the value of the gradient on a parameter to [min, max].

API reference: :ref:`api_fluid_clip_GradientClipByValue`

Model Averaging
================

Fluid determines whether to average a single parameter by setting attributes of :code:`do_model_average` in :code:`ParamAttr` .
Examples:

  .. code-block:: python

      param_attrs = fluid.ParamAttr(name="fc_weight",
                                do_model_average=true)
      y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates.

The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process.

API reference  :ref:`api_fluid_optimizer_ModelAverage`