未验证 提交 8da2b16d 编写于 作者: L littletomatodonkey 提交者: GitHub

fix reg (#27647)

* fix reg

* fix code example and doc

* remove disable_static

* fix doc

* fix l2decay
上级 cc780b19
...@@ -21,18 +21,18 @@ class L1Decay(fluid.regularizer.L1Decay): ...@@ -21,18 +21,18 @@ class L1Decay(fluid.regularizer.L1Decay):
""" """
Implement the L1 Weight Decay Regularization, which encourages the weights to be sparse. Implement the L1 Weight Decay Regularization, which encourages the weights to be sparse.
It can be set in :ref:`api_fluid_ParamAttr` or ``optimizer`` (such as :ref:`api_paddle_optimizer_Momentum` ). It can be set in :ref:`api_paddle_ParamAttr` or ``optimizer`` (such as :ref:`api_paddle_optimizer_Momentum` ).
When set in ``ParamAttr`` , it only takes effect for trainable parameters in this layer. When set in When set in ``ParamAttr`` , it only takes effect for trainable parameters in this layer. When set in
``optimizer`` , it takes effect for all trainable parameters. When set together, ``ParamAttr`` has ``optimizer`` , it takes effect for all trainable parameters. When set together, ``ParamAttr`` has
higher priority than ``optimizer`` , which means that for a trainable parameter, if regularizer is defined higher priority than ``optimizer`` , which means that for a trainable parameter, if regularizer is defined
in its ParamAttr, then the regularizer in Optimizer will be ignored. Otherwise the regularizer in its ParamAttr, then the regularizer in Optimizer will be ignored. Otherwise the regularizer
in Optimizer will be used. in Optimizer will be used.
In the implementation, the formula of L1 Weight Decay Regularization is as follows: In the implementation, the loss function of L1 Weight Decay Regularization is as follows:
.. math:: .. math::
L1WeightDecay = reg\_coeff * sign(parameter) loss = coeff * reduce\_sum(abs(x))
Args: Args:
coeff(float, optional): regularization coeff. Default:0.0. coeff(float, optional): regularization coeff. Default:0.0.
...@@ -44,10 +44,8 @@ class L1Decay(fluid.regularizer.L1Decay): ...@@ -44,10 +44,8 @@ class L1Decay(fluid.regularizer.L1Decay):
import paddle import paddle
from paddle.regularizer import L1Decay from paddle.regularizer import L1Decay
import numpy as np import numpy as np
paddle.disable_static()
inp = np.random.uniform(-0.1, 0.1, [10, 10]).astype("float32")
linear = paddle.nn.Linear(10, 10) linear = paddle.nn.Linear(10, 10)
inp = paddle.to_tensor(inp) inp = paddle.rand(shape=[10, 10], dtype="float32")
out = linear(inp) out = linear(inp)
loss = paddle.mean(out) loss = paddle.mean(out)
beta1 = paddle.to_tensor([0.9], dtype="float32") beta1 = paddle.to_tensor([0.9], dtype="float32")
...@@ -85,18 +83,18 @@ class L2Decay(fluid.regularizer.L2Decay): ...@@ -85,18 +83,18 @@ class L2Decay(fluid.regularizer.L2Decay):
""" """
Implement the L2 Weight Decay Regularization, which helps to prevent the model over-fitting. Implement the L2 Weight Decay Regularization, which helps to prevent the model over-fitting.
It can be set in :ref:`api_fluid_ParamAttr` or ``optimizer`` (such as :ref:`api_paddle_optimizer_Momentum` ). It can be set in :ref:`api_paddle_ParamAttr` or ``optimizer`` (such as :ref:`api_paddle_optimizer_Momentum` ).
When set in ``ParamAttr`` , it only takes effect for trainable parameters in this layer. When set in When set in ``ParamAttr`` , it only takes effect for trainable parameters in this layer. When set in
``optimizer`` , it takes effect for all trainable parameters. When set together, ``ParamAttr`` has ``optimizer`` , it takes effect for all trainable parameters. When set together, ``ParamAttr`` has
higher priority than ``optimizer`` , which means that for a trainable parameter, if regularizer is defined higher priority than ``optimizer`` , which means that for a trainable parameter, if regularizer is defined
in its ParamAttr, then the regularizer in Optimizer will be ignored. Otherwise the regularizer in its ParamAttr, then the regularizer in Optimizer will be ignored. Otherwise the regularizer
in Optimizer will be used. in Optimizer will be used.
In the implementation, the formula of L2 Weight Decay Regularization is as follows: In the implementation, the loss function of L2 Weight Decay Regularization is as follows:
.. math:: .. math::
L2WeightDecay = reg\_coeff * parameter loss = 0.5 * coeff * reduce\_sum(square(x))
Args: Args:
regularization_coeff(float, optional): regularization coeff. Default:0.0 regularization_coeff(float, optional): regularization coeff. Default:0.0
...@@ -108,10 +106,8 @@ class L2Decay(fluid.regularizer.L2Decay): ...@@ -108,10 +106,8 @@ class L2Decay(fluid.regularizer.L2Decay):
import paddle import paddle
from paddle.regularizer import L2Decay from paddle.regularizer import L2Decay
import numpy as np import numpy as np
paddle.disable_static()
inp = np.random.uniform(-0.1, 0.1, [10, 10]).astype("float32")
linear = paddle.nn.Linear(10, 10) linear = paddle.nn.Linear(10, 10)
inp = paddle.to_tensor(inp) inp = paddle.rand(shape=[10, 10], dtype="float32")
out = linear(inp) out = linear(inp)
loss = paddle.mean(out) loss = paddle.mean(out)
beta1 = paddle.to_tensor([0.9], dtype="float32") beta1 = paddle.to_tensor([0.9], dtype="float32")
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册
新手
引导
客服 返回
顶部