noam_decay_cn.rst 1.5 KB
Newer Older
H
Hao Wang 已提交
1 2 3 4 5
.. _cn_api_fluid_layers_noam_decay:

noam_decay
-------------------------------

6
.. py:function:: paddle.fluid.layers.noam_decay(d_model, warmup_steps)
H
Hao Wang 已提交
7

X
xiaoting 已提交
8 9 10
Noam衰减方法

noam衰减的numpy实现如下:
H
Hao Wang 已提交
11 12 13

.. code-block:: python

X
xiaoting 已提交
14
    import paddle.fluid as fluid
H
Hao Wang 已提交
15 16
    import numpy as np
    # 设置超参数
17
    base_lr = 0.01
H
Hao Wang 已提交
18 19 20 21
    d_model = 2
    current_steps = 20
    warmup_steps = 200
    # 计算
22
    lr_value = base_lr * np.power(d_model, -0.5) * np.min([
H
Hao Wang 已提交
23 24 25 26 27 28
                           np.power(current_steps, -0.5),
                           np.power(warmup_steps, -1.5) * current_steps])

请参照 `attention is all you need <https://arxiv.org/pdf/1706.03762.pdf>`_

参数:
X
xiaoting 已提交
29 30
    - **d_model** (Variable|int) - 模型的输入、输出向量特征维度。类型可设置为标量Tensor,或int值。
    - **warmup_steps** (Variable|int) - 预热步数,类型可设置为标量Tensor,或int值。
31
    - **learning_rate** (Variable|float|int,可选) - 初始学习率。如果类型为Variable,则为shape为[1]的Tensor,数据类型为float32或float64;也可以是python的int类型。默认值为1.0。
H
Hao Wang 已提交
32 33 34

返回:衰减的学习率

X
xiaoting 已提交
35 36
返回类型: Variable

H
Hao Wang 已提交
37 38 39 40
**代码示例**:

.. code-block:: python

X
xiaoting 已提交
41
        import paddle.fluid as fluid
H
Hao Wang 已提交
42 43 44 45
        warmup_steps = 100
        learning_rate = 0.01
        lr = fluid.layers.learning_rate_scheduler.noam_decay(
                       1/(warmup_steps *(learning_rate ** 2)),
46 47
                       warmup_steps,
                       learning_rate)
H
Hao Wang 已提交
48 49 50 51 52 53