Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Paddle
提交
8d428bd9
P
Paddle
项目概览
PaddlePaddle
/
Paddle
1 年多 前同步成功
通知
2302
Star
20931
Fork
5422
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1423
列表
看板
标记
里程碑
合并请求
543
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1,423
Issue
1,423
列表
看板
标记
里程碑
合并请求
543
合并请求
543
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
8d428bd9
编写于
12月 04, 2017
作者:
R
ranqiu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update annotations of layers.py
上级
85e6906f
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
66 addition
and
55 deletion
+66
-55
python/paddle/trainer_config_helpers/layers.py
python/paddle/trainer_config_helpers/layers.py
+66
-55
未找到文件。
python/paddle/trainer_config_helpers/layers.py
浏览文件 @
8d428bd9
...
@@ -1516,34 +1516,33 @@ def lstmemory(input,
...
@@ -1516,34 +1516,33 @@ def lstmemory(input,
NOTE: This is a low level user interface. You can use network.simple_lstm
NOTE: This is a low level user interface. You can use network.simple_lstm
to config a simple plain lstm layer.
to config a simple plain lstm layer.
Please refer to **Generating Sequences With Recurrent Neural Networks** for
Reference:
more details about LSTM.
`Generating Sequences With Recurrent Neural Networks
<https://arxiv.org/pdf/1308.0850.pdf>`_
Link_ goes as below.
.. _Link: http://arxiv.org/abs/1308.0850
:param name: The
lstmemory layer name
.
:param name: The
name of this layer. It is optional
.
:type name: basestring
:type name: basestring
:param size: DEPRECATED.
size of the lstm cell
:param size: DEPRECATED.
The dimension of the lstm cell.
:type size: int
:type size: int
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param reverse:
is sequence process reversed or not
.
:param reverse:
Whether the input sequence is processed in a reverse order
.
:type reverse: bool
:type reverse: bool
:param act: Activation type. TanhActivation is the default activation.
:param act: Activation type. TanhActivation is the default activation.
:type act: BaseActivation
:type act: BaseActivation
:param gate_act: gate activation type, SigmoidActivation by default.
:param gate_act: Activation type of this layer's gates. SigmoidActivation is the
default activation.
:type gate_act: BaseActivation
:type gate_act: BaseActivation
:param state_act:
state activation type, TanhActivation by default
.
:param state_act:
Activation type of the state. TanhActivation is the default activation
.
:type state_act: BaseActivation
:type state_act: BaseActivation
:param bias_attr: The bias attribute. If the parameter is set to False or an object
:param bias_attr: The bias attribute. If the parameter is set to False or an object
whose type is not ParameterAttribute, no bias is defined. If the
whose type is not ParameterAttribute, no bias is defined. If the
parameter is set to True, the bias is initialized to zero.
parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any
:type bias_attr: ParameterAttribute | None | bool | Any
:param param_attr: Parameter Attribute.
:param param_attr: The parameter attribute. See ParameterAttribute for details.
:type param_attr: ParameterAttribute | None | False
:type param_attr: ParameterAttribute
:param layer_attr: Extra Layer attribute
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute | None
:type layer_attr: ExtraLayerAttribute | None
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -1632,14 +1631,14 @@ def grumemory(input,
...
@@ -1632,14 +1631,14 @@ def grumemory(input,
h_t = (1 - z_t) h_{t-1} + z_t {
\\
tilde{h_t}}
h_t = (1 - z_t) h_{t-1} + z_t {
\\
tilde{h_t}}
NOTE: In PaddlePaddle's implementation, the multiplication operations
NOTE: In PaddlePaddle's implementation, the multiplication operations
:math:`W_{r}x_{t}`, :math:`W_{z}x_{t}` and :math:`W x_t` are not
computed in
:math:`W_{r}x_{t}`, :math:`W_{z}x_{t}` and :math:`W x_t` are not
performed
gate_recurrent layer. Consequently, an additional mixed_layer with
in
gate_recurrent layer. Consequently, an additional mixed_layer with
full_matrix_projection or a fc_layer must be included before grumemory
full_matrix_projection or a fc_layer must be included before grumemory
is called.
is called.
More details can be found by referring to `Empirical Evaluation of Gated
Reference:
Recurrent Neural Networks on Sequence Modeling.
`Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
<https://arxiv.org/abs/1412.3555>`_
<https://arxiv.org/abs/1412.3555>`_
The simple usage is:
The simple usage is:
...
@@ -1647,28 +1646,29 @@ def grumemory(input,
...
@@ -1647,28 +1646,29 @@ def grumemory(input,
gru = grumemory(input)
gru = grumemory(input)
:param name: The
gru layer name
.
:param name: The
name of this layer. It is optional
.
:type name:
None |
basestring
:type name: basestring
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput.
:type input: LayerOutput.
:param size: DEPRECATED.
size of the gru cell
:param size: DEPRECATED.
The dimension of the gru cell.
:type size: int
:type size: int
:param reverse: Whether
sequence process is reversed or not
.
:param reverse: Whether
the input sequence is processed in a reverse order
.
:type reverse: bool
:type reverse: bool
:param act: Activation type, TanhActivation is the default. This activation
:param act: Activation type, TanhActivation is the default. This activation
affects the :math:`{
\\
tilde{h_t}}`.
affects the :math:`{
\\
tilde{h_t}}`.
:type act: BaseActivation
:type act: BaseActivation
:param gate_act:
gate activation type, SigmoidActivation by default.
:param gate_act:
Activation type of this layer's two gates. SigmoidActivation is
This activation affects the :math:`z_t` and :math:`r_t`. It is the
the default activation. This activation affects the :math:`z_t`
:math:`
\\
sigma` in the above formula.
and :math:`r_t`. It is the
:math:`
\\
sigma` in the above formula.
:type gate_act: BaseActivation
:type gate_act: BaseActivation
:param bias_attr: The bias attribute. If the parameter is set to False or an object
:param bias_attr: The bias attribute. If the parameter is set to False or an object
whose type is not ParameterAttribute, no bias is defined. If the
whose type is not ParameterAttribute, no bias is defined. If the
parameter is set to True, the bias is initialized to zero.
parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any
:type bias_attr: ParameterAttribute | None | bool | Any
:param param_attr: Parameter Attribute.
:param param_attr: The parameter attribute. See ParameterAttribute for details.
:type param_attr: ParameterAttribute | None | False
:type param_attr: ParameterAttribute
:param layer_attr: Extra Layer attribute
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute | None
:type layer_attr: ExtraLayerAttribute | None
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -1712,10 +1712,10 @@ def last_seq(input,
...
@@ -1712,10 +1712,10 @@ def last_seq(input,
"""
"""
Get Last Timestamp Activation of a sequence.
Get Last Timestamp Activation of a sequence.
If stride > 0, this layer
slides
a window whose size is determined by stride,
If stride > 0, this layer
will slide
a window whose size is determined by stride,
and return the last value of the
window as the output. Thus, a long sequence
and return the last value of the
sequence in the window as the output. Thus, a
will be shorten. Note that for sequence with sub-sequence, the default valu
e
long sequence will be shortened. Note that for sequence with sub-sequence, th
e
of stride is -1.
default value
of stride is -1.
The simple usage is:
The simple usage is:
...
@@ -1724,14 +1724,16 @@ def last_seq(input,
...
@@ -1724,14 +1724,16 @@ def last_seq(input,
seq = last_seq(input=layer)
seq = last_seq(input=layer)
:param agg_level: Aggregated level
:param agg_level: Aggregated level
:type agg_level: AggregateLevel
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
:type name: basestring
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param stride: The step size between successive pooling regions.
:param stride: The step size between successive pooling regions.
:type stride: Int
:type stride: int
:param layer_attr: extra layer attributes.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute.
details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
...
@@ -1768,10 +1770,10 @@ def first_seq(input,
...
@@ -1768,10 +1770,10 @@ def first_seq(input,
"""
"""
Get First Timestamp Activation of a sequence.
Get First Timestamp Activation of a sequence.
If stride > 0, this layer
slides
a window whose size is determined by stride,
If stride > 0, this layer
will slide
a window whose size is determined by stride,
and return the first value of the
window as the output. Thus, a long sequence
and return the first value of the
sequence in the window as the output. Thus, a
will be shorten. Note that for sequence with sub-sequence, the default valu
e
long sequence will be shortened. Note that for sequence with sub-sequence, th
e
of stride is -1.
default value
of stride is -1.
The simple usage is:
The simple usage is:
...
@@ -1780,13 +1782,15 @@ def first_seq(input,
...
@@ -1780,13 +1782,15 @@ def first_seq(input,
seq = first_seq(input=layer)
seq = first_seq(input=layer)
:param agg_level: aggregation level
:param agg_level: aggregation level
:type agg_level: AggregateLevel
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
:type name: basestring
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param stride: The step size between successive pooling regions.
:param stride: The step size between successive pooling regions.
:type stride: Int
:type stride: int
:param layer_attr: extra layer attributes.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute.
:type layer_attr: ExtraLayerAttribute.
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -1844,8 +1848,8 @@ def expand_layer(input,
...
@@ -1844,8 +1848,8 @@ def expand_layer(input,
expand_level
=
ExpandLevel
.
FROM_NO_SEQUENCE
,
expand_level
=
ExpandLevel
.
FROM_NO_SEQUENCE
,
layer_attr
=
None
):
layer_attr
=
None
):
"""
"""
A layer for
"Expand D
ense data or (sequence data where the length of each
A layer for
expanding d
ense data or (sequence data where the length of each
sequence is one) to sequence data.
"
sequence is one) to sequence data.
The example usage is:
The example usage is:
...
@@ -1857,7 +1861,9 @@ def expand_layer(input,
...
@@ -1857,7 +1861,9 @@ def expand_layer(input,
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param expand_as: Expand as this layer's sequence info.
:param expand_as: Expand the input according to this layer's sequence infomation. And
after the operation, the input expanded will have the same number of
elememts as this layer.
:type expand_as: LayerOutput
:type expand_as: LayerOutput
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
:type name: basestring
...
@@ -1865,9 +1871,10 @@ def expand_layer(input,
...
@@ -1865,9 +1871,10 @@ def expand_layer(input,
whose type is not ParameterAttribute, no bias is defined. If the
whose type is not ParameterAttribute, no bias is defined. If the
parameter is set to True, the bias is initialized to zero.
parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any
:type bias_attr: ParameterAttribute | None | bool | Any
:param expand_level:
whether input layer is timestep(default) or
sequence.
:param expand_level:
Whether the input layer is a sequence or the element of a
sequence.
:type expand_level: ExpandLevel
:type expand_level: ExpandLevel
:param layer_attr: extra layer attributes.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute.
:type layer_attr: ExtraLayerAttribute.
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -3294,7 +3301,7 @@ def row_l2_norm_layer(input, name=None, layer_attr=None):
...
@@ -3294,7 +3301,7 @@ def row_l2_norm_layer(input, name=None, layer_attr=None):
A layer for L2-normalization in each row.
A layer for L2-normalization in each row.
.. math::
.. math::
out[i] =
\
f
rac{in[i]}{\sqrt{
\sum_{k=1}^N in[k]^{2}}}
out[i] =
\
\
frac{in[i]} {
\\
sqrt{
\
\
sum_{k=1}^N in[k]^{2}}}
where the size of :math:`in` is (batchSize x dataDim) ,
where the size of :math:`in` is (batchSize x dataDim) ,
and the size of :math:`out` is a (batchSize x dataDim) .
and the size of :math:`out` is a (batchSize x dataDim) .
...
@@ -6161,9 +6168,11 @@ def huber_regression_cost(input,
...
@@ -6161,9 +6168,11 @@ def huber_regression_cost(input,
Given a prediction f(x), a label y and :math:`\delta`, the loss function
Given a prediction f(x), a label y and :math:`\delta`, the loss function
is defined as:
is defined as:
.. math:
.. math::
loss = 0.5*\left ( y-f(x)
\r
ight )^2, \left | y-f(x)
\r
ight |\leq \delta
loss = \delta \left | y-f(x)
\r
ight |-0.5\delta ^2, otherwise
loss = 0.5*(y-f(x))^{2}, | y-f(x) | < \delta
loss = \delta | y-f(x) | - 0.5 \delta ^2, otherwise
The example usage is:
The example usage is:
...
@@ -6210,12 +6219,14 @@ def huber_classification_cost(input,
...
@@ -6210,12 +6219,14 @@ def huber_classification_cost(input,
"""
"""
For classification purposes, a variant of the Huber loss called modified Huber
For classification purposes, a variant of the Huber loss called modified Huber
is sometimes used. Given a prediction f(x) (a real-valued classifier score) and
is sometimes used. Given a prediction f(x) (a real-valued classifier score) and
a true binary class label :math:`y\in \
left \{-1, 1
\r
ight
\}`, the modified Huber
a true binary class label :math:`y\in \
{-1, 1
\}`, the modified Huber
loss is defined as:
loss is defined as:
.. math:
.. math:
loss = \max \left ( 0, 1-yf(x)
\r
ight )^2, yf(x)\geq 1
loss = -4yf(x),
\t
ext{otherwise}
loss = \max ( 0, 1-yf(x) )^2, yf(x) \geq -1
loss = -4yf(x), otherwise
The example usage is:
The example usage is:
...
@@ -6959,7 +6970,7 @@ def clip_layer(input, min, max, name=None):
...
@@ -6959,7 +6970,7 @@ def clip_layer(input, min, max, name=None):
.. math::
.. math::
out[i] = \min
\left(\max\left(in[i],p_{1}
\r
ight),p_{2}
\r
ight
)
out[i] = \min
(\max (in[i],p_{1} ),p_{2}
)
.. code-block:: python
.. code-block:: python
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录