Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Paddle
提交
0e73967a
P
Paddle
项目概览
PaddlePaddle
/
Paddle
大约 2 年 前同步成功
通知
2325
Star
20933
Fork
5424
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1423
列表
看板
标记
里程碑
合并请求
543
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1,423
Issue
1,423
列表
看板
标记
里程碑
合并请求
543
合并请求
543
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
0e73967a
编写于
11月 09, 2017
作者:
R
ranqiu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update the annotations of layers.py
上级
7d343fca
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
117 addition
and
104 deletion
+117
-104
python/paddle/trainer_config_helpers/layers.py
python/paddle/trainer_config_helpers/layers.py
+117
-104
未找到文件。
python/paddle/trainer_config_helpers/layers.py
浏览文件 @
0e73967a
...
@@ -5135,12 +5135,19 @@ def block_expand_layer(input,
...
@@ -5135,12 +5135,19 @@ def block_expand_layer(input,
@
layer_support
()
@
layer_support
()
def
maxout_layer
(
input
,
groups
,
num_channels
=
None
,
name
=
None
,
layer_attr
=
None
):
def
maxout_layer
(
input
,
groups
,
num_channels
=
None
,
name
=
None
,
layer_attr
=
None
):
"""
"""
A layer to do max out on conv layer output.
A layer to do max out on convolutional layer output.
- Input: output of a conv layer.
- Input: the output of a convolutional layer.
- Output: feature map size same as input. Channel is (input channel) / groups.
- Output: feature map size same as the input's, and its channel number is
(input channel) / groups.
So groups should be larger than 1, and the num of channels should be able
So groups should be larger than 1, and the num of channels should be able
to devided by groups.
to be devided by groups.
Reference:
Maxout Networks
http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
https://arxiv.org/pdf/1312.6082v4.pdf
.. math::
.. math::
y_{si+j} = \max_k x_{gsi + sk + j}
y_{si+j} = \max_k x_{gsi + sk + j}
...
@@ -5150,12 +5157,6 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
...
@@ -5150,12 +5157,6 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
0 \le j < s
0 \le j < s
0 \le k < groups
0 \le k < groups
Please refer to Paper:
- Maxout Networks: http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
- Multi-digit Number Recognition from Street View
\
Imagery using Deep Convolutional Neural Networks:
\
https://arxiv.org/pdf/1312.6082v4.pdf
The simple usage is:
The simple usage is:
.. code-block:: python
.. code-block:: python
...
@@ -5166,14 +5167,16 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
...
@@ -5166,14 +5167,16 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param num_channels: The channel number of input layer. If None will be set
:param num_channels: The number of input channels. If the parameter is not set or
automatically from previous output.
set to None, its actual value will be automatically set to
:type num_channels: int | None
the channels number of the input.
:type num_channels: int
:param groups: The group number of input layer.
:param groups: The group number of input layer.
:type groups: int
:type groups: int
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: None | basestring.
:type name: basestring
:param layer_attr: Extra Layer attribute.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -5205,20 +5208,20 @@ def ctc_layer(input,
...
@@ -5205,20 +5208,20 @@ def ctc_layer(input,
layer_attr
=
None
):
layer_attr
=
None
):
"""
"""
Connectionist Temporal Classification (CTC) is designed for temporal
Connectionist Temporal Classification (CTC) is designed for temporal
classication task.
That is, for
sequence labeling problems where the
classication task.
e.g.
sequence labeling problems where the
alignment between the inputs and the target labels is unknown.
alignment between the inputs and the target labels is unknown.
More details can be found by referring to `Connectionist Temporal
Reference:
Classification: Labelling Unsegmented Sequence Data with Recurrent
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
with Recurrent Neural Networks
icml2006_GravesFGS06.pdf>`_
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
Note:
Note:
Considering the 'blank' label needed by CTC, you need to use
Considering the 'blank' label needed by CTC, you need to use
(num_classes + 1)
(num_classes + 1) as the input size.
num_classes is the category number.
as the size of the input, where
num_classes is the category number.
And the 'blank' is the last category index. So the size of 'input' layer
, such as
And the 'blank' is the last category index. So the size of 'input' layer
(e.g.
fc_layer with softmax activation
, should be num_classes + 1. The size of ctc_layer
fc_layer with softmax activation
) should be (num_classes + 1). The size of
should also be num_classes + 1
.
ctc_layer should also be (num_classes + 1)
.
The example usage is:
The example usage is:
...
@@ -5231,16 +5234,17 @@ def ctc_layer(input,
...
@@ -5231,16 +5234,17 @@ def ctc_layer(input,
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param label: The
data layer of label with variable length
.
:param label: The
input label
.
:type label: LayerOutput
:type label: LayerOutput
:param size:
category numbers + 1
.
:param size:
The dimension of this layer, which must be equal to (category number + 1)
.
:type size: int
:type size: int
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
| None
:type name: basestring
:param norm_by_times: Whether to
normalization by times. False by
default.
:param norm_by_times: Whether to
do normalization by times. False is the
default.
:type norm_by_times: bool
:type norm_by_times: bool
:param layer_attr: Extra Layer config.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None
details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
...
@@ -5281,20 +5285,19 @@ def warp_ctc_layer(input,
...
@@ -5281,20 +5285,19 @@ def warp_ctc_layer(input,
building process, PaddlePaddle will clone the source codes, build and
building process, PaddlePaddle will clone the source codes, build and
install it to :code:`third_party/install/warpctc` directory.
install it to :code:`third_party/install/warpctc` directory.
More details of CTC can be found by referring to `Connectionist Temporal
Reference:
Classification: Labelling Unsegmented Sequence Data with Recurrent
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
with Recurrent Neural Networks
icml2006_GravesFGS06.pdf>`_.
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
Note:
Note:
- Let num_classes represent the category number. Considering the 'blank'
- Let num_classes represents the category number. Considering the 'blank'
label needed by CTC, you need to use (num_classes + 1) as the input size.
label needed by CTC, you need to use (num_classes + 1) as the size of
Thus, the size of both warp_ctc layer and 'input' layer should be set to
warp_ctc layer.
num_classes + 1.
- You can set 'blank' to any value ranged in [0, num_classes], which
- You can set 'blank' to any value ranged in [0, num_classes], which
should be consistent
as that
used in your labels.
should be consistent
with those
used in your labels.
- As a native 'softmax' activation is interated to the warp-ctc library,
- As a native 'softmax' activation is interated to the warp-ctc library,
'linear' activation is expected instead in the 'input' layer.
'linear' activation is expected
to be used
instead in the 'input' layer.
The example usage is:
The example usage is:
...
@@ -5308,18 +5311,19 @@ def warp_ctc_layer(input,
...
@@ -5308,18 +5311,19 @@ def warp_ctc_layer(input,
:param input: The input of this layer.
:param input: The input of this layer.
:type input: LayerOutput
:type input: LayerOutput
:param label: The
data layer of label with variable length
.
:param label: The
input label
.
:type label: LayerOutput
:type label: LayerOutput
:param size:
category numbers + 1
.
:param size:
The dimension of this layer, which must be equal to (category number + 1)
.
:type size: int
:type size: int
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
| None
:type name: basestring
:param blank:
the 'blank' label used in ctc
:param blank:
The 'blank' label used in ctc.
:type blank: int
:type blank: int
:param norm_by_times: Whether to
normalization by times. False by
default.
:param norm_by_times: Whether to
do normalization by times. False is the
default.
:type norm_by_times: bool
:type norm_by_times: bool
:param layer_attr: Extra Layer config.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None
details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
...
@@ -5365,23 +5369,25 @@ def crf_layer(input,
...
@@ -5365,23 +5369,25 @@ def crf_layer(input,
label=label,
label=label,
size=label_dim)
size=label_dim)
:param input: The first input layer
is the feature
.
:param input: The first input layer.
:type input: LayerOutput
:type input: LayerOutput
:param label: The
second input layer is
label.
:param label: The
input
label.
:type label: LayerOutput
:type label: LayerOutput
:param size: The category number.
:param size: The category number.
:type size: int
:type size: int
:param weight: The third layer is "weight" of each sample, which is an
:param weight: The scale of the cost of each sample. It is optional.
optional argument.
:type weight: LayerOutput
:type weight: LayerOutput
:param param_attr: Parameter attribute. None means default attribute
:param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute
:type param_attr: ParameterAttribute
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: None | basestring
:type name: basestring
:param coeff: The coefficient affects the gradient in the backward.
:param coeff: The weight of the gradient in the back propagation.
1.0 is the default.
:type coeff: float
:type coeff: float
:param layer_attr: Extra Layer config.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None
details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
...
@@ -5427,9 +5433,9 @@ def crf_decoding_layer(input,
...
@@ -5427,9 +5433,9 @@ def crf_decoding_layer(input,
"""
"""
A layer for calculating the decoding sequence of sequential conditional
A layer for calculating the decoding sequence of sequential conditional
random field model. The decoding sequence is stored in output.ids.
random field model. The decoding sequence is stored in output.ids.
If
a second input
is provided, it is treated as the ground-truth label, and
If
the input 'label'
is provided, it is treated as the ground-truth label, and
this layer will also calculate error. output.value[i] is 1 for incorrect
this layer will also calculate error. output.value[i] is 1 for
an
incorrect
decoding
or 0 for correct decoding
.
decoding
and 0 for the correct
.
The example usage is:
The example usage is:
...
@@ -5440,16 +5446,18 @@ def crf_decoding_layer(input,
...
@@ -5440,16 +5446,18 @@ def crf_decoding_layer(input,
:param input: The first input layer.
:param input: The first input layer.
:type input: LayerOutput
:type input: LayerOutput
:param size:
size
of this layer.
:param size:
The dimension
of this layer.
:type size: int
:type size: int
:param label: None or ground-truth label.
:param label: The input label.
:type label: LayerOutput or None
:type label: LayerOutput | None
:param param_attr: Parameter attribute. None means default attribute
:param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute
:type param_attr: ParameterAttribute
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: None | basestring
:type name: basestring
:param layer_attr: Extra Layer config.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None
details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
...
@@ -5494,8 +5502,10 @@ def nce_layer(input,
...
@@ -5494,8 +5502,10 @@ def nce_layer(input,
layer_attr
=
None
):
layer_attr
=
None
):
"""
"""
Noise-contrastive estimation.
Noise-contrastive estimation.
Implements the method in the following paper:
A fast and simple algorithm for training neural probabilistic language models.
Reference:
A fast and simple algorithm for training neural probabilistic language models.
http://www.icml.cc/2012/papers/855.pdf
The example usage is:
The example usage is:
...
@@ -5507,31 +5517,33 @@ def nce_layer(input,
...
@@ -5507,31 +5517,33 @@ def nce_layer(input,
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: basestring
:type name: basestring
:param input: The
input layers. It could be a LayerOutput of list/tuple of LayerOutput
.
:param input: The
first input of this layer
.
:type input: LayerOutput | list | tuple | collections.Sequence
:type input: LayerOutput | list | tuple | collections.Sequence
:param label:
label layer
:param label:
The input label.
:type label: LayerOutput
:type label: LayerOutput
:param weight:
weight layer, can be None(default)
:param weight:
The scale of the cost. It is optional.
:type weight: LayerOutput
:type weight: LayerOutput
:param num_classes: number of classes.
:param num_classes:
The
number of classes.
:type num_classes: int
:type num_classes: int
:param act: Activation type. SigmoidActivation is the default.
:param act: Activation type. SigmoidActivation is the default.
:type act: BaseActivation
:type act: BaseActivation
:param param_attr: The Parameter Attribute|list.
:param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute
:type param_attr: ParameterAttribute
:param num_neg_samples:
number of negative samples. Default is 10
.
:param num_neg_samples:
The number of negative samples. 10 is the default
.
:type num_neg_samples: int
:type num_neg_samples: int
:param neg_distribution: The
distribution for generating the random negative labels.
:param neg_distribution: The
probability distribution for generating the random negative
A uniform distribution will be used if not provided.
labels. If this parameter is not set, a uniform distribution will
If not None, its length must be equal to num_classes.
be used.
If not None, its length must be equal to num_classes.
:type neg_distribution: list | tuple | collections.Sequence | None
:type neg_distribution: list | tuple | collections.Sequence | None
:param bias_attr: The bias attribute. If the parameter is set to False or an object
:param bias_attr: The bias attribute. If the parameter is set to False or an object
whose type is not ParameterAttribute, no bias is defined. If the
whose type is not ParameterAttribute, no bias is defined. If the
parameter is set to True, the bias is initialized to zero.
parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any
:type bias_attr: ParameterAttribute | None | bool | Any
:param layer_attr: Extra Layer Attribute.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute
:type layer_attr: ExtraLayerAttribute
:return:
layer name
.
:return:
LayerOutput object
.
:rtype: LayerOutput
:rtype: LayerOutput
"""
"""
if
isinstance
(
input
,
LayerOutput
):
if
isinstance
(
input
,
LayerOutput
):
...
@@ -5605,11 +5617,11 @@ def rank_cost(left,
...
@@ -5605,11 +5617,11 @@ def rank_cost(left,
coeff
=
1.0
,
coeff
=
1.0
,
layer_attr
=
None
):
layer_attr
=
None
):
"""
"""
A cost Layer for learning to rank using gradient descent.
Details can refer
A cost Layer for learning to rank using gradient descent.
to `papers <http://research.microsoft.com/en-us/um/people/cburges/papers/
ICML_ranking.pdf>`_.
Reference:
This layer contains at least three inputs. The weight is an optional
Learning to Rank using Gradient Descent
argument, which affects the cost.
http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf
.. math::
.. math::
...
@@ -5640,14 +5652,15 @@ def rank_cost(left,
...
@@ -5640,14 +5652,15 @@ def rank_cost(left,
:type right: LayerOutput
:type right: LayerOutput
:param label: Label is 1 or 0, means positive order and reverse order.
:param label: Label is 1 or 0, means positive order and reverse order.
:type label: LayerOutput
:type label: LayerOutput
:param weight: The weight affects the cost, namely the scale of cost.
:param weight: The scale of cost. It is optional.
It is an optional argument.
:type weight: LayerOutput
:type weight: LayerOutput
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: None | basestring
:type name: basestring
:param coeff: The coefficient affects the gradient in the backward.
:param coeff: The weight of the gradient in the back propagation.
1.0 is the default.
:type coeff: float
:type coeff: float
:param layer_attr: Extra Layer Attribute.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -5692,25 +5705,25 @@ def lambda_cost(input,
...
@@ -5692,25 +5705,25 @@ def lambda_cost(input,
NDCG_num=8,
NDCG_num=8,
max_sort_size=-1)
max_sort_size=-1)
:param input: Samples of the same query should be loaded as sequence.
:param input: The first input of this layer, which is often a document
samples list of the same query and whose type must be sequence.
:type input: LayerOutput
:type input: LayerOutput
:param score: The
2nd input. Score of each sample
.
:param score: The
scores of the samples
.
:type input: LayerOutput
:type input: LayerOutput
:param NDCG_num: The size of NDCG (Normalized Discounted Cumulative Gain),
:param NDCG_num: The size of NDCG (Normalized Discounted Cumulative Gain),
e.g., 5 for NDCG@5. It must be less than or equal to the
e.g., 5 for NDCG@5. It must be less than or equal to the
minimum size of
lists
.
minimum size of
the list
.
:type NDCG_num: int
:type NDCG_num: int
:param max_sort_size: The size of partial sorting in calculating gradient.
:param max_sort_size: The size of partial sorting in calculating gradient. If
If max_sort_size = -1, then for each list, the
max_sort_size is equal to -1 or greater than the number
algorithm will sort the entire list to get gradient.
of the samples in the list, then the algorithm will sort
In other cases, max_sort_size must be greater than or
the entire list to compute the gradient. In other cases,
equal to NDCG_num. And if max_sort_size is greater
max_sort_size must be greater than or equal to NDCG_num.
than the size of a list, the algorithm will sort the
entire list of get gradient.
:type max_sort_size: int
:type max_sort_size: int
:param name: The name of this layer. It is optional.
:param name: The name of this layer. It is optional.
:type name: None | basestring
:type name: basestring
:param layer_attr: Extra Layer Attribute.
:param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:return: LayerOutput object.
:rtype: LayerOutput
:rtype: LayerOutput
...
@@ -6830,8 +6843,8 @@ def img_conv3d_layer(input,
...
@@ -6830,8 +6843,8 @@ def img_conv3d_layer(input,
parameter is set to True, the bias is initialized to zero.
parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any
:type bias_attr: ParameterAttribute | None | bool | Any
:param num_channels: The number of input channels. If the parameter is not set or
:param num_channels: The number of input channels. If the parameter is not set or
set to None,
its actual value will be automatically set to
set to None, its actual value will be automatically set to
the channels number of the input
.
the channels number of the input.
:type num_channels: int
:type num_channels: int
:param param_attr: The parameter attribute of the convolution. See ParameterAttribute for
:param param_attr: The parameter attribute of the convolution. See ParameterAttribute for
details.
details.
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录