提交 0e73967a 编写于 作者: R ranqiu

Update the annotations of layers.py

上级 7d343fca
...@@ -5135,12 +5135,19 @@ def block_expand_layer(input, ...@@ -5135,12 +5135,19 @@ def block_expand_layer(input,
@layer_support() @layer_support()
def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None): def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
""" """
A layer to do max out on conv layer output. A layer to do max out on convolutional layer output.
- Input: output of a conv layer. - Input: the output of a convolutional layer.
- Output: feature map size same as input. Channel is (input channel) / groups. - Output: feature map size same as the input's, and its channel number is
(input channel) / groups.
So groups should be larger than 1, and the num of channels should be able So groups should be larger than 1, and the num of channels should be able
to devided by groups. to be devided by groups.
Reference:
Maxout Networks
http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
https://arxiv.org/pdf/1312.6082v4.pdf
.. math:: .. math::
y_{si+j} = \max_k x_{gsi + sk + j} y_{si+j} = \max_k x_{gsi + sk + j}
...@@ -5150,12 +5157,6 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None): ...@@ -5150,12 +5157,6 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
0 \le j < s 0 \le j < s
0 \le k < groups 0 \le k < groups
Please refer to Paper:
- Maxout Networks: http://www.jmlr.org/proceedings/papers/v28/goodfellow13.pdf
- Multi-digit Number Recognition from Street View \
Imagery using Deep Convolutional Neural Networks: \
https://arxiv.org/pdf/1312.6082v4.pdf
The simple usage is: The simple usage is:
.. code-block:: python .. code-block:: python
...@@ -5166,14 +5167,16 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None): ...@@ -5166,14 +5167,16 @@ def maxout_layer(input, groups, num_channels=None, name=None, layer_attr=None):
:param input: The input of this layer. :param input: The input of this layer.
:type input: LayerOutput :type input: LayerOutput
:param num_channels: The channel number of input layer. If None will be set :param num_channels: The number of input channels. If the parameter is not set or
automatically from previous output. set to None, its actual value will be automatically set to
:type num_channels: int | None the channels number of the input.
:type num_channels: int
:param groups: The group number of input layer. :param groups: The group number of input layer.
:type groups: int :type groups: int
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: None | basestring. :type name: basestring
:param layer_attr: Extra Layer attribute. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute :type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
...@@ -5205,20 +5208,20 @@ def ctc_layer(input, ...@@ -5205,20 +5208,20 @@ def ctc_layer(input,
layer_attr=None): layer_attr=None):
""" """
Connectionist Temporal Classification (CTC) is designed for temporal Connectionist Temporal Classification (CTC) is designed for temporal
classication task. That is, for sequence labeling problems where the classication task. e.g. sequence labeling problems where the
alignment between the inputs and the target labels is unknown. alignment between the inputs and the target labels is unknown.
More details can be found by referring to `Connectionist Temporal Reference:
Classification: Labelling Unsegmented Sequence Data with Recurrent Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/ with Recurrent Neural Networks
icml2006_GravesFGS06.pdf>`_ http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
Note: Note:
Considering the 'blank' label needed by CTC, you need to use Considering the 'blank' label needed by CTC, you need to use (num_classes + 1)
(num_classes + 1) as the input size. num_classes is the category number. as the size of the input, where num_classes is the category number.
And the 'blank' is the last category index. So the size of 'input' layer, such as And the 'blank' is the last category index. So the size of 'input' layer (e.g.
fc_layer with softmax activation, should be num_classes + 1. The size of ctc_layer fc_layer with softmax activation) should be (num_classes + 1). The size of
should also be num_classes + 1. ctc_layer should also be (num_classes + 1).
The example usage is: The example usage is:
...@@ -5231,16 +5234,17 @@ def ctc_layer(input, ...@@ -5231,16 +5234,17 @@ def ctc_layer(input,
:param input: The input of this layer. :param input: The input of this layer.
:type input: LayerOutput :type input: LayerOutput
:param label: The data layer of label with variable length. :param label: The input label.
:type label: LayerOutput :type label: LayerOutput
:param size: category numbers + 1. :param size: The dimension of this layer, which must be equal to (category number + 1).
:type size: int :type size: int
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: basestring | None :type name: basestring
:param norm_by_times: Whether to normalization by times. False by default. :param norm_by_times: Whether to do normalization by times. False is the default.
:type norm_by_times: bool :type norm_by_times: bool
:param layer_attr: Extra Layer config. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
""" """
...@@ -5281,20 +5285,19 @@ def warp_ctc_layer(input, ...@@ -5281,20 +5285,19 @@ def warp_ctc_layer(input,
building process, PaddlePaddle will clone the source codes, build and building process, PaddlePaddle will clone the source codes, build and
install it to :code:`third_party/install/warpctc` directory. install it to :code:`third_party/install/warpctc` directory.
More details of CTC can be found by referring to `Connectionist Temporal Reference:
Classification: Labelling Unsegmented Sequence Data with Recurrent Connectionist Temporal Classification: Labelling Unsegmented Sequence Data
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/ with Recurrent Neural Networks
icml2006_GravesFGS06.pdf>`_. http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf
Note: Note:
- Let num_classes represent the category number. Considering the 'blank' - Let num_classes represents the category number. Considering the 'blank'
label needed by CTC, you need to use (num_classes + 1) as the input size. label needed by CTC, you need to use (num_classes + 1) as the size of
Thus, the size of both warp_ctc layer and 'input' layer should be set to warp_ctc layer.
num_classes + 1.
- You can set 'blank' to any value ranged in [0, num_classes], which - You can set 'blank' to any value ranged in [0, num_classes], which
should be consistent as that used in your labels. should be consistent with those used in your labels.
- As a native 'softmax' activation is interated to the warp-ctc library, - As a native 'softmax' activation is interated to the warp-ctc library,
'linear' activation is expected instead in the 'input' layer. 'linear' activation is expected to be used instead in the 'input' layer.
The example usage is: The example usage is:
...@@ -5308,18 +5311,19 @@ def warp_ctc_layer(input, ...@@ -5308,18 +5311,19 @@ def warp_ctc_layer(input,
:param input: The input of this layer. :param input: The input of this layer.
:type input: LayerOutput :type input: LayerOutput
:param label: The data layer of label with variable length. :param label: The input label.
:type label: LayerOutput :type label: LayerOutput
:param size: category numbers + 1. :param size: The dimension of this layer, which must be equal to (category number + 1).
:type size: int :type size: int
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: basestring | None :type name: basestring
:param blank: the 'blank' label used in ctc :param blank: The 'blank' label used in ctc.
:type blank: int :type blank: int
:param norm_by_times: Whether to normalization by times. False by default. :param norm_by_times: Whether to do normalization by times. False is the default.
:type norm_by_times: bool :type norm_by_times: bool
:param layer_attr: Extra Layer config. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
""" """
...@@ -5365,23 +5369,25 @@ def crf_layer(input, ...@@ -5365,23 +5369,25 @@ def crf_layer(input,
label=label, label=label,
size=label_dim) size=label_dim)
:param input: The first input layer is the feature. :param input: The first input layer.
:type input: LayerOutput :type input: LayerOutput
:param label: The second input layer is label. :param label: The input label.
:type label: LayerOutput :type label: LayerOutput
:param size: The category number. :param size: The category number.
:type size: int :type size: int
:param weight: The third layer is "weight" of each sample, which is an :param weight: The scale of the cost of each sample. It is optional.
optional argument.
:type weight: LayerOutput :type weight: LayerOutput
:param param_attr: Parameter attribute. None means default attribute :param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute :type param_attr: ParameterAttribute
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: None | basestring :type name: basestring
:param coeff: The coefficient affects the gradient in the backward. :param coeff: The weight of the gradient in the back propagation.
1.0 is the default.
:type coeff: float :type coeff: float
:param layer_attr: Extra Layer config. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
""" """
...@@ -5427,9 +5433,9 @@ def crf_decoding_layer(input, ...@@ -5427,9 +5433,9 @@ def crf_decoding_layer(input,
""" """
A layer for calculating the decoding sequence of sequential conditional A layer for calculating the decoding sequence of sequential conditional
random field model. The decoding sequence is stored in output.ids. random field model. The decoding sequence is stored in output.ids.
If a second input is provided, it is treated as the ground-truth label, and If the input 'label' is provided, it is treated as the ground-truth label, and
this layer will also calculate error. output.value[i] is 1 for incorrect this layer will also calculate error. output.value[i] is 1 for an incorrect
decoding or 0 for correct decoding. decoding and 0 for the correct.
The example usage is: The example usage is:
...@@ -5440,16 +5446,18 @@ def crf_decoding_layer(input, ...@@ -5440,16 +5446,18 @@ def crf_decoding_layer(input,
:param input: The first input layer. :param input: The first input layer.
:type input: LayerOutput :type input: LayerOutput
:param size: size of this layer. :param size: The dimension of this layer.
:type size: int :type size: int
:param label: None or ground-truth label. :param label: The input label.
:type label: LayerOutput or None :type label: LayerOutput | None
:param param_attr: Parameter attribute. None means default attribute :param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute :type param_attr: ParameterAttribute
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: None | basestring :type name: basestring
:param layer_attr: Extra Layer config. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
:type layer_attr: ExtraLayerAttribute | None details.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
""" """
...@@ -5494,8 +5502,10 @@ def nce_layer(input, ...@@ -5494,8 +5502,10 @@ def nce_layer(input,
layer_attr=None): layer_attr=None):
""" """
Noise-contrastive estimation. Noise-contrastive estimation.
Implements the method in the following paper:
Reference:
A fast and simple algorithm for training neural probabilistic language models. A fast and simple algorithm for training neural probabilistic language models.
http://www.icml.cc/2012/papers/855.pdf
The example usage is: The example usage is:
...@@ -5507,31 +5517,33 @@ def nce_layer(input, ...@@ -5507,31 +5517,33 @@ def nce_layer(input,
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: basestring :type name: basestring
:param input: The input layers. It could be a LayerOutput of list/tuple of LayerOutput. :param input: The first input of this layer.
:type input: LayerOutput | list | tuple | collections.Sequence :type input: LayerOutput | list | tuple | collections.Sequence
:param label: label layer :param label: The input label.
:type label: LayerOutput :type label: LayerOutput
:param weight: weight layer, can be None(default) :param weight: The scale of the cost. It is optional.
:type weight: LayerOutput :type weight: LayerOutput
:param num_classes: number of classes. :param num_classes: The number of classes.
:type num_classes: int :type num_classes: int
:param act: Activation type. SigmoidActivation is the default. :param act: Activation type. SigmoidActivation is the default.
:type act: BaseActivation :type act: BaseActivation
:param param_attr: The Parameter Attribute|list. :param param_attr: The parameter attribute. See ParameterAttribute for
details.
:type param_attr: ParameterAttribute :type param_attr: ParameterAttribute
:param num_neg_samples: number of negative samples. Default is 10. :param num_neg_samples: The number of negative samples. 10 is the default.
:type num_neg_samples: int :type num_neg_samples: int
:param neg_distribution: The distribution for generating the random negative labels. :param neg_distribution: The probability distribution for generating the random negative
A uniform distribution will be used if not provided. labels. If this parameter is not set, a uniform distribution will
If not None, its length must be equal to num_classes. be used. If not None, its length must be equal to num_classes.
:type neg_distribution: list | tuple | collections.Sequence | None :type neg_distribution: list | tuple | collections.Sequence | None
:param bias_attr: The bias attribute. If the parameter is set to False or an object :param bias_attr: The bias attribute. If the parameter is set to False or an object
whose type is not ParameterAttribute, no bias is defined. If the whose type is not ParameterAttribute, no bias is defined. If the
parameter is set to True, the bias is initialized to zero. parameter is set to True, the bias is initialized to zero.
:type bias_attr: ParameterAttribute | None | bool | Any :type bias_attr: ParameterAttribute | None | bool | Any
:param layer_attr: Extra Layer Attribute. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute :type layer_attr: ExtraLayerAttribute
:return: layer name. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
""" """
if isinstance(input, LayerOutput): if isinstance(input, LayerOutput):
...@@ -5605,11 +5617,11 @@ def rank_cost(left, ...@@ -5605,11 +5617,11 @@ def rank_cost(left,
coeff=1.0, coeff=1.0,
layer_attr=None): layer_attr=None):
""" """
A cost Layer for learning to rank using gradient descent. Details can refer A cost Layer for learning to rank using gradient descent.
to `papers <http://research.microsoft.com/en-us/um/people/cburges/papers/
ICML_ranking.pdf>`_. Reference:
This layer contains at least three inputs. The weight is an optional Learning to Rank using Gradient Descent
argument, which affects the cost. http://research.microsoft.com/en-us/um/people/cburges/papers/ICML_ranking.pdf
.. math:: .. math::
...@@ -5640,14 +5652,15 @@ def rank_cost(left, ...@@ -5640,14 +5652,15 @@ def rank_cost(left,
:type right: LayerOutput :type right: LayerOutput
:param label: Label is 1 or 0, means positive order and reverse order. :param label: Label is 1 or 0, means positive order and reverse order.
:type label: LayerOutput :type label: LayerOutput
:param weight: The weight affects the cost, namely the scale of cost. :param weight: The scale of cost. It is optional.
It is an optional argument.
:type weight: LayerOutput :type weight: LayerOutput
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: None | basestring :type name: basestring
:param coeff: The coefficient affects the gradient in the backward. :param coeff: The weight of the gradient in the back propagation.
1.0 is the default.
:type coeff: float :type coeff: float
:param layer_attr: Extra Layer Attribute. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute :type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
...@@ -5692,25 +5705,25 @@ def lambda_cost(input, ...@@ -5692,25 +5705,25 @@ def lambda_cost(input,
NDCG_num=8, NDCG_num=8,
max_sort_size=-1) max_sort_size=-1)
:param input: Samples of the same query should be loaded as sequence. :param input: The first input of this layer, which is often a document
samples list of the same query and whose type must be sequence.
:type input: LayerOutput :type input: LayerOutput
:param score: The 2nd input. Score of each sample. :param score: The scores of the samples.
:type input: LayerOutput :type input: LayerOutput
:param NDCG_num: The size of NDCG (Normalized Discounted Cumulative Gain), :param NDCG_num: The size of NDCG (Normalized Discounted Cumulative Gain),
e.g., 5 for NDCG@5. It must be less than or equal to the e.g., 5 for NDCG@5. It must be less than or equal to the
minimum size of lists. minimum size of the list.
:type NDCG_num: int :type NDCG_num: int
:param max_sort_size: The size of partial sorting in calculating gradient. :param max_sort_size: The size of partial sorting in calculating gradient. If
If max_sort_size = -1, then for each list, the max_sort_size is equal to -1 or greater than the number
algorithm will sort the entire list to get gradient. of the samples in the list, then the algorithm will sort
In other cases, max_sort_size must be greater than or the entire list to compute the gradient. In other cases,
equal to NDCG_num. And if max_sort_size is greater max_sort_size must be greater than or equal to NDCG_num.
than the size of a list, the algorithm will sort the
entire list of get gradient.
:type max_sort_size: int :type max_sort_size: int
:param name: The name of this layer. It is optional. :param name: The name of this layer. It is optional.
:type name: None | basestring :type name: basestring
:param layer_attr: Extra Layer Attribute. :param layer_attr: The extra layer attribute. See ExtraLayerAttribute for
details.
:type layer_attr: ExtraLayerAttribute :type layer_attr: ExtraLayerAttribute
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
...@@ -6831,7 +6844,7 @@ def img_conv3d_layer(input, ...@@ -6831,7 +6844,7 @@ def img_conv3d_layer(input,
:type bias_attr: ParameterAttribute | None | bool | Any :type bias_attr: ParameterAttribute | None | bool | Any
:param num_channels: The number of input channels. If the parameter is not set or :param num_channels: The number of input channels. If the parameter is not set or
set to None, its actual value will be automatically set to set to None, its actual value will be automatically set to
the channels number of the input . the channels number of the input.
:type num_channels: int :type num_channels: int
:param param_attr: The parameter attribute of the convolution. See ParameterAttribute for :param param_attr: The parameter attribute of the convolution. See ParameterAttribute for
details. details.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册