epsilon(float, Default 1e-05): A value added to the denominator for
epsilon(float): A value added to the denominator for
numerical stability. Default is 1e-5.
numerical stability. Default is 1e-5.
param_attr(ParamAttr|None): The parameter attribute for Parameter `scale`
param_attr(ParamAttr|None): The parameter attribute for Parameter `scale`
of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm
of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm
...
@@ -1034,28 +1033,29 @@ class BatchNorm(layers.Layer):
...
@@ -1034,28 +1033,29 @@ class BatchNorm(layers.Layer):
If it is set to None or one attribute of ParamAttr, batch_norm
If it is set to None or one attribute of ParamAttr, batch_norm
will create ParamAttr as bias_attr. If the Initializer of the bias_attr
will create ParamAttr as bias_attr. If the Initializer of the bias_attr
is not set, the bias is initialized zero. Default: None.
is not set, the bias is initialized zero. Default: None.
data_layout(string, default NCHW): NCHW|NHWC
data_layout(string): NCHW|NHWC. Default: NCHW
in_place(bool, Default False): Make the input and output of batch norm reuse memory.
in_place(bool): Make the input and output of batch norm reuse memory. Default: False
name(string, Default None): A name for this layer(optional). If set None, the layer
moving_mean_name(string|None): The name of moving_mean which store the global Mean. Default: None
will be named automatically.
moving_mean_name(string, Default None): The name of moving_mean which store the global Mean.
moving_variance_name(string, Default None): The name of the moving_variance which store the global Variance.
moving_variance_name(string, Default None): The name of the moving_variance which store the global Variance.
do_model_average_for_mean_and_var(bool, Default False): Do model average for mean and variance or not.
do_model_average_for_mean_and_var(bool, Default False): Do model average for mean and variance or not.
fuse_with_relu (bool): if True, this OP performs relu after batch norm.
fuse_with_relu (bool): if True, this OP performs relu after batch norm. Default: False
use_global_stats(bool, Default False): Whether to use global mean and
use_global_stats(bool): Whether to use global mean and
variance. In inference or test mode, set use_global_stats to true
variance. In inference or test mode, set use_global_stats to true
or is_test to true, and the behavior is equivalent.
or is_test to true, and the behavior is equivalent.
In train mode, when setting use_global_stats True, the global mean
In train mode, when setting use_global_stats True, the global mean
and variance are also used during train period.
and variance are also used during train period. Default: False
trainable_statistics(bool, Default False): Whether to calculate mean and var in eval mode. In eval mode, when
trainable_statistics(bool): Whether to calculate mean and var in eval mode. In eval mode, when
setting trainable_statistics True, mean and variance will be calculated by current batch statistics.
setting trainable_statistics True, mean and variance will be calculated by current batch statistics.Default: False
Returns:
Returns:
Variable: A tensor variable which is the result after applying batch normalization on the input.
Variable: A tensor variable which is the result after applying batch normalization on the input.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.fluid as fluid
with fluid.dygraph.guard():
fc = fluid.FC('fc', size=200, param_attr='fc1.w')
fc = fluid.FC('fc', size=200, param_attr='fc1.w')
hidden1 = fc(x)
hidden1 = fc(x)
batch_norm = fluid.BatchNorm("batch_norm", 10)
batch_norm = fluid.BatchNorm("batch_norm", 10)
...
@@ -1196,14 +1196,16 @@ class Embedding(layers.Layer):
...
@@ -1196,14 +1196,16 @@ class Embedding(layers.Layer):
All the input variables are passed in as local variables to the LayerHelper constructor
All the input variables are passed in as local variables to the LayerHelper constructor
Args:
Args:
name_scope: See base class.
name_scope(str): The name of this class.
size(tuple|list): The shape of the look up table parameter. It should have two elements which indicate the size of the dictionary of embeddings and the size of each embedding vector respectively.
size(tuple|list): The shape of the look up table parameter. It should have two elements which indicate the size
of the dictionary of embeddings and the size of each embedding vector respectively.
is_sparse(bool): The flag indicating whether to use sparse update.
is_sparse(bool): The flag indicating whether to use sparse update. Default: False
is_distributed(bool): Whether to run lookup table from remote parameter server.
is_distributed(bool): Whether to run lookup table from remote parameter server. Default: False.
padding_idx(int|long|None): If :attr:`None`, it makes no effect to lookup. Otherwise the given :attr:`padding_idx` indicates padding the output with zeros whenever lookup encounters it in :attr:`input`. If :math:`padding_idx < 0`, the :attr:`padding_idx` to use in lookup is :math:`size[0] + dim`.
padding_idx(int|long|None): If :attr:`None`, it makes no effect to lookup.
param_attr(ParamAttr): Parameters for this layer
Otherwise the given :attr:`padding_idx` indicates padding the output with zeros whenever lookup encounters
dtype(np.dtype|core.VarDesc.VarType|str): The type of data : float32, float_16, int etc
it in :attr:`input`. If :math:`padding_idx < 0`, the :attr:`padding_idx` to use in lookup is :math:`size[0] + dim`. Default: None.
param_attr(ParamAttr): Parameters for this layer. Default: None.
dtype(np.dtype|core.VarDesc.VarType|str): The type of data : float32, float_16, int etc. Default: 'float32'.
Returns:
Returns:
Variable: The tensor variable storing the embeddings of the \
Variable: The tensor variable storing the embeddings of the \
...
@@ -1213,15 +1215,19 @@ class Embedding(layers.Layer):
...
@@ -1213,15 +1215,19 @@ class Embedding(layers.Layer):
.. code-block:: python
.. code-block:: python
import paddle.fluid as fluid
import paddle.fluid.dygraph.base as base
import numpy as np
inp_word = np.array([[[1]]]).astype('int64')
inp_word = np.array([[[1]]]).astype('int64')
dict_size = 20
dict_size = 20
with fluid.dygraph.guard():
with fluid.dygraph.guard():
emb = fluid.Embedding(
emb = fluid.dygraph.Embedding(
name_scope='embedding',
name_scope='embedding',
size=[dict_size, 32],
size=[dict_size, 32],
param_attr='emb.w',
param_attr='emb.w',
is_sparse=False)
is_sparse=False)
static_rlt3 = emb2(base.to_variable(inp_word))
static_rlt3 = emb(base.to_variable(inp_word))
"""
"""
def__init__(self,
def__init__(self,
...
@@ -1232,7 +1238,6 @@ class Embedding(layers.Layer):
...
@@ -1232,7 +1238,6 @@ class Embedding(layers.Layer):
padding_idx=None,
padding_idx=None,
param_attr=None,
param_attr=None,
dtype='float32'):
dtype='float32'):
super(Embedding,self).__init__(name_scope,dtype)
super(Embedding,self).__init__(name_scope,dtype)
self._size=size
self._size=size
self._is_sparse=is_sparse
self._is_sparse=is_sparse
...
@@ -1299,28 +1304,28 @@ class LayerNorm(layers.Layer):
...
@@ -1299,28 +1304,28 @@ class LayerNorm(layers.Layer):
* :math:`b`: the trainable bias parameter.
* :math:`b`: the trainable bias parameter.
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
scale(bool): Whether to learn the adaptive gain :math:`g` after
scale(bool): Whether to learn the adaptive gain :math:`g` after
normalization. Default True.
normalization. Default: True.
shift(bool): Whether to learn the adaptive bias :math:`b` after
shift(bool): Whether to learn the adaptive bias :math:`b` after
normalization. Default True.
normalization. Default: True.
begin_norm_axis(int): The normalization will be performed along
begin_norm_axis(int): The normalization will be performed along
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
Default 1.
Default: 1.
epsilon(float): The small value added to the variance to prevent
epsilon(float): The small value added to the variance to prevent
division by zero. Default 1e-05.
division by zero. Default: 1e-05.
param_attr(ParamAttr|None): The parameter attribute for the learnable
param_attr(ParamAttr|None): The parameter attribute for the learnable
gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
a default :code:`ParamAttr` would be added as scale. The
a default :code:`ParamAttr` would be added as scale. The
:attr:`param_attr` is initialized as 1 if it is added. Default None.
:attr:`param_attr` is initialized as 1 if it is added. Default: None.
bias_attr(ParamAttr|None): The parameter attribute for the learnable
bias_attr(ParamAttr|None): The parameter attribute for the learnable
bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
a default :code:`ParamAttr` would be added as bias. The
a default :code:`ParamAttr` would be added as bias. The
:attr:`bias_attr` is initialized as 0 if it is added. Default None.
:attr:`bias_attr` is initialized as 0 if it is added. Default: None.
act(str): Activation to be applied to the output of layer normalizaiton.
act(str): Activation to be applied to the output of layer normalizaiton.
Default None.
Default: None.
Returns:
Returns:
Result after normalization
Result after normalization
...
@@ -1414,7 +1419,7 @@ class GRUUnit(layers.Layer):
...
@@ -1414,7 +1419,7 @@ class GRUUnit(layers.Layer):
if origin_mode is True, then the equation of a gru step is from paper
if origin_mode is True, then the equation of a gru step is from paper
`Learning Phrase Representations using RNN Encoder-Decoder for Statistical
`Learning Phrase Representations using RNN Encoder-Decoder for Statistical
@@ -2040,8 +2060,6 @@ class Conv2DTranspose(layers.Layer):
...
@@ -2040,8 +2060,6 @@ class Conv2DTranspose(layers.Layer):
library is installed. Default: True.
library is installed. Default: True.
act (str): Activation type, if it is set to None, activation is not appended.
act (str): Activation type, if it is set to None, activation is not appended.
Default: None.
Default: None.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically. Default: True.
Returns:
Returns:
Variable: The tensor variable storing the convolution transpose result.
Variable: The tensor variable storing the convolution transpose result.
...
@@ -2173,11 +2191,11 @@ class SequenceConv(layers.Layer):
...
@@ -2173,11 +2191,11 @@ class SequenceConv(layers.Layer):
in the input parameters to the function.
in the input parameters to the function.
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
num_filters (int): number of filters.
num_filters (int): number of filters.
filter_size (int): the filter size (H and W).
filter_size (int): the filter size (H and W). Default: 3.
filter_stride (int): stride of the filter.
filter_stride (int): stride of the filter. Default: 1.
padding (bool): if True, add paddings.
padding (bool|None): if True, add paddings. Default: None
bias_attr (ParamAttr|bool|None): The parameter attribute for the bias of sequence_conv.
bias_attr (ParamAttr|bool|None): The parameter attribute for the bias of sequence_conv.
If it is set to False, no bias will be added to the output units.
If it is set to False, no bias will be added to the output units.
If it is set to None or one attribute of ParamAttr, sequence_conv
If it is set to None or one attribute of ParamAttr, sequence_conv
...
@@ -2189,8 +2207,6 @@ class SequenceConv(layers.Layer):
...
@@ -2189,8 +2207,6 @@ class SequenceConv(layers.Layer):
is not set, the parameter is initialized with Xavier. Default: None.
is not set, the parameter is initialized with Xavier. Default: None.
act (str): Activation type, if it is set to None, activation is not appended.
act (str): Activation type, if it is set to None, activation is not appended.
Default: None.
Default: None.
name (str|None): A name for this layer(optional). If set None, the layer
will be named automatically. Default: None.
Returns:
Returns:
Variable: output of sequence_conv
Variable: output of sequence_conv
...
@@ -2259,15 +2275,16 @@ class RowConv(layers.Layer):
...
@@ -2259,15 +2275,16 @@ class RowConv(layers.Layer):
More details about row_conv please refer to the design document https://github.com/PaddlePaddle/Paddle/issues/2228#issuecomment-303903645 .
More details about row_conv please refer to the design document https://github.com/PaddlePaddle/Paddle/issues/2228#issuecomment-303903645 .
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
future_context_size (int): Future context size. Please note, the shape
future_context_size (int): Future context size. Please note, the shape
of convolution kernel is [future_context_size + 1, D].
of convolution kernel is [future_context_size + 1, D].
param_attr (ParamAttr): Attributes of parameters, including
param_attr (ParamAttr): Attributes of parameters, including
name, initializer etc.
name, initializer etc. Default: None.
act (str): Non-linear activation to be applied to output variable.
act (str): Non-linear activation to be applied to output variable. Default: None.
Returns:
Returns:
the output(Out) is a LodTensor, which supports variable time-length input sequences. The underlying tensor in this LodTensor is a matrix with shape T x N, i.e., the same shape as X.
the output(Out) is a LodTensor, which supports variable time-length input sequences.
The underlying tensor in this LodTensor is a matrix with shape T x N, i.e., the same shape as X.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
...
@@ -2321,10 +2338,10 @@ class GroupNorm(layers.Layer):
...
@@ -2321,10 +2338,10 @@ class GroupNorm(layers.Layer):
Refer to `Group Normalization <https://arxiv.org/abs/1803.08494>`_ .
Refer to `Group Normalization <https://arxiv.org/abs/1803.08494>`_ .
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
groups(int): The number of groups that divided from channels.
groups(int): The number of groups that divided from channels.
epsilon(float): The small value added to the variance to prevent
epsilon(float): The small value added to the variance to prevent
division by zero.
division by zero. Default: 1e-05.
param_attr(ParamAttr|None): The parameter attribute for the learnable
param_attr(ParamAttr|None): The parameter attribute for the learnable
scale :math:`g`. If it is set to False, no scale will be added to the output units.
scale :math:`g`. If it is set to False, no scale will be added to the output units.
If it is set to None, the bias is initialized one. Default: None.
If it is set to None, the bias is initialized one. Default: None.
...
@@ -2333,7 +2350,6 @@ class GroupNorm(layers.Layer):
...
@@ -2333,7 +2350,6 @@ class GroupNorm(layers.Layer):
If it is set to None, the bias is initialized zero. Default: None.
If it is set to None, the bias is initialized zero. Default: None.
act(str): Activation to be applied to the output of group normalizaiton.
act(str): Activation to be applied to the output of group normalizaiton.
data_layout(string|NCHW): Only NCHW is supported.
data_layout(string|NCHW): Only NCHW is supported.
dtype(np.dtype|core.VarDesc.VarType|str): The type of data : float32, float_16, int etc
Returns:
Returns:
Variable: A tensor variable which is the result after applying group normalization on the input.
Variable: A tensor variable which is the result after applying group normalization on the input.
...
@@ -2450,10 +2466,10 @@ class SpectralNorm(layers.Layer):
...
@@ -2450,10 +2466,10 @@ class SpectralNorm(layers.Layer):
Refer to `Spectral Normalization <https://arxiv.org/abs/1802.05957>`_ .
Refer to `Spectral Normalization <https://arxiv.org/abs/1802.05957>`_ .
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
dim(int): The index of dimension which should be permuted to the first before reshaping Input(Weight) to matrix, it should be set as 0 if Input(Weight) is the weight of fc layer, and should be set as 1 if Input(Weight) is the weight of conv layer, default 0
dim(int): The index of dimension which should be permuted to the first before reshaping Input(Weight) to matrix, it should be set as 0 if Input(Weight) is the weight of fc layer, and should be set as 1 if Input(Weight) is the weight of conv layer. Default: 0.
power_iters(int): number of power iterations to calculate spectral norm, default 1
power_iters(int): The number of power iterations to calculate spectral norm. Default: 1.
eps(float): epsilon for numerical stability in calculating norms
eps(float): The epsilon for numerical stability in calculating norms. Default: 1e-12.
name (str): The name of this layer. It is optional.
name (str): The name of this layer. It is optional.
Returns:
Returns:
...
@@ -2527,20 +2543,22 @@ class TreeConv(layers.Layer):
...
@@ -2527,20 +2543,22 @@ class TreeConv(layers.Layer):
Args:
Args:
name_scope (str): See base class.
name_scope(str): The name of this class.
output_size(int): output feature width
output_size(int): output feature width
num_filters(int): number of filters, Default 1
num_filters(int): number of filters, Default: 1.
max_depth(int): max depth of filters, Default 2
max_depth(int): max depth of filters, Default: 2.
act(str): activation function, Default tanh
act(str): activation function, Default: tanh.
param_attr(ParamAttr): the parameter attribute for the filters, Default None
param_attr(ParamAttr): the parameter attribute for the filters, Default: None.
bias_attr(ParamAttr): the parameter attribute for the bias of this layer, Default None
bias_attr(ParamAttr): the parameter attribute for the bias of this layer, Default: None.
name(str): a name of this layer(optional). If set None, the layer will be named automatically, Default None
name(str): a name of this layer(optional). If set None, the layer will be named automatically, Default: None.
Returns:
Returns:
out(Variable): (Tensor) The feature vector of subtrees. The shape of the output tensor is [max_tree_node_size, output_size, num_filters]. The output tensor could be a new feature vector for next tree convolution layers
out(Variable): (Tensor) The feature vector of subtrees. The shape of the output tensor is [max_tree_node_size, output_size, num_filters]. The output tensor could be a new feature vector for next tree convolution layers