Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
de89b472
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
de89b472
编写于
1月 23, 2018
作者:
Y
Yang yaming
提交者:
GitHub
1月 23, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #7575 from pkuyym/fix-7555
Add pyton wrapper for row conv operator.
上级
9609c17a
630a8646
变更
3
隐藏空白更改
内联
并排
Showing
3 changed file
with
88 addition
and
21 deletion
+88
-21
doc/api/v2/fluid/layers.rst
doc/api/v2/fluid/layers.rst
+5
-0
python/paddle/v2/fluid/layers/nn.py
python/paddle/v2/fluid/layers/nn.py
+75
-21
python/paddle/v2/fluid/tests/test_layers.py
python/paddle/v2/fluid/tests/test_layers.py
+8
-0
未找到文件。
doc/api/v2/fluid/layers.rst
浏览文件 @
de89b472
...
@@ -529,3 +529,8 @@ sequence_reshape
...
@@ -529,3 +529,8 @@ sequence_reshape
----------------
----------------
.. autofunction:: paddle.v2.fluid.layers.sequence_reshape
.. autofunction:: paddle.v2.fluid.layers.sequence_reshape
:noindex:
:noindex:
row_conv
--------
.. autofunction:: paddle.v2.fluid.layers.row_conv
:noindex:
python/paddle/v2/fluid/layers/nn.py
浏览文件 @
de89b472
...
@@ -62,6 +62,7 @@ __all__ = [
...
@@ -62,6 +62,7 @@ __all__ = [
'im2sequence'
,
'im2sequence'
,
'nce'
,
'nce'
,
'beam_search'
,
'beam_search'
,
'row_conv'
,
]
]
...
@@ -193,7 +194,7 @@ def embedding(input,
...
@@ -193,7 +194,7 @@ def embedding(input,
"""
"""
**Embedding Layer**
**Embedding Layer**
This layer is used to lookup embeddings of IDs, provided by :attr:`input`, in
This layer is used to lookup embeddings of IDs, provided by :attr:`input`, in
a lookup table. The result of this lookup is the embedding of each ID in the
a lookup table. The result of this lookup is the embedding of each ID in the
:attr:`input`.
:attr:`input`.
...
@@ -208,8 +209,8 @@ def embedding(input,
...
@@ -208,8 +209,8 @@ def embedding(input,
is_sparse(bool): The flag indicating whether to use sparse update.
is_sparse(bool): The flag indicating whether to use sparse update.
padding_idx(int|long|None): If :attr:`None`, it makes no effect to lookup.
padding_idx(int|long|None): If :attr:`None`, it makes no effect to lookup.
Otherwise the given :attr:`padding_idx` indicates padding the output
Otherwise the given :attr:`padding_idx` indicates padding the output
with zeros whenever lookup encounters it in :attr:`input`. If
with zeros whenever lookup encounters it in :attr:`input`. If
:math:`padding_idx < 0`, the padding_idx to use in lookup is
:math:`padding_idx < 0`, the padding_idx to use in lookup is
:math:`size[0] + dim`.
:math:`size[0] + dim`.
param_attr(ParamAttr): Parameters for this layer
param_attr(ParamAttr): Parameters for this layer
dtype(np.dtype|core.DataType|str): The type of data : float32, float_16, int etc
dtype(np.dtype|core.DataType|str): The type of data : float32, float_16, int etc
...
@@ -396,9 +397,9 @@ def dynamic_gru(input,
...
@@ -396,9 +397,9 @@ def dynamic_gru(input,
"""
"""
**Dynamic GRU Layer**
**Dynamic GRU Layer**
Refer to `Empirical Evaluation of Gated Recurrent Neural Networks on
Refer to `Empirical Evaluation of Gated Recurrent Neural Networks on
Sequence Modeling <https://arxiv.org/abs/1412.3555>`_
Sequence Modeling <https://arxiv.org/abs/1412.3555>`_
The formula is as follows:
The formula is as follows:
.. math::
.. math::
...
@@ -408,47 +409,47 @@ def dynamic_gru(input,
...
@@ -408,47 +409,47 @@ def dynamic_gru(input,
r_t & = act_g(W_{rx}x_{t} + W_{rh}h_{t-1} + b_r)
r_t & = act_g(W_{rx}x_{t} + W_{rh}h_{t-1} + b_r)
\\
tilde{h_t} & = act_c(W_{cx}x_{t} + W_{ch}(r_t \odot h_{t-1}) + b_c)
\\
tilde{h_t} & = act_c(W_{cx}x_{t} + W_{ch}(r_t \odot h_{t-1}) + b_c)
h_t & = (1-u_t) \odot h_{t-1} + u_t \odot
\\
tilde{h_t}
h_t & = (1-u_t) \odot h_{t-1} + u_t \odot
\\
tilde{h_t}
The :math:`\odot` is the element-wise product of the vectors. :math:`act_g`
The :math:`\odot` is the element-wise product of the vectors. :math:`act_g`
is the update gate and reset gate activation function and :math:`sigmoid`
is the update gate and reset gate activation function and :math:`sigmoid`
is usually used for it. :math:`act_c` is the activation function for
is usually used for it. :math:`act_c` is the activation function for
candidate hidden state and :math:`tanh` is usually used for it.
candidate hidden state and :math:`tanh` is usually used for it.
Note that these :math:`W_{ux}x_{t}, W_{rx}x_{t}, W_{cx}x_{t}` operations on
Note that these :math:`W_{ux}x_{t}, W_{rx}x_{t}, W_{cx}x_{t}` operations on
the input :math:`x_{t}` are NOT included in this operator. Users can choose
the input :math:`x_{t}` are NOT included in this operator. Users can choose
to use fully-connect layer before GRU layer.
to use fully-connect layer before GRU layer.
Args:
Args:
input(Variable): The input of dynamic_gru layer, which supports
input(Variable): The input of dynamic_gru layer, which supports
variable-time length input sequence. The underlying tensor in this
variable-time length input sequence. The underlying tensor in this
Variable is a matrix with shape :math:`(T
\\
times 3D)`, where
Variable is a matrix with shape :math:`(T
\\
times 3D)`, where
:math:`T` is the total time steps in this mini-batch, :math:`D`
:math:`T` is the total time steps in this mini-batch, :math:`D`
is the hidden size.
is the hidden size.
size(int): The dimension of the gru cell.
size(int): The dimension of the gru cell.
param_attr(ParamAttr|None): The parameter attribute for the learnable
param_attr(ParamAttr|None): The parameter attribute for the learnable
hidden-hidden weight matrix. Note:
hidden-hidden weight matrix. Note:
- The shape of the weight matrix is :math:`(T
\\
times 3D)`, where
- The shape of the weight matrix is :math:`(T
\\
times 3D)`, where
:math:`D` is the hidden size.
:math:`D` is the hidden size.
- All elements in the weight matrix can be divided into two parts.
- All elements in the weight matrix can be divided into two parts.
The first part are weights of the update gate and reset gate with
The first part are weights of the update gate and reset gate with
shape :math:`(D
\\
times 2D)`, and the second part are weights for
shape :math:`(D
\\
times 2D)`, and the second part are weights for
candidate hidden state with shape :math:`(D
\\
times D)`.
candidate hidden state with shape :math:`(D
\\
times D)`.
bias_attr(ParamAttr): The parameter attribute for learnable the
bias_attr(ParamAttr): The parameter attribute for learnable the
hidden-hidden bias.
hidden-hidden bias.
is_reverse(bool): Whether to compute reversed GRU, default
is_reverse(bool): Whether to compute reversed GRU, default
:attr:`False`.
:attr:`False`.
gate_activation(str): The activation for update gate and reset gate.
gate_activation(str): The activation for update gate and reset gate.
Choices = ["sigmoid", "tanh", "relu", "identity"], default "sigmoid".
Choices = ["sigmoid", "tanh", "relu", "identity"], default "sigmoid".
activation(str): The activation for candidate hidden state.
activation(str): The activation for candidate hidden state.
Choices = ["sigmoid", "tanh", "relu", "identity"], default "tanh".
Choices = ["sigmoid", "tanh", "relu", "identity"], default "tanh".
Returns:
Returns:
Variable: The hidden state of GRU. The shape is (T
\\
times D), and lod
\
Variable: The hidden state of GRU. The shape is (T
\\
times D), and lod
\
is the same with the input.
is the same with the input.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
...
@@ -2564,3 +2565,56 @@ def im2sequence(input, filter_size=1, stride=1, padding=0, name=None):
...
@@ -2564,3 +2565,56 @@ def im2sequence(input, filter_size=1, stride=1, padding=0, name=None):
'paddings'
:
padding
,
'paddings'
:
padding
,
})
})
return
out
return
out
def
row_conv
(
input
,
future_context_size
,
param_attr
=
None
,
act
=
None
):
"""Row Conv Operator. This layer will apply lookahead convolution to
**input**. The input variable should be a 2D LoDTensor with shape [T, D].
Parameters with shape [future_context_size + 1, D] will be created. The math
equation of row convolution is as follows:
.. math::
Out_{i} = \sum_{j = i} ^ {i +
\\
tau} X_{j} \odot W_{i - j}
In the above equation:
* :math:`Out_{i}`: The i-th row of output variable with shape [1, D].
* :math:`
\\
tau`: Future context size.
* :math:`X_{j}`: The j-th row of input variable with shape [1, D].
* :math:`W_{i-j}`: The (i-j)-th row of parameters with shape [1, D].
More details about row_conv please refer to the paper
\
(http://www.cs.cmu.edu/~dyogatam/papers/wang+etal.iclrworkshop2016.pdf) and
the design document
\
(https://github.com/PaddlePaddle/Paddle/issues/2228#issuecomment-303903645).
Args:
input (Variable): Input variable, a 2D LoDTensor with shape [T, D].
future_context_size (int): Future context size. Please note, the shape
of convolution kernel is [future_context_size + 1, D].
param_attr (ParamAttr): Attributes of parameters, including
name, initializer etc.
act (str): Non-linear activation to be applied to output variable.
Returns:
Variable: The output tensor with same shape as input tensor.
Examples:
.. code-block:: python
x = fluid.layers.data(name='x', shape=[16],
dtype='float32', lod_level=1)
out = fluid.layers.row_conv(input=x, future_context_size=2)
"""
helper
=
LayerHelper
(
'row_conv'
,
**
locals
())
dtype
=
helper
.
input_dtype
()
filter_shape
=
[
future_context_size
+
1
,
input
.
shape
[
1
]]
filter_param
=
helper
.
create_parameter
(
attr
=
helper
.
param_attr
,
shape
=
filter_shape
,
dtype
=
dtype
)
out
=
helper
.
create_tmp_variable
(
dtype
)
helper
.
append_op
(
type
=
'row_conv'
,
inputs
=
{
'X'
:
[
input
],
'Filter'
:
[
filter_param
]},
outputs
=
{
'Out'
:
[
out
]})
return
helper
.
append_activation
(
out
)
python/paddle/v2/fluid/tests/test_layers.py
浏览文件 @
de89b472
...
@@ -271,6 +271,14 @@ class TestBook(unittest.TestCase):
...
@@ -271,6 +271,14 @@ class TestBook(unittest.TestCase):
self
.
assertIsNotNone
(
avg_loss
)
self
.
assertIsNotNone
(
avg_loss
)
print
(
str
(
default_main_program
()))
print
(
str
(
default_main_program
()))
def
test_row_conv
(
self
):
program
=
Program
()
with
program_guard
(
program
):
x
=
layers
.
data
(
name
=
'x'
,
shape
=
[
16
],
dtype
=
'float32'
,
lod_level
=
1
)
out
=
layers
.
row_conv
(
input
=
x
,
future_context_size
=
2
)
self
.
assertIsNotNone
(
out
)
print
(
str
(
program
))
if
__name__
==
'__main__'
:
if
__name__
==
'__main__'
:
unittest
.
main
()
unittest
.
main
()
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录