Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
a741056d
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
a741056d
编写于
10月 18, 2016
作者:
T
Travis CI
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Deploy to GitHub Pages:
45280a07
上级
aafb7043
变更
3
显示空白变更内容
内联
并排
Showing
3 changed file
with
287 addition
and
4 deletion
+287
-4
doc_cn/_sources/algorithm/rnn/hierarchical-rnn.txt
doc_cn/_sources/algorithm/rnn/hierarchical-rnn.txt
+137
-1
doc_cn/algorithm/rnn/hierarchical-rnn.html
doc_cn/algorithm/rnn/hierarchical-rnn.html
+149
-2
doc_cn/searchindex.js
doc_cn/searchindex.js
+1
-1
未找到文件。
doc_cn/_sources/algorithm/rnn/hierarchical-rnn.txt
浏览文件 @
a741056d
...
...
@@ -260,7 +260,143 @@ out = recurrent_group(step=outer_step, input=SubsequenceInput(emb))
## 示例3:双进双出,输入不等长
TBD
**输入不等长**是指recurrent_group的多个输入在各时刻的长度可以不相等, 但需要指定一个和输出长度一致的input,用<font color="red">targetInlink</font>表示。参考配置:单层RNN(`sequence_rnn_multi_unequalength_inputs.conf`),双层RNN(`sequence_nest_rnn_multi_unequalength_inputs.conf`)
### 读取双层序列的方法
我们看一下单双层序列的数据组织形式和dataprovider(见`rnn_data_provider.py`)
```python
data2 = [
[[[1, 2], [4, 5, 2]], [[5, 4, 1], [3, 1]] ,0],
[[[0, 2], [2, 5], [0, 1, 2]],[[1, 5], [4], [2, 3, 6, 1]], 1],
]
@provider(input_types=[integer_value_sub_sequence(10),
integer_value_sub_sequence(10),
integer_value(2)],
should_shuffle=False)
def process_unequalength_subseq(settings, file_name): #双层RNN的dataprovider
for d in data2:
yield d
@provider(input_types=[integer_value_sequence(10),
integer_value_sequence(10),
integer_value(2)],
should_shuffle=False)
def process_unequalength_seq(settings, file_name): #单层RNN的dataprovider
for d in data2:
words1=reduce(lambda x,y: x+y, d[0])
words2=reduce(lambda x,y: x+y, d[1])
yield words1, words2, d[2]
```
data2 中有两个样本,每个样本有两个特征, 记fea1, fea2。
- 单层序列:两个样本分别为[[1, 2, 4, 5, 2], [5, 4, 1, 3, 1]] 和 [[0, 2, 2, 5, 0, 1, 2], [1, 5, 4, 2, 3, 6, 1]]
- 双层序列:两个样本分别为
- **样本1**:[[[1, 2], [4, 5, 2]], [[5, 4, 1], [3, 1]]]。fea1和fea2都分别有2个子句,fea1=[[1, 2], [4, 5, 2]], fea2=[[5, 4, 1], [3, 1]]
- **样本2**:[[[0, 2], [2, 5], [0, 1, 2]],[[1, 5], [4], [2, 3, 6, 1]]]。fea1和fea2都分别有3个子句, fea1=[[0, 2], [2, 5], [0, 1, 2]], fea2=[[1, 5], [4], [2, 3, 6, 1]]。<br/>
- **注意**:每个样本中,各特征的子句数目需要相等。这里说的“双进双出,输入不等长”是指fea1在i时刻的输入的长度可以不等于fea2在i时刻的输入的长度。如对于第1个样本,时刻i=2, fea1[2]=[4, 5, 2],fea2[2]=[3, 1],3≠2。
- 单双层序列中,两个样本的label都分别是0和1
### 模型中的配置
单层RNN(`sequence_rnn_multi_unequalength_inputs.conf`)和双层RNN(`sequence_nest_rnn_multi_unequalength_inputs.conf`)两个模型配置达到的效果完全一样,区别只在于输入为单层还是双层序列,现在我们来看它们内部分别是如何实现的。
- 单层序列:
- 过了一个简单的recurrent_group。每一个时间步,当前的输入y和上一个时间步的输出rnn_state做了一个全连接,功能与示例2中`sequence_rnn.conf`的`step`函数完全相同。这里,两个输入x1,x2分别通过calrnn返回最后时刻的状态。结果得到的encoder1_rep和encoder2_rep分别是单层序列,最后取encoder1_rep的最后一个时刻和encoder2_rep的所有时刻分别相加得到context。
- 注意到这里recurrent_group输入的每个样本中,fea1和fea2的长度都分别相等,这并非偶然,而是因为recurrent_group要求输入为单层序列时,所有输入的长度都必须相等。
```python
def step(x1, x2):
def calrnn(y):
mem = memory(name = 'rnn_state_' + y.name, size = hidden_dim)
out = fc_layer(input = [y, mem],
size = hidden_dim,
act = TanhActivation(),
bias_attr = True,
name = 'rnn_state_' + y.name)
return out
encoder1 = calrnn(x1)
encoder2 = calrnn(x2)
return [encoder1, encoder2]
encoder1_rep, encoder2_rep = recurrent_group(
name="stepout",
step=step,
input=[emb1, emb2])
encoder1_last = last_seq(input = encoder1_rep)
encoder1_expandlast = expand_layer(input = encoder1_last,
expand_as = encoder2_rep)
context = mixed_layer(input = [identity_projection(encoder1_expandlast),
identity_projection(encoder2_rep)],
size = hidden_dim)
```
- 双层序列:
- 双层RNN中,对输入的两个特征分别求时序上的连续全连接(`inner_step1`和`inner_step2`分别处理fea1和fea2),其功能与示例2中`sequence_nest_rnn.conf`的`outer_step`函数完全相同。不同之处是,此时输入`[SubsequenceInput(emb1), SubsequenceInput(emb2)]`在各时刻并不等长。
- 函数`outer_step`中可以分别处理这两个特征,但我们需要用<font color=red>targetInlink</font>指定recurrent_group的输出的格式(各子句长度)只能和其中一个保持一致,如这里选择了和emb2的长度一致。
- 最后,依然是取encoder1_rep的最后一个时刻和encoder2_rep的所有时刻分别相加得到context。
```python
def outer_step(x1, x2):
outer_mem1 = memory(name = "outer_rnn_state1", size = hidden_dim)
outer_mem2 = memory(name = "outer_rnn_state2", size = hidden_dim)
def inner_step1(y):
inner_mem = memory(name = 'inner_rnn_state_' + y.name,
size = hidden_dim,
boot_layer = outer_mem1)
out = fc_layer(input = [y, inner_mem],
size = hidden_dim,
act = TanhActivation(),
bias_attr = True,
name = 'inner_rnn_state_' + y.name)
return out
def inner_step2(y):
inner_mem = memory(name = 'inner_rnn_state_' + y.name,
size = hidden_dim,
boot_layer = outer_mem2)
out = fc_layer(input = [y, inner_mem],
size = hidden_dim,
act = TanhActivation(),
bias_attr = True,
name = 'inner_rnn_state_' + y.name)
return out
encoder1 = recurrent_group(
step = inner_step1,
name = 'inner1',
input = x1)
encoder2 = recurrent_group(
step = inner_step2,
name = 'inner2',
input = x2)
sentence_last_state1 = last_seq(input = encoder1, name = 'outer_rnn_state1')
sentence_last_state2_ = last_seq(input = encoder2, name = 'outer_rnn_state2')
encoder1_expand = expand_layer(input = sentence_last_state1,
expand_as = encoder2)
return [encoder1_expand, encoder2]
encoder1_rep, encoder2_rep = recurrent_group(
name="outer",
step=outer_step,
input=[SubsequenceInput(emb1), SubsequenceInput(emb2)],
targetInlink=emb2)
encoder1_last = last_seq(input = encoder1_rep)
encoder1_expandlast = expand_layer(input = encoder1_last,
expand_as = encoder2_rep)
context = mixed_layer(input = [identity_projection(encoder1_expandlast),
identity_projection(encoder2_rep)],
size = hidden_dim)
```
## 示例4:beam_search的生成
...
...
doc_cn/algorithm/rnn/hierarchical-rnn.html
浏览文件 @
a741056d
...
...
@@ -333,7 +333,150 @@ var _hmt = _hmt || [];
</div>
<div
class=
"section"
id=
""
>
<span
id=
"id6"
></span><h2>
示例3:双进双出,输入不等长
<a
class=
"headerlink"
href=
"#"
title=
"Permalink to this headline"
>
¶
</a></h2>
<p>
TBD
</p>
<p><strong>
输入不等长
</strong>
是指recurrent_group的多个输入在各时刻的长度可以不相等, 但需要指定一个和输出长度一致的input,用
<font
color=
"red"
>
targetInlink
</font>
表示。参考配置:单层RNN(
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_rnn_multi_unequalength_inputs.conf
</span></code>
),双层RNN(
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_nest_rnn_multi_unequalength_inputs.conf
</span></code>
)
</p>
<div
class=
"section"
id=
""
>
<span
id=
"id7"
></span><h3>
读取双层序列的方法
<a
class=
"headerlink"
href=
"#"
title=
"Permalink to this headline"
>
¶
</a></h3>
<p>
我们看一下单双层序列的数据组织形式和dataprovider(见
<code
class=
"docutils literal"
><span
class=
"pre"
>
rnn_data_provider.py
</span></code>
)
</p>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"n"
>
data2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span>
<span
class=
"p"
>
[[[
</span><span
class=
"mi"
>
1
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
2
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
4
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
5
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
2
</span><span
class=
"p"
>
]],
</span>
<span
class=
"p"
>
[[
</span><span
class=
"mi"
>
5
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
4
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
1
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
3
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
1
</span><span
class=
"p"
>
]]
</span>
<span
class=
"p"
>
,
</span><span
class=
"mi"
>
0
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[[[
</span><span
class=
"mi"
>
0
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
2
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
2
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
5
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
0
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
1
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
2
</span><span
class=
"p"
>
]],[[
</span><span
class=
"mi"
>
1
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
5
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
4
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
[
</span><span
class=
"mi"
>
2
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
3
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
6
</span><span
class=
"p"
>
,
</span>
<span
class=
"mi"
>
1
</span><span
class=
"p"
>
]],
</span>
<span
class=
"mi"
>
1
</span><span
class=
"p"
>
],
</span>
<span
class=
"p"
>
]
</span>
<span
class=
"nd"
>
@provider
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
input_types
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"n"
>
integer_value_sub_sequence
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
10
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
integer_value_sub_sequence
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
10
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
integer_value
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
2
</span><span
class=
"p"
>
)],
</span>
<span
class=
"n"
>
should_shuffle
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
False
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
process_unequalength_subseq
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
settings
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
file_name
</span><span
class=
"p"
>
):
</span>
<span
class=
"c1"
>
#双层RNN的dataprovider
</span>
<span
class=
"k"
>
for
</span>
<span
class=
"n"
>
d
</span>
<span
class=
"ow"
>
in
</span>
<span
class=
"n"
>
data2
</span><span
class=
"p"
>
:
</span>
<span
class=
"k"
>
yield
</span>
<span
class=
"n"
>
d
</span>
<span
class=
"nd"
>
@provider
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
input_types
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"n"
>
integer_value_sequence
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
10
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
integer_value_sequence
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
10
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
integer_value
</span><span
class=
"p"
>
(
</span><span
class=
"mi"
>
2
</span><span
class=
"p"
>
)],
</span>
<span
class=
"n"
>
should_shuffle
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
False
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
process_unequalength_seq
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
settings
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
file_name
</span><span
class=
"p"
>
):
</span>
<span
class=
"c1"
>
#单层RNN的dataprovider
</span>
<span
class=
"k"
>
for
</span>
<span
class=
"n"
>
d
</span>
<span
class=
"ow"
>
in
</span>
<span
class=
"n"
>
data2
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
words1
</span><span
class=
"o"
>
=
</span><span
class=
"nb"
>
reduce
</span><span
class=
"p"
>
(
</span><span
class=
"k"
>
lambda
</span>
<span
class=
"n"
>
x
</span><span
class=
"p"
>
,
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
x
</span><span
class=
"o"
>
+
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
d
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
0
</span><span
class=
"p"
>
])
</span>
<span
class=
"n"
>
words2
</span><span
class=
"o"
>
=
</span><span
class=
"nb"
>
reduce
</span><span
class=
"p"
>
(
</span><span
class=
"k"
>
lambda
</span>
<span
class=
"n"
>
x
</span><span
class=
"p"
>
,
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
x
</span><span
class=
"o"
>
+
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
d
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
1
</span><span
class=
"p"
>
])
</span>
<span
class=
"k"
>
yield
</span>
<span
class=
"n"
>
words1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
words2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
d
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
2
</span><span
class=
"p"
>
]
</span>
</pre></div>
</div>
<p>
data2 中有两个样本,每个样本有两个特征, 记fea1, fea2。
</p>
<ul
class=
"simple"
>
<li>
单层序列:两个样本分别为[[1, 2, 4, 5, 2], [5, 4, 1, 3, 1]] 和 [[0, 2, 2, 5, 0, 1, 2], [1, 5, 4, 2, 3, 6, 1]]
</li>
<li>
双层序列:两个样本分别为
<ul>
<li><strong>
样本1
</strong>
:[[[1, 2], [4, 5, 2]], [[5, 4, 1], [3, 1]]]。fea1和fea2都分别有2个子句,fea1=[[1, 2], [4, 5, 2]], fea2=[[5, 4, 1], [3, 1]]
</li>
<li><strong>
样本2
</strong>
:[[[0, 2], [2, 5], [0, 1, 2]],[[1, 5], [4], [2, 3, 6, 1]]]。fea1和fea2都分别有3个子句, fea1=[[0, 2], [2, 5], [0, 1, 2]], fea2=[[1, 5], [4], [2, 3, 6, 1]]。
<br/></li>
<li><strong>
注意
</strong>
:每个样本中,各特征的子句数目需要相等。这里说的“双进双出,输入不等长”是指fea1在i时刻的输入的长度可以不等于fea2在i时刻的输入的长度。如对于第1个样本,时刻i=2, fea1[2]=[4, 5, 2],fea2[2]=[3, 1],3≠2。
</li>
</ul>
</li>
<li>
单双层序列中,两个样本的label都分别是0和1
</li>
</ul>
</div>
<div
class=
"section"
id=
""
>
<span
id=
"id8"
></span><h3>
模型中的配置
<a
class=
"headerlink"
href=
"#"
title=
"Permalink to this headline"
>
¶
</a></h3>
<p>
单层RNN(
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_rnn_multi_unequalength_inputs.conf
</span></code>
)和双层RNN(
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_nest_rnn_multi_unequalength_inputs.conf
</span></code>
)两个模型配置达到的效果完全一样,区别只在于输入为单层还是双层序列,现在我们来看它们内部分别是如何实现的。
</p>
<ul
class=
"simple"
>
<li>
单层序列:
<ul>
<li>
过了一个简单的recurrent_group。每一个时间步,当前的输入y和上一个时间步的输出rnn_state做了一个全连接,功能与示例2中
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_rnn.conf
</span></code>
的
<code
class=
"docutils literal"
><span
class=
"pre"
>
step
</span></code>
函数完全相同。这里,两个输入x1,x2分别通过calrnn返回最后时刻的状态。结果得到的encoder1_rep和encoder2_rep分别是单层序列,最后取encoder1_rep的最后一个时刻和encoder2_rep的所有时刻分别相加得到context。
</li>
<li>
注意到这里recurrent_group输入的每个样本中,fea1和fea2的长度都分别相等,这并非偶然,而是因为recurrent_group要求输入为单层序列时,所有输入的长度都必须相等。
</li>
</ul>
</li>
</ul>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
step
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
x1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
x2
</span><span
class=
"p"
>
):
</span>
<span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
calrnn
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
mem
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
memory
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fc_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
mem
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
act
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
TanhActivation
</span><span
class=
"p"
>
(),
</span>
<span
class=
"n"
>
bias_attr
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"bp"
>
True
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
return
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"n"
>
encoder1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
calrnn
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
x1
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
calrnn
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
x2
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
return
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
encoder1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
encoder2
</span><span
class=
"p"
>
]
</span>
<span
class=
"n"
>
encoder1_rep
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
encoder2_rep
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
recurrent_group
</span><span
class=
"p"
>
(
</span>
<span
class=
"n"
>
name
</span><span
class=
"o"
>
=
</span><span
class=
"s2"
>
"
stepout
"
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
step
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
step
</span><span
class=
"p"
>
,
</span>
<span
class=
"nb"
>
input
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"n"
>
emb1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
emb2
</span><span
class=
"p"
>
])
</span>
<span
class=
"n"
>
encoder1_last
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
last_seq
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder1_rep
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder1_expandlast
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
expand_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder1_last
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
expand_as
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder2_rep
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
context
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
mixed_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
identity_projection
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
encoder1_expandlast
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
identity_projection
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
encoder2_rep
</span><span
class=
"p"
>
)],
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
)
</span>
</pre></div>
</div>
<ul
class=
"simple"
>
<li>
双层序列:
<ul>
<li>
双层RNN中,对输入的两个特征分别求时序上的连续全连接(
<code
class=
"docutils literal"
><span
class=
"pre"
>
inner_step1
</span></code>
和
<code
class=
"docutils literal"
><span
class=
"pre"
>
inner_step2
</span></code>
分别处理fea1和fea2),其功能与示例2中
<code
class=
"docutils literal"
><span
class=
"pre"
>
sequence_nest_rnn.conf
</span></code>
的
<code
class=
"docutils literal"
><span
class=
"pre"
>
outer_step
</span></code>
函数完全相同。不同之处是,此时输入
<code
class=
"docutils literal"
><span
class=
"pre"
>
[SubsequenceInput(emb1),
</span>
<span
class=
"pre"
>
SubsequenceInput(emb2)]
</span></code>
在各时刻并不等长。
</li>
<li>
函数
<code
class=
"docutils literal"
><span
class=
"pre"
>
outer_step
</span></code>
中可以分别处理这两个特征,但我们需要用
<font
color=
red
>
targetInlink
</font>
指定recurrent_group的输出的格式(各子句长度)只能和其中一个保持一致,如这里选择了和emb2的长度一致。
</li>
<li>
最后,依然是取encoder1_rep的最后一个时刻和encoder2_rep的所有时刻分别相加得到context。
</li>
</ul>
</li>
</ul>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
outer_step
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
x1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
x2
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
outer_mem1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
memory
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s2"
>
"
outer_rnn_state1
"
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
outer_mem2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
memory
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s2"
>
"
outer_rnn_state2
"
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
inner_step1
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
inner_mem
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
memory
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner_rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
boot_layer
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
outer_mem1
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fc_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
inner_mem
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
act
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
TanhActivation
</span><span
class=
"p"
>
(),
</span>
<span
class=
"n"
>
bias_attr
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"bp"
>
True
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner_rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
return
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"k"
>
def
</span>
<span
class=
"nf"
>
inner_step2
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
inner_mem
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
memory
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner_rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
boot_layer
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
outer_mem2
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fc_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
y
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
inner_mem
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
act
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
TanhActivation
</span><span
class=
"p"
>
(),
</span>
<span
class=
"n"
>
bias_attr
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"bp"
>
True
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner_rnn_state_
'
</span>
<span
class=
"o"
>
+
</span>
<span
class=
"n"
>
y
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
name
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
return
</span>
<span
class=
"n"
>
out
</span>
<span
class=
"n"
>
encoder1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
recurrent_group
</span><span
class=
"p"
>
(
</span>
<span
class=
"n"
>
step
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
inner_step1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner1
'
</span><span
class=
"p"
>
,
</span>
<span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
x1
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
recurrent_group
</span><span
class=
"p"
>
(
</span>
<span
class=
"n"
>
step
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
inner_step2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
inner2
'
</span><span
class=
"p"
>
,
</span>
<span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
x2
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
sentence_last_state1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
last_seq
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
outer_rnn_state1
'
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
sentence_last_state2_
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
last_seq
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
name
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"s1"
>
'
outer_rnn_state2
'
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder1_expand
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
expand_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
sentence_last_state1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
expand_as
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder2
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
return
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
encoder1_expand
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
encoder2
</span><span
class=
"p"
>
]
</span>
<span
class=
"n"
>
encoder1_rep
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
encoder2_rep
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
recurrent_group
</span><span
class=
"p"
>
(
</span>
<span
class=
"n"
>
name
</span><span
class=
"o"
>
=
</span><span
class=
"s2"
>
"
outer
"
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
step
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
outer_step
</span><span
class=
"p"
>
,
</span>
<span
class=
"nb"
>
input
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"n"
>
SubsequenceInput
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
emb1
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
SubsequenceInput
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
emb2
</span><span
class=
"p"
>
)],
</span>
<span
class=
"n"
>
targetInlink
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
emb2
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder1_last
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
last_seq
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder1_rep
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
encoder1_expandlast
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
expand_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder1_last
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
expand_as
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
encoder2_rep
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
context
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
mixed_layer
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"p"
>
[
</span><span
class=
"n"
>
identity_projection
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
encoder1_expandlast
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
identity_projection
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
encoder2_rep
</span><span
class=
"p"
>
)],
</span>
<span
class=
"n"
>
size
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
hidden_dim
</span><span
class=
"p"
>
)
</span>
</pre></div>
</div>
</div>
</div>
<div
class=
"section"
id=
"beam-search"
>
<span
id=
"beam-search"
></span><h2>
示例4:beam_search的生成
<a
class=
"headerlink"
href=
"#beam-search"
title=
"Permalink to this headline"
>
¶
</a></h2>
...
...
@@ -360,7 +503,11 @@ var _hmt = _hmt || [];
<li><a
class=
"reference internal"
href=
"#"
>
模型中的配置
</a></li>
</ul>
</li>
<li><a
class=
"reference internal"
href=
"#"
>
示例3:双进双出,输入不等长
</a></li>
<li><a
class=
"reference internal"
href=
"#"
>
示例3:双进双出,输入不等长
</a><ul>
<li><a
class=
"reference internal"
href=
"#"
>
读取双层序列的方法
</a></li>
<li><a
class=
"reference internal"
href=
"#"
>
模型中的配置
</a></li>
</ul>
</li>
<li><a
class=
"reference internal"
href=
"#beam-search"
>
示例4:beam_search的生成
</a></li>
</ul>
</li>
...
...
doc_cn/searchindex.js
浏览文件 @
a741056d
因为 它太大了无法显示 source diff 。你可以改为
查看blob
。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录