Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
xxadev
tensorflow
提交
9cc50983
T
tensorflow
项目概览
xxadev
/
tensorflow
与 Fork 源项目一致
从无法访问的项目Fork
通知
3
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
T
tensorflow
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
9cc50983
编写于
3月 16, 2017
作者:
E
Eugene Brevdo
提交者:
TensorFlower Gardener
3月 16, 2017
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Initial cut of documentation for tf.contrib.seq2seq
Change: 150400474
上级
5135546b
变更
4
隐藏空白更改
内联
并排
Showing
4 changed file
with
114 addition
and
21 deletion
+114
-21
tensorflow/contrib/seq2seq/README.md
tensorflow/contrib/seq2seq/README.md
+0
-9
tensorflow/contrib/seq2seq/__init__.py
tensorflow/contrib/seq2seq/__init__.py
+4
-10
tensorflow/contrib/seq2seq/python/ops/dynamic_attention_wrapper.py
...w/contrib/seq2seq/python/ops/dynamic_attention_wrapper.py
+2
-2
tensorflow/docs_src/api_guides/python/contrib.seq2seq.md
tensorflow/docs_src/api_guides/python/contrib.seq2seq.md
+108
-0
未找到文件。
tensorflow/contrib/seq2seq/README.md
已删除
100644 → 0
浏览文件 @
5135546b
# TensorFlow contrib seq2seq layers and losses
## Layers
Information to be added.
## Losses
Information to be added.
tensorflow/contrib/seq2seq/__init__.py
浏览文件 @
9cc50983
...
...
@@ -15,15 +15,14 @@
"""Ops for building neural network seq2seq decoders and losses.
## Decoder base class and functions
See the @{$python/contrib.seq2seq} guide.
@@Decoder
@@dynamic_decode
## Basic Decoder
@@BasicDecoderOutput
@@BasicDecoder
## Decoder Helpers
@@Helper
@@CustomHelper
@@GreedyEmbeddingHelper
...
...
@@ -31,16 +30,11 @@
@@ScheduledOutputTrainingHelper
@@TrainingHelper
## Attention
### Scorers
@@BahdanauScorer
@@LuongScorer
@@BahdanauAttention
@@LuongAttention
### Helper functions
@@hardmax
### RNNCells
@@DynamicAttentionWrapperState
@@DynamicAttentionWrapper
"""
...
...
tensorflow/contrib/seq2seq/python/ops/dynamic_attention_wrapper.py
浏览文件 @
9cc50983
...
...
@@ -425,8 +425,8 @@ class DynamicAttentionWrapper(core_rnn_cell.RNNCell):
cell_input_fn: (optional) A `callable`. The default is:
`lambda inputs, attention: array_ops.concat([inputs, attention], -1)`.
probability_fn: (optional) A `callable`. Converts the score to
probabilities. The default is
`tf.nn.softmax`
. Other options include
`tf.contrib.seq2seq.hardmax` and `tf.contrib.sparsemax.sparsemax`
.
probabilities. The default is
@{tf.nn.softmax}
. Other options include
@{tf.contrib.seq2seq.hardmax} and @{tf.contrib.sparsemax.sparsemax}
.
output_attention: Python bool. If `True` (default), the output at each
time step is the attention value. This is the behavior of Luong-style
attention mechanisms. If `False`, the output at each time step is
...
...
tensorflow/docs_src/api_guides/python/contrib.seq2seq.md
0 → 100644
浏览文件 @
9cc50983
# Seq2seq Library (contrib)
[TOC]
Module for constructing seq2seq models and dynamic decoding. Builds on top of
libraries in @{tf.contrib.rnn}.
This library is composed of two primary components:
*
New attention wrappers for @{tf.contrib.rnn.RNNCell} objects.
*
A new object-oriented dynamic decoding framework.
## Attention
Attention wrappers are
`RNNCell`
objects that wrap other
`RNNCell`
objects and
implement attention. The form of attention is determined by a subclass of
@{tf.contrib.seq2seq.AttentionMechanism}. These subclasses describe the form
of attention (e.g. additive vs. multiplicative) to use when creating the
wrapper. An instance of an
`AttentionMechanism`
is constructed with a
`memory`
tensor, from which lookup keys and values tensors are created.
### Attention Mechanisms
The two basic attention mechanisms are:
*
@{tf.contrib.seq2seq.BahdanauAttention} (additive attention,
[
ref.
](
https://arxiv.org/abs/1409.0473
)
)
*
@{tf.contrib.seq2seq.LuongAttention} (multiplicative attention,
[
ref.
](
https://arxiv.org/abs/1508.04025
)
)
The
`memory`
tensor passed the attention mechanism's constructor is expected to
be shaped
`[batch_size, memory_max_time, memory_depth]`
; and often an additional
`memory_sequence_length`
vector is accepted. If provided, the
`memory`
tensors' rows are masked with zeros past their true sequence lengths.
Attention mechanisms also have a concept of depth, usually determined as a
construction parameter
`num_units`
. For some kinds of attention (like
`BahdanauAttention`
), both queries and memory are projected to tensors of depth
`num_units`
. For other kinds (like
`LuongAttention`
),
`num_units`
should match
the depth of the queries; and the
`memory`
tensor will be projected to this
depth.
### Attention Wrappers
The basic attention wrapper is @{tf.contrib.seq2seq.DynamicAttentionWrapper}.
This wrapper accepts an
`RNNCell`
instance, an instance of
`AttentionMechanism`
,
and an attention depth parameter (
`attention_size`
); as well as several
optional arguments that allow one to customize intermediate calculations.
At each time step, the basic calculation performed by this wrapper is:
```
python
cell_inputs
=
concat
([
inputs
,
prev_state
.
attention
],
-
1
)
cell_output
,
next_cell_state
=
cell
(
cell_inputs
,
prev_state
.
cell_state
)
score
=
attention_mechanism
(
cell_output
)
alignments
=
softmax
(
score
)
context
=
matmul
(
alignments
,
attention_mechanism
.
values
)
attention
=
tf
.
layers
.
Dense
(
attention_size
)(
concat
([
cell_output
,
context
],
1
))
next_state
=
DynamicAttentionWrapperState
(
cell_state
=
next_cell_state
,
attention
=
attention
)
output
=
attention
return
output
,
next_state
```
In practice, a number of the intermediate calculations are configurable.
For example, the initial concatenation of
`inputs`
and
`prev_state.attention`
can be replaced with another mixing function. The function
`softmax`
can
be replaced with alternative options when calculating
`alignments`
from the
`score`
. Finally, the outputs returned by the wrapper can be configured to
be the value
`cell_output`
instead of
`attention`
.
The benefit of using a
`DynamicAttentionWrapper`
is that it plays nicely with
other wrappers and the dynamic decoder described below. For example, one can
write:
```
python
cell
=
tf
.
contrib
.
rnn
.
DeviceWrapper
(
LSTMCell
(
512
),
"/gpu:0"
)
attention_mechanism
=
tf
.
contrib
.
seq2seq
.
LuongAttention
(
512
,
encoder_outputs
)
attn_cell
=
tf
.
contrib
.
seq2seq
.
DynamicAttentionWrapper
(
cell
,
attention_mechanism
,
attention_size
=
256
)
attn_cell
=
tf
.
contrib
.
rnn
.
DeviceWrapper
(
attn_cell
,
"/gpu:1"
)
top_cell
=
tf
.
contrib
.
rnn
.
DeviceWrapper
(
LSTMCell
(
512
),
"/gpu:1"
)
multi_cell
=
MultiRNNCell
([
attn_cell
,
top_cell
])
```
The
`multi_rnn`
cell will perform the bottom layer calculations on GPU 0;
attention calculations will be performed on GPU 1 and immediately passed
up to the top layer which is also calculated on GPU 1. The attention is
also passed forward in time to the next time step and copied to GPU 0 for the
next time step of
`cell`
. (
*Note*
: This is just an example of use,
not a suggested device partitioning strategy.)
## Dynamic Decoding
### Decoder base class and functions
*
@{tf.contrib.seq2seq.Decoder}
*
@{tf.contrib.seq2seq.dynamic_decode}
### Basic Decoder
*
@{tf.contrib.seq2seq.BasicDecoderOutput}
*
@{tf.contrib.seq2seq.BasicDecoder}
### Decoder Helpers
*
@{tf.contrib.seq2seq.Helper}
*
@{tf.contrib.seq2seq.CustomHelper}
*
@{tf.contrib.seq2seq.GreedyEmbeddingHelper}
*
@{tf.contrib.seq2seq.ScheduledEmbeddingTrainingHelper}
*
@{tf.contrib.seq2seq.ScheduledOutputTrainingHelper}
*
@{tf.contrib.seq2seq.TrainingHelper}
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录