Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
e8184927
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
e8184927
编写于
7月 08, 2022
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
att cache for streaming asr
上级
5ca05fea
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
41 addition
and
30 deletion
+41
-30
demos/streaming_asr_server/local/rtf_from_log.py
demos/streaming_asr_server/local/rtf_from_log.py
+1
-1
paddlespeech/s2t/models/u2_st/u2_st.py
paddlespeech/s2t/models/u2_st/u2_st.py
+30
-17
paddlespeech/s2t/modules/attention.py
paddlespeech/s2t/modules/attention.py
+3
-4
paddlespeech/s2t/modules/encoder.py
paddlespeech/s2t/modules/encoder.py
+2
-1
paddlespeech/server/engine/asr/online/python/asr_engine.py
paddlespeech/server/engine/asr/online/python/asr_engine.py
+5
-7
未找到文件。
demos/streaming_asr_server/local/rtf_from_log.py
浏览文件 @
e8184927
...
@@ -38,4 +38,4 @@ if __name__ == '__main__':
...
@@ -38,4 +38,4 @@ if __name__ == '__main__':
T
+=
m
[
'T'
]
T
+=
m
[
'T'
]
P
+=
m
[
'P'
]
P
+=
m
[
'P'
]
print
(
f
"RTF:
{
P
/
T
}
"
)
print
(
f
"RTF:
{
P
/
T
}
, utts:
{
n
}
"
)
paddlespeech/s2t/models/u2_st/u2_st.py
浏览文件 @
e8184927
...
@@ -401,29 +401,42 @@ class U2STBaseModel(nn.Layer):
...
@@ -401,29 +401,42 @@ class U2STBaseModel(nn.Layer):
xs
:
paddle
.
Tensor
,
xs
:
paddle
.
Tensor
,
offset
:
int
,
offset
:
int
,
required_cache_size
:
int
,
required_cache_size
:
int
,
subsampling_cache
:
Optional
[
paddle
.
Tensor
]
=
None
,
att_cache
:
paddle
.
Tensor
=
paddle
.
zeros
([
0
,
0
,
0
,
0
]),
elayers_output_cache
:
Optional
[
List
[
paddle
.
Tensor
]]
=
None
,
cnn_cache
:
paddle
.
Tensor
=
paddle
.
zeros
([
0
,
0
,
0
,
0
]),
conformer_cnn_cache
:
Optional
[
List
[
paddle
.
Tensor
]]
=
None
,
)
->
Tuple
[
paddle
.
Tensor
,
paddle
.
Tensor
,
paddle
.
Tensor
]:
)
->
Tuple
[
paddle
.
Tensor
,
paddle
.
Tensor
,
List
[
paddle
.
Tensor
],
List
[
paddle
.
Tensor
]]:
""" Export interface for c++ call, give input chunk xs, and return
""" Export interface for c++ call, give input chunk xs, and return
output from time 0 to current chunk.
output from time 0 to current chunk.
Args:
Args:
xs (paddle.Tensor): chunk input
xs (paddle.Tensor): chunk input, with shape (b=1, time, mel-dim),
subsampling_cache (Optional[paddle.Tensor]): subsampling cache
where `time == (chunk_size - 1) * subsample_rate +
\
elayers_output_cache (Optional[List[paddle.Tensor]]):
subsample.right_context + 1`
transformer/conformer encoder layers output cache
offset (int): current offset in encoder output time stamp
conformer_cnn_cache (Optional[List[paddle.Tensor]]): conformer
required_cache_size (int): cache size required for next chunk
cnn cache
compuation
>=0: actual cache size
<0: means all history cache is required
att_cache (paddle.Tensor): cache tensor for KEY & VALUE in
transformer/conformer attention, with shape
(elayers, head, cache_t1, d_k * 2), where
`head * d_k == hidden-dim` and
`cache_t1 == chunk_size * num_decoding_left_chunks`.
`d_k * 2` for att key & value.
cnn_cache (paddle.Tensor): cache tensor for cnn_module in conformer,
(elayers, b=1, hidden-dim, cache_t2), where
`cache_t2 == cnn.lorder - 1`
Returns:
Returns:
paddle.Tensor: output, it ranges from time 0 to current chunk.
paddle.Tensor: output of current input xs,
paddle.Tensor: subsampling cache
with shape (b=1, chunk_size, hidden-dim).
List[paddle.Tensor]: attention cache
paddle.Tensor: new attention cache required for next chunk, with
List[paddle.Tensor]: conformer cnn cache
dynamic shape (elayers, head, T(?), d_k * 2)
depending on required_cache_size.
paddle.Tensor: new conformer cnn cache required for next chunk, with
same shape as the original cnn_cache.
"""
"""
return
self
.
encoder
.
forward_chunk
(
return
self
.
encoder
.
forward_chunk
(
xs
,
offset
,
required_cache_size
,
subsampling_cache
,
xs
,
offset
,
required_cache_size
,
att_cache
,
cnn_cache
)
elayers_output_cache
,
conformer_cnn_cache
)
# @jit.to_static
# @jit.to_static
def
ctc_activation
(
self
,
xs
:
paddle
.
Tensor
)
->
paddle
.
Tensor
:
def
ctc_activation
(
self
,
xs
:
paddle
.
Tensor
)
->
paddle
.
Tensor
:
...
...
paddlespeech/s2t/modules/attention.py
浏览文件 @
e8184927
...
@@ -181,8 +181,7 @@ class MultiHeadedAttention(nn.Layer):
...
@@ -181,8 +181,7 @@ class MultiHeadedAttention(nn.Layer):
# >>> torch.equal(d[0], d[1]) # True
# >>> torch.equal(d[0], d[1]) # True
if
paddle
.
shape
(
cache
)[
0
]
>
0
:
if
paddle
.
shape
(
cache
)[
0
]
>
0
:
# last dim `d_k * 2` for (key, val)
# last dim `d_k * 2` for (key, val)
key_cache
,
value_cache
=
paddle
.
split
(
key_cache
,
value_cache
=
paddle
.
split
(
cache
,
2
,
axis
=-
1
)
cache
,
paddle
.
shape
(
cache
)[
-
1
]
//
2
,
axis
=-
1
)
k
=
paddle
.
concat
([
key_cache
,
k
],
axis
=
2
)
k
=
paddle
.
concat
([
key_cache
,
k
],
axis
=
2
)
v
=
paddle
.
concat
([
value_cache
,
v
],
axis
=
2
)
v
=
paddle
.
concat
([
value_cache
,
v
],
axis
=
2
)
# We do cache slicing in encoder.forward_chunk, since it's
# We do cache slicing in encoder.forward_chunk, since it's
...
@@ -289,8 +288,8 @@ class RelPositionMultiHeadedAttention(MultiHeadedAttention):
...
@@ -289,8 +288,8 @@ class RelPositionMultiHeadedAttention(MultiHeadedAttention):
# >>> d = torch.split(a, 2, dim=-1)
# >>> d = torch.split(a, 2, dim=-1)
# >>> torch.equal(d[0], d[1]) # True
# >>> torch.equal(d[0], d[1]) # True
if
paddle
.
shape
(
cache
)[
0
]
>
0
:
if
paddle
.
shape
(
cache
)[
0
]
>
0
:
key_cache
,
value_cache
=
paddle
.
split
(
# last dim `d_k * 2` for (key, val)
cache
,
paddle
.
shape
(
cache
)[
-
1
]
//
2
,
axis
=-
1
)
key_cache
,
value_cache
=
paddle
.
split
(
cache
,
2
,
axis
=-
1
)
k
=
paddle
.
concat
([
key_cache
,
k
],
axis
=
2
)
k
=
paddle
.
concat
([
key_cache
,
k
],
axis
=
2
)
v
=
paddle
.
concat
([
value_cache
,
v
],
axis
=
2
)
v
=
paddle
.
concat
([
value_cache
,
v
],
axis
=
2
)
# We do cache slicing in encoder.forward_chunk, since it's
# We do cache slicing in encoder.forward_chunk, since it's
...
...
paddlespeech/s2t/modules/encoder.py
浏览文件 @
e8184927
...
@@ -230,7 +230,8 @@ class BaseEncoder(nn.Layer):
...
@@ -230,7 +230,8 @@ class BaseEncoder(nn.Layer):
xs
,
pos_emb
,
_
=
self
.
embed
(
xs
,
tmp_masks
,
offset
=
offset
)
xs
,
pos_emb
,
_
=
self
.
embed
(
xs
,
tmp_masks
,
offset
=
offset
)
# after embed, xs=(B=1, chunk_size, hidden-dim)
# after embed, xs=(B=1, chunk_size, hidden-dim)
elayers
,
cache_t1
=
paddle
.
shape
(
att_cache
)[
0
],
paddle
.
shape
(
att_cache
)[
2
]
elayers
=
paddle
.
shape
(
att_cache
)[
0
]
cache_t1
=
paddle
.
shape
(
att_cache
)[
2
]
chunk_size
=
paddle
.
shape
(
xs
)[
1
]
chunk_size
=
paddle
.
shape
(
xs
)[
1
]
attention_key_size
=
cache_t1
+
chunk_size
attention_key_size
=
cache_t1
+
chunk_size
...
...
paddlespeech/server/engine/asr/online/python/asr_engine.py
浏览文件 @
e8184927
...
@@ -130,9 +130,9 @@ class PaddleASRConnectionHanddler:
...
@@ -130,9 +130,9 @@ class PaddleASRConnectionHanddler:
## conformer
## conformer
# cache for conformer online
# cache for conformer online
self
.
subsampling_cache
=
None
self
.
att_cache
=
paddle
.
zeros
([
0
,
0
,
0
,
0
])
self
.
elayers_output_cache
=
None
self
.
cnn_cache
=
paddle
.
zeros
([
0
,
0
,
0
,
0
])
self
.
conformer_cnn_cache
=
None
self
.
encoder_out
=
None
self
.
encoder_out
=
None
# conformer decoding state
# conformer decoding state
self
.
offset
=
0
# global offset in decoding frame unit
self
.
offset
=
0
# global offset in decoding frame unit
...
@@ -474,11 +474,9 @@ class PaddleASRConnectionHanddler:
...
@@ -474,11 +474,9 @@ class PaddleASRConnectionHanddler:
# cur chunk
# cur chunk
chunk_xs
=
self
.
cached_feat
[:,
cur
:
end
,
:]
chunk_xs
=
self
.
cached_feat
[:,
cur
:
end
,
:]
# forward chunk
# forward chunk
(
y
,
self
.
subsampling_cache
,
self
.
elayers_output_cache
,
(
y
,
self
.
att_cache
,
self
.
cnn_cache
)
=
self
.
model
.
encoder
.
forward_chunk
(
self
.
conformer_cnn_cache
)
=
self
.
model
.
encoder
.
forward_chunk
(
chunk_xs
,
self
.
offset
,
required_cache_size
,
chunk_xs
,
self
.
offset
,
required_cache_size
,
self
.
subsampling_cache
,
self
.
elayers_output_cache
,
self
.
att_cache
,
self
.
cnn_cache
)
self
.
conformer_cnn_cache
)
outputs
.
append
(
y
)
outputs
.
append
(
y
)
# update the global offset, in decoding frame unit
# update the global offset, in decoding frame unit
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录