Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
260752aa
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
260752aa
编写于
9月 19, 2022
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
using forward_attention_decoder
上级
0d7d8712
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
9 addition
and
13 deletion
+9
-13
paddlespeech/s2t/exps/u2/bin/test_wav.py
paddlespeech/s2t/exps/u2/bin/test_wav.py
+3
-5
paddlespeech/s2t/models/u2/u2.py
paddlespeech/s2t/models/u2/u2.py
+6
-8
未找到文件。
paddlespeech/s2t/exps/u2/bin/test_wav.py
浏览文件 @
260752aa
...
...
@@ -69,8 +69,7 @@ class U2Infer():
with
paddle
.
no_grad
():
# read
audio
,
sample_rate
=
soundfile
.
read
(
self
.
audio_file
,
dtype
=
"int16"
,
always_2d
=
True
)
self
.
audio_file
,
dtype
=
"int16"
,
always_2d
=
True
)
audio
=
audio
[:,
0
]
logger
.
info
(
f
"audio shape:
{
audio
.
shape
}
"
)
...
...
@@ -78,11 +77,10 @@ class U2Infer():
feat
=
self
.
preprocessing
(
audio
,
**
self
.
preprocess_args
)
logger
.
info
(
f
"feat shape:
{
feat
.
shape
}
"
)
np
.
savetxt
(
"feat.transform.txt"
,
feat
)
ilen
=
paddle
.
to_tensor
(
feat
.
shape
[
0
])
xs
=
paddle
.
to_tensor
(
feat
,
dtype
=
'float32'
).
unsqueeze
(
axis
=
0
)
xs
=
paddle
.
to_tensor
(
feat
,
dtype
=
'float32'
).
unsqueeze
(
0
)
decode_config
=
self
.
config
.
decode
logger
.
debug
(
f
"decode cfg:
{
decode_config
}
"
)
result_transcripts
=
self
.
model
.
decode
(
xs
,
ilen
,
...
...
paddlespeech/s2t/models/u2/u2.py
浏览文件 @
260752aa
...
...
@@ -545,17 +545,11 @@ class U2BaseModel(ASRInterface, nn.Layer):
[
len
(
hyp
[
0
])
for
hyp
in
hyps
],
place
=
device
,
dtype
=
paddle
.
long
)
# (beam_size,)
hyps_pad
,
_
=
add_sos_eos
(
hyps_pad
,
self
.
sos
,
self
.
eos
,
self
.
ignore_id
)
logger
.
debug
(
f
"hyps pad:
{
hyps_pad
}
{
self
.
sos
}
{
self
.
eos
}
{
self
.
ignore_id
}
"
)
hyps_lens
=
hyps_lens
+
1
# Add <sos> at begining
encoder_out
=
encoder_out
.
repeat
(
beam_size
,
1
,
1
)
encoder_mask
=
paddle
.
ones
(
(
beam_size
,
1
,
encoder_out
.
shape
[
1
]),
dtype
=
paddle
.
bool
)
decoder_out
,
_
=
self
.
decoder
(
encoder_out
,
encoder_mask
,
hyps_pad
,
hyps_lens
)
# (beam_size, max_hyps_len, vocab_size)
# ctc score in ln domain
decoder_out
=
paddle
.
nn
.
functional
.
log_softmax
(
decoder_out
,
axis
=-
1
)
decoder_out
=
decoder_out
.
numpy
()
decoder_out
=
self
.
forward_attention_decoder
(
hyps_pad
,
hyps_lens
,
encoder_out
)
# Only use decoder score for rescoring
best_score
=
-
float
(
'inf'
)
...
...
@@ -567,11 +561,15 @@ class U2BaseModel(ASRInterface, nn.Layer):
score
+=
decoder_out
[
i
][
j
][
w
]
# last decoder output token is `eos`, for laste decoder input token.
score
+=
decoder_out
[
i
][
len
(
hyp
[
0
])][
self
.
eos
]
logger
.
debug
(
f
"hyp
{
i
}
len
{
len
(
hyp
[
0
])
}
l2r rescore_score:
{
score
}
ctc_score:
{
hyp
[
1
]
}
"
)
# add ctc score (which in ln domain)
score
+=
hyp
[
1
]
*
ctc_weight
if
score
>
best_score
:
best_score
=
score
best_index
=
i
logger
.
debug
(
f
"result:
{
hyps
[
best_index
]
}
"
)
return
hyps
[
best_index
][
0
]
@
jit
.
to_static
(
property
=
True
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录