Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
6a7e0265
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
6a7e0265
编写于
11月 05, 2021
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add josn global cmvn
上级
9cdd2643
变更
4
显示空白变更内容
内联
并排
Showing
4 changed file
with
45 addition
and
9 deletion
+45
-9
examples/tiny/s1/conf/preprocess.yaml
examples/tiny/s1/conf/preprocess.yaml
+7
-5
examples/tiny/s1/conf/transformer.yaml
examples/tiny/s1/conf/transformer.yaml
+2
-2
paddlespeech/s2t/transform/cmvn.py
paddlespeech/s2t/transform/cmvn.py
+35
-2
paddlespeech/s2t/transform/transformation.py
paddlespeech/s2t/transform/transformation.py
+1
-0
未找到文件。
examples/tiny/s1/conf/preprocess.yaml
浏览文件 @
6a7e0265
process
:
process
:
# extract kaldi fbank from PCM
# extract kaldi fbank from PCM
-
type
:
"
fbank_kaldi"
-
type
:
fbank_kaldi
fs
:
16000
fs
:
16000
n_mels
:
80
n_mels
:
80
n_shift
:
160
n_shift
:
160
win_length
:
400
win_length
:
400
dither
:
true
dither
:
true
-
type
:
cmvn_json
cmvn_path
:
data/mean_std.json
# these three processes are a.k.a. SpecAugument
# these three processes are a.k.a. SpecAugument
-
type
:
"
time_warp"
-
type
:
time_warp
max_time_warp
:
5
max_time_warp
:
5
inplace
:
true
inplace
:
true
mode
:
"
PIL"
mode
:
PIL
-
type
:
"
freq_mask"
-
type
:
freq_mask
F
:
30
F
:
30
n_mask
:
2
n_mask
:
2
inplace
:
true
inplace
:
true
replace_with_zero
:
false
replace_with_zero
:
false
-
type
:
"
time_mask"
-
type
:
time_mask
T
:
40
T
:
40
n_mask
:
2
n_mask
:
2
inplace
:
true
inplace
:
true
...
...
examples/tiny/s1/conf/transformer.yaml
浏览文件 @
6a7e0265
...
@@ -11,7 +11,7 @@ data:
...
@@ -11,7 +11,7 @@ data:
max_output_input_ratio
:
10.0
max_output_input_ratio
:
10.0
collator
:
collator
:
mean_std_filepath
:
"
"
mean_std_filepath
:
data/mean_std.json
vocab_filepath
:
data/vocab.txt
vocab_filepath
:
data/vocab.txt
unit_type
:
'
spm'
unit_type
:
'
spm'
spm_model_prefix
:
'
data/bpe_unigram_200'
spm_model_prefix
:
'
data/bpe_unigram_200'
...
@@ -37,7 +37,7 @@ collator:
...
@@ -37,7 +37,7 @@ collator:
# network architecture
# network architecture
model
:
model
:
cmvn_file
:
"
data/mean_std.json"
cmvn_file
:
cmvn_file_type
:
"
json"
cmvn_file_type
:
"
json"
# encoder related
# encoder related
encoder
:
transformer
encoder
:
transformer
...
...
paddlespeech/s2t/transform/cmvn.py
浏览文件 @
6a7e0265
...
@@ -13,12 +13,11 @@
...
@@ -13,12 +13,11 @@
# limitations under the License.
# limitations under the License.
# Modified from espnet(https://github.com/espnet/espnet)
# Modified from espnet(https://github.com/espnet/espnet)
import
io
import
io
import
json
import
h5py
import
h5py
import
kaldiio
import
kaldiio
import
numpy
as
np
import
numpy
as
np
class
CMVN
():
class
CMVN
():
"Apply Global/Spk CMVN/iverserCMVN."
"Apply Global/Spk CMVN/iverserCMVN."
...
@@ -157,3 +156,37 @@ class UtteranceCMVN():
...
@@ -157,3 +156,37 @@ class UtteranceCMVN():
x
=
np
.
divide
(
x
,
std
)
x
=
np
.
divide
(
x
,
std
)
return
x
return
x
class
GlobalCMVN
():
"Apply Global CMVN"
def
__init__
(
self
,
cmvn_path
,
norm_means
=
True
,
norm_vars
=
True
,
std_floor
=
1.0e-20
):
self
.
cmvn_path
=
cmvn_path
self
.
norm_means
=
norm_means
self
.
norm_vars
=
norm_vars
self
.
std_floor
=
std_floor
with
open
(
cmvn_path
)
as
f
:
cmvn_stats
=
json
.
load
(
f
)
self
.
count
=
cmvn_stats
[
'frame_num'
]
self
.
mean
=
np
.
array
(
cmvn_stats
[
'mean_stat'
])
/
self
.
count
self
.
square_sums
=
np
.
array
(
cmvn_stats
[
'var_stat'
])
self
.
var
=
self
.
square_sums
/
self
.
count
-
self
.
mean
**
2
self
.
std
=
np
.
maximum
(
np
.
sqrt
(
self
.
var
),
self
.
std_floor
)
def
__repr__
(
self
):
return
f
"""
{
self
.
__class__
.
__name__
}
(
cmvn_path=
{
self
.
cmvn_path
}
,
norm_means=
{
self
.
norm_means
}
,
norm_vars=
{
self
.
norm_vars
}
,)"""
def
__call__
(
self
,
x
,
uttid
=
None
):
# x: [Time, Dim]
if
self
.
norm_means
:
x
=
np
.
subtract
(
x
,
self
.
mean
)
if
self
.
norm_vars
:
x
=
np
.
divide
(
x
,
self
.
std
)
return
x
\ No newline at end of file
paddlespeech/s2t/transform/transformation.py
浏览文件 @
6a7e0265
...
@@ -46,6 +46,7 @@ import_alias = dict(
...
@@ -46,6 +46,7 @@ import_alias = dict(
wpe
=
"paddlespeech.s2t.transform.wpe:WPE"
,
wpe
=
"paddlespeech.s2t.transform.wpe:WPE"
,
channel_selector
=
"paddlespeech.s2t.transform.channel_selector:ChannelSelector"
,
channel_selector
=
"paddlespeech.s2t.transform.channel_selector:ChannelSelector"
,
fbank_kaldi
=
"paddlespeech.s2t.transform.spectrogram:LogMelSpectrogramKaldi"
,
fbank_kaldi
=
"paddlespeech.s2t.transform.spectrogram:LogMelSpectrogramKaldi"
,
cmvn_json
=
"paddlespeech.s2t.transform.cmvn:GlobalCMVN"
)
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录