Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
0ffe1f91
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
207
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
0ffe1f91
编写于
3月 28, 2022
作者:
H
huangyuxin
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
replace kaidi_fbank with paddleaudio
上级
2177a19d
变更
3
隐藏空白更改
内联
并排
Showing
3 changed file
with
49 addition
and
6 deletion
+49
-6
examples/aishell/asr1/conf/preprocess.yaml
examples/aishell/asr1/conf/preprocess.yaml
+3
-6
paddlespeech/s2t/transform/spectrogram.py
paddlespeech/s2t/transform/spectrogram.py
+45
-0
paddlespeech/s2t/transform/transformation.py
paddlespeech/s2t/transform/transformation.py
+1
-0
未找到文件。
examples/aishell/asr1/conf/preprocess.yaml
浏览文件 @
0ffe1f91
...
...
@@ -3,8 +3,9 @@ process:
-
type
:
fbank_kaldi
fs
:
16000
n_mels
:
80
n_shift
:
160
win_length
:
400
n_frame_length
:
25
n_frame_shift
:
10
energy_floor
:
0.0
dither
:
0.1
-
type
:
cmvn_json
cmvn_path
:
data/mean_std.json
...
...
@@ -23,7 +24,3 @@ process:
n_mask
:
2
inplace
:
true
replace_with_zero
:
false
paddlespeech/s2t/transform/spectrogram.py
浏览文件 @
0ffe1f91
...
...
@@ -14,8 +14,11 @@
# Modified from espnet(https://github.com/espnet/espnet)
import
librosa
import
numpy
as
np
import
paddle
from
python_speech_features
import
logfbank
import
paddleaudio.compliance.kaldi
as
kaldi
def
stft
(
x
,
n_fft
,
...
...
@@ -309,6 +312,48 @@ class IStft():
class
LogMelSpectrogramKaldi
():
def
__init__
(
self
,
fs
=
16000
,
n_mels
=
80
,
n_frame_length
=
25
,
n_frame_shift
=
10
,
energy_floor
=
0.0
,
dither
=
0.1
):
self
.
fs
=
fs
self
.
n_mels
=
n_mels
self
.
n_frame_length
=
n_frame_length
self
.
n_frame_shift
=
n_frame_shift
self
.
energy_floor
=
energy_floor
self
.
dither
=
dither
def
__repr__
(
self
):
return
(
"{name}(fs={fs}, n_mels={n_mels}, "
"n_frame_shift={n_frame_shift}, n_frame_length={n_frame_length}, "
"dither={dither}))"
.
format
(
name
=
self
.
__class__
.
__name__
,
fs
=
self
.
fs
,
n_mels
=
self
.
n_mels
,
n_frame_shift
=
self
.
n_frame_shift
,
n_frame_length
=
self
.
n_frame_length
,
dither
=
self
.
dither
,
))
def
__call__
(
self
,
x
,
train
):
dither
=
self
.
dither
if
train
else
0.0
waveform
=
paddle
.
to_tensor
(
np
.
expand_dims
(
x
,
0
),
dtype
=
paddle
.
float32
)
mat
=
kaldi
.
fbank
(
waveform
,
n_mels
=
self
.
n_mels
,
frame_length
=
self
.
n_frame_length
,
frame_shift
=
self
.
n_frame_shift
,
dither
=
dither
,
energy_floor
=
self
.
energy_floor
,
sr
=
self
.
fs
)
mat
=
np
.
squeeze
(
mat
.
numpy
())
return
mat
class
LogMelSpectrogramKaldi_decay
():
def
__init__
(
self
,
fs
=
16000
,
...
...
paddlespeech/s2t/transform/transformation.py
浏览文件 @
0ffe1f91
...
...
@@ -31,6 +31,7 @@ import_alias = dict(
freq_mask
=
"paddlespeech.s2t.transform.spec_augment:FreqMask"
,
spec_augment
=
"paddlespeech.s2t.transform.spec_augment:SpecAugment"
,
speed_perturbation
=
"paddlespeech.s2t.transform.perturb:SpeedPerturbation"
,
speed_perturbation_sox
=
"paddlespeech.s2t.transform.perturb:SpeedPerturbationSox"
,
volume_perturbation
=
"paddlespeech.s2t.transform.perturb:VolumePerturbation"
,
noise_injection
=
"paddlespeech.s2t.transform.perturb:NoiseInjection"
,
bandpass_perturbation
=
"paddlespeech.s2t.transform.perturb:BandpassPerturbation"
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录