Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
18d9abc7
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
11 个月 前同步成功
通知
203
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
18d9abc7
编写于
11月 05, 2021
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add sox speed pertrub
上级
6a7e0265
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
104 addition
and
2 deletion
+104
-2
paddlespeech/s2t/transform/perturb.py
paddlespeech/s2t/transform/perturb.py
+104
-2
未找到文件。
paddlespeech/s2t/transform/perturb.py
浏览文件 @
18d9abc7
...
...
@@ -16,7 +16,7 @@ import librosa
import
numpy
import
scipy
import
soundfile
import
soxbindings
as
sox
from
paddlespeech.s2t.io.reader
import
SoundHDF5File
...
...
@@ -82,7 +82,6 @@ class SpeedPerturbation():
def
__call__
(
self
,
x
,
uttid
=
None
,
train
=
True
):
if
not
train
:
return
x
x
=
x
.
astype
(
numpy
.
float32
)
if
self
.
accept_uttid
:
ratio
=
self
.
utt2ratio
[
uttid
]
...
...
@@ -108,6 +107,109 @@ class SpeedPerturbation():
return
y
class
SpeedPerturbationSox
():
"""SpeedPerturbationSox
The speed perturbation in kaldi uses sox-speed instead of sox-tempo,
and sox-speed just to resample the input,
i.e pitch and tempo are changed both.
To speed up or slow down the sound of a file,
use speed to modify the pitch and the duration of the file.
This raises the speed and reduces the time.
The default factor is 1.0 which makes no change to the audio.
2.0 doubles speed, thus time length is cut by a half and pitch is one interval higher.
"Why use speed option instead of tempo -s in SoX for speed perturbation"
https://groups.google.com/forum/#!topic/kaldi-help/8OOG7eE4sZ8
tempo option:
sox -t wav input.wav -t wav output.tempo0.9.wav tempo -s 0.9
speed option:
sox -t wav input.wav -t wav output.speed0.9.wav speed 0.9
If we use speed option like above, the pitch of audio also will be changed,
but the tempo option does not change the pitch.
"""
def
__init__
(
self
,
lower
=
0.9
,
upper
=
1.1
,
utt2ratio
=
None
,
keep_length
=
True
,
sr
=
16000
,
seed
=
None
,
):
self
.
sr
=
sr
self
.
keep_length
=
keep_length
self
.
state
=
numpy
.
random
.
RandomState
(
seed
)
if
utt2ratio
is
not
None
:
self
.
utt2ratio
=
{}
# Use the scheduled ratio for each utterances
self
.
utt2ratio_file
=
utt2ratio
self
.
lower
=
None
self
.
upper
=
None
self
.
accept_uttid
=
True
with
open
(
utt2ratio
,
"r"
)
as
f
:
for
line
in
f
:
utt
,
ratio
=
line
.
rstrip
().
split
(
None
,
1
)
ratio
=
float
(
ratio
)
self
.
utt2ratio
[
utt
]
=
ratio
else
:
self
.
utt2ratio
=
None
# The ratio is given on runtime randomly
self
.
lower
=
lower
self
.
upper
=
upper
def
__repr__
(
self
):
if
self
.
utt2ratio
is
None
:
return
f
"""
{
self
.
__class__
.
__name__
}
(
lower=
{
self
.
lower
}
,
upper=
{
self
.
upper
}
,
keep_length=
{
self
.
keep_length
}
,
sample_rate=
{
self
.
sr
}
)"""
else
:
return
f
"""
{
self
.
__class__
.
__name__
}
(
utt2ratio=
{
self
.
utt2ratio_file
}
,
sample_rate=
{
self
.
sr
}
)"""
def
__call__
(
self
,
x
,
uttid
=
None
,
train
=
True
):
if
not
train
:
return
x
x
=
x
.
astype
(
numpy
.
float32
)
if
self
.
accept_uttid
:
ratio
=
self
.
utt2ratio
[
uttid
]
else
:
ratio
=
self
.
state
.
uniform
(
self
.
lower
,
self
.
upper
)
tfm
=
sox
.
Transformer
()
tfm
.
set_globals
(
multithread
=
False
)
tfm
.
speed
(
ratio
)
y
=
tfm
.
build_array
(
input_array
=
x
,
sample_rate_in
=
self
.
sr
)
if
self
.
keep_length
:
diff
=
abs
(
len
(
x
)
-
len
(
y
))
if
len
(
y
)
>
len
(
x
):
# Truncate noise
y
=
y
[
diff
//
2
:
-
((
diff
+
1
)
//
2
)]
elif
len
(
y
)
<
len
(
x
):
# Assume the time-axis is the first: (Time, Channel)
pad_width
=
[(
diff
//
2
,
(
diff
+
1
)
//
2
)]
+
[
(
0
,
0
)
for
_
in
range
(
y
.
ndim
-
1
)
]
y
=
numpy
.
pad
(
y
,
pad_width
=
pad_width
,
constant_values
=
0
,
mode
=
"constant"
)
if
y
.
ndim
==
2
and
x
.
ndim
==
1
:
# (T, C) -> (T)
y
=
y
.
sequence
(
1
)
return
y
class
BandpassPerturbation
():
"""BandpassPerturbation
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录