Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
852d0ab9
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
206
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
852d0ab9
编写于
2月 25, 2022
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
dtw metric for tts, test=doc
上级
54f06041
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
94 addition
and
0 deletion
+94
-0
paddleaudio/CHANGELOG.md
paddleaudio/CHANGELOG.md
+1
-0
paddleaudio/paddleaudio/metric/__init__.py
paddleaudio/paddleaudio/metric/__init__.py
+2
-0
paddleaudio/paddleaudio/metric/dtw.py
paddleaudio/paddleaudio/metric/dtw.py
+42
-0
paddleaudio/paddleaudio/metric/mcd.py
paddleaudio/paddleaudio/metric/mcd.py
+47
-0
paddleaudio/setup.py
paddleaudio/setup.py
+2
-0
未找到文件。
paddleaudio/CHANGELOG.md
浏览文件 @
852d0ab9
...
@@ -2,3 +2,4 @@
...
@@ -2,3 +2,4 @@
Date: 2022-2-25, Author: Hui Zhang.
Date: 2022-2-25, Author: Hui Zhang.
-
Refactor architecture.
-
Refactor architecture.
-
dtw distance and mcd style dtw
paddleaudio/paddleaudio/metric/__init__.py
浏览文件 @
852d0ab9
...
@@ -11,3 +11,5 @@
...
@@ -11,3 +11,5 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
from
.dtw
import
dtw_distance
from
.mcd
import
mcd_distance
paddleaudio/paddleaudio/metric/dtw.py
0 → 100644
浏览文件 @
852d0ab9
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
numpy
as
np
from
dtaidistance
import
dtw_ndim
__all__
=
[
'dtw_distance'
,
]
def
dtw_distance
(
xs
:
np
.
ndarray
,
ys
:
np
.
ndarray
)
->
float
:
"""dtw distance
Dynamic Time Warping.
This function keeps a compact matrix, not the full warping paths matrix.
Uses dynamic programming to compute:
wps[i, j] = (s1[i]-s2[j])**2 + min(
wps[i-1, j ] + penalty, // vertical / insertion / expansion
wps[i , j-1] + penalty, // horizontal / deletion / compression
wps[i-1, j-1]) // diagonal / match
dtw = sqrt(wps[-1, -1])
Args:
xs (np.ndarray): ref sequence, [T,D]
ys (np.ndarray): hyp sequence, [T,D]
Returns:
float: dtw distance
"""
return
dtw_ndim
.
distance
(
xs
,
ys
)
paddleaudio/paddleaudio/metric/mcd.py
0 → 100644
浏览文件 @
852d0ab9
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
mcd.metrics_fast
as
mt
from
mcd
import
dtw
__all__
=
[
'mcd_distance'
,
]
def
mcd_distance
(
xs
:
np
.
ndarray
,
ys
:
np
.
ndarray
,
cost_fn
=
mt
.
logSpecDbDist
):
"""Mel cepstral distortion (MCD), dtw distance.
Dynamic Time Warping.
Uses dynamic programming to compute:
wps[i, j] = cost_fn(xs[i], ys[j]) + min(
wps[i-1, j ], // vertical / insertion / expansion
wps[i , j-1], // horizontal / deletion / compression
wps[i-1, j-1]) // diagonal / match
dtw = sqrt(wps[-1, -1])
Cost Function:
logSpecDbConst = 10.0 / math.log(10.0) * math.sqrt(2.0)
def logSpecDbDist(x, y):
diff = x - y
return logSpecDbConst * math.sqrt(np.inner(diff, diff))
Args:
xs (np.ndarray): ref sequence, [T,D]
ys (np.ndarray): hyp sequence, [T,D]
Returns:
float: dtw distance
"""
min_cost
,
path
=
dtw
.
dtw
(
xs
,
ys
,
cost_fn
)
return
min_cost
paddleaudio/setup.py
浏览文件 @
852d0ab9
...
@@ -59,6 +59,8 @@ setuptools.setup(
...
@@ -59,6 +59,8 @@ setuptools.setup(
'resampy >= 0.2.2'
,
'resampy >= 0.2.2'
,
'soundfile >= 0.9.0'
,
'soundfile >= 0.9.0'
,
'colorlog'
,
'colorlog'
,
'dtaidistance >= 2.3.6'
,
'mcd >= 0.4'
,
],
)
],
)
remove_version_py
()
remove_version_py
()
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录