Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
cfc390e0
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
cfc390e0
编写于
4月 04, 2022
作者:
X
xiongxinlei
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add speaker verification method, test=doc
上级
a2c0fbf2
变更
3
隐藏空白更改
内联
并排
Showing
3 changed file
with
130 addition
and
6 deletion
+130
-6
demos/speaker_verification/README.md
demos/speaker_verification/README.md
+64
-3
demos/speaker_verification/README_cn.md
demos/speaker_verification/README_cn.md
+61
-1
demos/speaker_verification/run.sh
demos/speaker_verification/run.sh
+5
-2
未找到文件。
demos/speaker_verification/README.md
浏览文件 @
cfc390e0
...
...
@@ -30,6 +30,11 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
paddlespeech vector
--task
spk
--input
vec.job
echo
-e
"demo2 85236145389.wav
\n
demo3 85236145389.wav"
| paddlespeech vector
--task
spk
paddlespeech vector
--task
score
--input
"./85236145389.wav ./123456789.wav"
echo
-e
"demo4 85236145389.wav 85236145389.wav
\n
demo5 85236145389.wav 123456789.wav"
>
vec.job
paddlespeech vector
--task
score
--input
vec.job
```
Usage:
...
...
@@ -38,6 +43,7 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
```
Arguments:
-
`input`
(required): Audio file to recognize.
-
`task`
(required): Specify
`vector`
task. Default
`spk`
。
-
`model`
: Model type of vector task. Default:
`ecapatdnn_voxceleb12`
.
-
`sample_rate`
: Sample rate of the model. Default:
`16000`
.
-
`config`
: Config of vector task. Use pretrained model when it is None. Default:
`None`
.
...
...
@@ -97,17 +103,29 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
audio_emb
=
vector_executor
(
model
=
'ecapatdnn_voxceleb12'
,
sample_rate
=
16000
,
config
=
None
,
config
=
None
,
# Set `config` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
'./85236145389.wav'
,
force_yes
=
False
,
device
=
paddle
.
get_device
())
print
(
'Audio embedding Result:
\n
{}'
.
format
(
audio_emb
))
test_emb
=
vector_executor
(
model
=
'ecapatdnn_voxceleb12'
,
sample_rate
=
16000
,
config
=
None
,
# Set `config` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
'./123456789.wav'
,
device
=
paddle
.
get_device
())
print
(
'Test embedding Result:
\n
{}'
.
format
(
test_emb
))
score
=
vector_executor
.
get_embeddings_score
(
audio_emb
,
test_emb
)
print
(
f
"Eembeddings Score:
{
score
}
"
)
```
Output:
Output:
```
bash
# Vector Result:
Audio embedding Result:
[
-5
.749211 9.505463
-8
.200284
-5
.2075014 5.3940268
-3
.04878 1.611095 10.127234
-10
.534177
-15
.821609
1.2032688
-0
.35080156 1.2629458
-12
.643498
-2
.5758228
...
...
@@ -147,6 +165,49 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
-6
.417456 1.3333273 11.872697
-0
.30664724 8.8845
6.5569253 4.7948146 0.03662816
-8
.704245 6.224871
-3
.2701402
-11
.508579
]
# get the test embedding
Test embedding Result:
[
-1
.9617152 4.2184057
-5
.4289927 3.8006616 7.400566
12.844175 1.4330423 0.4860911
-15
.927942
-13
.081303
-4
.585545 2.378477 5.5894523
-13
.060747 18.578707
-9
.107497
-9
.904055 0.7032993 0.7945765
-1
.4118854
-6
.4434266
-2
.7688267 5.4320455 2.9636188 23.857662
-4
.797293 22.821133
-1
.6718386 0.80379957
-10
.28131
-1
.0586771 5.840774
-11
.794188 0.9715659
-10
.794272
-9
.9839325 11.916608
-19
.614918
-7
.38727 12.361765
-15
.568076 3.796782 1.4648503
-9
.617965 1.8912128
5.5519567 4.1027875 9.565811 1.6652825
-0
.06557167
7.3765106 6.91407
-3
.4179301 4.676896 2.4507313
21.415924
-1
.5271066 0.7630236
-15
.634208
-24
.682417
12.035311 1.9669697
-13
.733474 11.616938
-16
.630692
-16
.287516
-7
.4265285
-6
.4809394 5.4794173
-8
.481719
2.0745668
-7
.50969 1.8279544
-15
.189501
-4
.000386
-1
.5209727 6.975059 4.518711 3.0962887
-6
.8465433
1.3825562 7.6983547
-9
.399815
-7
.3269534
-2
.6540608
1.3231711 5.0338726
-5
.9562182
-10
.437971 19.123528
12.213971
-2
.8820174
-20
.65914 15.071251 8.114322
-4
.045127 7.5128584
-3
.3306584 6.822803
-0
.05004288
-4
.4368496 18.926466 14.04377
-5
.9657135 4.714744
10.24277
-3
.848245 14.494125 5.3582125
-6
.30404
-14
.122616 2.1969411
-5
.90989 9.3047
-8
.431231
10.438023
-11
.987487 20.954517
-4
.279951
-0
.3756797
13.041809
-6
.051407
-10
.529183 3.7894943
-1
.6330183
6.743382
-0
.19549051 7.315633
-19
.438568 0.6115422
4.5697403 2.1208212 0.52282465
-6
.9142766
-5
.8893275
0.5135903 0.92921656
-3
.0571883
-7
.4849505 2.2382743
-3
.0478394 0.08785366 6.810543
-5
.1137877 15.182398
-6
.9418297
-8
.922732
-2
.4528694 7.324874 19.77244
13.997188
-5
.08692
-14
.329076
-6
.1807523
-1
.8777156
-3
.6879017 6.3892293
-3
.78877
-13
.009837
-16
.838747
-4
.1660237
-7
.4346085 0.5579437
-2
.8482168
-13
.509024
9.329142 8.1292095
-8
.064337
-4
.002228
-18
.78694
7.7969575
-13
.585645
-5
.8225474 15.266658
-8
.57028
-7
.449079 2.2094946 28.004955
-3
.0901644 11.932054
-1
.5897936
-4
.826059 6.9232755
-11
.169697
-5
.235409
11.251503 2.105524 4.0860977
-0
.5384147 19.023642
1.6203141
-10
.608387
]
# get the score between enroll and test
Eembeddings Score: 0.3965281546115875
```
### 4.Pretrained Models
...
...
demos/speaker_verification/README_cn.md
浏览文件 @
cfc390e0
...
...
@@ -29,6 +29,11 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
paddlespeech vector
--task
spk
--input
vec.job
echo
-e
"demo2 85236145389.wav
\n
demo3 85236145389.wav"
| paddlespeech vector
--task
spk
paddlespeech vector
--task
score
--input
"./85236145389.wav ./123456789.wav"
echo
-e
"demo4 85236145389.wav 85236145389.wav
\n
demo5 85236145389.wav 123456789.wav"
>
vec.job
paddlespeech vector
--task
score
--input
vec.job
```
使用方法:
...
...
@@ -37,6 +42,7 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
```
参数:
-
`input`
(必须输入):用于识别的音频文件。
-
`task`
(必须输入): 用于指定
`vector`
处理的具体任务,默认是
`spk`
。
-
`model`
:声纹任务的模型,默认值:
`ecapatdnn_voxceleb12`
。
-
`sample_rate`
:音频采样率,默认值:
`16000`
。
-
`config`
:声纹任务的参数文件,若不设置则使用预训练模型中的默认配置,默认值:
`None`
。
...
...
@@ -98,14 +104,25 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
config
=
None
,
# Set `config` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
'./85236145389.wav'
,
force_yes
=
False
,
device
=
paddle
.
get_device
())
print
(
'Audio embedding Result:
\n
{}'
.
format
(
audio_emb
))
test_emb
=
vector_executor
(
model
=
'ecapatdnn_voxceleb12'
,
sample_rate
=
16000
,
config
=
None
,
# Set `config` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
'./123456789.wav'
,
device
=
paddle
.
get_device
())
print
(
'Test embedding Result:
\n
{}'
.
format
(
test_emb
))
score
=
vector_executor
.
get_embeddings_score
(
audio_emb
,
test_emb
)
print
(
f
"Eembeddings Score:
{
score
}
"
)
```
输出:
```
bash
# Vector Result:
Audio embedding Result:
[
-5
.749211 9.505463
-8
.200284
-5
.2075014 5.3940268
-3
.04878 1.611095 10.127234
-10
.534177
-15
.821609
1.2032688
-0
.35080156 1.2629458
-12
.643498
-2
.5758228
...
...
@@ -145,6 +162,49 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
-6
.417456 1.3333273 11.872697
-0
.30664724 8.8845
6.5569253 4.7948146 0.03662816
-8
.704245 6.224871
-3
.2701402
-11
.508579
]
# get the test embedding
Test embedding Result:
[
-1
.9617152 4.2184057
-5
.4289927 3.8006616 7.400566
12.844175 1.4330423 0.4860911
-15
.927942
-13
.081303
-4
.585545 2.378477 5.5894523
-13
.060747 18.578707
-9
.107497
-9
.904055 0.7032993 0.7945765
-1
.4118854
-6
.4434266
-2
.7688267 5.4320455 2.9636188 23.857662
-4
.797293 22.821133
-1
.6718386 0.80379957
-10
.28131
-1
.0586771 5.840774
-11
.794188 0.9715659
-10
.794272
-9
.9839325 11.916608
-19
.614918
-7
.38727 12.361765
-15
.568076 3.796782 1.4648503
-9
.617965 1.8912128
5.5519567 4.1027875 9.565811 1.6652825
-0
.06557167
7.3765106 6.91407
-3
.4179301 4.676896 2.4507313
21.415924
-1
.5271066 0.7630236
-15
.634208
-24
.682417
12.035311 1.9669697
-13
.733474 11.616938
-16
.630692
-16
.287516
-7
.4265285
-6
.4809394 5.4794173
-8
.481719
2.0745668
-7
.50969 1.8279544
-15
.189501
-4
.000386
-1
.5209727 6.975059 4.518711 3.0962887
-6
.8465433
1.3825562 7.6983547
-9
.399815
-7
.3269534
-2
.6540608
1.3231711 5.0338726
-5
.9562182
-10
.437971 19.123528
12.213971
-2
.8820174
-20
.65914 15.071251 8.114322
-4
.045127 7.5128584
-3
.3306584 6.822803
-0
.05004288
-4
.4368496 18.926466 14.04377
-5
.9657135 4.714744
10.24277
-3
.848245 14.494125 5.3582125
-6
.30404
-14
.122616 2.1969411
-5
.90989 9.3047
-8
.431231
10.438023
-11
.987487 20.954517
-4
.279951
-0
.3756797
13.041809
-6
.051407
-10
.529183 3.7894943
-1
.6330183
6.743382
-0
.19549051 7.315633
-19
.438568 0.6115422
4.5697403 2.1208212 0.52282465
-6
.9142766
-5
.8893275
0.5135903 0.92921656
-3
.0571883
-7
.4849505 2.2382743
-3
.0478394 0.08785366 6.810543
-5
.1137877 15.182398
-6
.9418297
-8
.922732
-2
.4528694 7.324874 19.77244
13.997188
-5
.08692
-14
.329076
-6
.1807523
-1
.8777156
-3
.6879017 6.3892293
-3
.78877
-13
.009837
-16
.838747
-4
.1660237
-7
.4346085 0.5579437
-2
.8482168
-13
.509024
9.329142 8.1292095
-8
.064337
-4
.002228
-18
.78694
7.7969575
-13
.585645
-5
.8225474 15.266658
-8
.57028
-7
.449079 2.2094946 28.004955
-3
.0901644 11.932054
-1
.5897936
-4
.826059 6.9232755
-11
.169697
-5
.235409
11.251503 2.105524 4.0860977
-0
.5384147 19.023642
1.6203141
-10
.608387
]
# get the score between enroll and test
Eembeddings Score: 0.3965281546115875
```
### 4.预训练模型
...
...
demos/speaker_verification/run.sh
浏览文件 @
cfc390e0
#!/bin/bash
wget
-c
https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
wget
-c
https://paddlespeech.bj.bcebos.com/vector/audio/123456789.wav
# asr
paddlespeech vector
--task
spk
--input
./85236145389.wav
\ No newline at end of file
# vector
paddlespeech vector
--task
spk
--input
./85236145389.wav
paddlespeech vector
--task
score
--input
"./85236145389.wav ./123456789.wav"
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录