Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
940602ad
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
940602ad
编写于
1月 29, 2022
作者:
Honei_X
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
convert voxceleb trial to kaldi format trial
上级
8891621e
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
89 addition
and
0 deletion
+89
-0
examples/voxceleb/README.md
examples/voxceleb/README.md
+8
-0
examples/voxceleb/sv0/local/make_voxceleb_kaldi_trial.py
examples/voxceleb/sv0/local/make_voxceleb_kaldi_trial.py
+81
-0
未找到文件。
examples/voxceleb/README.md
0 → 100644
浏览文件 @
940602ad
dataset info refer to
[
VoxCeleb
](
https://www.robots.ox.ac.uk/~vgg/data/voxceleb/index.html#about
)
sv0 - speaker verfication with softmax backend etc, all python code
more info refer to the sv0/readme.txt
sv1 - dependence on kaldi, speaker verfication with plda/sc backend,
more info refer to the sv1/readme.txt
examples/voxceleb/sv0/local/make_voxceleb_kaldi_trial.py
0 → 100644
浏览文件 @
940602ad
#!/usr/bin/python3
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Make VoxCeleb1 trial of kaldi format
this script creat the test trial from kaldi trial voxceleb1_test_v2.txt or official trial veri_test2.txt
to kaldi trial format
"""
import
argparse
import
codecs
import
os
parser
=
argparse
.
ArgumentParser
(
description
=
__doc__
)
parser
.
add_argument
(
"--voxceleb_trial"
,
default
=
"voxceleb1_test_v2"
,
type
=
str
,
help
=
"VoxCeleb trial file. Default we use the kaldi trial voxceleb1_test_v2.txt"
)
parser
.
add_argument
(
"--trial"
,
default
=
"data/test/trial"
,
type
=
str
,
help
=
"Kaldi format trial file"
)
args
=
parser
.
parse_args
()
def
main
(
voxceleb_trial
,
trial
):
"""
VoxCeleb provide several trial file, which format is different with kaldi format.
VoxCeleb format's meaning is as following:
--------------------------------
target_or_nontarget path1 path2
--------------------------------
target_or_nontarget is an integer: 1 target path1 is equal to path2
0 target_or_nontarget path1 is unequal to path2
path1: spkr_id/rec_id/name
path2: spkr_id/rec_id/name
Kaldi format's meaning is as following:
---------------------------------------
utt_id1 utt_id2 target_or_nontarget
---------------------------------------
utt_id1: utterance identification or speaker identification
utt_id2: utterance identification or speaker identification
target_or_nontarget is an string: 'target' utt_id1 is equal to utt_id2
'nontarget' utt_id2 is unequal to utt_id2
"""
print
(
"Start convert the voxceleb trial to kaldi format"
)
if
not
os
.
path
.
exists
(
voxceleb_trial
):
raise
RuntimeError
(
"{} does not exist. Pleas input the correct file path"
.
format
(
voxceleb_trial
))
trial_dirname
=
os
.
path
.
dirname
(
trial
)
if
not
os
.
path
.
exists
(
trial_dirname
):
os
.
mkdir
(
trial_dirname
)
with
codecs
.
open
(
voxceleb_trial
,
'r'
,
encoding
=
'utf-8'
)
as
f
,
\
codecs
.
open
(
trial
,
'w'
,
encoding
=
'utf-8'
)
as
w
:
for
line
in
f
:
target_or_nontarget
,
path1
,
path2
=
line
.
strip
().
split
()
utt_id1
=
"-"
.
join
(
path1
.
split
(
"/"
))
utt_id2
=
"-"
.
join
(
path2
.
split
(
"/"
))
target
=
"nontarget"
if
int
(
target_or_nontarget
):
target
=
"target"
w
.
write
(
"{} {} {}
\n
"
.
format
(
utt_id1
,
utt_id2
,
target
))
print
(
"Convert the voxceleb trial to kaldi format successfully"
)
if
__name__
==
"__main__"
:
main
(
args
.
voxceleb_trial
,
args
.
trial
)
\ No newline at end of file
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录