Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
f43d0260
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
206
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
未验证
提交
f43d0260
编写于
11月 09, 2022
作者:
H
HuangLiangJie
提交者:
GitHub
11月 09, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Add rhythm tags for MFA, test=tts (#2615)
* Add rhythm tags for MFA, test=tts
上级
fd73a184
变更
3
显示空白变更内容
内联
并排
Showing
3 changed file
with
78 addition
and
3 deletion
+78
-3
examples/other/mfa/README.md
examples/other/mfa/README.md
+3
-0
examples/other/mfa/local/generate_lexicon.py
examples/other/mfa/local/generate_lexicon.py
+5
-0
examples/other/mfa/local/reorganize_baker.py
examples/other/mfa/local/reorganize_baker.py
+70
-3
未找到文件。
examples/other/mfa/README.md
浏览文件 @
f43d0260
...
...
@@ -4,3 +4,6 @@ Run the following script to get started, for more detail, please see `run.sh`.
```
bash
./run.sh
```
# Rhythm tags for MFA
If you want to get rhythm tags with duration through MFA tool, you may add flag
`--rhy-with-duration`
in the first two commands in
`run.sh`
Note that only CSMSC dataset is supported so far, and we replace
`#`
with
`sp`
in rhythm tags for MFA.
examples/other/mfa/local/generate_lexicon.py
浏览文件 @
f43d0260
...
...
@@ -182,12 +182,17 @@ if __name__ == "__main__":
"--with-tone"
,
action
=
"store_true"
,
help
=
"whether to consider tone."
)
parser
.
add_argument
(
"--with-r"
,
action
=
"store_true"
,
help
=
"whether to consider erhua."
)
parser
.
add_argument
(
"--rhy-with-duration"
,
action
=
"store_true"
,
)
args
=
parser
.
parse_args
()
lexicon
=
generate_lexicon
(
args
.
with_tone
,
args
.
with_r
)
symbols
=
generate_symbols
(
lexicon
)
with
open
(
args
.
output
+
".lexicon"
,
'wt'
)
as
f
:
if
args
.
rhy_with_duration
:
f
.
write
(
"sp1 sp1
\n
sp2 sp2
\n
sp3 sp3
\n
sp4 sp4
\n
"
)
for
k
,
v
in
lexicon
.
items
():
f
.
write
(
f
"
{
k
}
{
v
}
\n
"
)
...
...
examples/other/mfa/local/reorganize_baker.py
浏览文件 @
f43d0260
...
...
@@ -23,6 +23,7 @@ for more details.
"""
import
argparse
import
os
import
re
import
shutil
from
concurrent.futures
import
ThreadPoolExecutor
from
pathlib
import
Path
...
...
@@ -32,6 +33,22 @@ import librosa
import
soundfile
as
sf
from
tqdm
import
tqdm
repalce_dict
=
{
";"
:
""
,
"。"
:
""
,
":"
:
""
,
"—"
:
""
,
")"
:
""
,
","
:
""
,
"“"
:
""
,
"("
:
""
,
"、"
:
""
,
"…"
:
""
,
"!"
:
""
,
"?"
:
""
,
"”"
:
""
}
def
get_transcripts
(
path
:
Union
[
str
,
Path
]):
transcripts
=
{}
...
...
@@ -55,8 +72,12 @@ def resample_and_save(source, target, sr=16000):
def
reorganize_baker
(
root_dir
:
Union
[
str
,
Path
],
output_dir
:
Union
[
str
,
Path
]
=
None
,
resample_audio
=
False
):
resample_audio
=
False
,
rhy_dur
=
False
):
root_dir
=
Path
(
root_dir
).
expanduser
()
if
rhy_dur
:
transcript_path
=
root_dir
/
"ProsodyLabeling"
/
"000001-010000_rhy.txt"
else
:
transcript_path
=
root_dir
/
"ProsodyLabeling"
/
"000001-010000.txt"
transcriptions
=
get_transcripts
(
transcript_path
)
...
...
@@ -92,6 +113,46 @@ def reorganize_baker(root_dir: Union[str, Path],
print
(
"Done!"
)
def
insert_rhy
(
sentence_first
,
sentence_second
):
sub
=
'#'
return_words
=
[]
sentence_first
=
sentence_first
.
translate
(
str
.
maketrans
(
repalce_dict
))
rhy_idx
=
[
substr
.
start
()
for
substr
in
re
.
finditer
(
sub
,
sentence_first
)]
re_rhy_idx
=
[]
sentence_first_
=
sentence_first
.
replace
(
"#1"
,
""
).
replace
(
"#2"
,
""
).
replace
(
"#3"
,
""
).
replace
(
"#4"
,
""
)
sentence_seconds
=
sentence_second
.
split
(
" "
)
for
i
,
w
in
enumerate
(
rhy_idx
):
re_rhy_idx
.
append
(
w
-
i
*
2
)
i
=
0
# print("re_rhy_idx: ", re_rhy_idx)
for
sentence_s
in
(
sentence_seconds
):
return_words
.
append
(
sentence_s
)
if
i
<
len
(
re_rhy_idx
)
and
len
(
return_words
)
-
i
==
re_rhy_idx
[
i
]:
return_words
.
append
(
"sp"
+
sentence_first
[
rhy_idx
[
i
]
+
1
:
rhy_idx
[
i
]
+
2
])
i
=
i
+
1
return
return_words
def
normalize_rhy
(
root_dir
:
Union
[
str
,
Path
]):
root_dir
=
Path
(
root_dir
).
expanduser
()
transcript_path
=
root_dir
/
"ProsodyLabeling"
/
"000001-010000.txt"
target_transcript_path
=
root_dir
/
"ProsodyLabeling"
/
"000001-010000_rhy.txt"
with
open
(
transcript_path
)
as
f
:
lines
=
f
.
readlines
()
with
open
(
target_transcript_path
,
'wt'
)
as
f
:
for
i
in
range
(
0
,
len
(
lines
),
2
):
sentence_first
=
lines
[
i
]
#第一行直接保存
f
.
write
(
sentence_first
)
transcription
=
lines
[
i
+
1
].
strip
()
f
.
write
(
"
\t
"
+
" "
.
join
(
insert_rhy
(
sentence_first
.
split
(
'
\t
'
)[
1
],
transcription
))
+
"
\n
"
)
if
__name__
==
"__main__"
:
parser
=
argparse
.
ArgumentParser
(
description
=
"Reorganize Baker dataset for MFA"
)
...
...
@@ -104,6 +165,12 @@ if __name__ == "__main__":
"--resample-audio"
,
action
=
"store_true"
,
help
=
"To resample audio files or just copy them"
)
parser
.
add_argument
(
"--rhy-with-duration"
,
action
=
"store_true"
,
)
args
=
parser
.
parse_args
()
reorganize_baker
(
args
.
root_dir
,
args
.
output_dir
,
args
.
resample_audio
)
if
args
.
rhy_with_duration
:
normalize_rhy
(
args
.
root_dir
)
reorganize_baker
(
args
.
root_dir
,
args
.
output_dir
,
args
.
resample_audio
,
args
.
rhy_with_duration
)
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录