Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
71bda244
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
71bda244
编写于
2月 15, 2023
作者:
L
lance6716
提交者:
GitHub
2月 15, 2023
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
[TTS]Fix canton (#2924)
* Update run.sh * Update README.md
上级
9db75af2
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
1 addition
and
74 deletion
+1
-74
examples/canton/tts3/README.md
examples/canton/tts3/README.md
+1
-41
examples/canton/tts3/run.sh
examples/canton/tts3/run.sh
+0
-33
未找到文件。
examples/canton/tts3/README.md
浏览文件 @
71bda244
...
@@ -74,44 +74,4 @@ Also, there is a `metadata.jsonl` in each subfolder. It is a table-like file tha
...
@@ -74,44 +74,4 @@ Also, there is a `metadata.jsonl` in each subfolder. It is a table-like file tha
### Training details can refer to the script of [examples/aishell3/tts3](../../aishell3/tts3).
### Training details can refer to the script of [examples/aishell3/tts3](../../aishell3/tts3).
## Pretrained Model(Waiting========)
## Pretrained Model
Pretrained FastSpeech2 model with no silence in the edge of audios:
-
[
fastspeech2_aishell3_ckpt_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_ckpt_1.1.0.zip
)
-
[
fastspeech2_conformer_aishell3_ckpt_0.2.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_conformer_aishell3_ckpt_0.2.0.zip
)
(
Thanks
for
[
@awmmmm
](
https://github.com/awmmmm
)
's contribution)
FastSpeech2 checkpoint contains files listed below.
```
text
fastspeech2_aishell3_ckpt_1.1.0
├── default.yaml # default config used to train fastspeech2
├── energy_stats.npy # statistics used to normalize energy when training fastspeech2
├── phone_id_map.txt # phone vocabulary file when training fastspeech2
├── pitch_stats.npy # statistics used to normalize pitch when training fastspeech2
├── snapshot_iter_96400.pdz # model parameters and optimizer states
├── speaker_id_map.txt # speaker id map file when training a multi-speaker fastspeech2
└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2
```
You can use the following scripts to synthesize for
`${BIN_DIR}/../sentences.txt`
using pretrained fastspeech2 and parallel wavegan models.
```
bash
source
path.sh
FLAGS_allocator_strategy
=
naive_best_fit
\
FLAGS_fraction_of_gpu_memory_to_use
=
0.01
\
python3
${
BIN_DIR
}
/../synthesize_e2e.py
\
--am
=
fastspeech2_aishell3
\
--am_config
=
fastspeech2_aishell3_ckpt_1.1.0/default.yaml
\
--am_ckpt
=
fastspeech2_aishell3_ckpt_1.1.0/snapshot_iter_96400.pdz
\
--am_stat
=
fastspeech2_aishell3_ckpt_1.1.0/speech_stats.npy
\
--voc
=
pwgan_aishell3
\
--voc_config
=
pwg_aishell3_ckpt_0.5/default.yaml
\
--voc_ckpt
=
pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz
\
--voc_stat
=
pwg_aishell3_ckpt_0.5/feats_stats.npy
\
--lang
=
zh
\
--text
=
${
BIN_DIR
}
/../sentences.txt
\
--output_dir
=
exp/default/test_e2e
\
--phones_dict
=
fastspeech2_aishell3_ckpt_1.1.0/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_aishell3_ckpt_1.1.0/speaker_id_map.txt
\
--spk_id
=
0
\
--inference_dir
=
exp/default/inference
```
examples/canton/tts3/run.sh
浏览文件 @
71bda244
...
@@ -35,36 +35,3 @@ if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
...
@@ -35,36 +35,3 @@ if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default
# synthesize_e2e, vocoder is pwgan by default
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/synthesize_e2e.sh
${
conf_path
}
${
train_output_path
}
${
ckpt_name
}
||
exit
-1
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/synthesize_e2e.sh
${
conf_path
}
${
train_output_path
}
${
ckpt_name
}
||
exit
-1
fi
fi
if
[
${
stage
}
-le
4
]
&&
[
${
stop_stage
}
-ge
4
]
;
then
# inference with static model, vocoder is pwgan by default
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/inference.sh
${
train_output_path
}
||
exit
-1
fi
if
[
${
stage
}
-le
5
]
&&
[
${
stop_stage
}
-ge
5
]
;
then
# install paddle2onnx
version
=
$(
echo
`
pip list |grep
"paddle2onnx"
`
|awk
-F
" "
'{print $2}'
)
if
[[
-z
"
$version
"
||
${
version
}
!=
'1.0.0'
]]
;
then
pip
install
paddle2onnx
==
1.0.0
fi
./local/paddle2onnx.sh
${
train_output_path
}
inference inference_onnx fastspeech2_aishell3
# considering the balance between speed and quality, we recommend that you use hifigan as vocoder
./local/paddle2onnx.sh
${
train_output_path
}
inference inference_onnx pwgan_aishell3
# ./local/paddle2onnx.sh ${train_output_path} inference inference_onnx hifigan_aishell3
fi
# inference with onnxruntime, use fastspeech2 + pwgan by default
if
[
${
stage
}
-le
6
]
&&
[
${
stop_stage
}
-ge
6
]
;
then
./local/ort_predict.sh
${
train_output_path
}
fi
if
[
${
stage
}
-le
7
]
&&
[
${
stop_stage
}
-ge
7
]
;
then
./local/export2lite.sh
${
train_output_path
}
inference pdlite fastspeech2_aishell3 x86
./local/export2lite.sh
${
train_output_path
}
inference pdlite pwgan_aishell3 x86
# ./local/export2lite.sh ${train_output_path} inference pdlite hifigan_aishell3 x86
fi
if
[
${
stage
}
-le
8
]
&&
[
${
stop_stage
}
-ge
8
]
;
then
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/lite_predict.sh
${
train_output_path
}
||
exit
-1
fi
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录