Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
d21e03c0
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
206
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
未验证
提交
d21e03c0
编写于
8月 26, 2022
作者:
小湉湉
提交者:
GitHub
8月 26, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
update tts3 readme, test=doc (#2315)
上级
c7163abf
变更
6
隐藏空白更改
内联
并排
Showing
6 changed file
with
33 addition
and
26 deletion
+33
-26
docs/source/released_model.md
docs/source/released_model.md
+4
-2
examples/aishell3/tts3/README.md
examples/aishell3/tts3/README.md
+8
-7
examples/aishell3/tts3/local/synthesize_e2e.sh
examples/aishell3/tts3/local/synthesize_e2e.sh
+3
-3
examples/other/g2p/README.md
examples/other/g2p/README.md
+1
-1
examples/vctk/tts3/README.md
examples/vctk/tts3/README.md
+9
-7
examples/zh_en_tts/tts3/README.md
examples/zh_en_tts/tts3/README.md
+8
-6
未找到文件。
docs/source/released_model.md
浏览文件 @
d21e03c0
...
...
@@ -42,9 +42,11 @@ SpeedySpeech| CSMSC | [speedyspeech-csmsc](https://github.com/PaddlePaddle/Paddl
FastSpeech2| CSMSC |
[
fastspeech2-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3
)
|
[
fastspeech2_nosil_baker_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_baker_ckpt_0.4.zip
)
|
[
fastspeech2_csmsc_static_0.2.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_csmsc_static_0.2.0.zip
)
</br>
[
fastspeech2_csmsc_onnx_0.2.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_csmsc_onnx_0.2.0.zip
)
|157MB|
FastSpeech2-Conformer| CSMSC |
[
fastspeech2-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3
)
|
[
fastspeech2_conformer_baker_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_conformer_baker_ckpt_0.5.zip
)
|||
FastSpeech2-CNNDecoder| CSMSC|
[
fastspeech2-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3
)
|
[
fastspeech2_cnndecoder_csmsc_ckpt_1.0.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_ckpt_1.0.0.zip
)
|
[
fastspeech2_cnndecoder_csmsc_static_1.0.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_static_1.0.0.zip
)
</br>
[
fastspeech2_cnndecoder_csmsc_streaming_static_1.0.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_streaming_static_1.0.0.zip
)
</br>
[
fastspeech2_cnndecoder_csmsc_onnx_1.0.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_onnx_1.0.0.zip
)
</br>
[
fastspeech2_cnndecoder_csmsc_streaming_onnx_1.0.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_streaming_onnx_1.0.0.zip
)
| 84MB|
FastSpeech2| AISHELL-3 |
[
fastspeech2-aishell3
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/tts3
)
|
[
fastspeech2_
nosil_aishell3_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_aishell3_ckpt_0.4
.zip
)
|
[
fastspeech2_aishell3_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_static_1.1.0.zip
)
</br>
[
fastspeech2_aishell3_onnx_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_onnx_1.1.0.zip
)
|147MB|
FastSpeech2| AISHELL-3 |
[
fastspeech2-aishell3
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/tts3
)
|
[
fastspeech2_
aishell3_ckpt_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_ckpt_1.1.0
.zip
)
|
[
fastspeech2_aishell3_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_static_1.1.0.zip
)
</br>
[
fastspeech2_aishell3_onnx_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_onnx_1.1.0.zip
)
|147MB|
FastSpeech2| LJSpeech |
[
fastspeech2-ljspeech
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tts3
)
|
[
fastspeech2_nosil_ljspeech_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_ljspeech_ckpt_0.5.zip
)
|
[
fastspeech2_ljspeech_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_ljspeech_static_1.1.0.zip
)
</br>
[
fastspeech2_ljspeech_onnx_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_ljspeech_onnx_1.1.0.zip
)
|145MB|
FastSpeech2| VCTK |
[
fastspeech2-vctk
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/tts3
)
|
[
fastspeech2_nosil_vctk_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_vctk_ckpt_0.5.zip
)
|
[
fastspeech2_vctk_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_static_1.1.0.zip
)
</br>
[
fastspeech2_vctk_onnx_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_onnx_1.1.0.zip
)
| 145MB|
FastSpeech2| VCTK |
[
fastspeech2-vctk
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/tts3
)
|
[
fastspeech2_vctk_ckpt_1.2.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_ckpt_1.2.0.zip
)
|
[
fastspeech2_vctk_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_static_1.1.0.zip
)
</br>
[
fastspeech2_vctk_onnx_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_onnx_1.1.0.zip
)
| 145MB|
FastSpeech2| ZH_EN |
[
fastspeech2-zh_en
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/zh_en_tts/tts3
)
|
[
fastspeech2_mix_ckpt_1.2.0.zip
](
https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_ckpt_1.2.0.zip
)
|
[
fastspeech2_mix_static_0.2.0.zip
](
https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_static_0.2.0.zip
)
</br>
[
fastspeech2_mix_onnx_0.2.0.zip
](
https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_onnx_0.2.0.zip
)
| 145MB|
### Vocoders
Model Type | Dataset| Example Link | Pretrained Models| Static/ONNX Models|Size (static)
...
...
examples/aishell3/tts3/README.md
浏览文件 @
d21e03c0
...
...
@@ -229,9 +229,11 @@ The ONNX model can be downloaded here:
FastSpeech2 checkpoint contains files listed below.
```
text
fastspeech2_
nosil_aishell3_ckpt_0.4
fastspeech2_
aishell3_ckpt_1.1.0
├── default.yaml # default config used to train fastspeech2
├── energy_stats.npy # statistics used to normalize energy when training fastspeech2
├── phone_id_map.txt # phone vocabulary file when training fastspeech2
├── pitch_stats.npy # statistics used to normalize pitch when training fastspeech2
├── snapshot_iter_96400.pdz # model parameters and optimizer states
├── speaker_id_map.txt # speaker id map file when training a multi-speaker fastspeech2
└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2
...
...
@@ -244,9 +246,9 @@ FLAGS_allocator_strategy=naive_best_fit \
FLAGS_fraction_of_gpu_memory_to_use
=
0.01
\
python3
${
BIN_DIR
}
/../synthesize_e2e.py
\
--am
=
fastspeech2_aishell3
\
--am_config
=
fastspeech2_
nosil_aishell3_ckpt_0.4
/default.yaml
\
--am_ckpt
=
fastspeech2_
nosil_aishell3_ckpt_0.4
/snapshot_iter_96400.pdz
\
--am_stat
=
fastspeech2_
nosil_aishell3_ckpt_0.4
/speech_stats.npy
\
--am_config
=
fastspeech2_
aishell3_ckpt_1.1.0
/default.yaml
\
--am_ckpt
=
fastspeech2_
aishell3_ckpt_1.1.0
/snapshot_iter_96400.pdz
\
--am_stat
=
fastspeech2_
aishell3_ckpt_1.1.0
/speech_stats.npy
\
--voc
=
pwgan_aishell3
\
--voc_config
=
pwg_aishell3_ckpt_0.5/default.yaml
\
--voc_ckpt
=
pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz
\
...
...
@@ -254,9 +256,8 @@ python3 ${BIN_DIR}/../synthesize_e2e.py \
--lang
=
zh
\
--text
=
${
BIN_DIR
}
/../sentences.txt
\
--output_dir
=
exp/default/test_e2e
\
--phones_dict
=
fastspeech2_
nosil_aishell3_ckpt_0.4
/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_
nosil_aishell3_ckpt_0.4
/speaker_id_map.txt
\
--phones_dict
=
fastspeech2_
aishell3_ckpt_1.1.0
/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_
aishell3_ckpt_1.1.0
/speaker_id_map.txt
\
--spk_id
=
0
\
--inference_dir
=
exp/default/inference
```
examples/aishell3/tts3/local/synthesize_e2e.sh
浏览文件 @
d21e03c0
...
...
@@ -38,7 +38,7 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
--am
=
fastspeech2_aishell3
\
--am_config
=
${
config_path
}
\
--am_ckpt
=
${
train_output_path
}
/checkpoints/
${
ckpt_name
}
\
--am_stat
=
fastspeech2_nosil_aishell3_ckpt_0.4
/speech_stats.npy
\
--am_stat
=
dump/train
/speech_stats.npy
\
--voc
=
hifigan_aishell3
\
--voc_config
=
hifigan_aishell3_ckpt_0.2.0/default.yaml
\
--voc_ckpt
=
hifigan_aishell3_ckpt_0.2.0/snapshot_iter_2500000.pdz
\
...
...
@@ -46,8 +46,8 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
--lang
=
zh
\
--text
=
${
BIN_DIR
}
/../sentences.txt
\
--output_dir
=
${
train_output_path
}
/test_e2e
\
--phones_dict
=
fastspeech2_nosil_aishell3_ckpt_0.4
/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_nosil_aishell3_ckpt_0.4
/speaker_id_map.txt
\
--phones_dict
=
dump
/phone_id_map.txt
\
--speaker_dict
=
dump
/speaker_id_map.txt
\
--spk_id
=
0
\
--inference_dir
=
${
train_output_path
}
/inference
fi
examples/other/g2p/README.md
浏览文件 @
d21e03c0
...
...
@@ -12,7 +12,7 @@ Run the command below to get the results of the test.
./run.sh
```
The
`avg WER`
of g2p is: 0.024
219452438490413
The
`avg WER`
of g2p is: 0.024
169315564825305
```
text
,--------------------------------------------------------------------.
...
...
examples/vctk/tts3/README.md
浏览文件 @
d21e03c0
...
...
@@ -216,7 +216,7 @@ optional arguments:
## Pretrained Model
Pretrained FastSpeech2 model with no silence in the edge of audios:
-
[
fastspeech2_
nosil_vctk_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_vctk_ckpt_0.5
.zip
)
-
[
fastspeech2_
vctk_ckpt_1.2.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_ckpt_1.2.0
.zip
)
The static model can be downloaded here:
-
[
fastspeech2_vctk_static_1.1.0.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_static_1.1.0.zip
)
...
...
@@ -226,9 +226,11 @@ The ONNX model can be downloaded here:
FastSpeech2 checkpoint contains files listed below.
```
text
fastspeech2_
nosil_vctk_ckpt_0.5
fastspeech2_
vctk_ckpt_1.2.0
├── default.yaml # default config used to train fastspeech2
├── energy_stats.npy # statistics used to normalize energy when training fastspeech2
├── phone_id_map.txt # phone vocabulary file when training fastspeech2
├── pitch_stats.npy # statistics used to normalize pitch when training fastspeech2
├── snapshot_iter_66200.pdz # model parameters and optimizer states
├── speaker_id_map.txt # speaker id map file when training a multi-speaker fastspeech2
└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2
...
...
@@ -241,9 +243,9 @@ FLAGS_allocator_strategy=naive_best_fit \
FLAGS_fraction_of_gpu_memory_to_use
=
0.01
\
python3
${
BIN_DIR
}
/../synthesize_e2e.py
\
--am
=
fastspeech2_vctk
\
--am_config
=
fastspeech2_
nosil_vctk_ckpt_0.5
/default.yaml
\
--am_ckpt
=
fastspeech2_
nosil_vctk_ckpt_0.5
/snapshot_iter_66200.pdz
\
--am_stat
=
fastspeech2_
nosil_vctk_ckpt_0.5
/speech_stats.npy
\
--am_config
=
fastspeech2_
vctk_ckpt_1.2.0
/default.yaml
\
--am_ckpt
=
fastspeech2_
vctk_ckpt_1.2.0
/snapshot_iter_66200.pdz
\
--am_stat
=
fastspeech2_
vctk_ckpt_1.2.0
/speech_stats.npy
\
--voc
=
pwgan_vctk
\
--voc_config
=
pwg_vctk_ckpt_0.1.1/default.yaml
\
--voc_ckpt
=
pwg_vctk_ckpt_0.1.1/snapshot_iter_1500000.pdz
\
...
...
@@ -251,8 +253,8 @@ python3 ${BIN_DIR}/../synthesize_e2e.py \
--lang
=
en
\
--text
=
${
BIN_DIR
}
/../sentences_en.txt
\
--output_dir
=
exp/default/test_e2e
\
--phones_dict
=
fastspeech2_
nosil_vctk_ckpt_0.5
/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_
nosil_vctk_ckpt_0.5
/speaker_id_map.txt
\
--phones_dict
=
fastspeech2_
vctk_ckpt_1.2.0
/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_
vctk_ckpt_1.2.0
/speaker_id_map.txt
\
--spk_id
=
0
\
--inference_dir
=
exp/default/inference
```
examples/zh_en_tts/tts3/README.md
浏览文件 @
d21e03c0
...
...
@@ -262,9 +262,11 @@ The ONNX model can be downloaded here:
FastSpeech2 checkpoint contains files listed below.
```
text
fastspeech2_mix_ckpt_
0
.2.0
fastspeech2_mix_ckpt_
1
.2.0
├── default.yaml # default config used to train fastspeech2
├── energy_stats.npy # statistics used to energy spectrogram when training fastspeech2
├── phone_id_map.txt # phone vocabulary file when training fastspeech2
├── pitch_stats.npy # statistics used to normalize pitch when training fastspeech2
├── snapshot_iter_99200.pdz # model parameters and optimizer states
├── speaker_id_map.txt # speaker id map file when training a multi-speaker fastspeech2
└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2
...
...
@@ -281,9 +283,9 @@ FLAGS_allocator_strategy=naive_best_fit \
FLAGS_fraction_of_gpu_memory_to_use
=
0.01
\
python3
${
BIN_DIR
}
/../synthesize_e2e.py
\
--am
=
fastspeech2_mix
\
--am_config
=
fastspeech2_mix_ckpt_
0
.2.0/default.yaml
\
--am_ckpt
=
fastspeech2_mix_ckpt_
0
.2.0/snapshot_iter_99200.pdz
\
--am_stat
=
fastspeech2_mix_ckpt_
0
.2.0/speech_stats.npy
\
--am_config
=
fastspeech2_mix_ckpt_
1
.2.0/default.yaml
\
--am_ckpt
=
fastspeech2_mix_ckpt_
1
.2.0/snapshot_iter_99200.pdz
\
--am_stat
=
fastspeech2_mix_ckpt_
1
.2.0/speech_stats.npy
\
--voc
=
pwgan_aishell3
\
--voc_config
=
pwg_aishell3_ckpt_0.5/default.yaml
\
--voc_ckpt
=
pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz
\
...
...
@@ -291,8 +293,8 @@ python3 ${BIN_DIR}/../synthesize_e2e.py \
--lang
=
mix
\
--text
=
${
BIN_DIR
}
/../sentences_mix.txt
\
--output_dir
=
exp/default/test_e2e
\
--phones_dict
=
fastspeech2_mix_ckpt_
0
.2.0/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_mix_ckpt_
0
.2.0/speaker_id_map.txt
\
--phones_dict
=
fastspeech2_mix_ckpt_
1
.2.0/phone_id_map.txt
\
--speaker_dict
=
fastspeech2_mix_ckpt_
1
.2.0/speaker_id_map.txt
\
--spk_id
=
174
\
--inference_dir
=
exp/default/inference
```
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录