diff --git a/README.md b/README.md index 5c62925e20ae1e2f1cc14909fc45b127201f0e3f..2fb281e7f83637a5cbc90bd1d75358870c3c41eb 100644 --- a/README.md +++ b/README.md @@ -159,15 +159,20 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). ### Recent Update -- 👑 2022.05.13: Release [PP-ASR](./docs/source/asr/PPASR.md)、[PP-TTS](./docs/source/tts/PPTTS.md)、[PP-VPR](docs/source/vpr/PPVPR.md) -- 👏🏻 2022.05.06: `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp`. -- 👏🏻 2022.05.06: `Server` is available for `Speaker Verification`, and `Punctuation Restoration`. -- 👏🏻 2022.04.28: `Streaming Server` is available for `Automatic Speech Recognition` and `Text-to-Speech`. -- 👏🏻 2022.03.28: `Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`. -- 👏🏻 2022.03.28: `CLI` is available for `Speaker Verification`. +- ⚡ 2022.08.25: Release TTS [finetune](./examples/other/tts_finetune/tts3) example. +- 🔥 2022.08.22: Add ERNIE-SAT models: [ERNIE-SAT-vctk](./examples/vctk/ernie_sat)、[ERNIE-SAT-aishell3](./examples/aishell3/ernie_sat)、[ERNIE-SAT-zh_en](./examples/aishell3_vctk/ernie_sat). +- 🔥 2022.08.15: Add [g2pW](https://github.com/GitYCC/g2pW) into TTS Chinese Text Frontend. +- 🔥 2022.08.09: Release [Chinese English mixed TTS](./examples/zh_en_tts/tts3). +- ⚡ 2022.08.03: Add ONNXRuntime infer for TTS CLI. +- 🎉 2022.07.18: Release VITS: [VITS-csmsc](./examples/csmsc/vits)、[VITS-aishell3](./examples/aishell3/vits)、[VITS-VC](./examples/aishell3/vits-vc). +- 🎉 2022.06.22: All TTS models support ONNX format. +- 🍀 2022.06.17: Add [PaddleSpeech Web Demo](./demos/speech_web). +- 👑 2022.05.13: Release [PP-ASR](./docs/source/asr/PPASR.md)、[PP-TTS](./docs/source/tts/PPTTS.md)、[PP-VPR](docs/source/vpr/PPVPR.md). +- 👏🏻 2022.05.06: `PaddleSpeech Streaming Server` is available for `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp` and `Text-to-Speech`. +- 👏🏻 2022.05.06: `PaddleSpeech Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`, `Speaker Verification` and `Punctuation Restoration`. +- 👏🏻 2022.03.28: `PaddleSpeech CLI` is available for `Speaker Verification`. - 🤗 2021.12.14: [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available! -- 👏🏻 2021.12.10: `CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`. - +- 👏🏻 2021.12.10: `PaddleSpeech CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`. ### Community - Scan the QR code below with your Wechat, you can access to official technical exchange group and get the bonus ( more than 20GB learning materials, such as papers, codes and videos ) and the live link of the lessons. Look forward to your participation. @@ -599,49 +604,56 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r - HiFiGAN - LJSpeech / VCTK / CSMSC / AISHELL-3 + HiFiGAN + LJSpeech / VCTK / CSMSC / AISHELL-3 HiFiGAN-ljspeech / HiFiGAN-vctk / HiFiGAN-csmsc / HiFiGAN-aishell3 - WaveRNN - CSMSC + WaveRNN + CSMSC WaveRNN-csmsc - Voice Cloning + Voice Cloning GE2E Librispeech, etc. - ge2e + GE2E + + + + SV2TTS (GE2E + Tacotron2) + AISHELL-3 + + VC0 - GE2E + Tacotron2 + SV2TTS (GE2E + FastSpeech2) AISHELL-3 - ge2e-tacotron2-aishell3 + VC1 - GE2E + FastSpeech2 + SV2TTS (ECAPA-TDNN + FastSpeech2) AISHELL-3 - ge2e-fastspeech2-aishell3 + VC2 GE2E + VITS AISHELL-3 - ge2e-vits-aishell3 + VITS-VC - + End-to-End VITS CSMSC / AISHELL-3 @@ -876,8 +888,9 @@ You are warmly welcome to submit questions in [discussions](https://github.com/P

## Acknowledgement -- Many thanks to [david-95](https://github.com/david-95) improved TTS, fixed multi-punctuation bug, and contributed to multiple program and data. -- Many thanks to [BarryKCL](https://github.com/BarryKCL) improved TTS Chinses frontend based on [G2PW](https://github.com/GitYCC/g2pW) +- Many thanks to [HighCWu](https://github.com/HighCWu)for adding [VITS-aishell3](./examples/aishell3/vits) and [VITS-VC](./examples/aishell3/vits-vc) examples. +- Many thanks to [david-95](https://github.com/david-95) improved TTS, fixed multi-punctuation bug, and contributed to multiple program and data. +- Many thanks to [BarryKCL](https://github.com/BarryKCL) improved TTS Chinses frontend based on [G2PW](https://github.com/GitYCC/g2pW). - Many thanks to [yeyupiaoling](https://github.com/yeyupiaoling)/[PPASR](https://github.com/yeyupiaoling/PPASR)/[PaddlePaddle-DeepSpeech](https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech)/[VoiceprintRecognition-PaddlePaddle](https://github.com/yeyupiaoling/VoiceprintRecognition-PaddlePaddle)/[AudioClassification-PaddlePaddle](https://github.com/yeyupiaoling/AudioClassification-PaddlePaddle) for years of attention, constructive advice and great help. - Many thanks to [mymagicpower](https://github.com/mymagicpower) for the Java implementation of ASR upon [short](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_sdk) and [long](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_long_audio_sdk) audio files. - Many thanks to [JiehangXie](https://github.com/JiehangXie)/[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo) for developing Virtual Uploader(VUP)/Virtual YouTuber(VTuber) with PaddleSpeech TTS function. diff --git a/README_cn.md b/README_cn.md index 21cd00a99a7b85103ac85de63c6fa6ae8a9ac2ba..590124648bc6a133931b6248ebb9d07064781317 100644 --- a/README_cn.md +++ b/README_cn.md @@ -181,12 +181,20 @@ ### 近期更新 - +- ⚡ 2022.08.25: 发布 TTS [finetune](./examples/other/tts_finetune/tts3) 示例。 +- 🔥 2022.08.22: 新增 ERNIE-SAT 模型: [ERNIE-SAT-vctk](./examples/vctk/ernie_sat)、[ERNIE-SAT-aishell3](./examples/aishell3/ernie_sat)、[ERNIE-SAT-zh_en](./examples/aishell3_vctk/ernie_sat)。 +- 🔥 2022.08.15: 将 [g2pW](https://github.com/GitYCC/g2pW) 引入 TTS 中文文本前端。 +- 🔥 2022.08.09: 发布[中英文混合 TTS](./examples/zh_en_tts/tts3)。 +- ⚡ 2022.08.03: TTS CLI 新增 ONNXRuntime 推理方式。 +- 🎉 2022.07.18: 发布 VITS 模型: [VITS-csmsc](./examples/csmsc/vits)、[VITS-aishell3](./examples/aishell3/vits)、[VITS-VC](./examples/aishell3/vits-vc)。 +- 🎉 2022.06.22: 所有 TTS 模型支持了 ONNX 格式。 +- 🍀 2022.06.17: 新增 [PaddleSpeech 网页应用](./demos/speech_web)。 - 👑 2022.05.13: PaddleSpeech 发布 [PP-ASR](./docs/source/asr/PPASR_cn.md) 流式语音识别系统、[PP-TTS](./docs/source/tts/PPTTS_cn.md) 流式语音合成系统、[PP-VPR](docs/source/vpr/PPVPR_cn.md) 全链路声纹识别系统 -- 👏🏻 2022.05.06: PaddleSpeech Streaming Server 上线! 覆盖了语音识别(标点恢复、时间戳),和语音合成。 -- 👏🏻 2022.05.06: PaddleSpeech Server 上线! 覆盖了声音分类、语音识别、语音合成、声纹识别,标点恢复。 -- 👏🏻 2022.03.28: PaddleSpeech CLI 覆盖声音分类、语音识别、语音翻译(英译中)、语音合成,声纹验证。 -- 🤗 2021.12.14: PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available! +- 👏🏻 2022.05.06: PaddleSpeech Streaming Server 上线!覆盖了语音识别(标点恢复、时间戳)和语音合成。 +- 👏🏻 2022.05.06: PaddleSpeech Server 上线!覆盖了声音分类、语音识别、语音合成、声纹识别,标点恢复。 +- 👏🏻 2022.03.28: PaddleSpeech CLI 覆盖声音分类、语音识别、语音翻译(英译中)、语音合成和声纹验证。 +- 🤗 2021.12.14: PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) 和 [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) 可在 Hugging Face Spaces 上体验! +- 👏🏻 2021.12.10: PaddleSpeech CLI 支持语音分类, 语音识别, 语音翻译(英译中)和语音合成。 ### 🔥 加入技术交流群获取入群福利 @@ -237,7 +245,6 @@ pip install . ## 快速开始 - 安装完成后,开发者可以通过命令行或者 Python 快速开始,命令行模式下改变 `--input` 可以尝试用自己的音频或文本测试,支持 16k wav 格式音频。 你也可以在 `aistudio` 中快速体验 👉🏻[一键预测,快速上手 Speech 开发任务](https://aistudio.baidu.com/aistudio/projectdetail/4353348?sUid=2470186&shared=1&ts=1660878142250)。 @@ -624,34 +631,40 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 - 声音克隆 + 声音克隆 GE2E Librispeech, etc. - ge2e + GE2E - GE2E + Tacotron2 + SV2TTS (GE2E + Tacotron2) AISHELL-3 - ge2e-tacotron2-aishell3 + VC0 - GE2E + FastSpeech2 + SV2TTS (GE2E + FastSpeech2) AISHELL-3 - ge2e-fastspeech2-aishell3 + VC1 - GE2E + VITS + SV2TTS (ECAPA-TDNN + FastSpeech2) AISHELL-3 - ge2e-vits-aishell3 + VC2 + + GE2E + VITS + AISHELL-3 + + VITS-VC + 端到端 @@ -896,8 +909,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声

## 致谢 -- 非常感谢 [david-95](https://github.com/david-95)修复句尾多标点符号出错的问题,补充frontend语音polyphonic 数据,贡献补充多条程序和数据 -- 非常感谢 [BarryKCL](https://github.com/BarryKCL)基于[G2PW](https://github.com/GitYCC/g2pW)对TTS中文文本前端的优化。 +- 非常感谢 [HighCWu](https://github.com/HighCWu) 新增 [VITS-aishell3](./examples/aishell3/vits) 和 [VITS-VC](./examples/aishell3/vits-vc) 代码示例。 +- 非常感谢 [david-95](https://github.com/david-95) 修复句尾多标点符号出错的问题,贡献补充多条程序和数据。 +- 非常感谢 [BarryKCL](https://github.com/BarryKCL) 基于 [G2PW](https://github.com/GitYCC/g2pW) 对 TTS 中文文本前端的优化。 - 非常感谢 [yeyupiaoling](https://github.com/yeyupiaoling)/[PPASR](https://github.com/yeyupiaoling/PPASR)/[PaddlePaddle-DeepSpeech](https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech)/[VoiceprintRecognition-PaddlePaddle](https://github.com/yeyupiaoling/VoiceprintRecognition-PaddlePaddle)/[AudioClassification-PaddlePaddle](https://github.com/yeyupiaoling/AudioClassification-PaddlePaddle) 多年来的关注和建议,以及在诸多问题上的帮助。 - 非常感谢 [mymagicpower](https://github.com/mymagicpower) 采用PaddleSpeech 对 ASR 的[短语音](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_sdk)及[长语音](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_long_audio_sdk)进行 Java 实现。 - 非常感谢 [JiehangXie](https://github.com/JiehangXie)/[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo) 采用 PaddleSpeech 语音合成功能实现 Virtual Uploader(VUP)/Virtual YouTuber(VTuber) 虚拟主播。 diff --git a/demos/text_to_speech/README.md b/demos/text_to_speech/README.md index 3288ecf2f07ddeb028a83b2f33b1f13a62975928..41dcf820b08cfbe894a8b68f16ae552ea73609c9 100644 --- a/demos/text_to_speech/README.md +++ b/demos/text_to_speech/README.md @@ -16,8 +16,8 @@ You can choose one way from easy, meduim and hard to install paddlespeech. The input of this demo should be a text of the specific language that can be passed via argument. ### 3. Usage - Command Line (Recommended) + The default acoustic model is `Fastspeech2`, and the default vocoder is `HiFiGAN`, the default inference method is dygraph inference. - Chinese - The default acoustic model is `Fastspeech2`, and the default vocoder is `Parallel WaveGAN`. ```bash paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" ``` @@ -58,6 +58,20 @@ The input of this demo should be a text of the specific language that can be pas paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175_pwgan.wav paddlespeech tts --am fastspeech2_mix --voc hifigan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175.wav ``` + - Use ONNXRuntime infer: + ```bash + paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True + paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True + paddlespeech tts --voc mb_melgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output mb.wav --use_onnx True + paddlespeech tts --voc pwgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_aishell3 --voc pwgan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_aishell3 --voc hifigan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_hifigan.wav --use_onnx True + paddlespeech tts --am fastspeech2_ljspeech --voc pwgan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True + paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True + ``` + Usage: ```bash @@ -80,6 +94,8 @@ The input of this demo should be a text of the specific language that can be pas - `lang`: Language of tts task. Default: `zh`. - `device`: Choose device to execute model inference. Default: default device of paddlepaddle in current environment. - `output`: Output wave filepath. Default: `output.wav`. + - `use_onnx`: whether to usen ONNXRuntime inference. + - `fs`: sample rate for ONNX models when use specified model files. Output: ```bash @@ -87,38 +103,50 @@ The input of this demo should be a text of the specific language that can be pas ``` - Python API - ```python - import paddle - from paddlespeech.cli.tts import TTSExecutor - - tts_executor = TTSExecutor() - wav_file = tts_executor( - text='今天的天气不错啊', - output='output.wav', - am='fastspeech2_csmsc', - am_config=None, - am_ckpt=None, - am_stat=None, - spk_id=0, - phones_dict=None, - tones_dict=None, - speaker_dict=None, - voc='pwgan_csmsc', - voc_config=None, - voc_ckpt=None, - voc_stat=None, - lang='zh', - device=paddle.get_device()) - print('Wave file has been generated: {}'.format(wav_file)) - ``` - + - Dygraph infer: + ```python + import paddle + from paddlespeech.cli.tts import TTSExecutor + tts_executor = TTSExecutor() + wav_file = tts_executor( + text='今天的天气不错啊', + output='output.wav', + am='fastspeech2_csmsc', + am_config=None, + am_ckpt=None, + am_stat=None, + spk_id=0, + phones_dict=None, + tones_dict=None, + speaker_dict=None, + voc='pwgan_csmsc', + voc_config=None, + voc_ckpt=None, + voc_stat=None, + lang='zh', + device=paddle.get_device()) + print('Wave file has been generated: {}'.format(wav_file)) + ``` + - ONNXRuntime infer: + ```python + from paddlespeech.cli.tts import TTSExecutor + tts_executor = TTSExecutor() + wav_file = tts_executor( + text='对数据集进行预处理', + output='output.wav', + am='fastspeech2_csmsc', + voc='hifigan_csmsc', + lang='zh', + use_onnx=True, + cpu_threads=2) + ``` + Output: ```bash Wave file has been generated: output.wav ``` ### 4. Pretrained Models - Here is a list of pretrained models released by PaddleSpeech that can be used by command and python API: - Acoustic model diff --git a/demos/text_to_speech/README_cn.md b/demos/text_to_speech/README_cn.md index ec5eb5ae92d421c6fb3790e9df9ddd9480ae9026..4a4132238f63feb5fe86aa2f0821cc481cd99e6a 100644 --- a/demos/text_to_speech/README_cn.md +++ b/demos/text_to_speech/README_cn.md @@ -1,26 +1,23 @@ (简体中文|[English](./README.md)) # 语音合成 - ## 介绍 语音合成是一种自然语言建模过程,其将文本转换为语音以进行音频演示。 这个 demo 是一个从给定文本生成音频的实现,它可以通过使用 `PaddleSpeech` 的单个命令或 python 中的几行代码来实现。 - ## 使用方法 ### 1. 安装 请看[安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)。 -你可以从 easy,medium,hard 三中方式中选择一种方式安装。 +你可以从 easy,medium,hard 三种方式中选择一种方式安装。 ### 2. 准备输入 这个 demo 的输入是通过参数传递的特定语言的文本。 ### 3. 使用方法 - 命令行 (推荐使用) + 默认的声学模型是 `Fastspeech2`,默认的声码器是 `HiFiGAN`,默认推理方式是动态图推理。 - 中文 - - 默认的声学模型是 `Fastspeech2`,默认的声码器是 `Parallel WaveGAN`. ```bash paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" ``` @@ -61,6 +58,19 @@ paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175_pwgan.wav paddlespeech tts --am fastspeech2_mix --voc hifigan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175.wav ``` + - 使用 ONNXRuntime 推理: + ```bash + paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True + paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True + paddlespeech tts --voc mb_melgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output mb.wav --use_onnx True + paddlespeech tts --voc pwgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_aishell3 --voc pwgan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_aishell3 --voc hifigan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_hifigan.wav --use_onnx True + paddlespeech tts --am fastspeech2_ljspeech --voc pwgan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True + paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True + paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True + ``` 使用方法: @@ -84,6 +94,8 @@ - `lang`:TTS 任务的语言, 默认值:`zh`。 - `device`:执行预测的设备, 默认值:当前系统下 paddlepaddle 的默认 device。 - `output`:输出音频的路径, 默认值:`output.wav`。 + - `use_onnx`: 是否使用 ONNXRuntime 进行推理。 + - `fs`: 使用特定 ONNX 模型时的采样率。 输出: ```bash @@ -91,31 +103,44 @@ ``` - Python API - ```python - import paddle - from paddlespeech.cli.tts import TTSExecutor - - tts_executor = TTSExecutor() - wav_file = tts_executor( - text='今天的天气不错啊', - output='output.wav', - am='fastspeech2_csmsc', - am_config=None, - am_ckpt=None, - am_stat=None, - spk_id=0, - phones_dict=None, - tones_dict=None, - speaker_dict=None, - voc='pwgan_csmsc', - voc_config=None, - voc_ckpt=None, - voc_stat=None, - lang='zh', - device=paddle.get_device()) - print('Wave file has been generated: {}'.format(wav_file)) - ``` - + - 动态图推理: + ```python + import paddle + from paddlespeech.cli.tts import TTSExecutor + tts_executor = TTSExecutor() + wav_file = tts_executor( + text='今天的天气不错啊', + output='output.wav', + am='fastspeech2_csmsc', + am_config=None, + am_ckpt=None, + am_stat=None, + spk_id=0, + phones_dict=None, + tones_dict=None, + speaker_dict=None, + voc='pwgan_csmsc', + voc_config=None, + voc_ckpt=None, + voc_stat=None, + lang='zh', + device=paddle.get_device()) + print('Wave file has been generated: {}'.format(wav_file)) + ``` + - ONNXRuntime 推理: + ```python + from paddlespeech.cli.tts import TTSExecutor + tts_executor = TTSExecutor() + wav_file = tts_executor( + text='对数据集进行预处理', + output='output.wav', + am='fastspeech2_csmsc', + voc='hifigan_csmsc', + lang='zh', + use_onnx=True, + cpu_threads=2) + ``` + 输出: ```bash Wave file has been generated: output.wav