This demo is an implementation of starting the streaming speech synthesis service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
For service interface definition, please check:
-[PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
-[PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## Usage
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.2.2** or above.
It is recommended to use **paddlepaddle 2.3.1** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
**If you install in simple mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
### 2. Prepare config File
The configuration file can be found in `conf/tts_online_application.yaml`.
...
...
@@ -29,11 +33,10 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- Both hifigan and mb_melgan support streaming voc inference.
- When the voc model is mb_melgan, when voc_pad=14, the synthetic audio for streaming inference is consistent with the non-streaming synthetic audio; the minimum voc_pad can be set to 7, and the synthetic audio has no abnormal hearing. If the voc_pad is less than 7, the synthetic audio sounds abnormal.
- When the voc model is hifigan, when voc_pad=19, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
- Pad calculation method of streaming vocoder in PaddleSpeech: [AIStudio tutorial](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
-**Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.
### 3. Streaming speech synthesis server and client using http protocol
#### 3.1 Server Usage
- Command Line (Recommended)
...
...
@@ -53,7 +56,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
#### 3.2 Streaming TTS client Usage
...
...
@@ -125,7 +126,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- Currently, only the single-speaker model is supported in the code, so `spk_id` does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.