Merge pull request #2189 from yt605155624/fix_name_bug

[doc]update server readme

Merge pull request #2189 from yt605155624/fix_name_bug
[doc]update server readme
663cfc01 · 小湉湉 · GitHub · bc2613b7 · dfb09eee · 663cfc01
8 changed file
--- a/demos/speech_server/README.md
+++ b/demos/speech_server/README.md
--- a/demos/speech_server/README_cn.md
+++ b/demos/speech_server/README_cn.md
--- a/demos/speech_web/接口文档.md
+++ b/demos/speech_web/接口文档.md
@@ -8,7 +8,7 @@ http://0.0.0.0:8010/docs
 ### 【POST】/asr/offline
-说明：上传16k,16bit wav文件，返回 offline 语音识别模型识别结果
+说明：上传 16k, 16bit wav 文件，返回 offline 语音识别模型识别结果
 返回: JSON
@@ -26,11 +26,11 @@ http://0.0.0.0:8010/docs
 ### 【POST】/asr/offlinefile
-说明：上传16k,16bit wav文件，返回 offline 语音识别模型识别结果 + wav数据的base64
+说明：上传16k,16bit wav文件，返回 offline 语音识别模型识别结果 + wav 数据的 base64
 返回: JSON
-前端接口： 音频文件识别(播放这段base64还原后记得添加wav头，采样率16k, int16，添加后才能播放)
+前端接口： 音频文件识别(播放这段base64还原后记得添加 wav 头，采样率 16k, int16，添加后才能播放)
 示例:
@@ -48,7 +48,7 @@ http://0.0.0.0:8010/docs
 ### 【POST】/asr/collectEnv
-说明： 通过采集环境噪音，上传16k, int16 wav文件，来生成后台VAD的能量阈值， 返回阈值结果
+说明： 通过采集环境噪音，上传 16k, int16 wav 文件，来生成后台 VAD 的能量阈值， 返回阈值结果
 前端接口：ASR-环境采样
@@ -64,9 +64,9 @@ http://0.0.0.0:8010/docs
 ### 【GET】/asr/stopRecord
-说明：通过 GET 请求 /asr/stopRecord, 后台停止接收 offlineStream 中通过 WS协议 上传的数据
+说明：通过 GET 请求 /asr/stopRecord, 后台停止接收 offlineStream 中通过 WS 协议 上传的数据
-前端接口：语音聊天-暂停录音（获取NLP，播放TTS时暂停）
+前端接口：语音聊天-暂停录音（获取 NLP，播放 TTS 时暂停）
 返回: JSON
@@ -80,9 +80,9 @@ http://0.0.0.0:8010/docs
 ### 【GET】/asr/resumeRecord
-说明：通过 GET 请求 /asr/resumeRecord, 后台停止接收 offlineStream 中通过 WS协议 上传的数据
+说明：通过 GET 请求 /asr/resumeRecord, 后台停止接收 offlineStream 中通过 WS 协议 上传的数据
-前端接口：语音聊天-恢复录音（TTS播放完毕时，告诉后台恢复录音）
+前端接口：语音聊天-恢复录音（ TTS 播放完毕时，告诉后台恢复录音）
 返回: JSON
@@ -100,16 +100,16 @@ http://0.0.0.0:8010/docs
 前端接口：语音聊天-开始录音，持续将麦克风语音传给后端，后端推送语音识别结果
-返回：后端返回识别结果，offline模型识别结果， 由WS推送
+返回：后端返回识别结果，offline 模型识别结果， 由WS推送
 ### 【Websocket】/ws/asr/onlineStream
-说明：通过 WS 协议，将前端音频持续上传到后台，前端采集 16k，Int16 类型的PCM片段，持续上传到后端
+说明：通过 WS 协议，将前端音频持续上传到后台，前端采集 16k，Int16 类型的 PCM 片段，持续上传到后端
 前端接口：ASR-流式识别开始录音，持续将麦克风语音传给后端，后端推送语音识别结果
-返回：后端返回识别结果，online模型识别结果， 由WS推送
+返回：后端返回识别结果，online 模型识别结果， 由 WS 推送
 ## NLP
@@ -202,7 +202,7 @@ http://0.0.0.0:8010/docs
 ### 【POST】/tts/offline
-说明：获取TTS离线模型音频
+说明：获取 TTS 离线模型音频
 前端接口：TTS-端到端合成
@@ -272,7 +272,7 @@ curl -X 'POST' \
 ### 【POST】/vpr/recog
-说明：声纹识别，识别文件，提取文件的声纹信息做比对 音频 16k, int 16 wav格式
+说明：声纹识别，识别文件，提取文件的声纹信息做比对 音频 16k, int 16 wav 格式
 前端接口：声纹识别-上传音频，返回声纹识别结果
@@ -383,9 +383,9 @@ curl -X 'GET' \
 ### 【GET】/vpr/database64
-说明： 根据 vpr_id 获取用户vpr时注册使用音频转换成 16k, int16 类型的数组，返回base64编码
+说明： 根据 vpr_id 获取用户 vpr 时注册使用音频转换成 16k, int16 类型的数组，返回 base64 编码
-前端接口：声纹识别-获取vpr对应的音频（注意：播放时需要添加 wav头，16k,int16, 可参考tts播放时添加wav的方式，注意更改采样率）
+前端接口：声纹识别-获取 vpr 对应的音频（注意：播放时需要添加 wav头，16k,int16, 可参考 tts 播放时添加 wav 的方式，注意更改采样率）
 访问示例：
@@ -401,6 +401,4 @@ curl -X 'GET' \
  "code": 0,
  "result":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
  "message": "ok"
 ```
\ No newline at end of file
--- a/demos/speech_web/README_cn.md
+++ b/demos/speech_web/README_cn.md
 # Paddle Speech Demo
-PaddleSpeechDemo是一个以PaddleSpeech的语音交互功能为主体开发的Demo展示项目，用于帮助大家更好的上手PaddleSpeech以及使用PaddleSpeech构建自己的应用。
+PaddleSpeechDemo 是一个以 PaddleSpeech 的语音交互功能为主体开发的 Demo 展示项目，用于帮助大家更好的上手 PaddleSpeech 以及使用 PaddleSpeech 构建自己的应用。
-智能语音交互部分使用PaddleSpeech，对话以及信息抽取部分使用PaddleNLP，网页前端展示部分基于Vue3进行开发
+智能语音交互部分使用 PaddleSpeech，对话以及信息抽取部分使用 PaddleNLP，网页前端展示部分基于 Vue3 进行开发
 主要功能：
-+ 语音聊天：PaddleSpeech的语音识别能力+语音合成能力，对话部分基于PaddleNLP的闲聊功能
+ 语音聊天：PaddleSpeech 的语音识别能力+语音合成能力，对话部分基于 PaddleNLP 的闲聊功能
-+ 声纹识别：PaddleSpeech的声纹识别功能展示
+ 声纹识别：PaddleSpeech 的声纹识别功能展示
 + 语音识别：支持【实时语音识别】，【端到端识别】，【音频文件识别】三种模式
 + 语音合成：支持【流式合成】与【端到端合成】两种方式
-+ 语音指令：基于PaddleSpeech的语音识别能力与PaddleNLP的信息抽取，实现交通费的智能报销
+ 语音指令：基于 PaddleSpeech 的语音识别能力与 PaddleNLP 的信息抽取，实现交通费的智能报销
 运行效果：
@@ -32,23 +32,21 @@ cd model
 wget https://bj.bcebos.com/paddlenlp/applications/speech-cmd-analysis/finetune/model_state.pdparams
 ```
 ### 前端环境安装
-前端依赖node.js ，需要提前安装，确保npm可用，npm测试版本8.3.1，建议下载[官网](https://nodejs.org/en/)稳定版的node.js
+前端依赖 `node.js` ，需要提前安装，确保 `npm` 可用，`npm` 测试版本 `8.3.1`，建议下载[官网](https://nodejs.org/en/)稳定版的 `node.js`
 ```
 # 进入前端目录
 cd web_client
-# 安装yarn，已经安装可跳过
+# 安装 `yarn`，已经安装可跳过
 npm install -g yarn
 # 使用yarn安装前端依赖
 yarn install
 ```
 ## 启动服务
 ### 开启后端服务
@@ -66,18 +64,18 @@ cd web_client
 yarn dev --port 8011
 ```
-默认配置下，前端中配置的后台地址信息是localhost，确保后端服务器和打开页面的游览器在同一台机器上，不在一台机器的配置方式见下方的FAQ：【后端如果部署在其它机器或者别的端口如何修改】
+默认配置下，前端中配置的后台地址信息是 localhost，确保后端服务器和打开页面的游览器在同一台机器上，不在一台机器的配置方式见下方的 FAQ：【后端如果部署在其它机器或者别的端口如何修改】
 ## FAQ 
 #### Q: 如何安装node.js
-A： node.js的安装可以参考[【菜鸟教程】](https://www.runoob.com/nodejs/nodejs-install-setup.html), 确保npm可用
+A： node.js的安装可以参考[【菜鸟教程】](https://www.runoob.com/nodejs/nodejs-install-setup.html), 确保 npm 可用
 #### Q：后端如果部署在其它机器或者别的端口如何修改
 A：后端的配置地址有分散在两个文件中
-修改第一个文件`PaddleSpeechWebClient/vite.config.js`
+修改第一个文件 `PaddleSpeechWebClient/vite.config.js`
 ```
 server: {
@@ -92,7 +90,7 @@ server: {
  }
 ```
-修改第二个文件`PaddleSpeechWebClient/src/api/API.js`（Websocket代理配置失败，所以需要在这个文件中修改）
+修改第二个文件 `PaddleSpeechWebClient/src/api/API.js`（ Websocket 代理配置失败，所以需要在这个文件中修改）
 ```
 // websocket （这里改成后端所在的接口）
@@ -107,9 +105,6 @@ A：这里主要是游览器安全策略的限制，需要配置游览器后重
 chrome设置地址: chrome://flags/#unsafely-treat-insecure-origin-as-secure
 ## 参考资料
 vue实现录音参考资料：https://blog.csdn.net/qq_41619796/article/details/107865602#t1

--- a/demos/streaming_asr_server/README.md
+++ b/demos/streaming_asr_server/README.md
--- a/demos/streaming_asr_server/README_cn.md
+++ b/demos/streaming_asr_server/README_cn.md
--- a/demos/streaming_tts_server/README.md
+++ b/demos/streaming_tts_server/README.md
@@ -5,15 +5,19 @@
 ## Introduction
 This demo is an implementation of starting the streaming speech synthesis service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
+For service interface definition, please check:
+- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
+- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
 ## Usage
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
-It is recommended to use **paddlepaddle 2.2.2** or above.
+It is recommended to use **paddlepaddle 2.3.1** or above.
 You can choose one way from easy, meduim and hard to install paddlespeech.
-**If you install in simple mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
+**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
 ### 2. Prepare config File
 The configuration file can be found in `conf/tts_online_application.yaml`.
@@ -29,11 +33,10 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
    - Both hifigan and mb_melgan support streaming voc inference.
    - When the voc model is mb_melgan, when voc_pad=14, the synthetic audio for streaming inference is consistent with the non-streaming synthetic audio; the minimum voc_pad can be set to 7, and the synthetic audio has no abnormal hearing. If the voc_pad is less than 7, the synthetic audio sounds abnormal.
    - When the voc model is hifigan, when voc_pad=19, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
+    - Pad calculation method of streaming vocoder in PaddleSpeech: [AIStudio tutorial](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
 - Inference speed: mb_melgan > hifigan; Audio quality: mb_melgan < hifigan
 - **Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.
 ### 3. Streaming speech synthesis server and client using http protocol
 #### 3.1 Server Usage
 - Command Line (Recommended)
@@ -53,7 +56,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  - `log_file`: log file. Default: ./log/paddlespeech.log
  Output:
-  ```bash
+  ```text
  [2022-04-24 20:05:27,887] [    INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
  [2022-04-24 20:05:28,038] [    INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
  [2022-04-24 20:05:28,191] [    INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@@ -79,8 +82,8 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
      log_file="./log/paddlespeech.log")
  ```
- Output:
+  Output:
-  ```bash
+  ```text
  [2022-04-24 21:00:16,934] [    INFO] - The first response time of the 0 warm up: 1.268730878829956 s
  [2022-04-24 21:00:17,046] [    INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
  [2022-04-24 21:00:17,151] [    INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@@ -93,8 +96,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  [2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 #### 3.2 Streaming TTS client Usage
@@ -125,7 +126,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
    - Currently, only the single-speaker model is supported in the code, so `spk_id` does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.
    Output:
-    ```bash
+    ```text
    [2022-04-24 21:08:18,559] [    INFO] - tts http client start
    [2022-04-24 21:08:21,702] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
    [2022-04-24 21:08:21,703] [    INFO] - 首包响应：0.18863153457641602 s
@@ -154,7 +155,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  ```
  Output:
-  ```bash
+  ```text
  [2022-04-24 21:11:13,798] [    INFO] - tts http client start
  [2022-04-24 21:11:16,800] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
  [2022-04-24 21:11:16,801] [    INFO] - 首包响应：0.18234872817993164 s
@@ -164,7 +165,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  [2022-04-24 21:11:16,837] [    INFO] - 音频保存至：./output.wav
  ```
 ### 4. Streaming speech synthesis server and client using websocket protocol
 #### 4.1 Server Usage
 - Command Line (Recommended)
@@ -184,21 +184,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  - `log_file`: log file. Default: ./log/paddlespeech.log
  Output:
-  ```bash
+  ```text
-    [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
+  [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
-    [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
+  [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
-    [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
+  [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
-    [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
+  [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
-    INFO:     Started server process [17600]
+  INFO:     Started server process [17600]
-    [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
+  [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
-    INFO:     Waiting for application startup.
+  INFO:     Waiting for application startup.
-    [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
+  [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
+  INFO:     Application startup complete.
-    [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
+  [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
-    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
-    [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 - Python API
@@ -212,20 +210,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  ```
  Output:
-  ```bash
+  ```text
-    [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
+  [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
-    [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
+  [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
-    [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
+  [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
-    [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
+  [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
-    INFO:     Started server process [23466]
+  INFO:     Started server process [23466]
-    [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
+  [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
-    INFO:     Waiting for application startup.
+  INFO:     Waiting for application startup.
-    [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
+  [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
+  INFO:     Application startup complete.
-    [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
+  [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
-    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
-    [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 #### 4.2 Streaming TTS client Usage
@@ -258,7 +255,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
    Output:
-    ```bash
+    ```text
    [2022-04-27 10:21:04,262] [    INFO] - tts websocket client start
    [2022-04-27 10:21:04,496] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
    [2022-04-27 10:21:04,496] [    INFO] - 首包响应：0.2124948501586914 s
@@ -266,7 +263,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
    [2022-04-27 10:21:07,484] [    INFO] - 音频时长：3.825 s
    [2022-04-27 10:21:07,484] [    INFO] - RTF: 0.8363677006141812
    [2022-04-27 10:21:07,516] [    INFO] - 音频保存至：output.wav
    ```
 - Python API
@@ -283,21 +279,15 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
      spk_id=0,
      output="./output.wav",
      play=False)
  ```
  Output:
-  ```bash
+  ```text
-    [2022-04-27 10:22:48,852] [    INFO] - tts websocket client start
+  [2022-04-27 10:22:48,852] [    INFO] - tts websocket client start
-    [2022-04-27 10:22:49,080] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
+  [2022-04-27 10:22:49,080] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
-    [2022-04-27 10:22:49,080] [    INFO] - 首包响应：0.21017956733703613 s
+  [2022-04-27 10:22:49,080] [    INFO] - 首包响应：0.21017956733703613 s
-    [2022-04-27 10:22:52,100] [    INFO] - 尾包响应：3.2304444313049316 s
+  [2022-04-27 10:22:52,100] [    INFO] - 尾包响应：3.2304444313049316 s
-    [2022-04-27 10:22:52,101] [    INFO] - 音频时长：3.825 s
+  [2022-04-27 10:22:52,101] [    INFO] - 音频时长：3.825 s
-    [2022-04-27 10:22:52,101] [    INFO] - RTF: 0.8445606356352762
+  [2022-04-27 10:22:52,101] [    INFO] - RTF: 0.8445606356352762
-    [2022-04-27 10:22:52,134] [    INFO] - 音频保存至：./output.wav
+  [2022-04-27 10:22:52,134] [    INFO] - 音频保存至：./output.wav
  ```
--- a/demos/streaming_tts_server/README_cn.md
+++ b/demos/streaming_tts_server/README_cn.md
@@ -3,15 +3,19 @@
 # 流式语音合成服务
 ## 介绍
-这个demo是一个启动流式语音合成服务和访问该服务的实现。 它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。
+这个 demo 是一个启动流式语音合成服务和访问该服务的实现。 它可以通过使用 `paddlespeech_server` 和 `paddlespeech_client` 的单个命令或 python 的几行代码来实现。
+服务接口定义请参考:
+- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
+- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
 ## 使用方法
 ### 1. 安装
 请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
-推荐使用 **paddlepaddle 2.2.2** 或以上版本。
+推荐使用 **paddlepaddle 2.3.1** 或以上版本。
-你可以从简单，中等，困难几种方式中选择一种方式安装 PaddleSpeech。
+你可以从简单，中等，困难 几种方式中选择一种方式安装 PaddleSpeech。
 **如果使用简单模式安装，需要自行准备 yaml 文件，可参考 conf 目录下的 yaml 文件。**
 ### 2. 准备配置文件
@@ -20,19 +24,20 @@
 - `engine_list` 表示即将启动的服务将会包含的语音引擎，格式为 <语音任务>_<引擎类型>。
    - 该 demo 主要介绍流式语音合成服务，因此语音任务应设置为 tts。
    - 目前引擎类型支持两种形式：**online** 表示使用python进行动态图推理的引擎；**online-onnx** 表示使用 onnxruntime 进行推理的引擎。其中，online-onnx 的推理速度更快。
- 流式 TTS 引擎的 AM 模型支持：**fastspeech2 以及fastspeech2_cnndecoder**; Voc 模型支持：**hifigan, mb_melgan**
+- 流式 TTS 引擎的 AM 模型支持：**fastspeech2 以及 fastspeech2_cnndecoder**; Voc 模型支持：**hifigan, mb_melgan**
 - 流式 am 推理中，每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `am_block` 表示 chunk 中的有效帧数，`am_pad` 表示一个 chunk 中 am_block 前后各加的帧数。am_pad 的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
-    - fastspeech2 不支持流式 am 推理，因此 am_pad 与 m_block 对它无效
+    - fastspeech2 不支持流式 am 推理，因此 am_pad 与 am_block 对它无效
    - fastspeech2_cnndecoder 支持流式推理，当 am_pad=12 时，流式推理合成音频与非流式合成音频一致
- 流式 voc 推理中，每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `voc_block` 表示chunk中的有效帧数，`voc_pad` 表示一个 chunk 中 voc_block 前后各加的帧数。voc_pad 的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
+- 流式 voc 推理中，每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `voc_block` 表示 chunk 中的有效帧数，`voc_pad` 表示一个 chunk 中 voc_block 前后各加的帧数。voc_pad 的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
    - hifigan, mb_melgan 均支持流式 voc 推理
    - 当 voc 模型为 mb_melgan，当 voc_pad=14 时，流式推理合成音频与非流式合成音频一致；voc_pad 最小可以设置为7，合成音频听感上没有异常，若 voc_pad 小于7，合成音频听感上存在异常。
    - 当 voc 模型为 hifigan，当 voc_pad=19 时，流式推理合成音频与非流式合成音频一致；当 voc_pad=14 时，合成音频听感上没有异常。
+    - PaddleSpeech 中流式声码器 Pad 计算方法: [AIStudio 教程](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
 - 推理速度：mb_melgan > hifigan; 音频质量：mb_melgan < hifigan
 - **注意：** 如果在容器里可正常启动服务，但客户端访问 ip 不可达，可尝试将配置文件中 `host` 地址换成本地 ip 地址。
-### 3. 使用http协议的流式语音合成服务端及客户端使用方法
+### 3. 使用 http 协议的流式语音合成服务端及客户端使用方法
 #### 3.1 服务端使用方法
 - 命令行 (推荐使用)
@@ -51,7 +56,7 @@
  - `log_file`: log 文件. 默认：./log/paddlespeech.log
  输出:
-  ```bash
+  ```text
  [2022-04-24 20:05:27,887] [    INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
  [2022-04-24 20:05:28,038] [    INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
  [2022-04-24 20:05:28,191] [    INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@@ -64,7 +69,6 @@
  [2022-04-24 20:05:28] [INFO] [on.py:59] Application startup complete.
  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  [2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 - Python API
@@ -77,8 +81,8 @@
      log_file="./log/paddlespeech.log")
  ```
-  输出：
+  输出:
-  ```bash
+  ```text
  [2022-04-24 21:00:16,934] [    INFO] - The first response time of the 0 warm up: 1.268730878829956 s
  [2022-04-24 21:00:17,046] [    INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
  [2022-04-24 21:00:17,151] [    INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@@ -91,8 +95,6 @@
  [2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 #### 3.2 客户端使用方法
@@ -124,7 +126,7 @@
    输出:
-    ```bash
+    ```text
    [2022-04-24 21:08:18,559] [    INFO] - tts http client start
    [2022-04-24 21:08:21,702] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
    [2022-04-24 21:08:21,703] [    INFO] - 首包响应：0.18863153457641602 s
@@ -162,9 +164,8 @@
  [2022-04-24 21:11:16,802] [    INFO] - RTF: 0.7846773683635238
  [2022-04-24 21:11:16,837] [    INFO] - 音频保存至：./output.wav
  ```
-### 4. 使用websocket协议的流式语音合成服务端及客户端使用方法
+### 4. 使用 websocket 协议的流式语音合成服务端及客户端使用方法
 #### 4.1 服务端使用方法
 - 命令行 (推荐使用)
  首先修改配置文件 `conf/tts_online_application.yaml`， **将 `protocol` 设置为 `websocket`**。
@@ -183,21 +184,19 @@
  - `log_file`: log 文件. 默认：./log/paddlespeech.log
  输出:
-  ```bash
+  ```text
-    [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
+  [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
-    [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
+  [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
-    [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
+  [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
-    [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
+  [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
-    INFO:     Started server process [17600]
+  INFO:     Started server process [17600]
-    [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
+  [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
-    INFO:     Waiting for application startup.
+  INFO:     Waiting for application startup.
-    [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
+  [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
+  INFO:     Application startup complete.
-    [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
+  [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
-    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
-    [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 - Python API
@@ -210,27 +209,26 @@
      log_file="./log/paddlespeech.log")
  ```
-  输出：
+  输出:
-  ```bash
+  ```text
-    [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
+  [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
-    [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
+  [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
-    [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
+  [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
-    [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
+  [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
-    INFO:     Started server process [23466]
+  INFO:     Started server process [23466]
-    [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
+  [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
-    INFO:     Waiting for application startup.
+  INFO:     Waiting for application startup.
-    [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
+  [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
+  INFO:     Application startup complete.
-    [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
+  [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
-    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
-    [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
  ```
 #### 4.2 客户端使用方法
 - 命令行 (推荐使用)
-    访问 websocket 流式TTS服务：
+    访问 websocket 流式 TTS 服务：
    若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
@@ -255,9 +253,8 @@
    - 目前代码中只支持单说话人的模型，因此 spk_id 的选择并不生效。流式 TTS 不支持更换采样率，变速和变音量等功能。
    输出:
-    ```bash
+    ```text
    [2022-04-27 10:21:04,262] [    INFO] - tts websocket client start
    [2022-04-27 10:21:04,496] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
    [2022-04-27 10:21:04,496] [    INFO] - 首包响应：0.2124948501586914 s
@@ -265,7 +262,6 @@
    [2022-04-27 10:21:07,484] [    INFO] - 音频时长：3.825 s
    [2022-04-27 10:21:07,484] [    INFO] - RTF: 0.8363677006141812
    [2022-04-27 10:21:07,516] [    INFO] - 音频保存至：output.wav
    ```
 - Python API
@@ -282,11 +278,10 @@
      spk_id=0,
      output="./output.wav",
      play=False)
  ```
  输出:
-  ```bash
+  ```text
    [2022-04-27 10:22:48,852] [    INFO] - tts websocket client start
    [2022-04-27 10:22:49,080] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
    [2022-04-27 10:22:49,080] [    INFO] - 首包响应：0.21017956733703613 s
@@ -294,8 +289,4 @@
    [2022-04-27 10:22:52,101] [    INFO] - 音频时长：3.825 s
    [2022-04-27 10:22:52,101] [    INFO] - RTF: 0.8445606356352762
    [2022-04-27 10:22:52,134] [    INFO] - 音频保存至：./output.wav
  ```