未验证 提交 b66838fa 编写于 作者: H Hui Zhang 提交者: GitHub

Merge pull request #1811 from Honei/v0.3

[R1.0]update the streaming asr server readme
......@@ -5,6 +5,7 @@
## Introduction
This demo is an implementation of starting the streaming speech service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol.
## Usage
### 1. Installation
......@@ -114,7 +115,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
server_executor = ServerExecutor()
server_executor(
config_file="./conf/ws_conformer_application.yaml",
config_file="./conf/ws_conformer_application.yaml",
log_file="./log/paddlespeech.log")
```
......@@ -188,7 +189,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
**Note:** The response time will be slightly longer when using the client for the first time
- Command Line (Recommended)
```
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav --protocol websocket
```
Usage:
......@@ -284,8 +285,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
port=8090,
sample_rate=16000,
lang="zh_cn",
audio_format="wav")
print(res.json())
audio_format="wav",
protocol="websocket")
print(res)
```
Output:
......
......@@ -5,13 +5,14 @@
## 介绍
这个demo是一个启动流式语音服务和访问服务的实现。 它可以通过使用`paddlespeech_server``paddlespeech_client`的单个命令或 python 的几行代码来实现。
流式语音识别服务只支持 `weboscket` 协议,不支持 `http` 协议。
## 使用方法
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.2.1** 或以上版本。
你可以从 medium,hard 三中方式中选择一种方式安装 PaddleSpeech。
你可以从medium,hard 二中方式中选择一种方式安装 PaddleSpeech。
### 2. 准备配置文件
......@@ -187,7 +188,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
**注意:** 初次使用客户端时响应时间会略长
- 命令行 (推荐使用)
```
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav --protocol websocket
```
......@@ -275,18 +276,19 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- Python API
```python
from paddlespeech.server.bin.paddlespeech_client import ASROnlineClientExecutor
from paddlespeech.server.bin.paddlespeech_client import ASRClientExecutor
import json
asrclient_executor = ASROnlineClientExecutor()
asrclient_executor = ASRClientExecutor()
res = asrclient_executor(
input="./zh.wav",
server_ip="127.0.0.1",
port=8090,
sample_rate=16000,
lang="zh_cn",
audio_format="wav")
print(res.json())
audio_format="wav",
protocol="websocket")
print(res)
```
输出:
......
# This is the parameter configuration file for PaddleSpeech Serving.
#################################################################################
# SERVER SETTING #
#################################################################################
host: 0.0.0.0
port: 8090
# The task format in the engin_list is: <speech task>_<engine type>
# task choices = ['asr_online']
# protocol = ['websocket'] (only one can be selected).
# websocket only support online engine type.
protocol: 'websocket'
engine_list: ['asr_online']
#################################################################################
# ENGINE CONFIG #
#################################################################################
################################### ASR #########################################
################### speech task: asr; engine_type: online #######################
asr_online:
model_type: 'conformer_online_multicn'
am_model: # the pdmodel file of am static model [optional]
am_params: # the pdiparams file of am static model [optional]
lang: 'zh'
sample_rate: 16000
cfg_path:
decode_method:
force_yes: True
device: # cpu or gpu:id
am_predictor_conf:
device: # set 'gpu:id' or 'cpu'
switch_ir_optim: True
glog_info: False # True -> print glog
summary: True # False -> do not show predictor config
chunk_buffer_conf:
window_n: 7 # frame
shift_n: 4 # frame
window_ms: 25 # ms
shift_ms: 10 # ms
sample_rate: 16000
sample_width: 2
\ No newline at end of file
......@@ -7,8 +7,8 @@ host: 0.0.0.0
port: 8090
# The task format in the engin_list is: <speech task>_<engine type>
# task choices = ['asr_online', 'tts_online']
# protocol = ['websocket', 'http'] (only one can be selected).
# task choices = ['asr_online']
# protocol = ['websocket'] (only one can be selected).
# websocket only support online engine type.
protocol: 'websocket'
engine_list: ['asr_online']
......
......@@ -7,8 +7,8 @@ host: 0.0.0.0
port: 8090
# The task format in the engin_list is: <speech task>_<engine type>
# task choices = ['asr_online', 'tts_online']
# protocol = ['websocket', 'http'] (only one can be selected).
# task choices = ['asr_online']
# protocol = ['websocket'] (only one can be selected).
# websocket only support online engine type.
protocol: 'websocket'
engine_list: ['asr_online']
......
......@@ -142,7 +142,7 @@ using the `tar` scripts to unpack the model and then you can use the script to t
For example:
```
wget https://paddlespeech.bj.bcebos.com/vector/voxceleb/sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_0.tar.gz
tar xzvf sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_0.tar.gz
tar -xvf sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_0.tar.gz
source path.sh
# If you have processed the data and get the manifest file, you can skip the following 2 steps
......
......@@ -42,15 +42,25 @@ device="cpu"
if ${use_gpu}; then
device="gpu"
fi
if [ $ngpu -le 0 ]; then
echo "no gpu, training in cpu mode"
device='cpu'
use_gpu=false
fi
if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
# train the speaker identification task with voxceleb data
# and we will create the trained model parameters in ${exp_dir}/model.pdparams as the soft link
# Note: we will store the log file in exp/log directory
python3 -m paddle.distributed.launch --gpus=$CUDA_VISIBLE_DEVICES \
${BIN_DIR}/train.py --device ${device} --checkpoint-dir ${exp_dir} \
--data-dir ${dir} --config ${conf_path}
if $use_gpu; then
python3 -m paddle.distributed.launch --gpus=$CUDA_VISIBLE_DEVICES \
${BIN_DIR}/train.py --device ${device} --checkpoint-dir ${exp_dir} \
--data-dir ${dir} --config ${conf_path}
else
python3 \
${BIN_DIR}/train.py --device ${device} --checkpoint-dir ${exp_dir} \
--data-dir ${dir} --config ${conf_path}
fi
fi
if [ $? -ne 0 ]; then
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册