Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
cdb9a1b2
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
cdb9a1b2
编写于
4月 27, 2022
作者:
H
Hui Zhang
提交者:
GitHub
4月 27, 2022
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #1813 from Honei/v0.3
[R1.0]update the paddlespeech_client asr_online cli
上级
bb8785c6
ff7dbcc2
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
123 addition
and
38 deletion
+123
-38
demos/streaming_asr_server/README.md
demos/streaming_asr_server/README.md
+10
-9
demos/streaming_asr_server/README_cn.md
demos/streaming_asr_server/README_cn.md
+21
-14
examples/voxceleb/sv0/README.md
examples/voxceleb/sv0/README.md
+1
-1
examples/voxceleb/sv0/local/test.sh
examples/voxceleb/sv0/local/test.sh
+17
-1
paddlespeech/server/bin/paddlespeech_client.py
paddlespeech/server/bin/paddlespeech_client.py
+74
-13
未找到文件。
demos/streaming_asr_server/README.md
浏览文件 @
cdb9a1b2
...
...
@@ -31,7 +31,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
Command Line (Recommended)
```
bash
# start the service
#
in PaddleSpeech/demos/streaming_asr_server
start the service
paddlespeech_server start
--config_file
./conf/ws_conformer_application.yaml
```
...
...
@@ -111,6 +111,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
Python API
```
python
# in PaddleSpeech/demos/streaming_asr_server directory
from
paddlespeech.server.bin.paddlespeech_server
import
ServerExecutor
server_executor
=
ServerExecutor
()
...
...
@@ -186,10 +187,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
### 4. ASR Client Usage
**Note:**
The response time will be slightly longer when using the client for the first time
-
Command Line (Recommended)
```
paddlespeech_client asr
--server_ip 127.0.0.1 --port 8090 --input ./zh.wav --protocol websocket
paddlespeech_client asr
_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
Usage:
...
...
@@ -204,6 +206,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
`sample_rate`
: Audio ampling rate, default: 16000.
-
`lang`
: Language. Default: "zh_cn".
-
`audio_format`
: Audio format. Default: "wav".
-
`punc.server_ip`
: punctuation server ip. Default: None.
-
`punc.server_port`
: punctuation server port. Default: None.
Output:
```
bash
...
...
@@ -275,18 +279,16 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
Python API
```
python
from
paddlespeech.server.bin.paddlespeech_client
import
ASRClientExecutor
import
json
from
paddlespeech.server.bin.paddlespeech_client
import
ASROnlineClientExecutor
asrclient_executor
=
ASRClientExecutor
()
asrclient_executor
=
ASR
Online
ClientExecutor
()
res
=
asrclient_executor
(
input
=
"./zh.wav"
,
server_ip
=
"127.0.0.1"
,
port
=
8090
,
sample_rate
=
16000
,
lang
=
"zh_cn"
,
audio_format
=
"wav"
,
protocol
=
"websocket"
)
audio_format
=
"wav"
)
print
(
res
)
```
...
...
@@ -353,5 +355,4 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
[
2022-04-21 15:59:08,016]
[
INFO] - receive
msg
={
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:08,024]
[
INFO] - receive
msg
={
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:12,883]
[
INFO] - final receive
msg
={
'status'
:
'ok'
,
'signal'
:
'finished'
,
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:12,884]
[
INFO] - 我认为跑步最重要的就是给我带来了身体健康
```
```
\ No newline at end of file
demos/streaming_asr_server/README_cn.md
浏览文件 @
cdb9a1b2
...
...
@@ -5,19 +5,26 @@
## 介绍
这个demo是一个启动流式语音服务和访问服务的实现。 它可以通过使用
`paddlespeech_server`
和
`paddlespeech_client`
的单个命令或 python 的几行代码来实现。
流式语音识别服务只支持
`weboscket`
协议,不支持
`http`
协议。
**流式语音识别服务只支持 `weboscket` 协议,不支持 `http` 协议。**
## 使用方法
### 1. 安装
请看
[
安装文档
](
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md
)
.
安装 PaddleSpeech 的详细过程请看
[
安装文档
](
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md
)
。
推荐使用
**paddlepaddle 2.2.1**
或以上版本。
你可以从
medium,hard 三
种方式中选择一种方式安装 PaddleSpeech。
你可以从
medium,hard 两
种方式中选择一种方式安装 PaddleSpeech。
### 2. 准备配置文件
配置文件可参见
`conf/ws_application.yaml`
和
`conf/ws_conformer_application.yaml`
。
目前服务集成的模型有: DeepSpeech2和conformer模型。
流式ASR的服务启动脚本和服务测试脚本存放在
`PaddleSpeech/demos/streaming_asr_server`
目录。
下载好
`PaddleSpeech`
之后,进入到
`PaddleSpeech/demos/streaming_asr_server`
目录。
配置文件可参见该目录下
`conf/ws_application.yaml`
和
`conf/ws_conformer_application.yaml`
。
目前服务集成的模型有: DeepSpeech2和 conformer模型,对应的配置文件如下:
*
DeepSpeech:
`conf/ws_application.yaml`
*
conformer:
`conf/ws_conformer_application.yaml`
这个 ASR client 的输入应该是一个 WAV 文件(
`.wav`
),并且采样率必须与模型的采样率相同。
...
...
@@ -31,7 +38,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
命令行 (推荐使用)
```
bash
# 启动服务
#
在 PaddleSpeech/demos/streaming_asr_server 目录
启动服务
paddlespeech_server start
--config_file
./conf/ws_conformer_application.yaml
```
...
...
@@ -111,6 +118,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
Python API
```
python
# 在 PaddleSpeech/demos/streaming_asr_server 目录
from
paddlespeech.server.bin.paddlespeech_server
import
ServerExecutor
server_executor
=
ServerExecutor
()
...
...
@@ -185,11 +193,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
### 4. ASR 客户端使用方法
**注意:**
初次使用客户端时响应时间会略长
-
命令行 (推荐使用)
```
paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav --protocol websocket
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
使用帮助:
...
...
@@ -205,6 +213,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- `sample_rate`: 音频采样率,默认值:16000。
- `lang`: 模型语言,默认值:zh_cn。
- `audio_format`: 音频格式,默认值:wav。
- `punc.server_ip` 标点预测服务的ip。默认是None。
- `punc.server_port` 标点预测服务的端口port。默认是None。
输出:
...
...
@@ -276,18 +286,16 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
-
Python API
```
python
from
paddlespeech.server.bin.paddlespeech_client
import
ASRClientExecutor
import
json
from
paddlespeech.server.bin.paddlespeech_client
import
ASROnlineClientExecutor
asrclient_executor
=
ASRClientExecutor
()
asrclient_executor
=
ASR
Online
ClientExecutor
()
res
=
asrclient_executor
(
input
=
"./zh.wav"
,
server_ip
=
"127.0.0.1"
,
port
=
8090
,
sample_rate
=
16000
,
lang
=
"zh_cn"
,
audio_format
=
"wav"
,
protocol
=
"websocket"
)
audio_format
=
"wav"
)
print
(
res
)
```
...
...
@@ -354,5 +362,4 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
[
2022-04-21 15:59:08,016]
[
INFO] - receive
msg
={
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:08,024]
[
INFO] - receive
msg
={
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:12,883]
[
INFO] - final receive
msg
={
'status'
:
'ok'
,
'signal'
:
'finished'
,
'asr_results'
:
'我认为跑步最重要的就是给我带来了身体健康'
}
[
2022-04-21 15:59:12,884]
[
INFO] - 我认为跑步最重要的就是给我带来了身体健康
```
examples/voxceleb/sv0/README.md
浏览文件 @
cdb9a1b2
...
...
@@ -146,6 +146,6 @@ tar -xvf sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_0.tar.gz
source path.sh
# If you have processed the data and get the manifest file, you can skip the following 2 steps
CUDA_VISIBLE_DEVICES=
./local/test.sh ./data sv0_ecapa_tdnn_voxceleb12_ckpt_0_1_2
conf/ecapa_tdnn.yaml
CUDA_VISIBLE_DEVICES=
bash ./local/test.sh ./data sv0_ecapa_tdnn_voxceleb12_ckpt_0_1_2/model/
conf/ecapa_tdnn.yaml
```
The performance of the released models are shown in
[
this
](
./RESULTS.md
)
examples/voxceleb/sv0/local/test.sh
浏览文件 @
cdb9a1b2
...
...
@@ -33,10 +33,26 @@ dir=$1
exp_dir
=
$2
conf_path
=
$3
# get the gpu nums for training
ngpu
=
$(
echo
$CUDA_VISIBLE_DEVICES
|
awk
-F
","
'{print NF}'
)
echo
"using
$ngpu
gpus..."
# setting training device
device
=
"cpu"
if
${
use_gpu
}
;
then
device
=
"gpu"
fi
if
[
$ngpu
-le
0
]
;
then
echo
"no gpu, training in cpu mode"
device
=
'cpu'
use_gpu
=
false
fi
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
# test the model and compute the eer metrics
python3
${
BIN_DIR
}
/test.py
\
--data-dir
${
dir
}
\
--load-checkpoint
${
exp_dir
}
\
--config
${
conf_path
}
--config
${
conf_path
}
\
--device
${
device
}
fi
paddlespeech/server/bin/paddlespeech_client.py
浏览文件 @
cdb9a1b2
...
...
@@ -35,7 +35,7 @@ from paddlespeech.server.utils.util import wav2base64
__all__
=
[
'TTSClientExecutor'
,
'TTSOnlineClientExecutor'
,
'ASRClientExecutor'
,
'CLSClientExecutor'
'
ASROnlineClientExecutor'
,
'
CLSClientExecutor'
]
...
...
@@ -370,6 +370,8 @@ class ASRClientExecutor(BaseExecutor):
str: The ASR results
"""
# we use the asr server to recognize the audio text content
# and paddlespeech_client asr only support http protocol
protocol
=
"http"
if
protocol
.
lower
()
==
"http"
:
from
paddlespeech.server.utils.audio_handler
import
ASRHttpHandler
logger
.
info
(
"asr http client start"
)
...
...
@@ -377,18 +379,6 @@ class ASRClientExecutor(BaseExecutor):
res
=
handler
.
run
(
input
,
audio_format
,
sample_rate
,
lang
)
res
=
res
[
'result'
][
'transcription'
]
logger
.
info
(
"asr http client finished"
)
elif
protocol
.
lower
()
==
"websocket"
:
logger
.
info
(
"asr websocket client start"
)
handler
=
ASRWsAudioHandler
(
server_ip
,
port
,
punc_server_ip
=
punc_server_ip
,
punc_server_port
=
punc_server_port
)
loop
=
asyncio
.
get_event_loop
()
res
=
loop
.
run_until_complete
(
handler
.
run
(
input
))
res
=
res
[
'result'
]
logger
.
info
(
"asr websocket client finished"
)
else
:
logger
.
error
(
f
"Sorry, we have not support protocol:
{
protocol
}
,"
"please use http or websocket protocol"
)
...
...
@@ -397,6 +387,77 @@ class ASRClientExecutor(BaseExecutor):
return
res
@
cli_client_register
(
name
=
'paddlespeech_client.asr_online'
,
description
=
'visit asr online service'
)
class
ASROnlineClientExecutor
(
BaseExecutor
):
def
__init__
(
self
):
super
(
ASROnlineClientExecutor
,
self
).
__init__
()
self
.
parser
=
argparse
.
ArgumentParser
(
prog
=
'paddlespeech_client.asr_online'
,
add_help
=
True
)
self
.
parser
.
add_argument
(
'--server_ip'
,
type
=
str
,
default
=
'127.0.0.1'
,
help
=
'server ip'
)
self
.
parser
.
add_argument
(
'--port'
,
type
=
int
,
default
=
8091
,
help
=
'server port'
)
self
.
parser
.
add_argument
(
'--input'
,
type
=
str
,
default
=
None
,
help
=
'Audio file to be recognized'
,
required
=
True
)
self
.
parser
.
add_argument
(
'--sample_rate'
,
type
=
int
,
default
=
16000
,
help
=
'audio sample rate'
)
self
.
parser
.
add_argument
(
'--lang'
,
type
=
str
,
default
=
"zh_cn"
,
help
=
'language'
)
self
.
parser
.
add_argument
(
'--audio_format'
,
type
=
str
,
default
=
"wav"
,
help
=
'audio format'
)
def
execute
(
self
,
argv
:
List
[
str
])
->
bool
:
args
=
self
.
parser
.
parse_args
(
argv
)
input_
=
args
.
input
server_ip
=
args
.
server_ip
port
=
args
.
port
sample_rate
=
args
.
sample_rate
lang
=
args
.
lang
audio_format
=
args
.
audio_format
try
:
time_start
=
time
.
time
()
res
=
self
(
input
=
input_
,
server_ip
=
server_ip
,
port
=
port
,
sample_rate
=
sample_rate
,
lang
=
lang
,
audio_format
=
audio_format
)
time_end
=
time
.
time
()
logger
.
info
(
res
)
logger
.
info
(
"Response time %f s."
%
(
time_end
-
time_start
))
return
True
except
Exception
as
e
:
logger
.
error
(
"Failed to speech recognition."
)
logger
.
error
(
e
)
return
False
@
stats_wrapper
def
__call__
(
self
,
input
:
str
,
server_ip
:
str
=
"127.0.0.1"
,
port
:
int
=
8091
,
sample_rate
:
int
=
16000
,
lang
:
str
=
"zh_cn"
,
audio_format
:
str
=
"wav"
):
"""
Python API to call an executor.
"""
logger
.
info
(
"asr websocket client start"
)
handler
=
ASRWsAudioHandler
(
server_ip
,
port
)
loop
=
asyncio
.
get_event_loop
()
res
=
loop
.
run_until_complete
(
handler
.
run
(
input
))
logger
.
info
(
"asr websocket client finished"
)
return
res
[
'result'
]
@
cli_client_register
(
name
=
'paddlespeech_client.cls'
,
description
=
'visit cls service'
)
class
CLSClientExecutor
(
BaseExecutor
):
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录