Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
b3e4e26c
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
10 个月 前同步成功
通知
200
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
b3e4e26c
编写于
12月 08, 2021
作者:
K
KP
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update asr and audio tagging demo.
上级
f053dce3
变更
4
隐藏空白更改
内联
并排
Showing
4 changed file
with
49 addition
and
90 deletion
+49
-90
demos/audio_tagging/README.md
demos/audio_tagging/README.md
+24
-8
demos/audio_tagging/tag.py
demos/audio_tagging/tag.py
+0
-37
demos/speech_recognition/README.md
demos/speech_recognition/README.md
+25
-8
demos/speech_recognition/asr.py
demos/speech_recognition/asr.py
+0
-37
未找到文件。
demos/audio_tagging/README.md
浏览文件 @
b3e4e26c
...
...
@@ -7,7 +7,7 @@ This demo is an implementation to tag an audio file with 527 [AudioSet](https://
## Usage
### 1. Installation
```
sh
```
ba
sh
pip
install
paddlespeech
```
...
...
@@ -15,16 +15,20 @@ pip install paddlespeech
Input of this demo should be a WAV file(
`.wav`
).
Here are sample files for this demo that can be downloaded:
```
sh
```
ba
sh
wget https://paddlespeech.bj.bcebos.com/PaddleAudio/cat.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/dog.wav
```
### 3. Usage
-
Command Line(Recommended)
```
sh
```
ba
sh
paddlespeech cls
--input
~/cat.wav
--topk
10
```
Command usage:
Usage:
```
bash
paddlespeech cls
--help
```
Arguments:
-
`input`
(required): Audio file to tag.
-
`model`
: Model type of tagging task. Default:
`panns_cnn14`
.
-
`config`
: Config of tagging task. Use pretrained model when it is None. Default:
`None`
.
...
...
@@ -34,7 +38,7 @@ wget https://paddlespeech.bj.bcebos.com/PaddleAudio/cat.wav https://paddlespeech
-
`device`
: Choose device to execute model inference. Default: default device of paddlepaddle in current environment.
Output:
```
sh
```
ba
sh
[
2021-12-08 14:49:40,671]
[
INFO]
[
utils.py]
[
L225] - CLS Result:
Cat: 0.8991316556930542
Domestic animals, pets: 0.8806838393211365
...
...
@@ -49,11 +53,23 @@ wget https://paddlespeech.bj.bcebos.com/PaddleAudio/cat.wav https://paddlespeech
```
-
Python API
```
sh
python tag.py
--input
~/cat.wav
```
bash
import paddle
from paddlespeech.cli import CLSExecutor
cls_executor
=
CLSExecutor
()
result
=
cls_executor
(
model_type
=
'panns_cnn14'
,
cfg_path
=
None,
# Set `cfg_path` and `ckpt_path` to None to use pretrained model.
label_file
=
None,
ckpt_path
=
None,
audio_file
=
'./cat.wav'
,
topk
=
10,
device
=
paddle.get_device
()
,
)
print
(
'CLS Result: \n{}'
.format
(
result
))
```
Output:
```
sh
```
ba
sh
CLS Result:
Cat: 0.8991316556930542
Domestic animals, pets: 0.8806838393211365
...
...
demos/audio_tagging/tag.py
已删除
100644 → 0
浏览文件 @
f053dce3
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
argparse
import
paddle
from
paddlespeech.cli
import
CLSExecutor
# yapf: disable
parser
=
argparse
.
ArgumentParser
()
parser
.
add_argument
(
'--input'
,
type
=
str
,
required
=
True
,
help
=
'Audio file to recognize.'
)
args
=
parser
.
parse_args
()
# yapf: enable
if
__name__
==
'__main__'
:
cls_executor
=
CLSExecutor
()
result
=
cls_executor
(
model_type
=
'panns_cnn14'
,
cfg_path
=
None
,
# Set `cfg_path` and `ckpt_path` to None to use pretrained model.
label_file
=
None
,
ckpt_path
=
None
,
audio_file
=
args
.
input
,
topk
=
10
,
device
=
paddle
.
get_device
(),
)
print
(
'CLS Result:
\n
{}'
.
format
(
result
))
demos/speech_recognition/README.md
浏览文件 @
b3e4e26c
...
...
@@ -7,7 +7,7 @@ This demo is an implementation to recognize text from a specific audio file. It
## Usage
### 1. Installation
```
sh
```
ba
sh
pip
install
paddlespeech
```
...
...
@@ -15,16 +15,20 @@ pip install paddlespeech
Input of this demo should be a WAV file(
`.wav`
), and the sample rate must be same as the model's.
Here are sample files for this demo that can be downloaded:
```
sh
```
ba
sh
wget https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
```
### 3. Usage
-
Command Line(Recommended)
```
sh
```
ba
sh
paddlespeech asr
--input
~/zh.wav
```
Command usage:
Usage:
```
bash
paddlespeech asr
--help
```
Arguments:
-
`input`
(required): Audio file to recognize.
-
`model`
: Model type of asr task. Default:
`conformer_wenetspeech`
.
-
`lang`
: Model language. Default:
`zh`
.
...
...
@@ -34,16 +38,29 @@ wget https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.
-
`device`
: Choose device to execute model inference. Default: default device of paddlepaddle in current environment.
Output:
```
sh
```
ba
sh
[
2021-12-08 13:12:34,063]
[
INFO]
[
utils.py]
[
L225] - ASR Result: 我认为跑步最重要的就是给我带来了身体健康
```
-
Python API
```
sh
python asr.py
--input
~/zh.wav
```
python
import
paddle
from
paddlespeech.cli
import
ASRExecutor
asr_executor
=
ASRExecutor
()
text
=
asr_executor
(
model
=
'conformer_wenetspeech'
,
lang
=
'zh'
,
sample_rate
=
16000
,
config
=
None
,
# Set `conf` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
'./zh.wav'
,
device
=
paddle
.
get_device
())
print
(
'ASR Result:
\n
{}'
.
format
(
text
))
```
Output:
```
sh
```
ba
sh
ASR Result:
我认为跑步最重要的就是给我带来了身体健康
```
...
...
demos/speech_recognition/asr.py
已删除
100644 → 0
浏览文件 @
f053dce3
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
argparse
import
paddle
from
paddlespeech.cli
import
ASRExecutor
# yapf: disable
parser
=
argparse
.
ArgumentParser
()
parser
.
add_argument
(
'--input'
,
type
=
str
,
required
=
True
,
help
=
'Audio file to recognize.'
)
args
=
parser
.
parse_args
()
# yapf: enable
if
__name__
==
'__main__'
:
asr_executor
=
ASRExecutor
()
text
=
asr_executor
(
model
=
'conformer_wenetspeech'
,
lang
=
'zh'
,
sample_rate
=
16000
,
config
=
None
,
# Set `conf` and `ckpt_path` to None to use pretrained model.
ckpt_path
=
None
,
audio_file
=
args
.
input
,
device
=
paddle
.
get_device
(),
)
print
(
'ASR Result:
\n
{}'
.
format
(
text
))
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录