Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
33f0e762
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
33f0e762
编写于
11月 25, 2021
作者:
K
KP
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Add paddlespeech.cls and esc50 example.
上级
476f05c4
变更
14
隐藏空白更改
内联
并排
Showing
14 changed file
with
100 addition
and
24 deletion
+100
-24
examples/esc50/README.md
examples/esc50/README.md
+16
-8
examples/esc50/cls0/local/export.sh
examples/esc50/cls0/local/export.sh
+8
-0
examples/esc50/cls0/local/static_model_infer.sh
examples/esc50/cls0/local/static_model_infer.sh
+11
-0
examples/esc50/cls0/run.sh
examples/esc50/cls0/run.sh
+11
-1
paddlespeech/cls/exps/PANNs/__init__.py
paddlespeech/cls/exps/PANNs/__init__.py
+0
-1
paddlespeech/cls/exps/PANNs/deploy/__init__.py
paddlespeech/cls/exps/PANNs/deploy/__init__.py
+13
-0
paddlespeech/cls/exps/PANNs/deploy/predict.py
paddlespeech/cls/exps/PANNs/deploy/predict.py
+6
-8
paddlespeech/cls/exps/PANNs/export_model.py
paddlespeech/cls/exps/PANNs/export_model.py
+2
-2
paddlespeech/cls/exps/PANNs/predict.py
paddlespeech/cls/exps/PANNs/predict.py
+2
-2
paddlespeech/cls/exps/PANNs/train.py
paddlespeech/cls/exps/PANNs/train.py
+2
-2
paddlespeech/cls/models/PANNs/__init__.py
paddlespeech/cls/models/PANNs/__init__.py
+15
-0
paddlespeech/cls/models/PANNs/classifier.py
paddlespeech/cls/models/PANNs/classifier.py
+0
-0
paddlespeech/cls/models/PANNs/panns.py
paddlespeech/cls/models/PANNs/panns.py
+0
-0
paddlespeech/cls/models/__init__.py
paddlespeech/cls/models/__init__.py
+14
-0
未找到文件。
examples/esc50/README.md
浏览文件 @
33f0e762
...
@@ -30,7 +30,7 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 1
...
@@ -30,7 +30,7 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 1
`paddlespeech/cls/exps/PANNs/train.py`
脚本中可支持配置的参数:
`paddlespeech/cls/exps/PANNs/train.py`
脚本中可支持配置的参数:
-
`device`
:
选用什么设备进行训练,可选cpu或gpu,默认为gpu。如使用gpu训练则参数gpus指定GPU卡号
。
-
`device`
:
指定模型预测时使用的设备
。
-
`feat_backend`
: 选择提取特征的后端,可选
`'numpy'`
或
`'paddle'`
,默认为
`'numpy'`
。
-
`feat_backend`
: 选择提取特征的后端,可选
`'numpy'`
或
`'paddle'`
,默认为
`'numpy'`
。
-
`epochs`
: 训练轮次,默认为50。
-
`epochs`
: 训练轮次,默认为50。
-
`learning_rate`
: Fine-tune的学习率;默认为5e-5。
-
`learning_rate`
: Fine-tune的学习率;默认为5e-5。
...
@@ -42,8 +42,8 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 1
...
@@ -42,8 +42,8 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 1
示例代码中使用的预训练模型为
`CNN14`
,如果想更换为其他预训练模型,可通过以下方式执行:
示例代码中使用的预训练模型为
`CNN14`
,如果想更换为其他预训练模型,可通过以下方式执行:
```
python
```
python
from
model
import
SoundClassifier
from
paddleaudio.datasets
import
ESC50
from
paddlespeech.cls.
datasets
import
ESC50
from
paddlespeech.cls.
models
import
SoundClassifier
from
paddlespeech.cls.models
import
cnn14
,
cnn10
,
cnn6
from
paddlespeech.cls.models
import
cnn14
,
cnn10
,
cnn6
# CNN14
# CNN14
...
@@ -67,7 +67,7 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 2
...
@@ -67,7 +67,7 @@ $ CUDA_VISIBLE_DEVICES=0 ./run.sh 2
`paddlespeech/cls/exps/PANNs/predict.py`
脚本中可支持配置的参数:
`paddlespeech/cls/exps/PANNs/predict.py`
脚本中可支持配置的参数:
-
`device`
:
选用什么设备进行训练,可选cpu或gpu,默认为gpu。如使用gpu训练则参数gpus指定GPU卡号
。
-
`device`
:
指定模型预测时使用的设备
。
-
`wav`
: 指定预测的音频文件。
-
`wav`
: 指定预测的音频文件。
-
`feat_backend`
: 选择提取特征的后端,可选
`'numpy'`
或
`'paddle'`
,默认为
`'numpy'`
。
-
`feat_backend`
: 选择提取特征的后端,可选
`'numpy'`
或
`'paddle'`
,默认为
`'numpy'`
。
-
`top_k`
: 预测显示的top k标签的得分,默认为1。
-
`top_k`
: 预测显示的top k标签的得分,默认为1。
...
@@ -88,10 +88,10 @@ Cat: 6.579841738130199e-06
...
@@ -88,10 +88,10 @@ Cat: 6.579841738130199e-06
模型训练结束后,可以将已保存的动态图参数导出成静态图的模型和参数,然后实施静态图的部署。
模型训练结束后,可以将已保存的动态图参数导出成静态图的模型和参数,然后实施静态图的部署。
```
shell
```
shell
python
-u
export_model.py
--checkpoint
./checkpoint/epoch_50/model.pdparams
--output_dir
./export
$ CUDA_VISIBLE_DEVICES
=
0 ./run.sh 3
```
```
可支持配置的参数:
`paddlespeech/cls/exps/PANNs/export_model.py`
脚本中
可支持配置的参数:
-
`checkpoint`
: 模型参数checkpoint文件。
-
`checkpoint`
: 模型参数checkpoint文件。
-
`output_dir`
: 导出静态图模型和参数文件的保存目录。
-
`output_dir`
: 导出静态图模型和参数文件的保存目录。
...
@@ -106,8 +106,16 @@ export
...
@@ -106,8 +106,16 @@ export
#### 2. 模型部署和预测
#### 2. 模型部署和预测
`
deploy/python
/predict.py`
脚本使用了
`paddle.inference`
模块下的api,提供了python端部署的示例:
`
paddlespeech/cls/exps/PANNs/deploy
/predict.py`
脚本使用了
`paddle.inference`
模块下的api,提供了python端部署的示例:
```
shell
$ CUDA_VISIBLE_DEVICES
=
0 ./run.sh 3
```
```
sh
```
sh
python
deploy/python
/predict.py
--model_dir
./export
--device
gpu
python
paddlespeech/cls/exps/PANNs/deploy
/predict.py
--model_dir
./export
--device
gpu
```
```
`paddlespeech/cls/exps/PANNs/deploy/predict.py`
脚本中可支持配置的主要参数:
-
`device`
: 指定模型预测时使用的设备。
-
`model_dir`
: 导出静态图模型和参数文件的保存目录。
-
`wav`
: 指定预测的音频文件。
examples/esc50/cls0/local/export.sh
0 → 100755
浏览文件 @
33f0e762
#!/bin/bash
ckpt_dir
=
$1
output_dir
=
$2
python3
${
BIN_DIR
}
/export_model.py
\
--checkpoint
${
ckpt_dir
}
/model.pdparams
\
--output_dir
${
output_dir
}
examples/esc50/cls0/local/static_model_infer.sh
0 → 100755
浏览文件 @
33f0e762
#!/bin/bash
device
=
$1
model_dir
=
$2
audio_file
=
$3
python3
${
BIN_DIR
}
/deploy/predict.py
\
--device
${
device
}
\
--model_dir
${
model_dir
}
\
--wav
${
audio_file
}
examples/esc50/cls0/run.sh
浏览文件 @
33f0e762
...
@@ -15,13 +15,23 @@ feat_backend=numpy
...
@@ -15,13 +15,23 @@ feat_backend=numpy
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
./local/train.sh
${
ngpu
}
${
device
}
${
feat_backend
}
||
exit
-1
./local/train.sh
${
ngpu
}
${
device
}
${
feat_backend
}
||
exit
-1
exit
0
fi
fi
audio_file
=
~/cat.wav
audio_file
=
~/cat.wav
ckpt_dir
=
./checkpoint/epoch_50
ckpt_dir
=
./checkpoint/epoch_50
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
./local/infer.sh
${
device
}
${
audio_file
}
${
ckpt_dir
}
${
feat_backend
}
||
exit
-1
./local/infer.sh
${
device
}
${
audio_file
}
${
ckpt_dir
}
${
feat_backend
}
||
exit
-1
exit
0
fi
fi
output_dir
=
./export
if
[
${
stage
}
-le
3
]
&&
[
${
stop_stage
}
-ge
3
]
;
then
./local/export.sh
${
ckpt_dir
}
${
output_dir
}
||
exit
-1
exit
0
fi
exit
0
if
[
${
stage
}
-le
4
]
&&
[
${
stop_stage
}
-ge
4
]
;
then
\ No newline at end of file
./local/static_model_infer.sh
${
device
}
${
output_dir
}
${
audio_file
}
||
exit
-1
exit
0
fi
paddlespeech/cls/exps/PANNs/__init__.py
浏览文件 @
33f0e762
...
@@ -11,4 +11,3 @@
...
@@ -11,4 +11,3 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
from
.panns
import
*
paddlespeech/cls/exps/PANNs/deploy/__init__.py
0 → 100644
浏览文件 @
33f0e762
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
paddlespeech/cls/exps/PANNs/deploy/p
ython/p
redict.py
→
paddlespeech/cls/exps/PANNs/deploy/predict.py
浏览文件 @
33f0e762
...
@@ -18,15 +18,16 @@ import numpy as np
...
@@ -18,15 +18,16 @@ import numpy as np
from
paddle
import
inference
from
paddle
import
inference
from
scipy.special
import
softmax
from
scipy.special
import
softmax
from
paddle
speech.cls
.backends
import
load
as
load_audio
from
paddle
audio
.backends
import
load
as
load_audio
from
paddle
speech.cls
.datasets
import
ESC50
from
paddle
audio
.datasets
import
ESC50
from
paddle
speech.cls
.features
import
melspectrogram
from
paddle
audio
.features
import
melspectrogram
# yapf: disable
# yapf: disable
parser
=
argparse
.
ArgumentParser
()
parser
=
argparse
.
ArgumentParser
()
parser
.
add_argument
(
"--model_dir"
,
type
=
str
,
required
=
True
,
default
=
"./export"
,
help
=
"The directory to static model."
)
parser
.
add_argument
(
"--model_dir"
,
type
=
str
,
required
=
True
,
default
=
"./export"
,
help
=
"The directory to static model."
)
parser
.
add_argument
(
"--batch_size"
,
type
=
int
,
default
=
2
,
help
=
"Batch size per GPU/CPU for training."
)
parser
.
add_argument
(
'--device'
,
choices
=
[
'cpu'
,
'gpu'
,
'xpu'
],
default
=
"gpu"
,
help
=
"Select which device to train model, defaults to gpu."
)
parser
.
add_argument
(
'--device'
,
choices
=
[
'cpu'
,
'gpu'
,
'xpu'
],
default
=
"gpu"
,
help
=
"Select which device to train model, defaults to gpu."
)
parser
.
add_argument
(
"--wav"
,
type
=
str
,
required
=
True
,
help
=
"Audio file to infer."
)
parser
.
add_argument
(
"--batch_size"
,
type
=
int
,
default
=
1
,
help
=
"Batch size per GPU/CPU for training."
)
parser
.
add_argument
(
'--use_tensorrt'
,
type
=
eval
,
default
=
False
,
choices
=
[
True
,
False
],
help
=
'Enable to use tensorrt to speed up.'
)
parser
.
add_argument
(
'--use_tensorrt'
,
type
=
eval
,
default
=
False
,
choices
=
[
True
,
False
],
help
=
'Enable to use tensorrt to speed up.'
)
parser
.
add_argument
(
"--precision"
,
type
=
str
,
default
=
"fp32"
,
choices
=
[
"fp32"
,
"fp16"
],
help
=
'The tensorrt precision.'
)
parser
.
add_argument
(
"--precision"
,
type
=
str
,
default
=
"fp32"
,
choices
=
[
"fp32"
,
"fp16"
],
help
=
'The tensorrt precision.'
)
parser
.
add_argument
(
'--cpu_threads'
,
type
=
int
,
default
=
10
,
help
=
'Number of threads to predict when using cpu.'
)
parser
.
add_argument
(
'--cpu_threads'
,
type
=
int
,
default
=
10
,
help
=
'Number of threads to predict when using cpu.'
)
...
@@ -132,10 +133,7 @@ if __name__ == "__main__":
...
@@ -132,10 +133,7 @@ if __name__ == "__main__":
args
.
use_tensorrt
,
args
.
precision
,
args
.
cpu_threads
,
args
.
use_tensorrt
,
args
.
precision
,
args
.
cpu_threads
,
args
.
enable_mkldnn
)
args
.
enable_mkldnn
)
wavs
=
[
wavs
=
[
args
.
wav
]
'~/audio_demo_resource/cat.wav'
,
'~/audio_demo_resource/dog.wav'
,
]
for
i
in
range
(
len
(
wavs
)):
for
i
in
range
(
len
(
wavs
)):
wavs
[
i
]
=
os
.
path
.
abspath
(
os
.
path
.
expanduser
(
wavs
[
i
]))
wavs
[
i
]
=
os
.
path
.
abspath
(
os
.
path
.
expanduser
(
wavs
[
i
]))
...
...
paddlespeech/cls/exps/PANNs/export_model.py
浏览文件 @
33f0e762
...
@@ -16,9 +16,9 @@ import os
...
@@ -16,9 +16,9 @@ import os
import
paddle
import
paddle
from
.model
import
SoundClassifier
from
.panns
import
cnn14
from
paddleaudio.datasets
import
ESC50
from
paddleaudio.datasets
import
ESC50
from
paddlespeech.cls.models
import
cnn14
from
paddlespeech.cls.models
import
SoundClassifier
# yapf: disable
# yapf: disable
parser
=
argparse
.
ArgumentParser
(
__doc__
)
parser
=
argparse
.
ArgumentParser
(
__doc__
)
...
...
paddlespeech/cls/exps/PANNs/predict.py
浏览文件 @
33f0e762
...
@@ -16,13 +16,13 @@ import argparse
...
@@ -16,13 +16,13 @@ import argparse
import
numpy
as
np
import
numpy
as
np
import
paddle
import
paddle
import
paddle.nn.functional
as
F
import
paddle.nn.functional
as
F
from
model
import
SoundClassifier
from
panns
import
cnn14
from
paddleaudio.backends
import
load
as
load_audio
from
paddleaudio.backends
import
load
as
load_audio
from
paddleaudio.datasets
import
ESC50
from
paddleaudio.datasets
import
ESC50
from
paddleaudio.features
import
LogMelSpectrogram
from
paddleaudio.features
import
LogMelSpectrogram
from
paddleaudio.features
import
melspectrogram
from
paddleaudio.features
import
melspectrogram
from
paddlespeech.cls.models
import
cnn14
from
paddlespeech.cls.models
import
SoundClassifier
# yapf: disable
# yapf: disable
parser
=
argparse
.
ArgumentParser
(
__doc__
)
parser
=
argparse
.
ArgumentParser
(
__doc__
)
...
...
paddlespeech/cls/exps/PANNs/train.py
浏览文件 @
33f0e762
...
@@ -15,13 +15,13 @@ import argparse
...
@@ -15,13 +15,13 @@ import argparse
import
os
import
os
import
paddle
import
paddle
from
model
import
SoundClassifier
from
panns
import
cnn14
from
paddleaudio.datasets
import
ESC50
from
paddleaudio.datasets
import
ESC50
from
paddleaudio.features
import
LogMelSpectrogram
from
paddleaudio.features
import
LogMelSpectrogram
from
paddleaudio.utils
import
logger
from
paddleaudio.utils
import
logger
from
paddleaudio.utils
import
Timer
from
paddleaudio.utils
import
Timer
from
paddlespeech.cls.models
import
cnn14
from
paddlespeech.cls.models
import
SoundClassifier
# yapf: disable
# yapf: disable
parser
=
argparse
.
ArgumentParser
(
__doc__
)
parser
=
argparse
.
ArgumentParser
(
__doc__
)
...
...
paddlespeech/cls/models/PANNs/__init__.py
0 → 100644
浏览文件 @
33f0e762
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
.classifier
import
*
from
.panns
import
*
paddlespeech/cls/
exps/PANNs/model
.py
→
paddlespeech/cls/
models/PANNs/classifier
.py
浏览文件 @
33f0e762
文件已移动
paddlespeech/cls/
exp
s/PANNs/panns.py
→
paddlespeech/cls/
model
s/PANNs/panns.py
浏览文件 @
33f0e762
文件已移动
paddlespeech/cls/models/__init__.py
0 → 100644
浏览文件 @
33f0e762
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from
.PANNs
import
*
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录