Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
7ec0ed4a
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
206
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
7ec0ed4a
编写于
11月 23, 2021
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
kaldi feat dither when train
上级
6750770e
变更
3
显示空白变更内容
内联
并排
Showing
3 changed file
with
15 addition
and
11 deletion
+15
-11
docs/source/released_model.md
docs/source/released_model.md
+8
-8
paddlespeech/s2t/frontend/featurizer/text_featurizer.py
paddlespeech/s2t/frontend/featurizer/text_featurizer.py
+2
-0
paddlespeech/s2t/transform/spectrogram.py
paddlespeech/s2t/transform/spectrogram.py
+5
-3
未找到文件。
docs/source/released_model.md
浏览文件 @
7ec0ed4a
...
@@ -5,13 +5,13 @@
...
@@ -5,13 +5,13 @@
### Acoustic Model Released in paddle 2.X
### Acoustic Model Released in paddle 2.X
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech | example link
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech | example link
:-------------:| :------------:| :-----: | -----: | :----------------- |:--------- | :---------- | :--------- | :-----------
:-------------:| :------------:| :-----: | -----: | :----------------- |:--------- | :---------- | :--------- | :-----------
[
Ds2 Online Aishell
S0 Model
](
https://deepspeech.bj.bcebos.com/release2.2/aishell/s0/ds2_online_aishll_CER8.02_release.tar.gz
)
| Aishell Dataset | Char-based | 345 MB | 2 Conv + 5 LSTM layers with only forward direction | 0.080218 |-| 151 h |
[
D2 Online Aishell S0 Example
](
../../examples/aishell/s
0
)
[
Ds2 Online Aishell
ASR0 Model
](
https://deepspeech.bj.bcebos.com/release2.2/aishell/s0/ds2_online_aishll_CER8.02_release.tar.gz
)
| Aishell Dataset | Char-based | 345 MB | 2 Conv + 5 LSTM layers with only forward direction | 0.080218 |-| 151 h |
[
D2 Online Aishell S0 Example
](
../../examples/aishell/asr
0
)
[
Ds2 Offline Aishell
S0 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds2.offline.cer6p65.release.tar.gz
)
| Aishell Dataset | Char-based | 306 MB | 2 Conv + 3 bidirectional GRU layers| 0.065 |-| 151 h |
[
Ds2 Offline Aishell S0 Example
](
../../examples/aishell/s
0
)
[
Ds2 Offline Aishell
ASR0 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds2.offline.cer6p65.release.tar.gz
)
| Aishell Dataset | Char-based | 306 MB | 2 Conv + 3 bidirectional GRU layers| 0.065 |-| 151 h |
[
Ds2 Offline Aishell S0 Example
](
../../examples/aishell/asr
0
)
[
Conformer Online Aishell
S
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz
)
| Aishell Dataset | Char-based | 283 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0594 |-| 151 h |
[
Conformer Online Aishell S1 Example
](
../../examples/aishell/s1
)
[
Conformer Online Aishell
ASR
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz
)
| Aishell Dataset | Char-based | 283 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0594 |-| 151 h |
[
Conformer Online Aishell S1 Example
](
../../examples/aishell/s1
)
[
Conformer Offline Aishell
S
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz
)
| Aishell Dataset | Char-based | 284 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0547 |-| 151 h |
[
Conformer Offline Aishell S1 Example
](
../../examples/aishell/s1
)
[
Conformer Offline Aishell
ASR
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz
)
| Aishell Dataset | Char-based | 284 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0547 |-| 151 h |
[
Conformer Offline Aishell S1 Example
](
../../examples/aishell/s1
)
[
Conformer Librispeech
S
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/conformer.release.tar.gz
)
| Librispeech Dataset | subword-based | 287 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0325 | 960 h |
[
Conformer Librispeech S1 example
](
../../example/librispeech/s1
)
[
Conformer Librispeech
ASR
1 Model
](
https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/conformer.release.tar.gz
)
| Librispeech Dataset | subword-based | 287 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0325 | 960 h |
[
Conformer Librispeech S1 example
](
../../example/librispeech/s1
)
[
Transformer Librispeech
S
1 Model
](
https://deepspeech.bj.bcebos.com/release2.2/librispeech/s1/librispeech.s1.transformer.all.wer5p62.release.tar.gz
)
| Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0456 | 960 h |
[
Transformer Librispeech S1 example
](
../../example/librispeech/s1
)
[
Transformer Librispeech
ASR
1 Model
](
https://deepspeech.bj.bcebos.com/release2.2/librispeech/s1/librispeech.s1.transformer.all.wer5p62.release.tar.gz
)
| Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0456 | 960 h |
[
Transformer Librispeech S1 example
](
../../example/librispeech/s1
)
[
Transformer Librispeech
S
2 Model
](
https://deepspeech.bj.bcebos.com/release2.2/librispeech/s2/libri_transformer_espnet_wer3p84.release.tar.gz
)
| Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention |-| 0.0384 | 960 h |
[
Transformer Librispeech S2 example
](
../../example/librispeech/s2
)
[
Transformer Librispeech
ASR
2 Model
](
https://deepspeech.bj.bcebos.com/release2.2/librispeech/s2/libri_transformer_espnet_wer3p84.release.tar.gz
)
| Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention |-| 0.0384 | 960 h |
[
Transformer Librispeech S2 example
](
../../example/librispeech/s2
)
### Acoustic Model Transformed from paddle 1.8
### Acoustic Model Transformed from paddle 1.8
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
...
@@ -32,7 +32,7 @@ Language Model | Training Data | Token-based | Size | Descriptions
...
@@ -32,7 +32,7 @@ Language Model | Training Data | Token-based | Size | Descriptions
### Acoustic Models
### Acoustic Models
Model Type | Dataset| Example Link | Pretrained Models|Static Models|Siize(static)
Model Type | Dataset| Example Link | Pretrained Models|Static Models|Siize(static)
:-------------:| :------------:| :-----: | :-----:| :-----:| :-----:
:-------------:| :------------:| :-----: | :-----:| :-----:| :-----:
Tacotron2|LJSpeech|
[
tacotron2-vctk
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tt
s
0
)
|
[
tacotron2_ljspeech_ckpt_0.3.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/tacotron2_ljspeech_ckpt_0.3.zip
)
|||
Tacotron2|LJSpeech|
[
tacotron2-vctk
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tt
asr
0
)
|
[
tacotron2_ljspeech_ckpt_0.3.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/tacotron2_ljspeech_ckpt_0.3.zip
)
|||
TransformerTTS| LJSpeech|
[
transformer-ljspeech
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tts1
)
|
[
transformer_tts_ljspeech_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/transformer_tts_ljspeech_ckpt_0.4.zip
)
|||
TransformerTTS| LJSpeech|
[
transformer-ljspeech
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tts1
)
|
[
transformer_tts_ljspeech_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/transformer_tts_ljspeech_ckpt_0.4.zip
)
|||
SpeedySpeech| CSMSC |
[
speedyspeech-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts2
)
|
[
speedyspeech_nosil_baker_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/speedyspeech_nosil_baker_ckpt_0.5.zip
)
|
[
speedyspeech_nosil_baker_static_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/speedyspeech_nosil_baker_static_0.5.zip
)
|12MB|
SpeedySpeech| CSMSC |
[
speedyspeech-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts2
)
|
[
speedyspeech_nosil_baker_ckpt_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/speedyspeech_nosil_baker_ckpt_0.5.zip
)
|
[
speedyspeech_nosil_baker_static_0.5.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/speedyspeech_nosil_baker_static_0.5.zip
)
|12MB|
FastSpeech2| CSMSC |
[
fastspeech2-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3
)
|
[
fastspeech2_nosil_baker_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_baker_ckpt_0.4.zip
)
|
[
fastspeech2_nosil_baker_static_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_baker_static_0.4.zip
)
|157MB|
FastSpeech2| CSMSC |
[
fastspeech2-csmsc
](
https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3
)
|
[
fastspeech2_nosil_baker_ckpt_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_baker_ckpt_0.4.zip
)
|
[
fastspeech2_nosil_baker_static_0.4.zip
](
https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_baker_static_0.4.zip
)
|157MB|
...
...
paddlespeech/s2t/frontend/featurizer/text_featurizer.py
浏览文件 @
7ec0ed4a
...
@@ -56,6 +56,8 @@ class TextFeaturizer():
...
@@ -56,6 +56,8 @@ class TextFeaturizer():
self
.
vocab_dict
,
self
.
_id2token
,
self
.
vocab_list
,
self
.
unk_id
,
self
.
eos_id
,
self
.
blank_id
=
self
.
_load_vocabulary_from_file
(
self
.
vocab_dict
,
self
.
_id2token
,
self
.
vocab_list
,
self
.
unk_id
,
self
.
eos_id
,
self
.
blank_id
=
self
.
_load_vocabulary_from_file
(
vocab_filepath
,
maskctc
)
vocab_filepath
,
maskctc
)
self
.
vocab_size
=
len
(
self
.
vocab_list
)
self
.
vocab_size
=
len
(
self
.
vocab_list
)
else
:
logger
.
warning
(
f
"TextFeaturizer: not have vocab file."
)
if
unit_type
==
'spm'
:
if
unit_type
==
'spm'
:
spm_model
=
spm_model_prefix
+
'.model'
spm_model
=
spm_model_prefix
+
'.model'
...
...
paddlespeech/s2t/transform/spectrogram.py
浏览文件 @
7ec0ed4a
...
@@ -341,7 +341,7 @@ class LogMelSpectrogramKaldi():
...
@@ -341,7 +341,7 @@ class LogMelSpectrogramKaldi():
self
.
eps
=
eps
self
.
eps
=
eps
self
.
remove_dc_offset
=
True
self
.
remove_dc_offset
=
True
self
.
preemph
=
0.97
self
.
preemph
=
0.97
self
.
dither
=
dither
self
.
dither
=
dither
# only work in train mode
def
__repr__
(
self
):
def
__repr__
(
self
):
return
(
return
(
...
@@ -361,11 +361,12 @@ class LogMelSpectrogramKaldi():
...
@@ -361,11 +361,12 @@ class LogMelSpectrogramKaldi():
eps
=
self
.
eps
,
eps
=
self
.
eps
,
dither
=
self
.
dither
,
))
dither
=
self
.
dither
,
))
def
__call__
(
self
,
x
):
def
__call__
(
self
,
x
,
train
):
"""
"""
Args:
Args:
x (np.ndarray): shape (Ti,)
x (np.ndarray): shape (Ti,)
train (bool): True, train mode.
Raises:
Raises:
ValueError: not support (Ti, C)
ValueError: not support (Ti, C)
...
@@ -373,6 +374,7 @@ class LogMelSpectrogramKaldi():
...
@@ -373,6 +374,7 @@ class LogMelSpectrogramKaldi():
Returns:
Returns:
np.ndarray: (T, D)
np.ndarray: (T, D)
"""
"""
dither
=
self
.
dither
if
train
else
False
if
x
.
ndim
!=
1
:
if
x
.
ndim
!=
1
:
raise
ValueError
(
"Not support x: [Time, Channel]"
)
raise
ValueError
(
"Not support x: [Time, Channel]"
)
...
@@ -391,7 +393,7 @@ class LogMelSpectrogramKaldi():
...
@@ -391,7 +393,7 @@ class LogMelSpectrogramKaldi():
nfft
=
self
.
n_fft
,
nfft
=
self
.
n_fft
,
lowfreq
=
self
.
fmin
,
lowfreq
=
self
.
fmin
,
highfreq
=
self
.
fmax
,
highfreq
=
self
.
fmax
,
dither
=
self
.
dither
,
dither
=
dither
,
remove_dc_offset
=
self
.
remove_dc_offset
,
remove_dc_offset
=
self
.
remove_dc_offset
,
preemph
=
self
.
preemph
,
preemph
=
self
.
preemph
,
wintype
=
self
.
window
)
wintype
=
self
.
window
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录