Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
467e8235
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
467e8235
编写于
5月 18, 2021
作者:
H
Hui Zhang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
parallel data scripts; more mask test; need pybind11 repo
上级
0fe80f0f
变更
7
隐藏空白更改
内联
并排
Showing
7 changed file
with
37 addition
and
28 deletion
+37
-28
.gitignore
.gitignore
+1
-1
deepspeech/exps/deepspeech2/model.py
deepspeech/exps/deepspeech2/model.py
+1
-1
examples/aishell/s0/local/data.sh
examples/aishell/s0/local/data.sh
+14
-11
examples/aishell/s1/local/data.sh
examples/aishell/s1/local/data.sh
+17
-14
examples/tiny/s1/run.sh
examples/tiny/s1/run.sh
+0
-0
requirements.txt
requirements.txt
+2
-1
tests/mask_test.py
tests/mask_test.py
+2
-0
未找到文件。
.gitignore
浏览文件 @
467e8235
.DS_Store
*.pyc
.vscode
*
.
log
*log
*.pdmodel
*.pdiparams*
*.zip
...
...
deepspeech/exps/deepspeech2/model.py
浏览文件 @
467e8235
...
...
@@ -170,7 +170,7 @@ class DeepSpeech2Trainer(Trainer):
train_dataset
,
batch_sampler
=
batch_sampler
,
collate_fn
=
collate_fn
,
num_workers
=
config
.
data
.
num_workers
,
)
num_workers
=
config
.
data
.
num_workers
)
self
.
valid_loader
=
DataLoader
(
dev_dataset
,
batch_size
=
config
.
data
.
batch_size
,
...
...
examples/aishell/s0/local/data.sh
浏览文件 @
467e8235
...
...
@@ -66,19 +66,22 @@ fi
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
# format manifest with tokenids, vocab size
for
dataset
in
train dev
test
;
do
{
python3
${
MAIN_ROOT
}
/utils/format_data.py
\
--feat_type
"raw"
\
--cmvn_path
"data/mean_std.json"
\
--unit_type
"char"
\
--vocab_path
=
"data/vocab.txt"
\
--manifest_path
=
"data/manifest.
${
dataset
}
.raw"
\
--output_path
=
"data/manifest.
${
dataset
}
"
done
--feat_type
"raw"
\
--cmvn_path
"data/mean_std.json"
\
--unit_type
"char"
\
--vocab_path
=
"data/vocab.txt"
\
--manifest_path
=
"data/manifest.
${
dataset
}
.raw"
\
--output_path
=
"data/manifest.
${
dataset
}
"
if
[
$?
-ne
0
]
;
then
echo
"Formt mnaifest failed. Terminated."
exit
1
fi
if
[
$?
-ne
0
]
;
then
echo
"Formt mnaifest failed. Terminated."
exit
1
fi
}
&
done
wait
fi
echo
"Aishell data preparation done."
...
...
examples/aishell/s1/local/data.sh
浏览文件 @
467e8235
...
...
@@ -14,7 +14,7 @@ if [ ${stage} -le -1 ] && [ ${stop_stage} -ge -1 ]; then
python3
${
TARGET_DIR
}
/aishell/aishell.py
\
--manifest_prefix
=
"data/manifest"
\
--target_dir
=
"
${
TARGET_DIR
}
/aishell"
if
[
$?
-ne
0
]
;
then
echo
"Prepare Aishell failed. Terminated."
exit
1
...
...
@@ -33,7 +33,7 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
--count_threshold
=
0
\
--vocab_path
=
"data/vocab.txt"
\
--manifest_paths
"data/manifest.train.raw"
if
[
$?
-ne
0
]
;
then
echo
"Build vocabulary failed. Terminated."
exit
1
...
...
@@ -56,7 +56,7 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
--num_samples
=
-1
\
--num_workers
=
${
num_workers
}
\
--output_path
=
"data/mean_std.json"
if
[
$?
-ne
0
]
;
then
echo
"Compute mean and stddev failed. Terminated."
exit
1
...
...
@@ -67,19 +67,22 @@ fi
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
# format manifest with tokenids, vocab size
for
dataset
in
train dev
test
;
do
{
python3
${
MAIN_ROOT
}
/utils/format_data.py
\
--feat_type
"raw"
\
--cmvn_path
"data/mean_std.json"
\
--unit_type
"char"
\
--vocab_path
=
"data/vocab.txt"
\
--manifest_path
=
"data/manifest.
${
dataset
}
.raw"
\
--output_path
=
"data/manifest.
${
dataset
}
"
--feat_type
"raw"
\
--cmvn_path
"data/mean_std.json"
\
--unit_type
"char"
\
--vocab_path
=
"data/vocab.txt"
\
--manifest_path
=
"data/manifest.
${
dataset
}
.raw"
\
--output_path
=
"data/manifest.
${
dataset
}
"
if
[
$?
-ne
0
]
;
then
echo
"Formt mnaifest failed. Terminated."
exit
1
fi
}
&
done
if
[
$?
-ne
0
]
;
then
echo
"Formt mnaifest failed. Terminated."
exit
1
fi
wait
fi
echo
"Aishell data preparation done."
...
...
examples/tiny/s1/run.sh
100644 → 100755
浏览文件 @
467e8235
文件模式从 100644 更改为 100755
requirements.txt
浏览文件 @
467e8235
...
...
@@ -8,4 +8,5 @@ SoundFile==0.9.0.post1
sox
tensorboardX
typeguard
yacs
\ No newline at end of file
yacs
pybind11
tests/mask_test.py
浏览文件 @
467e8235
...
...
@@ -48,7 +48,9 @@ class TestU2Model(unittest.TestCase):
def
test_make_pad_mask
(
self
):
res
=
make_pad_mask
(
self
.
lengths
)
res1
=
make_non_pad_mask
(
self
.
lengths
).
logical_not
()
self
.
assertSequenceEqual
(
res
.
numpy
().
tolist
(),
self
.
pad_masks
.
tolist
())
self
.
assertSequenceEqual
(
res
.
numpy
().
tolist
(),
res1
.
tolist
())
if
__name__
==
'__main__'
:
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录