Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
9c27b1d1
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 1 年 前同步成功
通知
207
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
9c27b1d1
编写于
6月 12, 2017
作者:
D
dangqingqing
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add more comments and update train.py
上级
bf735400
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
23 addition
and
13 deletion
+23
-13
audio_data_utils.py
audio_data_utils.py
+20
-10
train.py
train.py
+3
-3
未找到文件。
audio_data_utils.py
浏览文件 @
9c27b1d1
...
...
@@ -247,25 +247,34 @@ class DataGenerator(object):
new_batch
.
append
((
padded_audio
,
text
))
return
new_batch
def
__batch_shuffle__
(
self
,
manifest
,
batch_s
huffle_s
ize
):
def
__batch_shuffle__
(
self
,
manifest
,
batch_size
):
"""
The instances have different lengths and they cannot be
combined into a single matrix multiplication. It usually
sorts the training examples by length and combines only
similarly-sized instances into minibatches, pads with
silence when necessary so that all instances in a batch
have the same length. This batch shuffle fuction is used
to make similarly-sized instances into minibatches and
make a batch-wise shuffle.
1. Sort the audio clips by duration.
2. Generate a random number `k`, k in [0, batch_s
huffle_s
ize).
2. Generate a random number `k`, k in [0, batch_size).
3. Randomly remove `k` instances in order to make different mini-batches,
then make minibatches and each minibatch size is batch_s
huffle_s
ize.
then make minibatches and each minibatch size is batch_size.
4. Shuffle the minibatches.
:param manifest: manifest file.
:type manifest: list
:param batch_s
huffle_size: This size is uesed to generate a random number,
it usually equals to batch siz
e.
:type batch_s
huffle_s
ize: int
:param batch_s
ize: Batch size. This size is also used for generate
a random number for batch shuffl
e.
:type batch_size: int
:return: batch shuffled mainifest.
:rtype: list
"""
manifest
.
sort
(
key
=
lambda
x
:
x
[
"duration"
])
shift_len
=
self
.
__random__
.
randint
(
0
,
batch_s
huffle_s
ize
-
1
)
batch_manifest
=
zip
(
*
[
iter
(
manifest
[
shift_len
:])]
*
batch_s
huffle_s
ize
)
shift_len
=
self
.
__random__
.
randint
(
0
,
batch_size
-
1
)
batch_manifest
=
zip
(
*
[
iter
(
manifest
[
shift_len
:])]
*
batch_size
)
self
.
__random__
.
shuffle
(
batch_manifest
)
batch_manifest
=
list
(
sum
(
batch_manifest
,
()))
res_len
=
len
(
manifest
)
-
shift_len
-
len
(
batch_manifest
)
...
...
@@ -327,8 +336,9 @@ class DataGenerator(object):
if set True.
:type sortagrad: bool
:param batch_shuffle: Shuffle the audio clips if set True. It is
not a thorough instance-wise shuffle,
but a specific batch-wise shuffle.
not a thorough instance-wise shuffle, but a
specific batch-wise shuffle. For more details,
please see `__batch_shuffle__` function.
:type batch_shuffle: bool
:return: Batch reader function, producing batches of data when called.
:rtype: callable
...
...
train.py
浏览文件 @
9c27b1d1
...
...
@@ -143,12 +143,12 @@ def train():
train_batch_reader
=
train_generator
.
batch_reader_creator
(
manifest_path
=
args
.
train_manifest_path
,
batch_size
=
args
.
batch_size
,
sortagrad
=
True
,
shuffle
=
True
)
sortagrad
=
True
if
args
.
init_model_path
is
None
else
False
,
batch_
shuffle
=
True
)
test_batch_reader
=
test_generator
.
batch_reader_creator
(
manifest_path
=
args
.
dev_manifest_path
,
batch_size
=
args
.
batch_size
,
shuffle
=
False
)
batch_
shuffle
=
False
)
feeding
=
train_generator
.
data_name_feeding
()
# create event handler
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录