Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
models
提交
e8266790
M
models
项目概览
PaddlePaddle
/
models
大约 1 年 前同步成功
通知
222
Star
6828
Fork
2962
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
602
列表
看板
标记
里程碑
合并请求
255
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
M
models
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
602
Issue
602
列表
看板
标记
里程碑
合并请求
255
合并请求
255
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
e8266790
编写于
2月 08, 2018
作者:
Y
Yibing Liu
提交者:
GitHub
2月 08, 2018
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #660 from kuke/rm_print
Remove uncessary print
上级
11841096
f1b40e0b
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
18 addition
and
19 deletion
+18
-19
fluid/DeepASR/data_utils/data_reader.py
fluid/DeepASR/data_utils/data_reader.py
+18
-19
未找到文件。
fluid/DeepASR/data_utils/data_reader.py
浏览文件 @
e8266790
...
...
@@ -28,7 +28,7 @@ class SampleInfo(object):
feature_frame_num (int): Time length of the sample.
feature_dim (int): Feature dimension of one frame.
label_bin_path (str): File containing the label data.
label_size (int): Byte count of the sample's label data.
label_size (int): Byte count of the sample's label data.
label_frame_num (int): Label number of the sample.
"""
...
...
@@ -49,24 +49,24 @@ class SampleInfo(object):
class
SampleInfoBucket
(
object
):
"""SampleInfoBucket contains paths of several description files. Feature
description file contains necessary information (including path of binary
data, sample start position, sample byte number etc.) to access samples'
feature data and the same with the label description file. SampleInfoBucket
description file contains necessary information (including path of binary
data, sample start position, sample byte number etc.) to access samples'
feature data and the same with the label description file. SampleInfoBucket
is the minimum unit to do shuffle.
Args:
feature_bin_paths (list|tuple): Files containing the binary feature
feature_bin_paths (list|tuple): Files containing the binary feature
data.
feature_desc_paths (list|tuple): Files containing the description of
samples' feature data.
feature_desc_paths (list|tuple): Files containing the description of
samples' feature data.
label_bin_paths (list|tuple): Files containing the binary label data.
label_desc_paths (list|tuple): Files containing the description of
samples' label data.
split_perturb(int): Maximum perturbation value for length of
split_perturb(int): Maximum perturbation value for length of
sub-sentence when splitting long sentence.
split_sentence_threshold(int): Sentence whose length larger than
split_sentence_threshold(int): Sentence whose length larger than
the value will trigger split operation.
split_sub_sentence_len(int): sub-sentence length is equal to
split_sub_sentence_len(int): sub-sentence length is equal to
(split_sub_sentence_len + rand() % split_perturb).
"""
...
...
@@ -129,7 +129,7 @@ class SampleInfoBucket(object):
feature_size
,
feature_frame_num
,
feature_dim
,
label_bin_path
,
label_start
,
label_size
,
label_frame_num
))
#split sentence
#split sentence
else
:
cur_frame_pos
=
0
cur_frame_len
=
0
...
...
@@ -156,7 +156,6 @@ class SampleInfoBucket(object):
if
remain_frame_num
<=
0
:
break
print
(
"generate_sample_info_list size "
,
len
(
sample_info_list
))
return
sample_info_list
...
...
@@ -171,22 +170,22 @@ class DataReader(object):
Args:
feature_file_list (str): File containing paths of feature data file and
corresponding description file.
label_file_list (str): File containing paths of label data file and
label_file_list (str): File containing paths of label data file and
corresponding description file.
drop_frame_len (int): Samples whose label length above the value will be
dropped.(Using '-1' to disable the policy)
process_num (int): Number of processes for processing data.
sample_buffer_size (int): Buffer size to indicate the maximum samples
sample_buffer_size (int): Buffer size to indicate the maximum samples
cached.
sample_info_buffer_size (int): Buffer size to indicate the maximum
sample_info_buffer_size (int): Buffer size to indicate the maximum
sample information cached.
batch_buffer_size (int): Buffer size to indicate the maximum batch
batch_buffer_size (int): Buffer size to indicate the maximum batch
cached.
shuffle_block_num (int): Block number indicating the minimum unit to do
shuffle_block_num (int): Block number indicating the minimum unit to do
shuffle.
random_seed (int): Random seed.
verbose (int): If set to 0, complaints including exceptions and signal
traceback from sub-process will be suppressed. If set
verbose (int): If set to 0, complaints including exceptions and signal
traceback from sub-process will be suppressed. If set
to 1, all complaints will be printed.
"""
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录