Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Crayon鑫
Paddle
提交
3fbf33d4
P
Paddle
项目概览
Crayon鑫
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
3fbf33d4
编写于
12月 02, 2017
作者:
T
Travis CI
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Deploy to GitHub Pages:
d89061c3
上级
00b17acd
变更
4
展开全部
隐藏空白更改
内联
并排
Showing
4 changed file
with
166 addition
and
2 deletion
+166
-2
develop/doc/api/v2/data/data_reader.html
develop/doc/api/v2/data/data_reader.html
+82
-0
develop/doc/searchindex.js
develop/doc/searchindex.js
+1
-1
develop/doc_cn/api/v2/data/data_reader.html
develop/doc_cn/api/v2/data/data_reader.html
+82
-0
develop/doc_cn/searchindex.js
develop/doc_cn/searchindex.js
+1
-1
未找到文件。
develop/doc/api/v2/data/data_reader.html
浏览文件 @
3fbf33d4
...
...
@@ -710,6 +710,88 @@ And this function contains a buffered decorator.
:rtype: callable
</p>
</dd></dl>
<dl
class=
"function"
>
<dt>
<code
class=
"descclassname"
>
paddle.v2.reader.
</code><code
class=
"descname"
>
pipe_reader
</code><span
class=
"sig-paren"
>
(
</span><em>
left_cmd
</em>
,
<em>
parser
</em>
,
<em>
bufsize=8192
</em>
,
<em>
file_type='plain'
</em>
,
<em>
cut_lines=True
</em>
,
<em>
line_break='\n'
</em><span
class=
"sig-paren"
>
)
</span></dt>
<dd><blockquote>
<div><p>
pipe_reader read data by stream from a command, take it
’
s
stdout into a pipe buffer and redirect it to the parser to
parse, then yield data as your desired format.
</p>
<p>
You can using standard linux command or call another program
to read data, from HDFS, Ceph, URL, AWS S3 etc:
</p>
<p>
cmd =
“
hadoop fs -cat /path/to/some/file
”
cmd =
“
cat sample_file.tar.gz
”
cmd =
“
curl
<a
class=
"reference external"
href=
"http://someurl"
>
http://someurl
</a>
”
cmd =
“
python print_s3_bucket.py
”
</p>
<p>
A sample parser:
</p>
<dl
class=
"docutils"
>
<dt>
def sample_parser(lines):
</dt>
<dd><p
class=
"first"
>
# parse each line as one sample data,
# return a list of samples as batches.
ret = []
for l in lines:
</p>
<blockquote>
<div>
ret.append(l.split(
”
”
)[1:5])
</div></blockquote>
<p
class=
"last"
>
return ret
</p>
</dd>
</dl>
<table
class=
"docutils field-list"
frame=
"void"
rules=
"none"
>
<col
class=
"field-name"
/>
<col
class=
"field-body"
/>
<tbody
valign=
"top"
>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param left_cmd:
</th><td
class=
"field-body"
>
command to excute to get stdout from.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type left_cmd:
</th><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param parser:
</th><td
class=
"field-body"
>
parser function to parse lines of data.
if cut_lines is True, parser will receive list
of lines.
if cut_lines is False, parser will receive a
raw buffer each time.
parser should return a list of parsed values.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type parser:
</th><td
class=
"field-body"
>
callable
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param bufsize:
</th><td
class=
"field-body"
>
the buffer size used for the stdout pipe.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type bufsize:
</th><td
class=
"field-body"
>
int
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param file_type:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
can be plain/gzip, stream buffer data type.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type file_type:
</th><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param cut_lines:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
whether to pass lines instead of raw buffer
to the parser
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type cut_lines:
</th><td
class=
"field-body"
>
bool
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param line_break:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
line break of the file, like
</td>
</tr>
</tbody>
</table>
</div></blockquote>
<dl
class=
"docutils"
>
<dt>
or
</dt>
<dd><table
class=
"first last docutils field-list"
frame=
"void"
rules=
"none"
>
<col
class=
"field-name"
/>
<col
class=
"field-body"
/>
<tbody
valign=
"top"
>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
type line_break:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
return:
</th><td
class=
"field-body"
>
the reader generator.
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
rtype:
</th><td
class=
"field-body"
>
callable
</td>
</tr>
</tbody>
</table>
</dd>
</dl>
</dd></dl>
</div>
<p>
Creator package contains some simple reader creator, which could
be used in user program.
</p>
...
...
develop/doc/searchindex.js
浏览文件 @
3fbf33d4
因为 它太大了无法显示 source diff 。你可以改为
查看blob
。
develop/doc_cn/api/v2/data/data_reader.html
浏览文件 @
3fbf33d4
...
...
@@ -724,6 +724,88 @@ And this function contains a buffered decorator.
:rtype: callable
</p>
</dd></dl>
<dl
class=
"function"
>
<dt>
<code
class=
"descclassname"
>
paddle.v2.reader.
</code><code
class=
"descname"
>
pipe_reader
</code><span
class=
"sig-paren"
>
(
</span><em>
left_cmd
</em>
,
<em>
parser
</em>
,
<em>
bufsize=8192
</em>
,
<em>
file_type='plain'
</em>
,
<em>
cut_lines=True
</em>
,
<em>
line_break='\n'
</em><span
class=
"sig-paren"
>
)
</span></dt>
<dd><blockquote>
<div><p>
pipe_reader read data by stream from a command, take it
’
s
stdout into a pipe buffer and redirect it to the parser to
parse, then yield data as your desired format.
</p>
<p>
You can using standard linux command or call another program
to read data, from HDFS, Ceph, URL, AWS S3 etc:
</p>
<p>
cmd =
“
hadoop fs -cat /path/to/some/file
”
cmd =
“
cat sample_file.tar.gz
”
cmd =
“
curl
<a
class=
"reference external"
href=
"http://someurl"
>
http://someurl
</a>
”
cmd =
“
python print_s3_bucket.py
”
</p>
<p>
A sample parser:
</p>
<dl
class=
"docutils"
>
<dt>
def sample_parser(lines):
</dt>
<dd><p
class=
"first"
>
# parse each line as one sample data,
# return a list of samples as batches.
ret = []
for l in lines:
</p>
<blockquote>
<div>
ret.append(l.split(
”
”
)[1:5])
</div></blockquote>
<p
class=
"last"
>
return ret
</p>
</dd>
</dl>
<table
class=
"docutils field-list"
frame=
"void"
rules=
"none"
>
<col
class=
"field-name"
/>
<col
class=
"field-body"
/>
<tbody
valign=
"top"
>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param left_cmd:
</th><td
class=
"field-body"
>
command to excute to get stdout from.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type left_cmd:
</th><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param parser:
</th><td
class=
"field-body"
>
parser function to parse lines of data.
if cut_lines is True, parser will receive list
of lines.
if cut_lines is False, parser will receive a
raw buffer each time.
parser should return a list of parsed values.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type parser:
</th><td
class=
"field-body"
>
callable
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
param bufsize:
</th><td
class=
"field-body"
>
the buffer size used for the stdout pipe.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type bufsize:
</th><td
class=
"field-body"
>
int
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param file_type:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
can be plain/gzip, stream buffer data type.
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type file_type:
</th><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param cut_lines:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
whether to pass lines instead of raw buffer
to the parser
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
type cut_lines:
</th><td
class=
"field-body"
>
bool
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
param line_break:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
line break of the file, like
</td>
</tr>
</tbody>
</table>
</div></blockquote>
<dl
class=
"docutils"
>
<dt>
or
</dt>
<dd><table
class=
"first last docutils field-list"
frame=
"void"
rules=
"none"
>
<col
class=
"field-name"
/>
<col
class=
"field-body"
/>
<tbody
valign=
"top"
>
<tr
class=
"field-odd field"
><th
class=
"field-name"
colspan=
"2"
>
type line_break:
</th></tr>
<tr
class=
"field-odd field"
><td>
 
</td><td
class=
"field-body"
>
string
</td>
</tr>
<tr
class=
"field-even field"
><th
class=
"field-name"
>
return:
</th><td
class=
"field-body"
>
the reader generator.
</td>
</tr>
<tr
class=
"field-odd field"
><th
class=
"field-name"
>
rtype:
</th><td
class=
"field-body"
>
callable
</td>
</tr>
</tbody>
</table>
</dd>
</dl>
</dd></dl>
</div>
<p>
Creator package contains some simple reader creator, which could
be used in user program.
</p>
...
...
develop/doc_cn/searchindex.js
浏览文件 @
3fbf33d4
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录