提交 3fbf33d4 编写于 作者: T Travis CI

Deploy to GitHub Pages: d89061c3

上级 00b17acd
......@@ -710,6 +710,88 @@ And this function contains a buffered decorator.
:rtype: callable</p>
</dd></dl>
<dl class="function">
<dt>
<code class="descclassname">paddle.v2.reader.</code><code class="descname">pipe_reader</code><span class="sig-paren">(</span><em>left_cmd</em>, <em>parser</em>, <em>bufsize=8192</em>, <em>file_type='plain'</em>, <em>cut_lines=True</em>, <em>line_break='\n'</em><span class="sig-paren">)</span></dt>
<dd><blockquote>
<div><p>pipe_reader read data by stream from a command, take it&#8217;s
stdout into a pipe buffer and redirect it to the parser to
parse, then yield data as your desired format.</p>
<p>You can using standard linux command or call another program
to read data, from HDFS, Ceph, URL, AWS S3 etc:</p>
<p>cmd = &#8220;hadoop fs -cat /path/to/some/file&#8221;
cmd = &#8220;cat sample_file.tar.gz&#8221;
cmd = &#8220;curl <a class="reference external" href="http://someurl">http://someurl</a>&#8221;
cmd = &#8220;python print_s3_bucket.py&#8221;</p>
<p>A sample parser:</p>
<dl class="docutils">
<dt>def sample_parser(lines):</dt>
<dd><p class="first"># parse each line as one sample data,
# return a list of samples as batches.
ret = []
for l in lines:</p>
<blockquote>
<div>ret.append(l.split(&#8221; &#8221;)[1:5])</div></blockquote>
<p class="last">return ret</p>
</dd>
</dl>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">param left_cmd:</th><td class="field-body">command to excute to get stdout from.</td>
</tr>
<tr class="field-even field"><th class="field-name">type left_cmd:</th><td class="field-body">string</td>
</tr>
<tr class="field-odd field"><th class="field-name">param parser:</th><td class="field-body">parser function to parse lines of data.
if cut_lines is True, parser will receive list
of lines.
if cut_lines is False, parser will receive a
raw buffer each time.
parser should return a list of parsed values.</td>
</tr>
<tr class="field-even field"><th class="field-name">type parser:</th><td class="field-body">callable</td>
</tr>
<tr class="field-odd field"><th class="field-name">param bufsize:</th><td class="field-body">the buffer size used for the stdout pipe.</td>
</tr>
<tr class="field-even field"><th class="field-name">type bufsize:</th><td class="field-body">int</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param file_type:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">can be plain/gzip, stream buffer data type.</td>
</tr>
<tr class="field-even field"><th class="field-name">type file_type:</th><td class="field-body">string</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param cut_lines:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">whether to pass lines instead of raw buffer
to the parser</td>
</tr>
<tr class="field-even field"><th class="field-name">type cut_lines:</th><td class="field-body">bool</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param line_break:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">line break of the file, like</td>
</tr>
</tbody>
</table>
</div></blockquote>
<dl class="docutils">
<dt>or</dt>
<dd><table class="first last docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name" colspan="2">type line_break:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">string</td>
</tr>
<tr class="field-even field"><th class="field-name">return:</th><td class="field-body">the reader generator.</td>
</tr>
<tr class="field-odd field"><th class="field-name">rtype:</th><td class="field-body">callable</td>
</tr>
</tbody>
</table>
</dd>
</dl>
</dd></dl>
</div>
<p>Creator package contains some simple reader creator, which could
be used in user program.</p>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
......@@ -724,6 +724,88 @@ And this function contains a buffered decorator.
:rtype: callable</p>
</dd></dl>
<dl class="function">
<dt>
<code class="descclassname">paddle.v2.reader.</code><code class="descname">pipe_reader</code><span class="sig-paren">(</span><em>left_cmd</em>, <em>parser</em>, <em>bufsize=8192</em>, <em>file_type='plain'</em>, <em>cut_lines=True</em>, <em>line_break='\n'</em><span class="sig-paren">)</span></dt>
<dd><blockquote>
<div><p>pipe_reader read data by stream from a command, take it&#8217;s
stdout into a pipe buffer and redirect it to the parser to
parse, then yield data as your desired format.</p>
<p>You can using standard linux command or call another program
to read data, from HDFS, Ceph, URL, AWS S3 etc:</p>
<p>cmd = &#8220;hadoop fs -cat /path/to/some/file&#8221;
cmd = &#8220;cat sample_file.tar.gz&#8221;
cmd = &#8220;curl <a class="reference external" href="http://someurl">http://someurl</a>&#8221;
cmd = &#8220;python print_s3_bucket.py&#8221;</p>
<p>A sample parser:</p>
<dl class="docutils">
<dt>def sample_parser(lines):</dt>
<dd><p class="first"># parse each line as one sample data,
# return a list of samples as batches.
ret = []
for l in lines:</p>
<blockquote>
<div>ret.append(l.split(&#8221; &#8221;)[1:5])</div></blockquote>
<p class="last">return ret</p>
</dd>
</dl>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">param left_cmd:</th><td class="field-body">command to excute to get stdout from.</td>
</tr>
<tr class="field-even field"><th class="field-name">type left_cmd:</th><td class="field-body">string</td>
</tr>
<tr class="field-odd field"><th class="field-name">param parser:</th><td class="field-body">parser function to parse lines of data.
if cut_lines is True, parser will receive list
of lines.
if cut_lines is False, parser will receive a
raw buffer each time.
parser should return a list of parsed values.</td>
</tr>
<tr class="field-even field"><th class="field-name">type parser:</th><td class="field-body">callable</td>
</tr>
<tr class="field-odd field"><th class="field-name">param bufsize:</th><td class="field-body">the buffer size used for the stdout pipe.</td>
</tr>
<tr class="field-even field"><th class="field-name">type bufsize:</th><td class="field-body">int</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param file_type:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">can be plain/gzip, stream buffer data type.</td>
</tr>
<tr class="field-even field"><th class="field-name">type file_type:</th><td class="field-body">string</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param cut_lines:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">whether to pass lines instead of raw buffer
to the parser</td>
</tr>
<tr class="field-even field"><th class="field-name">type cut_lines:</th><td class="field-body">bool</td>
</tr>
<tr class="field-odd field"><th class="field-name" colspan="2">param line_break:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">line break of the file, like</td>
</tr>
</tbody>
</table>
</div></blockquote>
<dl class="docutils">
<dt>or</dt>
<dd><table class="first last docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name" colspan="2">type line_break:</th></tr>
<tr class="field-odd field"><td>&#160;</td><td class="field-body">string</td>
</tr>
<tr class="field-even field"><th class="field-name">return:</th><td class="field-body">the reader generator.</td>
</tr>
<tr class="field-odd field"><th class="field-name">rtype:</th><td class="field-body">callable</td>
</tr>
</tbody>
</table>
</dd>
</dl>
</dd></dl>
</div>
<p>Creator package contains some simple reader creator, which could
be used in user program.</p>
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册