提交 84d505de 编写于 作者: J Jane Beckman 提交者: David Yozie

Docs: Add COPY ON SEGMENT (#2681)

* Updated file for COPY ON SEGMENT

* Extra note about FROM and STDOUT

* Incorporate comments from David and Mel

* Revise COPY FROM note

* Copying note from line 65
上级 50f812c6
......@@ -24,6 +24,7 @@
COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} TO {'<varname>file</varname>' | STDOUT}
      [ [WITH]
[ON SEGMENT]
[BINARY]
        [OIDS]
        [HEADER]
......@@ -37,19 +38,38 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
<section id="section3">
<title>Description</title>
<p><codeph>COPY</codeph> moves data between Greenplum Database tables and standard file-system
files. <codeph>COPY TO</codeph> copies the contents of a table to a file, while <codeph>COPY
files. <codeph>COPY TO</codeph> copies the contents of a table to a file (or multiple files
based on the segment ID if copying <codeph>ON SEGMENT</codeph>), while <codeph>COPY
FROM</codeph> copies data from a file to a table (appending the data to whatever is in the
table already). <codeph>COPY TO</codeph> can also copy the results of a
<codeph>SELECT</codeph> query. </p>
<codeph>SELECT</codeph> query. <note><codeph>COPY
FROM</codeph> currently does not support copying data <codeph>FROM</codeph> the segment
files generated by <codeph>COPY TO</codeph> with the <codeph>ON SEGMENT</codeph> option,
but other tools can be used to restore data.</note></p>
<p>If a list of columns is specified, <codeph>COPY</codeph> will only copy the data in the
specified columns to or from the file. If there are any columns in the table that are not in
the column list, <codeph>COPY FROM</codeph> will insert the default values for those
columns. </p>
<p><codeph>COPY</codeph> with a file name instructs the Greenplum Database master host to
directly read from or write to a file. The file must be accessible to the master host and
the name must be specified from the viewpoint of the master host. When
<codeph>STDIN</codeph> or <codeph>STDOUT</codeph> is specified, data is transmitted via
the connection between the client and the master.</p>
the name must be specified from the viewpoint of the master host. </p>
<p>When <codeph>COPY</codeph> is used with the <codeph>ON SEGMENT</codeph> option, the
<codeph>COPY TO</codeph> causes segments to create individual segment-oriented files,
which remain on the segment hosts. The <varname>file</varname> argument for <codeph>ON
SEGMENT</codeph> takes the string literal <codeph>&lt;SEGID></codeph> (required) and uses
either the absolute path or the <codeph>&lt;SEG_DATA_DIR></codeph> string literal. When the
<codeph>COPY</codeph> operation is run, the segment IDs and the paths of the segment data
directories are substituted for the string literal values. </p>
<p>The <codeph>ON SEGMENT</codeph> option allows you to copy table data to files on segments
for use in operations such as migrating data between clusters or performing a backup.
Segment data created by the <codeph>ON SEGMENT</codeph> option can be restored by tools such
as <codeph>gpfdist</codeph>, which is useful for high speed data loading. <note><codeph>COPY
FROM</codeph> currently does not support copying data <codeph>FROM</codeph> the segment
files generated by <codeph>COPY TO</codeph> with the <codeph>ON SEGMENT</codeph> option,
but other tools can be used to restore data.</note></p>
<p>When <codeph>STDIN</codeph> or <codeph>STDOUT</codeph> is specified, data is transmitted
via the connection between the client and the master. <codeph>STDOUT</codeph> cannot be used
with the <codeph>ON SEGMENT</codeph> option.</p>
<p>If <codeph>SEGMENT REJECT LIMIT</codeph> is used, then a <codeph>COPY FROM</codeph>
operation will operate in single row error isolation mode. In this release, single row error
isolation mode only applies to rows in the input file with format errors — for example,
......@@ -114,6 +134,35 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
<pt>STDOUT</pt>
<pd>Specifies that output goes to the client application. </pd>
</plentry>
<plentry>
<pt>ON SEGMENT</pt>
<pd>Copy table data to create individual segment-oriented files, which remain on the
segment hosts. <codeph>COPY FROM</codeph> the <codeph>ON SEGMENT</codeph> output is
currently not supported. <codeph>COPY TO STDOUT</codeph> cannot be used with <codeph>ON
SEGMENT</codeph>. The <codeph>&lt;SEG_DATA_DIR></codeph> and
<codeph>&lt;SEGID></codeph> string literals are used with <codeph>ON SEGMENT</codeph>,
with the following
syntax:<codeblock>COPY <varname>table</varname> TO '&lt;SEG_DATA_DIR>/<varname>gpdumpname</varname>&lt;SEGID>_<varname>suffix</varname>' ON SEGMENT; </codeblock>Mirror
segments do not copy their data into the segment files.<parml>
<plentry>
<pt>&lt;SEG_DATA_DIR></pt>
<pd>The string literal representing the full path of the segment data directory for
<codeph>ON SEGMENT</codeph> copying. Brackets are part of the string literal
used to specify the path. COPY replaces the string literal with the segment
path(s) when <codeph>COPY</codeph> is run. (Optional. An absolute path can be used
in place of using the <codeph>&lt;SEG_DATA_DIR></codeph> string literal.)</pd>
</plentry>
</parml><parml>
<plentry>
<pt>&lt;SEGID></pt>
<pd>The string literal representing the content ID number of the segment to be
copied when copying <codeph>ON SEGMENT</codeph>. <codeph>&lt;SEGID></codeph> is a
required part of the <codeph>ON SEGMENT</codeph> option. Brackets are part of the
string literal used to specify the path. <codeph>COPY</codeph> replaces the string
literal with the content ID when <codeph>COPY</codeph> is run.</pd>
</plentry>
</parml></pd>
</plentry>
<plentry>
<pt>BINARY</pt>
<pd>Causes all data to be stored or read in binary format rather than as text. You cannot
......@@ -502,6 +551,45 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
isolation mode and log errors:</p>
<codeblock>COPY sales FROM '/home/usr1/sql/sales_data' LOG ERRORS
SEGMENT REJECT LIMIT 10 ROWS;</codeblock>
<p>To copy segment data for later use, use the <codeph>ON SEGMENT</codeph> argument. Use of
the <codeph>ON SEGMENT</codeph> argument takes the form:</p>
<p>COPY <varname>table</varname> TO
'&lt;SEG_DATA_DIR>/<varname>gpdumpname</varname>&lt;SEGID>_<varname>suffix</varname>' ON
SEGMENT; </p>
<p>The <codeph>&lt;SEGID></codeph> is required. However, you can substitute an absolute path
for the <codeph>&lt;SEG_DATA_DIR></codeph> string literal in the path. </p>
<p>When you pass in the string literal &lt;SEG_DATA_DIR> and &lt;SEGID> to
<codeph>COPY</codeph>, <codeph>COPY</codeph> will fill in the appropriate values when the
operation is run.</p>
<p>For example, if you have <codeph>mytable</codeph> with the segments and mirror segments
like
this:<codeblock>contentid | dbid | file segment location
0 | 1 |/home/usr1/data1/gpsegdir0
0 | 3 | /home/usr1/data_mirror1/gpsegdir0
1 | 4 | /home/usr1/data2/gpsegdir1
1 | 2 | /home/usr1/data_mirror2/gpsegdir1
</codeblock>running
the
command:<codeblock>COPY mytable TO '&lt;SEG_DATA_DIR>/gpbackup&lt;SEGID>.txt' ON SEGMENT;</codeblock>
would result in the following
files:<codeblock>/home/usr1/data1/gpsegdir0/gpbackup0.txt
/home/usr1/data2/gpsegdir1/gpbackup1.txt</codeblock></p>
<p>The content ID in the first column is the identifier inserted into the file path (for
example, <codeph>gpsegdir0/gpbackup0.txt</codeph> above) Files are created on the segment
hosts, rather than on the master, as they would be in a standard <codeph>COPY</codeph>
operation. No data files are created for the mirror segments when using <codeph>ON
SEGMENT</codeph> copying.</p>
<p>If an absolute path is specified, instead of <codeph>&lt;SEG_DATA_DIR></codeph>, such as in
the statement
<codeblock>COPY mytable TO '/tmp/gpdir/gpbackup_&lt;SEGID>.txt' ON SEGMENT;</codeblock></p>
<p>files would be placed in <codeph>/tmp/gpdir</codeph> on every segment.<note>The
<codeph>COPY FROM</codeph> operation does not currently support the <codeph>ON
SEGMENT</codeph> argument. Tools such as <codeph>gpfdist</codeph> can be used to restore
data.</note>
</p>
</section>
<section id="section12">
<title>Compatibility</title>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册