提交 58b61bd3 编写于 作者: M Mel Kiyama 提交者: dyozie

docs: COPY command - add PROGRAM clause (#3297)

* docs: COPY command add PROGRAM clause

* docs: copy - edits from review comments.
上级 f44db8da
...@@ -1166,8 +1166,7 @@ ...@@ -1166,8 +1166,7 @@
<tbody> <tbody>
<row> <row>
<entry colname="col1">DEBUG5 <entry colname="col1">DEBUG5
<p>DEBUG4</p><p>DEBUG3</p><p>DEBUG2</p><p>DEBUG1</p><p>LOG <p>DEBUG4</p><p>DEBUG3</p><p>DEBUG2</p><p>DEBUG1</p><p>LOG</p><p>NOTICE</p><p>WARNING</p><p>ERROR</p><p>FATAL</p><p>PANIC</p></entry>
NOTICE</p><p>WARNING</p><p>ERROR</p><p>FATAL</p><p>PANIC</p></entry>
<entry colname="col2">NOTICE</entry> <entry colname="col2">NOTICE</entry>
<entry colname="col3">master<p>session</p><p>reload</p></entry> <entry colname="col3">master<p>session</p><p>reload</p></entry>
</row> </row>
...@@ -3616,7 +3615,8 @@ ...@@ -3616,7 +3615,8 @@
</ul></p> </ul></p>
<p>If the value is <codeph>false</codeph>, the distribution policy is not checked. The data <p>If the value is <codeph>false</codeph>, the distribution policy is not checked. The data
added to the table might violate the table distribution policy for the segment instance. added to the table might violate the table distribution policy for the segment instance.
Manual redistribution of table data might be required. </p> Manual redistribution of table data might be required. See the <codeph>ALTER TABLE</codeph>
clause <codeph>WITH REORGANIZE</codeph>.</p>
<p>The parameter can be set for a database system or a session. The parameter cannot be set <p>The parameter can be set for a database system or a session. The parameter cannot be set
for a specific database.</p> for a specific database.</p>
<table id="table_gxf_hpm_z1b"> <table id="table_gxf_hpm_z1b">
...@@ -4878,7 +4878,9 @@ ...@@ -4878,7 +4878,9 @@
<topic id="gp_resource_group_cpu_limit"> <topic id="gp_resource_group_cpu_limit">
<title>gp_resource_group_cpu_limit</title> <title>gp_resource_group_cpu_limit</title>
<body> <body>
<note type="warning">Resource group-based workload management is an experimental feature and is not intended for use in a production environment. Experimental features are subject to change without notice in future releases.</note> <note type="warning">Resource group-based workload management is an experimental feature and
is not intended for use in a production environment. Experimental features are subject to
change without notice in future releases.</note>
<p>Identifies the maximum percentage of system CPU resources to allocate to resource groups on <p>Identifies the maximum percentage of system CPU resources to allocate to resource groups on
each Greenplum Database segment node.</p> each Greenplum Database segment node.</p>
<table id="gp_resource_group_cpu_limit_table"> <table id="gp_resource_group_cpu_limit_table">
...@@ -4907,7 +4909,9 @@ ...@@ -4907,7 +4909,9 @@
<topic id="gp_resource_group_memory_limit"> <topic id="gp_resource_group_memory_limit">
<title>gp_resource_group_memory_limit</title> <title>gp_resource_group_memory_limit</title>
<body> <body>
<note type="warning">Resource group-based workload management is an experimental feature and is not intended for use in a production environment. Experimental features are subject to change without notice in future releases.</note> <note type="warning">Resource group-based workload management is an experimental feature and
is not intended for use in a production environment. Experimental features are subject to
change without notice in future releases.</note>
<p>Identifies the maximum percentage of system memory resources to allocate to resource groups <p>Identifies the maximum percentage of system memory resources to allocate to resource groups
on each Greenplum Database segment node.</p> on each Greenplum Database segment node.</p>
<table id="gp_resource_group_memory_limit_table"> <table id="gp_resource_group_memory_limit_table">
...@@ -4936,7 +4940,9 @@ ...@@ -4936,7 +4940,9 @@
<topic id="gp_resource_manager"> <topic id="gp_resource_manager">
<title>gp_resource_manager</title> <title>gp_resource_manager</title>
<body> <body>
<note type="warning">Resource group-based workload management is an experimental feature and is not intended for use in a production environment. Experimental features are subject to change without notice in future releases.</note> <note type="warning">Resource group-based workload management is an experimental feature and
is not intended for use in a production environment. Experimental features are subject to
change without notice in future releases.</note>
<p>Identifies the resource management scheme currently enabled in the Greenplum Database <p>Identifies the resource management scheme currently enabled in the Greenplum Database
cluster. The default scheme is workload management using resource queues.</p> cluster. The default scheme is workload management using resource queues.</p>
<table id="gp_resource_manager_table"> <table id="gp_resource_manager_table">
...@@ -7213,7 +7219,9 @@ ...@@ -7213,7 +7219,9 @@
<topic id="max_resource_groups"> <topic id="max_resource_groups">
<title>max_resource_groups</title> <title>max_resource_groups</title>
<body> <body>
<note type="warning">Resource group-based workload management is an experimental feature and is not intended for use in a production environment. Experimental features are subject to change without notice in future releases.</note> <note type="warning">Resource group-based workload management is an experimental feature and
is not intended for use in a production environment. Experimental features are subject to
change without notice in future releases.</note>
<p>Sets the maximum number of resource groups that you can create in a Greenplum Database <p>Sets the maximum number of resource groups that you can create in a Greenplum Database
system. Resource groups are defined system-wide.</p> system. Resource groups are defined system-wide.</p>
<table id="max_resource_queues_table"> <table id="max_resource_queues_table">
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
<p id="sql_command_desc">Copies data between a file and a table.</p> <p id="sql_command_desc">Copies data between a file and a table.</p>
<section id="section2"> <section id="section2">
<title>Synopsis</title> <title>Synopsis</title>
<codeblock id="sql_command_synopsis">COPY <varname>table</varname> [(<varname>column</varname> [, ...])] FROM {'<varname>file</varname>' | STDIN} <codeblock id="sql_command_synopsis">COPY <varname>table</varname> [(<varname>column</varname> [, ...])] FROM {'<varname>file</varname>' | PROGRAM '<varname>command</varname>' | STDIN}
     [ [WITH]      [ [WITH]
[ON SEGMENT] [ON SEGMENT]
[BINARY] [BINARY]
...@@ -23,7 +23,7 @@ ...@@ -23,7 +23,7 @@
     [[LOG ERRORS]      [[LOG ERRORS]
       SEGMENT REJECT LIMIT <varname>count</varname> [ROWS | PERCENT] ]        SEGMENT REJECT LIMIT <varname>count</varname> [ROWS | PERCENT] ]
COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} TO {'<varname>file</varname>' | STDOUT} COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} TO {'<varname>file</varname>' | PROGRAM '<varname>command</varname>' | STDOUT}
      [ [WITH]       [ [WITH]
[ON SEGMENT] [ON SEGMENT]
[BINARY] [BINARY]
...@@ -123,6 +123,26 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -123,6 +123,26 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
</pt> </pt>
<pd>The absolute path name of the input or output file.</pd> <pd>The absolute path name of the input or output file.</pd>
</plentry> </plentry>
<plentry>
<pt>PROGRAM '<varname>command</varname>'</pt>
<pd>Specify a command to execute. The <varname>command</varname> must be specified from
the viewpoint of the Greenplum Database master host system, and must be executable by
the Greenplum Database administrator user (<codeph>gpadmin</codeph>). The <codeph>COPY
FROM</codeph> command reads the input from the standard output of the command, and for
the <codeph>COPY TO</codeph> command, the output is written to the standard input of the
command.</pd>
<pd>The <varname>command</varname> is invoked by a shell. When passing arguments to the
shell, strip or escape any special characters that have a special meaning for the shell.
For security reasons, it is best to use a fixed command string, or at least avoid
passing any user input in the string. </pd>
<pd>When <codeph>ON SEGMENT</codeph> is specified, the command must be executable on all
Greenplum Database primary segment hosts by the Greenplum Database administrator user
(<codeph>gpadmin</codeph>). The command is executed by each Greenplum segment
instance. The <codeph>&lt;SEGID></codeph> is required in the
<varname>command</varname>.</pd>
<pd>See the <codeph>ON SEGMENT</codeph> clause for information about command syntax
requirements and he data that is copied when the clause is specified. </pd>
</plentry>
<plentry> <plentry>
<pt>STDIN</pt> <pt>STDIN</pt>
<pd>Specifies that input comes from the client application. The <codeph>ON <pd>Specifies that input comes from the client application. The <codeph>ON
...@@ -170,6 +190,10 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -170,6 +190,10 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
instance. </pd> instance. </pd>
</plentry> </plentry>
</parml></pd> </parml></pd>
<pd>When the <codeph>PROGRAM <varname>command</varname></codeph> clause is specified, the
<codeph>&lt;SEGID></codeph> string literal is required in the
<varname>command</varname>, the <codeph>&lt;SEG_DATA_DIR></codeph> string literal is
optional. See <xref href="#topic1/section11" format="dita">Examples</xref>.</pd>
<pd>For a <codeph>COPY FROM...ON SEGMENT</codeph> command, the table distribution policy <pd>For a <codeph>COPY FROM...ON SEGMENT</codeph> command, the table distribution policy
is checked when data is copied into the table. By default, an error is returned if a is checked when data is copied into the table. By default, an error is returned if a
data row violates the table distribution policy. You can disable the distribution policy data row violates the table distribution policy. You can disable the distribution policy
...@@ -378,7 +402,8 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -378,7 +402,8 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
<codeph>COPY FROM...ON SEGMENT</codeph> was run.</p> <codeph>COPY FROM...ON SEGMENT</codeph> was run.</p>
<note>If you run <codeph>COPY FROM...ON SEGMENT</codeph>and the server configuration parameter <note>If you run <codeph>COPY FROM...ON SEGMENT</codeph>and the server configuration parameter
<codeph>gp_enable_segment_copy_checking</codeph> is <codeph>false</codeph>, manual <codeph>gp_enable_segment_copy_checking</codeph> is <codeph>false</codeph>, manual
redistribution of table data might be required.</note> redistribution of table data might be required. See the <codeph>ALTER TABLE</codeph> clause
<codeph>WITH REORGANIZE</codeph>.</note>
<p>When you specify the <codeph>LOG ERRORS</codeph> clause, Greenplum Database captures errors <p>When you specify the <codeph>LOG ERRORS</codeph> clause, Greenplum Database captures errors
that occur while reading the external table data. You can view and manage the captured error that occur while reading the external table data. You can view and manage the captured error
log data. </p> log data. </p>
...@@ -583,12 +608,9 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -583,12 +608,9 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
isolation mode and log errors:</p> isolation mode and log errors:</p>
<codeblock>COPY sales FROM '/home/usr1/sql/sales_data' LOG ERRORS <codeblock>COPY sales FROM '/home/usr1/sql/sales_data' LOG ERRORS
SEGMENT REJECT LIMIT 10 ROWS;</codeblock> SEGMENT REJECT LIMIT 10 ROWS;</codeblock>
<p>To copy segment data for later use, use the <codeph>ON SEGMENT</codeph> argument. Use of <p>To copy segment data for later use, use the <codeph>ON SEGMENT</codeph> clause. Use of the
the <codeph>COPY TO ON SEGMENT</codeph> argument takes the form:</p> <codeph>COPY TO ON SEGMENT</codeph> command takes the form:</p>
<p><codeph>COPY</codeph> <codeblock>COPY <varname>table</varname> TO '&lt;SEG_DATA_DIR>/<varname>gpdumpname</varname>&lt;SEGID>_<varname>suffix</varname>' ON SEGMENT; </codeblock>
<varname>table</varname> TO
'&lt;SEG_DATA_DIR>/<varname>gpdumpname</varname>&lt;SEGID>_<varname>suffix</varname>' ON
SEGMENT; </p>
<p>The <codeph>&lt;SEGID></codeph> is required. However, you can substitute an absolute path <p>The <codeph>&lt;SEGID></codeph> is required. However, you can substitute an absolute path
for the <codeph>&lt;SEG_DATA_DIR></codeph> string literal in the path. </p> for the <codeph>&lt;SEG_DATA_DIR></codeph> string literal in the path. </p>
<p>When you pass in the string literal <codeph>&lt;SEG_DATA_DIR></codeph> and <p>When you pass in the string literal <codeph>&lt;SEG_DATA_DIR></codeph> and
...@@ -597,14 +619,10 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -597,14 +619,10 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
<p>For example, if you have <codeph>mytable</codeph> with the segments and mirror segments <p>For example, if you have <codeph>mytable</codeph> with the segments and mirror segments
like like
this:<codeblock>contentid | dbid | file segment location this:<codeblock>contentid | dbid | file segment location
0 | 1 |/home/usr1/data1/gpsegdir0 0 | 1 | /home/usr1/data1/gpsegdir0
0 | 3 | /home/usr1/data_mirror1/gpsegdir0 0 | 3 | /home/usr1/data_mirror1/gpsegdir0
1 | 4 | /home/usr1/data2/gpsegdir1 1 | 4 | /home/usr1/data2/gpsegdir1
1 | 2 | /home/usr1/data_mirror2/gpsegdir1 </codeblock>running
1 | 2 | /home/usr1/data_mirror2/gpsegdir1
</codeblock>running
the the
command:<codeblock>COPY mytable TO '&lt;SEG_DATA_DIR>/gpbackup&lt;SEGID>.txt' ON SEGMENT;</codeblock> command:<codeblock>COPY mytable TO '&lt;SEG_DATA_DIR>/gpbackup&lt;SEGID>.txt' ON SEGMENT;</codeblock>
would result in the following would result in the following
...@@ -624,6 +642,22 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)} ...@@ -624,6 +642,22 @@ COPY {table [(<varname>column</varname> [, ...])] | (<varname>query</varname>)}
necessary.<note>Tools such as <codeph>gpfdist</codeph> can be used to restore data. The necessary.<note>Tools such as <codeph>gpfdist</codeph> can be used to restore data. The
backup/restore tools will not work with files that were manually generated with backup/restore tools will not work with files that were manually generated with
<codeph>COPY TO ON SEGMENT</codeph>. </note></p> <codeph>COPY TO ON SEGMENT</codeph>. </note></p>
<p>This example copies the data from the <codeph>lineitem</codeph> table and uses the
<codeph>PROGRAM</codeph> clause to add the data to the
<codeph>/tmp/lineitem_program.csv</codeph> file with <codeph>cat</codeph> utility. The
file is placed on the Greenplum Database
master.<codeblock>COPY LINEITEM TO PROGRAM 'cat > /tmp/lineitem.csv' CSV; </codeblock></p>
<p>This example uses the <codeph>PROGRAM</codeph> and <codeph>ON SEGEMENT</codeph> clauses to
copy data to files on the segment hosts. On the segment hosts, the <codeph>COPY</codeph>
command replaces <codeph>&lt;SEGID></codeph> with the segment content ID to create a file
for each segment instance on the segment
host.<codeblock>COPY LINEITEM TO PROGRAM 'cat > /tmp/lineitem_program&lt;SEGID>.csv' ON SEGMENT CSV; </codeblock></p>
<p>This example uses the <codeph>PROGRAM</codeph> and <codeph>ON SEGEMENT</codeph> clauses to
copy data from files on the segment hosts. The <codeph>COPY</codeph> command replaces
<codeph>&lt;SEGID></codeph> with the segment content ID when copying data from the files.
On the segment hosts, there must be a file for each segment instance where the file name
contains the segment content ID on the segment host.
<codeblock>COPY LINEITEM_4 FROM PROGRAM 'cat /tmp/lineitem_program&lt;SEGID>.csv' ON SEGMENT CSV;</codeblock></p>
</section> </section>
<section id="section12"> <section id="section12">
<title>Compatibility</title> <title>Compatibility</title>
......
...@@ -336,9 +336,9 @@ testdb=#</codeblock> ...@@ -336,9 +336,9 @@ testdb=#</codeblock>
</plentry> </plentry>
<plentry> <plentry>
<pt>\copy {<varname>table</varname> [(<varname>column_list</varname>)] | <pt>\copy {<varname>table</varname> [(<varname>column_list</varname>)] |
(<varname>query</varname>)} {from | to} {<varname>filename</varname> | stdin | stdout (<varname>query</varname>)} {from | to} {'<varname>filename</varname>' | stdin |
| pstdin | pstdout} [with] [binary] [oids] [delimiter [as] stdout | pstdin | pstdout} [with] [binary] [oids] [delimiter [as]
'<varname>character</varname>'] [null [as] '<varname>string</varname>'] [csv [header] '<varname>character</varname>'] [null [as] '<varname>string</varname>'] [csv [header]
[quote [as] 'character'] [escape [as] '<varname>character</varname>'] [force quote [quote [as] 'character'] [escape [as] '<varname>character</varname>'] [force quote
column_list] [force not null column_list]]</pt> column_list] [force not null column_list]]</pt>
<pd>Performs a frontend (client) copy. This is an operation that runs an SQL <pd>Performs a frontend (client) copy. This is an operation that runs an SQL
...@@ -497,8 +497,10 @@ testdb=#</codeblock> ...@@ -497,8 +497,10 @@ testdb=#</codeblock>
<pd>Lists all database roles, or only those that match pattern.</pd> <pd>Lists all database roles, or only those that match pattern.</pd>
</plentry> </plentry>
<plentry> <plentry>
<pt>\dx [<varname>extension_pattern</varname>] | \dx+ [<varname>extension_pattern</varname>]</pt> <pt>\dx [<varname>extension_pattern</varname>] | \dx+
<pd>Lists all installed extensions, or only those that match the pattern. <codeph>\dx</codeph> and <codeph>\dx+</codeph> are functionally equivalent.</pd> [<varname>extension_pattern</varname>]</pt>
<pd>Lists all installed extensions, or only those that match the pattern.
<codeph>\dx</codeph> and <codeph>\dx+</codeph> are functionally equivalent.</pd>
</plentry> </plentry>
<plentry> <plentry>
<pt>\e | \edit [<varname>filename</varname>]</pt> <pt>\e | \edit [<varname>filename</varname>]</pt>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册