From 2bc5401c3f8ce48cc168252eacc26668e67f0ec1 Mon Sep 17 00:00:00 2001 From: Mel Kiyama Date: Tue, 26 Sep 2017 12:26:24 -0700 Subject: [PATCH] docs: COPY... ON SEGMENT w/ SELECT is not supported. (#3348) --- gpdb-doc/dita/ref_guide/sql_commands/COPY.xml | 145 +++++++++--------- 1 file changed, 71 insertions(+), 74 deletions(-) diff --git a/gpdb-doc/dita/ref_guide/sql_commands/COPY.xml b/gpdb-doc/dita/ref_guide/sql_commands/COPY.xml index 8735f1dc20..fa4b81dee6 100644 --- a/gpdb-doc/dita/ref_guide/sql_commands/COPY.xml +++ b/gpdb-doc/dita/ref_guide/sql_commands/COPY.xml @@ -340,74 +340,73 @@ COPY {table [(column [, ...])] | (query)} -
- Notes -

COPY can only be used with tables, not with external tables or views. - However, you can write COPY (SELECT * FROM viewname) TO ... -

-

To copy data from a partitioned table with a leaf child partition that is an external - table, use an SQL query to copy the data. For example, if the table - my_sales contains a with a leaf child partition that is an external - table, this command COPY my_sales TO stdout returns an error. This command - sends the data to stdout:COPY (SELECT * from my_sales ) TO stdout

-

The BINARY key word causes all data to be stored/read as binary format - rather than as text. It is somewhat faster than the normal text mode, but a binary-format - file is less portable across machine architectures and Greenplum Database versions. Also, - you cannot run COPY FROM in single row error isolation mode if the data is - in binary format.

-

You must have SELECT privilege on the table whose values are read by - COPY TO, and insert privilege on the table into which values are inserted - by COPY FROM.

-

Files named in a COPY command are read or written directly by the database - server, not by the client application. Therefore, they must reside on or be accessible to - the Greenplum Database master host machine, not the client. They must be accessible to and - readable or writable by the Greenplum Database system user (the user ID the server runs as), - not the client. COPY naming a file is only allowed to database superusers, - since it allows reading or writing any file that the server has privileges to access.

-

COPY FROM will invoke any triggers and check constraints on the - destination table. However, it will not invoke rewrite rules. Note that in this release, - violations of constraints are not evaluated for single row error isolation mode.

-

COPY input and output is affected by DateStyle. To ensure - portability to other Greenplum Database installations that might use non-default - DateStyle settings, DateStyle should be set to ISO - before using COPY TO.

-

When copying XML data from a file in text mode, the server configuration parameter - xmloption - affects the validation of the XML data that is copied. If the value is - content (the default), XML data is validated as an XML content fragment. - If the parameter value is document, XML data is validated as an XML - document. If the XML data is not valid, COPY returns an error.

-

By default, COPY stops operation at the first error. This should not lead - to problems in the event of a COPY TO, but the target table will already - have received earlier rows in a COPY FROM. These rows will not be visible - or accessible, but they still occupy disk space. This may amount to a considerable amount of +

Notes

COPY can only be used with + tables, not with external tables or views. However, you can write COPY (SELECT * + FROM viewname) TO ... +

When the ON SEGMENT clause is specified, the COPY + command does not support specifying a SELECT statement in the COPY + TO command. For example, this command is not + supported.COPY (SELECT * FROM testtbl) TO '/tmp/mytst<SEGID>' ON SEGMENT

To + copy data from a partitioned table with a leaf child partition that is an external table, + use an SQL query to copy the data. For example, if the table my_sales + contains a with a leaf child partition that is an external table, this command COPY + my_sales TO stdout returns an error. This command sends the data to + stdout:COPY (SELECT * from my_sales ) TO stdout

The + BINARY key word causes all data to be stored/read as binary format rather + than as text. It is somewhat faster than the normal text mode, but a binary-format file is + less portable across machine architectures and Greenplum Database versions. Also, you cannot + run COPY FROM in single row error isolation mode if the data is in binary + format.

You must have SELECT privilege on the table whose values are + read by COPY TO, and insert privilege on the table into which values are + inserted by COPY FROM.

Files named in a COPY + command are read or written directly by the database server, not by the client application. + Therefore, they must reside on or be accessible to the Greenplum Database master host + machine, not the client. They must be accessible to and readable or writable by the + Greenplum Database system user (the user ID the server runs as), not the client. + COPY naming a file is only allowed to database superusers, since it + allows reading or writing any file that the server has privileges to + access.

COPY FROM will invoke any triggers and check constraints on + the destination table. However, it will not invoke rewrite rules. Note that in this release, + violations of constraints are not evaluated for single row error isolation + mode.

COPY input and output is affected by + DateStyle. To ensure portability to other Greenplum Database + installations that might use non-default DateStyle settings, + DateStyle should be set to ISO before using COPY + TO.

When copying XML data from a file in text mode, the server configuration + parameter xmloption affects the validation of the XML data that is copied. If the + value is content (the default), XML data is validated as an XML content + fragment. If the parameter value is document, XML data is validated as an + XML document. If the XML data is not valid, COPY returns an error.

By + default, COPY stops operation at the first error. This should not lead to + problems in the event of a COPY TO, but the target table will already have + received earlier rows in a COPY FROM. These rows will not be visible or + accessible, but they still occupy disk space. This may amount to a considerable amount of wasted disk space if the failure happened well into a large COPY FROM operation. You may wish to invoke VACUUM to recover the wasted space. Another option would be to use single row error isolation mode to filter out error rows - while still loading good rows.

-

When a COPY FROM...ON SEGMENT command is run, the server configuration - parameter gp_enable_segment_copy_checking controls whether the table - distribution policy (from the table DISTRIBUTED clause) is checked when - data is copied into the table. The default is to check the distribution policy. An error is - returned if the row of data violates the distribution policy for the segment instance. For a - partitioned table, if the distribution policy of the child leaf partitioned table is not the - same as the root table, an error is returned for all data. For information about the - parameter, see .

-

Data from a table that is generated by a COPY TO...ON SEGMENT command can - be used to restore table data with COPY FROM...ON SEGMENT. However, data - restored to the segments is distributed according to the table distribution policy at the - time the files were generated with the COPY TO command. The - COPY command might return table distribution policy errors, if you - attempt to restore table data and the table distribution policy was changed after the - COPY FROM...ON SEGMENT was run.

- If you run COPY FROM...ON SEGMENTand the server configuration parameter - gp_enable_segment_copy_checking is false, manual + while still loading good rows.

When a COPY FROM...ON SEGMENT command + is run, the server configuration parameter gp_enable_segment_copy_checking + controls whether the table distribution policy (from the table DISTRIBUTED + clause) is checked when data is copied into the table. The default is to check the + distribution policy. An error is returned if the row of data violates the distribution + policy for the segment instance. For a partitioned table, if the distribution policy of the + child leaf partitioned table is not the same as the root table, an error is returned for all + data. For information about the parameter, see .

Data from a table that is + generated by a COPY TO...ON SEGMENT command can be used to restore table + data with COPY FROM...ON SEGMENT. However, data restored to the segments is + distributed according to the table distribution policy at the time the files were generated + with the COPY TO command. The COPY command might return + table distribution policy errors, if you attempt to restore table data and the table + distribution policy was changed after the COPY FROM...ON SEGMENT was + run.

If you run COPY FROM...ON SEGMENTand the server configuration + parameter gp_enable_segment_copy_checking is false, manual redistribution of table data might be required. See the ALTER TABLE clause - WITH REORGANIZE. -

When you specify the LOG ERRORS clause, Greenplum Database captures errors - that occur while reading the external table data. You can view and manage the captured error - log data.

-
    + WITH REORGANIZE.

    When you specify the LOG + ERRORS clause, Greenplum Database captures errors that occur while reading the + external table data. You can view and manage the captured error log data.

    • Use the built-in SQL function gp_read_error_log('table_name'). It requires SELECT privilege on table_name. This example @@ -434,16 +433,14 @@ COPY {table [(column [, ...])] | (query)} information that was not deleted due to previous database issues. If * is specified, database owner privilege is required. If *.* is specified, operating system super-user privilege is required.

    • -
    -

    When a Greenplum Database user who is not a superuser runs a COPY command, - the command can be controlled by a resource queue. The resource queue must be configured - with the ACTIVE_STATEMENTS parameter that specifies a maximum limit on the - number of queries that can be executed by roles assigned to that queue. Greenplum Database - does not apply a cost value or memory value to a COPY command, resource - queues with only cost or memory limits do not affect the running of COPY - commands.

    -

    A non-superuser can run only these types of COPY commands:

      +

    When a Greenplum Database user who is not a superuser runs a COPY + command, the command can be controlled by a resource queue. The resource queue must be + configured with the ACTIVE_STATEMENTS parameter that specifies a maximum + limit on the number of queries that can be executed by roles assigned to that queue. + Greenplum Database does not apply a cost value or memory value to a COPY + command, resource queues with only cost or memory limits do not affect the running of + COPY commands.

    A non-superuser can run only these types of + COPY commands:

    • COPY FROM command where the source is stdin
    • COPY TO command where the destination is stdout

    -- GitLab