提交 3ddc470f 编写于 作者: M Mel Kiyama 提交者: dyozie

docs - gpcopy new options --dry-run, --no-distribution-check (#5264)

* docs - gpcopy new options --dry-run, --no-distribution-check

--Add limitation and warning about copying some types of partitioned tables with --no-distribution-check.
--Also added limitation and warning to Admin Guide.

* docs - gpcopy fix typo for new option --no-distribution-check

* docs - gpcopy new options. Removed unneeded xref.
上级 3aa3c9e5
......@@ -35,8 +35,9 @@
<codeph>--truncate-source-after</codeph> option to help migrate data from one Pivotal
Greenplum Database system to another on the same hardware, requiring minimal free space
available.</li>
</ul></p></body>
<topic id="topic_psq_dsp_zdb">
</ul></p>
</body>
<topic id="topic_psq_dsp_zdb">
<title>Prerequisites</title>
<body>
<p>The source and destination Greenplum Database systems must already exist, have network
......@@ -46,9 +47,10 @@
<codeph>pg_dumpall</codeph>, and <codeph>psql</codeph> utilities installed with Greenplum
Database. In most cases, you run <codeph>gpcopy</codeph> from a Greenplum Database cluster,
so the dependencies are automatically met. If you need to run <codeph>gpcopy</codeph> on a
remote server, such as an ETL system, copy the <codeph>gpcopy</codeph> binary and install a
remote server, such as an ETL system, copy the <codeph>gpcopy</codeph> binary and install a
compatible <xref href="https://network.pivotal.io/products/pivotal-gpdb" scope="external"
>Greenplum Clients</xref> package to meet the <codeph>gpcopy</codeph> dependencies.</p>
format="html">Greenplum Clients</xref> package to meet the <codeph>gpcopy</codeph>
dependencies.</p>
<p><codeph>gpcopy</codeph> supports migrating data from a Greenplum Database 4.3.26 or later
cluster to a Greenplum Database 5.9 or later cluster. However Greenplum Database 4.3.26 and
later do not include the actual <codeph>gpcopy</codeph> utility. You must manually copy the
......@@ -57,47 +59,58 @@
example:<codeblock>$ cp /usr/local/greenplum-db-5.8.0/bin/gpcopy /usr/local/greenplum-db-4.3.26.0/bin/</codeblock></p>
</body>
</topic>
<topic id="topic_qwl_2rp_zdb">
<title>Limitations for the Source and Destination Systems</title>
<body>
<topic id="topic_qwl_2rp_zdb">
<title>Limitations for the Source and Destination Systems</title>
<body>
<p><codeph>gpcopy</codeph> cannot copy data from one database to another in the same Greenplum
Database system. The destination system must be a separate Greenplum Database cluster.</p>
<p dir="ltr" id="docs-internal-guid-872b07d9-60bd-1131-d5a6-3fb4dc225771">The
<p dir="ltr" id="docs-internal-guid-872b07d9-60bd-1131-d5a6-3fb4dc225771">The
<codeph>gpcopy</codeph> utility supports copying only between source and destination
Greenplum Database systems with the same number of segments. If you need to transfer data to
a Greenplum Database system having a different number of segments, you must use
<codeph>gptransfer</codeph> instead. See <xref href="gptransfer.xml#topic_gptransfer"
/>.</p>
<p>If you are copying data between Greenplum Database clusters having different versions,
each cluster must have <codeph>gpcopy</codeph> installed locally. <codeph>gpcopy</codeph> is
<p>If you are copying data between Greenplum Database clusters having different versions, each
cluster must have <codeph>gpcopy</codeph> installed locally. <codeph>gpcopy</codeph> is
installed with Pivotal Greenplum Database starting with versions 5.9.0 and 4.3.26.0.</p>
<p><codeph>gpcopy</codeph> transfers data from user databases only; the
<codeph>postgres</codeph>, <codeph>template0</codeph>, and <codeph>template1</codeph>
databases cannot be transferred. Administrators must transfer configuration files manually
and install extensions into the destination database with <codeph>gppkg</codeph>.</p>
<p><codeph>gpcopy</codeph> cannot copy a row that is larger than 1GB in size.</p>
<p>When transferring data between databases, you can run only one instance of
<codeph>gpcopy</codeph> at a time. Running multiple, concurrent instances of
<p><codeph>gpcopy</codeph> transfers data from user databases only; the
<codeph>postgres</codeph>, <codeph>template0</codeph>, and <codeph>template1</codeph>
databases cannot be transferred. Administrators must transfer configuration files manually
and install extensions into the destination database with <codeph>gppkg</codeph>.</p>
<p><codeph>gpcopy</codeph> cannot copy a row that is larger than 1GB in size.</p>
<p><codeph>gpcopy</codeph> does not support table data distribution checking when copying a
partitioned table that is defined with a leaf table that is an external table or if a leaf
table is defined with a distribution policy that is different from the root partitioned
table. You can copy those tables in a <codeph>gpcopy</codeph> operation and specify the
<codeph>--no-distribution-check</codeph> option to disable checking of data distribution. </p>
<note type="warning">Before you perform a <codeph>gpcopy</codeph> operation with the
<codeph>--no-distribution-check</codeph> option, ensure that you have a backup of the
destination database and that the distribution policies of the tables that are being copied
are the same in the source and destination database. Copying data into segment instances
with incorrect data distribution can cause incorrect query results and can cause database
corruption.</note>
<p>When transferring data between databases, you can run only one instance of
<codeph>gpcopy</codeph> at a time. Running multiple, concurrent instances of
<codeph>gpcopy</codeph> is not supported.</p>
</body>
</topic>
<topic id="topic_ay1_frp_zdb">
<title>Configuring Parallel Jobs</title>
<body>
<p>The degree of parallelism when running <codeph>gpcopy</codeph> is determined the option
</body>
</topic>
<topic id="topic_ay1_frp_zdb">
<title>Configuring Parallel Jobs</title>
<body>
<p>The degree of parallelism when running <codeph>gpcopy</codeph> is determined the option
<codeph>--jobs</codeph>. The option controls the number processes that
<codeph>gpcopy</codeph> runs in parallel. The default is 4. The range is from 1 to 64. </p>
<p>The <codeph>--jobs</codeph> value, <varname>n</varname>, produces
<codeph>2*<varname>n</varname>+1</codeph> database connections. For example, the default
<codeph>--jobs</codeph> value of 4 creates 9 connections.</p>
<p>If you increase this option, ensure that the Greenplum Database systems are configured
with a sufficient maximum concurrent connection value to accommodate the
<codeph>gpcopy</codeph> connections and any other concurrent connections (such as user
connections) that you require. See the Greenplum Database server configuration parameter
<p>If you increase this option, ensure that the Greenplum Database systems are configured with
a sufficient maximum concurrent connection value to accommodate the <codeph>gpcopy</codeph>
connections and any other concurrent connections (such as user connections) that you
require. See the Greenplum Database server configuration parameter
<codeph>max_connections</codeph>.</p>
</body>
</topic>
<topic id="topic_nd3_2sp_zdb">
</body>
</topic>
<topic id="topic_nd3_2sp_zdb">
<title>Validating Copied Data</title>
<body>
<p>By default, <codeph>gpcopy</codeph> does not validate the data transferred. You can request
......@@ -109,7 +122,7 @@
tables.</li>
<li><codeph>md5xor</codeph> - validates by selecting all rows of the source and
destination tables, converting all columns in a row to text, and then calculating the
md5 value of each row. <codeph>gpcopy</codeph> then performs an XOR over the MD5 values
md5 value of each row. <codeph>gpcopy</codeph> then performs an XOR over the MD5 values
to ensure that all rows were successfully copied for the table.</li>
</ul></p>
<note>Avoid using <codeph>--append</codeph> with either validation option. If you use
......@@ -118,7 +131,7 @@
destination tables. </note>
</body>
</topic>
<topic id="topic_ytw_2sp_zdb">
<topic id="topic_ytw_2sp_zdb">
<title>Addressing Failed Data Transfers</title>
<body>
<p>When <codeph>gpcopy</codeph> encounters errors and quits or is cancelled by the user,
......@@ -187,13 +200,13 @@
<li>The <systemoutput>gpcopy</systemoutput> utility does not copy external objects such as
Greenplum Database extensions, third party jar files, and shared object files. You must
recreate these external objects as necessary to match the source system. </li>
<li>Greenplum Database 5.x removes automatic implicit casts between the text type and other
data types. After you migrate from Greenplum Database version 4.3.x to version 5.x, this
change in behavior may impact existing applications and queries. Refer to <xref
href="../../install_guide/43x_to_5x.xml" format="dita" scope="peer">About Implicit
Text Casting in Greenplum Database</xref> in the <cite>Greenplum Database Installation
Guide</cite> for information, including a discussion about supported and unsupported
workarounds.</li>
<li>Greenplum Database 5.x removes automatic implicit casts between the text type and
other data types. After you migrate from Greenplum Database version 4.3.x to version
5.x, this change in behavior may impact existing applications and queries. Refer to
<xref href="../../install_guide/43x_to_5x.xml" format="dita" scope="peer">About
Implicit Text Casting in Greenplum Database</xref> in the <cite>Greenplum Database
Installation Guide</cite> for information, including a discussion about supported and
unsupported workarounds.</li>
<li>After migrating data you may need to modify SQL scripts, administration scripts, and
user-defined functions as necessary to account for changes in Greenplum Database version
5.x. Look for <b>Upgrade Action Required</b> entries in the <xref scope="external"
......@@ -302,13 +315,13 @@ ORDER BY 4 DESC LIMIT 5;
<li>The <systemoutput>gpcopy</systemoutput> utility does not copy external objects such as
Greenplum Database extensions, third party jar files, and shared object files. You must
recreate these external objects as necessary to match the source system. </li>
<li>Greenplum Database 5.x removes automatic implicit casts between the text type and other
data types. After you migrate from Greenplum Database version 4.3.x to version 5.x, this
change in behavior may impact existing applications and queries. Refer to <xref
href="../../install_guide/43x_to_5x.xml" format="dita" scope="peer">About Implicit
Text Casting in Greenplum Database</xref> in the <cite>Greenplum Database Installation
Guide</cite> for information, including a discussion about supported and unsupported
workarounds.</li>
<li>Greenplum Database 5.x removes automatic implicit casts between the text type and
other data types. After you migrate from Greenplum Database version 4.3.x to version
5.x, this change in behavior may impact existing applications and queries. Refer to
<xref href="../../install_guide/43x_to_5x.xml" format="dita" scope="peer">About
Implicit Text Casting in Greenplum Database</xref> in the <cite>Greenplum Database
Installation Guide</cite> for information, including a discussion about supported and
unsupported workarounds.</li>
<li>After migrating data you may need to modify SQL scripts, administration scripts, and
user-defined functions as necessary to account for changes in Greenplum Database version
5.x. Look for <b>Upgrade Action Required</b> entries in the <xref scope="external"
......
......@@ -21,8 +21,10 @@
[<b>--exclude-table-file</b> <varname>table-file1</varname>]
[ <b>--exclude-table-file</b> <varname>table-file2</varname>] ... ]]
[<b>--skip-existing</b> | <b>--truncate</b> | <b>--drop</b> | <b>--append</b> ]
[<b>--dry-run</b>]
[<b>--analyze</b>]
[<b>--validate</b> <varname>type</varname>]
[<b>--no-distribution-check</b>]
[<b>--truncate-source-after</b> [<b>--yes</b>]]
[<b>--metadata-only</b>]
{ <b>--dest-host</b> <varname>dest_host</varname> [<b>--dest-port</b> <varname>dest_port</varname>]
......@@ -116,6 +118,14 @@
<codeph>--skip-existing</codeph>, <codeph>--truncate</codeph>,
<codeph>--drop</codeph>, or <codeph>--append</codeph>.</pd>
</plentry>
<plentry>
<pt>--dry-run</pt>
<pd>When you specify this option, <codeph>gpcopy</codeph> generates a list of
the migration operations that would have been performed with the specified
options. The data is not migrated. </pd>
<pd>The information is displayed at the command line and written to the log
file.</pd>
</plentry>
<plentry>
<pt>--jobs <varname>int</varname></pt>
<pd>Specify the number processes that <codeph>gpcopy</codeph> runs in parallel.
......@@ -285,6 +295,24 @@
to the destination database when copying data to a different host.</pd>
<pd>The utility does not compress data when copying data to the same host.</pd>
</plentry>
<plentry>
<pt>--no-distribution-check</pt>
<pd>Specify this option to disable table data distribution checking. By default,
<codeph>gpcopy</codeph> performs data distribution checking to ensure
data is distributed to segment instances correctly. If distribution checking
fails, the table copy fails. </pd>
<pd>The utility does not support table data distribution checking when copying a
partitioned table that is defined with a leaf table that is an external
table or if a leaf table is defined with a distribution policy that is
different from the root partitioned table. <note type="warning">Before you
perform a <codeph>gpcopy</codeph> operation with the
<codeph>--no-distribution-check</codeph> option, ensure that you
have a backup of the destination database and that the distribution
policies of the tables that are being copied are the same in the source
and destination database. Copying data into segment instances with
incorrect data distribution can cause incorrect query results and can
cause database corruption.</note></pd>
</plentry>
<plentry>
<pt>--quiet</pt>
<pd>If specified, suppress status messages at the command prompt. The messages
......@@ -380,7 +408,7 @@
</plentry>
</parml>
</section>
<section>
<section id="notes_gpcopy">
<title>Notes</title>
<p>If a <codeph>gpcopy</codeph> command specifies an invalid option, or specifies a
source table or database that does not exist, the utility returns an error and
......@@ -441,6 +469,19 @@
with an external table, that leaf partition is created, but data is not copied. </p>
<p>You cannot copy an individual leaf partition, you must copy the entire
partitioned table.</p>
<p><codeph>gpcopy</codeph> does not support table data distribution checking when
copying a partitioned table that is defined with a leaf table that is an
external table or if a leaf table is defined with a distribution policy that is
different from the root partitioned table. You can copy those tables in a
<codeph>gpcopy</codeph> operation and specify the option
<codeph>--no-distribution-check</codeph> to disable checking of data
distribution. </p>
<note type="warning">Before you perform a <codeph>gpcopy</codeph> operation with the
<codeph>--no-distribution-check</codeph> option, ensure that you have a
backup of the destination database and that the distribution policies of the
tables that are being copied are the same in the source and destination
database. Copying data into segment instances with incorrect data distribution
can cause incorrect query results and can cause database corruption.</note>
</sectiondiv>
<sectiondiv>
<p><b>Handling gpcopy Errors</b></p>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册