提交 77c0d584 编写于 作者: L Lisa Owen 提交者: David Yozie

docs - misc updates to gptransfer (#5048)

* docs - misc updates to gptransfer

- conref from best practices to admin guide
- qualify use for migration to diff number of segments
- misc edits

* conditionalize
上级 dbb4b61a
......@@ -5,25 +5,34 @@
<shortdesc>This topic describes how to use the <codeph>gptransfer</codeph> utility to transfer
data between databases.</shortdesc>
<body>
<p>The <codeph>gptransfer</codeph> migration utility transfers Greenplum Database metadata and
<note id="note1" otherprops="pivotal">Greenplum Database provides two utilities for migrating data
between Greenplum Database installations, <codeph>gpcopy</codeph> and
<codeph>gptransfer</codeph>. Use <codeph>gpcopy</codeph> to migrate data
to a Greenplum Database cluster version 5.9.0 and later when the destination
Greenplum installation has the same number of segments as the source installation.
Use the <codeph>gptransfer</codeph> utility to migrate data between Greenplum
Database installations running software version 5.8.0 or earlier, or when you
migrate data between Greenplum Database installations with differing numbers
of segments (any versions).</note>
<p id="intro1">The <codeph>gptransfer</codeph> migration utility transfers Greenplum Database metadata and
data from one Greenplum database to another Greenplum database, allowing you to migrate the
entire contents of a database, or just selected tables, to another database. The source and
destination databases may be in the same or a different cluster. Data is transferred in
parallel across all the segments, using the <codeph>gpfdist</codeph> data loading utility to
attain the highest transfer rates. </p>
<p><codeph>gptransfer</codeph> handles the setup and execution of the data transfer.
<p id="intro2"><codeph>gptransfer</codeph> handles the setup and execution of the data transfer.
Participating clusters must already exist, have network access between all hosts in both
clusters, and have certificate-authenticated ssh access between all hosts in both clusters. </p>
<p>The interface includes options to transfer one or more full databases, or one or more
<p id="intro3">The interface includes options to transfer one or more full databases, or one or more
database tables. A full database transfer includes the database schema, table data, indexes,
views, roles, user-defined functions, and resource queues. Configuration files, including
<codeph>postgres.conf</codeph> and <codeph>pg_hba.conf</codeph>, must be transferred
<codeph>postgresql.conf</codeph> and <codeph>pg_hba.conf</codeph>, must be transferred
manually by an administrator. Extensions installed in the database with
<codeph>gppkg</codeph>, such as MADlib, must be installed in the destination database by an
administrator. </p>
<p>See the <cite>Greenplum Database Utility Guide</cite> for complete syntax and usage
<p id="intro4">See the <cite>Greenplum Database Utility Guide</cite> for complete syntax and usage
information for the <codeph>gptransfer</codeph> utility. </p>
<section>
<section id="prereqs">
<title>Prerequisites</title>
<ul id="ul_wkb_xqd_cp">
<li>The <codeph>gptransfer</codeph> utility can only be used with Greenplum Database. Apache
......@@ -45,7 +54,7 @@
exchange public keys between the hosts of both clusters.</li>
</ul>
</section>
<section>
<section id="whatgptdoes">
<title>What gptransfer Does</title>
<p><codeph>gptransfer</codeph> uses writable and readable external tables, the Greenplum
<codeph>gpfdist</codeph> parallel data-loading utility, and named pipes to transfer data
......@@ -72,7 +81,7 @@
processes</li>
</ul></p>
</section>
<section>
<section id="fastslow">
<title>Fast Mode and Slow Mode</title>
<p><codeph>gptransfer</codeph> sets up data transfer using the <codeph>gpfdist</codeph>
parallel file serving utility, which serves the data evenly to the destination segments.
......@@ -105,7 +114,7 @@
readable external table into the destination table. The data is distributed evenly to all
the segments in the destination cluster. </p>
</section>
<section>
<section id="batch">
<title>Batch Size and Sub-batch Size</title>
<p>The degree of parallelism of a <codeph>gptransfer</codeph> execution is determined by two
command-line options: <codeph>--batch-size</codeph> and <codeph>--sub-batch-size</codeph>.
......@@ -120,7 +129,7 @@
values too high can cause a Python Out of Memory error. For this reason, the batch sizes
should be tuned for your environment. </p>
</section>
<section>
<section id="prephosts">
<title>Preparing Hosts for gptransfer</title>
<p>When you install a Greenplum Database cluster, you set up all the master and segment hosts
so that the Greenplum Database administrative user (<codeph>gpadmin</codeph>) can connect
......@@ -144,7 +153,7 @@ host2_name,host2_ipaddr
uses IP addresses instead of host names to avoid any problems with name resolution between
the clusters. </p>
</section>
<section>
<section id="limitations">
<title>Limitations</title>
<p><codeph>gptransfer</codeph> transfers data from user databases only; the
<codeph>postgres</codeph>, <codeph>template0</codeph>, and <codeph>template1</codeph>
......@@ -163,7 +172,7 @@ host2_name,host2_ipaddr
<codeph>gptransfer</codeph> at a time.<ph otherprops="op-pivotal"> Running multiple,
concurrent instances of <codeph>gptransfer</codeph> is not supported.</ph></p>
</section>
<section>
<section id="fulltblmode">
<title>Full Mode and Table Mode</title>
<p>When run with the <codeph>--full</codeph> option, <codeph>gptransfer</codeph> copies all
user-created databases, tables, views, indexes, roles, user-defined functions, and resource
......@@ -225,7 +234,7 @@ host2_name,host2_ipaddr
</row>
<row>
<entry>
<codeph>postgres.conf</codeph>
<codeph>postgresql.conf</codeph>
</entry>
<entry>No</entry>
<entry>No</entry>
......@@ -256,12 +265,12 @@ host2_name,host2_ipaddr
tables to prevent the transfer from failing because the table already exists at the
destination.</p>
</section>
<section>
<section id="locking">
<title>Locking</title>
<p>The <codeph>-x</codeph> option enables table locking. An exclusive lock is placed on the
source table until the copy and validation, if requested, are complete. </p>
</section>
<section>
<section id="validation">
<title>Validation</title>
<p>By default, <codeph>gptransfer</codeph> does not validate the data transferred. You can
request validation using the <codeph>--validate=<i>type</i></codeph> option. The validation
......@@ -275,7 +284,7 @@ host2_name,host2_ipaddr
option to lock the table. Otherwise, the table could be modified during the transfer,
causing validation to fail.</p>
</section>
<section>
<section id="failedtran">
<title>Failed Transfers</title>
<p>A failure on a table does not end the <codeph>gptransfer</codeph> job. When a transfer
fails, <codeph>gptransfer</codeph> displays an error message and adds the table name to a
......
......@@ -277,7 +277,7 @@
<li>Ensure any roles, functions, and resource queues are created in the destination
database. These objects are not transferred when you use the <codeph>gptransfer
-t</codeph> option. </li>
<li>Copy the <codeph>postgres.conf</codeph> and <codeph>pg_hba.conf</codeph> configuration
<li>Copy the <codeph>postgresql.conf</codeph> and <codeph>pg_hba.conf</codeph> configuration
files from the source to the destination cluster. </li>
<li>Install needed extensions in the destination database with <codeph>gppkg</codeph>.</li>
</ul>
......
......@@ -7,6 +7,8 @@
<body>
<p>The <codeph>gptransfer</codeph> utility copies objects from databases in a source
Greenplum Database system to databases in a destination Greenplum Database system. </p>
<!-- conref'ing to content from admin_guide. -->
<note conref="../../admin_guide/managing/gptransfer.xml#note1"/>
<section id="section2">
<title>Synopsis</title>
<codeblock><b>gptransfer</b>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册