Refresh gpexpand documentation

Just general freshening -- no major changes Signed-off-by: N Chumki Roy <croy@pivotal.io>

Refresh gpexpand documentation
Just general freshening -- no major changes Signed-off-by: N Chumki Roy <croy@pivotal.io>
443bef12 · C.J. Jameson · Larry Hamel · ef699d2f · 443bef12 · 443bef12
4 changed file
--- a/gpdb-doc/dita/admin_guide/expand/expand-initialize.xml
+++ b/gpdb-doc/dita/admin_guide/expand/expand-initialize.xml
@@ -179,7 +179,7 @@ sdw5:sdw5-1:60012:/gpdata/mirror/gp10:14:10:m:63011</codeblock>
                  <codeph>p | m</codeph>
                </entry>
                <entry colname="col3">Determines whether this segment is a primary or mirror.
-                  Specify <codeph>p</codeph>for primary and <codeph>m</codeph>for mirror.</entry>
+                  Specify <codeph>p</codeph> for primary and <codeph>m</codeph> for mirror.</entry>
              </row>
              <row>
                <entry colname="col1">replication_port</entry>

--- a/gpdb-doc/dita/admin_guide/expand/expand-overview.xml
+++ b/gpdb-doc/dita/admin_guide/expand/expand-overview.xml
@@ -30,22 +30,22 @@
        <li>Transparency of process. The expansion process employs standard Greenplum Database
          mechanisms, so administrators can diagnose and troubleshoot any problems.</li>
        <li>Configurable process. Expansion can be a long running process, but it can be fit into a
-          schedule of ongoing operations. Expansion control tables allow adminstrators to prioritize
-          the order in which tables are redistributed and the expansion activity can be paused and
-          resumed.</li>
+          schedule of ongoing operations. The expansion schema's tables allow adminstrators to
+          prioritize the order in which tables are redistributed, and the expansion activity can be
+          paused and resumed.</li>
      </ul></p>
    <p>The planning and physical aspects of an expansion project are a greater share of the work
      than expanding the database itself. It will take a multi-discipline team to plan and execute
-      the project. Space must be acquired and prepared for the new servers. The servers must be
-      specified, acquired, installed, cabled, configured, and tested. Consulting Greenplum Database
-      platform engineers in the planning stages will help to ensure a successful expansion project.
-        <xref href="expand-planning.xml#topic5" type="topic" format="dita"/> describes general
+      the project. For on-premise installations, space must be acquired and prepared for the new
+      servers. The servers must be specified, acquired, installed, cabled, configured, and tested.
+      For cloud deployments, similar plans should also be made. <xref
+        href="expand-planning.xml#topic5" type="topic" format="dita"/> describes general
      considerations for deploying new hardware. </p>
    <p>After you provision the new hardware platforms and set up their networks, configure the
-      operating systems and run performance tests using Greenplum utilities.The Greenplum Database
+      operating systems and run performance tests using Greenplum utilities. The Greenplum Database
      software distribution includes utilities that are helpful to test and burn-in the new servers
      before beginning the software phase of the expansion. See <xref href="expand-nodes.xml#topic1"
-      /> for steps to ready the new hosts for Greenplum Database.</p>
+      /> for steps to prepare the new hosts for Greenplum Database.</p>
    <p>Once the new servers are installed and tested, the software phase of the Greenplum Database
      expansion process begins. The software phase is designed to be minimally disruptive,
      transactionally consistent, reliable, and flexible. <ul id="ul_zwt_png_tt">

--- a/gpdb-doc/dita/admin_guide/expand/expand-planning.xml
+++ b/gpdb-doc/dita/admin_guide/expand/expand-planning.xml
@@ -41,7 +41,7 @@
                    id="image_gr2_s1m_2r"/>
                </entry>
                <entry>Devise and execute a plan for ordering, building, and networking new hardware
-                  platforms. </entry>
+                  platforms, or provisioning cloud resources. </entry>
              </row>
              <row>
                <entry>
@@ -49,7 +49,7 @@
                    id="image_ryl_s1m_2r"/>
                </entry>
                <entry>Devise a database expansion plan. Map the number of segments per host,
-                  schedule the offline period for testing performance and creating the expansion
+                  schedule the downtime period for testing performance and creating the expansion
                  schema, and schedule the intervals for table redistribution.</entry>
              </row>
              <row>
@@ -78,16 +78,16 @@
                  <image href="../../graphics/green-checkbox.jpg" width="29px" height="28px"
                    id="image_zrz_s1m_2r"/>
                </entry>
-                <entry>Validate the operating system environment of the new hardware
-                    (<codeph>gpcheck</codeph>).</entry>
+                <entry>Validate the operating system environment of the new hardware or cloud
+                  resources (<codeph>gpcheck</codeph>).</entry>
              </row>
              <row>
                <entry>
                  <image href="../../graphics/green-checkbox.jpg" width="29px" height="28px"
                    id="image_qkb_t1m_2r"/>
                </entry>
-                <entry>Validate disk I/O and memory bandwidth of the new hardware
-                    (<codeph>gpcheckperf</codeph>).</entry>
+                <entry>Validate disk I/O and memory bandwidth of the new hardware or cloud resources
+                  (<codeph>gpcheckperf</codeph>).</entry>
              </row>
              <row>
                <entry>
@@ -125,7 +125,7 @@
                    id="image_hgm_t1m_2r"/>
                </entry>
                <entry>Validate the operating system environment of the combined existing and new
-                  hardware (<codeph>gpcheck</codeph>). </entry>
+                  hardware or cloud resources (<codeph>gpcheck</codeph>). </entry>
              </row>
              <row>
                <entry>
@@ -133,7 +133,7 @@
                    id="image_q3q_t1m_2r"/>
                </entry>
                <entry>Validate disk I/O and memory bandwidth of the combined existing and new
-                  hardware (<codeph>gpcheckperf</codeph>). </entry>
+                  hardware or cloud resources (<codeph>gpcheckperf</codeph>). </entry>
              </row>
              <row>
                <entry>
@@ -217,7 +217,7 @@
  <topic id="topic6" xml:lang="en">
    <title>Planning New Segment Initialization</title>
    <body>
-      <p>Expanding Greenplum Database requires a limited period of system down time. During this
+      <p>Expanding Greenplum Database requires a limited period of system downtime. During this
        period, run <codeph>gpexpand</codeph> to initialize new segments into the array and create
        an expansion schema.</p>
      <p>The time required depends on the number of schema objects in the Greenplum system and other
@@ -255,10 +255,10 @@
        <p>For example, if existing hosts currently have two segments per host, you can use
            <codeph>gpexpand</codeph> to initialize two additional segments on existing hosts for a
          total of four segments and four new segments on new hosts. </p>
-        <p>The interactive process for creating an expansion input file prompts for this option; the
-          input file format allows you to specify new segment directories manually, also. For more
-          information, see <xref href="expand-initialize.xml#topic23" type="topic" format="dita"
-          />.</p>
+        <p>The interactive process for creating an expansion input file prompts for this option; you
+          can also specify new segment directories manually in the input configuration file. For
+          more information, see <xref href="expand-initialize.xml#topic23" type="topic"
+            format="dita" />.</p>
      </body>
    </topic>
    <topic id="topic9" xml:lang="en">
@@ -295,17 +295,17 @@
      <p>Table redistribution is performed while the system is online. For many Greenplum systems,
        table redistribution completes in a single <codeph>gpexpand</codeph> session scheduled
        during a low-use period. Larger systems may require multiple sessions and setting the order
-        of table redistribution to minimize performance impact. Complete the
-        table redistribution in one session if possible.</p>
+        of table redistribution to minimize performance impact. Complete the table redistribution in
+        one session if possible.</p>
      <note type="important">To perform table redistribution, your segment hosts must have enough
        disk space to temporarily hold a copy of your largest table. All tables are unavailable for
        read and write operations during redistribution. </note>
      <p>The performance impact of table redistribution depends on the size, storage type, and
-        partitioning design of a table. Per table, redistributing a table with
-          <codeph>gpexpand</codeph> takes as much time as a <codeph>CREATE TABLE AS SELECT</codeph>
-        operation does. When redistributing a terabyte-scale fact table, the expansion utility can
-        use much of the available system resources, with resulting impact on query performance or
-        other database workloads.</p>
+        partitioning design of a table. For any given table, redistributing it with
+        <codeph>gpexpand</codeph> takes as much time as a <codeph>CREATE TABLE AS SELECT</codeph>
+        operation would. When redistributing a terabyte-scale fact table, the expansion utility can
+        use much of the available system resources, which could affect query performance or other
+        database workloads.</p>
    </body>
    <topic id="topic11" xml:lang="en">
      <title id="no167000">Managing Redistribution in Large-Scale Greenplum Systems</title>
@@ -313,10 +313,11 @@
        <p>You can manage the order in which tables are redistributed by adjusting their ranking.
          See <xref href="expand-redistribute.xml#topic29" type="topic" format="dita"/>.
          Manipulating the redistribution order can help adjust for limited disk space and restore
-          optimal query performance. </p>
+          optimal query performance for high-priority queries sooner. </p>
        <p>When planning the redistribution phase, consider the impact of the exclusive lock taken
          on each table during redistribution. User activity on a table can delay its
-          redistribution. Tables are unavailable during redistribution. </p>
+          redistribution, but also tables are unavailable for user activity during redistribution.
+        </p>
        <section>
          <title>Systems with Abundant Free Disk Space</title>
          <p>In systems with abundant free disk space (required to store a copy of the largest

--- a/gpdb-doc/dita/admin_guide/expand/expand-redistribute.xml
+++ b/gpdb-doc/dita/admin_guide/expand/expand-redistribute.xml
@@ -7,14 +7,14 @@
  <shortdesc>Redistribute tables to balance existing data over the newly expanded cluster. </shortdesc>
  <body>
    <p>After creating an expansion schema, you can bring Greenplum Database back online and
-      redistribute tables across the entire array with <codeph>gpexpand</codeph>. Target low-use
-      hours when the utility's CPU usage and table locks have minimal impact on operations. Rank
-      tables to redistribute the largest or most critical tables in preferential order.</p>
+      redistribute tables across the entire array with <codeph>gpexpand</codeph>. Aim to run this
+      during low-use hours when the utility's CPU usage and table locks have minimal impact on
+      operations. Rank tables to redistribute the largest or most critical tables first.</p>
    <note>When redistributing data, Greenplum Database must be running in production mode. Greenplum
      Database cannot be restricted mode or in master mode. The <codeph>gpstart</codeph> options
        <codeph>-R</codeph> or <codeph>-m</codeph> cannot be specified to start Greenplum
      Database.</note>
-    <p>While table redistribution is underway any new tables or partitions created are distributed
+    <p>While table redistribution is underway, any new tables or partitions created are distributed
      across all segments exactly as they would be under normal operating conditions. Queries can
      access all segments, even before the relevant data is redistributed to tables on the new
      segments. The table or partition being redistributed is locked and unavailable for read or
@@ -34,14 +34,14 @@
  <topic id="topic29" xml:lang="en">
    <title id="no161370">Ranking Tables for Redistribution</title>
    <body>
-      <p>For large systems, control the table redistribution order. Adjust
+      <p>For large systems, you can control the table redistribution order. Adjust
        tables' <codeph>rank</codeph> values in the expansion schema to prioritize heavily-used
        tables and minimize performance impact. Available free disk space can affect table ranking;
        see <xref href="expand-planning.xml#topic11" type="topic" format="dita"/>.</p>
      <p>To rank tables for redistribution by updating <codeph>rank</codeph> values in
          <i>gpexpand.status_detail</i>, connect to Greenplum Database using <codeph>psql</codeph>
        or another supported client. Update <i>gpexpand.status_detail</i> with commands such as:</p>
-      <codeblock>=&gt; UPDATE gpexpand.status_detail SET rank=10; 
+      <codeblock>=&gt; UPDATE gpexpand.status_detail SET rank=10;

 =&gt; UPDATE gpexpand.status_detail SET rank=1 WHERE fq_name = 'public.lineitem';
 =&gt; UPDATE gpexpand.status_detail SET rank=2 WHERE fq_name = 'public.orders';</codeblock>
@@ -112,7 +112,7 @@
          Database using <codeph>psql</codeph> or another supported client and query
            <i>gpexpand.status_detail</i>:</p>
        <p>
-          <codeblock>=&gt; SELECT status, expansion_started, source_bytes FROM 
+          <codeblock>=&gt; SELECT status, expansion_started, source_bytes FROM
 gpexpand.status_detail WHERE fq_name = 'public.sales';
  status   |     expansion_started      | source_bytes
 -----------+----------------------------+--------------