<p>Partitioning does not change the physical distribution of table data across the segments.
Table distribution is physical: Greenplum Database physically divides
partitioned tables and non-partitioned tables across segments to enable parallel query
processing. Table <i>partitioning</i> is logical: Greenplum Database logically
divides big tables to improve query performance and facilitate data warehouse maintenance
tasks, such as rolling old data out of the data warehouse. </p>
Table distribution is physical: Greenplum Database physically divides partitioned tables and
non-partitioned tables across segments to enable parallel query processing. Table
<i>partitioning</i> is logical: Greenplum Database logically divides big tables to improve
query performance and facilitate data warehouse maintenance tasks, such as rolling old data
out of the data warehouse. </p>
<p>Greenplum Database supports:</p>
<ulid="ul_ohc_2wy_sp">
<liid="im204035"><i>range partitioning</i>: division of data based on a numerical range,
...
...
@@ -44,22 +44,21 @@
<topicid="topic64"xml:lang="en">
<title>Table Partitioning in Greenplum Database</title>
<body>
<p>Greenplum Database divides tables into parts (also known as partitions) to
enable massively parallel processing. Tables are partitioned during <codeph>CREATE
TABLE</codeph> using the <codeph>PARTITION BY</codeph> (and optionally the
<codeph>SUBPARTITION BY</codeph>) clause. Partitioning creates a top-level (or parent)
table with one or more levels of sub-tables (or child tables). Internally, Greenplum Database creates an inheritance relationship between the top-level table
and its underlying partitions, similar to the functionality of the <codeph>INHERITS</codeph>
clause of PostgreSQL.</p>
<p>Greenplum uses the partition criteria defined during table
creation to create each partition with a distinct <codeph>CHECK</codeph> constraint, which
limits the data that table can contain. The query optimizer uses <codeph>CHECK</codeph>
constraints to determine which table partitions to scan to satisfy a given query
predicate.</p>
<p>The Greenplum system catalog stores partition hierarchy
information so that rows inserted into the top-level parent table propagate correctly to the
child table partitions. To change the partition design or table structure, alter the parent
table using <codeph>ALTER TABLE</codeph> with the <codeph>PARTITION</codeph> clause.</p>
<p>Greenplum Database divides tables into parts (also known as partitions) to enable massively
parallel processing. Tables are partitioned during <codeph>CREATE TABLE</codeph> using the
<codeph>PARTITION BY</codeph> (and optionally the <codeph>SUBPARTITION BY</codeph>)
clause. Partitioning creates a top-level (or parent) table with one or more levels of
sub-tables (or child tables). Internally, Greenplum Database creates an inheritance
relationship between the top-level table and its underlying partitions, similar to the
functionality of the <codeph>INHERITS</codeph> clause of PostgreSQL.</p>
<p>Greenplum uses the partition criteria defined during table creation to create each
partition with a distinct <codeph>CHECK</codeph> constraint, which limits the data that
table can contain. The query optimizer uses <codeph>CHECK</codeph> constraints to determine
which table partitions to scan to satisfy a given query predicate.</p>
<p>The Greenplum system catalog stores partition hierarchy information so that rows inserted
into the top-level parent table propagate correctly to the child table partitions. To change
the partition design or table structure, alter the parent table using <codeph>ALTER
TABLE</codeph> with the <codeph>PARTITION</codeph> clause.</p>
<p>To insert data into a partitioned table, you specify the root partitioned table, the table
created with the <codeph>CREATE TABLE</codeph> command. You also can specify a leaf child
table of the partitioned table in an <codeph>INSERT</codeph> command. An error is returned
...
...
@@ -166,8 +165,8 @@
partitions, rather than partition by year then subpartition by month then subpartition by
day. A multi-level design can reduce query planning time, but a flat partition design runs
faster.</p>
<p>You can have Greenplum Database automatically generate partitions by giving
a<codeph>START</codeph> value, an <codeph>END</codeph> value, and an
<p>You can have Greenplum Database automatically generate partitions by giving a
<codeph>START</codeph> value, an <codeph>END</codeph> value, and an
<codeph>EVERY</codeph> clause that defines the partition increment value. By default,
<codeph>START</codeph> values are always inclusive and <codeph>END</codeph> values are
always exclusive. For example:</p>
...
...
@@ -241,10 +240,10 @@ PARTITION BY LIST (gender)
DEFAULT PARTITION other );
</codeblock>
</p>
<note>The current Greenplum Database legacy optimizer allows list partitions
with multi-column (composite) partition keys. A range partition only allows a single
column as the partition key. The Greenplum Query Optimizer does not support composite keys,
so you should not use composite partition keys.</note>
<note>The current Greenplum Database legacy optimizer allows list partitions with
multi-column (composite) partition keys. A range partition only allows a single column as
the partition key. The Greenplum Query Optimizer does not support composite keys, so you
should not use composite partition keys.</note>
<p>For more information about default partitions, see <xrefhref="#topic80"type="topic"
format="dita"/>.</p>
</body>
...
...
@@ -336,7 +335,9 @@ GRANT SELECT ON sales TO guest;
partitioning columns. A unique index can omit the partitioning columns; however, it is
enforced only on the parts of the partitioned table, not on the partitioned table as a
whole.</p>
<p>GPORCA, the Greenplum next generation query optimizer, supports uniform multi-level partitioned tables. If GPORCA is enabled and the multi-level partitioned table is not uniform, Greenplum Database executes queries against the table with the legacy query
<p>GPORCA, the Greenplum next generation query optimizer, supports uniform multi-level
partitioned tables. If GPORCA is enabled and the multi-level partitioned table is not
uniform, Greenplum Database executes queries against the table with the legacy query
optimizer. For information about uniform multi-level partitioned tables, see <xref
Query Optimizer enabled, you must collect statistics on the partitioned table root partition
with the <cmdname>ANALYZE ROOTPARTITION</cmdname> command. The command <codeph>ANALYZE
ROOTPARTITION</codeph> collects statistics on the root partition of a partitioned table
without collecting statistics on the leaf partitions. If you specify a list of column names
for a partitioned table, the statistics for the columns and the root partition are collected.
For information on the <cmdname>ANALYZE</cmdname> command, see the <cite>Greenplum Database Reference Guide</cite>.<p>You can also use the Greenplum Database utility <codeph>analyzedb</codeph> to update table statistics. The
based on the data in the table. Statistics are not collected on the leaf partitions, leaf
partition data is only sampled. If you specify a list of column names for a partitioned table,
the statistics for the columns and the root partition are collected. For information on the
<cmdname>ANALYZE</cmdname> command, see the <cite>Greenplum Database Reference Guide</cite>.<p>You can also use the Greenplum Database utility <codeph>analyzedb</codeph> to update table statistics. The
Greenplum Database utility <codeph>analyzedb</codeph> can update statistics
for multiple tables in parallel. The utility can also check table statistics and update
statistics only if the statistics are not current or do not exist. For information about the
<codeph>analyzedb</codeph> utility, see the <cite>Greenplum Database Utility
Guide</cite>.</p><p>As part of routine database maintenance, Refresh
statistics on the root partition when there are significant changes to child leaf partition
data.</p></note>
Guide</cite>.</p><p>As part of routine database maintenance, Refresh statistics on
the root partition when there are significant changes to leaf partition data.</p></note>
</body>
<topicid="topic_r5d_hv1_kr">
<title>Setting the optimizer_analyze_root_partition Parameter</title>
<li><xrefhref="guc_category-list.xml#topic3">Configuration Parameters</xref> lists the
parameter descriptions in alphabetic order.</li>
<li><xrefhref="guc-list.xml#topic3">Configuration Parameters</xref> lists the parameter
descriptions in alphabetic order.</li>
</ul>
</body>
<topicid="topic_vsn_22l_z4">
...
...
@@ -44,12 +43,13 @@
<title>Setting Parameters</title>
<body>
<p>Many of the configuration parameters have limitations on who can change them and where or
when they can be set. For example, to change certain parameters, you must be a Greenplum Database superuser. Other parameters require a restart of the system for
the changes to take effect. A parameter that is classified as <i>session</i> can be set at
the system level (in the <codeph>postgresql.conf</codeph> file), at the database-level
(using <codeph>ALTER DATABASE</codeph>), at the role-level (using <codeph>ALTER
ROLE</codeph>), or at the session-level (using <codeph>SET</codeph>). System parameters
can only be set in the <codeph>postgresql.conf</codeph> file.</p>
when they can be set. For example, to change certain parameters, you must be a Greenplum
Database superuser. Other parameters require a restart of the system for the changes to take
effect. A parameter that is classified as <i>session</i> can be set at the system level (in
the <codeph>postgresql.conf</codeph> file), at the database-level (using <codeph>ALTER
DATABASE</codeph>), at the role-level (using <codeph>ALTER ROLE</codeph>), or at the
session-level (using <codeph>SET</codeph>). System parameters can only be set in the
<codeph>postgresql.conf</codeph> file.</p>
<p>In Greenplum Database, the master and each segment instance has its own
<codeph>postgresql.conf</codeph> file (located in their respective data directories). Some
parameters are considered <i>local</i> parameters, meaning that each segment instance looks
...
...
@@ -74,12 +74,12 @@
<row>
<entrycolname="col1">master or local</entry>
<entrycolname="col2">A <i>master</i> parameter only needs to be set in the
<codeph>postgresql.conf</codeph> file of the Greenplum master instance. The value for this parameter is then either passed to (or
ignored by) the segments at run time.<p>A local parameter must be set in the
<codeph>postgresql.conf</codeph> file of the master AND each segment instance.
Each segment instance looks to its own configuration to get the value for the
parameter. Local parameters always requires a system restart for changes to take
effect.</p></entry>
<codeph>postgresql.conf</codeph> file of the Greenplum master instance. The value
for this parameter is then either passed to (or ignored by) the segments at run
time.<p>A local parameter must be set in the <codeph>postgresql.conf</codeph> file
of the master AND each segment instance. Each segment instance looks to its own
configuration to get the value for the parameter. Local parameters always requires
a system restart for changes to take effect.</p></entry>
</row>
<row>
<entrycolname="col1">session or system</entry>
...
...
@@ -96,10 +96,10 @@
<row>
<entrycolname="col1">restart or reload</entry>
<entrycolname="col2">When changing parameter values in the postgresql.conf file(s),
some require a <i>restart</i> of Greenplum Database for the change to
take effect. Other parameter values can be refreshed by just reloading the server
configuration file (using <codeph>gpstop -u</codeph>), and do not require stopping
the system.</entry>
some require a <i>restart</i> of Greenplum Database for the change to take effect.
Other parameter values can be refreshed by just reloading the server configuration
file (using <codeph>gpstop -u</codeph>), and do not require stopping the