提交 · a53c05d8063077540af58711b39aaf3fd5c47395 · Greenplum / Gpdb

01 4月, 2017 1 次提交

Rule based partition selection for list (sub)partitions (#2076) · 5cecfcd1

由 foyzur 提交于 3月 31, 2017

GPDB supports range and list partitions. Range partitions are represented as a set of rules. Each rule defines the boundaries of a part. E.g., a rule might say that a part contains all values between (0, 5], where left bound is 0 exclusive, but the right bound is 5, inclusive. List partitions are defined by a list of values that the part will contain. 

ORCA uses the above rule definition to generate expressions that determine which partitions need to be scanned. These expressions are of the following types:

1. Equality predicate as in PartitionSelectorState->levelEqExpressions: If we have a simple equality on partitioning key (e.g., part_key = 1).

2. General predicate as in PartitionSelectorState->levelExpressions: If we need more complex composition, including non-equality such as part_key > 1.

Note:  We also have residual predicate, which the optimizer currently doesn't use. We are planning to remove this dead code soon.

Prior to  this PR, ORCA was treating both range and list partitions as range partitions. This meant that each list part will be converted to a set of list values and each of these values will become a single point range partition.

E.g., consider the DDL:

```sql
CREATE TABLE DATE_PARTS (id int, year int, month int, day int, region text)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
    SUBPARTITION BY LIST (month)
       SUBPARTITION TEMPLATE (
        SUBPARTITION Q1 VALUES (1, 2, 3), 
        SUBPARTITION Q2 VALUES (4 ,5 ,6),
        SUBPARTITION Q3 VALUES (7, 8, 9),
        SUBPARTITION Q4 VALUES (10, 11, 12),
        DEFAULT SUBPARTITION other_months )
( START (2002) END (2012) EVERY (1), 
  DEFAULT PARTITION outlying_years );
```

Here we partition the months as list partition using quarters. So, each of the list part will contain three months. Now consider a query on this table:

```sql
select * from DATE_PARTS where month between 1 and 3;
```

Prior to this ORCA generated plan would consider each value of the Q1 as a separate range part with just one point range. I.e., we will have 3 virtual parts to evaluate for just one Q1: [1], [2], [3]. This approach is inefficient. The problem is further exacerbated when we have multi-level partitioning. Consider the list part of the above example. We have only 4 rules for 4 different quarters, but we will have 12 different virtual rule (aka constraints). For each such constraint, we will then evaluate the entire subtree of partitions.

After this PR, we no longer decompose rules into constraints for list parts and then derive single point virtual range partitions based on those constraints. Rather, the new ORCA changes will use ScalarArrayOp to express selectivity on a list of values. So, the expression for the above SQL will look like 1 <= ANY {month_part} AND 3 >= ANY {month_part}, where month_part will be substituted at runtime with different list of values for each of quarterly partitions. We will end up evaluating that expressions 4 times with the following list of values:

Q1: 1 <= ANY {1,2,3} AND 3 >= ANY {1,2,3}
Q2: 1 <= ANY {4,5,6} AND 3 >= ANY {4,5,6}
...

Compare this to the previous approach, where we will end up evaluating 12 different expressions, each time for a single point value:

First constraint of Q1: 1 <= 1 AND 3 >= 1
Second constraint of Q1: 1 <= 2 AND 3 >= 2
Third constraint of Q1: 1 <= 3 AND 3 >= 3
First constraint of Q2: 1 <= 4 AND 3 >= 4
...

The ScalarArrayOp depends on a new type of expression PartListRuleExpr that can convert a list rule to an array of values. ORCA specific changes can be found here: https://github.com/greenplum-db/gporca/pull/149

5cecfcd1

24 2月, 2017 3 次提交

Partitioning code cleanup · 32745494

由 Daniel Gustafsson 提交于 2月 24, 2017

This applies minor cosmetic cleanup to the partition code stemming
from a read-through. There are no functional changes from this:

  * Remove stale comments and reflow existing ones as well as fix
    some typos
  * Remove spurious whitespace
  * Re-indent and rewrite where the logic isn't clear to improve
    readability (removing !!(foo) construction).

32745494

Remove useless StringInfo and truncate calls · 3d10d102

由 Daniel Gustafsson 提交于 2月 24, 2017

There is no gain in calling truncateStringInfo() on a StringInfo
which was just inited, it will always be no-op. Remove truncates
from the partitioning codepath and save a cycle or two. Also
remove two StringInfos where one were unused and the other could
be replaced with a pstrdup() call.

3d10d102

Fix typo in code/doc comments · 1069bc97

由 Daniel Gustafsson 提交于 2月 24, 2017

occurances was a surprisingly common typo, fix all findings except
one in pg_regress.c which will be fixed in a much larger doc patch
as we merge upstream.

1069bc97

04 2月, 2017 1 次提交

[#138767899] Prune system cols for appendonly partition tables · 8e001fac

由 Omer Arap 提交于 2月 01, 2017

Previously gporca translator was only pruning the non-visible system columns from
the table descriptor for non-partition `appendonly` tables or if the
paritition table is marked as `appendonly` at the root level.

If one of the leaf partitions in is marked as `appendonly` but the root
is not, the system columns still appears in the table descriptor.

This commit fixes the issue by checking if the root table has
`appendonly` paritions and pruning system columns if it has.

8e001fac

18 1月, 2017 1 次提交
- J
  Fix getting the list of leaf relids in part table form a rule · 22647157
  由 Jesse Zhang and Omer Arap 提交于 1月 10, 2017
```
Uninitialized variable causes release and debug build to behave
differently for analyze. This commit fixes the issue.
```
  22647157
01 12月, 2016 1 次提交

Clean up palloc/pfree usage · 35aa3841

由 Daniel Gustafsson 提交于 12月 01, 2016

palloc() will never return on allocation failure so checking for
NULL is at best pointless. Remove NULL checks on allocations and
before pfree() where we know beforehands that it must be non-NULL.
Also remove some unneeded inclusions of palloc.h

35aa3841

21 11月, 2016 1 次提交

Avoid hashing partitions without rules set · 92758a16

由 Daniel Gustafsson 提交于 11月 16, 2016

If there are no rules on the PartitionInfo the hashing will perform
a (foo % 0) which is an undefined operator.

Per report from Coverity.

92758a16

08 11月, 2016 1 次提交

Remove unnecessary ReleaseOperator() abstraction. · 8d77fb99

由 Heikki Linnakangas 提交于 11月 08, 2016

The upstream doesn't have it, and it's an odd one out, when none of the
other syscaches have such macros. This reduces our diff vs. upstream, which
makes diffing and merging easier.

8d77fb99

07 11月, 2016 1 次提交

Revamp the way OIDs are dispatched to segments on CREATE statements. · f9016da2

由 Heikki Linnakangas 提交于 11月 07, 2016

Instead of carrying a "new OID" field in all the structs that represent
CREATE statements, introduce a generic mechanism for capturing the OIDs
of all created objects, dispatching them to the QEs, and using those same
OIDs when the corresponding objects are created in the QEs. This allows
removing a lot of scattered changes in DDL command handling, that was
previously needed to ensure that objects are assigned the same OIDs in
all the nodes.

This also provides the groundwork for pg_upgrade to dictate the OIDs to use
for upgraded objects. The upstream has mechanisms for pg_upgrade to dictate
the OIDs for a few objects (relations and types, at least), but in GPDB,
we need to preserve the OIDs of almost all object types.

f9016da2

25 8月, 2016 6 次提交

H

Replace CaQL calls with systable scans. · 778205a7
由 Heikki Linnakangas 提交于 8月 25, 2016

778205a7
H

Replace some straightforward CaQL calls with Syscache lookups. · 37380b5f
由 Heikki Linnakangas 提交于 8月 25, 2016

37380b5f

Eliminate caql_getcount(). · 9bf5a8e3

由 Heikki Linnakangas 提交于 8月 25, 2016

It was used for DELETEs, and SELECT COUNT(*). Replace with the usual
syscache lookups, systable scans, and simple_heap_update/delete calls,
like these things are done in the upstream.

9bf5a8e3

H
Get rid of caql_getattname and similar functions. · e0f59a56
由 Heikki Linnakangas 提交于 8月 25, 2016
```
There were only a few callers using it. Use the upstream
SearchSysCacheAttName() function instead.
```
e0f59a56

Remove CaQL update support. · 8df78e67

由 Heikki Linnakangas 提交于 8月 25, 2016

This removes all the code for inserting, updating, or deleting through
CaQL context. All the code surrounding the updates are also converted to
use regular SysCache lookups or systable scans.

Note that this leaves the support for deleting rows with
caql_getcount("DELETE ...") alone. For now. This commit is all about
removing the functions that modify the current row of a caql scan.

8df78e67

H
Remove backward-scan support from CaQL. · 558adb00
由 Heikki Linnakangas 提交于 8月 25, 2016
```
There was only one caller of that facility. Replace it with direct
index_beginscan/getnext calls.
```
558adb00

18 8月, 2016 1 次提交

Remove dead code. · 6a1c4299

由 Heikki Linnakangas 提交于 8月 18, 2016

I found these with "callcatcher", written by Caolán McNamara. Many thanks
for the tool! See https://www.skynet.ie/~caolan/Packages/callcatcher.html

6a1c4299

16 8月, 2016 1 次提交
- H
  Remove dead function. · a899bc6a
  由 Heikki Linnakangas 提交于 8月 16, 2016
```
I'm not sure why it ever existed, but it's been dead for a long time.
```
  a899bc6a
09 7月, 2016 1 次提交

Remove dead code in partition coalesce functionality · c309c921

由 Daniel Gustafsson 提交于 7月 07, 2016

The ALTER TABLE .. COALESCE PARTITION feature is while partially
implemented not supported. Removing all the scaffolding around the
parsing might as well be worthwhile but at least it seems reasonable
to kill the completely dead code in ATPExecPartCoalesce(). As this
was the only external caller of parruleord_open_gap() make the
function static.

c309c921

13 5月, 2016 1 次提交
- D
  
  Fix typo in get_parts() comment documentation · da66ec51
  由 Daniel Gustafsson 提交于 5月 11, 2016
  
  da66ec51
25 4月, 2016 1 次提交

Simplify the parsed-representation of ALTER TABLE ADD PARTITION. · 08db9061

由 Heikki Linnakangas 提交于 4月 25, 2016

atpxPartAddList() needs a CreateStmt that represents the parent table,
but instead of creating it already in the parser, and adding more details
to it in analyze.c, it's simpler to create it later, in atpxPartAddList(),
where it's actually needed.

08db9061

22 3月, 2016 1 次提交
- H
  Convert a few Value fields that are always of type String into "char *". · fce4fcc9
  由 Heikki Linnakangas 提交于 3月 21, 2016
```
This saves a little bit of memory when parsing massively partitioned
CREATE TABLE statements.
```
  fce4fcc9
12 2月, 2016 1 次提交

Remove unnecessary MemoryContext arguments from cdbpartition functions. · d1765734

由 Heikki Linnakangas 提交于 2月 11, 2016

We always passed CurrentMemoryContext for them, so might as well remove
the parameter, making the code more readable, and always allocate the
return values in CurrentMemoryContext.

d1765734

18 1月, 2016 1 次提交

Move functions related to parsing of PARTITION BY clause to separate file. · 14fc7182

由 Heikki Linnakangas 提交于 1月 18, 2016

parser/analyze.c is bloated, moving larger chunks of GPDB-specific
functionality like this elsewhere makes it more readable. Make diffing and
merging of analyze.c with upstream easier, too.

I also ran pgindent on parse_partition.c.

14fc7182

07 1月, 2016 2 次提交
- H
  Move a couple of structs closer to where they're used · d1b2ff5d
  由 Heikki Linnakangas 提交于 1月 07, 2016
```
These were both only used by a single, static, function.
```
  d1b2ff5d
- H
  
  Remove unused functions. · 3e8c5afa
  由 Heikki Linnakangas 提交于 1月 07, 2016
  
  3e8c5afa
28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be