1. 01 4月, 2017 1 次提交
    • F
      Rule based partition selection for list (sub)partitions (#2076) · 5cecfcd1
      foyzur 提交于
      GPDB supports range and list partitions. Range partitions are represented as a set of rules. Each rule defines the boundaries of a part. E.g., a rule might say that a part contains all values between (0, 5], where left bound is 0 exclusive, but the right bound is 5, inclusive. List partitions are defined by a list of values that the part will contain. 
      
      ORCA uses the above rule definition to generate expressions that determine which partitions need to be scanned. These expressions are of the following types:
      
      1. Equality predicate as in PartitionSelectorState->levelEqExpressions: If we have a simple equality on partitioning key (e.g., part_key = 1).
      
      2. General predicate as in PartitionSelectorState->levelExpressions: If we need more complex composition, including non-equality such as part_key > 1.
      
      Note:  We also have residual predicate, which the optimizer currently doesn't use. We are planning to remove this dead code soon.
      
      Prior to  this PR, ORCA was treating both range and list partitions as range partitions. This meant that each list part will be converted to a set of list values and each of these values will become a single point range partition.
      
      E.g., consider the DDL:
      
      ```sql
      CREATE TABLE DATE_PARTS (id int, year int, month int, day int, region text)
      DISTRIBUTED BY (id)
      PARTITION BY RANGE (year)
          SUBPARTITION BY LIST (month)
             SUBPARTITION TEMPLATE (
              SUBPARTITION Q1 VALUES (1, 2, 3), 
              SUBPARTITION Q2 VALUES (4 ,5 ,6),
              SUBPARTITION Q3 VALUES (7, 8, 9),
              SUBPARTITION Q4 VALUES (10, 11, 12),
              DEFAULT SUBPARTITION other_months )
      ( START (2002) END (2012) EVERY (1), 
        DEFAULT PARTITION outlying_years );
      ```
      
      Here we partition the months as list partition using quarters. So, each of the list part will contain three months. Now consider a query on this table:
      
      ```sql
      select * from DATE_PARTS where month between 1 and 3;
      ```
      
      Prior to this ORCA generated plan would consider each value of the Q1 as a separate range part with just one point range. I.e., we will have 3 virtual parts to evaluate for just one Q1: [1], [2], [3]. This approach is inefficient. The problem is further exacerbated when we have multi-level partitioning. Consider the list part of the above example. We have only 4 rules for 4 different quarters, but we will have 12 different virtual rule (aka constraints). For each such constraint, we will then evaluate the entire subtree of partitions.
      
      After this PR, we no longer decompose rules into constraints for list parts and then derive single point virtual range partitions based on those constraints. Rather, the new ORCA changes will use ScalarArrayOp to express selectivity on a list of values. So, the expression for the above SQL will look like 1 <= ANY {month_part} AND 3 >= ANY {month_part}, where month_part will be substituted at runtime with different list of values for each of quarterly partitions. We will end up evaluating that expressions 4 times with the following list of values:
      
      Q1: 1 <= ANY {1,2,3} AND 3 >= ANY {1,2,3}
      Q2: 1 <= ANY {4,5,6} AND 3 >= ANY {4,5,6}
      ...
      
      Compare this to the previous approach, where we will end up evaluating 12 different expressions, each time for a single point value:
      
      First constraint of Q1: 1 <= 1 AND 3 >= 1
      Second constraint of Q1: 1 <= 2 AND 3 >= 2
      Third constraint of Q1: 1 <= 3 AND 3 >= 3
      First constraint of Q2: 1 <= 4 AND 3 >= 4
      ...
      
      The ScalarArrayOp depends on a new type of expression PartListRuleExpr that can convert a list rule to an array of values. ORCA specific changes can be found here: https://github.com/greenplum-db/gporca/pull/149
      5cecfcd1
  2. 24 2月, 2017 3 次提交
    • D
      Partitioning code cleanup · 32745494
      Daniel Gustafsson 提交于
      This applies minor cosmetic cleanup to the partition code stemming
      from a read-through. There are no functional changes from this:
      
        * Remove stale comments and reflow existing ones as well as fix
          some typos
        * Remove spurious whitespace
        * Re-indent and rewrite where the logic isn't clear to improve
          readability (removing !!(foo) construction).
      32745494
    • D
      Remove useless StringInfo and truncate calls · 3d10d102
      Daniel Gustafsson 提交于
      There is no gain in calling truncateStringInfo() on a StringInfo
      which was just inited, it will always be no-op. Remove truncates
      from the partitioning codepath and save a cycle or two. Also
      remove two StringInfos where one were unused and the other could
      be replaced with a pstrdup() call.
      3d10d102
    • D
      Fix typo in code/doc comments · 1069bc97
      Daniel Gustafsson 提交于
      occurances was a surprisingly common typo, fix all findings except
      one in pg_regress.c which will be fixed in a much larger doc patch
      as we merge upstream.
      1069bc97
  3. 04 2月, 2017 1 次提交
    • O
      [#138767899] Prune system cols for appendonly partition tables · 8e001fac
      Omer Arap 提交于
      Previously gporca translator was only pruning the non-visible system columns from
      the table descriptor for non-partition `appendonly` tables or if the
      paritition table is marked as `appendonly` at the root level.
      
      If one of the leaf partitions in is marked as `appendonly` but the root
      is not, the system columns still appears in the table descriptor.
      
      This commit fixes the issue by checking if the root table has
      `appendonly` paritions and pruning system columns if it has.
      8e001fac
  4. 18 1月, 2017 1 次提交
  5. 01 12月, 2016 1 次提交
    • D
      Clean up palloc/pfree usage · 35aa3841
      Daniel Gustafsson 提交于
      palloc() will never return on allocation failure so checking for
      NULL is at best pointless. Remove NULL checks on allocations and
      before pfree() where we know beforehands that it must be non-NULL.
      Also remove some unneeded inclusions of palloc.h
      35aa3841
  6. 21 11月, 2016 1 次提交
  7. 08 11月, 2016 1 次提交
  8. 07 11月, 2016 1 次提交
    • H
      Revamp the way OIDs are dispatched to segments on CREATE statements. · f9016da2
      Heikki Linnakangas 提交于
      Instead of carrying a "new OID" field in all the structs that represent
      CREATE statements, introduce a generic mechanism for capturing the OIDs
      of all created objects, dispatching them to the QEs, and using those same
      OIDs when the corresponding objects are created in the QEs. This allows
      removing a lot of scattered changes in DDL command handling, that was
      previously needed to ensure that objects are assigned the same OIDs in
      all the nodes.
      
      This also provides the groundwork for pg_upgrade to dictate the OIDs to use
      for upgraded objects. The upstream has mechanisms for pg_upgrade to dictate
      the OIDs for a few objects (relations and types, at least), but in GPDB,
      we need to preserve the OIDs of almost all object types.
      f9016da2
  9. 25 8月, 2016 6 次提交
  10. 18 8月, 2016 1 次提交
  11. 16 8月, 2016 1 次提交
  12. 09 7月, 2016 1 次提交
    • D
      Remove dead code in partition coalesce functionality · c309c921
      Daniel Gustafsson 提交于
      The ALTER TABLE .. COALESCE PARTITION feature is while partially
      implemented not supported. Removing all the scaffolding around the
      parsing might as well be worthwhile but at least it seems reasonable
      to kill the completely dead code in ATPExecPartCoalesce(). As this
      was the only external caller of parruleord_open_gap() make the
      function static.
      c309c921
  13. 13 5月, 2016 1 次提交
  14. 25 4月, 2016 1 次提交
  15. 22 3月, 2016 1 次提交
  16. 12 2月, 2016 1 次提交
  17. 18 1月, 2016 1 次提交
  18. 07 1月, 2016 2 次提交
  19. 28 10月, 2015 1 次提交