1. 21 9月, 2018 1 次提交
  2. 21 8月, 2018 1 次提交
    • T
      Do not create split update for relations excluded by constraints · 9b8dd4f4
      Taylor Vesely 提交于
      When the query_planner determines that a relation does not to need
      scanning due to constraint exclusion, it will create a 'dummy' plan for
      that operation. When we plan a split update, it does not understand this
      'dummy' plan shape, and will fail with an assertion.
      
      Instead, because an excluded relation will never return tuples, do not
      attempt to create a split update at all.
      9b8dd4f4
  3. 03 8月, 2018 1 次提交
  4. 02 8月, 2018 1 次提交
    • R
      Merge with PostgreSQL 9.2beta2. · 4750e1b6
      Richard Guo 提交于
      This is the final batch of commits from PostgreSQL 9.2 development,
      up to the point where the REL9_2_STABLE branch was created, and 9.3
      development started on the PostgreSQL master branch.
      
      Notable upstream changes:
      
      * Index-only scan was included in the batch of upstream commits. It
        allows queries to retrieve data only from indexes, avoiding heap access.
      
      * Group commit was added to work effectively under heavy load. Previously,
        batching of commits became ineffective as the write workload increased,
        because of internal lock contention.
      
      * A new fast-path lock mechanism was added to reduce the overhead of
        taking and releasing certain types of locks which are taken and released
        very frequently but rarely conflict.
      
      * The new "parameterized path" mechanism was added. It allows inner index
        scans to use values from relations that are more than one join level up
        from the scan. This can greatly improve performance in situations where
        semantic restrictions (such as outer joins) limit the allowed join orderings.
      
      * SP-GiST (Space-Partitioned GiST) index access method was added to support
        unbalanced partitioned search structures. For suitable problems, SP-GiST can
        be faster than GiST in both index build time and search time.
      
      * Checkpoints now are performed by a dedicated background process. Formerly
        the background writer did both dirty-page writing and checkpointing. Separating
        this into two processes allows each goal to be accomplished more predictably.
      
      * Custom plan was supported for specific parameter values even when using
        prepared statements.
      
      * API for FDW was improved to provide multiple access "paths" for their tables,
        allowing more flexibility in join planning.
      
      * Security_barrier option was added for views to prevents optimizations that
        might allow view-protected data to be exposed to users.
      
      * Range data type was added to store a lower and upper bound belonging to its
        base data type.
      
      * CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
        SELECT query is planned during the execution of the utility. To conform to
        this change, GPDB executes the utility statement only on QD and dispatches
        the plan of the SELECT query to QEs.
      Co-authored-by: NAdam Lee <ali@pivotal.io>
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      Co-authored-by: NPaul Guo <paulguo@gmail.com>
      Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
      Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
      Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
      Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
      4750e1b6
  5. 23 7月, 2018 1 次提交
    • Z
      Enable update on distribution column in legacy planner. · 6be0a32a
      Zhenghua Lyu 提交于
      Before, we cannot update distribution column in legacy planner, because the OLD tuple
      and NEW tuple maybe belong to different segments. We enable this by borrowing ORCA's
      logic, namely, split each update operation into delete and insert. The delete operation is hashed
      by OLD tuple attributes, and insert operation is hashed by NEW tuple attributes. This change
      includes following items:
      * We need push missed OLD attributes to sub plan tree so that that attribute could be passed to top Motion.
      * In addition, if the result relation has oids, we also need to put oid in the targetlist.
      * If result relation is partitioned, we should special treat it because resultRelations is partition tables instead of root table, but that is true for normal Insert.
      * Special treats for update triggers, because trigger cannot be executed across segments.
      * Special treatment in nodeModifyTable, so that it can process Insert/Delete for update purpose.
      * Proper initialization of SplitUpdate.
      
      There are still TODOs:
      * We don't handle cost gracefully, because we add SplitUpdate node after plan generated. Already added a FIXME for this
      * For deletion, we could optimize in just sending distribution columns instead of all columns
      
      
      Author: Xiaoran Wang <xiwang@pivotal.io>
      Author: Max Yang <myang@pivotal.io>
      Author: Shujie Zhang <shzhang@pivotal.io>
      Author: Zhenghua Lyu <zlv@pivotal.io>
      6be0a32a
  6. 28 3月, 2018 1 次提交
    • B
      Add GUC to enable / disable Join Associativity in ORCA · bb68b5c6
      Bhuvnesh Chaudhary 提交于
      This commit introduces a GUC `optimizer_enable_associativity` to enable
      or disable join associativity. Join Associativity increases the search
      space as it increases the numbers of groups to represent a join and its
      associative counterpart, i.e (A X B) X C ~ A X (B X C).
      
      This patch, by default disables join associativity transform, if
      required the users can enable the transform. There are few plan changes
      which are observed due to this change. However, further evaluation of
      the plan changes revealed that even though the cost of the the resulting
      plan has increased, the execution time went down by 1-2 seconds.
      
      For the queries with plan changes, there are 3 tables which are joined,
      i.e A, B and C. If we increase the number of tuples returned by the
      subquery which forms A', we see the old plan. But if the tuples in
      relation B and C is significantly higher, the plan changes with the
      patch yeild faster execution times. This suggests that we may need to
      tune the cost model to adapt to such cases.
      
      The plan cost increase is 1000x as compared to the old plans, this 1000x
      factor is due to the value of `optimizer_nestloop_factor=1024`, if you
      set the value of the GUC `optimizer_nestloop_factor=1`, the plan before
      or after the patch remains same.
      bb68b5c6
  7. 27 9月, 2017 1 次提交
  8. 25 9月, 2017 1 次提交
    • H
      Remove the concept of window "key levels". · b1651a43
      Heikki Linnakangas 提交于
      It wasn't very useful. ORCA and Postgres both just stack WindowAgg nodes
      on top of each other, and no-one's been unhappy about that, so we might as
      well do that, too. This reduces the difference between GPDB and the upstream
      implementation, and will hopefully make it smoother to switch.
      
      Rename the Window Plan node type to WindowAgg, to match upstream, now
      that it is fairly close to the upstream version.
      b1651a43
  9. 15 9月, 2017 1 次提交
    • O
      Only request stats of columns needed for cardinality estimation [#150424379] · c5ade96d
      Omer Arap 提交于
      GPORCA should not spend time extracting column statistics that are not
      needed for cardinality estimation. This commit eliminates this overhead
      of requesting and generating the statistics for columns that are not
      used in cardinality estimation unnecessarily.
      
      E.g:
      `CREATE TABLE foo (a int, b int, c int);`
      
      For table foo, the query below only needs for stats for column `a` which
      is the distribution column and column `c` which is the column used in
      where clause.
      `select * from foo where c=2;`
      
      However, prior to that commit, the column statistics for column `b` is
      also calculated and passed for the cardinality estimation. The only
      information needed by the optimizer is the `width` of column `b`. For
      this tiny information, we transfer every stats information for that
      column.
      
      This commit and its counterpart commit in GPORCA ensures that the column
      width information is passed and extracted in the `dxl:Relation` metadata
      information.
      
      Preliminary results for short running queries provides up to 65x
      performance improvement.
      Signed-off-by: NJemish Patel <jpatel@pivotal.io>
      c5ade96d
  10. 12 9月, 2017 1 次提交
    • B
      Refactor adding explicit distribution motion logic · 8f01bf79
      Bhuvnesh Chaudhary 提交于
      nMotionNodes tracks the number of Motion in a plan, and each
      plan node maintains nMotionNodes. Counting number of Motions in a plan node by
      traversing the tree and adding up nMotionNodes found in nested plans will give
      incorrect number of Motion nodes. So instead of using nMotionNodes, use
      a boolean flag to track if the subtree tree excluding the initplans
      contains a motion node
      8f01bf79