1. 31 8月, 2018 1 次提交
    • H
      Rename "prelim function" to "combine function", to match upstream. · b8545d57
      Heikki Linnakangas 提交于
      The GPDB "prelim" functions did the same things as the "combine"
      functions introduced in PostgreSQL 9.6 This commit includes just the
      catalog changes, to essentially search & replace "prelim" with
      "combine". I did not pick the planner and executor changes that were
      made as part of this in the upstream, yet.
      
      Also replace the GPDB implementation of float8_amalg() and
      float8_regr_amalg(), with the upstream float8_combine() and
      float8_regr_combine(). They do the same thing, but let's use upstream
      functions where possible.
      
      Upstream commits:
      commit a7de3dc5
      Author: Robert Haas <rhaas@postgresql.org>
      Date:   Wed Jan 20 13:46:50 2016 -0500
      
          Support multi-stage aggregation.
      
          Aggregate nodes now have two new modes: a "partial" mode where they
          output the unfinalized transition state, and a "finalize" mode where
          they accept unfinalized transition states rather than individual
          values as input.
      
          These new modes are not used anywhere yet, but they will be necessary
          for parallel aggregation.  The infrastructure also figures to be
          useful for cases where we want to aggregate local data and remote
          data via the FDW interface, and want to bring back partial aggregates
          from the remote side that can then be combined with locally generated
          partial aggregates to produce the final value.  It may also be useful
          even when neither FDWs nor parallelism are in play, as explained in
          the comments in nodeAgg.c.
      
          David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
          Linnakangas, Haribabu Kommi, and me.
      
      commit af025eed
      Author: Robert Haas <rhaas@postgresql.org>
      Date:   Fri Apr 8 13:44:50 2016 -0400
      
          Add combine functions for various floating-point aggregates.
      
          This allows parallel aggregation to use them.  It may seem surprising
          that we use float8_combine for both float4_accum and float8_accum
          transition functions, but that's because those functions differ only
          in the type of the non-transition-state argument.
      
          Haribabu Kommi, reviewed by David Rowley and Tomas Vondra
      b8545d57
  2. 03 8月, 2018 1 次提交
  3. 02 8月, 2018 1 次提交
    • R
      Merge with PostgreSQL 9.2beta2. · 4750e1b6
      Richard Guo 提交于
      This is the final batch of commits from PostgreSQL 9.2 development,
      up to the point where the REL9_2_STABLE branch was created, and 9.3
      development started on the PostgreSQL master branch.
      
      Notable upstream changes:
      
      * Index-only scan was included in the batch of upstream commits. It
        allows queries to retrieve data only from indexes, avoiding heap access.
      
      * Group commit was added to work effectively under heavy load. Previously,
        batching of commits became ineffective as the write workload increased,
        because of internal lock contention.
      
      * A new fast-path lock mechanism was added to reduce the overhead of
        taking and releasing certain types of locks which are taken and released
        very frequently but rarely conflict.
      
      * The new "parameterized path" mechanism was added. It allows inner index
        scans to use values from relations that are more than one join level up
        from the scan. This can greatly improve performance in situations where
        semantic restrictions (such as outer joins) limit the allowed join orderings.
      
      * SP-GiST (Space-Partitioned GiST) index access method was added to support
        unbalanced partitioned search structures. For suitable problems, SP-GiST can
        be faster than GiST in both index build time and search time.
      
      * Checkpoints now are performed by a dedicated background process. Formerly
        the background writer did both dirty-page writing and checkpointing. Separating
        this into two processes allows each goal to be accomplished more predictably.
      
      * Custom plan was supported for specific parameter values even when using
        prepared statements.
      
      * API for FDW was improved to provide multiple access "paths" for their tables,
        allowing more flexibility in join planning.
      
      * Security_barrier option was added for views to prevents optimizations that
        might allow view-protected data to be exposed to users.
      
      * Range data type was added to store a lower and upper bound belonging to its
        base data type.
      
      * CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
        SELECT query is planned during the execution of the utility. To conform to
        this change, GPDB executes the utility statement only on QD and dispatches
        the plan of the SELECT query to QEs.
      Co-authored-by: NAdam Lee <ali@pivotal.io>
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      Co-authored-by: NPaul Guo <paulguo@gmail.com>
      Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
      Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
      Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
      Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
      4750e1b6
  4. 09 7月, 2018 1 次提交
  5. 12 5月, 2018 1 次提交
  6. 03 5月, 2018 1 次提交
    • Z
      Add Global Deadlock Detector. · 03915d65
      Zhenghua Lyu 提交于
      To prevent distributed deadlock, in Greenplum DB an exclusive table lock is
      held for UPDATE and DELETE commands, so concurrent updates the same table are
      actually disabled.
      
      We add a backend process to do global deadlock detect so that we do not lock
      the whole table while doing UPDATE/DELETE and this will help improve the
      concurrency of Greenplum DB.
      
      The core idea of the algorithm is to divide lock into two types:
      
      - Persistent: the lock can only be released after the transaction is over(abort/commit)
      - Otherwise cases
      
      This PR’s implementation adds a persistent flag in the LOCK, and the set rule is:
      
      - Xid lock is always persistent
      - Tuple lock is never persistent
      - Relation is persistent if it has been closed with NoLock parameter, otherwise
        it is not persistent Other types of locks are not persistent
      
      More details please refer the code and README.
      
      There are several known issues to pay attention to:
      
      - This PR’s implementation only cares about the locks can be shown
        in the view pg_locks.
      - This PR’s implementation does not support AO table. We keep upgrading
        the locks for AO table.
      - This PR’s implementation does not take networking wait into account.
        Thus we cannot detect the deadlock of GitHub issue #2837.
      - SELECT FOR UPDATE still lock the whole table.
      Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      03915d65
  7. 02 5月, 2018 1 次提交
    • H
      Re-enable MIN/MAX optimization. · 362fc756
      Heikki Linnakangas 提交于
      I'm not sure why it's been disabled. It's not very hard to make it work, so
      let's do it. Might not be a very common query type, but if you happen to
      have a query where it helps, it helps a lot.
      
      This adds a GUC, gp_enable_minmax_optimization, to enable/disable the
      optimization. There's no such GUC in upstream, but we need at least a flag
      in PlannerConfig for it, so that we can disable the optimization for
      correlated subqueries, along with some other optimizer tricks. Seems best
      to also have a GUC for it, for consistency with other flags in
      PlannerConfig.
      362fc756
  8. 29 3月, 2018 2 次提交
    • P
      Support replicated table in GPDB · 7efe3204
      Pengzhou Tang 提交于
      * Support replicated table in GPDB
      
      Currently, tables are distributed across all segments by hash or random in GPDB. There
      are requirements to introduce a new table type that all segments have the duplicate
      and full table data called replicated table.
      
      To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
      a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
      the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
      data is generally available on all segments but not available on qDisp, so plan node with
      this locus type can be flexibly planned to execute on either single QE or all QEs. it is
      similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
      node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
      on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
      rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
      is not promoted to executed on qDisp finally, so we need to detect such case and omit the
      redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
      it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
      dispatch to avoid getting duplicate data.
      
      We don't support replicated table with inherit/partition by clause now, the main problem is
      that update/delete on multiple result relations can't work correctly now, we can fix this
      later.
      
      * Allow spi_* to access replicated table on QE
      
      Previously, GPDB didn't allow QE to access non-catalog table because the
      data is incomplete,
      we can remove this limitation now if it only accesses replicated table.
      
      One problem is QE need to know if a table is replicated table,
      previously, QE didn't maintain
      the gp_distribution_policy catalog, so we need to pass policy info to QE
      for replicated table.
      
      * Change schema of gp_distribution_policy to identify replicated table
      
      Previously, we used a magic number -128 in gp_distribution_policy table
      to identify replicated table which is quite a hack, so we add a new column
      in gp_distribution_policy to identify replicated table and partitioned
      table.
      
      This commit also abandon the old way that used 1-length-NULL list and
      2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
      FULLY clause.
      
      Beside, this commit refactor the code to make the decision-making of
      distribution policy more clear.
      
      * support COPY for replicated table
      
      * Disable row ctid unique path for replicated table.
        Previously, GPDB use a special Unique path on rowid to address queries
        like "x IN (subquery)", For example:
        select * from t1 where t1.c2 in (select c2 from t3), the plan looks
        like:
         ->  HashAggregate
               Group By: t1.ctid, t1.gp_segment_id
                  ->  Hash Join
                        Hash Cond: t2.c2 = t1.c2
                      ->  Seq Scan on t2
                      ->  Hash
                          ->  Seq Scan on t1
      
        Obviously, the plan is wrong if t1 is a replicated table because ctid
        + gp_segment_id can't identify a tuple, in replicated table, a logical
        row may have different ctid and gp_segment_id. So we disable such plan
        for replicated table temporarily, it's not the best way because rowid
        unique way maybe the cheapest plan than normal hash semi join, so
        we left a FIXME for later optimization.
      
      * ORCA related fix
        Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
        Fallback to legacy query optimizer for queries over replicated table
      
      * Adapt pg_dump/gpcheckcat to replicated table
        gp_distribution_policy is no longer a master-only catalog, do
        same check as other catalogs.
      
      * Support gpexpand on replicated table && alter the dist policy of replicated table
      7efe3204
    • D
      Remove FIXME about group_id in Distinct HashAgg · 2b25c663
      Dhanashree Kashid 提交于
      With the 8.4 merge, planner considers using HashAgg to implement
      DISTINCT. At the end of planning, we replace the expressions in the
      targetlist of certain operators (including Agg) into OUTER references
      in targetlist of its lefttree (see set_plan_refs() >
      set_upper_references()).
      But, as per the code, in the case when grouping() or group_id() are
      present in the target list of Agg, it skips the replacement and this is
      problematic in case the Agg is implementing DISTINCT.
      
      It seems that the Agg's targetlist need not compute grouping() or
      group_id() when its lefttree is computing it. In that case, it may
      simply refer to it. This would then also apply to other operators
      WindowAgg, Result & PartitionSelector.
      
      However, the Repeat node needs to compute these functions at each stage
      because group_id is derived from RepeatState::repeat_count. Thus, it
      connot be replaced by an OUTER reference.
      
      Hence, this commit removes the special case for these functions for all
      operators except Repeat. Then, a DISTINCT HashAgg produces the correct
      results.
      Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>
      2b25c663
  9. 16 3月, 2018 1 次提交
    • S
      Remove GPDB_84_MERGE_FIXME in planner.c and prepunion.c · 74546663
      Shreedhar Hardikar 提交于
      These were related to chosing the right arguments to send to GPDB's
      make_agg() and cost_agg() methods for queries containing DISTINCT or set
      operations.
      
      Hash aggregation when used to implement a DISTINCT (in either form) in
      the query is not related to grouping sets and thus the argments to
      num_nullcols, input_grouping, grouping and rollup_gs_times should be 0.
      
      However, since SetOp uses the upstream TupleHashTable while HashAgg uses
      GPDB's HHashTable implementation, the hash table size calculations
      should be computed differently. This is fixed in this commit
      Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
      74546663
  10. 09 2月, 2018 1 次提交
    • H
      Refactor the way Semi-Joins plans are constructed. · d4ce0921
      Heikki Linnakangas 提交于
      This removes much of the GPDB machinery to handle "deduplication paths"
      within the planner. We will now use the upstream code to build JOIN_SEMI
      paths, as well as paths where the outer side of the join is first
      deduplicated (JOIN_UNIQUE_OUTER/INNER).
      
      The old style "join first and deduplicate later" plans can be better in
      some cases, however. To still be able to generate such plan, add new
      JOIN_DEDUP_SEMI join type, which is transformed into JOIN_INNER followed
      by the deduplication step after the join, during planning.
      
      This new way of constructing these plans is simpler, and allows removing
      a bunch of code, and reverting some more code to the way it is in the
      upstream.
      
      I'm not sure if this can generate the same plans that the old code could,
      in all cases. In particular, I think the old "late deduplication"
      mechanism could delay the deduplication further, all the way to the top of
      the join tree. I'm not sure when that woud be useful, though, and the
      regression suite doesn't seem to contain any such cases (with EXPLAIN). Or
      maybe I misunderstood the old code. In any case, I think this is good
      enough.
      d4ce0921
  11. 02 2月, 2018 1 次提交
    • H
      Remove extra planner pass to remove "trivial" Result nodes. · c613cabf
      Heikki Linnakangas 提交于
      Instead, avoid creating such Result nodes in the first place, by making
      plan_pushdown_tlist() check if the Result node would have any work to do.
      
      With this, you get Result nodes in some cases where the old code could zap
      it away. But on the other hand, this can avoid inserting Result nodes, not
      only on top of Appends, but on top of any node. This can be seen in the
      included expected output changes: some test queries lose a Result, some
      gain one. So performance-wise this is about a wash, but this is simpler.
      
      The reason to do this right now is that we ran into issues with the
      "zapping" code while working on the 9.0 merge. I'm sure we could fix those
      issues, but let's do this rather than spend time debugging and fixing the
      zapping code with the merge.
      c613cabf
  12. 13 12月, 2017 4 次提交
    • D
      Reword comment to avoid nested comments · 8105f067
      Daniel Gustafsson 提交于
      The comment added in 916f460f created a nested comment structure
      by accident, which triggered a warning in clang for -Wcomment. Reword
      the comment slightly to make the compiler happy.
      
      planner.c:194:15: warning: '/*' within block comment [-Wcomment]
               * support pl/* statements (relevant when they are planned on the segments).
                           ^
      8105f067
    • S
      Fix storage test failures caused by 916f460f · 0d3ae2a0
      Shreedhar Hardikar 提交于
      The default value of Gp_role is set to GP_ROLE_DISPATCH. Which means
      auxiliary processes inherit this value. FileRep does the same, but also
      executes queries using SPI on the segment. Which means Gp_role ==
      GP_ROLE_DISPATCH is not a sufficient check for master QD.
      
      So, bring back the check on GpIdentity.
      
      Author: Asim R P <apraveen@pivotal.io>
      Author: Shreedhar Hardikar <shardikar@pivotal.io>
      0d3ae2a0
    • S
      Rename querytree_safe_for_segment to querytree_safe_for_qe · 32f099fd
      Shreedhar Hardikar 提交于
      The original name was deceptive because this check is also done for QE
      slices that run on master. For example:
      
      EXPLAIN SELECT * FROM func1_nosql_vol(5), foo;
      
                                               QUERY PLAN
      --------------------------------------------------------------------------------------------
       Gather Motion 3:1  (slice2; segments: 3)  (cost=0.30..1.37 rows=4 width=12)
         ->  Nested Loop  (cost=0.30..1.37 rows=2 width=12)
               ->  Seq Scan on foo  (cost=0.00..1.01 rows=1 width=8)
               ->  Materialize  (cost=0.30..0.33 rows=1 width=4)
                     ->  Broadcast Motion 1:3  (slice1)  (cost=0.00..0.30 rows=3 width=4)
                           ->  Function Scan on func1_nosql_vol  (cost=0.00..0.26 rows=1 width=4)
       Settings:  optimizer=off
       Optimizer status: legacy query optimizer
      (8 rows)
      
      Note that in the plan, the function func1_nosql_vol() will be executed on a
      master slice with Gp_role as GP_ROLE_EXECUTE.
      
      Also, update output files
      Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
      32f099fd
    • S
      Ensure that ORCA is not called on any process other than the master QD · 916f460f
      Shreedhar Hardikar 提交于
      We don't want to use the optimizer for planning queries in SQL, pl/pgSQL
      etc. functions when that is done on the segments.
      
      ORCA excels in complex queries, most of which will access distributed
      tables. We can't run such queries from the segments slices anyway
      because they require dispatching a query within another - which is not
      allowed in GPDB. Note that this restriction also applies to non-QD
      master slices.  Furthermore, ORCA doesn't currently support pl/*
      statements (relevant when they are planned on the segments).
      
      For these reasons, restrict to using ORCA on the master QD processes
      only.
      
      Also revert commit d79a2c7f ("Fix pipeline failures caused by 0dfd0ebc.")
      and separate out gporca fault injector tests in newly added
      gporca_faults.sql so that the rest can run in a parallel group.
      Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
      916f460f
  13. 12 12月, 2017 1 次提交
    • D
      Replace usage of deprecated error codes · fd0a1b75
      Daniel Gustafsson 提交于
      These error codes were marked as deprecated in September 2007 but
      the code didn't get the memo. Extend the deprecation into the code
      and actually replace the usage. Ten years seems long enough notice
      so also remove the renames, the odds of anyone using these in code
      which compiles against a 6X tree should be low (and easily fixed).
      fd0a1b75
  14. 30 11月, 2017 1 次提交
  15. 24 11月, 2017 7 次提交
    • H
      Backport upstream comment updates · 122e817b
      Heikki Linnakangas 提交于
      commit 96f990e2
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Wed Jul 13 20:23:09 2011 -0400
      
          Update some comments to clarify who does what in targetlist creation.
      
          No code changes; just avoid blaming query_planner for things it doesn't
          really do.
      122e817b
    • H
      Backport upstream bugfix related to Window functions. · 411a033c
      Heikki Linnakangas 提交于
      The test case added to the regression suite actually seems to work on
      GPDB even without this, but nevertheless seems like a good idea to pick
      it now, since we have the code it affected. Also, I'm about to backport
      more stuff that depend on this.
      
      commit c1d9579d
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Jul 12 18:23:55 2011 -0400
      
          Avoid listing ungrouped Vars in the targetlist of Agg-underneath-Window.
      
          Regular aggregate functions in combination with, or within the arguments
          of, window functions are OK per spec; they have the semantics that the
          aggregate output rows are computed and then we run the window functions
          over that row set.  (Thus, this combination is not really useful unless
          there's a GROUP BY so that more than one aggregate output row is possible.)
          The case without GROUP BY could fail, as recently reported by Jeff Davis,
          because sloppy construction of the Agg node's targetlist resulted in extra
          references to possibly-ungrouped Vars appearing outside the aggregate
          function calls themselves.  See the added regression test case for an
          example.
      
          Fixing this requires modifying the API of flatten_tlist and its underlying
          function pull_var_clause.  I chose to make pull_var_clause's API for
          aggregates identical to what it was already doing for placeholders, since
          the useful behaviors turn out to be the same (error, report node as-is, or
          recurse into it).  I also tightened the error checking in this area a bit:
          if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here,
          that was a long time ago, so complain instead of ignoring them.
      
          Backpatch into 9.1.  The failure exists in 8.4 and 9.0 as well, but seeing
          that it only occurs in a basically-useless corner case, it doesn't seem
          worth the risks of changing a function API in a minor release.  There might
          be third-party code using pull_var_clause.
      411a033c
    • H
      Cherry-pick change to pull_var_clause() API. · bd3ab7bd
      Heikki Linnakangas 提交于
      We would get this later in PostgreSQL 8.4, but I'm about to cherry-pick
      more commits now, that depends on this.
      
      Upstream commmit:
      
      commit 1d97c19a
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Sun Apr 19 19:46:33 2009 +0000
      
          Fix estimate_num_groups() to not fail on PlaceHolderVars, per report from
          Stefan Kaltenbrunner.  The most reasonable behavior (at least for the near
          term) seems to be to ignore the PlaceHolderVar and examine its argument
          instead.  In support of this, change the API of pull_var_clause() to allow
          callers to request recursion into PlaceHolderVars.  Currently
          estimate_num_groups() is the only customer for that behavior, but where
          there's one there may be others.
      bd3ab7bd
    • H
      Re-implement RANGE PRECEDING/FOLLOWING. · 14a9108a
      Heikki Linnakangas 提交于
      This is similar to the old implementation, in that we use "+", "-" to
      compute the boundaries.
      
      Unfortunately it seems unlikely that this would be accepted in the
      upstream, but at least we have that feature back in GPDB now, the way it
      used to be. See discussion on pgsql-hackers about that:
      https://www.postgresql.org/message-id/26801.1265656635@sss.pgh.pa.us
      14a9108a
    • H
      Backport implementation of ORDER BY within aggregates, from PostgreSQL 9.0. · 4319b7bb
      Heikki Linnakangas 提交于
      This is functionality that was lost by the ripout & replace.
      
      commit 34d26872
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Dec 15 17:57:48 2009 +0000
      
          Support ORDER BY within aggregate function calls, at long last providing a
          non-kluge method for controlling the order in which values are fed to an
          aggregate function.  At the same time eliminate the old implementation
          restriction that DISTINCT was only supported for single-argument aggregates.
      
          Possibly release-notable behavioral change: formerly, agg(DISTINCT x)
          dropped null values of x unconditionally.  Now, it does so only if the
          agg transition function is strict; otherwise nulls are treated as DISTINCT
          normally would, ie, you get one copy.
      
          Andrew Gierth, reviewed by Hitoshi Harada
      4319b7bb
    • H
      Remove PercentileExpr. · bb6a757e
      Heikki Linnakangas 提交于
      This loses the functionality, and leaves all the regression tests that used
      those functions failing.
      
      The plan is to later backport the upstream implementation of those
      functions from PostgreSQL 9.4. The feature is called "ordered set
      aggregates" there.
      bb6a757e
    • H
      Wholesale rip out and replace Window planner and executor code · f62bd1c6
      Heikki Linnakangas 提交于
      This adds some limitations, and removes some functionality that tte old
      implementation had. These limitations will be lifted, and missing
      functionality will be added back, in subsequent commits:
      
      * You can no longer have variables in start/end offsets
      
      * RANGE is not implemented (except for UNBOUNDED)
      
      * If you have multiple window functions that require a different sort
        ordering, the planner is not smart about placing them in a way that
        minimizes the number of sorts.
      
      This also lifts some limitations that the GPDB implementation had:
      
      * LEAD/LAG offset can now be negative. In the qp_olap_windowerr, a lot of
        queries that used to throw an "ROWS parameter cannot be negative" error
        are now passing. That error was an artifact of the eay LEAD/LAG were
        implemented. Those queries contain window function calls like "LEAD(col1,
        col2 - col3)", and sometimes with suitable values in col2 and col3, the
        second argument went negative. That caused the error. implementation of
        LEAD/LAG is OK with a negative argument.
      
      * Aggregate functions with no prelimfn or invprelimfn are now supported as
        window functions
      
      * Window functions, e.g. rank(), no longer require an ORDER BY. (The output
        will vary from one invocation to another, though, because the order is
        then not well defined. This is more annoying on GPDB than on PostgreSQL,
        because in GDPB the row order tends to vary because the rows are spread
        out across the cluster and will arrive in the master in unpredictable
        order)
      
      * NTILE doesn't require the argument expression to be in PARTITION BY
      
      * A window function's arguments may contain references to an outer query.
      
      This changes the OIDs of the built-in window functions to match upstream.
      Unfortunately, the OIDs had been hard-coded in ORCA, so to work around that
      until those hard-coded values are fixed in ORCA, the ORCA translator code
      contains a hack to map the old OID to the new ones.
      f62bd1c6
  16. 23 11月, 2017 2 次提交
    • H
      Make 'must_gather' logic when planning DISTINCT and ORDER BY more robust. · a5610212
      Heikki Linnakangas 提交于
      The old logic was:
      
      1. Decide if we need to put a Gather motion on top of the plan
      2. Add nodes to handle DISTINCT
      3. Add nodes to handle ORDER BY.
      4. Add Gather node, if we decided so in step 1.
      
      If in step 1, if the result was already focused on a single segment, we
      would make note that no Gather is needed, and not add one in step 4.
      However, the DISTINCT processing might add a Redistribute Motion node, so
      that the final result is not focused on a single node.
      
      I couldn't come up with a query where that would happen, as the code stands,
      but we saw such a case on the "window functions rewrite" branch we've been
      working on. There, the sort order/distribution of the input can be changed
      to process window functions. But even if this isn't actively broken right
      now, it seems more robust to change the logic so that 'must_gather' means
      'at the end, the result must end up on a single node', instead of 'we must
      add a Gather node'. The test that this adds exercises this issue after the
      the window functions rewrite, but right now it passes with or without these
      code changes. But might as well add it now.
      a5610212
    • H
      Fix DISTINCT with window functions. · 898ced7c
      Heikki Linnakangas 提交于
      The last 8.4 merge commit introduced support for DISTINCT with hashing,
      and refactored the way grouping_planner() works with the path keys. That
      broke DISTINCT with window functions, because the new distinct_pathkeys
      field was not set correctly.
      
      In commit 474f1db0, I moved some GPDB-added tests from the 'aggregates'
      test, to a new 'gp_aggregates' test. But I forgot to add the new test file
      to the test schedule, so it was not run. Oops. Add it to the schedule now.
      The tests in 'gp_aggregates' cover this bug.
      898ced7c
  17. 21 10月, 2017 1 次提交
    • H
      Fix distribution of rows in CREATE TABLE AS and ORDER BY. · c159ec72
      Heikki Linnakangas 提交于
      If a CREATE TABLE AS query contained an ORDER BY, the planner put a Motion
      node on top of the plan that focuses all the rows to a single node.
      However, that was confused with the re-distribute motion that CREATE TABLE
      AS that is supposed to go to the top, to distribute the rows according to
      the DISTRIBUTED BY of the table. This used to work before commit
      7e268107, because we used to not add an explicit Motion node on top of
      the plan for ORDER BY, but we just changed the sort-order information in
      the Flow.
      
      I have a nagging feeling that the apply_motion code isn't dealing with
      Motion on top of a Motion node correctly, because I would've expected to
      get a plan like that without this fix. Perhaps apply_motion silentlye
      refuses to add a Motion node on top of an existing Motion? That'd be a
      silly plan, of course, and the planner doesn't fortunately create such
      plans, so I'm not going to dig deeper into that right now.
      
      The test case is a simplified version from one of the
      "mpp21090_drop_col_oids_dml_*" TINC tests. I noticed this while moving
      those tests over from TINC to the main suite. We only run those tests
      in the concourse pipeline with "set optimizer=on", so it didn't catch
      this issue with optimizer=off.
      
      Fixes github issue #3577.
      c159ec72
  18. 13 10月, 2017 1 次提交
    • J
      Remove superfluous pathkey canonicalization · 7913e231
      Jesse Zhang 提交于
      `make_pathkeys_for_sortclauses` with a `true` last argument promises to
      canonicalize the returned path keys. We somehow cargo-culted a few
      unnecessary `canonicalize_pathkeys` immediately after those calls.
      
      This commit removes such superfluous calls to `canonicalize_pathkeys`.
      Signed-off-by: NMax Yang <myang@pivotal.io>
      7913e231
  19. 12 10月, 2017 1 次提交
  20. 27 9月, 2017 8 次提交
    • S
      Remove dead code around JoinExpr::subqfromlist. · f16deabd
      Shreedhar Hardikar 提交于
      This was used to keep information about the subquery join tree for
      pulled-up sublinks for use later in deconstruct_recurse().  With the
      upstream subselect merge, a JoinExpr constructed at the pull-up time
      itself, so this is no longer needed since the subquery join tree
      information is available in the constructed JoinExpr.
      
      Also with the merge, deconstruct_recurse() handles JOIN_SEMI JoinExprs.
      However, since GPDB differs from upstream by treating SEMI joins as
      INNER join for internal join planning, this commit also updates
      inner_join_rels correctly for SEMI joins (see regression test).
      
      Also remove unused function declaration for not_null_inner_vars().
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      f16deabd
    • E
      Improve pull_up_subqueries logic w.r.t PlaceHolderVar · da29e67a
      Ekta Khanna 提交于
      commit c59d8dd4
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Apr 28 21:31:16 2009 +0000
      
          Improve pull_up_subqueries logic so that it doesn't insert unnecessary
          PlaceHolderVar nodes in join quals appearing in or below the lowest
          outer join that could null the subquery being pulled up.  This improves
          the planner's ability to recognize constant join quals, and probably
          helps with detection of common sort keys (equivalence classes) as well.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      da29e67a
    • E
      Refrain from creating the planner's placeholder_list · 695c9fdf
      Ekta Khanna 提交于
      commit 31468d05
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Wed Oct 22 20:17:52 2008 +0000
      
          Dept of better ideas: refrain from creating the planner's placeholder_list
          until vars are distributed to rels during query_planner() startup.  We don't
          really need it before that, and not building it early has some advantages.
          First, we don't need to put it through the various preprocessing steps, which
          saves some cycles and eliminates the need for a number of routines to support
          PlaceHolderInfo nodes at all.  Second, this means one less unused plan for any
          sub-SELECT appearing in a placeholder's expression, since we don't build
          placeholder_list until after sublink expansion is complete.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      695c9fdf
    • B
      Add a concept of "placeholder" variables to the planner · 2b5c8201
      Bhuvnesh Chaudhary 提交于
      commit e6ae3b5d
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Oct 21 20:42:53 2008 +0000
      
          Add a concept of "placeholder" variables to the planner.  These are variables
          that represent some expression that we desire to compute below the top level
          of the plan, and then let that value "bubble up" as though it were a plain
          Var (ie, a column value).
      
          The immediate application is to allow sub-selects to be flattened even when
          they are below an outer join and have non-nullable output expressions.
          Formerly we couldn't flatten because such an expression wouldn't properly
          go to NULL when evaluated above the outer join.  Now, we wrap it in a
          PlaceHolderVar and arrange for the actual evaluation to occur below the outer
          join.  When the resulting Var bubbles up through the join, it will be set to
          NULL if necessary, yielding the correct results.  This fixes a planner
          limitation that's existed since 7.1.
      
          In future we might want to use this mechanism to re-introduce some form of
          Hellerstein's "expensive functions" optimization, ie place the evaluation of
          an expensive function at the most suitable point in the plan tree.
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      2b5c8201
    • E
      Improve sublink pullup code to handle ANY/EXISTS sublinks · 1ddcb97e
      Ekta Khanna 提交于
      commit 19e34b62
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Sun Aug 17 01:20:00 2008 +0000
      
          Improve sublink pullup code to handle ANY/EXISTS sublinks that are at top
          level of a JOIN/ON clause, not only at top level of WHERE.  (However, we
          can't do this in an outer join's ON clause, unless the ANY/EXISTS refers
          only to the nullable side of the outer join, so that it can effectively
          be pushed down into the nullable side.)  Per request from Kevin Grittner.
      
          In passing, fix a bug in the initial implementation of EXISTS pullup:
          it would Assert if the EXIST's WHERE clause used a join alias variable.
          Since we haven't yet flattened join aliases when this transformation
          happens, it's necessary to include join relids in the computed set of
          RHS relids.
      
      Ref [#142356521]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      1ddcb97e
    • E
      Replace JOIN_LASJ by JOIN_ANTI · 6e7b4722
      Ekta Khanna 提交于
      After merging with e006a24a, Anti Semi Join will
      be denoted by `JOIN_ANTI` instead of `JOIN_LASJ`
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      6e7b4722
    • E
      CDBlize the cherry-pick e006a24a · 0feb1bd9
      Ekta Khanna 提交于
      Original Flow:
      cdb_flatten_sublinks
      	+--> pull_up_IN_clauses
      		+--> convert_sublink_to_join
      
      New Flow:
      cdb_flatten_sublinks
      	+--> pull_up_sublinks
      
      This commit contains relevant changes for the above flow.
      
      Previously, `try_join_unique` was part of `InClauseInfo`. It was getting
      set in `convert_IN_to_join()` and used in `cdb_make_rel_dedup_info()`.
      Now, since `InClauseInfo` is not present and we construct
      `FlattenedSublink` instead in `convert_ANY_sublink_to_join()`. And later
      in the flow, we construct `SpecialJoinInfo` from `FlattenedSublink` in
      `deconstruct_sublink_quals_to_rel()`. Hence, adding `try_join_unique` as
      part of both `FlattenedSublink` and `SpecialJoinInfo`.
      
      Ref [#142355175]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      0feb1bd9
    • E
      Implement SEMI and ANTI joins in the planner and executor. · fe2eb2c9
      Ekta Khanna 提交于
      commit e006a24a
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Thu Aug 14 18:48:00 2008 +0000
      
          Implement SEMI and ANTI joins in the planner and executor.  (Semijoins replace
          the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
          to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
          joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
          filters are recognized as being anti joins.  Unify the InClauseInfo and
          OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
          it becomes possible to associate a SpecialJoinInfo with every join attempt,
          which permits some cleanup of join selectivity estimation.  That needs to be
          taken much further than this patch does, but the next step is to change the
          API for oprjoin selectivity functions, which seems like material for a
          separate patch.  So for the moment the output size estimates for semi and
          especially anti joins are quite bogus.
      
      Ref [#142355175]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      fe2eb2c9
  21. 25 9月, 2017 1 次提交
    • H
      Remove row order information from Flow. · 7e268107
      Heikki Linnakangas 提交于
      A Motion node often needs to "merge" the incoming streams, to preserve the
      overall sort order. Instead of carrying sort order information throughout
      the later stages of planning, in the Flow struct, pass it as argument
      directly to make_motion() and other functions, where a Motion node is
      created. This simplifies things.
      
      To make that work, we can no longer rely on apply_motion() to add the final
      Motion on top of the plan, when the (sub-)query contains an ORDER BY. That's
      because we no longer have that information available at apply_motion(). Add
      the Motion node in grouping_planner() instead, where we still have that
      information, as a path key.
      
      When I started to work on this, this also fixed a bug, where the sortColIdx
      of plan flow node may refer to wrong resno. A test case for that is
      included. However, that case was since fixed by other coincidental changes
      to partition elimination, so now this is just refactoring.
      7e268107
  22. 21 9月, 2017 1 次提交
    • H
      Fix CURRENT OF to work with PL/pgSQL cursors. · 91411ac4
      Heikki Linnakangas 提交于
      It only worked for cursors declared with DECLARE CURSOR, before. You got
      an "there is no parameter $0" error if you tried. This moves the decision
      on whether a plan is "simply updatable", from the parser to the planner.
      Doing it in the parser was awkward, because we only want to do it for
      queries that are used in a cursor, and for SPI queries, we don't know it
      at that time yet.
      
      For some reason, the copy, out, read-functions of CurrentOfExpr were missing
      the cursor_param field. While we're at it, reorder the code to match
      upstream.
      
      This only makes the required changes to the Postgres planner. ORCA has never
      supported updatable cursors. In fact, it will fall back to the Postgres
      planner on any DECLARE CURSOR command, so that's why the existing tests
      have passed even with optimizer=off.
      91411ac4