1. 30 11月, 2017 2 次提交
  2. 24 11月, 2017 11 次提交
    • H
      Backport upstream comment updates · 122e817b
      Heikki Linnakangas 提交于
      commit 96f990e2
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Wed Jul 13 20:23:09 2011 -0400
      
          Update some comments to clarify who does what in targetlist creation.
      
          No code changes; just avoid blaming query_planner for things it doesn't
          really do.
      122e817b
    • H
      Backport upstream bugfix related to Window functions. · 411a033c
      Heikki Linnakangas 提交于
      The test case added to the regression suite actually seems to work on
      GPDB even without this, but nevertheless seems like a good idea to pick
      it now, since we have the code it affected. Also, I'm about to backport
      more stuff that depend on this.
      
      commit c1d9579d
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Jul 12 18:23:55 2011 -0400
      
          Avoid listing ungrouped Vars in the targetlist of Agg-underneath-Window.
      
          Regular aggregate functions in combination with, or within the arguments
          of, window functions are OK per spec; they have the semantics that the
          aggregate output rows are computed and then we run the window functions
          over that row set.  (Thus, this combination is not really useful unless
          there's a GROUP BY so that more than one aggregate output row is possible.)
          The case without GROUP BY could fail, as recently reported by Jeff Davis,
          because sloppy construction of the Agg node's targetlist resulted in extra
          references to possibly-ungrouped Vars appearing outside the aggregate
          function calls themselves.  See the added regression test case for an
          example.
      
          Fixing this requires modifying the API of flatten_tlist and its underlying
          function pull_var_clause.  I chose to make pull_var_clause's API for
          aggregates identical to what it was already doing for placeholders, since
          the useful behaviors turn out to be the same (error, report node as-is, or
          recurse into it).  I also tightened the error checking in this area a bit:
          if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here,
          that was a long time ago, so complain instead of ignoring them.
      
          Backpatch into 9.1.  The failure exists in 8.4 and 9.0 as well, but seeing
          that it only occurs in a basically-useless corner case, it doesn't seem
          worth the risks of changing a function API in a minor release.  There might
          be third-party code using pull_var_clause.
      411a033c
    • H
      Cherry-pick change to pull_var_clause() API. · bd3ab7bd
      Heikki Linnakangas 提交于
      We would get this later in PostgreSQL 8.4, but I'm about to cherry-pick
      more commits now, that depends on this.
      
      Upstream commmit:
      
      commit 1d97c19a
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Sun Apr 19 19:46:33 2009 +0000
      
          Fix estimate_num_groups() to not fail on PlaceHolderVars, per report from
          Stefan Kaltenbrunner.  The most reasonable behavior (at least for the near
          term) seems to be to ignore the PlaceHolderVar and examine its argument
          instead.  In support of this, change the API of pull_var_clause() to allow
          callers to request recursion into PlaceHolderVars.  Currently
          estimate_num_groups() is the only customer for that behavior, but where
          there's one there may be others.
      bd3ab7bd
    • H
      Fix assertion failure from calling contain_agg_clause on raw parse tree · 7c0ceea1
      Heikki Linnakangas 提交于
      It assumes that any SubLinks have been processed already.
      7c0ceea1
    • H
      Re-implement RANGE PRECEDING/FOLLOWING. · 14a9108a
      Heikki Linnakangas 提交于
      This is similar to the old implementation, in that we use "+", "-" to
      compute the boundaries.
      
      Unfortunately it seems unlikely that this would be accepted in the
      upstream, but at least we have that feature back in GPDB now, the way it
      used to be. See discussion on pgsql-hackers about that:
      https://www.postgresql.org/message-id/26801.1265656635@sss.pgh.pa.us
      14a9108a
    • H
      Support ordered-set (WITHIN GROUP) aggregates. · fd6212ce
      Heikki Linnakangas 提交于
      This is backport from PostgreSQL 9.4. It brings back functionality that we
      lost with the ripout & replace of the window function implementation.
      
      I left out all the code and tests related to COLLATE, because we don't have
      that feature. Will need to put that back when we merge collation support, in
      9.1.
      
      commit 8d65da1f
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Dec 23 16:11:35 2013 -0500
      
          Support ordered-set (WITHIN GROUP) aggregates.
      
          This patch introduces generic support for ordered-set and hypothetical-set
          aggregate functions, as well as implementations of the instances defined in
          SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
          percent_rank(), cume_dist()).  We also added mode() though it is not in the
          spec, as well as versions of percentile_cont() and percentile_disc() that
          can compute multiple percentile values in one pass over the data.
      
          Unlike the original submission, this patch puts full control of the sorting
          process in the hands of the aggregate's support functions.  To allow the
          support functions to find out how they're supposed to sort, a new API
          function AggGetAggref() is added to nodeAgg.c.  This allows retrieval of
          the aggregate call's Aggref node, which may have other uses beyond the
          immediate need.  There is also support for ordered-set aggregates to
          install cleanup callback functions, so that they can be sure that
          infrastructure such as tuplesort objects gets cleaned up.
      
          In passing, make some fixes in the recently-added support for variadic
          aggregates, and make some editorial adjustments in the recent FILTER
          additions for aggregates.  Also, simplify use of IsBinaryCoercible() by
          allowing it to succeed whenever the target type is ANY or ANYELEMENT.
          It was inconsistent that it dealt with other polymorphic target types
          but not these.
      
          Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
          and rather heavily editorialized upon by Tom Lane
      
      Also includes this fixup:
      
      commit cf63c641
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Dec 23 20:24:07 2013 -0500
      
          Fix portability issue in ordered-set patch.
      
          Overly compact coding in makeOrderedSetArgs() led to a platform dependency:
          if the compiler chose to execute the subexpressions in the wrong order,
          list_length() might get applied to an already-modified List, giving a
          value we didn't want.  Per buildfarm.
      fd6212ce
    • H
      Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag. · c70617a2
      Heikki Linnakangas 提交于
      This is a backport from the following commit from PostgreSQL 9.3. Needed
      now, because subsequent backported commits depend on it.
      
      commit 75b39e79
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Jan 21 20:25:26 2013 -0500
      
          Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag.
      
          Originally we didn't bother to mark FuncExprs with any indication whether
          VARIADIC had been given in the source text, because there didn't seem to be
          any need for it at runtime.  However, because we cannot fold a VARIADIC ANY
          function's arguments into an array (since they're not necessarily all the
          same type), we do actually need that information at runtime if VARIADIC ANY
          functions are to respond unsurprisingly to use of the VARIADIC keyword.
          Add the missing field, and also fix ruleutils.c so that VARIADIC ANY
          function calls are dumped properly.
      
          Extracted from a larger patch that also fixes concat() and format() (the
          only two extant VARIADIC ANY functions) to behave properly when VARIADIC is
          specified.  This portion seems appropriate to review and commit separately.
      
          Pavel Stehule
      c70617a2
    • H
      Centralize the logic for detecting misplaced aggregates, window funcs, etc. · fc8f849d
      Heikki Linnakangas 提交于
      This cherry-picks the following commit. This is needed because subsequent
      commits depend on this one.
      
      I took the EXPR_KIND_PARTITION_EXPRESSION value from PostgreSQL v10, where
      it's also for partition-related things. Seems like a good idea, even though
      our partitioning implementation is completely different.
      
      commit eaccfded
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Fri Aug 10 11:35:33 2012 -0400
      
          Centralize the logic for detecting misplaced aggregates, window funcs, etc.
      
          Formerly we relied on checking after-the-fact to see if an expression
          contained aggregates, window functions, or sub-selects when it shouldn't.
          This is grotty, easily forgotten (indeed, we had forgotten to teach
          DefineIndex about rejecting window functions), and none too efficient
          since it requires extra traversals of the parse tree.  To improve matters,
          define an enum type that classifies all SQL sub-expressions, store it in
          ParseState to show what kind of expression we are currently parsing, and
          make transformAggregateCall, transformWindowFuncCall, and transformSubLink
          check the expression type and throw error if the type indicates the
          construct is disallowed.  This allows removal of a large number of ad-hoc
          checks scattered around the code base.  The enum type is sufficiently
          fine-grained that we can still produce error messages of at least the
          same specificity as before.
      
          Bringing these error checks together revealed that we'd been none too
          consistent about phrasing of the error messages, so standardize the wording
          a bit.
      
          Also, rewrite checking of aggregate arguments so that it requires only one
          traversal of the arguments, rather than up to three as before.
      
          In passing, clean up some more comments left over from add_missing_from
          support, and annotate some tests that I think are dead code now that that's
          gone.  (I didn't risk actually removing said dead code, though.)
      
      Author: Heikki Linnakangas <hlinnakangas@pivotal.io>
      Author: Ekta Khanna <ekhanna@pivotal.io>
      fc8f849d
    • H
      Backport implementation of ORDER BY within aggregates, from PostgreSQL 9.0. · 4319b7bb
      Heikki Linnakangas 提交于
      This is functionality that was lost by the ripout & replace.
      
      commit 34d26872
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Dec 15 17:57:48 2009 +0000
      
          Support ORDER BY within aggregate function calls, at long last providing a
          non-kluge method for controlling the order in which values are fed to an
          aggregate function.  At the same time eliminate the old implementation
          restriction that DISTINCT was only supported for single-argument aggregates.
      
          Possibly release-notable behavioral change: formerly, agg(DISTINCT x)
          dropped null values of x unconditionally.  Now, it does so only if the
          agg transition function is strict; otherwise nulls are treated as DISTINCT
          normally would, ie, you get one copy.
      
          Andrew Gierth, reviewed by Hitoshi Harada
      4319b7bb
    • H
      Remove PercentileExpr. · bb6a757e
      Heikki Linnakangas 提交于
      This loses the functionality, and leaves all the regression tests that used
      those functions failing.
      
      The plan is to later backport the upstream implementation of those
      functions from PostgreSQL 9.4. The feature is called "ordered set
      aggregates" there.
      bb6a757e
    • H
      Wholesale rip out and replace Window planner and executor code · f62bd1c6
      Heikki Linnakangas 提交于
      This adds some limitations, and removes some functionality that tte old
      implementation had. These limitations will be lifted, and missing
      functionality will be added back, in subsequent commits:
      
      * You can no longer have variables in start/end offsets
      
      * RANGE is not implemented (except for UNBOUNDED)
      
      * If you have multiple window functions that require a different sort
        ordering, the planner is not smart about placing them in a way that
        minimizes the number of sorts.
      
      This also lifts some limitations that the GPDB implementation had:
      
      * LEAD/LAG offset can now be negative. In the qp_olap_windowerr, a lot of
        queries that used to throw an "ROWS parameter cannot be negative" error
        are now passing. That error was an artifact of the eay LEAD/LAG were
        implemented. Those queries contain window function calls like "LEAD(col1,
        col2 - col3)", and sometimes with suitable values in col2 and col3, the
        second argument went negative. That caused the error. implementation of
        LEAD/LAG is OK with a negative argument.
      
      * Aggregate functions with no prelimfn or invprelimfn are now supported as
        window functions
      
      * Window functions, e.g. rank(), no longer require an ORDER BY. (The output
        will vary from one invocation to another, though, because the order is
        then not well defined. This is more annoying on GPDB than on PostgreSQL,
        because in GDPB the row order tends to vary because the rows are spread
        out across the cluster and will arrive in the master in unpredictable
        order)
      
      * NTILE doesn't require the argument expression to be in PARTITION BY
      
      * A window function's arguments may contain references to an outer query.
      
      This changes the OIDs of the built-in window functions to match upstream.
      Unfortunately, the OIDs had been hard-coded in ORCA, so to work around that
      until those hard-coded values are fixed in ORCA, the ORCA translator code
      contains a hack to map the old OID to the new ones.
      f62bd1c6
  3. 23 11月, 2017 2 次提交
    • H
      Make 'must_gather' logic when planning DISTINCT and ORDER BY more robust. · a5610212
      Heikki Linnakangas 提交于
      The old logic was:
      
      1. Decide if we need to put a Gather motion on top of the plan
      2. Add nodes to handle DISTINCT
      3. Add nodes to handle ORDER BY.
      4. Add Gather node, if we decided so in step 1.
      
      If in step 1, if the result was already focused on a single segment, we
      would make note that no Gather is needed, and not add one in step 4.
      However, the DISTINCT processing might add a Redistribute Motion node, so
      that the final result is not focused on a single node.
      
      I couldn't come up with a query where that would happen, as the code stands,
      but we saw such a case on the "window functions rewrite" branch we've been
      working on. There, the sort order/distribution of the input can be changed
      to process window functions. But even if this isn't actively broken right
      now, it seems more robust to change the logic so that 'must_gather' means
      'at the end, the result must end up on a single node', instead of 'we must
      add a Gather node'. The test that this adds exercises this issue after the
      the window functions rewrite, but right now it passes with or without these
      code changes. But might as well add it now.
      a5610212
    • H
      Fix DISTINCT with window functions. · 898ced7c
      Heikki Linnakangas 提交于
      The last 8.4 merge commit introduced support for DISTINCT with hashing,
      and refactored the way grouping_planner() works with the path keys. That
      broke DISTINCT with window functions, because the new distinct_pathkeys
      field was not set correctly.
      
      In commit 474f1db0, I moved some GPDB-added tests from the 'aggregates'
      test, to a new 'gp_aggregates' test. But I forgot to add the new test file
      to the test schedule, so it was not run. Oops. Add it to the schedule now.
      The tests in 'gp_aggregates' cover this bug.
      898ced7c
  4. 21 11月, 2017 1 次提交
    • H
      Refactor dynamic index scans and bitmap scans, to reduce diff vs. upstream. · 198f701e
      Heikki Linnakangas 提交于
      Much of the code and structs used by index scans and bitmap index scans had
      been fused together and refactored in GPDB, to share code between dynamic
      index scans and regular ones. However, it would be nice to keep upstream
      code unchanged as much as possible. To that end, refactor the exector code
      for dynamic index scans and dynamic bitmap index scans, to reduce the diff
      vs upstream.
      
      The Dynamic Index Scan executor node is now a thin wrapper around the
      regular Index Scan node, even thinner than before. When a new Dynamic Index
      Scan begins, we don't do much initialization at that point. When the scan
      begins, we initialize an Index Scan node for the first partition, and
      return rows from it until it's exhausted. On next call, the underlying
      Index Scan is destroyed, and a new Index Scan node is created, for the next
      partition, and so on. Creating and destroying the IndexScanState for every
      partition adds some overhead, but it's not significant compared to all the
      other overhead of opening and closing the relations, building scan keys
      etc.
      
      Similarly, a Dynamic Bitmap Index Scan executor node is just a thin wrapper
      for regular Bitmap Index Scan. When MultiExecDynamicBitmapIndexScan() is
      called, it initializes an BitmapIndexScanState for the current partition,
      and calls it. On ReScan, the BitmapIndexScan executor node for the old
      partiton is shut down. A Dynamic Bitmap Index Scan differs from Dynamic
      Index Scan in that a Dynamic Index Scan is responsible for iterating
      through all the active partitions, while a Dynamic Bitmap Index Scan works
      as a slave for the Dynamic Bitmap Heap Scan node above it.
      
      It'd be nice to do a similar refactoring for heap scans, but that's for
      another day.
      198f701e
  5. 11 11月, 2017 1 次提交
    • D
      Align simplify_EXISTS_query with upstream · c823e7c6
      Dhanashree Kashid 提交于
      This function had diverged a lot from upstream; post subselect merge.
      One of the main reason is that upstream has lot of restrictive checks
      which prevent pull-up of EXISTS/NOT EXISTS. GPDB handles them
      differently; thus producing a join/initplan or a one-time filter.
      
      The cases that GPDB handles and for which we have not ported the checks
      from upstream are as follows:
      
      - AGG with limit count with/without offset
      - HAVING clause without AGG
      - AGG without HAVING clause
      
      For other conditions, we bail out as upstream. Hence we have added
      checks differently for having and aggs inside simplify_EXISTS_query.
      Rest of the code is similar to upstream.
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      c823e7c6
  6. 30 10月, 2017 1 次提交
  7. 21 10月, 2017 1 次提交
    • H
      Fix distribution of rows in CREATE TABLE AS and ORDER BY. · c159ec72
      Heikki Linnakangas 提交于
      If a CREATE TABLE AS query contained an ORDER BY, the planner put a Motion
      node on top of the plan that focuses all the rows to a single node.
      However, that was confused with the re-distribute motion that CREATE TABLE
      AS that is supposed to go to the top, to distribute the rows according to
      the DISTRIBUTED BY of the table. This used to work before commit
      7e268107, because we used to not add an explicit Motion node on top of
      the plan for ORDER BY, but we just changed the sort-order information in
      the Flow.
      
      I have a nagging feeling that the apply_motion code isn't dealing with
      Motion on top of a Motion node correctly, because I would've expected to
      get a plan like that without this fix. Perhaps apply_motion silentlye
      refuses to add a Motion node on top of an existing Motion? That'd be a
      silly plan, of course, and the planner doesn't fortunately create such
      plans, so I'm not going to dig deeper into that right now.
      
      The test case is a simplified version from one of the
      "mpp21090_drop_col_oids_dml_*" TINC tests. I noticed this while moving
      those tests over from TINC to the main suite. We only run those tests
      in the concourse pipeline with "set optimizer=on", so it didn't catch
      this issue with optimizer=off.
      
      Fixes github issue #3577.
      c159ec72
  8. 20 10月, 2017 1 次提交
    • D
      Refactor to use AOCS totalbyte code · 70feac1c
      Daniel Gustafsson 提交于
      We had code for getting the total bytes consumed by an AOCS table,
      as well as code implementing the very same asking for a common
      function. Extend GetAOCSTotalBytes() to deal with uncompressed or
      compressed data and refactor callsites to use this function.
      
      This also removes the need for a memory allocation in the codepaths
      which only want to know the number of bytes.
      70feac1c
  9. 17 10月, 2017 1 次提交
    • H
      Fix assertion failure if a window function's PARTITION BY is constant. · e49c5e70
      Heikki Linnakangas 提交于
      If a window function had a PARTITION BY clause, but the planner was able
      to deduce that it's a constant at runtime, we would still try to distribute
      the rows according to the non-existent hash expression. Creating a hash
      locus with no hash expressions tripped an assertion.
      
      Fixes github issues #3423 and #3446. Backpatch to 5X_STABLE.
      e49c5e70
  10. 13 10月, 2017 1 次提交
    • J
      Remove superfluous pathkey canonicalization · 7913e231
      Jesse Zhang 提交于
      `make_pathkeys_for_sortclauses` with a `true` last argument promises to
      canonicalize the returned path keys. We somehow cargo-culted a few
      unnecessary `canonicalize_pathkeys` immediately after those calls.
      
      This commit removes such superfluous calls to `canonicalize_pathkeys`.
      Signed-off-by: NMax Yang <myang@pivotal.io>
      7913e231
  11. 12 10月, 2017 1 次提交
  12. 11 10月, 2017 1 次提交
    • H
      Fix crash with a ROLLUP query. · 281ad5e9
      Heikki Linnakangas 提交于
      This was broken by commit 7e268107, which refactored the code that deals
      with path keys and sorts in plangroupext.c. The new function
      make_sort_from_pathkeys_and_groupingcol(), which replaced the old
      make_sort_from_reordered_groupcols() function, didn't work quite the same
      as the old function. I'm not sure what exactly went wrong there, but the
      caller already has the column number and operator information at hand, so
      we can use it to construct the Sort directly, without trying to re-find the
      original target list entries of the sort columns.
      
      Commit 7e268107 also neglected the comments in
      make_sort_from_pathkeys_and_groupingcol(), but this commit removes the whole
      function.
      
      Fixes github issue #3447.
      281ad5e9
  13. 10 10月, 2017 1 次提交
    • H
      Hide the two tuplesort implementations behind a common facade. · bbf40a8c
      Heikki Linnakangas 提交于
      We have two implementations of tuplesort: the "regular" one inherited
      from upstream, in tuplesort.c, and a GPDB-specific tuplesort_mk.c. We had
      modified all the callers to check the gp_enable_mk_sort GUC, and deal with
      both of them. However, that makes merging with upstream difficult, and
      litters the code with the boilerplate to check the GUC and call one of
      the two implementations.
      
      Simplify the callers, by providing a single API that hides the two
      implementations from the rest of the system. The API is the tuplesort_*
      functions, as in upstream. This requires some preprocessor trickery,
      so that tuplesort.c can use the tuplesort_* function names as is, but in
      the rest of the codebase, calling tuplesort_*() will call a "switcheroo"
      function that decides which implementation to actually call. While this
      is more lines of code overall, it keeps all the ugliness confined in
      tuplesort.h, not littered throughout the codebase.
      bbf40a8c
  14. 30 9月, 2017 1 次提交
    • H
      Don't transform large array Const into ArrayExpr for Orca (#3406) · 1c82fac2
      Haisheng Yuan 提交于
      Don't transform large array Const into ArrayExpr for Orca (#3406)
      
      If the number of elements in the array Const is greater than
      optimizer_array_expansion_threshold, returns the original Const unmodified.
      Otherwise, it will cause severe performance issue for Orca optimizer for array
      with very large number of elements, e.g. 50K.
      
      Fixes issue #3355
      [#151340976]
      1c82fac2
  15. 28 9月, 2017 1 次提交
  16. 27 9月, 2017 13 次提交
    • T
      Disable flattening of IN/EXISTS sublinks inside outer joins · fb1448d0
      Tom Lane 提交于
      commit 07b9936a
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Fri Feb 27 23:30:29 2009 +0000
      
          Temporarily (I hope) disable flattening of IN/EXISTS sublinks that are within
          the ON clause of an outer join.  Doing so is semantically correct but results
          in de-optimizing queries that were structured to take advantage of the sublink
          style of execution, as seen in recent complaint from Kevin Grittner.  Since
          the user can get the other behavior by reorganizing his query, having the
          flattening happen automatically is just a convenience, and that doesn't
          justify breaking existing applications.  Eventually it would be nice to
          re-enable this, but that seems to require a significantly different approach
          to outer joins in the executor.
      
      Added relevant test case.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      fb1448d0
    • E
      Don't assume a subquery's output is unique if there's a SRF in its tlist · e7ff3ef1
      Ekta Khanna and Jemish Patel 提交于
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Jul 8 14:03:32 2014 -0400
      
          While the x output of "select x from t group by x" can be presumed unique,
          this does not hold for "select x, generate_series(1,10) from t group by x",
          because we may expand the set-returning function after the grouping step.
          (Perhaps that should be re-thought; but considering all the other oddities
          involved with SRFs in targetlists, it seems unlikely we'll change it.)
          Put a check in query_is_distinct_for() so it's not fooled by such cases.
      
          Back-patch to all supported branches.
      
          David Rowley
      
      (cherry picked from commit 2e7469dc8b3bac4fe0f9bd042aaf802132efde85)
      e7ff3ef1
    • E
      Fix possible crash with nested SubLinks. · cb7e418d
      Ekta Khanna 提交于
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Tue Dec 10 16:10:36 2013 -0500
      
      An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...)
      could produce an invalid plan that results in a crash at execution time,
      if the planner attempts to flatten the outer IN into a semi-join.
      This happens because convert_testexpr() was not expecting any nested
      SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging
      to the inner SubLink.  (I think the comment denying that this case could
      happen was wrong when written; it's certainly been wrong for quite a long
      time, since very early versions of the semijoin flattening logic.)
      
      Per report from Teodor Sigaev.  Back-patch to all supported branches.
      
      (cherry picked from commit 884c6384a2db34f6a65573e6bfd4b71dfba0de90)
      cb7e418d
    • E
      Fix planner's handling of outer PlaceHolderVars within subqueries. · 45cbf64a
      Ekta Khanna 提交于
      commit 0a0ca1cb18a34e92ab549df171e174dcce7bf7a3
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Sat Mar 24 16:22:00 2012 -0400
      
          Fix planner's handling of outer PlaceHolderVars within subqueries.
      
          For some reason, in the original coding of the PlaceHolderVar mechanism
          I had supposed that PlaceHolderVars couldn't propagate into subqueries.
          That is of course entirely possible.  When it happens, we need to treat
          an outer-level PlaceHolderVar much like an outer Var or Aggref, that is
          SS_replace_correlation_vars() needs to replace the PlaceHolderVar with
          a Param, and then when building the finished SubPlan we have to provide
          the PlaceHolderVar expression as an actual parameter for the SubPlan.
          The handling of the contained expression is a bit delicate but it can be
          treated exactly like an Aggref's expression.
      
          In addition to the missing logic in subselect.c, prepjointree.c was failing
          to search subqueries for PlaceHolderVars that need their relids adjusted
          during subquery pullup.  It looks like everyplace else that touches
          PlaceHolderVars got it right, though.
      
          Per report from Mark Murawski.  In 9.1 and HEAD, queries affected by this
          oversight would fail with "ERROR: Upper-level PlaceHolderVar found where
          not expected".  But in 9.0 and 8.4, you'd silently get possibly-wrong
          answers, since the value transmitted into the subquery wouldn't go to null
          when it should.
      45cbf64a
    • S
      Remove is_simple_subquery() check in simplify_EXISTS_query() · 77f804f5
      Shreedhar Hardikar 提交于
      GPDB handles a lot of the cases that are restricted by
      is_simple_subquery; and the restrictions not handled, are checked for
      separately in convert_EXISTS_sublink_to_join().
      
      Resulting from cascading ICG failures, we also fixed the following:
      
      - initialize all the members of IncrementVarSublevelsUp_context
        properly.
      - remove incorrect assertions brought in from upstream. In GPDB, these
        cases are handled.
      - improve plans for NOT EXISTS sub-queries containing an aggregation
        without limits by creating a "false" one-time filter.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      77f804f5
    • S
      Remove dead code around JoinExpr::subqfromlist. · f16deabd
      Shreedhar Hardikar 提交于
      This was used to keep information about the subquery join tree for
      pulled-up sublinks for use later in deconstruct_recurse().  With the
      upstream subselect merge, a JoinExpr constructed at the pull-up time
      itself, so this is no longer needed since the subquery join tree
      information is available in the constructed JoinExpr.
      
      Also with the merge, deconstruct_recurse() handles JOIN_SEMI JoinExprs.
      However, since GPDB differs from upstream by treating SEMI joins as
      INNER join for internal join planning, this commit also updates
      inner_join_rels correctly for SEMI joins (see regression test).
      
      Also remove unused function declaration for not_null_inner_vars().
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      f16deabd
    • S
      Do not commute inner/outer rels of JOIN_ANTI and JOIN_LASJ_NOTIN · 70be2285
      Shreedhar Hardikar 提交于
      This issue was discovered during the subselect merge. wherin planner
      incorrectly commutes anti joins.
      `cdb_add_subquery_join_paths()` creates join paths for (rel1, rel2) and
      (rel2, rel1) for all join types including JOIN_ANTI and JOIN_LASJ_NOTIN.
      This produces wrong results since these joins are order-sensitive w.r.t
      inner and outer relations (see new regression tests). So, do not add
      (rel2, rel1) for JOIN_ANTI and JOIN_LASJ_NOTIN.
      
      This commit also refactors cdb_add_subquery_join_paths() and
      make_join_rel() to make it easier to control the commuting.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      70be2285
    • S
      Handle pending merge FIXMEs from merging e549722a · cc208986
      Shreedhar Hardikar 提交于
      1. convert_IN_to_antijoin() should fail pull-up when left relids are not
         a subset of available_rels, otherwise we get wrong results. See
         regression tests in qp_correlated_query.sql.
      2. convert_EXPR_to_join() is a GPDB-only function that already handles
         this case via ProcessSubqueryToJoin().
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      cc208986
    • S
      Partial cherry-pick up of upstream commit 0dec322. · cdfc5616
      Shreedhar Hardikar 提交于
      commit 0dec3226ee905f94d0b9d6e2f274e13bbcaf5370
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Jun 20 14:33:20 2011 -0400
      
          Fix thinko in previous patch for optimizing EXISTS-within-EXISTS.
      
          When recursing after an optimization in pull_up_sublinks_qual_recurse, the
          available_rels value passed down must include only the relations that are
          in the righthand side of the new SEMI or ANTI join; it's incorrect to pull
          up a sub-select that refers to other relations, as seen in the added test
          case.  Per report from BangarRaju Vadapalli.
      
      NOTE: The second part of the upstream commit is not pulled in because that
      produces inferior plans in GPDB by not pulling nested sublinks below NOT
      EXISTS. That part is reverted later upstream in 9.2 anyway.
      
      Also update regression tests.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      cdfc5616
    • T
      Fix pull_up_sublinks' failure to handle nested pull-up opportunities · 40082bd2
      Tom Lane 提交于
      commit f3f0f37068e06d01e88abbf3ed596664b139f7e2
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon May 2 15:56:47 2011 -0400
      
          Fix pull_up_sublinks' failure to handle nested pull-up opportunities.
      
          After finding an EXISTS or ANY sub-select that can be converted to a
          semi-join or anti-join, we should recurse into the body of the sub-select.
          This allows cases such as EXISTS-within-EXISTS to be optimized properly.
          The original coding would leave the lower sub-select as a SubLink, which
          is no better and often worse than what we can do with a join.  Per example
          from Wayne Conrad.
      
          Back-patch to 8.4.  There is a related issue in older versions' handling
          of pull_up_IN_clauses, but they're lame enough anyway about the whole area
          that it seems not worth the extra work to try to fix.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      40082bd2
    • T
      Fix mishandling of whole-row Vars referencing a view or sub-select · 385bb3cb
      Tom Lane 提交于
      commit c4ac2ff765d9b68a3ff2a3461804489721770d06
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Jun 21 00:14:54 2010 +0000
      
          Fix mishandling of whole-row Vars referencing a view or sub-select.
          If such a Var appeared within a nested sub-select, we failed to translate it
          correctly during pullup of the view, because the recursive call to
          replace_rte_variables_mutator was looking for the wrong sublevels_up value.
          Bug was introduced during the addition of the PlaceHolderVar mechanism.
          Per bug #5514 from Marcos Castedo.
      385bb3cb
    • D
      Fix an oversight in convert_EXISTS_sublink_to_join · d3ff95a1
      Dhanashree Kashid 提交于
      commit dcd647d7cf98e3393f919135f6e113e896781f60
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Mon Jan 18 18:17:52 2010 +0000
      
          Fix an oversight in convert_EXISTS_sublink_to_join: we can't convert an
          EXISTS that contains a WITH clause.  This would usually lead to a
          "could not find CTE" error later in planning, because the WITH wouldn't
          get processed at all.  Noted while playing with an example from Ken Marshall.
      d3ff95a1
    • T
      Move exprType,exprTypmod,expression_tree_walker and related routines · e65f963b
      Tom Lane 提交于
        commit e5536e77
        Author: Tom Lane <tgl@sss.pgh.pa.us>
        Date:   Mon Aug 25 22:42:34 2008 +0000
      
            Move exprType(), exprTypmod(), expression_tree_walker(), and related routines
            into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
            the backend.  There's probably more that should be done along this line,
            but this is a start anyway
      Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>
      e65f963b