1. 15 3月, 2016 2 次提交
    • T
      Provide a planner hook at a suitable place for creating upper-rel Paths. · 5864d6a4
      Tom Lane 提交于
      In the initial revision of the upper-planner pathification work, the only
      available way for an FDW or custom-scan provider to inject Paths
      representing post-scan-join processing was to insert them during scan-level
      GetForeignPaths or similar processing.  While that's not impossible, it'd
      require quite a lot of duplicative processing to look forward and see if
      the extension would be capable of implementing the whole query.  To improve
      matters for custom-scan providers, provide a hook function at the point
      where the core code is about to start filling in upperrel Paths.  At this
      point Paths are available for the whole scan/join tree, which should reduce
      the amount of redundant effort considerably.
      
      (An alternative design that was suggested was to provide a separate hook
      for each post-scan-join processing step, but that seems messy and not
      clearly more useful.)
      
      Following our time-honored tradition, there's no documentation for this
      hook outside the source code.
      
      As-is, this hook is only meant for custom scan providers, which we can't
      assume very much about.  A followon patch will implement an FDW callback
      to let FDWs do the same thing in a somewhat more structured fashion.
      5864d6a4
    • T
      Rethink representation of PathTargets. · 307c7885
      Tom Lane 提交于
      In commit 19a54114 I did not make PathTarget a subtype of Node,
      and embedded a RelOptInfo's reltarget directly into it rather than having
      a separately-allocated Node.  In hindsight that was misguided
      micro-optimization, enabled by the fact that at that point we didn't have
      any Paths with custom PathTargets.  Now that PathTarget processing has
      been fleshed out some more, it's easier to see that it's better to have
      PathTarget as an indepedent Node type, even if it does cost us one more
      palloc to create a RelOptInfo.  So change it while we still can.
      
      This commit just changes the representation, without doing anything more
      interesting than that.
      307c7885
  2. 12 3月, 2016 1 次提交
    • T
      When appropriate, postpone SELECT output expressions till after ORDER BY. · 9118d03a
      Tom Lane 提交于
      It is frequently useful for volatile, set-returning, or expensive functions
      in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are
      done.  Otherwise, the functions might be executed for every row of the
      table despite the presence of LIMIT, and/or be executed in an unexpected
      order.  For example, in
      	SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10;
      it's probably desirable that the nextval() values are ordered the same
      as x, and that nextval() is not run more than 10 times.
      
      In the past, Postgres was inconsistent in this area: you would get the
      desirable behavior if the ordering were performed via an indexscan, but
      not if it had to be done by an explicit sort step.  Getting the desired
      behavior reliably required contortions like
      	SELECT x, nextval('seq')
      	  FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10;
      
      This patch conditionally postpones evaluation of pure-output target
      expressions (that is, those that are not used as DISTINCT, ORDER BY, or
      GROUP BY columns) so that they effectively occur after sorting, even if an
      explicit sort step is necessary.  Volatile expressions and set-returning
      expressions are always postponed, so as to provide consistent semantics.
      Expensive expressions (costing more than 10 times typical operator cost,
      which by default would include any user-defined function) are postponed
      if there is a LIMIT or if there are expressions that must be postponed.
      
      We could be more aggressive and postpone any nontrivial expression, but
      there are costs associated with doing so: it requires an extra Result plan
      node which adds some overhead, and postponement changes the volume of data
      going through the sort step, perhaps for the worse.  Since we tend not to
      have very good estimates of the output width of nontrivial expressions,
      it's hard to have much confidence in our ability to predict whether
      postponement would increase or decrease the cost of the sort; therefore
      this patch doesn't attempt to make decisions conditionally on that.
      Between these factors and a general desire not to change query behavior
      when there's not a demonstrable benefit, it seems best to be conservative
      about applying postponement.  We might tweak the decision rules in the
      future, though.
      
      Konstantin Knizhnik, heavily rewritten by me
      9118d03a
  3. 11 3月, 2016 3 次提交
    • T
      Minor additional refactoring of planner.c's PathTarget handling. · 49635d7b
      Tom Lane 提交于
      Teach make_group_input_target() and make_window_input_target() to work
      entirely with the PathTarget representation of tlists, rather than
      constructing a tlist and immediately deconstructing it into PathTarget
      format.  In itself this only saves a few palloc's; the bigger picture is
      that it opens the door for sharing cost_qual_eval work across all of
      planner.c's constructions of PathTargets.  I'll come back to that later.
      
      In support of this, flesh out tlist.c's infrastructure for PathTargets
      a bit more.
      49635d7b
    • T
      Give pull_var_clause() reject/recurse/return behavior for WindowFuncs too. · c82c92b1
      Tom Lane 提交于
      All along, this function should have treated WindowFuncs in a manner
      similar to Aggrefs, ie with an option whether or not to recurse into them.
      By not considering the case, it was always recursing, which is OK for most
      callers (although I suspect that the case in prepare_sort_from_pathkeys
      might represent a bug).  But now we need return-without-recursing behavior
      as well.  There are also more than a few callers that should never see a
      WindowFunc, and now we'll get some error checking on that.
      c82c92b1
    • T
      Refactor pull_var_clause's API to make it less tedious to extend. · 364a9f47
      Tom Lane 提交于
      In commit 1d97c19a and later c1d9579d, we extended
      pull_var_clause's API by adding enum-type arguments.  That's sort of a pain
      to maintain, though, because it means every time we add a new behavior we
      must touch every last one of the call sites, even if there's a reasonable
      default behavior that most of them could use.  Let's switch over to using a
      bitmask of flags, instead; that seems more maintainable and might save a
      nanosecond or two as well.  This commit changes no behavior in itself,
      though I'm going to follow it up with one that does add a new behavior.
      
      In passing, remove flatten_tlist(), which has not been used since 9.1
      and would otherwise need the same API changes.
      
      Removing these enums means that optimizer/tlist.h no longer needs to
      depend on optimizer/var.h.  Changing that caused a number of C files to
      need addition of #include "optimizer/var.h" (probably we can thank old
      runs of pgrminclude for that); but on balance it seems like a good change
      anyway.
      364a9f47
  4. 09 3月, 2016 3 次提交
    • T
      Improve handling of pathtargets in planner.c. · 51c0f63e
      Tom Lane 提交于
      Refactor so that the internal APIs in planner.c deal in PathTargets not
      targetlists, and establish a more regular structure for deriving the
      targets needed for successive steps.
      
      There is more that could be done here; calculating the eval costs of each
      successive target independently is both inefficient and wrong in detail,
      since we won't actually recompute values available from the input node's
      tlist.  But it's no worse than what happened before the pathification
      rewrite.  In any case this seems like a good starting point for considering
      how to handle Konstantin Knizhnik's function-evaluation-postponement patch.
      51c0f63e
    • T
      Improve handling of group-column indexes in GroupingSetsPath. · 9e8b9942
      Tom Lane 提交于
      Instead of having planner.c compute a groupColIdx array and store it in
      GroupingSetsPaths, make create_groupingsets_plan() find the grouping
      columns by searching in the child plan node's tlist.  Although that's
      probably a bit slower for create_groupingsets_plan(), it's more like
      the way every other plan node type does this, and it provides positive
      confirmation that we know which child output columns we're supposed to be
      grouping on.  (Indeed, looking at this now, I'm not at all sure that it
      wasn't broken before, because create_groupingsets_plan() isn't demanding
      an exact tlist match from its child node.)  Also, this allows substantial
      simplification in planner.c, because it no longer needs to compute the
      groupColIdx array at all; no other cases were using it.
      
      I'd intended to put off this refactoring until later (like 9.7), but
      in view of the likely bug fix and the need to rationalize planner.c's
      tlist handling so we can do something sane with Konstantin Knizhnik's
      function-evaluation-postponement patch, I think it can't wait.
      9e8b9942
    • T
      Finish refactoring make_foo() functions in createplan.c. · 8c314b98
      Tom Lane 提交于
      This patch removes some redundant cost calculations that I left for later
      cleanup in commit 3fc6e2d7.  There's now a uniform policy that the
      make_foo() convenience functions don't do any cost calculations.  Most of
      their callers copy costs from the source Path node, and for those that
      don't, the calculation in the make_foo() function wasn't necessarily right
      anyhow.  (make_result() was particularly a mess, as it was serving multiple
      callers using cost calcs designed for only the first one or two that had
      ever existed.)  Aside from saving a few cycles, this ensures that what
      EXPLAIN prints matches the costs we used for planning purposes.  It does
      not change any planner decisions, since the decisions are already made.
      8c314b98
  5. 08 3月, 2016 1 次提交
    • T
      Make the upper part of the planner work by generating and comparing Paths. · 3fc6e2d7
      Tom Lane 提交于
      I've been saying we needed to do this for more than five years, and here it
      finally is.  This patch removes the ever-growing tangle of spaghetti logic
      that grouping_planner() used to use to try to identify the best plan for
      post-scan/join query steps.  Now, there is (nearly) independent
      consideration of each execution step, and entirely separate construction of
      Paths to represent each of the possible ways to do that step.  We choose
      the best Path or set of Paths using the same add_path() logic that's been
      used inside query_planner() for years.
      
      In addition, this patch removes the old restriction that subquery_planner()
      could return only a single Plan.  It now returns a RelOptInfo containing a
      set of Paths, just as query_planner() does, and the parent query level can
      use each of those Paths as the basis of a SubqueryScanPath at its level.
      This allows finding some optimizations that we missed before, wherein a
      subquery was capable of returning presorted data and thereby avoiding a
      sort in the parent level, making the overall cost cheaper even though
      delivering sorted output was not the cheapest plan for the subquery in
      isolation.  (A couple of regression test outputs change in consequence of
      that.  However, there is very little change in visible planner behavior
      overall, because the point of this patch is not to get immediate planning
      benefits but to create the infrastructure for future improvements.)
      
      There is a great deal left to do here.  This patch unblocks a lot of
      planner work that was basically impractical in the old code structure,
      such as allowing FDWs to implement remote aggregation, or rewriting
      plan_set_operations() to allow consideration of multiple implementation
      orders for set operations.  (The latter will likely require a full
      rewrite of plan_set_operations(); what I've done here is only to fix it
      to return Paths not Plans.)  I have also left unfinished some localized
      refactoring in createplan.c and planner.c, because it was not necessary
      to get this patch to a working state.
      
      Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
      3fc6e2d7
  6. 19 2月, 2016 1 次提交
    • T
      Add an explicit representation of the output targetlist to Paths. · 19a54114
      Tom Lane 提交于
      Up to now, there's been an assumption that all Paths for a given relation
      compute the same output column set (targetlist).  However, there are good
      reasons to remove that assumption.  For example, an indexscan on an
      expression index might be able to return the value of an expensive function
      "for free".  While we have the ability to generate such a plan today in
      simple cases, we don't have a way to model that it's cheaper than a plan
      that computes the function from scratch, nor a way to create such a plan
      in join cases (where the function computation would normally happen at
      the topmost join node).  Also, we need this so that we can have Paths
      representing post-scan/join steps, where the targetlist may well change
      from one step to the next.  Therefore, invent a "struct PathTarget"
      representing the columns we expect a plan step to emit.  It's convenient
      to include the output tuple width and tlist evaluation cost in this struct,
      and there will likely be additional fields in future.
      
      While Path nodes that actually do have custom outputs will need their own
      PathTargets, it will still be true that most Paths for a given relation
      will compute the same tlist.  To reduce the overhead added by this patch,
      keep a "default PathTarget" in RelOptInfo, and allow Paths that compute
      that column set to just point to their parent RelOptInfo's reltarget.
      (In the patch as committed, actually every Path is like that, since we
      do not yet have any cases of custom PathTargets.)
      
      I took this opportunity to provide some more-honest costing of
      PlaceHolderVar evaluation.  Up to now, the assumption that "scan/join
      reltargetlists have cost zero" was applied not only to Vars, where it's
      reasonable, but also PlaceHolderVars where it isn't.  Now, we add the eval
      cost of a PlaceHolderVar's expression to the first plan level where it can
      be computed, by including it in the PathTarget cost field and adding that
      to the cost estimates for Paths.  This isn't perfect yet but it's much
      better than before, and there is a way forward to improve it more.  This
      costing change affects the join order chosen for a couple of the regression
      tests, changing expected row ordering.
      19a54114
  7. 12 2月, 2016 2 次提交
    • T
      Remove GROUP BY columns that are functionally dependent on other columns. · d4c3a156
      Tom Lane 提交于
      If a GROUP BY clause includes all columns of a non-deferred primary key,
      as well as other columns of the same relation, those other columns are
      redundant and can be dropped from the grouping; the pkey is enough to
      ensure that each row of the table corresponds to a separate group.
      Getting rid of the excess columns will reduce the cost of the sorting or
      hashing needed to implement GROUP BY, and can indeed remove the need for
      a sort step altogether.
      
      This seems worth testing for since many query authors are not aware of
      the GROUP-BY-primary-key exception to the rule about queries not being
      allowed to reference non-grouped-by columns in their targetlists or
      HAVING clauses.  Thus, redundant GROUP BY items are not uncommon.  Also,
      we can make the test pretty cheap in most queries where it won't help
      by not looking up a rel's primary key until we've found that at least
      two of its columns are in GROUP BY.
      
      David Rowley, reviewed by Julien Rouhaud
      d4c3a156
    • T
      Fix typo in comment. · 2564be36
      Tom Lane 提交于
      2564be36
  8. 08 2月, 2016 2 次提交
    • A
      Fix overeager pushdown of HAVING clauses when grouping sets are used. · a6897efa
      Andres Freund 提交于
      In 61444bfb we started to allow HAVING clauses to be fully pushed down
      into WHERE, even when grouping sets are in use. That turns out not to
      work correctly, because grouping sets can "produce" NULLs, meaning that
      filtering in WHERE and HAVING can have different results, even when no
      aggregates or volatile functions are involved.
      
      Instead only allow pushdown of empty grouping sets.
      
      It'd be nice to do better, but the exact mechanics of deciding which
      cases are safe are still being debated. It's important to give correct
      results till we find a good solution, and such a solution might not be
      appropriate for backpatching anyway.
      
      Bug: #13863
      Reported-By: 'wrb'
      Diagnosed-By: Dean Rasheed
      Author: Andrew Gierth
      Reviewed-By: Dean Rasheed and Andres Freund
      Discussion: 20160113183558.12989.56904@wrigleys.postgresql.org
      Backpatch: 9.5, where grouping sets were introduced
      a6897efa
    • R
      Introduce a new GUC force_parallel_mode for testing purposes. · 7c944bd9
      Robert Haas 提交于
      When force_parallel_mode = true, we enable the parallel mode restrictions
      for all queries for which this is believed to be safe.  For the subset of
      those queries believed to be safe to run entirely within a worker, we spin
      up a worker and run the query there instead of running it in the
      original process.  When force_parallel_mode = regress, make additional
      changes to allow the regression tests to run cleanly even though parallel
      workers have been injected under the hood.
      
      Taken together, this facilitates both better user testing and better
      regression testing of the parallelism code.
      
      Robert Haas, with help from Amit Kapila and Rushabh Lathia.
      7c944bd9
  9. 29 1月, 2016 1 次提交
    • R
      Only try to push down foreign joins if the user mapping OIDs match. · fbe5a3fb
      Robert Haas 提交于
      Previously, the foreign join pushdown infrastructure left the question
      of security entirely up to individual FDWs, but it would be easy for
      a foreign data wrapper to inadvertently open up subtle security holes
      that way.  So, make it the core code's job to determine which user
      mapping OID is relevant, and don't attempt join pushdown unless it's
      the same for all relevant relations.
      
      Per a suggestion from Tom Lane.  Shigeru Hanada and Ashutosh Bapat,
      reviewed by Etsuro Fujita and KaiGai Kohei, with some further
      changes by me.
      fbe5a3fb
  10. 21 1月, 2016 1 次提交
    • R
      Support multi-stage aggregation. · a7de3dc5
      Robert Haas 提交于
      Aggregate nodes now have two new modes: a "partial" mode where they
      output the unfinalized transition state, and a "finalize" mode where
      they accept unfinalized transition states rather than individual
      values as input.
      
      These new modes are not used anywhere yet, but they will be necessary
      for parallel aggregation.  The infrastructure also figures to be
      useful for cases where we want to aggregate local data and remote
      data via the FDW interface, and want to bring back partial aggregates
      from the remote side that can then be combined with locally generated
      partial aggregates to produce the final value.  It may also be useful
      even when neither FDWs nor parallelism are in play, as explained in
      the comments in nodeAgg.c.
      
      David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
      Linnakangas, Haribabu Kommi, and me.
      a7de3dc5
  11. 15 1月, 2016 1 次提交
    • T
      Fix build_grouping_chain() to not clobber its input lists. · a923af38
      Tom Lane 提交于
      There's no good reason for stomping on the input data; it makes the logic
      in this function no simpler, in fact probably the reverse.  And it makes
      it impossible to separate path generation from plan generation, as I'm
      working towards doing; that will require more than one traversal of these
      lists.
      a923af38
  12. 11 1月, 2016 1 次提交
  13. 08 1月, 2016 2 次提交
    • T
      Marginal cleanup of GROUPING SETS code in grouping_planner(). · a54676ac
      Tom Lane 提交于
      Improve comments and make it a shade less messy.  I think we might want
      to move all of this somewhere else later, but it needs to be more
      readable first.
      
      In passing, re-pgindent the file, affecting some recently-added comments
      concerning parallel query planning.
      a54676ac
    • T
      Delay creation of subplan tlist until after create_plan(). · c44d0138
      Tom Lane 提交于
      Once upon a time it was necessary for grouping_planner() to determine
      the tlist it wanted from the scan/join plan subtree before it called
      query_planner(), because query_planner() would actually make a Plan using
      that.  But we refactored things a long time ago to delay construction of
      the Plan tree till later, so there's no need to build that tlist until
      (and indeed unless) we're ready to plaster it onto the Plan.  The only
      thing query_planner() cares about is what Vars are going to be needed for
      the tlist, and it can perfectly well get that by looking at the real tlist
      rather than some masticated version.
      
      Well, actually, there is one minor glitch in that argument, which is that
      make_subplanTargetList also adds Vars appearing only in HAVING to the
      tlist it produces.  So now we have to account for HAVING explicitly in
      build_base_rel_tlists.  But that just adds a few lines of code, and
      I doubt it moves the needle much on processing time; we might be doing
      pull_var_clause() twice on the havingQual, but before we had it scanning
      dummy tlist entries instead.
      
      This is a very small down payment on rationalizing grouping_planner
      enough so it can be refactored.
      c44d0138
  14. 03 1月, 2016 1 次提交
  15. 15 12月, 2015 1 次提交
    • S
      Collect the global OR of hasRowSecurity flags for plancache · e5e11c8c
      Stephen Frost 提交于
      We carry around information about if a given query has row security or
      not to allow the plancache to use that information to invalidate a
      planned query in the event that the environment changes.
      
      Previously, the flag of one of the subqueries was simply being copied
      into place to indicate if the query overall included RLS components.
      That's wrong as we need the global OR of all subqueries.  Fix by
      changing the code to match how fireRIRules works, which is results
      in OR'ing all of the flags.
      
      Noted by Tom.
      
      Back-patch to 9.5 where RLS was introduced.
      e5e11c8c
  16. 12 12月, 2015 1 次提交
    • T
      Get rid of the planner's LateralJoinInfo data structure. · 4fcf4845
      Tom Lane 提交于
      I originally modeled this data structure on SpecialJoinInfo, but after
      commit acfcd45c that looks like a pretty poor decision.
      All we really need is relid sets identifying laterally-referenced rels;
      and most of the time, what we want to know about includes indirect lateral
      references, a case the LateralJoinInfo data was unsuited to compute with
      any efficiency.  The previous commit redefined RelOptInfo.lateral_relids
      as the transitive closure of lateral references, so that it easily supports
      checking indirect references.  For the places where we really do want just
      direct references, add a new RelOptInfo field direct_lateral_relids, which
      is easily set up as a copy of lateral_relids before we perform the
      transitive closure calculation.  Then we can just drop lateral_info_list
      and LateralJoinInfo and the supporting code.  This makes the planner's
      handling of lateral references noticeably more efficient, and shorter too.
      
      Such a change can't be back-patched into stable branches for fear of
      breaking extensions that might be looking at the planner's data structures;
      but it seems not too late to push it into 9.5, so I've done so.
      4fcf4845
  17. 11 11月, 2015 2 次提交
    • R
      Generate parallel sequential scan plans in simple cases. · 80558c1f
      Robert Haas 提交于
      Add a new flag, consider_parallel, to each RelOptInfo, indicating
      whether a plan for that relation could conceivably be run inside of
      a parallel worker.  Right now, we're pretty conservative: for example,
      it might be possible to defer applying a parallel-restricted qual
      in a worker, and later do it in the leader, but right now we just
      don't try to parallelize access to that relation.  That's probably
      the right decision in most cases, anyway.
      
      Using the new flag, generate parallel sequential scan plans for plain
      baserels, meaning that we now have parallel sequential scan in
      PostgreSQL.  The logic here is pretty unsophisticated right now: the
      costing model probably isn't right in detail, and we can't push joins
      beneath Gather nodes, so the number of plans that can actually benefit
      from this is pretty limited right now.  Lots more work is needed.
      Nevertheless, it seems time to enable this functionality so that all
      this code can actually be tested easily by users and developers.
      
      Note that, if you wish to test this functionality, it will be
      necessary to set max_parallel_degree to a value greater than the
      default of 0.  Once a few more loose ends have been tidied up here, we
      might want to consider changing the default value of this GUC, but
      I'm leaving it alone for now.
      
      Along the way, fix a bug in cost_gather: the previous coding thought
      that a Gather node's transfer overhead should be costed on the basis of
      the relation size rather than the number of tuples that actually need
      to be passed off to the leader.
      
      Patch by me, reviewed in earlier versions by Amit Kapila.
      80558c1f
    • R
      Make sequential scans parallel-aware. · f0661c4e
      Robert Haas 提交于
      In addition, this path fills in a number of missing bits and pieces in
      the parallel infrastructure.  Paths and plans now have a parallel_aware
      flag indicating whether whatever parallel-aware logic they have should
      be engaged.  It is believed that we will need this flag for a number of
      path/plan types, not just sequential scans, which is why the flag is
      generic rather than part of the SeqScan structures specifically.
      Also, execParallel.c now gives parallel nodes a chance to initialize
      their PlanState nodes from the DSM during parallel worker startup.
      
      Amit Kapila, with a fair amount of adjustment by me.  Review of previous
      patch versions by Haribabu Kommi and others.
      f0661c4e
  18. 16 10月, 2015 1 次提交
    • R
      Prohibit parallel query when the isolation level is serializable. · a53c06a1
      Robert Haas 提交于
      In order for this to be safe, the code which hands true serializability
      will need to taught that the SIRead locks taken by a parallel worker
      pertain to the same transaction as those taken by the parallel leader.
      Some further changes may be needed as well.  Until the necessary
      adaptations are made, don't generate parallel plans in serializable
      mode, and if a previously-generated parallel plan is used after
      serializable mode has been activated, run it serially.
      
      This fixes a bug in commit 7aea8e4f.
      a53c06a1
  19. 29 9月, 2015 1 次提交
    • R
      Parallel executor support. · d1b7c1ff
      Robert Haas 提交于
      This code provides infrastructure for a parallel leader to start up
      parallel workers to execute subtrees of the plan tree being executed
      in the master.  User-supplied parameters from ParamListInfo are passed
      down, but PARAM_EXEC parameters are not.  Various other constructs,
      such as initplans, subplans, and CTEs, are also not currently shared.
      Nevertheless, there's enough here to support a basic implementation of
      parallel query, and we can lift some of the current restrictions as
      needed.
      
      Amit Kapila and Robert Haas
      d1b7c1ff
  20. 17 9月, 2015 1 次提交
    • R
      Determine whether it's safe to attempt a parallel plan for a query. · 7aea8e4f
      Robert Haas 提交于
      Commit 924bcf4f introduced a framework
      for parallel computation in PostgreSQL that makes most but not all
      built-in functions safe to execute in parallel mode.  In order to have
      parallel query, we'll need to be able to determine whether that query
      contains functions (either built-in or user-defined) that cannot be
      safely executed in parallel mode.  This requires those functions to be
      labeled, so this patch introduces an infrastructure for that.  Some
      functions currently labeled as safe may need to be revised depending on
      how pending issues related to heavyweight locking under paralllelism
      are resolved.
      
      Parallel plans can't be used except for the case where the query will
      run to completion.  If portal execution were suspended, the parallel
      mode restrictions would need to remain in effect during that time, but
      that might make other queries fail.  Therefore, this patch introduces
      a framework that enables consideration of parallel plans only when it
      is known that the plan will be run to completion.  This probably needs
      some refinement; for example, at bind time, we do not know whether a
      query run via the extended protocol will be execution to completion or
      run with a limited fetch count.  Having the client indicate its
      intentions at bind time would constitute a wire protocol break.  Some
      contexts in which parallel mode would be safe are not adjusted by this
      patch; the default is not to try parallel plans except from call sites
      that have been updated to say that such plans are OK.
      
      This commit doesn't introduce any parallel paths or plans; it just
      provides a way to determine whether they could potentially be used.
      I'm committing it on the theory that the remaining parallel sequential
      scan patches will also get committed to this release, hopefully in the
      not-too-distant future.
      
      Robert Haas and Amit Kapila.  Reviewed (in earlier versions) by Noah
      Misch.
      7aea8e4f
  21. 12 8月, 2015 1 次提交
    • T
      Postpone extParam/allParam calculations until the very end of planning. · 68fa28f7
      Tom Lane 提交于
      Until now we computed these Param ID sets at the end of subquery_planner,
      but that approach depends on subquery_planner returning a concrete Plan
      tree.  We would like to switch over to returning one or more Paths for a
      subquery, and in that representation the necessary details aren't fully
      fleshed out (not to mention that we don't really want to do this work for
      Paths that end up getting discarded).  Hence, refactor so that we can
      compute the param ID sets at the end of planning, just before
      set_plan_references is run.
      
      The main change necessary to make this work is that we need to capture
      the set of outer-level Param IDs available to the current query level
      before exiting subquery_planner, since the outer levels' plan_params lists
      are transient.  (That's not going to pose a problem for returning Paths,
      since all the work involved in producing that data is part of expression
      preprocessing, which will continue to happen before Paths are produced.)
      On the plus side, this change gets rid of several existing kluges.
      
      Eventually I'd like to get rid of SS_finalize_plan altogether in favor of
      doing this work during set_plan_references, but that will require some
      complex rejiggering because SS_finalize_plan needs to visit subplans and
      initplans before the main plan.  So leave that idea for another day.
      68fa28f7
  22. 31 7月, 2015 1 次提交
    • T
      Avoid some zero-divide hazards in the planner. · 8693ebe3
      Tom Lane 提交于
      Although I think on all modern machines floating division by zero
      results in Infinity not SIGFPE, we still don't want infinities
      running around in the planner's costing estimates; too much risk
      of that leading to insane behavior.
      
      grouping_planner() failed to consider the possibility that final_rel
      might be known dummy and hence have zero rowcount.  (I wonder if it
      would be better to set a rows estimate of 1 for dummy relations?
      But at least in the back branches, changing this convention seems
      like a bad idea, so I'll leave that for another day.)
      
      Make certain that get_variable_numdistinct() produces a nonzero result.
      The case that can be shown to be broken is with stadistinct < 0.0 and
      small ntuples; we did not prevent the result from rounding to zero.
      For good luck I applied clamp_row_est() to all the nonconstant return
      values.
      
      In ExecChooseHashTableSize(), Assert that we compute positive nbuckets
      and nbatch.  I know of no reason to think this isn't the case, but it
      seems like a good safety check.
      
      Per reports from Piotr Stefaniak.  Back-patch to all active branches.
      8693ebe3
  23. 26 7月, 2015 3 次提交
    • A
      Allow to push down clauses from HAVING to WHERE when grouping sets are used. · 61444bfb
      Andres Freund 提交于
      Previously we disallowed pushing down quals to WHERE in the presence of
      grouping sets. That's overly restrictive.
      
      We now instead copy quals to WHERE if applicable, leaving the
      one in HAVING in place. That's because, at that stage of the planning
      process, it's nontrivial to determine if it's safe to remove the one in
      HAVING.
      
      Author: Andrew Gierth
      Discussion: 874mkt3l59.fsf@news-spur.riddles.org.uk
      Backpatch: 9.5, where grouping sets were introduced. This isn't exactly
          a bugfix, but it seems better to keep the branches in sync at this point.
      61444bfb
    • A
      Build column mapping for grouping sets in all required cases. · 144666f6
      Andres Freund 提交于
      The previous coding frequently failed to fail because for one it's
      unusual to have rollup clauses with one column, and for another
      sometimes the wrong mapping didn't cause obvious problems.
      
      Author: Jeevan Chalke
      Reviewed-By: Andrew Gierth
      Discussion: CAM2+6=W=9=hQOipH0HAPbkun3Z3TFWij_EiHue0_6UX=oR=1kw@mail.gmail.com
      Backpatch: 9.5, where grouping sets were introduced
      144666f6
    • T
      Redesign tablesample method API, and do extensive code review. · dd7a8f66
      Tom Lane 提交于
      The original implementation of TABLESAMPLE modeled the tablesample method
      API on index access methods, which wasn't a good choice because, without
      specialized DDL commands, there's no way to build an extension that can
      implement a TSM.  (Raw inserts into system catalogs are not an acceptable
      thing to do, because we can't undo them during DROP EXTENSION, nor will
      pg_upgrade behave sanely.)  Instead adopt an API more like procedural
      language handlers or foreign data wrappers, wherein the only SQL-level
      support object needed is a single handler function identified by having
      a special return type.  This lets us get rid of the supporting catalog
      altogether, so that no custom DDL support is needed for the feature.
      
      Adjust the API so that it can support non-constant tablesample arguments
      (the original coding assumed we could evaluate the argument expressions at
      ExecInitSampleScan time, which is undesirable even if it weren't outright
      unsafe), and discourage sampling methods from looking at invisible tuples.
      Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
      within and across queries, as required by the SQL standard, and deal more
      honestly with methods that can't support that requirement.
      
      Make a full code-review pass over the tablesample additions, and fix
      assorted bugs, omissions, infelicities, and cosmetic issues (such as
      failure to put the added code stanzas in a consistent ordering).
      Improve EXPLAIN's output of tablesample plans, too.
      
      Back-patch to 9.5 so that we don't have to support the original API
      in production.
      dd7a8f66
  24. 23 6月, 2015 1 次提交
    • T
      Improve inheritance_planner()'s performance for large inheritance sets. · 2cb9ec1b
      Tom Lane 提交于
      Commit c03ad560 introduced a planner
      performance regression for UPDATE/DELETE on large inheritance sets.
      It required copying the append_rel_list (which is of size proportional to
      the number of inherited tables) once for each inherited table, thus
      resulting in O(N^2) time and memory consumption.  While it's difficult to
      avoid that in general, the extra work only has to be done for
      append_rel_list entries that actually reference subquery RTEs, which
      inheritance-set entries will not.  So we can buy back essentially all of
      the loss in cases without subqueries in FROM; and even for those, the added
      work is mainly proportional to the number of UNION ALL subqueries.
      
      Back-patch to 9.2, like the previous commit.
      
      Tom Lane and Dean Rasheed, per a complaint from Thomas Munro.
      2cb9ec1b
  25. 25 5月, 2015 1 次提交
    • T
      Manual cleanup of pgindent results. · 2aa0476d
      Tom Lane 提交于
      Fix some places where pgindent did silly stuff, often because project
      style wasn't followed to begin with.  (I've not touched the atomics
      headers, though.)
      2aa0476d
  26. 24 5月, 2015 1 次提交
  27. 23 5月, 2015 1 次提交
    • A
      Remove the new UPSERT command tag and use INSERT instead. · 631d7490
      Andres Freund 提交于
      Previously, INSERT with ON CONFLICT DO UPDATE specified used a new
      command tag -- UPSERT.  It was introduced out of concern that INSERT as
      a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as
      some affected rows may actually have been updated.
      
      Alvaro Herrera noticed that the implementation of that new command tag
      was incomplete; in subsequent discussion we concluded that having it
      doesn't provide benefits that are in line with the compatibility breaks
      it requires.
      
      Catversion bump due to the removal of PlannedStmt->isUpsert.
      
      Author: Peter Geoghegan
      Discussion: 20150520215816.GI5885@postgresql.org
      631d7490
  28. 16 5月, 2015 2 次提交
    • A
      Support GROUPING SETS, CUBE and ROLLUP. · f3d31185
      Andres Freund 提交于
      This SQL standard functionality allows to aggregate data by different
      GROUP BY clauses at once. Each grouping set returns rows with columns
      grouped by in other sets set to NULL.
      
      This could previously be achieved by doing each grouping as a separate
      query, conjoined by UNION ALLs. Besides being considerably more concise,
      grouping sets will in many cases be faster, requiring only one scan over
      the underlying data.
      
      The current implementation of grouping sets only supports using sorting
      for input. Individual sets that share a sort order are computed in one
      pass. If there are sets that don't share a sort order, additional sort &
      aggregation steps are performed. These additional passes are sourced by
      the previous sort step; thus avoiding repeated scans of the source data.
      
      The code is structured in a way that adding support for purely using
      hash aggregation or a mix of hashing and sorting is possible. Sorting
      was chosen to be supported first, as it is the most generic method of
      implementation.
      
      Instead of, as in an earlier versions of the patch, representing the
      chain of sort and aggregation steps as full blown planner and executor
      nodes, all but the first sort are performed inside the aggregation node
      itself. This avoids the need to do some unusual gymnastics to handle
      having to return aggregated and non-aggregated tuples from underlying
      nodes, as well as having to shut down underlying nodes early to limit
      memory usage.  The optimizer still builds Sort/Agg node to describe each
      phase, but they're not part of the plan tree, but instead additional
      data for the aggregation node. They're a convenient and preexisting way
      to describe aggregation and sorting.  The first (and possibly only) sort
      step is still performed as a separate execution step. That retains
      similarity with existing group by plans, makes rescans fairly simple,
      avoids very deep plans (leading to slow explains) and easily allows to
      avoid the sorting step if the underlying data is sorted by other means.
      
      A somewhat ugly side of this patch is having to deal with a grammar
      ambiguity between the new CUBE keyword and the cube extension/functions
      named cube (and rollup). To avoid breaking existing deployments of the
      cube extension it has not been renamed, neither has cube been made a
      reserved keyword. Instead precedence hacking is used to make GROUP BY
      cube(..) refer to the CUBE grouping sets feature, and not the function
      cube(). To actually group by a function cube(), unlikely as that might
      be, the function name has to be quoted.
      
      Needs a catversion bump because stored rules may change.
      
      Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
      Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
          Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
      Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
      f3d31185
    • S
      TABLESAMPLE, SQL Standard and extensible · f6d208d6
      Simon Riggs 提交于
      Add a TABLESAMPLE clause to SELECT statements that allows
      user to specify random BERNOULLI sampling or block level
      SYSTEM sampling. Implementation allows for extensible
      sampling functions to be written, using a standard API.
      Basic version follows SQLStandard exactly. Usable
      concrete use cases for the sampling API follow in later
      commits.
      
      Petr Jelinek
      
      Reviewed by Michael Paquier and Simon Riggs
      f6d208d6