1. 04 8月, 2008 1 次提交
  2. 03 8月, 2008 1 次提交
    • T
      Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items · 95113047
      Tom Lane 提交于
      as per my recent proposal:
      
      1. Fold SortClause and GroupClause into a single node type SortGroupClause.
      We were already relying on them to be struct-equivalent, so using two node
      tags wasn't accomplishing much except to get in the way of comparing items
      with equal().
      
      2. Add an "eqop" field to SortGroupClause to carry the associated equality
      operator.  This is cheap for the parser to get at the same time it's looking
      up the sort operator, and storing it eliminates the need for repeated
      not-so-cheap lookups during planning.  In future this will also let us
      represent GROUP/DISTINCT operations on datatypes that have hash opclasses
      but no btree opclasses (ie, they have equality but no natural sort order).
      The previous representation simply didn't work for that, since its only
      indicator of comparison semantics was a sort operator.
      
      3. Add a hasDistinctOn boolean to struct Query to explicitly record whether
      the distinctClause came from DISTINCT or DISTINCT ON.  This allows removing
      some complicated and not 100% bulletproof code that attempted to figure
      that out from the distinctClause alone.
      
      This patch doesn't in itself create any new capability, but it's necessary
      infrastructure for future attempts to use hash-based grouping for DISTINCT
      and UNION/INTERSECT/EXCEPT.
      95113047
  3. 01 8月, 2008 1 次提交
    • T
      Fix parser so that we don't modify the user-written ORDER BY list in order · 63247bec
      Tom Lane 提交于
      to represent DISTINCT or DISTINCT ON.  This gets rid of a longstanding
      annoyance that a view or rule using SELECT DISTINCT will be dumped out
      with an overspecified ORDER BY list, and is one small step along the way
      to decoupling DISTINCT and ORDER BY enough so that hash-based implementation
      of DISTINCT will be possible.  In passing, improve transformDistinctClause
      so that it doesn't reject duplicate DISTINCT ON items, as was reported by
      Steve Midgley a couple weeks ago.
      63247bec
  4. 10 7月, 2008 2 次提交
    • T
      Tighten up SS_finalize_plan's computation of valid_params to exclude Params of · eaf1b5d3
      Tom Lane 提交于
      the current query level that aren't in fact output parameters of the current
      initPlans.  (This means, for example, output parameters of regular subplans.)
      To make this work correctly for output parameters coming from sibling
      initplans requires rejiggering the API of SS_finalize_plan just a bit:
      we need the siblings to be visible to it, rather than hidden as
      SS_make_initplan_from_plan had been doing.  This is really part of my response
      to bug #4290, but I concluded this part probably shouldn't be back-patched,
      since all that it's doing is to make a debugging cross-check tighter.
      eaf1b5d3
    • T
      Fix mis-calculation of extParam/allParam sets for plan nodes, as seen in · 772a6d45
      Tom Lane 提交于
      bug #4290.  The fundamental bug is that masking extParam by outer_params,
      as finalize_plan had been doing, caused us to lose the information that
      an initPlan depended on the output of a sibling initPlan.  On reflection
      the best thing to do seemed to be not to try to adjust outer_params for
      this case but get rid of it entirely.  The only thing it was really doing
      for us was to filter out param IDs associated with SubPlan nodes, and that
      can be done (with greater accuracy) while processing individual SubPlan
      nodes in finalize_primnode.  This approach was vindicated by the discovery
      that the masking method was hiding a second bug: SS_finalize_plan failed to
      remove extParam bits for initPlan output params that were referenced in the
      main plan tree (it only got rid of those referenced by other initPlans).
      It's not clear that this caused any real problems, given the limited use
      of extParam by the executor, but it's certainly not what was intended.
      
      I originally thought that there was also a problem with needing to include
      indirect dependencies on external params in initPlans' param sets, but it
      turns out that the executor handles this correctly so long as the depended-on
      initPlan is earlier in the initPlans list than the one using its output.
      That seems a bit of a fragile assumption, but it is true at the moment,
      so I just documented it in some code comments rather than making what would
      be rather invasive changes to remove the assumption.
      
      Back-patch to 8.1.  Previous versions don't have the case of initPlans
      referring to other initPlans' outputs, so while the existing logic is still
      questionable for them, there are not any known bugs to be fixed.  So I'll
      refrain from changing them for now.
      772a6d45
  5. 28 6月, 2008 1 次提交
    • T
      Consider a clause to be outerjoin_delayed if it references the nullable side · dcc23347
      Tom Lane 提交于
      of any lower outer join, even if it also references the non-nullable side and
      so could not get pushed below the outer join anyway.  We need this in case
      the clause is an OR clause: if it doesn't get marked outerjoin_delayed,
      create_or_index_quals() could pull an indexable restriction for the nullable
      side out of it, leading to wrong results as demonstrated by today's bug
      report from toruvinn.  (See added regression test case for an example.)
      
      In principle this has been wrong for quite a while.  In practice I don't
      think any branch before 8.3 can really show the failure, because
      create_or_index_quals() will only pull out indexable conditions, and before
      8.3 those were always strict.  So though we might have improperly generated
      null-extended rows in the outer join, they'd get discarded from the result
      anyway.  The gating factor that makes the failure visible is that 8.3
      considers "col IS NULL" to be indexable.  Hence I'm not going to risk
      back-patching further than 8.3.
      dcc23347
  6. 27 6月, 2008 1 次提交
    • T
      Improve planner's estimation of the size of an append relation: rather than · 2c2161a4
      Tom Lane 提交于
      taking the maximum of any child rel's width, we should weight the widths
      proportionally to the number of rows expected from each child.  In hindsight
      this is obviously correct because row width is really a proxy for the total
      physical size of the relation.  Per discussion with Scott Carey (bug #4264).
      2c2161a4
  7. 19 6月, 2008 1 次提交
  8. 17 6月, 2008 1 次提交
    • T
      Fix the code that adds regclass constants to a plan's list of relation OIDs · 2e835a49
      Tom Lane 提交于
      that it depends on for replan-forcing purposes.  We need to consider plain OID
      constants too, because eval_const_expressions folds a RelabelType atop a Const
      to just a Const.  This change could result in OID values that aren't really
      for tables getting added to the dependency list, but the worst-case
      consequence would be occasional useless replans.  Per report from Gabriele
      Messineo.
      2e835a49
  9. 03 5月, 2008 1 次提交
  10. 22 4月, 2008 1 次提交
    • T
      Fix convert_IN_to_join to properly handle the case where the subselect's · ff673f55
      Tom Lane 提交于
      output is not of the same type that's needed for the IN comparison (ie,
      where the parser inserted an implicit coercion above the subselect result).
      We should record the coerced expression, not just a raw Var referencing
      the subselect output, as the quantity that needs to be unique-ified if
      we choose to implement the IN as Unique followed by a plain join.
      
      As of 8.3 this error was causing crashes, as seen in bug #4113 from Javier
      Hernandez, because the executor was being told to hash or sort the raw
      subselect output column using operators appropriate to the coerced type.
      
      In prior versions there was no crash because the executor chose the
      hash or sort operators for itself based on the column type it saw.
      However, that's still not really right, because what's unique for one data
      type might not be unique for another.  In corner cases we could get multiple
      outputs of a row that should appear only once, as demonstrated by the
      regression test case included in this commit.
      
      However, this patch doesn't apply cleanly to 8.2 or before, and the code
      involved has shifted enough over time that I'm hesitant to try to back-patch.
      Given the lack of complaints from the field about such corner cases, I think
      the bug may not be important enough to risk breaking other things with a
      back-patch.
      ff673f55
  11. 21 4月, 2008 1 次提交
    • T
      Allow float8, int8, and related datatypes to be passed by value on machines · 8472bf7a
      Tom Lane 提交于
      where Datum is 8 bytes wide.  Since this will break old-style C functions
      (those still using version 0 calling convention) that have arguments or
      results of these types, provide a configure option to disable it and retain
      the old pass-by-reference behavior.  Likewise, provide a configure option
      to disable the recently-committed float4 pass-by-value change.
      
      Zoltan Boszormenyi, plus configurability stuff by me.
      8472bf7a
  12. 18 4月, 2008 1 次提交
    • T
      Fix a couple of oversights associated with the "physical tlist" optimization: · 25e46a50
      Tom Lane 提交于
      we had several code paths where a physical tlist could be used for the input
      to a Sort node, which is a dumb idea because any unneeded table columns will
      increase the volume of data the sort has to push around.
      
      (Unfortunately the easy-looking fix of calling disuse_physical_tlist during
      make_sort_xxx doesn't work because in most cases we're already committed to
      the current input tlist --- it's been marked with sort column numbers, or
      we've built grouping column numbers using it, etc.  The tlist has to be
      selected properly at the calling level before we start constructing sort-col
      information.  This is easy enough to do, we were just failing to take the
      point into consideration.)
      
      Back-patch to 8.3.  I believe the problem probably exists clear back to 7.4
      when the physical tlist optimization was added, but I'm afraid to back-patch
      further than 8.3 without a great deal more study than I want to put into it.
      The code in this area has drifted a lot over time.  The real-world importance
      of these code paths is uncertain anyway --- I think in many cases we'd
      probably prefer hash-based methods.
      25e46a50
  13. 14 4月, 2008 2 次提交
    • T
      Since createplan.c no longer cares whether index operators are lossy, it has · 226837e5
      Tom Lane 提交于
      no particular need to do get_op_opfamily_properties() while building an
      indexscan plan.  Postpone that lookup until executor start.  This simplifies
      createplan.c a lot more than it complicates nodeIndexscan.c, and makes things
      more uniform since we already had to do it that way for RowCompare
      expressions.  Should be a bit faster too, at least for plans that aren't
      re-used many times, since we avoid palloc'ing and perhaps copying the
      intermediate list data structure.
      226837e5
    • T
      Phase 2 of project to make index operator lossiness be determined at runtime · 24558da1
      Tom Lane 提交于
      instead of plan time.  Extend the amgettuple API so that the index AM returns
      a boolean indicating whether the indexquals need to be rechecked, and make
      that rechecking happen in nodeIndexscan.c (currently the only place where
      it's expected to be needed; other callers of index_getnext are just erroring
      out for now).  For the moment, GIN and GIST have stub logic that just always
      sets the recheck flag to TRUE --- I'm hoping to get Teodor to handle pushing
      that control down to the opclass consistent() functions.  The planner no
      longer pays any attention to amopreqcheck, and that catalog column will go
      away in due course.
      24558da1
  14. 01 4月, 2008 2 次提交
    • T
      Fix an oversight I made in a cleanup patch over a year ago: · 6b73d7e5
      Tom Lane 提交于
      eval_const_expressions needs to be passed the PlannerInfo ("root") structure,
      because in some cases we want it to substitute values for Param nodes.
      (So "constant" is not so constant as all that ...)  This mistake partially
      disabled optimization of unnamed extended-Query statements in 8.3: in
      particular the LIKE-to-indexscan optimization would never be applied if the
      LIKE pattern was passed as a parameter, and constraint exclusion depending
      on a parameter value didn't work either.
      6b73d7e5
    • T
      Apply my original fix for Taiki Yamaguchi's bug report about DISTINCT MAX(). · d3441155
      Tom Lane 提交于
      Add some regression tests for plausible failures in this area.
      d3441155
  15. 29 3月, 2008 1 次提交
  16. 28 3月, 2008 2 次提交
    • T
      Department of second thoughts: the rule that ORDER BY and DISTINCT are · 2e4094da
      Tom Lane 提交于
      useless for an ungrouped-aggregate query holds regardless of whether
      optimize_minmax_aggregates succeeds.  So we might as well apply the
      optimization in any case.
      
      I'll leave 8.3 as it was, since this version is a tad more invasive
      than my earlier patch.
      2e4094da
    • T
      When we have successfully optimized a MIN or MAX aggregate into an indexscan, · ff72280c
      Tom Lane 提交于
      the query result must be exactly one row (since we don't do this when there's
      any GROUP BY).  Therefore any ORDER BY or DISTINCT attached to the query is
      useless and can be dropped.  Aside from saving useless cycles, this protects
      us against problems with matching the hacked-up tlist entries to sort clauses,
      as seen in a bug report from Taiki Yamaguchi.  We might need to work harder
      if we ever try to optimize grouped queries with this approach, but this
      solution will do for now.
      ff72280c
  17. 21 3月, 2008 2 次提交
  18. 19 3月, 2008 1 次提交
  19. 19 2月, 2008 1 次提交
  20. 18 1月, 2008 1 次提交
    • T
      Fix subselect.c to avoid assuming that a SubLink's testexpr references each · a44174cf
      Tom Lane 提交于
      subquery output column exactly once left-to-right.  Although this is the case
      in the original parser output, it might not be so after rewriting and
      constant-folding, as illustrated by bug #3882 from Jan Mate.  Instead
      scan the subquery's target list to obtain needed per-column information;
      this is duplicative of what the parser did, but only a couple dozen lines
      need be copied, and we can clean up a couple of notational uglinesses.
      Bug was introduced in 8.2 as part of revision of SubLink representation.
      a44174cf
  21. 11 1月, 2008 1 次提交
    • T
      Fix a conceptual error in my patch of 2007-10-26 that avoided considering · 59fc64ac
      Tom Lane 提交于
      clauseless joins of relations that have unexploited join clauses.  Rather
      than looking at every other base relation in the query, the correct thing is
      to examine the other relations in the "initial_rels" list of the current
      make_rel_from_joinlist() invocation, because those are what we actually have
      the ability to join against.  This might be a subset of the whole query in
      cases where join_collapse_limit or from_collapse_limit or full joins have
      prevented merging the whole query into a single join problem.  This is a bit
      untidy because we have to pass those rels down through a new PlannerInfo
      field, but it's necessary.  Per bug #3865 from Oleg Kharin.
      59fc64ac
  22. 10 1月, 2008 1 次提交
    • T
      Fix some planner issues found while investigating Kevin Grittner's report · 6a652252
      Tom Lane 提交于
      of poorer planning in 8.3 than 8.2:
      
      1. After pushing a constant across an outer join --- ie, given
      "a LEFT JOIN b ON (a.x = b.y) WHERE a.x = 42", we can deduce that b.y is
      sort of equal to 42, in the sense that we needn't fetch any b rows where
      it isn't 42 --- loop to see if any additional deductions can be made.
      Previous releases did that by recursing, but I had mistakenly thought that
      this was no longer necessary given the EquivalenceClass machinery.
      
      2. Allow pushing constants across outer join conditions even if the
      condition is outerjoin_delayed due to a lower outer join.  This is safe
      as long as the condition is strict and we re-test it at the upper join.
      
      3. Keep the outer-join clause even if we successfully push a constant
      across it.  This is *necessary* in the outerjoin_delayed case, but
      even in the simple case, it seems better to do this to ensure that the
      join search order heuristics will consider the join as reasonable to
      make.  Mark such a clause as having selectivity 1.0, though, since it's
      not going to eliminate very many rows after application of the constant
      condition.
      
      4. Tweak have_relevant_eclass_joinclause to report that two relations
      are joinable when they have vars that are equated to the same constant.
      We won't actually generate any joinclause from such an EquivalenceClass,
      but again it seems that in such a case it's a good idea to consider
      the join as worth costing out.
      
      5. Fix a bug in select_mergejoin_clauses that was exposed by these
      changes: we have to reject candidate mergejoin clauses if either side was
      equated to a constant, because we can't construct a canonical pathkey list
      for such a clause.  This is an implementation restriction that might be
      worth fixing someday, but it doesn't seem critical to get it done for 8.3.
      6a652252
  23. 02 1月, 2008 1 次提交
  24. 04 12月, 2007 1 次提交
  25. 24 11月, 2007 1 次提交
    • T
      Change fix_scan_expr() to avoid copying the input node tree in the common case · a36436ea
      Tom Lane 提交于
      where rtoffset == 0.  In that case there is no need to change Var nodes,
      and since filling in unset opfuncid fields is always safe, scribbling on the
      input tree to that extent is not objectionable.  This brings the cost of this
      operation back down to what it was in 8.2 for simple queries.  Per
      investigation of performance gripe from Guillaume Smet.
      a36436ea
  26. 16 11月, 2007 2 次提交
  27. 09 11月, 2007 2 次提交
    • T
      Fix EquivalenceClass code to handle volatile sort expressions in a more · c291203c
      Tom Lane 提交于
      predictable manner; in particular that if you say ORDER BY output-column-ref,
      it will in fact sort by that specific column even if there are multiple
      syntactic matches.  An example is
      	SELECT random() AS a, random() AS b FROM ... ORDER BY b, a;
      While the use-case for this might be a bit debatable, it worked as expected
      in earlier releases, so we should preserve the behavior for 8.3.  Per my
      recent proposal.
      
      While at it, fix convert_subquery_pathkeys() to handle RelabelType stripping
      in both directions; it needs this for the same reasons make_sort_from_pathkeys
      does.
      c291203c
    • T
      Last week's patch for make_sort_from_pathkeys wasn't good enough: it has · 1be06016
      Tom Lane 提交于
      to be able to discard top-level RelabelType nodes on *both* sides of the
      equivalence-class-to-target-list comparison, since make_pathkey_from_sortinfo
      might either add or remove a RelabelType.  Also fix the latter to do the
      removal case cleanly.  Per example from Peter.
      1be06016
  28. 03 11月, 2007 1 次提交
    • T
      Ensure that EquivalenceClasses generated from ORDER BY keys contain proper · 97ddfc96
      Tom Lane 提交于
      RelabelType nodes when the sort key is binary-compatible with the sort
      operator rather than having exactly its input type.  We did this correctly
      for index columns but not sort keys, leading to failure to notice that
      a varchar index matches an ORDER BY request.  This requires a bit more work
      in make_sort_from_pathkeys, but not anyplace else that I can find.
      Per bug report and subsequent discussion.
      97ddfc96
  29. 25 10月, 2007 1 次提交
    • T
      Fix an error in make_outerjoininfo introduced by my patch of 30-Aug: the code · 3ef18797
      Tom Lane 提交于
      neglected to test whether an outer join's join-condition actually refers to
      the lower outer join it is looking at.  (The comment correctly described what
      was supposed to happen, but the code didn't do it...)  This often resulted in
      adding an unnecessary constraint on the join order of the two outer joins,
      which was bad enough.  However, it also seems to expose a performance
      problem in an older patch (from 15-Feb): once we've decided that there is a
      join ordering constraint, we will start trying clauseless joins between every
      combination of rels within the constraint, which pointlessly eats up lots of
      time and space if there are numerous rels below the outer join.  That probably
      needs to be revisited :-(.  Per gripe from Jakub Ouhrabka.
      3ef18797
  30. 13 10月, 2007 1 次提交
    • T
      Teach planagg.c that partial indexes specifying WHERE foo IS NOT NULL can be · 106264ca
      Tom Lane 提交于
      used to perform MIN(foo) or MAX(foo), since we want to discard null rows in
      the indexscan anyway.  (This would probably fall out for free if we were
      injecting the IS NOT NULL clause somewhere earlier, but given the current
      anatomy of the MIN/MAX optimization code we have to do it explicitly.
      Fortunately, very little added code is needed.)  Per a discussion with
      Henk de Wit.
      106264ca
  31. 12 10月, 2007 1 次提交
    • T
      Fix the plan-invalidation mechanism to treat regclass constants that refer to · 82d8ab6f
      Tom Lane 提交于
      a relation as a reason to invalidate a plan when the relation changes.  This
      handles scenarios such as dropping/recreating a sequence that is referenced by
      nextval('seq') in a cached plan.  Rather than teach plancache.c all about
      digging through plan trees to find regclass Consts, we charge the planner's
      setrefs.c with making a list of the relation OIDs on which each plan depends.
      That way the list can be built cheaply during a plan tree traversal that has
      to happen anyway.  Per bug #3662 and subsequent discussion.
      82d8ab6f
  32. 05 10月, 2007 1 次提交
    • T
      Keep the planner from failing on "WHERE false AND something IN (SELECT ...)". · 89db887b
      Tom Lane 提交于
      eval_const_expressions simplifies this to just "WHERE false", but we have
      already done pull_up_IN_clauses so the IN join will be done, or at least
      planned, anyway.  The trouble case comes when the sub-SELECT is itself a join
      and we decide to implement the IN by unique-ifying the sub-SELECT outputs:
      with no remaining reference to the output Vars in WHERE, we won't have
      propagated the Vars up to the upper join point, leading to "variable not found
      in subplan target lists" error.  Fix by adding an extra scan of in_info_list
      and forcing all Vars mentioned therein to be propagated up to the IN join
      point.  Per bug report from Miroslav Sulc.
      89db887b
  33. 23 9月, 2007 1 次提交
    • T
      Fix cost estimates for EXISTS subqueries that are evaluated as initPlans · 71256875
      Tom Lane 提交于
      (because they are uncorrelated with the immediate parent query).  We were
      charging the full run cost to the parent node, disregarding the fact that
      only one row need be fetched for EXISTS.  While this would only be a
      cosmetic issue in most cases, it might possibly affect planning outcomes
      if the parent query were itself a subquery to some upper query.
      Per recent discussion with Steve Crawford.
      71256875