1. 07 9月, 2017 21 次提交
    • H
      Fix a compilation error and some warnings introduced by the recursive CTEs. · 7cb69995
      Heikki Linnakangas 提交于
      * In ruleutils.c, the ereport() was broken. Use elog() instead, like in
        the upstream. (elog() is fine for "can't happen" kind of sanity checks)
      
      * Remove a few unused local variables.
      
      * Add a missing cast from Plan * to Node *.
      7cb69995
    • T
      062a0d07
    • J
      Un-hide recursive CTE on master [#150861534] · 20152cbf
      Jesse Zhang 提交于
      We will be less conservative and enable by default recursive CTE on
      master, while keeping recursive CTE hidden as we progress on developing
      the feature.
      
      This reverts the following two commits:
      * 280c577a "Set gp_recursive_cte_prototype GUC to true in test"
      * 4d5f8087 "Guard Recursive CTE behind a GUC"
      Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io>
      Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
      20152cbf
    • H
      Force a stand-alone backend to run in utility mode. · 814d82d6
      Heikki Linnakangas 提交于
      In a stand-alone backend ("postgres --single"), you cannot realistically
      expect any of the infrastructure needed for MPP processing to be present.
      Let's force a stand-alone backend to run in utility mode, to make sure
      that we don't try to dispatch queries, participate in distributed
      transactions, or anything like that, in a stand-alone backend.
      
      Fixes github issue #3172, which was one such case where we tried to
      dispatch a SET command in single-user mode, and got all confused.
      814d82d6
    • J
      Evaluate lesser joins to produce best join tree · 6ad94ff2
      Jemish Patel 提交于
      Previously we were setting the value of
      `optimizer_join_arity_for_associativity_commutativity` to very large
      number and so ORCA would spend a very long time evaluating all possible n_way_join
      combinations to come up with the cheapest join tree to use in the plan.
      
      We are reducing this value to `7` as it does not prove to be beneficial
      to spend time and resources to evaluate any more than 7_way_joins in
      trying to find the cheapest join tree.
      6ad94ff2
    • K
      Update pr resource to new upstream location · 5469cfee
      Kris Macoskey 提交于
      The upstream repository for the resource type has been renamed. There
      are no changes to the functionality, just a change of name.
      5469cfee
    • H
      Fix check for using window functions in WHERE clause. · ad093812
      Heikki Linnakangas 提交于
      In commit 563c8c6b, I all but removed the WindowRef.winlevelsup field, but
      missed that checkExprHasWindFuncs() was relying on it in a subtle way. Even
      though winlevelsup was always 0 during parsing, checkExprHasWindFuncs()
      compared it against the "current" nesting level, when it recursed into
      subqueries. Before commit 563c8c6b, it could never actually return true
      within subqueries, because the node->winlevelsup == context_sublevels_up
      condition would never be true. But when I removed the winlevelsup field,
      I erroneosly removed that condition altogether, making it always return true
      in subqueries.
      
      To fix, don't recurse into subqueries in checkExprHasWindFuncs(). There's no
      point in recursing, if it can never return true in a subquery. It doesn't
      recurse in the upstream either.
      
      Add a test case for the failing case (reduced from TPC-DS query 70), to
      the 'bfv_olap' test. While we're at it, remove alternative expected output
      file for 'bfv_olap', because there was no meaningful (i.e not-ignored)
      difference between that and the main expected output file.
      ad093812
    • C
      Remove proiswindow column from gptransfer verification query · 8539adcd
      Chris Hajas 提交于
      This column has a different name between master and previous GPDB
      versions.
      Signed-off-by: NKaren Huddleston <khuddleston@pivotal.io>
      8539adcd
    • H
      Remove negative test case about WITH RECURSIVE · e969767f
      HaiSheng Yuan and Jesse Zhang 提交于
      We support WITH RECURSIVE now (albeit hidden behind a hidden GUC).
      [#149915203]
      e969767f
    • J
      Set gp_recursive_cte_prototype GUC to true in test · 280c577a
      Jesse Zhang 提交于
      Plus minor corrections in spelling and comments.
      Signed-off-by: NSam Dash <sdash@pivotal.io>
      280c577a
    • K
      Guard Recursive CTE behind a GUC · 4d5f8087
      Kavinder Dhaliwal 提交于
      While Recurisve CTE is still being developed it will be hidden from users by
      the guc gp_recursive_cte_prototype
      Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
      4d5f8087
    • T
      Fix handling of changed-Param signaling for CteScan plan nodes. We were using · 2e598035
      Tom Lane 提交于
      the "cteParam" as a proxy for the possibility that the underlying CTE plan
      depends on outer-level variables or Params, but that doesn't work very well
      because it sometimes causes calling subqueries to be treated as SubPlans when
      they could be InitPlans.  This is inefficient and also causes the outright
      failure exhibited in bug #4902.  Instead, leave the cteParam out of it and
      copy the underlying CTE plan's extParams directly.  Per bug #4902 from
      Marko Tiikkaja.
      
      (cherry picked from commit 9298d2ff)
      2e598035
    • T
      Fix up ruleutils.c for CTE features. The main problem was that · 8e4b2f67
      Tom Lane 提交于
      get_name_for_var_field didn't have enough context to interpret a reference to
      a CTE query's output.  Fixing this requires separate hacks for the regular
      deparse case (pg_get_ruledef) and for the EXPLAIN case, since the available
      context information is quite different.  It's pretty nearly parallel to the
      existing code for SUBQUERY RTEs, though.  Also, add code to make sure we
      qualify a relation name that matches a CTE name; else the CTE will mistakenly
      capture the reference when reloading the rule.
      
      In passing, fix a pre-existing problem with get_name_for_var_field not working
      on variables in targetlists of SubqueryScan plan nodes.  Although latent all
      along, this wasn't a problem until we made EXPLAIN VERBOSE try to print
      targetlists.  To do this, refactor the deparse_context_for_plan API so that
      the special case for SubqueryScan is all on ruleutils.c's side.
      
      (cherry picked from commit 742fd06d)
      8e4b2f67
    • F
      Supporting ReScan of HashJoin with Spilled HashTable (#2770) · 391e9ea7
      foyzur 提交于
      To support RecursiveCTE we need to be able to ReScan a HashJoin as many times as the recursion depth. The HashJoin was previously ReScannable only if it has one memory-resident batch. Now, we support ReScannability for more than one batch. The approach that we took is to keep the inner batch files around for more than the duration of a single iteration of join if we detect that we need to reuse the batch files for rescanning. This can also improve the performance of the subplan as we no longer need to materialize and rebuild the hash table. Rather, we can just reload the batches from their corresponding batch files.
      
      To accomplish reloading of inner batch files, we keep the inner batch files around even if the outer is joined as we wait for the reuse in subsequent rescan (if rescannability is desired).
      
      The corresponding mail thread is here: https://groups.google.com/a/greenplum.org/forum/#!searchin/gpdb-dev/Rescannability$20of$20HashJoin%7Csort:relevance/gpdb-dev/E5kYU0FwJLg/Cqcxx0fOCQAJ
      
      Contributed by Haisheng Yuan, Kavinder Dhaliwal and Foyzur Rahman
      391e9ea7
    • K
      Error out when parsing certain keywords in a recursive CTE · 3f5cf5c7
      Kavinder Dhaliwal 提交于
      Currently Recursive CTE's do not support the following operations in the
      recursive term:
      
      - Group By
      - Window Functions
      - Subqueries with a self-reference
      - Distinct
      
      This commit produces an error in the parsing stage whenever any of the
      above is found in the recursive term of a CTE definition
      3f5cf5c7
    • K
      Improve behavior of WITH RECURSIVE with an untyped literal in the · 5c3d4f55
      Kavinder Dhaliwal 提交于
      non-recursive term.  Per an example from Dickson S. Guedes.
      5c3d4f55
    • K
      Error out when self-ref set operation in recursive term · 2168ecc5
      Kavinder Dhaliwal 提交于
      This commit ensures that if there is ever a self reference to a
      recursive cte within a set operation in the recursive term an error will
      be produced
      
      For example
      
      WITH RECURSIVE x(n) AS (
      	SELECT 1
      	UNION ALL
      	SELECT n+1 FROM (SELECT * FROM x UNION SELECT * FROM z)foo)
      SELECT * FROM x;
      
      Will produce an error, while
      
      WITH RECURSIVE x(n) AS (
      	SELECT 1
      	UNION ALL
      	SELECT n+1 FROM (SELECT * from z UNION SELECT * FROM u)foo, x where foo.x = x.n)
      SELECT * FROM x;
      
      Will not because the set operation does not have a self reference to its
      cte.
      2168ecc5
    • H
      Bring in recursive CTE to GPDB · fd61a4ca
      Haisheng Yuan 提交于
      Planner generates plan that doesn't insert any motion between WorkTableScan and
      its corresponding RecursiveUnion, because currently in GPDB motions are not
      rescannable. For example, a MPP plan for recursive CTE query may look like:
      ```
      Gather Motion 3:1
         ->  Recursive Union
               ->  Seq Scan on department
                     Filter: name = 'A'::text
               ->  Nested Loop
                     Join Filter: d.parent_department = sd.id
                     ->  WorkTable Scan on subdepartment sd
                     ->  Materialize
                           ->  Broadcast Motion 3:3
                                 ->  Seq Scan on department d
      ```
      
      For the current solution, the WorkTableScan is always put on the outer side of
      the top most Join (the recursive part of RecusiveUnion), so that we can safely
      rescan the inner child of join without worrying about the materialization of a
      potential underlying motion. This is a heuristic based plan, not a cost based
      plan.
      
      Ideally, the WorkTableScan can be placed on either side of the join with any
      depth, and the plan should be chosen based on the cost of the recursive plan
      and the number of recursions. But we will leave it for later work.
      
      Note: The hash join is temporarily disabled for plan generation of recursive
      part, because if the hash table spills, the batch file is going to be removed
      as it executes. We have a following story to enable spilled hash table to be
      rescannable.
      
      See discussion at gpdb-dev mailing list:
      https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/s_SoXKlwd6I
      fd61a4ca
    • M
      gp_era: change usage from md5 to sha256 · c13a9177
      Marbin Tan 提交于
      There is a bug with python 2.7 where you can't use hashlib.md5() with a
      system that has fips mode on. python 2.7 will segfault if you run the
      following
      `python -c "import ssl; import hashlib; m = hashlib.md5(); m.update('abc');"`
      
      Use sha256 instead as a workaround of the python 2.7 md5 issue.
      
      gp_era saves the hashed value into a file which gets read when creating
      a new mirror. It's mainly used to see if any segments gets out of
      synced with the new era file.
      c13a9177
    • H
      Add missing subselect test case with CTE [#150338742] · 4765e971
      Haisheng Yuan and Jesse Zhang 提交于
      Commit 038c36b6 from Postgres 8.3 was
      merged into Greenplum in a453004e. Commit 038c36b6 is a partial back
      port of commit 688aafa1 from Postgres 8.4. What's partial about 038c36b6
      is the omission of a test case containing CTE: a whole-row variable can
      refer to either an aliased `FROM` clause, or it can refer to a CTE. The
      CTE case was omitted because upstream 8.3 didn't have CTE.
      
      The non-CTE test case was slightly modified to add an `ORDER BY` clause
      because atmsort is confused by the `ORDER BY` inside the subselect:
      semantically we expect the differ to canonicalize (sort) the output
      before comparison, because sorted order of a subselect is not preserved
      according to SQL standard, but in this case atmsort believes the output
      is already sorted (by virtue of the presence of `ORDER BY`, even though
      it's within the subselect).
      
      Original commit message of 688aafa1 is enclosed:
      
      > Fix whole-row Var evaluation to cope with resjunk columns (again).
      >
      > When a whole-row Var is reading the result of a subquery, we need it to
      > ignore any "resjunk" columns that the subquery might have evaluated for
      > GROUP BY or ORDER BY purposes.  We've hacked this area before, in commit
      > 68e40998, but that fix only covered
      > whole-row Vars of named composite types, not those of RECORD type; and it
      > was mighty klugy anyway, since it just assumed without checking that any
      > extra columns in the result must be resjunk.  A proper fix requires getting
      > hold of the subquery's targetlist so we can actually see which columns are
      > resjunk (whereupon we can use a JunkFilter to get rid of them).  So bite
      > the bullet and add some infrastructure to make that possible.
      >
      > Per report from Andrew Dunstan and additional testing by Merlin Moncure.
      > Back-patch to all supported branches.  In 8.3, also back-patch commit
      > 292176a1, which for some reason I had
      > not done at the time, but it's a prerequisite for this change.
      
      (cherry picked from commit 688aafa15d8d83077c686d2b5b88226528e29840)
      4765e971
    • T
  2. 06 9月, 2017 13 次提交
    • H
      Ensure that stable functions in a prepared statement are re-evaluated. · ccca0af2
      Heikki Linnakangas 提交于
      If a prepared statement, or a cached plan for an SPI query e.g. from a
      PL/pgSQL function, contains stable functions, the stable functions were
      incorrectly evaluated only once at plan time, instead of on every execution
      of the plan. This happened to not be a problem in queries that contain any
      parameters, because in GPDB, they are re-planned on every invocation
      anyway, but non-parameter queries were broken.
      
      In the planner, before this commit, when simplifying expressions, we set
      the transform_stable_funcs flag to true for every query, and evaluated all
      stable functions at planning time. Change it to false, and also rename it
      back to 'estimate', as it's called in the upstream. That flag was changed
      back in 2010, in order to allow partition pruning to work with qual
      containing stable functions, like TO_DATE. I think back then, we always
      re-planned every query, so that was OK, but we do cache plans now.
      
      To avoid regressing to worse plans, change eval_const_expressions() so that
      it still does evaluate stable functions, even when the 'estimate' flag is
      off. But when it does so, mark the plan as "one-off", meaning that it must
      be re-planned on every execution. That gives the old, intended, behavior,
      that such plans are indeed re-planned, but it still allows plans that don't
      use stable functions to be cached.
      
      This seems to fix github issue #2661. Looking at the direct dispatch code
      in apply_motion(), I suspect there are more issues like this lurking there.
      There's a call to planner_make_plan_constant(), modifying the target list
      in place, and that happens during planning. But this at least fixes the
      non-direct dispatch cases, and is a necessary step for fixing any remaining
      issues.
      
      For some reason, the query now gets planned *twice* for every invocation.
      That's not ideal, but it was an existing issue for prepared statements with
      parameters, already. So let's deal with that separately.
      ccca0af2
    • H
      Fix reuse of cached plans in user-defined functions. · 2f4d8554
      Heikki Linnakangas 提交于
      CdbDispatchPlan() was making a copy of the plan tree, in the same memory
      context as the old plan tree was in. If the plan came from the plan cache,
      the copy will also be stored in the CachedPlan context. That means that
      every execution of the cached plan will leak a copy of the plan tree in
      the long-lived memory context.
      
      Commit 8b693868 fixed this for cached plans being used directly with
      the extended query protocol, but it did not fix the same issue with plans
      being cached as part of a user-defined function. To fix this properly,
      revert the changes to exec_bind_message, and instead in CdbDispatchPlan,
      make the copy of the plan tree in a short-lived memory context.
      
      Aside from the memory leak, it was never a good idea to change the original
      PlannedStmt's planTree pointer to point to the modified copy of the plan
      tree. That copy has had all the parameters replaced with their current
      values, but on the next execution, we should do that replacement again. I
      think that happened to not be an issue, because we had code elsewhere that
      forced re-planning of all queries anyway. Or maybe it was in fact broken.
      But in any case, stop scribbling on the original PlannedStmt, which might
      live in the plan cache, and make a temporary copy that we can freely
      scribble on in CdbDispatchPlan, that's only used for the dispatch.
      2f4d8554
    • K
      2f15ab8c
    • H
      Refactor the way seqserver host and port are stored. · 208a3cad
      Heikki Linnakangas 提交于
      They're not really per-portal settings, so it doesn't make much sense
      to pass them to PortalStart. And most of the callers were passing
      savedSeqServerHost/Port anyway. Instead, set the "current" host and port
      in postgres.c, when we receive them from the QD.
      208a3cad
    • H
      Remove useless system_catalog TINC tests. · 0e9380b3
      Heikki Linnakangas 提交于
      All of these queries were wrapped in gpdiff ignore-blocks. What's the
      point?
      0e9380b3
    • H
      Mark Abort/Commit/Transaction as static again. · 5fac1a58
      Heikki Linnakangas 提交于
      We don't care about old versions of dtrace anymore. Revert the code to
      the way it's in the upstream, to reduce our diff footprint.
      5fac1a58
    • C
      66842386
    • J
      Add migrated cs_walrep CCP tests to pipeline ALL group · cdba4245
      Jimmy Yih 提交于
      [ci skip]
      cdba4245
    • J
      Migrate cs-walrepl-multinode from Pulse to CCP · 1b960a73
      Jimmy Yih 提交于
      1b960a73
    • J
      Reorder TINC walrep_2 to fix ordering test failure · 1a6797e9
      Jimmy Yih 提交于
      Also remove some useless Makefile targets.
      1a6797e9
    • J
      Add TINC support with CCP · 2aea56b7
      Jimmy Yih 提交于
      TINC tests are planned to be migrated over to run natively in
      Concourse using CCP. This commit adds the task and script files needed
      to create the new TINC jobs.
      2aea56b7
    • H
      Don't initialize random seed when creating a temporary file. · be894afd
      Heikki Linnakangas 提交于
      That seems like a very random place to do it (sorry for the pun). The
      random seed is initialized at backend startup anyway, that ought to be
      good enough, so just remove the spurious initialization from bfz.c.
      
      In the passing, improve the debug-message to mention which compression
      algorithm was used.
      be894afd
    • H
      Remove unnecessary parse-analysis error position callback. · b325dc8e
      Heikki Linnakangas 提交于
      I guess once upon a time this was needed to get better error messages,
      with error positions, but we rely on the 'location' fields in the parse
      nodes nowadays. Removing this doesn't affect any of the error messages
      memorized in the regression tests, so it's not needed anymore.
      b325dc8e
  3. 05 9月, 2017 6 次提交