1. 16 8月, 2019 1 次提交
  2. 03 8月, 2019 1 次提交
  3. 24 7月, 2019 1 次提交
    • A
      GPDB changes for supporting equi- full merge joins in ORCA (#7814) · 452a463f
      Ashuka Xue 提交于
      This commit corresponds with ORCA commit "Implement Full Merge Join in ORCA".
      It also bumps ORCA version to v3.59.0.
      
      This commit includes the following changes to support merge join in
      ORCA:
      1. Update optimizer_expand_fulljoin guc to use traceflags instead of
      disabling the transform.
      2. Translator changes for Merge Join.
      3. Add IsOpMergeJoinable() and GetMergeJoinOpFamilies() wrappers.
      4. Introduces the guc optimizer_enable_mergejoin.
      452a463f
  4. 28 6月, 2019 1 次提交
    • S
      Fix wrong results for aggs containing a correlated subquery · 96d92667
      Shreedhar Hardikar 提交于
      This commit handles a missed case in the previous commit: "Fix
      algebrization of subqueries in queries with complex GROUP BYs".
      
      The logic inside RunExtractAggregatesMutator's Var case was intended to
      fix top-level Vars inside subqueries in the targetlist, but also
      incorrectly fixed top-level Vars in subqueries inside of aggregates.
      96d92667
  5. 18 5月, 2019 1 次提交
    • S
      Fix algebrization of subqueries in queries with complex GROUP BYs · 90d1ec44
      Shreedhar Hardikar 提交于
      ORCA's algebrizer must first normalize GROUP BYs in a query to a form
      usable in ORCA. It must flatten expressions in the project list to
      contain only aggregates and grouping columns
      For example:
      
      ORGINAL QUERY:
        SELECT * from r where r.a > (SELECT max(c) + min(d)
                                     FROM t where r.b = t.e)
      NEW QUERY:
        SELECT * from r where r.a > (SELECT x1+x2 as x3
        FROM (SELECT max(c) as x2, min(d) as x2
              FROM t where r.b = t.e) t2)
      
      However this process did not support subqueries in the target list that
      may contain outer references, sometimes in other (nested) subqueries. It
      also did not support CTEs. All these would produce a normalization error
      and fall back.
      
      This commit fixes that by supporting subqueries & CTEs.
      
      It also includes some refactors in related areas:
      
      - Rename IncrLevelsUpInVar to IncrLevelsUpIfOuterRef, to capture it's
        implementation.
      - Remove SContextHavingQualMutator, after realizing that it has almost
        the same members as SContextGrpbyPlMutator.
      - Use MakeVarInDerivedTable & RunGroupByProjListMutator in
        RunGroupByProjListMutator to reduce code drift.
      - Merge RunGroupByProjListMutator & RunHavingQualMutator (see below)
      
      RunGroupByProjListMutator() was implemented at a later date than the
      RunHavingQualMutator, and did not handle subqueries and ctes correctly.
      After understanding its purpose, I think the functionality of both the
      above methods should be exactly the same, since they're trying to
      achieve the same goal.
      
      So, this commit just merges the two functions together into a new
      function - RunExtractAggregatesMutator. In this process, I also
      discovered a bug in the old RunHavingQualMutator, that is now fixed:
      
        create table fooh1 (a int, b int, c int);
        insert into fooh1 select i%4, i%3, i from generate_series(1,20) i;
        select f1.a + 1 from fooh1 f1 group by f1.a+1 having sum(f1.a+1) + 1 > 20;
      
      Finally, the earlier code deduplicated AGGREFs when possible for HAVING
      clauses, but not for GROUP BY target lists when moving them into the
      derived query. Not only is that inconsistent, but may also give
      incorrect results in case of volatile functions. The executor already
      handles de-duplicating AGGREFs in right circumstances, so doing this in
      the algebrizer doesn't provide much of a benefit.
      90d1ec44
  6. 03 5月, 2019 2 次提交
  7. 25 4月, 2019 1 次提交
  8. 18 4月, 2019 1 次提交
    • S
      Disable FULL JOIN by default for ORCA · a64ca852
      Sambitesh Dash 提交于
      Full joins are sub-optimal in ORCA as they are implemented as a UNION of
      Left Outer Join AND Left Anti-Semi Join. However, GPDB provides a full
      outer join operator. Therefore, until ORCA implements a more optimal
      FULL JOIN, it will fall back to the Postgres legacy query optimizer for
      queries with FULL JOINs.
      
      Co-authored by: Sambitesh Dash sdash@pivotal.io
      Co-authored by: Ashuka Xue axue@pivotal.io
      a64ca852
  9. 14 3月, 2019 1 次提交
  10. 12 3月, 2019 1 次提交
  11. 15 2月, 2019 1 次提交
    • T
      Recursively create partitioned indexes · f27b2a50
      Taylor Vesely 提交于
      Pull from upstream Postgres to make DefineIndex recursively create partitioned
      indexes. Instead of creating an individual IndexStmt for every partition,
      create indexes by recursing on the partition children.  This aligns index
      creation with upstream in preparation for adding INTERNAL_AUTO relationships
      between partition indexes.
      
       * The QD will now choose the same name for partition indexes as Postgres.
       * Update tests to reflect the partition index names changes.
       * The changes to DefineIndex are mostly cherry-picked from Postgres commit:
         8b08f7d4
       * transformIndexStmt and its callers have been aligned with Postgres
         REL9_4_STABLE
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      f27b2a50
  12. 19 1月, 2019 1 次提交
  13. 12 1月, 2019 1 次提交
  14. 29 12月, 2018 1 次提交
    • H
      Call executor nodes the same, whether generated by planner or ORCA. · 455b9a19
      Heikki Linnakangas 提交于
      We used to call some node types different names in EXPLAIN output,
      depending on whether the plan was generated by ORCA or the Postgres
      planner. Also, a Bitmap Heap Scan used to be called differently, when the
      table was an AO or AOCS table, but only in planner-generated plans. There
      was some historical justification for this, because they used to
      be different executor node types, but commit db516347 removed last such
      differences.
      
      Full list of renames:
      
      Table Scan -> Seq Scan
      Append-only Scan -> Seq Scan
      Append-only Columnar Scan -> Seq Scan
      Dynamic Table Scan -> Dynamic Seq Scan
      Bitmap Table Scan -> Bitmap Heap Scan
      Bitmap Append-Only Row-Oriented Scan -> Bitmap Heap Scan
      Bitmap Append-Only Column-Oriented Scan -> Bitmap Heap Scan
      Dynamic Bitmap Table Scan -> Dynamic Bitmap Heap Scan
      455b9a19
  15. 15 12月, 2018 2 次提交
    • H
      Refactor executor code for TableScan, DynamicTableScan, BitmapHeapScan. · db516347
      Heikki Linnakangas 提交于
      This removes a lot of GPDB-specific code that was used to deal with
      dynamic scans, and code duplication between nodes dealing with Heap, AO
      and AOCS tables.
      
      * Resurrect SeqScan node. We had replaced it with TableScan in GPDB.
        Teach SeqScan to also work on append-only and AOCS tables, and remove
        TableScan and all the code changes that were made in GPDB earlier to
        deal with all three table types.
      
      * Merge BitmapHeapScan, BitmapAppendOnlyScan, and BitmapTableScan node
        types. They're all BitmapHeapScans now. We used to use BitmapTableScans
        in ORCA-generated plans, and BitmapHeapScan/BitmapAppendOnlyScan in
        planner-generated plans, and there was no good reason for the
        difference. The "heap" part in the name is a bit misleading, but I
        prefer to keep the upstream name, even though it now handles AO tables
        as well. It's more like the old BitmapTableScan now, which also handled
        all three table types, but the code is refactored to stay as close to
        upstream as possible.
      
      * Introduce DynamicBitmapHeapScan. BitmapTableScan used to perform Dynamic
        scans too, now it's the responsibility of the new DynamicBitmapHeapScan
        plan node, just like we have DynamicTableScan and DynamicIndexScan as
        wrappers around SeqScand and IndexScans.
      
      * Get rid of BitmapAppendOnlyPath in the planner, too. Use BitmapHeapPath
        also for AO tables.
      
      * Refactor the way Dynamic Table Scan works. A Dynamic Table Scan node
        is now just a thin wrapper around SeqScan. It initializes a new
        SeqScan executor node for every partition, and lets it do the actual
        scanning. It now works the same way that I refactored Dynamic Index
        Scans to work in commit 198f701e. This allowed removing a lot of code
        that we used to use for both Dynamic Index Scans and Dynamic Table
        Scans, but is no longer used.
      
      There's now some duplication in the Dynamic* nodes, to walk through the
      partitions. They all have a function called setPidIndex(), for example,
      which does the same thing. But I think it's much more clear this way,
      than the previous DynamicController stuff. We could perhaps extract some
      of the code to common helper functions, but I think this is OK for now.
      
      This also fixes issue #6274. I'm not sure what exactly the bug was, but it
      was clearly in the Bitmap Table Scan code that is used with ORCA-generated
      plans. Now that we use the same code for plans generated with the Postgres
      planner and ORCA, it's not surprising that the bug is gone.
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      db516347
    • S
      ICG updates for ORCA commit: "Include binary coercible casts for predicate inference" · dbece3da
      Shreedhar Hardikar 提交于
      Also bump ORCA's version to 3.16.0
      Co-authored-by: NHans Zeller <hzeller@pivotal.io>
      dbece3da
  16. 13 12月, 2018 1 次提交
    • D
      Reporting cleanup for GPDB specific errors/messages · 56540f11
      Daniel Gustafsson 提交于
      The Greenplum specific error handling via ereport()/elog() calls was
      in need of a unification effort as some parts of the code was using a
      different messaging style to others (and to upstream). This aims at
      bringing many of the GPDB error calls in line with the upstream error
      message writing guidelines and thus make the user experience of
      Greenplum more consistent.
      
      The main contributions of this patch are:
      
      * errmsg() messages shall start with a lowercase letter, and not end
        with a period. errhint() and errdetail() shall be complete sentences
        starting with capital letter and ending with a period. This attempts
        to fix this on as many ereport() calls as possible, with too detailed
        errmsg() content broken up into details and hints where possible.
      
      * Reindent ereport() calls to be more consistent with the common style
        used in upstream and most parts of Greenplum:
      
      	ereport(ERROR,
      			(errcode(<CODE>),
      			 errmsg("short message describing error"),
      			 errhint("Longer message as a complete sentence.")));
      
      * Avoid breaking messages due to long lines since it makes grepping
        for error messages harder when debugging. This is also the de facto
        standard in upstream code.
      
      * Convert a few internal error ereport() calls to elog(). There are
        no doubt more that can be converted, but the low hanging fruit has
        been dealt with. Also convert a few elog() calls which are user
        facing to ereport().
      
      * Update the testfiles to match the new messages.
      
      Spelling and wording is mostly left for a follow-up commit, as this was
      getting big enough as it was. The most obvious cases have been handled
      but there is work left to be done here.
      
      Discussion: https://github.com/greenplum-db/gpdb/pull/6378Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      56540f11
  17. 03 11月, 2018 2 次提交
  18. 24 10月, 2018 1 次提交
  19. 12 10月, 2018 1 次提交
    • H
      Improve the INFO message on ORCA fallback. · d704245e
      Heikki Linnakangas 提交于
      If optimizer_trace_fallback is enabled, report the ORCA exception's message
      to the user. It usually says something like "Feature not supported by the
      Pivotal Query Optimizer: Non-default collation", which is quite helpful.
      
      While we're at it, simplify setting the "had unexpected failure" flag,
      by putting it in SOptContext. That allows removing one layer of catching and
      re-throwing the exception.
      d704245e
  20. 11 10月, 2018 1 次提交
  21. 28 9月, 2018 1 次提交
    • Z
      Allow tables to be distributed on a subset of segments · 4eb65a53
      ZhangJackey 提交于
      There was an assumption in gpdb that a table's data is always
      distributed on all segments, however this is not always true for example
      when a cluster is expanded from M segments to N (N > M) all the tables
      are still on M segments, to workaround the problem we used to have to
      alter all the hash distributed tables to randomly distributed to get
      correct query results, at the cost of bad performance.
      
      Now we support table data to be distributed on a subset of segments.
      
      A new columne `numsegments` is added to catalog table
      `gp_distribution_policy` to record how many segments a table's data is
      distributed on.  By doing so we could allow DMLs on M tables, joins
      between M and N tables are also supported.
      
      ```sql
      -- t1 and t2 are both distributed on (c1, c2),
      -- one on 1 segments, the other on 2 segments
      select localoid::regclass, attrnums, policytype, numsegments
          from gp_distribution_policy;
       localoid | attrnums | policytype | numsegments
      ----------+----------+------------+-------------
       t1       | {1,2}    | p          |           1
       t2       | {1,2}    | p          |           2
      (2 rows)
      
      -- t1 and t1 have exactly the same distribution policy,
      -- join locally
      explain select * from t1 a join t1 b using (c1, c2);
                         QUERY PLAN
      ------------------------------------------------
       Gather Motion 1:1  (slice1; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Seq Scan on t1 b
       Optimizer: legacy query optimizer
      
      -- t1 and t2 are both distributed on (c1, c2),
      -- but as they have different numsegments,
      -- one has to be redistributed
      explain select * from t1 a join t2 b using (c1, c2);
                                QUERY PLAN
      ------------------------------------------------------------------
       Gather Motion 1:1  (slice2; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Redistribute Motion 2:1  (slice1; segments: 2)
                           Hash Key: b.c1, b.c2
                           ->  Seq Scan on t2 b
       Optimizer: legacy query optimizer
      ```
      4eb65a53
  22. 25 9月, 2018 1 次提交
  23. 22 9月, 2018 1 次提交
    • H
      Change pretty-printing of expressions in EXPLAIN to match upstream. · 4c54c894
      Heikki Linnakangas 提交于
      We had changed this in GPDB, to print less parens. That's fine and dandy,
      but it hardly seems worth it to carry a diff vs upstream for this. Which
      format is better, is a matter of taste. The extra parens make some
      expressions more clear, but OTOH, it's unnecessarily verbose for simple
      expressions. Let's follow the upstream on this.
      
      These changes were made to GPDB back in 2006, as part of backporting
      to EXPLAIN-related patches from PostgreSQL 8.2. But I didn't see any
      explanation for this particular change in output in that commit message.
      
      It's nice to match upstream, to make merging easier. However, this won't
      make much difference to that: almost all EXPLAIN plans in regression
      tests are different from upstream anyway, because GPDB needs Motion nodes
      for most queries. But every little helps.
      4c54c894
  24. 21 9月, 2018 2 次提交
  25. 11 9月, 2018 1 次提交
  26. 08 9月, 2018 1 次提交
    • D
      Introduce optimizer guc to enable generating streaming material · 635c2e0f
      Dhanashree Kashid 提交于
      Previously, while optimizing nestloop joins, ORCA always generated a
      blocking materialize node (cdb_strict=true). Though, this conservative
      nature ensured that the join node produced by ORCA will always be
      deadlock safe; we sometimes produced slow running plans.
      
      ORCA now has a capability of producing blocking materialize only when
      needed by detecting motion hazard in the nestloop join. A streaming
      material will be generated when there is no motion hazard.
      
      This commit adds a guc to control this behavior. When set to off, we
      fallback to old behavior of always producing a blocking materialize.
      
      Also bump the statement_mem for a test in segspace. After this change,
      for the test query, we produce a streaming spool which changes number of
      operator groups in memory quota calculation and query fails with:
      `ERROR:  insufficient memory reserved for statement`. Bump the
      statement_mem by 1MB to test the fault injection.
      
      Also bump the orca version to 2.72.0
      Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      635c2e0f
  27. 06 9月, 2018 1 次提交
  28. 01 9月, 2018 1 次提交
    • S
      Add test to ORCA generates correct equivalence class · 27127b47
      Sambitesh Dash 提交于
      Given a query like below:
      
      SELECT Count(*)
      FROM   (SELECT *
              FROM   (SELECT tab_2.cd AS CD1,
                             tab_2.cd AS CD2
                      FROM   tab_1
                             LEFT JOIN tab_2
                                    ON tab_1.id = tab_2.id) f
              UNION ALL
              SELECT region,
                     code
              FROM   tab_3)a;
      
      Previously, orca produced an incorrect filter, (cd2 = cd) on top of the
      project list generated for producing an alias. This led to incorrect
      results as column 'cd' is produced by a nullable side of LOJ (tab2) and
      such filter produces NULL
      output.
      Ensure orca produces correct equivalence class by considering the
      nullable columns.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      27127b47
  29. 31 8月, 2018 2 次提交
    • H
      Replace GPDB versions of some numeric aggregates with upstream's. · 325e6fcd
      Heikki Linnakangas 提交于
      Among other things, this fixes the inaccuracy of integer avg() and sum()
      functions. (i.e. fixes https://github.com/greenplum-db/gpdb/issues/5525)
      
      The upstream versions are from PostgreSQL 9.6, using the 128-bit math
      from the following commit:
      
      commit 959277a4
      Author: Andres Freund <andres@anarazel.de>
      Date:   Fri Mar 20 10:26:17 2015 +0100
      
          Use 128-bit math to accelerate some aggregation functions.
      
          On platforms where we support 128bit integers, use them to implement
          faster transition functions for sum(int8), avg(int8),
          var_*(int2/int4),stdev_*(int2/int4). Where not supported continue to use
          numeric as a transition type.
      
          In some synthetic benchmarks this has been shown to provide significant
          speedups.
      
          Bumps catversion.
      
          Discussion: 544BB5F1.50709@proxel.se
          Author: Andreas Karlsson
          Reviewed-By: Peter Geoghegan, Petr Jelinek, Andres Freund,
              Oskari Saarenmaa, David Rowley
      325e6fcd
    • H
      Rename "prelim function" to "combine function", to match upstream. · b8545d57
      Heikki Linnakangas 提交于
      The GPDB "prelim" functions did the same things as the "combine"
      functions introduced in PostgreSQL 9.6 This commit includes just the
      catalog changes, to essentially search & replace "prelim" with
      "combine". I did not pick the planner and executor changes that were
      made as part of this in the upstream, yet.
      
      Also replace the GPDB implementation of float8_amalg() and
      float8_regr_amalg(), with the upstream float8_combine() and
      float8_regr_combine(). They do the same thing, but let's use upstream
      functions where possible.
      
      Upstream commits:
      commit a7de3dc5
      Author: Robert Haas <rhaas@postgresql.org>
      Date:   Wed Jan 20 13:46:50 2016 -0500
      
          Support multi-stage aggregation.
      
          Aggregate nodes now have two new modes: a "partial" mode where they
          output the unfinalized transition state, and a "finalize" mode where
          they accept unfinalized transition states rather than individual
          values as input.
      
          These new modes are not used anywhere yet, but they will be necessary
          for parallel aggregation.  The infrastructure also figures to be
          useful for cases where we want to aggregate local data and remote
          data via the FDW interface, and want to bring back partial aggregates
          from the remote side that can then be combined with locally generated
          partial aggregates to produce the final value.  It may also be useful
          even when neither FDWs nor parallelism are in play, as explained in
          the comments in nodeAgg.c.
      
          David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
          Linnakangas, Haribabu Kommi, and me.
      
      commit af025eed
      Author: Robert Haas <rhaas@postgresql.org>
      Date:   Fri Apr 8 13:44:50 2016 -0400
      
          Add combine functions for various floating-point aggregates.
      
          This allows parallel aggregation to use them.  It may seem surprising
          that we use float8_combine for both float4_accum and float8_accum
          transition functions, but that's because those functions differ only
          in the type of the non-transition-state argument.
      
          Haribabu Kommi, reviewed by David Rowley and Tomas Vondra
      b8545d57
  30. 25 8月, 2018 1 次提交
    • D
      Add tests ensuring correct handling of Full and left outer joins · 17de967d
      Dhanashree Kashid 提交于
      1. Add a test for a full outer join query on varchar columns
      In such scenario, planner expects a relabletype node on top of varchar
      column while looking up for a Sort operator. Please refer commit fab435e
      for more details.  Add a test for such queries and disable hashjoin to
      make sure that a planner is able to generate a plan with merge join
      successfully.
      
      2. Add a test for a query with an Agg and left outer join
      This test is to ensure that ORCA produces correct results, by performing
      a two stage aggregation on top of a co-located join. Corresponding plan
      test has been added in the ORCA test suite.
      17de967d
  31. 15 8月, 2018 1 次提交
  32. 03 8月, 2018 1 次提交
  33. 02 8月, 2018 1 次提交
    • R
      Merge with PostgreSQL 9.2beta2. · 4750e1b6
      Richard Guo 提交于
      This is the final batch of commits from PostgreSQL 9.2 development,
      up to the point where the REL9_2_STABLE branch was created, and 9.3
      development started on the PostgreSQL master branch.
      
      Notable upstream changes:
      
      * Index-only scan was included in the batch of upstream commits. It
        allows queries to retrieve data only from indexes, avoiding heap access.
      
      * Group commit was added to work effectively under heavy load. Previously,
        batching of commits became ineffective as the write workload increased,
        because of internal lock contention.
      
      * A new fast-path lock mechanism was added to reduce the overhead of
        taking and releasing certain types of locks which are taken and released
        very frequently but rarely conflict.
      
      * The new "parameterized path" mechanism was added. It allows inner index
        scans to use values from relations that are more than one join level up
        from the scan. This can greatly improve performance in situations where
        semantic restrictions (such as outer joins) limit the allowed join orderings.
      
      * SP-GiST (Space-Partitioned GiST) index access method was added to support
        unbalanced partitioned search structures. For suitable problems, SP-GiST can
        be faster than GiST in both index build time and search time.
      
      * Checkpoints now are performed by a dedicated background process. Formerly
        the background writer did both dirty-page writing and checkpointing. Separating
        this into two processes allows each goal to be accomplished more predictably.
      
      * Custom plan was supported for specific parameter values even when using
        prepared statements.
      
      * API for FDW was improved to provide multiple access "paths" for their tables,
        allowing more flexibility in join planning.
      
      * Security_barrier option was added for views to prevents optimizations that
        might allow view-protected data to be exposed to users.
      
      * Range data type was added to store a lower and upper bound belonging to its
        base data type.
      
      * CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
        SELECT query is planned during the execution of the utility. To conform to
        this change, GPDB executes the utility statement only on QD and dispatches
        the plan of the SELECT query to QEs.
      Co-authored-by: NAdam Lee <ali@pivotal.io>
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      Co-authored-by: NPaul Guo <paulguo@gmail.com>
      Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
      Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
      Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
      Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
      4750e1b6
  34. 01 8月, 2018 2 次提交