提交 · b4e2e3e2280ca81d1a3a1b20d9659e229238ed18 · Greenplum / Gpdb

14 3月, 2019 1 次提交

Rename legacy planner to Postgres planner · b4e2e3e2

由 Daniel Gustafsson 提交于 3月 14, 2019

As we merge with upstream and by that keep refining the Postgres
planner, legacy planner is no longer a suitable name. This changes
all variations of the spelling (legacy planner, legacy optimizer,
legacy query optimizer etc) to say "Postgres" rather than "legacy".
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NDavid Yozie <dyozie@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

b4e2e3e2

11 9月, 2018 1 次提交

Make planner generate redistribute-motion. · a4cbf586

由 ZhangJackey 提交于 9月 11, 2018

When doing an inner join, we will test that if we can use redistribute motion
by the function cdbpath_partkeys_from_preds. But if a_partkey is NIL(it is NIL
at the beginning of the function), we append nothing into it. Thus this
function will only return false. This leads to the planner can only generate a
broadcast motion for the inner relation.

We fix this by the same logic as an outer join.

WTS node is immovable, this commit adds some code to handle it.
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

a4cbf586

07 9月, 2018 1 次提交

Remove start_equiv/end_equiv support from gpdiff. · 2a5b7355

由 Heikki Linnakangas 提交于 9月 07, 2018

We can manage without it. Convert them into human-oriented comments, and
rely on the usual "compare with expected output" method for all of these
tests.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/lrJFgQR-KhI/KFTnrJj2BQAJ

2a5b7355

03 8月, 2018 1 次提交
- K
  Revert "Merge with PostgreSQL 9.2beta2." · e0aa3ef2
  由 Karen Huddleston 提交于 8月 02, 2018
```
This reverts commit 4750e1b6.
```
  e0aa3ef2
02 8月, 2018 1 次提交

Merge with PostgreSQL 9.2beta2. · 4750e1b6

由 Richard Guo 提交于 8月 02, 2018

This is the final batch of commits from PostgreSQL 9.2 development,
up to the point where the REL9_2_STABLE branch was created, and 9.3
development started on the PostgreSQL master branch.

Notable upstream changes:

* Index-only scan was included in the batch of upstream commits. It
  allows queries to retrieve data only from indexes, avoiding heap access.

* Group commit was added to work effectively under heavy load. Previously,
  batching of commits became ineffective as the write workload increased,
  because of internal lock contention.

* A new fast-path lock mechanism was added to reduce the overhead of
  taking and releasing certain types of locks which are taken and released
  very frequently but rarely conflict.

* The new "parameterized path" mechanism was added. It allows inner index
  scans to use values from relations that are more than one join level up
  from the scan. This can greatly improve performance in situations where
  semantic restrictions (such as outer joins) limit the allowed join orderings.

* SP-GiST (Space-Partitioned GiST) index access method was added to support
  unbalanced partitioned search structures. For suitable problems, SP-GiST can
  be faster than GiST in both index build time and search time.

* Checkpoints now are performed by a dedicated background process. Formerly
  the background writer did both dirty-page writing and checkpointing. Separating
  this into two processes allows each goal to be accomplished more predictably.

* Custom plan was supported for specific parameter values even when using
  prepared statements.

* API for FDW was improved to provide multiple access "paths" for their tables,
  allowing more flexibility in join planning.

* Security_barrier option was added for views to prevents optimizations that
  might allow view-protected data to be exposed to users.

* Range data type was added to store a lower and upper bound belonging to its
  base data type.

* CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
  SELECT query is planned during the execution of the utility. To conform to
  this change, GPDB executes the utility statement only on QD and dispatches
  the plan of the SELECT query to QEs.
Co-authored-by: NAdam Lee <ali@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

4750e1b6

09 7月, 2018 1 次提交

Use a penalty cost to implement enable_* planner GUCs, like in upstream. · 8fcd3fdd

由 Heikki Linnakangas 提交于 2月 05, 2018

Instead of completely disabling the generation of Paths with disabled
plan types, add a high penalty to their cost estimates, like in the
upstream. This reduces our diff vs. upstream, making future merges more
straightforward.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/Az2cDcqf73g/_tY6Yv1kBgAJCo-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NRichard Guo <riguo@pivotal.io>

8fcd3fdd

27 9月, 2017 1 次提交

Update test output files and indentation · 102295b8

由 Ekta Khanna and Omer Arap 提交于 5月 15, 2017

This commit updates test answer files after merging with e006a24a.

0. Replace `EXISTS Join` to `Semi Join`
0. Replace `Left Anti Semi Join` to `Anti Join`
0. Updated plan for `table_functions` for IN queries as we do not pull
up the sublink and convert it into a join when sublink testexpr does not
contain Vars of parent query.
0. Updated the error in `rangefuncs.out` since we now create a different
plan for the following query :
```
CREATE TABLE foorescan (fooid int, foosubid int, fooname text, primary key(fooid,foosubid));
INSERT INTO foorescan values(5000,1,'abc.5000.1');
INSERT INTO foorescan values(5001,1,'abc.5001.1');
CREATE FUNCTION foorescan(int,int) RETURNS setof foorescan AS 'SELECT * FROM foorescan WHERE fooid >= $1 and fooid < $2 ;' LANGUAGE SQL;
SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
```
Plan before fix :
```
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Sort  (cost=270.41..270.54 rows=50 width=50)
   Sort Key: f.fooid, f.foosubid
   ->  HashAggregate  (cost=268.50..269.00 rows=50 width=50)
         Group By: f.ctid::bigint, f.gp_segment_id
         ->  Hash Join  (cost=5.12..268.25 rows=50 width=29)
               Hash Cond: foorescan.fooid = f.fooid
               ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
               ->  Hash  (cost=4.50..4.50 rows=17 width=29)
                     ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..4.50 rows=50 width=29)
                           ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=29)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(12 rows)

```
here the function scan is done on master and since the function is
accessing a distributed relation,  `cdbdisp_dispatchToGang()`
errors out.

Plan after fix:
```
 explain SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=299.16..299.29 rows=50 width=19)
   Merge Key: f.fooid, f.foosubid
   ->  Sort  (cost=299.16..299.29 rows=17 width=19)
         Sort Key: f.fooid, f.foosubid
         ->  Hash Semi Join  (cost=292.50..297.75 rows=17 width=19)
               Hash Cond: f.fooid = foorescan.fooid
               ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..4.50 rows=17 width=19)
                     Hash Key: f.fooid
                     ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=19)
               ->  Hash  (cost=280.00..280.00 rows=334 width=4)
                     ->  Redistribute Motion 1:3  (slice2)  (cost=0.00..280.00 rows=1000 width=4)
                           Hash Key: foorescan.fooid
                           ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(15 rows)
```
With this new plan, function scan is executed on segment in which case
`init_sql_fcache()` first walks the query tree and checks if it is safe
to be planned and executed on the segment using `querytree_safe_for_segment_walker()`.
`querytree_safe_for_segment_walker()` errors out since the function is
accessing distributed table. Both the new and old errors are testing the same scenario.
But due to plan change, the place where we bail out is different.

Ref [#142355175]
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

102295b8

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be