提交 · a8f5a0455c2f2827aa8f039ef632df7bae534ade · Greenplum / Gpdb

27 9月, 2017 1 次提交

Update test output files and indentation · 102295b8

由 Ekta Khanna and Omer Arap 提交于 5月 15, 2017

This commit updates test answer files after merging with e006a24a.

0. Replace `EXISTS Join` to `Semi Join`
0. Replace `Left Anti Semi Join` to `Anti Join`
0. Updated plan for `table_functions` for IN queries as we do not pull
up the sublink and convert it into a join when sublink testexpr does not
contain Vars of parent query.
0. Updated the error in `rangefuncs.out` since we now create a different
plan for the following query :
```
CREATE TABLE foorescan (fooid int, foosubid int, fooname text, primary key(fooid,foosubid));
INSERT INTO foorescan values(5000,1,'abc.5000.1');
INSERT INTO foorescan values(5001,1,'abc.5001.1');
CREATE FUNCTION foorescan(int,int) RETURNS setof foorescan AS 'SELECT * FROM foorescan WHERE fooid >= $1 and fooid < $2 ;' LANGUAGE SQL;
SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
```
Plan before fix :
```
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Sort  (cost=270.41..270.54 rows=50 width=50)
   Sort Key: f.fooid, f.foosubid
   ->  HashAggregate  (cost=268.50..269.00 rows=50 width=50)
         Group By: f.ctid::bigint, f.gp_segment_id
         ->  Hash Join  (cost=5.12..268.25 rows=50 width=29)
               Hash Cond: foorescan.fooid = f.fooid
               ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
               ->  Hash  (cost=4.50..4.50 rows=17 width=29)
                     ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..4.50 rows=50 width=29)
                           ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=29)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(12 rows)

```
here the function scan is done on master and since the function is
accessing a distributed relation,  `cdbdisp_dispatchToGang()`
errors out.

Plan after fix:
```
 explain SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=299.16..299.29 rows=50 width=19)
   Merge Key: f.fooid, f.foosubid
   ->  Sort  (cost=299.16..299.29 rows=17 width=19)
         Sort Key: f.fooid, f.foosubid
         ->  Hash Semi Join  (cost=292.50..297.75 rows=17 width=19)
               Hash Cond: f.fooid = foorescan.fooid
               ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..4.50 rows=17 width=19)
                     Hash Key: f.fooid
                     ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=19)
               ->  Hash  (cost=280.00..280.00 rows=334 width=4)
                     ->  Redistribute Motion 1:3  (slice2)  (cost=0.00..280.00 rows=1000 width=4)
                           Hash Key: foorescan.fooid
                           ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(15 rows)
```
With this new plan, function scan is executed on segment in which case
`init_sql_fcache()` first walks the query tree and checks if it is safe
to be planned and executed on the segment using `querytree_safe_for_segment_walker()`.
`querytree_safe_for_segment_walker()` errors out since the function is
accessing distributed table. Both the new and old errors are testing the same scenario.
But due to plan change, the place where we bail out is different.

Ref [#142355175]
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

102295b8

25 9月, 2017 1 次提交

Remove the concept of window "key levels". · b1651a43

由 Heikki Linnakangas 提交于 9月 25, 2017

It wasn't very useful. ORCA and Postgres both just stack WindowAgg nodes
on top of each other, and no-one's been unhappy about that, so we might as
well do that, too. This reduces the difference between GPDB and the upstream
implementation, and will hopefully make it smoother to switch.

Rename the Window Plan node type to WindowAgg, to match upstream, now
that it is fairly close to the upstream version.

b1651a43

10 5月, 2017 1 次提交

Fix bug that partition selector may error out when its child is empty · e60b15d8

由 Heikki Linnakangas 提交于 4月 30, 2017

The problem is that if ExecPartitionSelector() gets no tuples from the child
node, it doesn't call InsertPidIntoDynamicTableScanInfo() at all. It should
call InsertPidIntoDynamicTableScanInfo() in that case, with InvalidOid as the
partition OID, to signal to PartSelectedExpr that it was executed and no
partitions are selected.

Now with this change it will call InsertPidIntoDynamicTableScanInfo() after the
last last tuple has been read, always, not only when there were no tuples at
all.

e60b15d8

04 5月, 2017 1 次提交

Fix bug that partition selector may generate incomplete results for NLJ · befc063b

由 Haisheng Yuan 提交于 4月 26, 2017

By setting material->cdb_strict to true if the outer child of material is a
partition selector.  Before this patch, the following query return incomplete
results quite often:

create table t(id int, a int);
create table pt(id int, b int) DISTRIBUTED BY (id)
PARTITION BY RANGE(b) (START (0) END (5) EVERY (1));
insert into t select i, i from generate_series(0,4) i;
insert into pt select i, i from generate_series(0,4) i;
analyze t;
analyze pt;
set enable_hashjoin=off;
set enable_nestloop=on;
select * from t, pt where a = b;

In 3 segments cluster, it may return different result shown below:
hyuan=# select * from t, pt where a = b;
 id | a | id | b
----+---+----+---
  0 | 0 |  0 | 0
  1 | 1 |  1 | 1
  2 | 2 |  2 | 2
(3 rows)

hyuan=# select * from t, pt where a = b;
 id | a | id | b
----+---+----+---
  3 | 3 |  3 | 3
  4 | 4 |  4 | 4
(2 rows)

hyuan=# select * from t, pt where a = b;
 id | a | id | b
----+---+----+---
  3 | 3 |  3 | 3
  4 | 4 |  4 | 4
  0 | 0 |  0 | 0
  1 | 1 |  1 | 1
  2 | 2 |  2 | 2
(5 rows)

But only the last one is correct result.

The plan for above query is:
-------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)
   ->  Nested Loop  (cost=2.27..9.00 rows=2 width=16)
         Join Filter: t.a = public.pt.b
         ->  Append  (cost=0.00..5.05 rows=2 width=8)
               ->  Result  (cost=0.00..1.01 rows=1 width=8)
                     One-Time Filter: PartSelected
                     ->  Seq Scan on pt_1_prt_1 pt
               ->  Result  (cost=0.00..1.01 rows=1 width=8)
                     One-Time Filter: PartSelected
                     ->  Seq Scan on pt_1_prt_2 pt
               ->  Result  (cost=0.00..1.01 rows=1 width=8)
                     One-Time Filter: PartSelected
                     ->  Seq Scan on pt_1_prt_3 pt
               ->  Result  (cost=0.00..1.01 rows=1 width=8)
                     One-Time Filter: PartSelected
                     ->  Seq Scan on pt_1_prt_4 pt
               ->  Result  (cost=0.00..1.01 rows=1 width=8)
                     One-Time Filter: PartSelected
                     ->  Seq Scan on pt_1_prt_5 pt
         ->  Materialize  (cost=2.27..2.42 rows=5 width=8)
               ->  Partition Selector for pt (dynamic scan id: 1)
                     Filter: t.a
                     ->  Broadcast Motion 3:3  (slice1; segments: 3)
                           ->  Seq Scan on t
 Settings:  enable_hashjoin=off; enable_nestloop=on
 Optimizer status: legacy query optimizer

The data distribution for table t and pt in 3 segments environment is:
hyuan=# select gp_segment_id, * from t;
 gp_segment_id | id | a
---------------+----+---
             1 |  3 | 3
             1 |  4 | 4
             0 |  0 | 0
             0 |  1 | 1
             0 |  2 | 2
(5 rows)

hyuan=# select gp_segment_id, * from pt;
 gp_segment_id | id | b
---------------+----+---
             0 |  0 | 0
             0 |  1 | 1
             0 |  2 | 2
             1 |  3 | 3
             1 |  4 | 4
(5 rows)

Tuples {0,1,2} of t and pt are in segment 0, tuples {3,4} of t and pt are in
segment 1. Segment 2 has no data for t and pt.

In this query, planner decides to prefetch inner child to avoid deadlock hazard
and the cdb_strict of Material is set to false. Let's see how the query goes in
segment 0.

1. The inner child of nestloop join, material fetch one tuple from partition
selector and then output it. Let's assume the output order of partition
selector/broadcast motion is {0,1,2,3,4}. So the 1st tuple output by partition
selector and material is 0.

2. The partition selector decides that the selected partition for table pt is
pt_1_prt_1, because t.a = pt.b = 0 in this partition. The outer child of
nestloop join, Append, fetches 1 tuple from that partition, with pt.b=0.

3. Nestloop join continues to execute Material of inner child to fetch other
tuples, 1,2,3,4, but all these tuples from t don't match the join condition,
because pt=0. No more tuples output by nestloop join for this round of loop.
But all the partition of pt are matched and selected.

4. Nestloop join fetch another tuple from pt_1_prt_2, which is 1, which can
match with a tuple from inner child, output 1. And then fetch tuple from
pt_1_prt_3, which is 2, matched, output 2. But pt_1_prt_4 and pt_1_prt_5 have
no data in this segment, so the output ends with {0,1,2} in segment 0.

But in segment 1, let's still assume the tuple output order of partition
selector/broadcast motion is {0,1,2,3,4}. Since the first output tuple from
inner child is 0, only pt_1_prt_1 is selected. But when nestloop join tries to
fetch tuple from outer child, which in fact fetch from pt_1_prt_1 in this case,
it returns no tuple, because pt_1_prt_1 is empty in this segment. So nestloop
join decides that since it can't fetch any tuple from outer child, it must be
empty, no need to execute the join, return NULL and finish it directly.

Segment 2 has no data for t and pt, no tuple is output. So the final result
gathered on master segment is {0,1,2} in this case. But if the broadcast motion
output tuple order is {3,4,0,1,2}, the final result may be {3,4}. If the
braodcast motion output tuple order on segment 0 is {0,1,2,3,4}, and on segment
1 is {3,4,0,1,2}, then the final result on master is {0,1,2,3,4}, which is
correct.

The bug is fixed by setting cdb_strict of material to true when planner
generates partition selector for the inner child of nestloop join, the material
will fetch all the tuples from child and materialize them before emitting any
tuple.  Thus we can make sure the partitions of pt are selected correctly.

RCA by Lirong Jian <jianlirong@gmail.com>

befc063b

01 4月, 2017 1 次提交

Use PartitionSelectors for partition elimination, even without ORCA. · e378d84b

由 Heikki Linnakangas 提交于 4月 01, 2017

The old mechanism was to scan the complete plan, searching for a pattern
with a Join, where the outer side included an Append node. The inner
side was duplicated into an InitPlan, with the pg_partition_oid aggregate
to collect the Oids of all the partitions that can match. That was
inefficient and broken: if the duplicated plan was volatile, you might
choose wrong partitions. And scanning the inner side twice can obviously
be slow, if there are a lot of tuples.

Rewrite the way such plans are generated. Instead of using an InitPlan,
inject a PartitionSelector node into the inner side of the join.

Fixes github issues #2100 and #2116.

e378d84b

16 2月, 2017 1 次提交
- V
  Remove duplicated DPE tests · 5cfec0f2
  由 Venkatesh Raghavan 提交于 2月 15, 2017
```
Issue: #1759
```
  5cfec0f2
14 2月, 2017 1 次提交

Renable dpe test in ICG. · 79a7ce4c

由 Haisheng Yuan 提交于 2月 07, 2017

- The dpe.sql file was duplicated in bugbuster - so we remove that one.
- Result files were old, and hence updated.
Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>

79a7ce4c

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be