提交 73fd01ca 编写于 作者: H Heikki Linnakangas

Relax assertions in setop planning, to accept execution on particular QE.

In setop plannning, we had assertions that checked that FLOW_SINGLETON
flows had segindex=0. I'm not sure what segindex 0 means; is it "any"?
In any case, it's possible to have an input that resides on a single QE,
different from 0, as evidenced by the new test query.

Fixes https://github.com/greenplum-db/gpdb/issues/3807
上级 20f2c007
......@@ -169,7 +169,15 @@ adjust_setop_arguments(PlannerInfo *root, List *planlist, GpSetOpType setop_type
break;
case CdbLocusType_SingleQE:
Assert(subplanflow->flotype == FLOW_SINGLETON && subplanflow->segindex == 0);
Assert(subplanflow->flotype == FLOW_SINGLETON);
/*
* The input was focused on a single QE, but we need it in the QD.
* It's bit silly to add a Motion to just move the whole result from
* single QE to QD, it would be better to produce the result in the
* QD in the first place, and avoid the Motion. But it's too late
* to modify the subplan.
*/
adjusted_plan = (Plan *) make_motion_gather_to_QD(root, subplan, NULL);
break;
......@@ -328,7 +336,7 @@ make_motion_gather(PlannerInfo *root, Plan *subplan, int segindex, List *sortPat
Assert(subplan->flow != NULL);
Assert(subplan->flow->flotype == FLOW_PARTITIONED ||
(subplan->flow->flotype == FLOW_SINGLETON && subplan->flow->segindex == 0));
subplan->flow->flotype == FLOW_SINGLETON);
if (sortPathKeys)
{
......
......@@ -211,6 +211,48 @@ select distinct a from (select distinct 'A' from (select 'C' from (select disti
B
(2 rows)
-- Test case where input to one branch of UNION resides on a single segment, and another on the QE.
-- The external table resides on QD, and the LIMIT on the test1 table forces the plan to be focused
-- on a single QE.
--
CREATE TABLE test1 (id int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
insert into test1 values (1);
CREATE EXTERNAL WEB TABLE test2 (id int) EXECUTE 'echo 2' ON MASTER FORMAT 'csv';
(SELECT 'test1' as branch, id FROM test1 LIMIT 1)
union
(SELECT 'test2' as branch, id FROM test2);
branch | id
--------+----
test1 | 1
test2 | 2
(2 rows)
-- The plan you currently get for this has a Motion to move the data from the single QE to
-- QD. That's a bit silly, it would probably make more sense to pull all the data to the QD
-- in the first place, and execute the Limit in the QD, to avoid the extra Motion. But this
-- is hopefully a pretty rare case.
explain (SELECT 'test1' as branch, id FROM test1 LIMIT 1)
union
(SELECT 'test2' as branch, id FROM test2);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Unique (cost=1.06..1.07 rows=2 width=4)
Group Key: "outer".branch, "*SELECT* 1".id
-> Sort (cost=1.06..1.06 rows=2 width=4)
Sort Key (Distinct): "outer".branch, "*SELECT* 1".id
-> Append (cost=0.00..1.05 rows=2 width=4)
-> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..1.04 rows=1 width=4)
-> Subquery Scan on "*SELECT* 1" (cost=0.00..1.04 rows=1 width=4)
-> Limit (cost=0.00..1.03 rows=1 width=4)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..1.03 rows=1 width=4)
-> Limit (cost=0.00..1.01 rows=1 width=4)
-> Seq Scan on test1 (cost=0.00..1.01 rows=1 width=4)
-> External Scan on test2 (cost=0.00..0.00 rows=1 width=4)
Optimizer: legacy query optimizer
(13 rows)
--
-- Setup
--
......
......@@ -61,6 +61,26 @@ select distinct a from (select 'A' from (select distinct 'C' ) as bar union sel
select distinct a from (select distinct 'A' from (select distinct 'C' ) as bar union select distinct 'B') as foo(a);
select distinct a from (select distinct 'A' from (select 'C' from (select distinct 'D') as bar1 ) as bar union select distinct 'B') as foo(a);
-- Test case where input to one branch of UNION resides on a single segment, and another on the QE.
-- The external table resides on QD, and the LIMIT on the test1 table forces the plan to be focused
-- on a single QE.
--
CREATE TABLE test1 (id int);
insert into test1 values (1);
CREATE EXTERNAL WEB TABLE test2 (id int) EXECUTE 'echo 2' ON MASTER FORMAT 'csv';
(SELECT 'test1' as branch, id FROM test1 LIMIT 1)
union
(SELECT 'test2' as branch, id FROM test2);
-- The plan you currently get for this has a Motion to move the data from the single QE to
-- QD. That's a bit silly, it would probably make more sense to pull all the data to the QD
-- in the first place, and execute the Limit in the QD, to avoid the extra Motion. But this
-- is hopefully a pretty rare case.
explain (SELECT 'test1' as branch, id FROM test1 LIMIT 1)
union
(SELECT 'test2' as branch, id FROM test2);
--
-- Setup
--
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册