未验证 提交 fa86f160 编写于 作者: N Ning Yu 提交者: GitHub

Fix numsegments when appending multiple SingleQEs

When Append node contains SingleQE subpath we used to put Append on ALL
the segments, however if the SingleQE is partially distributed then
apparently we could not put the SingleQE on ALL the segments, this
conflict could results in runtime or incorrect results.

To fix this we should put Append on SingleQE's segments.

In the other hand when there are multiple SingleQE subpaths we should
put Append on the common segments of SingleQEs.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
上级 2eef2ba2
......@@ -1388,6 +1388,7 @@ set_append_path_locus(PlannerInfo *root, Path *pathnode, RelOptInfo *rel,
ListCell *l;
bool fIsNotPartitioned = false;
bool fIsPartitionInEntry = false;
int numsegments;
List *subpaths;
List **subpaths_out;
List *new_subpaths;
......@@ -1410,6 +1411,21 @@ set_append_path_locus(PlannerInfo *root, Path *pathnode, RelOptInfo *rel,
return;
}
/* By default put Append node on all the segments */
numsegments = GP_POLICY_ALL_NUMSEGMENTS;
foreach(l, subpaths)
{
Path *subpath = (Path *) lfirst(l);
/* If any subplan is SingleQE, align Append numsegments with it */
if (CdbPathLocus_IsSingleQE(subpath->locus))
{
/* When there are multiple SingleQE, use the common segments */
numsegments = Min(numsegments,
CdbPathLocus_NumSegments(subpath->locus));
}
}
/*
* Do a first pass over the children to determine if there's any child
* which is not partitioned, i.e. is a bottleneck or replicated.
......@@ -1465,28 +1481,17 @@ set_append_path_locus(PlannerInfo *root, Path *pathnode, RelOptInfo *rel,
if (!CdbPathLocus_IsSingleQE(subpath->locus))
{
CdbPathLocus singleQE;
/*
* It's important to ensure that all the subpaths can be
* gathered to the SAME segment, we must set the same
* numsegments for all the SingleQE, there are many
* options:
*
* 1. a constant 1;
* 2. Min(numsegments of all subpaths);
* 3. Max(numsegments of all subpaths);
* 4. ALL;
*
* Options 2 & 3 need to decide the value with an extra
* scan, option 1 puts all the SingleQE on segment 0
* which makes segment 0 a bottle neck. So we choose
* option 4, ALL helps to balance the load on all the
* segments and no extra scan is needed.
*/
int numsegments = GP_POLICY_ALL_NUMSEGMENTS;
/* Gather to SingleQE */
CdbPathLocus_MakeSingleQE(&singleQE, numsegments);
subpath = cdbpath_create_motion_path(root, subpath, subpath->pathkeys, false, singleQE);
}
else
{
/* Align all SingleQE to the common segments */
subpath->locus.numsegments = numsegments;
}
}
}
......
......@@ -97,9 +97,14 @@ begin;
abort;
-- restore the analyze information
analyze t1;
--
-- regression tests
--
-- append SingleQE of different sizes
select max(c1) as v, 1 as r from t2 union all select 1 as v, 2 as r;
v | r
---+---
| 1
1 | 2
(2 rows)
-- append node should use the max numsegments of all the subpaths
begin;
-- insert enough data to ensure executors got reached on segments
......
......@@ -55,9 +55,8 @@ abort;
-- restore the analyze information
analyze t1;
--
-- regression tests
--
-- append SingleQE of different sizes
select max(c1) as v, 1 as r from t2 union all select 1 as v, 2 as r;
-- append node should use the max numsegments of all the subpaths
begin;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册