-
由 Hao Wu 提交于
After some refactors for the path/planner, the code assumes that the module of the hash function of a motion equals to the Gang size of the parent slice. However, it isn't always true, if some tables are distributed on part of the segments. It could happen during gpexpand. Assume the following case, there is a GPDB cluster with 2 segments, and the cluster is running `gpexpand` to add 1 segment. Now t1 and t2 are distributed on the first 2 segments, t3 has finished data transfer, i.e. t3 is distributed on three segments. See the following plan: gpadmin=# explain select t1.a, t2.b from t1 join t2 on t1.a = t2.b union all select * from t3; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=2037.25..1230798.15 rows=7499310 width=8) -> Append (cost=2037.25..1080811.95 rows=2499770 width=8) -> Hash Join (cost=2037.25..1005718.85 rows=2471070 width=8) Hash Cond: (t2.b = t1.a) -> Redistribute Motion 2:3 (slice2; segments: 2) (cost=0.00..2683.00 rows=43050 width=4) Hash Key: t2.b -> Seq Scan on t2 (cost=0.00..961.00 rows=43050 width=4) -> Hash (cost=961.00..961.00 rows=28700 width=4) -> Seq Scan on t1 (cost=0.00..961.00 rows=28700 width=4) -> Seq Scan on t3 (cost=0.00..961.00 rows=28700 width=8) Optimizer: Postgres query optimizer The slice2 shows that t2 will redistribute to all 3 segments and join with t1. Since t1 is distributed only on the first 2 segments, the data from t2 redistributed to the third segment couldn't have a match, which returns wrong results. The root cause is that the module of the cdb hash to redistribute t2 is 3, i.e. the Gang size of the parent slice. To fix this issue, we add a field in Motion to record the number of receivers. With this patch, the plan generated is: gpadmin=# explain select * from t1 join t2 on t1.a = t2.b union all select a,a,b,b from t3; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=2.26..155.16 rows=20 width=16) -> Append (cost=2.26..155.16 rows=7 width=16) -> Hash Join (cost=2.26..151.97 rows=7 width=16) Hash Cond: (t1.a = t2.b) -> Seq Scan on t1 (cost=0.00..112.06 rows=5003 width=8) -> Hash (cost=2.18..2.18 rows=2 width=8) -> Redistribute Motion 2:3 (slice1; segments: 2) (cost=0.00..2.18 rows=3 width=8) Hash Key: t2.b Hash Module: 2 -> Seq Scan on t2 (cost=0.00..2.06 rows=3 width=8) -> Seq Scan on t3 (cost=0.00..3.06 rows=2 width=16) Optimizer: Postgres query optimizer Note: the interconnect for the redistribute motion is still 2:3, but the data transfer only happens in 2:2. Co-authored-by: Zhenghua Lyu zlv@pivotal.io Reviewed-by: NPengzhou Tang <ptang@pivotal.io> Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
f69978d2