- 05 1月, 2017 6 次提交
-
-
This merges both the code and its history of GPOS, the foundation abstraction library for ORCA. This commit contains minimal changes to ORCA's build files. There are a few remnants of the assumption that GPOS was a separate library. Those will be remove in a subsequent commit.
-
-
The libgpos CMakeLists.txt should contain sufficient information post-merge now.
-
-
-
-
- 24 12月, 2016 2 次提交
-
-
由 Dhanashree Kashid 提交于
Included DbgPrint for CCostContext, CGroupExpression & COptimizationContext
-
由 Dhanashree Kashid 提交于
Bumped ORCA version to 1.697 Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
- 20 12月, 2016 2 次提交
-
-
由 Dhanashree Kashid 提交于
This will enable ORCA add RandomMotion or RedistributeMotion on top of the CPhysicalExternalScan. That extra motion will redistribute rows across all segments hence improve the performance of parallel loading using external table. This behavior is consistent with Planner behavior. If it is a Master Only table then there is no extra motion added. Bump the ORCA version to 696. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Dhanashree Kashid 提交于
Expect to have RandomMotion on top of external table scan if target table is distributed randomly. Expect to have RedistributeMotion on top of external table scan if target table is hash distributed. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
- 15 12月, 2016 2 次提交
-
-
由 Dhanashree Kashid 提交于
Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
由 Dhanashree Kashid 提交于
Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
- 14 12月, 2016 2 次提交
-
-
由 Dhanashree Kashid 提交于
optimizer_large_table_broadcast guc Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
由 Dhanashree Kashid 提交于
Added a guc, `optimizer_large_table_broadcast` to set the threshold on maximum number of rows to broadcast. Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
- 08 12月, 2016 4 次提交
-
-
-
Before no-op motions were introduced, a hash-redistribute motion was intended to really move tuples around, and it was justifiable that they required 'ANY' distribution, and hence they shared optimization contexts when everything else were identical. A no-op motion, however, was intended to _not_ move tuples. i.e. A no-op motion to hash-redistribute by column `b` on top of a relation that's hash-distributed on column `a` is not only undesirable, it's completely unintended. To rule out such plans, we made hash-redistribute motions a bit more intelligent when their distribution specifications say `no-op`: they should require the relation under them to be distributed exactly as the motions (as opposed to `ANY`). A nice side-effect of this change is, no-op motions that distribute on columns that are not covered by the output columns of the child group will no longer match any group expressions. This is a bit nuanced, but a minimal query to reproduce looks something like: ``` CREATE TABLE foo (a int, b int) DISTRIBUTED BY (a); EXPLAIN SELECT b, a FROM foo UNION ALL SELECT b, a FROM foo INTERSECT ALL SELECT b, a FROM foo; ``` And a (wrong) plan before this change looked like the following: ``` Physical plan: +--CPhysicalMotionGather(master) +--CPhysicalParallelUnionAll |--CPhysicalMotionHashDistribute HASHED NO-OP: "a" (0) | +--CPhysicalTableScan "foo" ("foo") +--CPhysicalMotionHashDistribute HASHED NO-OP: "gp_segment_id" (17) +--CPhysicalLeftSemiHashJoin |--CPhysicalSequenceProject (HASHED: "b" (10) "a" (9) | |--CPhysicalSort ( (97,1.0), "b" (10), NULLsLast ) ( (97,1.0), "a" (9), NULLsLast ) | | +--CPhysicalTableScan "foo" ("foo") | +--CScalarProjectList | +--CScalarProjectElement "row_number" (27) | +--CScalarWindowFunc (row_number , Agg: false , Distinct: false) |--CPhysicalSequenceProject (HASHED: "b" (19) "a" (18) | |--CPhysicalSort ( (97,1.0), "b" (19), NULLsLast ) ( (97,1.0), "a" (18), NULLsLast ) | | +--CPhysicalTableScan "foo" ("foo") | +--CScalarProjectList | +--CScalarProjectElement "row_number" (28) | +--CScalarWindowFunc (row_number , Agg: false , Distinct: false) +--CScalarBoolOp (EboolopAnd) |--CScalarBoolOp (EboolopNot) | +--CScalarIsDistinctFrom (=) | |--CScalarIdent "b" (10) | +--CScalarIdent "b" (19) |--CScalarBoolOp (EboolopNot) | +--CScalarIsDistinctFrom (=) | |--CScalarIdent "a" (9) | +--CScalarIdent "a" (18) +--CScalarBoolOp (EboolopNot) +--CScalarIsDistinctFrom (=) |--CScalarIdent "row_number" (27) +--CScalarIdent "row_number" (28) ``` Where the memo group of the no-op motion on `gp_segment_id (17)` was like: ``` Group 3 (#GExprs: 14): 0: CLogicalIntersectAll Output: ("b" (10), "a" (9)), Input: [("b" (10), "a" (9)), ("b" (19), "a" (18))] [ 1 2 ] 1: CLogicalLeftSemiJoin [ 8 11 24 ] 2: CLogicalGbAggDeduplicate( Global ) Grp Cols: ["a" (9), "b" (10), "ctid" (11), "gp_segment_id" (17), "row_number" (27)], Minimal Grp Cols: [], Join Child Keys: ["ctid" (11), "gp_segment_id" (17)], Generates Duplicates :[ 0 ] [ 25 26 ] 3: CLogicalGbAggDeduplicate( Global ) Grp Cols: ["a" (9), "b" (10), "ctid" (11), "gp_segment_id" (17), "row_number" (27)], Minimal Grp Cols: ["a" (9), "b" (10), "ctid" (11), "gp_segment_id" (17), "row_number" (27)], Join Child Keys: ["ctid" (11), "gp_segment_id" (17)], Generates Duplicates :[ 1 ] [ 27 26 ] 4: CPhysicalStreamAggDeduplicate( Global ) Grp Cols: ["a" (9), "b" (10), "ctid" (11), "gp_segment_id" (17), "row_number" (27)], Key Cols:["ctid" (11), "gp_segment_id" (17)], Generates Duplicates :[ 1 ] (High) [ 27 26 ] Cost Ctxts: main ctxt (stage 0)1.1, child ctxts:[0], rows:1.000000 (group), cost: 862.001458 main ctxt (stage 0)3.1, child ctxts:[0], rows:1.000000 (group), cost: 862.001458 5: CPhysicalStreamAggDeduplicate( Global ) Grp Cols: ["a" (9), "b" (10), "ctid" (11), "gp_segment_id" (17), "row_number" (27)], Key Cols:["ctid" (11), "gp_segment_id" (17)], Generates Duplicates :[ 0 ] [ 25 26 ] Cost Ctxts: main ctxt (stage 0)1.1, child ctxts:[4], rows:1.000000 (group), cost: 862.001429 main ctxt (stage 0)3.1, child ctxts:[4], rows:1.000000 (group), cost: 862.001429 6: CPhysicalLeftSemiHashJoin (High) [ 8 11 24 ] Cost Ctxts: main ctxt (stage 0)1.0, child ctxts:[5, 6], rows:1.000000 (group), cost: 862.001394 main ctxt (stage 0)1.1, child ctxts:[3, 5], rows:1.000000 (group), cost: 862.001294 main ctxt (stage 0)1.2, child ctxts:[4, 4], rows:1.000000 (group), cost: 862.001375 main ctxt (stage 0)1.3, child ctxts:[3, 3], rows:1.000000 (group), cost: 862.001294 main ctxt (stage 0)1.5, child ctxts:[2, 2], rows:1.000000 (group), cost: 862.002153 main ctxt (stage 0)1.6, child ctxts:[0, 0], rows:1.000000 (group), cost: 862.001652 main ctxt (stage 0)3.0, child ctxts:[5, 6], rows:1.000000 (group), cost: 862.001394 main ctxt (stage 0)3.1, child ctxts:[3, 5], rows:1.000000 (group), cost: 862.001294 main ctxt (stage 0)3.2, child ctxts:[4, 4], rows:1.000000 (group), cost: 862.001375 main ctxt (stage 0)3.3, child ctxts:[3, 3], rows:1.000000 (group), cost: 862.001294 main ctxt (stage 0)3.5, child ctxts:[2, 2], rows:1.000000 (group), cost: 862.002153 7: CPhysicalLeftSemiNLJoin [ 8 11 24 ] Cost Ctxts: main ctxt (stage 0)2.1, cost lower bound: 1290.000233 PRUNED main ctxt (stage 0)1.1, cost lower bound: 1290.000233 PRUNED main ctxt (stage 0)3.1, cost lower bound: 1290.000233 PRUNED main ctxt (stage 0)0.1, cost lower bound: 1290.000233 PRUNED 8: CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "a" (9) , nulls colocated ] [ 3 ] Cost Ctxts: main ctxt (stage 0)0.0, child ctxts:[1], rows:1.000000 (group), cost: 862.001294 9: CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "row_number" (27) origin: [Grp:20, GrpExpr:0] , nulls colocated ] [ 3 ] Cost Ctxts: main ctxt (stage 0)0.0, child ctxts:[1], rows:1.000000 (group), cost: 862.001294 10: CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "b" (10) origin: [Grp:12, GrpExpr:0] , nulls colocated ] [ 3 ] Cost Ctxts: main ctxt (stage 0)0.0, child ctxts:[1], rows:1.000000 (group), cost: 862.001294 11: CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "gp_segment_id" (17) , nulls colocated ] [ 3 ] Cost Ctxts: main ctxt (stage 0)0.0, child ctxts:[1], rows:1.000000 (group), cost: 862.001294 12: CPhysicalMotionHashDistribute STRICT HASHED: [ +--CScalarIdent "b" (10) +--CScalarIdent "a" (9) , nulls colocated ] [ 3 ] Cost Ctxts: main ctxt (stage 0)2.0, child ctxts:[1], rows:1.000000 (group), cost: 862.001315 13: CPhysicalMotionRandom [ 3 ] Cost Ctxts: Grp OptCtxts: 2 (stage 0): (req cols: ["a" (9), "b" (10)], req CTEs: [], req order: [<empty> match: satisfy ], req dist: [STRICT HASHED: [ +--CScalarIdent "b" (10) +--CScalarIdent "a" (9) , nulls colocated ] match: exact], req rewind: [NON-REWINDABLE match: satisfy], req partition propagation: [Filters: [] match: satisfy ]) => Best Expr:12 1 (stage 0): (req cols: ["a" (9), "b" (10)], req CTEs: [], req order: [<empty> match: satisfy ], req dist: [ANY EOperatorId: 122 match: satisfy], req rewind: [NON-REWINDABLE match: satisfy], req partition propagation: [Filters: [] match: satisfy ]) => Best Expr:6 3 (stage 0): (req cols: ["a" (9), "b" (10)], req CTEs: [], req order: [<empty> match: satisfy ], req dist: [NON-SINGLETON (NON-REPLICATED) match: satisfy], req rewind: [NON-REWINDABLE match: satisfy], req partition propagation: [Filters: [] match: satisfy ]) => Best Expr:6 0 (stage 0): (req cols: ["a" (9), "b" (10)], req CTEs: [], req order: [<empty> match: satisfy ], req dist: [HASHED NO-OP: [ +--CScalarIdent "b" (10) , nulls colocated ] match: exact], req rewind: [NON-REWINDABLE match: satisfy], req partition propagation: [Filters: [] match: satisfy ]) => Best Expr:11 ``` After this change, we got our expected plan: ``` +--CPhysicalMotionGather(master) +--CPhysicalParallelUnionAll |--CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "a" (0) , nulls colocated ] | +--CPhysicalTableScan "foo" ("foo") +--CPhysicalMotionHashDistribute HASHED NO-OP: [ +--CScalarIdent "a" (9) , nulls colocated ] +--CPhysicalLeftSemiHashJoin |--CPhysicalSequenceProject (HASHED: [ +--CScalarIdent "b" (10) +--CScalarIdent "a" (9) , nulls colocated ], [<empty>], [EMPTY FRAME]) | |--CPhysicalSort ( (97,1.0), "b" (10), NULLsLast ) ( (97,1.0), "a" (9), NULLsLast ) | | +--CPhysicalTableScan "foo" ("foo") | +--CScalarProjectList | +--CScalarProjectElement "row_number" (27) | +--CScalarWindowFunc (row_number , Agg: false , Distinct: false) |--CPhysicalSequenceProject (HASHED: [ +--CScalarIdent "b" (19) +--CScalarIdent "a" (18) , nulls colocated ], [<empty>], [EMPTY FRAME]) | |--CPhysicalSort ( (97,1.0), "b" (19), NULLsLast ) ( (97,1.0), "a" (18), NULLsLast ) | | +--CPhysicalTableScan "foo" ("foo") | +--CScalarProjectList | +--CScalarProjectElement "row_number" (28) | +--CScalarWindowFunc (row_number , Agg: false , Distinct: false) +--CScalarBoolOp (EboolopAnd) |--CScalarBoolOp (EboolopNot) | +--CScalarIsDistinctFrom (=) | |--CScalarIdent "b" (10) | +--CScalarIdent "b" (19) |--CScalarBoolOp (EboolopNot) | +--CScalarIsDistinctFrom (=) | |--CScalarIdent "a" (9) | +--CScalarIdent "a" (18) +--CScalarBoolOp (EboolopNot) +--CScalarIsDistinctFrom (=) |--CScalarIdent "row_number" (27) +--CScalarIdent "row_number" (28) ```
-
[#134885333]
-
[#134885333]
-
- 30 11月, 2016 3 次提交
-
-
由 Jesse Zhang 提交于
Similar to greenplum-db/gpdb@2d6ab22. This is more portable and Concourse has better cacheing support for `image_resource` allegedly. [ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
- 29 11月, 2016 3 次提交
-
-
由 Xin Zhang 提交于
-
由 Jesse Zhang 提交于
Also, our build process should be agnostic to the distro. [ci skip]
-
由 Xin Zhang 提交于
Under situation where there is a subquery under limit and also the columns used by the predicates inside the subquery are also referenced outside the subquery, then `InferPredicates` under `PexprPreprocess` will stop producing predicates under limit. For example: ``` explain select 1 from (select * from foo where a = 1 and b = a limit 1) x; QUERY PLAN ------------------------------------------------------------------------------------------------ Result (cost=0.00..431.00 rows=1 width=4) -> Result (cost=0.00..431.00 rows=1 width=1) Filter: a = 1 AND b = 1 -> Limit (cost=0.00..431.00 rows=1 width=8) -> Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..431.00 rows=1 width=8) -> Table Scan on foo (cost=0.00..431.00 rows=1 width=8) Filter: a = 1 AND a = b Settings: optimizer=on Optimizer status: PQO version 1.687 (9 rows) ``` As you can see the example above, we expect to also produce `b=1` on the `Table Scan on foo`. This is due to the reason of `InferPredicates` will NOT generate duplicated predicates if parent operators already generate the predicates. The basic assumption is that, all the predicates should be generated as high as possible in the query plan, and later on predicate pushdown will move those predicates in proper location. However, such assumption is broken when pushing predicates over limit, because it's NOT semantically correct to push predicates outside limit to inside limit. (It's totally fine to duplicate predicates outside limit). The fix is actually checking the operator when deriving predicates out of constraints. If we see a limit operator, then always generate all the predicates based on its constraint, since there won't be any predicates pushed down through limit. Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
- 24 11月, 2016 11 次提交
-
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
This is done by changing cost model to reduce the Motion cost when distribution spec is CDistributionSpecHashedNoOp. There are three distribution specs requested by CPhysicalParallelUnionAll: - CDistributionSpecHashedNoOp: To produce best performance plan to align with base table distribution. - CDistributionSpecStrictHashed: To produce parallel append plan by shuffling tuples using all distributable columns in the project list of UNION ALL. - CDistributionSpecStrictRandom: To produce parallel append plan by shuffling all tuples using random distribution. Most updates to the minidump files are reflecting the costing change, and also the plan id change due to costing change. We also output DistributionSpec types under TF 101010 to clearly indicate type of spec requested. Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Dhanashree Kashid 提交于
Union is different from UnionAll, where the duplicates in results got removed. Append is same as UnionAll. In this case, just keep the naming consistent and avoid confusion. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Bhuvnesh Chaudhary 提交于
-
由 Jesse Zhang 提交于
This commit makes CTest back off when load average is 4X the core count. This is similar in spirit to 03af732b, greenplum-db/gpdb@c83e696, and greenplum-db/gpos@bce4ed7 . [ci skip]
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Dhanashree Kashid 提交于
``` create table t (c int) distributed by (c); xzhang=# explain select * from t, (select * from t union all select * from t) tt where t.c = tt.c; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..862.00 rows=1 width=16) -> Hash Join (cost=0.00..862.00 rows=1 width=16) Hash Cond: public.t.c = public.t.c *** -> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Append (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Hash (cost=431.00..431.00 rows=1 width=8) -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) Settings: optimizer=on Optimizer status: PQO version 1.687 (16 rows) ``` We have a redundant motion (specified by `***`) because parallel append can only derive random distribution. In the fix, we make parallel append follow the derive logic of serial append. Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
- 23 11月, 2016 5 次提交
-
-
Spooky things happening [ci skip]
-
由 Jesse Zhang 提交于
Good citizen [ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-