- 24 11月, 2016 8 次提交
-
-
由 Dhanashree Kashid 提交于
Union is different from UnionAll, where the duplicates in results got removed. Append is same as UnionAll. In this case, just keep the naming consistent and avoid confusion. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Bhuvnesh Chaudhary 提交于
-
由 Jesse Zhang 提交于
This commit makes CTest back off when load average is 4X the core count. This is similar in spirit to 03af732b, greenplum-db/gpdb@c83e696, and greenplum-db/gpos@bce4ed7 . [ci skip]
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Dhanashree Kashid 提交于
``` create table t (c int) distributed by (c); xzhang=# explain select * from t, (select * from t union all select * from t) tt where t.c = tt.c; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..862.00 rows=1 width=16) -> Hash Join (cost=0.00..862.00 rows=1 width=16) Hash Cond: public.t.c = public.t.c *** -> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Append (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Hash (cost=431.00..431.00 rows=1 width=8) -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) Settings: optimizer=on Optimizer status: PQO version 1.687 (16 rows) ``` We have a redundant motion (specified by `***`) because parallel append can only derive random distribution. In the fix, we make parallel append follow the derive logic of serial append. Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NHaiSheng Yuan <hyuan@pivotal.io>
-
- 23 11月, 2016 5 次提交
-
-
Spooky things happening [ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Jesse Zhang 提交于
-
- 22 11月, 2016 3 次提交
-
-
由 Jesse Zhang 提交于
[ci skip]
-
由 Xin Zhang 提交于
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
由 Dhanashree Kashid 提交于
``` create table t (c int) distributed by (c); xzhang=# explain select * from t, (select * from t union all select * from t) tt where t.c = tt.c; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..862.00 rows=1 width=16) -> Hash Join (cost=0.00..862.00 rows=1 width=16) Hash Cond: public.t.c = public.t.c *** -> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Append (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=8) Hash Key: public.t.c -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) -> Hash (cost=431.00..431.00 rows=1 width=8) -> Table Scan on t (cost=0.00..431.00 rows=1 width=8) Settings: optimizer=on Optimizer status: PQO version 1.687 (16 rows) ``` We have a redundant motion (specified by `***`) because parallel append can only derive random distribution. In the fix, we make parallel append follow the derive logic of serial append. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
- 18 11月, 2016 1 次提交
-
-
由 Corbin Halliwill 提交于
Use new paths from GPDB repo
-
- 17 11月, 2016 1 次提交
-
-
由 Corbin Halliwill 提交于
The gpdb/concourse repo is getting cleaned up. This commit fixes paths for consumed files that are getting moved. [#134241435]
-
- 12 11月, 2016 7 次提交
-
-
由 Xin Zhang 提交于
Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
由 Omer Arap 提交于
In general, it's beneficial to remove outer references from groupby, because this value is always a constant for subquery. For example: ``` select a from t where c in (select count(s.j) from s group by s.i, t.b) ``` The `t.b` can be removed safely from above SQL statement, because for every execution of IN subquery, the `t.b` is constant. However, this is an issue if the outer reference is the only groupby column and there is NO additional aggregate functions used. For example: ``` select a from t where c in (select distinct t.b from s) ``` The above statement cannot be further simplified because the rewritten query below after removing outer reference is invalid: ``` select a from t where c in (select distinct ??? from s) ``` Hence, we add additional validation in pre-processing to ensure correct rewritten is done. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Jesse Zhang 提交于
-
由 Venkatesh Raghavan 提交于
Outer Filter should not pushed down in a partition selector in subquery with limit clause [#133827909] The bug is in derivation of required partition propagation specification from the child. Before this fix, we were pushing the partition constraints below Limit which will return wrong results.
-
由 Omer Arap 提交于
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Omer Arap 提交于
Parallel Union currently has 2 optimization request. The second request is not satisfied when the children of the parallel union are randomly distributed. For the first request, if the output columns are not hashable columns the `CPhysicalParallelUnion` requires random distribution. Since the children are randomly distributed the `CPhysicalTableScan` already satisfied the optimization request and no motion is created on top of `CPhysicalTableScan`. To overcome this issue and enforce a `CPhysicalMotionRandom`, we introduce a new distribution spec called `CDistributionSpecRandomStrict`. This distribution spec is requested replaces regular `CDistributionSpecRandom` in the first optimization request where no output column is redistributable. Below is the output after this commit: ``` explain select xmin from foo union all select xmin from bar; Physical plan: +--CPhysicalMotionGather(master) rows:1 width:4 rebinds:1 cost:431.000029 origin: [Grp:2, GrpExpr:3] +--CPhysicalParallelUnionAll rows:1 width:4 rebinds:1 cost:431.000014 origin: [Grp:2, GrpExpr:1] |--CPhysicalMotionRandom rows:1 width:38 rebinds:1 cost:431.000014 origin: [Grp:0, GrpExpr:2] | +--CPhysicalTableScan "foo" ("foo") rows:1 width:38 rebinds:1 cost:431.000007 origin: [Grp:0, GrpExpr:1] +--CPhysicalMotionRandom rows:1 width:38 rebinds:1 cost:431.000014 origin: [Grp:1, GrpExpr:2] +--CPhysicalTableScan "bar" ("bar") rows:1 width:38 rebinds:1 cost:431.000007 origin: [Grp:1, GrpExpr:1] QUERY PLAN ------------------------------------------------------------------------------------------------ Gather Motion 3:1 (slice3; segments: 3) (cost=0.00..431.00 rows=1 width=4) -> Append (cost=0.00..431.00 rows=1 width=4) -> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=4) -> Table Scan on foo (cost=0.00..431.00 rows=1 width=4) -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=4) -> Table Scan on bar (cost=0.00..431.00 rows=1 width=4) ``` Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
- 02 11月, 2016 1 次提交
-
-
由 Haisheng Yuan 提交于
gporca has a set of banned API calls which needs to be allowed with the ALLOW_xxx macro in order for gpopt to compile. But it should be the library caller(GPDB/Orca)'s resposibility to take care of the function call. see discussions on greenplum-db/gpdb#1136 and https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/Mcw6JPav6h4
-
- 01 11月, 2016 6 次提交
-
-
由 Daniel Gustafsson 提交于
The trailing colon which preceds the timing info is added elsewhere so remove from the OPT string; also remove space between time and unit on Xforms to make it uniform with other metric printings in order to make parsing the output via scripts easier.
-
Orca failed to generate plans for the following query when parallel union is enabled: SELECT * FROM foo UNION SELECT * FROM bar; Because returning EpetRequired in the EpetDistribution of CPhysicalParallelUnionAll causes the enforcement framework to falsely introduce motions that can cause failure in orca to generate plans for some queries. When we introduce unnecessary CPhysicalMotionHashDistribute on top of CPhysicalParallelUnionAll, the system actually detects that it is unnecessary. However, the optimization flow gets corrupted because of that since COptimizationContexts are indeed related to the each other. One optimization context can be a child or a parent of another optimization context in another group. To resolve the issue, share the same logic with the CPhysicalSerialUnionAll by lifting the CPhysicalSerialUnionAll::EpetRequired code to the parent class CPhysicalUnionAll. Closes #116
-
由 Venkatesh Raghavan 提交于
NULL condition [#132928535] Histograms have the following components: * Histogram buckets * Null frequency * Distinct values that are not captured by the buckets in both the * histograms Consider the following scenario: ``` create table test_stat(a bigint, b integer, c varchar(1) ) distributed by (a); insert into test_stat select id, mod(id,100),'J' from generate_series(1,1000000) id; insert into test_stat select id, mod(id,100), null from generate_series(1,1000000) id; select * from test_stat where c <> 'J' or c is null; ``` In the above query has a predicates combined by an OR. The predicates are on the same column `c`. GPORCA while computing the histogram of the OR, we do not compute the contribution of the null values and the distinct remain values. In this patch we fix this issue.
-
-
-
由 Xin Zhang 提交于
Revert "GPORCA computes wrong statistics when filter contains "<> or NOT IN + a IS NULL" condition [#132928535]" This reverts commit 1dafa7fc. Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
- 29 10月, 2016 7 次提交
-
-
由 Venkatesh Raghavan 提交于
GPORCA computes wrong statistics when filter contains "<> or NOT IN + a IS NULL" condition [#132928535] Histograms have the following components: * Histogram buckets * Null frequency * Distinct values that are not captured by the buckets in both the histograms Consider the following scenario: ``` create table test_stat(a bigint, b integer, c varchar(1) ) distributed by (a); insert into test_stat select id, mod(id,100),'J' from generate_series(1,1000000) id; insert into test_stat select id, mod(id,100), null from generate_series(1,1000000) id; select * from test_stat where c <> 'J' or c is null; ``` In the above query has a predicates combined by an OR. The predicates are on the same column `c`. GPORCA while computing the histogram of the OR, we do not compute the contribution of the null values and the distinct remain values. In this patch we fix this issue.
-
由 Omer Arap 提交于
-
由 Omer Arap 提交于
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
-
-
Previously, when `optimizer_parallel_union` GUC is set, we were only generating plans with `parallel union all` and omitting `serial union all`. In this commit, we generate plans for both `serial union all` and `parallel union all`. In some cases, enforcement framework may not let `parallel union all` to be part of a plan because of newly introduced distribution specs and motions it generates. It is better to provide the legacy union all implementation to be part of the alternatives instead of completely omitting the alternative. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
We enable both `serial` and `parallel union all` in the search space and we favor `parallel union all` if the `optimizer_parallel_union` GUC is enabled. `parallel union all` is costed as the same as maximum costed child instead of adding the children's cost all together. This makes the framework to choose the parallel union all instead of serial union all when both operators are part of valid plan alternatives.
-
- 28 10月, 2016 1 次提交
-
-