提交 · 3a586479fba76179295fef707b6a632091f776b2 · Greenplum / Gpdb

21 7月, 2017 1 次提交

Cut down the runtime for CCostTest · 3a586479

由 Haisheng Yuan 提交于 7月 14, 2017

Most of time is spent on running the 4 minidumps in CalibratedCostModel test.
The 4 minidumps are already tested in other test suites. Moreover, it just
executes the minidump, doesn't even compare the generaeted plan.

CalibratedCostModel is default in GPDB right now, and most minidumps use
calibrated cost model, that will also test CalibratedCostModel at the same
time. So remove EresUnittest_CalibratedCostModel. Cut down the debug runtime
of CCostTest from 814829 ms to 458 ms.

3a586479

19 7月, 2017 2 次提交
- D
  Bump ORCA version to 2.39 · 2d9af1ce
  由 Dhanashree Kashid 提交于 7月 18, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
  2d9af1ce
- B
  Bump ORCA version to 2.38 · e973f96e
  由 Bhunvesh Chaudhary 提交于 7月 18, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
  e973f96e
15 7月, 2017 1 次提交

Remove part oid (#186) · 2c139497

由 Bhuvnesh 提交于 7月 14, 2017

* Donot generated PartOid Expression

In GPDB, PartOidExpr is not used however ORCA still generates it.
But HAWQ uses PartOid for sorting while inserting into Append Only
Row / Parquet Partitioned tables.

This patch uses Parquet Storage and Number of Partitions in a Append
Only row partitioned table to decide if PartOid should be generated.
In case of GPDB, Parquet storage is not supported and the GUC to control
the number of partitions above which sort should be used is set to int
max which is practically not feasible, so in case of GPDB PartOid expr
will never be generated, however HAWQ can control the generation of
PartOid based on the value of already existing GUCs in HAWQ.

* Remove PartOid ProjElem from minidump files

* Fixed CICGTest

* Fix CDMLTest

* Fix CDirectDispatchTest

* Fix CPhysicalParallelUnionAllTest

* Fix CCollapseProjectTest test

* Fix parser for Partition Selector

A Partition Selector node can have another partition selector node as
its immediate child. In such cases, the current parsers fails. The patch
fixes the issue

* Fix PartTbl Test

* PR Feedback Applied

* Applied HSY feedback 1
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

* Bump ORCA to 2.37
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

2c139497

11 7月, 2017 3 次提交

V

Update ORCA version · 73a7bffd
由 Venkatesh Raghavan 提交于 7月 10, 2017

73a7bffd

Convert Non-correlated EXISTS subquery to a LIMIT 1 AND a JOIN · e04ae39d

由 Venkatesh Raghavan 提交于 7月 10, 2017

Enable GPORCA to generate better plans for non-correlated exists subquery in the WHERE clause

Consider the following exists subquery, `(select * from bar)`. GPORCA generates an elaborate count based implementation of this subquery. If bar is a fact table, the count is going to be expensive.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=0.00..1368262.79 rows=400324 width=8)
   ->  Nested Loop  (cost=0.00..1368250.86 rows=133442 width=8)
         Join Filter: true
         ->  Table Scan on foo  (cost=0.00..461.91 rows=133442 width=8)
               Filter: a = b
         ->  Materialize  (cost=0.00..438.57 rows=1 width=1)
               ->  Broadcast Motion 1:3  (slice2)  (cost=0.00..438.57 rows=3 width=1)
                     ->  Result  (cost=0.00..438.57 rows=1 width=1)
                           Filter: (count((count()))) > 0::bigint
                           ->  Aggregate  (cost=0.00..438.57 rows=1 width=8)
                                 ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..438.57 rows=1 width=8)
                                       ->  Aggregate  (cost=0.00..438.57 rows=1 width=8)
                                             ->  Table Scan on bar  (cost=0.00..437.95 rows=332395 width=1)
 Optimizer status: PQO version 2.35.1
(14 rows)
```
Planner on the other hand uses LIMIT as shown in the INIT plan.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.03..13611.14 rows=1001 width=8)
   ->  Result  (cost=0.03..13611.14 rows=334 width=8)
         One-Time Filter: $0
         InitPlan  (slice3)
           ->  Limit  (cost=0.00..0.03 rows=1 width=0)
                 ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..0.03 rows=1 width=0)
                       ->  Limit  (cost=0.00..0.01 rows=1 width=0)
                             ->  Seq Scan on bar  (cost=0.00..11072.84 rows=332395 width=0)
         ->  Seq Scan on foo  (cost=0.00..13611.11 rows=334 width=8)
               Filter: a = b
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(12 rows)
```

While GPORCA doesnot support init-plan, we can nevertheless generate a better plan by using LIMIT instead of count. After this PR, GPORCA will generate the following plan with LIMIT clause.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=0.00..1368262.73 rows=400324 width=8)
   ->  Nested Loop EXISTS Join  (cost=0.00..1368250.80 rows=133442 width=8)
         Join Filter: true
         ->  Table Scan on foo  (cost=0.00..461.91 rows=133442 width=8)
               Filter: a = b
         ->  Materialize  (cost=0.00..438.57 rows=1 width=1)
               ->  Broadcast Motion 1:3  (slice2)  (cost=0.00..438.57 rows=3 width=1)
                     ->  Limit  (cost=0.00..438.57 rows=1 width=1)
                           ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..438.57 rows=1 width=1)
                                 ->  Limit  (cost=0.00..438.57 rows=1 width=1)
                                       ->  Table Scan on bar  (cost=0.00..437.95 rows=332395 width=1)
 Optimizer status: PQO version 2.35.1
(12 rows)
```

e04ae39d

B
Bump ORCA version to 2.35.2 · 4ad9ce70
由 Bhunvesh Chaudhary 提交于 7月 10, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
4ad9ce70

06 7月, 2017 2 次提交

Fix the behaviour of FBetterThan for Random vs Hashed · bbbbd699

由 Omer Arap 提交于 7月 05, 2017

There is a tie breaker logic in FBetterThan function in CCostContext.
According to the code if the distribution is Hashed rather than Random,
it should be favored when the costs are equal.

The code was checking both if the distribution spec in the same context
is both equal to Hashed and Random which is false by default. It should
check if comparing one is Hashed and compared one is Random for correct
behavior.

bbbbd699

B
Bump ORCA version to 2.35.0 · 900a586f
由 Bhuvnesh Chaudhary 提交于 6月 29, 2017
```
There are changes to regress tests in GPDB repository, so bumping up the
minor version.
```
900a586f

01 7月, 2017 1 次提交
- V
  Refactor the code in CPredicateUtils · 350ca788
  由 Venkatesh Raghavan 提交于 6月 30, 2017
```
Minor cleaniup of duplicate code.
```
  350ca788
30 6月, 2017 1 次提交

Get rid of UlSafeLength() function. · 25c2b4dd

由 Heikki Linnakangas 提交于 6月 22, 2017

It made the assumption that it's OK to call it on a NULL pointer, which
isn't cool with all C++ compilers and options. I'm getting a bunch of
warnings like this because of it:

/home/heikki/gpdb/optimizer-main/libgpos/include/gpos/common/CDynamicPtrArray.inl:382:3: warning: nonnull argument ‘this’ compared to NULL [-Wnonnull-compare]
   if (NULL == this)
   ^~

There are a few other places that produce the same error, but one step at
a time. This is most important because it's in an inline function, so this
produces warnings also in any code that uses ORCA, like the translator code
in GPDB's src/backend/gpopt/ directory, not just ORCA itself.

Since the function is now gone, all references to it also need to be removed
from the translator code outside ORCA.

Bump up Orca version to 2.34.2

25c2b4dd

28 6月, 2017 1 次提交
- B
  Bump ORCA version to 2.34.1 · 3fa3cb01
  由 Bhuvnesh Chaudhary 提交于 6月 27, 2017
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
```
  3fa3cb01
20 6月, 2017 2 次提交

Remove .inl files and merge implementation in .h · 80305df7

由 Omer Arap 提交于 5月 31, 2017

The template classes in orca mostly do the implementation in
`.inl` files while some implementation also exists in `.h` files.
It makes it hard to traverse the code in .inl files since some
IDEs do not recognize formatting. Therefore this commit moves
the implementation to the `.h` files wherever is applicable.

This commit does not port implementation from `.inl` where
there exists `.cpp` implementation file as well as `.h` only has
function definitions such as `CUtils.h`.

80305df7

Update Join Cardinality Estimation for Text/bpchar/varchar/char columns · 6567f566

由 Venkatesh Raghavan 提交于 6月 19, 2017

Histogram intersection depends on the value of the bucket boundaries.
For datatypes like text, varchar, etc. Orca currently uses a hash function
to mark bucket boundaries. This function is slightly useful for equality
with singleton buckets but nothing more. So the previous join stats computation
based on histogram intersection is totally bogus. In this CL, we now modified
it into a NDV (number of distinct values) based estimation.

6567f566

18 5月, 2017 1 次提交
- V
  
  Add api to return all xforms that generate hash join · 658e05ac
  由 Venkatesh Raghavan 提交于 5月 17, 2017
  
  658e05ac
15 5月, 2017 1 次提交

Streamline Orca Traceflags · ecc57e1b

由 Venkatesh Raghavan 提交于 5月 14, 2017

* Make sure intent of the traceflags is clear
* Remove double negation where possible
* Update comments

ecc57e1b

09 5月, 2017 1 次提交
- D
  Bump ORCA version to 2.30 · a4f89e1d
  由 Dhanashree Kashid 提交于 5月 08, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
  a4f89e1d
26 4月, 2017 2 次提交
- O
  Bump Orca version to 2.29.0 · 26e07c30
  由 Omer Arap 提交于 4月 25, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
  26e07c30
- O
  Bump up Orca version to 2.28.0 · ae65932c
  由 Omer Arap 提交于 4月 25, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
  ae65932c
22 4月, 2017 3 次提交
- V
  
  Bump up Orca version to 2.27.0 · 62b4f337
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  62b4f337
- V
  
  Bump Orca version to 2.26 · 64568300
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  64568300
- V
  
  Bump Orca version to 2.25 · 023258ef
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  023258ef
21 4月, 2017 1 次提交
- B
  Bump ORCA version to 2.24 · ebc132d2
  由 Bhuvnesh Chaudhary 提交于 4月 20, 2017
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
```
  ebc132d2
13 4月, 2017 1 次提交
- O
  Bump ORCA version to 2.23.0 · ce63300b
  由 Omer Arap 提交于 4月 12, 2017
```
Signed-off-by: NBhunvesh Chaudhary <bchaudhary@pivotal.io>
```
  ce63300b
12 4月, 2017 2 次提交
- V
  
  Bump ORCA version to 2.21.0 [#143323299] · d2b90209
  由 Venkatesh Raghavan 提交于 4月 11, 2017
  
  d2b90209
- E
  Include missing transform for partial index scans [#143522031] · df435e2f
  由 Ekta Khanna and Jesse Zhang 提交于 4月 11, 2017
```
For some reason we didn't disable
InnerJoinWithInnerSelect2PartialDynamicIndexGetApply but we should.
```
  df435e2f
11 4月, 2017 1 次提交
- E
  
  Bump ORCA version to 2.20.0 [#143349743] · abefa35f
  由 Ekta Khanna and Jesse Zhang 提交于 4月 10, 2017
  
  abefa35f
05 4月, 2017 1 次提交
- D
  Bump ORCA version to 2.19.0 · af3100d9
  由 Dhanashree Kashid 提交于 4月 04, 2017
```
Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
```
  af3100d9
04 4月, 2017 1 次提交

Remove dead code about SharedScan · c6838425

由 Haisheng Yuan 提交于 3月 30, 2017

When I was trying to understand how does Orca generate plan for CTE using
shared input scan, I found that the share input scan is generated during CTE
producer & consumer DXL node to PlannedStmt translation stage, instead of Expr
to DXL stage inside Orca. It turns out CDXLPhysicalSharedScan is not used
anywhere, so remove all the related dead code.

c6838425

01 4月, 2017 1 次提交

Refactor Expr to DXL translator for list partition selector predicates · cdc45d92

由 Haisheng Yuan 提交于 3月 31, 2017

Previously when the final plan is translated into DXL, Orca uses range
predicates to represent the values of the list partition (range start and end
value equal a single list value), which is error-prone and redundant for the
query executor.

In this patch, we use the following way to translate the predicates of the list
partition selector in DXL (range based partition remain unchanged):
1. PK op Scalar -> Scalar reverse(op) Any(PartListValues)
For partition selector predicate `pk1 < 5`, will be translated to `ArrayComp(5 >
Any(PartListValues))`, which means as long as any value of list partition values
is less than 5, the partition will be selected.
2. PK is (not) NULL -> PartListNullTest
3. Propagation Expression ->
   Const1 = Any(PartListValues) or Const2 = Any(PartListValues) ...

[#140699737]
Closes #149

cdc45d92

31 3月, 2017 2 次提交
- O
  Bump ORCA version to 2.16.0 · 77476b83
  由 Omer Arap 提交于 3月 30, 2017
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
```
  77476b83
- O
  Bump ORCA version to 2.14.0 · 8490afdd
  由 Omer Arap 提交于 3月 30, 2017
```
Signed-off-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
```
  8490afdd
29 3月, 2017 1 次提交
- O
  Bump ORCA version to 2.14.0 · a14e93fc
  由 Omer Arap 提交于 3月 28, 2017
```
Signed-off-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
```
  a14e93fc
21 3月, 2017 1 次提交
- E
  Stop timing DXL to Scalar Expr translation time [#141739201] · 591bb303
  由 Ekta Khanna 提交于 3月 20, 2017
```
Removing timing Scalar DXL To Scalar Expr Translation Time
as for certain queries it produces many lines of noise.
```
  591bb303
17 3月, 2017 1 次提交

Fix bug that can't generate equality partition filters for multilevel partition table · e99b9fef

由 Haisheng Yuan 提交于 3月 15, 2017

The equality parition filter of partition selector works very well for single
level partition table, but if the table has multilevel partitions, e.g. 2
levels with pk1, pk2 as the partition key for level 1 and 2, and there is a
equality predicate in the query, say pk1 = 2, then level 2 equality filter is null,
the function `FEqPartFiltersAllLevels` will return false, causing the equality
predicate put into PartFilters instead of PartEqFilters. This bug has been
fixed in the patch.

[#141826453]

e99b9fef

16 3月, 2017 1 次提交
- E
  
  Bump ORCA version to 2.11.0 [#141511349] · 670028bb
  由 Ekta Khanna, Jesse Zhang and Omer Arap 提交于 3月 15, 2017
  
  670028bb
08 3月, 2017 2 次提交
- O
  
  Bump ORCA version to 2.10 · 8263c6b4
  由 Omer Arap 提交于 3月 07, 2017
  
  8263c6b4
- B
  Bump ORCA version to 2.9 · a8865d7d
  由 Bhuvnesh Chaudhary 提交于 3月 07, 2017
```
Signed-off-by: NOmer Arap <oarap@pivotal.io>
```
  a8865d7d
07 3月, 2017 1 次提交
- O
  Bump ORCA version to 2.8 · 5bbc9fa4
  由 Omer Arap 提交于 3月 06, 2017
```
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
```
  5bbc9fa4
23 2月, 2017 1 次提交
- B
  Bump ORCA version to 2.7 [#139042295] · aa466bfe
  由 Bhunvesh Chaudhary 提交于 2月 22, 2017
```
Signed-off-by: NTaylor Vesely <tvesely@pivotal.io>
```
  aa466bfe