提交 · c8b1e7f5323c6f74d5eb3e5a39dd81b834ee542b · Greenplum / Gpdb

12 9月, 2017 1 次提交

Added handling for winstar & winagg fields · c8b1e7f5

由 Bhuvnesh Chaudhary 提交于 9月 10, 2017

With commit 387c485d winstar and winagg
fields were added in WindowRef Node, so this commit adds handling for
them in ORCA.

c8b1e7f5

09 9月, 2017 6 次提交
- H
  Revert "Remove unused COstreamFile." · 07e6fe20
  由 Heikki Linnakangas 提交于 9月 09, 2017
```
Turns out that COstreamFile was still used by GPDB's ORCA translator code,
so it wasn't quite dead yet, after all.

This reverts commit 770dd5db.

Bump version to 2.42.3
```
  07e6fe20
- H
  Remove unused libgpos network wrapper code. · eefd9eaf
  由 Heikki Linnakangas 提交于 9月 09, 2017
```
Bump version to 2.42.2
```
  eefd9eaf
- H
  
  Remove unused CLoggerFile class. · 098cd538
  由 Heikki Linnakangas 提交于 9月 09, 2017
  
  098cd538
- H
  
  Remove unused CAutoFileDescriptor.h file. · ddb8bf12
  由 Heikki Linnakangas 提交于 9月 09, 2017
  
  ddb8bf12
- H
  Remove unused COstreamFile. · 770dd5db
  由 Heikki Linnakangas 提交于 9月 09, 2017
```
The header file was referenced in a few places, but it was otherwise
unused.
```
  770dd5db
- O
  Replace escaped qoutes with normal ones · 2211ca91
  由 Omer Arap 提交于 9月 07, 2017
```
Reformat minidump files with xmllint

Plan and cost change update after changing system column widths

Bump version to 2.42.1
```
  2211ca91
08 9月, 2017 1 次提交
- H
  
  Remove unused CCNFConverter class. · e26f249a
  由 Heikki Linnakangas 提交于 9月 07, 2017
  
  e26f249a
07 9月, 2017 1 次提交

Enable Index Scan when leaf partitions are queried directly (#219) · 4019b25a

由 Dhanashree Kashid 提交于 9月 06, 2017

Currently ORCA does not support index scan on leaf partitions when leaf partitions are queried directly. It only supports index scan if we query the root table. This PR along with the corresponding GPDB PR adds the support for using indexes when leaf partitions are queried
directly.

When a root table that has indexes (either homogenous/complete or
heterogenous/partial) is queried; the Relcache Translator sends index
information to ORCA. This enables ORCA to generate an alternative plan with
Dynamic Index Scan on all partitions (in case of homogenous index) or a plan
with partial scan i.e. Dynamic Table Scan on leaf partitions that don’t have
indexes + Dynamic Index Scan on leaf partitions with indexes (in case of
heterogeneous index).

This is a two step process in Relcache Translator as described below:
Step 1 - Get list of all index oids
CTranslatorRelcacheToDXL::PdrgpmdidRelIndexes() performs this step and it
only retrieves indexes on root and regular tables; for leaf partitions it bails
out.

Now for root, list of index oids is nothing but index oids on its leaf
partitions. For instance:

CREATE TABLE foo ( a int, b int, c int, d int) DISTRIBUTED by (a) PARTITION
BY RANGE(b) (PARTITION p1 START (1) END (10) INCLUSIVE, PARTITION p2 START (11)
END (20) INCLUSIVE);

CREATE INDEX complete_c on foo USING btree (c);
CREATE INDEX partial_d on foo_1_prt_p2 using btree(d);
The index list will look like = { complete_c_1_prt_p1, partial_d }

For a complete index, the index oid of the first leaf partitions is retrieved.
If there are partial indexes, all the partial index oids are retrieved.

Step 2 - Construct Index Metadata object
CTranslatorRelcacheToDXL::Pmdindex() performs this step.

For each index oid retrieved in Step #1 above; construct an Index Metadata
object (CMDIndexGPDB) to be stored in metadata cache such that ORCA can get all
the information about the index.
Along with all other information about the index, CMDIndexGPDB also contains
a flag fPartial which denotes if the given index is homogenous (ORCA
will apply it to all partitions selected by partition selector) or heterogenous
(the index will be applied to only appropriate partitions).
The process is as follows:
```
        Foreach oid in index oid list :
                Get index relation (rel)
                If rel is a leaf partition :
                        Get the root rel of the leaf partition
                        Get all the indexes on the root (this will be same list as step #1)
                        Determine if the current index oid is homogenous or heterogenous
                        Construct CMDIndexGPDB based appropriately (with fPartial, part constraint,
                        defaultlevels info)
                Else:
                        Construct a normal CMDIndexGPDB object.
```
Now for leaf partitions, there is no notion of homogenous or heterogenous
indexes since a leaf partition is like a regular table. Hence in Pmdindex()
we should not got for checking if index is complete or not.

Additionally, If a given index is homogenous or heterogenous needs to be
decided from the perspective of relation we are querying(such as root or a
leaf).

Hence the right place of fPartial flag is in the relation metadata object
(CMDRelationGPDB) and not the independent Index metadata object (CMDIndexGPDB).
This commit makes following changes to support index scan on leaf partitions
along with partial scans :

Relcache Translator:

In Step1, retrieve the index information on the leaf partition and create a
list of CMDIndexInfo object which contain the index oid and fPartial flag.
Step 1 is the place where we know what relation we are querying which enable us
to determine whether or not the index is homogenous from the context of the
relation.

The relation metadata tag will look like following after this change:

Before:
```
        <dxl:Indexes>
                <dxl:Index Mdid="0.17159874.1.0"/>
                <dxl:Index Mdid="0.17159920.1.0"/>
        </dxl:Indexes>
```
After:
```
        <dxl:IndexInfoList>
                <dxl:IndexInfo Mdid="0.17159874.1.0" IsPartial="true"/>
                <dxl:IndexInfo Mdid="0.17159920.1.0" IsPartial="false"/>
        </dxl:IndexInfoList>
```
ORCA changes:

A new class CMDIndexInfoGPDB has been created in ORCA which contains index mdid and fPartial flag. For external tables, normal tables and leaf partitions; the fPartial flag will always be false.
CMDRelationGPDB will contain array of CMDIndexInfoGPDB instead of simple index mdid array.
Add a new parsehandler to parse IndexInfoList and IndexInfo to create an array of CMDIndexInfoGPDB.
Update the existing mini dumps to remove fPartial flag from Index metatada tag and associate it with IndexInfo tag under Relation metadata.
Add new test scenarios for querying the leaf partition with homogenous/heterogenous index on root table.

4019b25a

02 9月, 2017 4 次提交

D
Bump ORCA version to 2.41.0 · 042471ca
由 Dhanashree Kashid 提交于 8月 24, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
042471ca

Update the minidumps · cba22e80

由 Dhanashree Kashid 提交于 8月 10, 2017

Now we send the part constraint expression only in the cases below:

            IsPartTable     Index   DefaultParts   ShouldSendPartConstraint
            NO              -       -              -
            YES             YES     YES/NO         YES
            YES             NO      NO             NO
            YES             NO      YES            YES (but only default levels info)

This commit updates the minidumps accordingly.
1. If the Relation tag has indices then keep the part constraint tag
2. If the Relation tag has no indices and no default partitions; remove
the entire part constraint tag
3. If the Relation tag has no indices but has default partitions at any
level then keep the part constraint tag but remove the scalar expression
4. Regenerated the following stale minidumps:
   * DynamicIndexScan-Homogenous.mdp
   * DynamicIndexScan-Heterogenous-Union.mdp
   * DynamicBitmapTableScan-Basic.mdp
5. Added four more test cases to CPartTblTest demonstrating the table
above.

[Ref #149769559]
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
Signed-off-by: NOmer Arap <oarap@pivotal.io>

cba22e80

Don't serialize part constraint expr if no indices · deb78d49

由 Jemish Patel 提交于 8月 11, 2017

Do not serialize and de-serialize the part constraint expression
when there are no indices on the partitioned rel.

The relacache translator in GPDB will send empty part constraint
expression when the rel has no indices.

[Ref #149769559]
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>

deb78d49

Remove dead code · a26b4eb0

由 Omer Arap 提交于 8月 09, 2017

We never send null part constraints from the relacache translator
hence we do not handling for the same.

This code was probably added to support the older minidumps. There are
a few very old minidump files which do not contain the part constraint
tag in relation tag.

Now with the fix on the relcache translator side in GPDB, the only case
when we send NULL part constraints is when there are no indices and no
default partitions; we still don't need null check for part constraint in
this case because the `fDummyConstraint` will always be true.

[Ref #149769559]
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>

a26b4eb0

29 8月, 2017 1 次提交

Convert IN subq to EXISTS subq with pred [##149683475] · 0037ed8f

由 Omer Arap 提交于 8月 28, 2017

This commit adds a preprocessing step to change the expression tree when
there is an IN subquery with a project list that includes an outer
reference but no columns is included from the project's relational
child. This preprocessing helps ORCA to decorrelate the subquery. Orca
currently does not support directly decorrelating IN subqueries if there
is an outer reference in the CLogicalProject. Converting an IN subquery
to a predicate AND EXISTS subquery helps Orca generate more with
decorrelated subquery option.

Below is an example of the preprocessing applied in this commit.

Before preprocessing:
```
   +--CScalarSubqueryAny(=)["?column?" (17)]
      |--CLogicalProject
      |  |--CLogicalGet "bar" ("bar"), Columns: ["c" (9)]}
      |  +--CScalarProjectList
      |     +--CScalarProjectElement "?column?" (17)
      |        +--CScalarOp (+)
      |           |--CScalarIdent "b" (1)
      |           +--CScalarConst (1)
      +--CScalarIdent "a" (0)
```
After:
```
   +--CScalarBoolOp (EboolopAnd)
      |--CScalarOp (=)
      |  |--CScalarIdent "a" (0)
      |  +--CScalarOp (+)
      |     |--CScalarIdent "b" (1)
      |     +--CScalarConst (1)
      +--CScalarSubqueryExists
         +--CLogicalGet "bar" ("bar"), Columns: ["c" (9)] }
```

This commit bumps Orca version to 2.40.3
Signed-off-by: NJemish Patel <jpatel@pivotal.io>

0037ed8f

28 8月, 2017 1 次提交

Allow building with unpatched Xerces-C. · 021de9a3

由 Heikki Linnakangas 提交于 8月 27, 2017

Per discussion at https://github.com/greenplum-db/gpdb/pull/2379, we don't
really need to use a special, patched, version of Xerces-C. Remove the
check.

See also commit 0b3b421a in GPDB, where we made the same change for
GPDB's autoconf check.

021de9a3

26 8月, 2017 1 次提交

Fix broken CAutoPTest & make pipeline green · 52feb13f

由 Ekta Khanna 提交于 8月 25, 2017

Previous to c09a0acd, `CStackObject()` constructor did a validity check if the pointer is on the stack using `FOnStack()`. Since the function is removed; the following test in `CAutoPTest` is invalid :

CAutoPTest::EresUnittest_Allocation()

This commit removes the test.
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>

52feb13f

24 8月, 2017 1 次提交

Remove FOnStack function, it doesn't work without frame pointers. · c09a0acd

由 Heikki Linnakangas 提交于 8月 24, 2017

If you compile with -fomit-frame-pointer, which is the default on recent
versions of gcc, the stack unwinding code in FOnStack will not work. This
is just a non-critical debugging aid, so let's just remove it altogether.

c09a0acd

10 8月, 2017 3 次提交

E
Bump ORCA version to 2.40.0 · ef457558
由 Ekta Khanna 提交于 8月 06, 2017
```
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
```
ef457558

Refactoring cast functions into `CUtilsCasts` · 10e91f99

由 Ekta Khanna 提交于 8月 03, 2017

This commit creates a new file `CCastUtils.cpp` which maintains all the
cast functions.

The following functions are moved from `CUtils.cpp` to `CCastUtils.cpp`:
```
* BOOL FBinaryCoercibleCastedScId(CExpression *pexpr, CColRef *pcr)
* BOOL FBinaryCoercibleCastedScId(CExpression *pexpr)
* const CColRef *PcrExtractFromScIdOrCastScId(CExpression *pexpr)
* CExpression *PexprCast( IMemoryPool *pmp, CMDAccessor *pmda, const
CColRef *pcr, IMDId *pmdidDest)
* BOOL FBinaryCoercibleCast(CExpression *pexpr)
* CExpression *PexprWithoutBinaryCoercibleCasts(CExpression *pexpr)
```

The following functions are moved from `CPredicateUtils.cpp` to
`CCastUtils.cpp`:

```
* DrgPexpr *PdrgpexprCastEquality(IMemoryPool *pmp, CExpression *pexpr)
* CExpression *PexprAddCast(IMemoryPool *pmp, CExpression *pexprPred)
* CExpression *PexprCast(IMemoryPool *pmp, CMDAccessor *pmda,
CExpression *pexpr, IMDId *pmdidDest)
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>

10e91f99

Add new Array Coerce Cast Metadata object · 5bcd44f6

由 Dhanashree Kashid 提交于 7月 26, 2017

Currently executor crashes for following query with ORCA ON :
```
CREATE TABLE FOO(a
integer NOT NULL, b double precision[]); SELECT b FROM foo UNION ALL SELECT
ARRAY[90, 90] as Cont_features;
```

In this query, we are appending an integer array (ARRAY[90, 90]) to a double
precision array (foo.b) and hence we need to apply a cast on ARRAY[90, 90] to
generate ARRAY[90, 90]::double precision[].
In gpdb5 there is not direct function available that can cast array of any type
to array of any other type. So in relcache to dxl translator we look into the
array elements and get their type and try to find a cast function for them.
For this query, source type is 23 (integer) and dest type is 701 (double precision)
 and we try to find if we have a conversion function for 23 -> 701.
Since that is available we send that function to ORCA as follows:
```
<dxl:MDCast
Mdid="3.1007.1.0;1022.1.0" Name="float8" BinaryCoercible="false"
SourceTypeId="0.1007.1.0" DestinationTypeId="0.1022.1.0"
CastFuncId="0.316.1.0"/>
```
Here we are misinforming ORCA by specifying that function with id 316 is available
to convert type 1007 (integer array) to 1022 (double precision) array.
However Function id 316 is simple int4 to float8 conversion function and it CAN NOT
convert an array of int4 to array of double precision.
ORCA generates a plan using this function but executor crashes while executing this
function because this function can not handle arrays.

This commit adds a new ArrayCoerceCast MetaData object which will be constructed
when we need to convert an array of one type to another; instead of constructing
Cast Metadata.

`CMDArrayCoerceCastGPDB` extends `CMDCastGPDB` in that it includes necessary information
to generate `CScalarArrayCoerceExpr`.

In Relcache Translator on GPDB, we will construct an object of `CMDArrayCoerceCastGPDB`
instead of `CMDCastGPDB` depending on the coercion path determined by `FCastFunc`.

Added relevant test cases.

Ref [#149524459]
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

5bcd44f6

29 7月, 2017 1 次提交
- H
  Remove all the "NULL == this" and "NULL != this" (#202) · ee5ef334
  由 Haisheng Yuan 提交于 7月 28, 2017
```
Add NULL check before calling IMDId FValid()
```
  ee5ef334
21 7月, 2017 1 次提交

Cut down the runtime for CCostTest · 3a586479

由 Haisheng Yuan 提交于 7月 14, 2017

Most of time is spent on running the 4 minidumps in CalibratedCostModel test.
The 4 minidumps are already tested in other test suites. Moreover, it just
executes the minidump, doesn't even compare the generaeted plan.

CalibratedCostModel is default in GPDB right now, and most minidumps use
calibrated cost model, that will also test CalibratedCostModel at the same
time. So remove EresUnittest_CalibratedCostModel. Cut down the debug runtime
of CCostTest from 814829 ms to 458 ms.

3a586479

19 7月, 2017 5 次提交

D
Bump ORCA version to 2.39 · 2d9af1ce
由 Dhanashree Kashid 提交于 7月 18, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
2d9af1ce

Fixing fallback in ORCA when we have a Correlated IN query with no · 989415a2

由 Dhanashree Kashid and Jemish Patel 提交于 7月 14, 2017

projections from the inner side.

For the query: `explain select * from foo where foo.a in (select foo.b from bar);`
ORCA generates 2 diff plans based on the size of `bar`.  If bar is a small table,
then  ORCA picks the plan below as it is cheaper to broadcast bar.

```
Physical plan #1:
+--CPhysicalMotionGather(master)   rows:1   width:76  rebinds:1   cost:1324032.133055   origin: [Grp:4, GrpExpr:19]
   +--CPhysicalCorrelatedInLeftSemiNLJoin("b" (1))   rows:1   width:76  rebinds:1   cost:1324032.133025   origin: [Grp:4, GrpExpr:18]
      |--CPhysicalFilter   rows:1   width:38  rebinds:1   cost:431.000092   origin: [Grp:9, GrpExpr:1]
      |  |--CPhysicalTableScan "foo" ("foo")   rows:1   width:38  rebinds:1   cost:431.000021   origin: [Grp:0, GrpExpr:1]
      |  +--CScalarCmp (=)   origin: [Grp:6, GrpExpr:0]
      |     |--CScalarIdent "a" (0)   origin: [Grp:2, GrpExpr:0]
      |     +--CScalarIdent "b" (1)   origin: [Grp:5, GrpExpr:0]
      |--CPhysicalSpool   rows:1   width:38  rebinds:1   cost:431.000026   origin: [Grp:1, GrpExpr:3]
      |  +--CPhysicalMotionBroadcast    rows:1   width:38  rebinds:1   cost:431.000025   origin: [Grp:1, GrpExpr:2]
      |     +--CPhysicalTableScan "bar" ("bar")   rows:1   width:38  rebinds:1   cost:431.000007   origin: [Grp:1, GrpExpr:1]
      +--CScalarConst (1)   origin: [Grp:8, GrpExpr:0]
```
However if bar is a large table, ORCA will decorrelate it into a inner join
and produces a plan as below:

```
Physical plan #2:
+--CPhysicalMotionGather(master)   rows:1   width:76  rebinds:1   cost:1324095.808700   origin: [Grp:4, GrpExpr:19]
   +--CPhysicalInnerNLJoin   rows:1   width:76  rebinds:1   cost:1324095.808670   origin: [Grp:4, GrpExpr:14]
      |--CPhysicalFilter   rows:1   width:38  rebinds:1   cost:431.000092   origin: [Grp:9, GrpExpr:1]
      |  |--CPhysicalTableScan "foo" ("foo")   rows:1   width:38  rebinds:1   cost:431.000021   origin: [Grp:0, GrpExpr:1]
      |  +--CScalarCmp (=)   origin: [Grp:6, GrpExpr:0]
      |     |--CScalarIdent "a" (0)   origin: [Grp:2, GrpExpr:0]
      |     +--CScalarIdent "b" (1)   origin: [Grp:5, GrpExpr:0]
      |--CPhysicalSpool   rows:1   width:38  rebinds:1   cost:431.062210   origin: [Grp:32, GrpExpr:5]
      |  +--CPhysicalMotionBroadcast    rows:1   width:38  rebinds:1   cost:431.062209   origin: [Grp:32, GrpExpr:6]
      |     +--CPhysicalLimit <empty> global   rows:1   width:38  rebinds:1   cost:431.062155   origin: [Grp:32, GrpExpr:2]
      |        |--CPhysicalMotionGather(master)   rows:1   width:38  rebinds:1   cost:431.062154   origin: [Grp:45, GrpExpr:2]
      |        |  +--CPhysicalLimit <empty> local   rows:1   width:38  rebinds:1   cost:431.062150   origin: [Grp:45, GrpExpr:1]
      |        |     |--CPhysicalTableScan "bar" ("bar")   rows:8192   width:38  rebinds:1   cost:431.057071   origin: [Grp:1, GrpExpr:1]
      |        |     |--CScalarConst (0)   origin: [Grp:15, GrpExpr:0]
      |        |     +--CScalarConst (1)   origin: [Grp:31, GrpExpr:0]
      |        |--CScalarConst (0)   origin: [Grp:15, GrpExpr:0]
      |        +--CScalarConst (1)   origin: [Grp:31, GrpExpr:0]
      +--CScalarConst (1)   origin: [Grp:8, GrpExpr:0]
```

Translator successfully translates Plan#2 however, it throws an exception while translating
plan #1 into a `subplan` and falls back to planner.
This PR fixes this to handle the translation for plan#1.

We added a check for `COperator::EopPhysicalCorrelatedInLeftSemiNLJoin == eopid`
in `CTranslatorExprToDXL::PdxlnCorrelatedNLJoin`. This function creates a scalar subplan
if you have a correlated NL join with a true join filter. The current check only handled
`CorrelatedInnerNLJoin` case and so we extended it to handle `CorrelatedInLeftSemiNLJoin` case as well.

This produces correct subplan and there is no fallback.
```
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..1324032.10 rows=2 width=8)
   ->  Table Scan on foo  (cost=0.00..1324032.10 rows=1 width=8)
         Filter: a = a AND ((subplan))
         SubPlan 1
           ->  Result  (cost=0.00..431.00 rows=1 width=1)
                 ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                       ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=1)
                             ->  Table Scan on bar  (cost=0.00..431.00 rows=1 width=1)
 Settings:  optimizer=on
 Optimizer status: PQO version 2.36.0
```

1. The plan produced above can be further optimized by inserting limit
over `bar`. We will file a separate story to handle this.
2. We could not generate a repro query with NOT IN with the similar
symptom as this (-CPhysicalCorrelatedNotInLeftAntiSemiNLJoin join with
Const true filter); hence no check has been added for
`EopPhysicalCorrelatedNotInLeftAntiSemiNLJoin` join type. ORCA always
decorraltes this type of NOT IN query into CPhysicalInnerNLJoin with
scalar comparison as a <> b:
`explain select * from foo where foo.a not in (select foo.b from bar)`

Added minidump tests:

1. `CorrelatedIN-LeftSemiJoin-True.mdp` This is the repro query as shown
above.
2. `CorrelatedIN-LeftSemiNotIn-True.mdp` This was previously causing a crash with
ORCA in DEBUG build. Now it produces the correct plan.

[#147893491]
Signed-off-by: NJemish Patel <jpatel@pivotal.io>

989415a2

B
Bump ORCA version to 2.38 · e973f96e
由 Bhunvesh Chaudhary 提交于 7月 18, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
e973f96e

Implement DXL Representation for VALUESSCAN [#147773843] · 8b95abb0

由 Bhunvesh Chaudhary 提交于 7月 18, 2017

Postgres and thus (GPDB Planner) supports Values via an operator called ValueScan.
```
explain SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) AS t (num,letter);
                          QUERY PLAN
--------------------------------------------------------------
 Values Scan on "*VALUES*"  (cost=0.00..0.04 rows=1 width=36)
 Optimizer status: legacy query optimizer
(2 rows)
```
However, inside Orca we expand each row in the Values list into a
Result node that projects constants.
Thus the above query with three rows having 2 columns is
represented as a plan by GPORCA as an append with three Result nodes.
Each of the Result nodes is a CTG with project elements.
```
explain SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) AS t (num,letter);
                   QUERY PLAN
-------------------------------------------------
 Append  (cost=0.00..0.00 rows=1 width=12)
   ->  Result  (cost=0.00..0.00 rows=1 width=12)
   ->  Result  (cost=0.00..0.00 rows=1 width=12)
   ->  Result  (cost=0.00..0.00 rows=1 width=12)
 Settings:  optimizer=on
 Optimizer status: PQO version 2.32.0
(6 rows)
```

This commit introduces a new value scan operator and instead of
generating multiple result node, ORCA will now generate a value scan
node. The resulting plan will look like:

```
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Values Scan on "Values"  (cost=0.00..0.44 rows=37000 width=4)
 Optimizer status: PQO version 2.37.0
(2 rows)
```

This enhancement bring in significant performance improvement in total
runtime of the queries involving high number of constant values.
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

8b95abb0

B
Add OSPrint for Const Table Get · 2c4e0a9d
由 Bhunvesh Chaudhary 提交于 7月 11, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
2c4e0a9d

15 7月, 2017 1 次提交

Remove part oid (#186) · 2c139497

由 Bhuvnesh 提交于 7月 14, 2017

* Donot generated PartOid Expression

In GPDB, PartOidExpr is not used however ORCA still generates it.
But HAWQ uses PartOid for sorting while inserting into Append Only
Row / Parquet Partitioned tables.

This patch uses Parquet Storage and Number of Partitions in a Append
Only row partitioned table to decide if PartOid should be generated.
In case of GPDB, Parquet storage is not supported and the GUC to control
the number of partitions above which sort should be used is set to int
max which is practically not feasible, so in case of GPDB PartOid expr
will never be generated, however HAWQ can control the generation of
PartOid based on the value of already existing GUCs in HAWQ.

* Remove PartOid ProjElem from minidump files

* Fixed CICGTest

* Fix CDMLTest

* Fix CDirectDispatchTest

* Fix CPhysicalParallelUnionAllTest

* Fix CCollapseProjectTest test

* Fix parser for Partition Selector

A Partition Selector node can have another partition selector node as
its immediate child. In such cases, the current parsers fails. The patch
fixes the issue

* Fix PartTbl Test

* PR Feedback Applied

* Applied HSY feedback 1
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

* Bump ORCA to 2.37
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

2c139497

11 7月, 2017 4 次提交

V

Update ORCA version · 73a7bffd
由 Venkatesh Raghavan 提交于 7月 10, 2017

73a7bffd

Convert Non-correlated EXISTS subquery to a LIMIT 1 AND a JOIN · e04ae39d

由 Venkatesh Raghavan 提交于 7月 10, 2017

Enable GPORCA to generate better plans for non-correlated exists subquery in the WHERE clause

Consider the following exists subquery, `(select * from bar)`. GPORCA generates an elaborate count based implementation of this subquery. If bar is a fact table, the count is going to be expensive.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=0.00..1368262.79 rows=400324 width=8)
   ->  Nested Loop  (cost=0.00..1368250.86 rows=133442 width=8)
         Join Filter: true
         ->  Table Scan on foo  (cost=0.00..461.91 rows=133442 width=8)
               Filter: a = b
         ->  Materialize  (cost=0.00..438.57 rows=1 width=1)
               ->  Broadcast Motion 1:3  (slice2)  (cost=0.00..438.57 rows=3 width=1)
                     ->  Result  (cost=0.00..438.57 rows=1 width=1)
                           Filter: (count((count()))) > 0::bigint
                           ->  Aggregate  (cost=0.00..438.57 rows=1 width=8)
                                 ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..438.57 rows=1 width=8)
                                       ->  Aggregate  (cost=0.00..438.57 rows=1 width=8)
                                             ->  Table Scan on bar  (cost=0.00..437.95 rows=332395 width=1)
 Optimizer status: PQO version 2.35.1
(14 rows)
```
Planner on the other hand uses LIMIT as shown in the INIT plan.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.03..13611.14 rows=1001 width=8)
   ->  Result  (cost=0.03..13611.14 rows=334 width=8)
         One-Time Filter: $0
         InitPlan  (slice3)
           ->  Limit  (cost=0.00..0.03 rows=1 width=0)
                 ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..0.03 rows=1 width=0)
                       ->  Limit  (cost=0.00..0.01 rows=1 width=0)
                             ->  Seq Scan on bar  (cost=0.00..11072.84 rows=332395 width=0)
         ->  Seq Scan on foo  (cost=0.00..13611.11 rows=334 width=8)
               Filter: a = b
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(12 rows)
```

While GPORCA doesnot support init-plan, we can nevertheless generate a better plan by using LIMIT instead of count. After this PR, GPORCA will generate the following plan with LIMIT clause.

```
vraghavan=# explain select * from foo where foo.a = foo.b and exists (select * from bar);
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=0.00..1368262.73 rows=400324 width=8)
   ->  Nested Loop EXISTS Join  (cost=0.00..1368250.80 rows=133442 width=8)
         Join Filter: true
         ->  Table Scan on foo  (cost=0.00..461.91 rows=133442 width=8)
               Filter: a = b
         ->  Materialize  (cost=0.00..438.57 rows=1 width=1)
               ->  Broadcast Motion 1:3  (slice2)  (cost=0.00..438.57 rows=3 width=1)
                     ->  Limit  (cost=0.00..438.57 rows=1 width=1)
                           ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..438.57 rows=1 width=1)
                                 ->  Limit  (cost=0.00..438.57 rows=1 width=1)
                                       ->  Table Scan on bar  (cost=0.00..437.95 rows=332395 width=1)
 Optimizer status: PQO version 2.35.1
(12 rows)
```

e04ae39d

B
Bump ORCA version to 2.35.2 · 4ad9ce70
由 Bhunvesh Chaudhary 提交于 7月 10, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
4ad9ce70

Check existing tag before publishing new artifacts · 9ff85c29

由 Bhunvesh Chaudhary 提交于 7月 10, 2017

Each ORCA commit must BUMP the version. If the version is not
bumped new releases will not be pushed to the ORCA repository.
This commit adds the check to validate the version of the
current commit with the tag version existing on the repository.
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

9ff85c29

06 7月, 2017 3 次提交

Fix the behaviour of FBetterThan for Random vs Hashed · bbbbd699

由 Omer Arap 提交于 7月 05, 2017

There is a tie breaker logic in FBetterThan function in CCostContext.
According to the code if the distribution is Hashed rather than Random,
it should be favored when the costs are equal.

The code was checking both if the distribution spec in the same context
is both equal to Hashed and Random which is false by default. It should
check if comparing one is Hashed and compared one is Random for correct
behavior.

bbbbd699

B
Bump ORCA version to 2.35.0 · 900a586f
由 Bhuvnesh Chaudhary 提交于 6月 29, 2017
```
There are changes to regress tests in GPDB repository, so bumping up the
minor version.
```
900a586f

Preprocess query to ensure Scalar Ident is on LHS · aa6754d1

由 Bhuvnesh Chaudhary 提交于 6月 29, 2017

CScalarCmp and CScalarIsDistinctFrom operator must have the CScalarIdent operator
on the LHS and CScalarConst operator on the RHS. If it's
not the case the predicate is assigned a default selectivity
due to which the Cardinality estimate is impacted.

This patch fixes the issue by reordering the children of
CScalarCmp and CScalarIsDistinctFrom operator if CScalarIdent operator is on
the RHS on CScalarConst is on LHS. Also the Comparision operator is changed
due to the reordering of the arguments, if supported. If the
corresponding comparision operator does not exist or supported, the
children are not reordered.
Only cases of the type CONST = VAR are handled by this patch.
Signed-off-by: NOmer Arap <oarap@pivotal.io>

aa6754d1

01 7月, 2017 1 次提交
- V
  Refactor the code in CPredicateUtils · 350ca788
  由 Venkatesh Raghavan 提交于 6月 30, 2017
```
Minor cleaniup of duplicate code.
```
  350ca788
30 6月, 2017 2 次提交

Enable Concourse caching of ccache · b46380ee

由 Jesse Zhang 提交于 6月 29, 2017

Shiny new feature in Concourse 3.3.0
(https://concourse.ci/running-tasks.html#caches)

[ci skip]

b46380ee

Get rid of UlSafeLength() function. · 25c2b4dd

由 Heikki Linnakangas 提交于 6月 22, 2017

It made the assumption that it's OK to call it on a NULL pointer, which
isn't cool with all C++ compilers and options. I'm getting a bunch of
warnings like this because of it:

/home/heikki/gpdb/optimizer-main/libgpos/include/gpos/common/CDynamicPtrArray.inl:382:3: warning: nonnull argument ‘this’ compared to NULL [-Wnonnull-compare]
   if (NULL == this)
   ^~

There are a few other places that produce the same error, but one step at
a time. This is most important because it's in an inline function, so this
produces warnings also in any code that uses ORCA, like the translator code
in GPDB's src/backend/gpopt/ directory, not just ORCA itself.

Since the function is now gone, all references to it also need to be removed
from the translator code outside ORCA.

Bump up Orca version to 2.34.2

25c2b4dd

28 6月, 2017 2 次提交

B
Bump ORCA version to 2.34.1 · 3fa3cb01
由 Bhuvnesh Chaudhary 提交于 6月 27, 2017
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
```
3fa3cb01

[#146190079] Falls back due to CTE prod-cons inconsistency · f826d526

由 Omer Arap 提交于 6月 21, 2017

This commit introduces a check to detect if the CTE producer and
matching consumer is executed on the right place. So if CTE producer is
executed on master/segments/segment, then matching consumer also has to
execute on master/segments/segment. In rare cases, orca generates plans
that violate this assumption. This commit detects plans of that kind and
falls back.
Signed-off-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

f826d526