- 06 10月, 2017 2 次提交
-
-
由 Shreedhar Hardikar 提交于
Add functionality to return a Constraint Interval for an expression of the form col IS DISTINCT FROM const. Also remove an unneccesary check in PexprPredicateCol() for the case of IS NOT NULL, since it is now supported. But this meant redundant IS NOT NULL filters, so take care of that also. Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Shreedhar Hardikar 提交于
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
- 05 10月, 2017 1 次提交
-
-
由 Venkatesh Raghavan 提交于
There was a fairly classic bug / typo where the assertion would never fail because we put an enum member (non-zero) in a boolean context. Even though the method in question is generically named `TransformImplementBinaryOp`, it's actually only on the code path of transforming a physical nested loop (non-correlated, non-indexed) join. This commit adds back all types of eligible nested loop joins into the assertion. Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
-
- 29 9月, 2017 1 次提交
-
-
由 Bhuvnesh Chaudhary 提交于
-
- 27 9月, 2017 2 次提交
-
-
由 Omer Arap 提交于
-
由 Omer Arap 提交于
Orca remove outer references from order grouping columns in GbAgg. Orca also trims an existential subquery whose inner expression is a GbAgg with no grouping columns by replacing it with a Boolean constant. However in same caes applying the first preprocessing step affects the latter and produces an unintended trimming of existential subqueries. This commit changes the order of these two preprocessing step to avoid that complications. E.g ``` SELECT * from foo where not exits (SELECT sum(bar.a) from bar where foo.a = bar.a GROUP BY foo.a; ``` In this example the grouping column is an outer reference and it is removed by `PexprRemoveSuperfluousOuterRefs`. And the next preprocessing step `PexprTrimExistentialSubqueries` sees that the GbAgg has no grouping colums and implys `NOT EXISTS` as `false`. Therefore we change the order and this fixes the problem. Bump version to 2.46.3
-
- 23 9月, 2017 1 次提交
-
-
由 Bhuvnesh 提交于
ORCA source will be pushed to bintray repository for each version bump. Developers can use conan dependency manager to build orca before building GPDB via minimal steps and need not worry about the version dependency. As we currently bump orca version used by GPDB, we will ensure that conanfile.txt gets updated for each bump of ORCA. To build orca from GPDB: Step 1: cd <path_to_gpdb>/depends Step 2: env CONAN_CMAKE_GENERATOR=Ninja conan install -s build_type=Debug --build where: build_type can be `Debug` or `Release` Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
- 20 9月, 2017 1 次提交
-
-
由 Jesse Zhang 提交于
Now that we all are using it. [ci skip]
-
- 19 9月, 2017 3 次提交
-
-
由 Heikki Linnakangas 提交于
This potentially allows the compiler to make some optimizations, or at give better warnings, e.g. about using variables uninitialized.
-
由 Bhuvnesh Chaudhary 提交于
GPOS raises exception with different severity level, but they were being logged to GPDB logs at LOG severity level. This is the initial commit which introduces the functionality. If an exception is created with debug log level, these messages will be logged to GPBD with debug1 level rest will be logged as LOG level. Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
由 Bhuvnesh 提交于
If the correlated subquery has aggregate window functions we can pull up the quals and aggregate function as a the condition of the join between outer and inner query Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
- 16 9月, 2017 1 次提交
-
-
由 Bhuvnesh Chaudhary 提交于
While attempting to decorrelate the subquery, we were incorrectly pulling up the join before calculating the window function of the results of the join. In cases where we have subqueries with window function and the subquery has outer references we should not attempt decorrelating it. Ex: select C.j from C where C.i in (select rank() over (order by B.i) from B where B.i=C.i) order by C.j; The above subquery has outer references and result of window function is projected from subquery There are further optimization which can be done in case of existential queries but this PR fixes the plan. Signed-off-by: NJemish Patel <jpatel@pivotal.io> Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
- 15 9月, 2017 4 次提交
-
-
由 Venkatesh Raghavan 提交于
Prior to this fix, the logic that calculated max cardinality for each logical operator assumed that the Full Outer Join with condition false will always return empty set. This was used by the following preprocessing step CExpressionPreprocessor::PexprPruneEmptySubtrees to eliminate the FOJ subtrees (since it thought had zero output cardinality), replacing them with a const table get and zero tuples ``` vraghavan=# explain select * from foo a full outer join foo b on false; QUERY PLAN ------------------------------------------------ Result (cost=0.00..0.00 rows=0 width=8) -> Result (cost=0.00..0.00 rows=0 width=8) One-Time Filter: false Optimizer status: PQO version 2.43.1 (4 rows) ``` This collapsing is incorrect. In this CL, the max cardinality logic has been fixed to ensure that GPORCA plan generates correct plan. After the fix: ``` vraghavan=# explain select * from foo a full outer join foo b on false; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..2207585.75 rows=35 width=8) -> Result (cost=0.00..2207585.75 rows=12 width=8) -> Sequence (cost=0.00..2207585.75 rows=12 width=8) -> Shared Scan (share slice:id 2:0) (cost=0.00..431.00 rows=2 width=1) -> Materialize (cost=0.00..431.00 rows=2 width=1) -> Table Scan on foo (cost=0.00..431.00 rows=2 width=34) -> Sequence (cost=0.00..2207154.75 rows=12 width=8) -> Shared Scan (share slice:id 2:1) (cost=0.00..431.00 rows=2 width=1) -> Materialize (cost=0.00..431.00 rows=2 width=1) -> Table Scan on foo (cost=0.00..431.00 rows=2 width=34) -> Append (cost=0.00..2206723.75 rows=12 width=8) -> Nested Loop Left Join (cost=0.00..882691.26 rows=10 width=68) Join Filter: false -> Shared Scan (share slice:id 2:0) (cost=0.00..431.00 rows=2 width=34) -> Result (cost=0.00..0.00 rows=0 width=34) One-Time Filter: false -> Result (cost=0.00..1324032.49 rows=2 width=68) -> Nested Loop Left Anti Semi Join (cost=0.00..1324032.49 rows=2 width=34) Join Filter: false -> Shared Scan (share slice:id 2:1) (cost=0.00..431.00 rows=2 width=34) -> Materialize (cost=0.00..431.00 rows=5 width=1) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=5 width=1) -> Result (cost=0.00..431.00 rows=2 width=1) -> Shared Scan (share slice:id 1:0) (cost=0.00..431.00 rows=2 width=1) Optimizer status: PQO version 2.43.1 (25 rows) ```
-
由 Omer Arap 提交于
-
由 Omer Arap 提交于
-
由 Omer Arap 提交于
GPORCA should not spend time extracting column statistics that are not needed for cardinality estimation. This commit eliminates this overhead of requesting and generating the statistics for columns that are not used in cardinality estimation unnecessarily. E.g: `CREATE TABLE foo (a int, b int, c int);` For table foo, the query below only needs for stats for column `a` which is the distribution column and column `c` which is the column used in where clause. `select * from foo where c=2;` However, prior to that commit, the column statistics for column `b` is also calculated and passed for the cardinality estimation. The only information needed by the optimizer is the `width` of column `b`. For this tiny information, we transfer every stats information for that column. This commit and its counterpart commit in GPDB ensures that the column width information is passed and extracted in the `dxl:Relation` metadata information. Preliminary results for short running queries provides up to 65x performance improvement. Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
- 12 9月, 2017 4 次提交
-
-
由 Jemish Patel 提交于
Index keys are relative to the relation and the list of columns in `pdrgpcr` is relative to the table descriptor and does not include any dropped columns. Consider the case where we had 20 columns in a table. We create an index that covers col # 20 as one of its keys. Then we drop columns 10 through 15. Now the index key still points to col #20 but the column ref list in `pdrgpcr` will only have 15 elements in it and cause ORCA to crash with an `Out of Bounds` exception when `CLogical::PosFromIndex()` gets called. This commit fixes that issue by using the index key as the position to retrieve the column from the relation. Use the column's `attno` to get the column's current position relative to the table descriptor `ulPos`. Then uses this `ulPos` to retrieve the `CColRef` of the index key column as shown below: ``` CColRef *pcr = (*pdrgpcr)[ulPos]; ``` We also added the 2 test cases below to test for the above condition: 1. `DynamicIndexGetDroppedCols` 2. `LogicalIndexGetDroppedCols` Both of the above test cases use a table with 30 columns; create a btree index on 6 columns and then drop 7 columns so the table only has 23 columns. This commit also bumps ORCA to version 2.43.1
-
由 Bhuvnesh Chaudhary 提交于
-
由 Bhuvnesh Chaudhary 提交于
-
由 Bhuvnesh Chaudhary 提交于
With commit 387c485d winstar and winagg fields were added in WindowRef Node, so this commit adds handling for them in ORCA.
-
- 09 9月, 2017 6 次提交
-
-
由 Heikki Linnakangas 提交于
Turns out that COstreamFile was still used by GPDB's ORCA translator code, so it wasn't quite dead yet, after all. This reverts commit 770dd5db. Bump version to 2.42.3
-
由 Heikki Linnakangas 提交于
Bump version to 2.42.2
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
The header file was referenced in a few places, but it was otherwise unused.
-
由 Omer Arap 提交于
Reformat minidump files with xmllint Plan and cost change update after changing system column widths Bump version to 2.42.1
-
- 08 9月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
-
- 07 9月, 2017 1 次提交
-
-
由 Dhanashree Kashid 提交于
Currently ORCA does not support index scan on leaf partitions when leaf partitions are queried directly. It only supports index scan if we query the root table. This PR along with the corresponding GPDB PR adds the support for using indexes when leaf partitions are queried directly. When a root table that has indexes (either homogenous/complete or heterogenous/partial) is queried; the Relcache Translator sends index information to ORCA. This enables ORCA to generate an alternative plan with Dynamic Index Scan on all partitions (in case of homogenous index) or a plan with partial scan i.e. Dynamic Table Scan on leaf partitions that don’t have indexes + Dynamic Index Scan on leaf partitions with indexes (in case of heterogeneous index). This is a two step process in Relcache Translator as described below: Step 1 - Get list of all index oids CTranslatorRelcacheToDXL::PdrgpmdidRelIndexes() performs this step and it only retrieves indexes on root and regular tables; for leaf partitions it bails out. Now for root, list of index oids is nothing but index oids on its leaf partitions. For instance: CREATE TABLE foo ( a int, b int, c int, d int) DISTRIBUTED by (a) PARTITION BY RANGE(b) (PARTITION p1 START (1) END (10) INCLUSIVE, PARTITION p2 START (11) END (20) INCLUSIVE); CREATE INDEX complete_c on foo USING btree (c); CREATE INDEX partial_d on foo_1_prt_p2 using btree(d); The index list will look like = { complete_c_1_prt_p1, partial_d } For a complete index, the index oid of the first leaf partitions is retrieved. If there are partial indexes, all the partial index oids are retrieved. Step 2 - Construct Index Metadata object CTranslatorRelcacheToDXL::Pmdindex() performs this step. For each index oid retrieved in Step #1 above; construct an Index Metadata object (CMDIndexGPDB) to be stored in metadata cache such that ORCA can get all the information about the index. Along with all other information about the index, CMDIndexGPDB also contains a flag fPartial which denotes if the given index is homogenous (ORCA will apply it to all partitions selected by partition selector) or heterogenous (the index will be applied to only appropriate partitions). The process is as follows: ``` Foreach oid in index oid list : Get index relation (rel) If rel is a leaf partition : Get the root rel of the leaf partition Get all the indexes on the root (this will be same list as step #1) Determine if the current index oid is homogenous or heterogenous Construct CMDIndexGPDB based appropriately (with fPartial, part constraint, defaultlevels info) Else: Construct a normal CMDIndexGPDB object. ``` Now for leaf partitions, there is no notion of homogenous or heterogenous indexes since a leaf partition is like a regular table. Hence in Pmdindex() we should not got for checking if index is complete or not. Additionally, If a given index is homogenous or heterogenous needs to be decided from the perspective of relation we are querying(such as root or a leaf). Hence the right place of fPartial flag is in the relation metadata object (CMDRelationGPDB) and not the independent Index metadata object (CMDIndexGPDB). This commit makes following changes to support index scan on leaf partitions along with partial scans : Relcache Translator: In Step1, retrieve the index information on the leaf partition and create a list of CMDIndexInfo object which contain the index oid and fPartial flag. Step 1 is the place where we know what relation we are querying which enable us to determine whether or not the index is homogenous from the context of the relation. The relation metadata tag will look like following after this change: Before: ``` <dxl:Indexes> <dxl:Index Mdid="0.17159874.1.0"/> <dxl:Index Mdid="0.17159920.1.0"/> </dxl:Indexes> ``` After: ``` <dxl:IndexInfoList> <dxl:IndexInfo Mdid="0.17159874.1.0" IsPartial="true"/> <dxl:IndexInfo Mdid="0.17159920.1.0" IsPartial="false"/> </dxl:IndexInfoList> ``` ORCA changes: A new class CMDIndexInfoGPDB has been created in ORCA which contains index mdid and fPartial flag. For external tables, normal tables and leaf partitions; the fPartial flag will always be false. CMDRelationGPDB will contain array of CMDIndexInfoGPDB instead of simple index mdid array. Add a new parsehandler to parse IndexInfoList and IndexInfo to create an array of CMDIndexInfoGPDB. Update the existing mini dumps to remove fPartial flag from Index metatada tag and associate it with IndexInfo tag under Relation metadata. Add new test scenarios for querying the leaf partition with homogenous/heterogenous index on root table.
-
- 02 9月, 2017 4 次提交
-
-
由 Dhanashree Kashid 提交于
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Dhanashree Kashid 提交于
Now we send the part constraint expression only in the cases below: IsPartTable Index DefaultParts ShouldSendPartConstraint NO - - - YES YES YES/NO YES YES NO NO NO YES NO YES YES (but only default levels info) This commit updates the minidumps accordingly. 1. If the Relation tag has indices then keep the part constraint tag 2. If the Relation tag has no indices and no default partitions; remove the entire part constraint tag 3. If the Relation tag has no indices but has default partitions at any level then keep the part constraint tag but remove the scalar expression 4. Regenerated the following stale minidumps: * DynamicIndexScan-Homogenous.mdp * DynamicIndexScan-Heterogenous-Union.mdp * DynamicBitmapTableScan-Basic.mdp 5. Added four more test cases to CPartTblTest demonstrating the table above. [Ref #149769559] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io> Signed-off-by: NOmer Arap <oarap@pivotal.io>
-
由 Jemish Patel 提交于
Do not serialize and de-serialize the part constraint expression when there are no indices on the partitioned rel. The relacache translator in GPDB will send empty part constraint expression when the rel has no indices. [Ref #149769559] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Omer Arap 提交于
We never send null part constraints from the relacache translator hence we do not handling for the same. This code was probably added to support the older minidumps. There are a few very old minidump files which do not contain the part constraint tag in relation tag. Now with the fix on the relcache translator side in GPDB, the only case when we send NULL part constraints is when there are no indices and no default partitions; we still don't need null check for part constraint in this case because the `fDummyConstraint` will always be true. [Ref #149769559] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
- 29 8月, 2017 1 次提交
-
-
由 Omer Arap 提交于
This commit adds a preprocessing step to change the expression tree when there is an IN subquery with a project list that includes an outer reference but no columns is included from the project's relational child. This preprocessing helps ORCA to decorrelate the subquery. Orca currently does not support directly decorrelating IN subqueries if there is an outer reference in the CLogicalProject. Converting an IN subquery to a predicate AND EXISTS subquery helps Orca generate more with decorrelated subquery option. Below is an example of the preprocessing applied in this commit. Before preprocessing: ``` +--CScalarSubqueryAny(=)["?column?" (17)] |--CLogicalProject | |--CLogicalGet "bar" ("bar"), Columns: ["c" (9)]} | +--CScalarProjectList | +--CScalarProjectElement "?column?" (17) | +--CScalarOp (+) | |--CScalarIdent "b" (1) | +--CScalarConst (1) +--CScalarIdent "a" (0) ``` After: ``` +--CScalarBoolOp (EboolopAnd) |--CScalarOp (=) | |--CScalarIdent "a" (0) | +--CScalarOp (+) | |--CScalarIdent "b" (1) | +--CScalarConst (1) +--CScalarSubqueryExists +--CLogicalGet "bar" ("bar"), Columns: ["c" (9)] } ``` This commit bumps Orca version to 2.40.3 Signed-off-by: NJemish Patel <jpatel@pivotal.io>
-
- 28 8月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
Per discussion at https://github.com/greenplum-db/gpdb/pull/2379, we don't really need to use a special, patched, version of Xerces-C. Remove the check. See also commit 0b3b421a in GPDB, where we made the same change for GPDB's autoconf check.
-
- 26 8月, 2017 1 次提交
-
-
由 Ekta Khanna 提交于
Previous to c09a0acd, `CStackObject()` constructor did a validity check if the pointer is on the stack using `FOnStack()`. Since the function is removed; the following test in `CAutoPTest` is invalid : CAutoPTest::EresUnittest_Allocation() This commit removes the test. Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
- 24 8月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
If you compile with -fomit-frame-pointer, which is the default on recent versions of gcc, the stack unwinding code in FOnStack will not work. This is just a non-critical debugging aid, so let's just remove it altogether.
-
- 10 8月, 2017 3 次提交
-
-
由 Ekta Khanna 提交于
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
由 Ekta Khanna 提交于
This commit creates a new file `CCastUtils.cpp` which maintains all the cast functions. The following functions are moved from `CUtils.cpp` to `CCastUtils.cpp`: ``` * BOOL FBinaryCoercibleCastedScId(CExpression *pexpr, CColRef *pcr) * BOOL FBinaryCoercibleCastedScId(CExpression *pexpr) * const CColRef *PcrExtractFromScIdOrCastScId(CExpression *pexpr) * CExpression *PexprCast( IMemoryPool *pmp, CMDAccessor *pmda, const CColRef *pcr, IMDId *pmdidDest) * BOOL FBinaryCoercibleCast(CExpression *pexpr) * CExpression *PexprWithoutBinaryCoercibleCasts(CExpression *pexpr) ``` The following functions are moved from `CPredicateUtils.cpp` to `CCastUtils.cpp`: ``` * DrgPexpr *PdrgpexprCastEquality(IMemoryPool *pmp, CExpression *pexpr) * CExpression *PexprAddCast(IMemoryPool *pmp, CExpression *pexprPred) * CExpression *PexprCast(IMemoryPool *pmp, CMDAccessor *pmda, CExpression *pexpr, IMDId *pmdidDest) ``` Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io> Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
由 Dhanashree Kashid 提交于
Currently executor crashes for following query with ORCA ON : ``` CREATE TABLE FOO(a integer NOT NULL, b double precision[]); SELECT b FROM foo UNION ALL SELECT ARRAY[90, 90] as Cont_features; ``` In this query, we are appending an integer array (ARRAY[90, 90]) to a double precision array (foo.b) and hence we need to apply a cast on ARRAY[90, 90] to generate ARRAY[90, 90]::double precision[]. In gpdb5 there is not direct function available that can cast array of any type to array of any other type. So in relcache to dxl translator we look into the array elements and get their type and try to find a cast function for them. For this query, source type is 23 (integer) and dest type is 701 (double precision) and we try to find if we have a conversion function for 23 -> 701. Since that is available we send that function to ORCA as follows: ``` <dxl:MDCast Mdid="3.1007.1.0;1022.1.0" Name="float8" BinaryCoercible="false" SourceTypeId="0.1007.1.0" DestinationTypeId="0.1022.1.0" CastFuncId="0.316.1.0"/> ``` Here we are misinforming ORCA by specifying that function with id 316 is available to convert type 1007 (integer array) to 1022 (double precision) array. However Function id 316 is simple int4 to float8 conversion function and it CAN NOT convert an array of int4 to array of double precision. ORCA generates a plan using this function but executor crashes while executing this function because this function can not handle arrays. This commit adds a new ArrayCoerceCast MetaData object which will be constructed when we need to convert an array of one type to another; instead of constructing Cast Metadata. `CMDArrayCoerceCastGPDB` extends `CMDCastGPDB` in that it includes necessary information to generate `CScalarArrayCoerceExpr`. In Relcache Translator on GPDB, we will construct an object of `CMDArrayCoerceCastGPDB` instead of `CMDCastGPDB` depending on the coercion path determined by `FCastFunc`. Added relevant test cases. Ref [#149524459] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
- 29 7月, 2017 1 次提交
-
-
由 Haisheng Yuan 提交于
Add NULL check before calling IMDId FValid()
-