提交 · ad944f60da94551d6b88be3c916f7c515ca3d28a · Greenplum / Gpdb

27 2月, 2020 1 次提交

Use correct CMAKE build option for debug builds · ad944f60

由 Chris Hajas 提交于 2月 21, 2020

Previously, we used "DEBUG" (in all caps) in our CI/scripts, which isn't
canonical (https://cmake.org/cmake/help/v3.0/variable/CMAKE_BUILD_TYPE.html).
Authored-by: NChris Hajas <chajas@pivotal.io>

ad944f60

19 2月, 2020 1 次提交

Add mdps that weren't being tested. · 713d0816

由 Chris Hajas 提交于 2月 10, 2020

These appear to be intended to be included in the test suite, but may
have been forgotten. They've also been updated accordingly since they
haven't been run in a while.
Authored-by: NChris Hajas <chajas@pivotal.io>

713d0816

15 2月, 2020 1 次提交

Change bitmap index costing to choose bitmap NLJs more often · 8a5513ec

由 Chris Hajas 提交于 2月 05, 2020

The experiments and assertions made below were found using the
cal_test.py calibration script. We used regression analysis and isolated
a single variable to determine the coefficients.

This commit makes substantial changes to costing bitmap indexes. Our
goal was to choose bitmap index NL joins instead of hash joins, as the
execution time of the bitmap NL joins was 10X+ less than hash joins
in many cases.

Previously, we were multiplying the rebinds by the bitmap page cost
which caused the cost to be much more expensive than a hash join in
many cases.

Now, we no longer multiply the page cost by the number of rebinds, and
instead multiply the rebinds by a much smaller rebind cost.

Additionally, we took this opportunity to simplify the cost model a bit
by removing the separate code path for small vs large NDVs. We did not
see the large NDV path being used in joins, and in non-join cases it had
very minimal impact on the cost.

This functionality is guarded by a traceflag, EopttraceCalibratedBitmapIndexCostModel.
In GPDB, it will be enabled by setting
`optimizer_cost_model=experimental`. The intent is to enable this by
default in the near future.
Co-authored-by: NChris Hajas <chajas@pivotal.io>
Co-authored-by: NAshuka Xue <axue@pivotal.io>

8a5513ec

14 2月, 2020 1 次提交
- S
  
  Bump ORCA version to 3.91.0 · 4e52112f
  由 Shreedhar Hardikar 提交于 2月 13, 2020
  
  4e52112f
07 2月, 2020 1 次提交

DPv2 Transform now handles left outer joins (#564) · 7982bd27

由 Hans Zeller 提交于 2月 06, 2020

The CExpandNAryJoinDPv2 transform now recognizes the CScalarNAryJoinPredList
operator for NAry joins containing outer joins.

We now transition gracefully from exhaustive to a MinCard algorithm when
we exceed the 10-table join limit in the DPv2 transform.

Preprocessor changes:

- Because we sometimes call the preprocessor multiple times on a tree
(e.g. with CTEs), we needed to make the calls idempotent
- Handle the new NAry join correctly in CExpressionPreprocessor::PexprOuterJoinToInnerJoin()
- Improve constraint and max card computation for NAry join with left outer joins in it

Changes to the CJoinOrderDPv2 class:
- Changed base class, CJoinOrder to add information about which vertex a given edge
belongs to, if it is an ON predicate
- Added a new CJoinOrder constructor to compute this new information

- New data structures to describe join levels:

We still have an array of levels. Each level is now an SLevelInfo struct.
Each SLevelInfo object has an array of SGroupInfo structs.
Each SGroupInfo struct has a bitset of atoms (a component) and
an expression (SExpressionInfo). The SExpressionInfo now has pointers to
child SGroupInfos, in addition to the expression itself, to help computing
the cost.

- Cost now has the cardinality of the join itself. To avoid computing stats
too many times, the stats are stored in each group and we have a new map to
look up already computed stats, given a bitset (BitSetToGroupInfoMap).

- The cost function also penalizes cross-products, similar to what the NLJ
penalty (EcpNLJFactor) does.

- We no longer make a list of CExpressionArrays and then pick the cheapest
of each ExpressionArray. Instead, we delete expressions that aren't the
cheapest in their bitset immediately. To do that, we store all the
good expressions in the CJoinOrderDPv2, not in temporary variables.

- We limit the number of groups in a given level to the number of groups
in a 10-way join. In other words, joins up to 10-way should be enumerated
exhaustively, larger joins will be restricted and levels greater than 10
have only one group, like for MinCard.

- To implement the limits for groups and also for the top k expression at
the highest level, we added a k-heap template struct.

Changes to statistics estimation:

- Use stats expression for children when deriving stats in higher groups

- Use histograms to compute the NDVs for join columns, don't override the
scale factor later with the global NDVs (that ignores the effect of some
predicates)

- We will also need a fix to the join damping factor, see
https://github.com/greenplum-db/gporca/pull/561Co-authored-by: NSambitesh Dash <sdash@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@pivotal.io>
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>

7982bd27

05 2月, 2020 2 次提交

Support MCV based cardinality estimation for all text related types · 157b0a48

由 Shreedhar Hardikar 提交于 1月 14, 2020

This commit introduces support for estimation for all text related types
including extensions like citext. Prior to this commit, the optimizer would
estimate the cardinality for predicates involving types like citext, name etc
as 40% leading to cardinality mis-estimation and thus suboptimal plans for
certain queries.
Co-authored-by: NShreedar Hardikar <shardikar@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>

157b0a48

Introduce a new scale factor algorithm for joins (#567) · ce453cf2

由 Ashuka Xue 提交于 2月 04, 2020

This commit introduces a new algorithm for calculating the cumulative
scale factor for joins when estimating cardinality. The new algorithm
moderately decreases the impact of subsequent predicates by a sqrt
factor but only damps the predicate if there are multiple predicates on
the same two tables. Otherwise, we assume that there is no correlation
between predicates and thus don't need to be damped.

For example, if given the following join predicates:
    t1.a = t2.a AND t1.b = t2.b AND t2.b = t3.a

The algorithm would do the following:
    cumulative scale factor = (SF of t1.a = t2.a) * sqrt(SF of t1.b =
t2.b) * (SF of t2.b = t3.a)

In order for this to work, we need to keep track of the table from which
the predicate came. Thus, we pass along the mdid of the table through
the colref.

The algorithm is not the default and can be turned on using
`set optimizer_damping_factor_join = 0`.

Also, fix bug in fix_mpds.py script.
Co-authored-by: NAshuka Xue <axue@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>

ce453cf2

31 1月, 2020 1 次提交
- A
  Parameterize the 5X explain pipeline · fc76bece
  由 Ashuka Xue 提交于 1月 29, 2020
```
This commit makes the explain pipeline reusable for other workloads.
```
  fc76bece
11 1月, 2020 1 次提交

Do not pushdown predicate when it contains CTEAnchor · 9b13340f

由 Abhijit Subramanya 提交于 12月 24, 2019

If there is a predicate which contains a CTEAnchor, pushing it down can cause
duplicate CTEProducers to be generated in the plan. So for now disable pushing
down the predicate in such cases.

9b13340f

07 1月, 2020 1 次提交

Fix PexprOuterJoinToInnerJoin to transitively convert LOJ to inner joins · f27a3d29

由 Abhijit Subramanya 提交于 12月 19, 2019

The PexprOuterJoinToInnerJoin preprocessing step would not recurse into newly
created children. This would cause only the first LOJ in a tree of LOJs to be
converted to an inner join. This patch fixes the issue by making sure that the
function recurses into its children even if it is newly created.
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>

f27a3d29

10 12月, 2019 1 次提交

Collapse inner and left outer joins into NAry joins · 589a18ef

由 Sambitesh Dash 提交于 10月 22, 2019

In the preprocessor, we only collapsed inner joins into NAry joins
so far. This commit also adds LOJs. The change is guarded by a
traceflag that is off by default, so this is dead code at this stage
until we deliver the needed optimizer changes.

We also change the algorithm to be more efficient.

- Renamed PexprCollapseInnerJoins to PexprCollapseJoins
- Added a list of integers to LogicalNAryJoin. These integers
  are present only when the NAry join contains at least one
  LOJ. The indexes help in associating join predicates to
  children.
- Moved the CollectJoinChildren method from to CPredicateUtils
  to class CExpressionPreprocessor, because it now calls
  PexprCollapseJoins recursively.
- Added a new scalar operator, CScalarNAryJoinPredList
  that contains the inner join predicates as well as the ON
  predicates for LOJs.
- Fixed preprocessor issues that came up during unit testing.
- Changed MaxCard computation for both existing and new
  NAry left outer joins to include the inner table MaxCard as
  long as it is not 0.
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@pivotal.io>

589a18ef

06 12月, 2019 2 次提交

Correctly compute distribution policy in phase 1 of gpexpand (#555) · b896d839

由 Hans Zeller 提交于 12月 05, 2019

Consider a hashed distributed partitioned table foo. After first stage
of gpexpand, the root table will have distribution polcy as hashed but
the leaf partitions will have random distribution. In such a case,
ORCA sets m_convert_hash_to_random to true in the table
descriptor. When the flag is true we should treat the distribution
policy of table as random during planning.

Note that this code will need to change when we plan to handle
gpexpand phase 1 in GPDB 6 and later. At that time, we won't be able
to treat these tables as random-partitioned anymore.

Co-authored-by: Sambitesh Dash sdash@pivotal.io
Co-authored-by: Hans Zeller hzeller@pivotal.io

b896d839

Optimize histogram copying to lazily copy buckets (#547) · 1eec040b

由 Chris Hajas 提交于 12月 05, 2019

Previously, when constructing new histograms, we would copy all buckets
from an existing histogram. This was expensive, both in optimization
time and memory, as this is used in PstatsDerive().

Now, we lazily copy the underlying bucket array only when modifying the
buckets. We only modify the buckets on 2 occasions:

1. CapNDVs(): We set the distinct values of the bucket in some
cases.

2. NormalizeHistogram(): We set the frequency of the bucket to
another value.

3. DistributeBucketProperties(): We set both the frequency and distinct
values of the bucket.

In the vast majority of cases, we're simply copying this array and
reading from it, modifications are relatively rare, which makes this
much more efficient.

From profiling on my laptop, I saw TPC-DS Q72 memory utilization go from
700MB to 270MB and optimization time go from 5.5s to 4s.

The entire TPC-DS optimization time went from 67s to 59s (12%
improvement).
Authored-by: NChris Hajas <chajas@pivotal.io>

1eec040b

20 11月, 2019 1 次提交

Consider Bitmap scans for OR and IN predicates on btree indexes (#554) · 327592f8

由 Hans Zeller 提交于 11月 19, 2019

When we have predicates that are not AND predicates and are too
complicated for BTree index scans,
then also consider bitmap index scans. This is implemented by passing
a boolean argument, indicating that we are processing a complex predicate
and by adding an additional parameter to allow btree indexes in addition
to the primary index type, bitmap.

The change also required updating some older MDP files with the return
type of the index, since this is now required for BTree indexes as well
as for bitmap indexes (that has been the case since commit 976cba9b).

Note that most AND predicates are handled by the btree code path ok, so there
is not need to consider them in the bitmap code path, since we could
cost the bitmap scans cheaper than the btree index scans.

There are three main cases of ANDed predicates and btree indexes:

- Multiple predicates for the same index, this is handled by the
  btree code path
- Predicates for different indexes, ORCA will pick only one index per
  alternative, so it will generate the cheapest plan that uses one index,
  while a bitmap scan would be able to use multiple indexes
- Predicates for one index plus non-index (residual) predicates, this
  is handled by the btree code path

In summary, we have an "opportunity to improve" for ANDed predicates on
multiple btree indexes. We won't fall back for such predicates and we
won't generate a bitmap scan at the moment (as planner does). This is
not a regression.

327592f8

14 11月, 2019 2 次提交

Prevent fallback when using array comparisons with btree indexes (#553) · 29025668

由 Hans Zeller 提交于 11月 13, 2019

We only support array comparison predicates on bitmap index scans, not on
btree index scans or index scans of other types. However, ORCA was attempting
to generate such plans, leading to a fallback to planner.

With the fix, we will only consider bitmap indexes for array comparison
predicates. This should avoid the fallback.

Here is a simple test case:

```
drop table if exists foo;
create table foo (a int, b int) distributed by (a);
insert into foo values (1,1);
create index foo_ix on foo using btree (b);

set optimizer_print_plan to on;
set client_min_messages to 'log';

-- before: btree index scan with an array comparison, fallback
-- after: file scan with array comparison, ORCA plan
explain select * from foo where b in (1, 3);

-- before: btree index scan with array comparison, fallback, planner error in 5X
-- after: file scan with array comparison, ORCA plan, succeeds in 5X
explain update foo set a = 1 where b in (1, 3);
```

29025668

Merge IDatumStatisticsMappable and IDatum classes · bcc826e8

由 Chris Hajas 提交于 10月 21, 2019

The IDatum class had only one child class, IDatumStatisticsMappable.
Previously, we performed quite a few expensive dynamic casts on IDatum
to IDatumStatisticsMappable. Now, we've merged the two classes and a
cast is unnecessary.

The class hierarchy is as follows:

IDatum
|
IDatumStatisticsMappable
|
IDatumInt4, IDatumBool, IDatumGeneric....
|
CDatumInt4, CDatumBool, CDatumGeneric...

With these last 2 commits, the TPC-DS optimization time on my laptop goes from
61s to 56s (~8% savings).
Authored-by: NChris Hajas <chajas@pivotal.io>

bcc826e8

13 11月, 2019 1 次提交

Fix "3rd party error log" message in the log with debug builds (#550) · c83a3b93

由 Hans Zeller 提交于 11月 12, 2019

The new CDebugCounter class caused error messages when we called
gpos_terminate in a debug build, because it didn't release its
memory pool before the CMemoryPoolManager that managed it got
destroyed.

The log messages started with "3rd party error log:".

The fix makes sure that we delete the static CDebugCounter object
and its associated memory pool before deleting the CMemoryPoolManager
in the gpos_terminate call.

Also, the code caused simple queries like "select 1" to fall back in
debug builds, due to an assert in the IDatumInt8::GetByteArrayValue()
function.  This has been fixed as well.

Note that most of the diffs in COptimizer::PrintQueryOrPlan() are due
to indentation, the actual code change is just a dynamic cast and an
added "if (NULL != datum)".

Also fix a double deletion issue, visible in debug builds

We had an issue in gpos_terminate, when we deleted the single CDebugCounter
object. This object was allocated from a memory pool, and we destroyed that
memory pool in the destructor - while the CDebugCounter object was still
allocated in the memory pool. This led to a recursive deletion of the same
object and an error.

In some cases (e.g. gpstart on 5X, possibly others) this error caused the
gpstart operation to fail.

The fix moves the destruction of the memory pool out, to a point where the
memory pool is empty.

c83a3b93

29 10月, 2019 1 次提交

Bump down the value of Epsilon · db1e9f54

由 Sambitesh Dash 提交于 10月 14, 2019

0.001 was too high a value for Epsilon. In billion+ rows table with highly
selective predicate, we ended up severly under estimating the cardinality of
predicate.

The Relcache Translator while calculating the MCV freq uses the comparison
`CStatistics::Epsilon < mcv_freq`. If this comparison fails, Relcache
translator will translate MCV freq as 0.  Before this fix, it was expecting the
MCV freq to be > 0.001. Which is too high a number for larger tables.  e.g. in
10 billion rows+ table, an MCV freq of 0.0001 will be passed as 0. But such a
selective predicate is actually selecting 1000000 rows. ORCA will though report
it as 1 row. Such cardinality misestimation can lead to picking an NLJ over
hashjoin or not picking an Index Scan.

Co-authored-by: Sambitesh Dash sdash@pivotal.io
Co-authored-by: Chris Hajas chajas@pivotal.io

db1e9f54

26 10月, 2019 1 次提交

Add script to convert minidumps from GPDB5 format to GPDB6+ (#548) · a07ced0c

由 Chris Hajas 提交于 10月 25, 2019

GPDB6 introduced additional statistitcs columns. This script adds
"dummy" values to those extra columns so that we can test older
minirepros on 6X. This should allow us to do more of our development on
6X instead of 5X.
Authored-by: NChris Hajas <chajas@pivotal.io>

a07ced0c

22 10月, 2019 3 次提交

Disable event counter code for Linux (#545) · 178bb182

由 Hans Zeller 提交于 10月 21, 2019

Temporarily disabling this for Linux, as it probably is causing
the CentOS6 and CentOS 7 build to fail in the pipeline.

We might want to fix those in a different way in the future, for example
by making sure we use the same C++ library for both cmake and ctest.
Today, the concourse/build_and_test.py script potentially allows different
libraries to be used.

178bb182

Better handling of null-rejecting predicates for ANDs and ORs (#542) · 6a004878

由 Hans Zeller 提交于 10月 21, 2019

The methods that recognize null-rejecting predicates didn't handle
the case where an operand could be false or null, but not true.

We introduced a new Enum EberNotTrue to handle this case.

We also renamed the EberUnknown enum to EberAny, to avoid confusion
with the concept of "unknown" in the SQL standard, which means
something very similar to "null".
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@pivotal.io>

6a004878

Debugging code to count events, quantities, CPU time (#540) · 3f8fd383

由 Hans Zeller 提交于 10月 21, 2019

This code makes it easy to count and measure various things in ORCA,
assuming you are using a debug build and are willing to modify the code.

The commit adds a new class CDebugCounter, which is enabled only in debug
builds (although that can be changed temporarily to measure retail builds,
by defining GPOS_DEBUG_COUNTERS).

You instrument the code by using the following macros to count events,
add up quantities (integers and floating point values) and to count CPU
time:

  GPOS_DEBUG_COUNTER_BUMP(counter_name)
  GPOS_DEBUG_COUNTER_ADD(counter_name, delta)
  GPOS_DEBUG_COUNTER_ADD_DOUBLE(counter_name, delta)
  GPOS_DEBUG_COUNTER_START_CPU_TIME(counter_name)
  GPOS_DEBUG_COUNTER_STOP_CPU_TIME(counter_name)

This commit adds code to ORCA to collect this information and write it
to the event log at the end of each statement compilation (again, this
is only enabled for debug builds unless changed).

You can have different counters, each one with a name chosen by you.
However, a given counter name can be used only with one of the measurement
types (count, sum of integers, sum of doubles, CPU time).

When you run multiple queries, ORCA will report the counter values separately
for each query. Any totals would have to be done by you with an SQL query
or a spreadsheet.

For convenience, you can name queries in a file, like so:

   select 'query name: q1';
   explain select * from foo;

In other words, a simple constant get with a string that starts with
"query name: " will be recognized by ORCA to indicate the name of the
next query. The counters will not be reported fir this first select,
only for the second query. Query names are useful when you have a test
with very many queries in it, e.g. the TPC-DS queries.

Finally, the commit contains a Python program to retrieve the recorded
counters from the log. The program can either print a series of comma-
separated values (CSV) to stdout or it can insert the result into an
SQL table name debug_counters in a database you can choose.

Run this for more info:

.../gporca/scripts/get_debug_event_counters.py --help

3f8fd383

19 10月, 2019 1 次提交

Simplify CMemoryPool to reduce unnecessary headers and logic (#539) · d828eed2

由 Chris Hajas 提交于 10月 18, 2019

Move headers and logic that are only needed for CMemoryPoolTracker out
of CMemoryPool and into CMemoryPoolTracker, and collapse the headers
into into one there. This reduces memory utilization for external memory
pools (e.g.: CMemoryPoolPalloc), as they don't need the additional
headers.

Specifically, the AllocHeader and SAllocHeader have been merged
into CMemoryPoolTracker, and the recording methods have been moved into
Tracker. UserSizeOfAlloc, DeleteImpl, and NewImpl are overridden in
CMemoryPoolPalloc and CMemoryPoolTracker.

Also, remove unused code/tests.
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>

d828eed2

12 10月, 2019 2 次提交

Support on-demand property derivation for scalar properties · cafe3a6c

由 Chris Hajas 提交于 7月 15, 2019

This commit uses the pattern established in b8fe1c15 to reduce
unnecessary property derivation in expressions. It implements a scalar
property interface in Cexpression and CExpressionHandle.

* Remove FAttachedToLeafPattern() check in GetRelationalProperties()
and GetDrvdScalarProps(), as this is an unreachable condition.

* Remove some PdpDerive()'s that were explicitly called
Since we derive these on demand now, we don't need to explicitly call
PdpDerive in these instances.
Authored-by: NChris Hajas <chajas@pivotal.io>

cafe3a6c

A

Update deprecated tag aggregate to in_parallel · eb944f96
由 Ashuka Xue 提交于 10月 09, 2019

eb944f96

10 10月, 2019 2 次提交

Improve index selection for bitmap indexes (#537) · 976cba9b

由 Ashuka Xue 提交于 10月 09, 2019

This commit contains the following components
- Prevent btree indexes on AO tables from following the btree path
- Allow col op col predicates as a supported index predicate
- If the predicate is not supported, break it down and combine the
  supported conjuncts instead of keeping them separate
- Use predicate selectivity to determine the best index

Btree indexes on AO tables are not yet supported. Now that such indexes
are sent over to ORCA as type btree instead of type bitmap, we need to
prevent this path from being followed. Instead, btree indexes on AO
tables can be handled by creating a Bitmap index with an underlying
btree index.

In the bitmap index path, alternatives that should not have been
generated (such as picking an index when the first column of the index
was not in the predicate) were being added to the memo because col op
col predicates were being handled separately from col op const
predicates.

In reality, these two scenarios should be following the same code path
when generating viable indexes.

Previously, if the predicate was not all conjuncts, it would be
immediatly broken down into separate predicates. For example, say you
had the predicate a = 4 AND b = 3 AND c = 3 OR (d = 2 AND e = 1), this
would get broken down and ORCA would find indexes for a  = 4, b = 3, c =
3 as individual components.Instead, we want to combine these 3 conjuncts
and find the best matching index.

Additionally, use more than just the number of residuals as an indicator
for a good index. Use the selectivity of the index predicate as an
indiator. If we do not have the NDV of the col, the selectivity
calculation of sel(col = outer ref) = default selectivity of 0.4.

Pick best index based on 1) selectivity, 2) number of residual
predicates and 3) number of columns in the index.

For example, if the predicate was a = 4 and c = 3 and the two indexes
were idx_abc and idx_ac. Say the selectivity was the same, as the number
of residuals is both 0, we want to select the index with the least
number of columns, idx_ac.
Co-Authored-By: NAshuka Xue <axue@pivotal.io>
Co-Authored-By: NHans Zeller <hzeller@pivotal.io>

976cba9b

Fallback when citext op non-citext join predicate is present · 3e45f064

由 Shreedhar Hardikar 提交于 10月 08, 2019

When a query contains a join where one side is distributed on a citext
column, and the other isn't, ORCA may generate a plan that redistributes
the citext column. However, the redistribution is done using the hash
function in citext type's default opclass which may be different than
the function used to distribute the other side of the join. Thus, after
redistribution the data from the citext-distributed table may end up on
the wrong segment, ultimately returning incorrect results.

To fix this, ORCA must fully support opclasses in the distribution spec,
so that it can pick the correct hash function to redistribute. This
commit contains a temporary fix that falls back to planner whenever it
encounters a `citext op non-citext` or `non-citext op citext` join
predicate, ignore binary coercible casts.

3e45f064

02 10月, 2019 1 次提交

Bump ORCA version to 3.74.0 · a69170c2

由 Chris Hajas 提交于 10月 01, 2019

Co-authored-by: NChris Hajas <chajas@pivotal.io>
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>

a69170c2

01 10月, 2019 1 次提交

Bump ORCA version to 3.73.0 · e6c12781

由 Chris Hajas 提交于 9月 30, 2019

Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>

e6c12781

20 9月, 2019 1 次提交

Only print Unexpected Errors · 456b2b31

由 Sambitesh Dash 提交于 9月 10, 2019

Earlier ORCA would log all erorrs. Even when they are due to unsupported
features. During gpinit when we are running catalog queries which ORCA doesn't
support by default, we ended up rpinting unecessary error messages to the log.

Now we want to log only unexpected errors. This will cut lot of clutter from log files.
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>

456b2b31

19 9月, 2019 1 次提交

Wrong preprocessed query for nullable contraints (#527) · 95445145

由 Denis Smirnov 提交于 9月 19, 2019

At first the problem was detected in Greenplum

```
create table ta(a int check(a = 1));
insert into ta values(null);

set optimizer=off;
select * from ta where a is null;
 a
---

(1 row)

set optimizer=on;
select * from ta where a is null;
 a
---
(0 rows)

set optimizer_print_query = on;
set client_min_messages='log';
select * from ta where a is null;
Algebrized query:
+--CLogicalSelect
   |--CLogicalGet "ta" ("ta"), Columns: ["a" (0), "ctid" (1), "xmin" (2), "cmin" (3), "xmax" (4), "cmax" (5), "tableoid" (6), "gp_segment_id" (7)] Key sets: {[1,7]}
   +--CScalarNullTest
      +--CScalarIdent "a" (0)

Algebrized preprocessed query:
+--CLogicalConstTableGet Columns: ["a" (0), "ctid" (1), "xmin" (2), "cmin" (3), "xmax" (4), "cmax" (5), "tableoid" (6), "gp_segment_id" (7)] Values: []
```

So ORCA preprocesses a logical plan to a single node
CLogicalConstTableGet, that means "there is no suitable data in this
node, no need for retrieving data and constructing a physical
plan". But in fact it is a wrong behavior for ta table - it can
contain nulls as far as check a = 1 doesn't protect from nulls. select
null = 1 returns null, but not a boolean value.

It was easy to guess that the problem was connected with constraint
null checks and comparing with constants. Debugging showed out that
CLogicalConstTableGet node is formed in CExpressionPreprocessor.cpp
file and in a current example max cardinality is zero.

So here is a pipeline that caused the problem:
CLogicalSelect.cpp -> CConstraintInterval.cpp

As you can see the problem is in m_fIncludesNull parameter that
describes whether nulls are allowed in a constraint or not. At the
moment it is always false. But in fact it should be false only if a
relative column not null - otherwise it should be true.

Under hood m_fIncludesNull is set in a function
PciIntervalFromColConstCmp that transforms constraints comparing a
column with a constant to an interval one. So a = 1 constraint always
transforms to a in [1, 1] and m_fIncludesNull is always set false at
the moment for a new interval constraint.

The solution is to pass a new parameter infer_nulls_as to several
methods, to determine whether NULL values in a column qualify the
row or reject it.

Also this PR contains new tests: check constant constraint on a
nullable and non-nullable column.

For example:

```
create table ta(a int check(a = 1));
insert into ta values(null);

create table tb(b int not null check(b = 1));
insert into tb values(1);
```

A similar problem occurs with planner (set optimizer = off), see
https://github.com/greenplum-db/gpdb/issues/8582.
This commit fixes only the ORCA issue.
Co-authored-by: NDenis Smirnov <darthunix@gmail.com>
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@pivotal.io>

95445145

17 9月, 2019 1 次提交

Deserialize "GPDB length" for CMDTypeGenericGPDB · 882c170c

由 Chris Hajas 提交于 8月 07, 2019

GetGPDBLength() should now return the deserialized length. This fixes
`invalid typLen: 0` we were seeing.
Authored-by: NChris Hajas <chajas@pivotal.io>

882c170c

13 9月, 2019 2 次提交

Fix stats calculation for predicates with UUID columns · 9b0e1391

由 Abhijit Subramanya 提交于 9月 04, 2019

For predicates that involved columns of UUID types, ORCA would consider the
predicate as unsupported and would estimate the selectivity to be 40%. This is
because ORCA did not have statistics for UUID columns. This would cause a bad
plan to be selected in some cases. This patch fixes the issue by adding
support for the UUID type in CMDIdGPDB class and make it a LINT value for
comparison purposes.

9b0e1391

Cost rebind for bitmap scans · 69c8e81c

由 Ashuka Xue 提交于 8月 23, 2019

This commit includes the dInitScan cost for every rebind in a bitmap table
scan.

This is a partial revert of commit : 1cae6057

It turns out including the init scan cost for every rebind more accurately
represents executor behavior. Not including the init scan cost per rebind led
to picking up much slower Index Nested Loop Join over Hash Join.

In the example minidump included in the commit, the Index Nested Loop Join runs
~ 3 times slower than the Hash Join, yet it was getting picked. In larger
tables, sometimes we ended up picking Index Nested Loop Join which never
finishes executing.
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>

69c8e81c

30 8月, 2019 1 次提交

Consider predicate of type <cast(ident)> array cmp <const array> · 22a54eea

由 Bhuvnesh Chaudhary 提交于 8月 07, 2019

While processing constraint interval, also consider predicate of type
<cast(ident)> array cmp <const array> else we lose the opportunity
to generate implied quals

```
CREATE TABLE varchar_sc_array_cmp(a varchar);
INSERT INTO varchar_sc_array_cmp VALUES ('a'), ('b'), ('c'), ('d');
EXPLAIN SELECT * FROM varchar_sc_array_cmp t1, varchar_sc_array_cmp t2 where t1.a = t2.a and t1.a in ('b', 'c');
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=4 width=4)
   ->  Hash Join  (cost=0.00..862.00 rows=2 width=4)
         Hash Cond: orca.varchar_sc_array_cmp.a::text = orca.varchar_sc_array_cmp.a::text
         ->  Table Scan on varchar_sc_array_cmp  (cost=0.00..431.00 rows=1 width=2)
               Filter: (a::text = ANY ('{b,c}'::text[])) AND (a::text = 'b'::text OR a::text = 'c'::text)
         ->  Hash  (cost=431.00..431.00 rows=1 width=2)
               ->  Table Scan on varchar_sc_array_cmp  (cost=0.00..431.00 rows=1 width=2)
                     Filter: a::text = 'b'::text OR a::text = 'c'::text
 Settings:  optimizer_cte_inlining_bound=1000; optimizer_metadata_caching=on
 Optimizer status: PQO version 3.63.1
(10 rows)

SELECT * FROM varchar_sc_array_cmp t1, varchar_sc_array_cmp t2 where t1.a = t2.a and t1.a in ('b', 'c');
 a | a
---+---
 c | c
 b | b
(2 rows)

SET optimizer_array_constraints=on;
EXPLAIN SELECT * FROM varchar_sc_array_cmp t1, varchar_sc_array_cmp t2 where t1.a = t2.a and (t1.a in ('b', 'c') OR t1.a = 'a');
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=4 width=4)
   ->  Hash Join  (cost=0.00..862.00 rows=2 width=4)
         Hash Cond: orca.varchar_sc_array_cmp.a::text = orca.varchar_sc_array_cmp.a::text
         ->  Table Scan on varchar_sc_array_cmp  (cost=0.00..431.00 rows=1 width=2)
               Filter: a = ANY ('{a,b,c}'::character varying[])
         ->  Hash  (cost=431.00..431.00 rows=1 width=2)
               ->  Table Scan on varchar_sc_array_cmp  (cost=0.00..431.00 rows=1 width=2)
                     Filter: a = ANY ('{a,b,c}'::character varying[])
 Settings:  optimizer_cte_inlining_bound=1000; optimizer_metadata_caching=on
 Optimizer status: PQO version 3.63.1
(10 rows)

SELECT * FROM varchar_sc_array_cmp t1, varchar_sc_array_cmp t2 where t1.a = t2.a and (t1.a in ('b', 'c') OR t1.a = 'a');
 a | a
---+---
 c | c
 b | b
 a | a
(3 rows)
```.

22a54eea

29 8月, 2019 1 次提交

Fix stats calculation for Dynamic scans with partition elimination · 36adba3b

由 Abhijit Subramanya 提交于 8月 14, 2019

The `DeriveStatsForDynamicScan` function did not consider unsupported
predicates while calculating the stats during dynamic partition elimination.
For e.g
Consider the following query
`explain select * from part inner join bar on b = d inner join jazz on c between CAST(TO_CHAR(jazz.f,'999') AS int4) AND CAST(TO_CHAR(jazz.f,'999') AS int4);`
Here the unsupported predicate `between CAST(TO_CHAR(jazz.f,'999') AS int4) AND CAST(TO_CHAR(jazz.f,'999') AS int4);` would simply be ignored.
Hence we ended up estimating the cardinality as that of the base table. But
this can lead to overestimation and causes the plan to be costed higher. We now
convert the unsupported predicates as a filter on top of the join to estimate
the cardinality more accurately.
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>

36adba3b

28 8月, 2019 1 次提交
- A
  
  Update instructions · a8b12a1a
  由 Ashuka Xue 提交于 8月 23, 2019
  
  a8b12a1a
22 8月, 2019 2 次提交

Reduce hash collisions when processing a large number of columns · 5639e71f

由 Chris Hajas 提交于 8月 19, 2019

Our hashing function, which is dependent on the size of the hash table,
was hashing to the same bucket during constant expression evaluation.
Instead, modify the hash table size to be an odd number to significantly
reduce hash collisions. In the future, we should modify our hashing
function to not be influenced by the hash table size, but that will be a
larger effort.

This reduces optimization time by 50%+ for some queries. In one case,
this reduced optimization time from 25s to 9s on my laptop.
Authored-by: NChris Hajas <chajas@pivotal.io>

5639e71f

Update the max in-flight jobs for the dev pipeline · 2fa4c3b5

由 Ashuka Xue 提交于 8月 21, 2019

Currently the dev pipeline only allows 1 in-flight runs for the ICG
jobs. So, if there are multiple jobs in queue, some of them can be lost.
This should help prevent some of those situations by upping the max
in-flight. Also, it will reduce the feedback loop time.
Co-authored-by: NAshuka Xue <axue@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>

2fa4c3b5

17 8月, 2019 1 次提交

Remove unused columns from subqueries (#521) · c78b4e04

由 Ashuka Xue 提交于 8月 16, 2019

* Remove unused columns from subqueries

This commit reduces the output columns for existential subqueries.
For existential subqueries, we add a groupby operation on a key of the
outer table, to simulate a semi-join. The grouping columns must be a key
of the outer table, plus any columns that are required by the parent
node.

In an earlier commit, we pruned away unused columns in a query, but
PcrsOutput() could still sometimes contain such unused columns. Our fix
makes sure we only add used ColRefs to the grouping columns, to avoid
problems when we generate and execute a query plan.
Co-Authored-By: NHans Zeller <hzeller@pivotal.io>

c78b4e04