提交 · b4e2e3e2280ca81d1a3a1b20d9659e229238ed18 · Greenplum / Gpdb

14 3月, 2019 2 次提交

Rename legacy planner to Postgres planner · b4e2e3e2

由 Daniel Gustafsson 提交于 3月 14, 2019

As we merge with upstream and by that keep refining the Postgres
planner, legacy planner is no longer a suitable name. This changes
all variations of the spelling (legacy planner, legacy optimizer,
legacy query optimizer etc) to say "Postgres" rather than "legacy".
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NDavid Yozie <dyozie@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

b4e2e3e2

H

Create fewer partitions, to speed up tests. · 80f8bede
由 Heikki Linnakangas 提交于 3月 14, 2019

80f8bede

15 8月, 2018 1 次提交

Refactor allow_system_table_mods into a boolean GUC (#5407) · 4c24d744

由 David Kimura 提交于 8月 15, 2018

The purpose of this refactor is to more closely align the GUC with postgres. It
started as a suggestion in https://github.com/greenplum-db/gpdb/pull/4790.
There are still differences, particularly around when this GUC can be set. In
GPDB it can be set by anyone at any time (PGC_USERSET), however in postgres it
is limited to postmaster restart (PGC_POSTMASTER). This difference was kept on
purpose until we have more buy-in as it is a bigger change on the end-user.
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>

4c24d744

09 7月, 2018 1 次提交

Use a penalty cost to implement enable_* planner GUCs, like in upstream. · 8fcd3fdd

由 Heikki Linnakangas 提交于 2月 05, 2018

Instead of completely disabling the generation of Paths with disabled
plan types, add a high penalty to their cost estimates, like in the
upstream. This reduces our diff vs. upstream, making future merges more
straightforward.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/Az2cDcqf73g/_tY6Yv1kBgAJCo-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NRichard Guo <riguo@pivotal.io>

8fcd3fdd

17 5月, 2018 1 次提交

Create dummy stats for type mismatch · 88b2ab36

由 Omer Arap 提交于 5月 08, 2018

If the column statistics in `pg_statistic` has values with type
different than column type, metadata accessor should not translate the
stats and create a dummy stats instead.

This commit also reorders stats collection from the `pg_statistic` to
align with how analyze generates stats. MCV and Histogram translation is
moved to the end after NDV, nullfraction and column width extraction.
Signed-off-by: NMelanie Plageman <mplageman@pivotal.io>

88b2ab36

01 5月, 2018 1 次提交

Fix NDVRemain and FreqRemain calculation · 4a5c58a5

由 Bhuvnesh Chaudhary 提交于 4月 27, 2018

For text, varchar, char and bpchar, ORCA does not collect the
MCV and Histogram information, so the calculation of NDVRemain and
FreqRemain must be updated to account for it.

For such columns, NDVRemain is the stadistinct as available in the
pg_statistic, and FreqRemain is everything except the NULL frequency.

Earlier, NDVRemain and FreqRemain for such columns would yield 0
resulting in poor cardinality estimation and suboptimal  plans.
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

4a5c58a5

19 1月, 2018 1 次提交

Fix get_attstatsslot()/free_attstatsslot() when statistics are broken. · ae06d7b0

由 Dhanashree Kashid 提交于 1月 12, 2018

In scenarios where pg_statistic contains wrong statistic entry for an
attribute, or when the statistics on a particular attribute are broken,
for e.g the type of elements stored in stavalues<1/2/3> is different
than the actual attribute type or when there are holes in the attribute
numbers due to adding/dropping columns; following two APIs fail because
they relied on the attribute type sent by the caller:

- get_attstatsslot() : Extracts the contents (numbers/frequency array and
values array) of the requested statistic slot (MCV, HISTOGRAM etc). If the
attribute is pass-by-reference or if the attribute is of toastable type
(varlena types)then it returns a copy allocated with palloc()
- free_attstatsslot() : Frees any palloc'd data by get_attstatsslot()

This problem was fixed in upstream 8.3
(8c21b4e9) for get_attstatsslot(),
wherein the actual element type of the array will be used for
deconstructing it rather that using caller passed OID.
free_attstatsslot() still depends on the type oid sent by caller.

However the issue still exists for free_attstatsslot() where it crashes while
freeing the array. The crash happened because the caller sent type OID was of
type TEXT meaning this a varlena type and hence free_attstatsslot() attempted
to free the datum; however due to the broken slot the datums extracted from
values array were of fixed length type such as int. We considered the int value
as memory address and crashed while freeing it.

This commit brings in a following fix from upstream 10 which redesigns
get_attstatsslot()/free_attstatsslot() such than they robust to scenarios like
these.

commit 9aab83fc
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sat May 13 15:14:39 2017 -0400

    Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.

    The mess cleaned up in commit da075960 is clear evidence that it's a
    bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
    to provide the correct type OID for the array elements in the slot.
    Moreover, we weren't even getting any performance benefit from that,
    since get_attstatsslot() was extracting the real type OID from the array
    anyway.  So we ought to get rid of that requirement; indeed, it would
    make more sense for get_attstatsslot() to pass back the type OID it found,
    in case the caller isn't sure what to expect, which is likely in binary-
    compatible-operator cases.

    Another problem with the current implementation is that if the stats array
    element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
    for each element.  That seemed acceptable when the code was written because
    we were targeting O(10) array sizes --- but these days, stats arrays are
    almost always bigger than that, sometimes much bigger.  We can save a
    significant number of cycles by doing one palloc/memcpy/pfree of the whole
    array.  Indeed, in the now-probably-common case where the array is toasted,
    that happens anyway so this method is basically free.  (Note: although the
    catcache code will inline any out-of-line toasted values, it doesn't
    decompress them.  At the other end of the size range, it doesn't expand
    short-header datums either.  In either case, DatumGetArrayTypeP would have
    to make a copy.  We do end up using an extra array copy step if the element
    type is pass-by-value and the array length is neither small enough for a
    short header nor large enough to have suffered compression.  But that
    seems like a very acceptable price for winning in pass-by-ref cases.)

    Hence, redesign to take these insights into account.  While at it,
    convert to an API in which we fill a struct rather than passing a bunch
    of pointers to individual output arguments.  That will make it less
    painful if we ever want further expansion of what get_attstatsslot can
    pass back.

    It's certainly arguable that this is new development and not something to
    push post-feature-freeze.  However, I view it as primarily bug-proofing
    and therefore something that's better to have sooner not later.  Since
    we aren't quite at beta phase yet, let's put it in.

    Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us

Most of the changes are same as the upstream commit with following additional
changes:
- Relcache translator changes in ORCA.
- Added a test that simulates the crash due to broken stats
- get_attstatsslot() contains an extra check for empty slot array which existed
in master but is not there in upstream.
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>

ae06d7b0

18 1月, 2018 1 次提交

Fix whitespace in tests, mostly in expected output. · 06a2bb64

由 Heikki Linnakangas 提交于 1月 18, 2018

Commit ce3153fa, about to be merged from PostgreSQL 9.0 soon, removes
the -w option from pg_regress's "diff" invocation. That commit will fix
all the PostgreSQL regression tests to pass without it, but we need to
also fix all the GPDB tests. That's what this commit does.

06a2bb64

05 12月, 2017 1 次提交

Clean up bfv_statistic test · 787d72ae

由 Venkatesh Raghavan 提交于 12月 04, 2017

While porting the test from tinc, we added a schema for each test.
During refactoring we forgot to add the schema name and correct table
name in the test query.

787d72ae

17 8月, 2017 1 次提交

Increase the default value of default_statistics_target from 10 to 100, · ef804bdf

由 Tom Lane 提交于 12月 13, 2008

and its maximum value from 1000 to 10000.  ALTER TABLE SET STATISTICS
similarly now allows a value up to 10000.  Per discussion.

Cherry-pick from 65e3ea76

ef804bdf

19 5月, 2017 1 次提交

Make ICG tests pass when GPDB is compiled with disable-orca · 7e774f28

由 Venkatesh Raghavan 提交于 5月 18, 2017

In the updated tests, we used functions like disable_xform and
enable_xform to hint the optimizer to disallow/allow a particular
physical node. However, these functions are only available when GPDB
is built with GPORCA. Planner on the other hand accomplished this
via a GUC.

To avoid usage of these functions in tests, I have introduced couple
of GUCS that mimic the same planner behavior but now for GPORCA.
In this effort I needed to add an API inside GPORCA.

7e774f28

11 2月, 2017 2 次提交
- V
  
  Remove drop schema - since the database is gonna chucked at the end of ICW · 8c0b6c01
  由 Venkatesh Raghavan 提交于 2月 10, 2017
  
  8c0b6c01
- V
  Guard against duplicate table names between parallel tests. · 45b2dbf6
  由 Venkatesh Raghavan 提交于 2月 10, 2017
```
In addition, delete unnecessary drop statements.
```
  45b2dbf6
02 8月, 2016 1 次提交
- H
  Avoid database-wide VACUUM FULL in test case. · fc73eac7
  由 Heikki Linnakangas 提交于 8月 02, 2016
```
Vacuuming the single table ought to be enough.
```
  fc73eac7
17 2月, 2016 1 次提交
- V
  
  Porting Statistics and Cardinality Estimation Tests to ICG · b3c6581a
  由 Venkatesh Raghavan 提交于 2月 16, 2016
  
  b3c6581a