提交 · 5b508229775513f26beb4cde27d9e83eb2f85a83 · Greenplum / Gpdb

29 9月, 2018 2 次提交

MPP-ify AlterTableSpaceOptions. (#5878) · 5b508229

由 Paul Guo 提交于 9月 29, 2018

Currently upstream supports two options: seq_page_cost and random_page_cost. In theory we possibly do not need to mpp-ify this utility stmt, but it is better to do that in case upstream or greenplum add some other option that affects QE.

5b508229

Use WITH syntax for options in create tablespace. (#5877) · e92a82d0

由 Paul Guo 提交于 9月 29, 2018

PG9.4 starts to allow the WITH syntax to support options
in create tablespace. Greenplum previously used the OPTIONS
syntax to support per segment location. Let's union them all to
use the WITH syntax only, following upstream.

Note the greenplum specific OPTIONS exists in gpdb master only.

e92a82d0

28 9月, 2018 1 次提交

Allow tables to be distributed on a subset of segments · 4eb65a53

由 ZhangJackey 提交于 9月 28, 2018

There was an assumption in gpdb that a table's data is always
distributed on all segments, however this is not always true for example
when a cluster is expanded from M segments to N (N > M) all the tables
are still on M segments, to workaround the problem we used to have to
alter all the hash distributed tables to randomly distributed to get
correct query results, at the cost of bad performance.

Now we support table data to be distributed on a subset of segments.

A new columne `numsegments` is added to catalog table
`gp_distribution_policy` to record how many segments a table's data is
distributed on.  By doing so we could allow DMLs on M tables, joins
between M and N tables are also supported.

```sql
-- t1 and t2 are both distributed on (c1, c2),
-- one on 1 segments, the other on 2 segments
select localoid::regclass, attrnums, policytype, numsegments
    from gp_distribution_policy;
 localoid | attrnums | policytype | numsegments
----------+----------+------------+-------------
 t1       | {1,2}    | p          |           1
 t2       | {1,2}    | p          |           2
(2 rows)

-- t1 and t1 have exactly the same distribution policy,
-- join locally
explain select * from t1 a join t1 b using (c1, c2);
                   QUERY PLAN
------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Seq Scan on t1 b
 Optimizer: legacy query optimizer

-- t1 and t2 are both distributed on (c1, c2),
-- but as they have different numsegments,
-- one has to be redistributed
explain select * from t1 a join t2 b using (c1, c2);
                          QUERY PLAN
------------------------------------------------------------------
 Gather Motion 1:1  (slice2; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Redistribute Motion 2:1  (slice1; segments: 2)
                     Hash Key: b.c1, b.c2
                     ->  Seq Scan on t2 b
 Optimizer: legacy query optimizer
```

4eb65a53

27 9月, 2018 1 次提交

Dispatcher can create flexible size gang (#5701) · a3ddac06

由 Tang Pengzhou 提交于 9月 27, 2018

* change type of db_descriptors to SegmentDatabaseDescriptor **

A new gang definination may consist of cached segdbDesc and new
created segdbDesc, there is no need to palloc all segdbDesc struct
as new.

* Remove unnecessary allocate gang unit test

* Manage idle segment dbs using CdbComponentDatabases instead of available* lists.

To support vary size gang, we now need to manage segment dbs in a lower
granularity, previously, idle QEs is managed by a bunch of lists like
availablePrimaryWriterGang, availableReaderGangsN, this restrict
dispatcher to only create N-size (N = number of segments) or 1-size
gang.

CdbComponentDatabases is a snapshot of segment components within current
cluster, now it maintains a freelist for each segment component. When
creating gang, dispatcher will make up a gang from each segment
component (from freelist or create a new segment db). When cleaning up
a gang, dispatcher will return idle segment dbs to each segment
component.

CdbComponentDatabases provide a few functions to manipulate segment dbs
(SegmentDatabaseDescriptor *):
* cdbcomponent_getCdbComponents
* cdbcomponent_destroyCdbComponents
* cdbcomponent_allocateIdleSegdb
* cdbcomponent_recycleIdleSegdb
* cdbcomponent_cleanupIdleSegdbs

CdbComponentDatabases is also FTS version sensitive, so once a FTS
version changed, CdbComponentDatabases destroy all idle segment dbs
and allocate QEs in the new promoted segment. This provides the ability
to transparent mirror failover to users.

Since segment dbs(SegmentDatabaseDescriptor *) are managed by
CdbComponentDatabases now, we can simplify the memory context
management by replacing GangContext & perGangContext with
DispatcherContext & CdbComponentsContext.

* Postpone the error hanlding when creating gang

Now we have AtAbort_DispatcherState, one advantage of it is that
we can postpone gang error hanlding in this function and make
code cleaner.

* Handle FTS version change correctly

In some cases, when a FTS version changed, we can't update current
snapshot of segment components, to be more specifically, we can't
destroy current writer segment dbs and create new segment dbs.

These cases include:
* session has temp table created.
* query need two-phase commit and gxid has been dispatched to
  segments.

* Replace <gangId, sliceId> map with <qeIdentifier, sliceId> map

We used to dispatch a <gangId, sliceId> map along with query to
segment dbs so segment dbs can know which slice they should
execute.

Now gangId is useless for a segment db because a segment db can
be reused by different gang, so we need a new way to tell the
info to segment dbs. To resolve this, CdbComponentDatabases
assign a unique identifier to each segment db and make up a
bitmap set which consist of segment identifiers for each slice,
segment dbs then can go through the slice table and find the
right slice to execute.

* Allow dispatcher to create vary size gang and refine AssignGangs()

Previously, dispatcher can only create N-size gang for
GANGTYPE_PRIMARY_WRITER or GANGTYPE_PRIMARY_READER. this
restrict dispatcher in many ways, one example is direct
dispatch, it always create a N-size gang even it only
dispatch the command to one segment, another example is
some operations may be able to use N+ size gang, like
hash join, if both inner and outer plan is redistributed,
the hash join node can associate with a N+ size gang to
execute. This commit changes the API of createGang() so the
caller can specify a list of segments (partial or even
duplicate segments), CdbCompoentDatabase will guarantee
each segment has only one writer in a session. With this
it also resolves another pain point of AssignGangs(), so
the caller don't need to promote a GANGTYPE_PRIMARY_READER
to GANGTYPE_PRIMARY_WRITER, or promote a GANGTYPE_SINGLETON
_READER to GANGTYPE_PRIMARY_WRITER for replicated table
(see FinalizeSliceTree()).

With this commit, AssignGang() is very clear now.

a3ddac06

22 9月, 2018 1 次提交

Change pretty-printing of expressions in EXPLAIN to match upstream. · 4c54c894

由 Heikki Linnakangas 提交于 9月 21, 2018

We had changed this in GPDB, to print less parens. That's fine and dandy,
but it hardly seems worth it to carry a diff vs upstream for this. Which
format is better, is a matter of taste. The extra parens make some
expressions more clear, but OTOH, it's unnecessarily verbose for simple
expressions. Let's follow the upstream on this.

These changes were made to GPDB back in 2006, as part of backporting
to EXPLAIN-related patches from PostgreSQL 8.2. But I didn't see any
explanation for this particular change in output in that commit message.

It's nice to match upstream, to make merging easier. However, this won't
make much difference to that: almost all EXPLAIN plans in regression
tests are different from upstream anyway, because GPDB needs Motion nodes
for most queries. But every little helps.

4c54c894

21 9月, 2018 2 次提交

Remove check for NOT NULLable column from ORCA translation of INSERT values. · e89be84b

由 Heikki Linnakangas 提交于 9月 21, 2018

When creating an ORCA plan for "INSERT ... (<col list>) VALUES (<values>)"
statement, the ORCA translator performed NULL checks for any columns not
listed in the column list. Nothing wrong with that per se, but we needed
to keep the error messages in sync, or we'd get regression test failures
caused by different messages. To simplify that, remove the check from
ORCA translator, and rely on the execution time check.

We bumped into this while working on the 9.3 merge, because 9.3 added
DETAIL to the error message in executor:

postgres=# create table notnulls (a text NOT NULL, b text NOT NULL);
CREATE TABLE
postgres=# insert into notnulls (a) values ('x');
ERROR:  null value in column "b" violates not-null constraint
postgres=# insert into notnulls (a,b) values ('x', NULL);
ERROR:  null value in column "b" violates not-null constraint  (seg2 127.0.0.1:40002 pid=26547)
DETAIL:  Failing row contains (x, null).

Doing this now will avoid that inconsistency in the merge.

One little difference with this is that EXPLAIN on an insert like above
now works, and you only get the error when you try to execute it. Before,
with ORCA, even EXPLAIN would throw the error.

e89be84b

Fix pg_stat_activity show wrong session id after session reset bug (#5757) · ac54faad

由 Teng Zhang 提交于 9月 21, 2018

* Fix pg_stat_activity show wrong session id after session reset bug

Currently if a session is reset because of some error such as OOM,
after call CheckForResetSession, gp_session_id will bump to a new one,
but sess_id in pg_stat_activity remains unchanged and show the wrong number.
This commit changes sess_id in pg_stat_activity, once a session is reset.

* Refactor test using gp_execute_on_server to trigger session reset

ac54faad

19 9月, 2018 1 次提交

Simplify 'partindex_test', by using OUT parameters and pg_get_expr(). · fcc7898b

由 Heikki Linnakangas 提交于 9月 19, 2018

OUT parameters make calling the function much less awkward, as you don't
need to specify the output columns in every invocation. Also, change the
datatype of a few columns to pg_node_tree, so that you can use pg_get_expr()
to pretty-print them.

I'm doing this to hide the trivial differences in the internal string
representation of expressions. This came up in the 9.3 merge, which added
a new field to FuncExpr again, which would cause the test to fail. Using
pg_get_expr() to print the fields in human-readable format, which is not
sensitive to small changes like that, will avoid that problem.

fcc7898b

18 9月, 2018 1 次提交

Remove dead code following ereport(ERROR.. call · 46cf24ee

由 Daniel Gustafsson 提交于 9月 18, 2018

The relation_close() call directly following ereport(ERROR.. will never be
called as the ereport won't return. While closing and cleaning up any used
resources is a good thing, they will be automatically handled by the error
handler so remove.

Also editorialized the error message to fit the error message style guide
and fixed test fallout.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

46cf24ee

12 9月, 2018 2 次提交

Fix imtermittent failure dispatch test cases · 1803e1eb

由 Pengzhou Tang 提交于 9月 11, 2018

In dispatch test cases, we need a way to put a segment to in-recovery
status to test gang recreating logic of dispatcher.

We used to trigger a panic fault on a segment and suspend the quickdie()
to simulate in-recovery status. To avoid segment staying in recovery mode
for a long time, we used a 'sleep' fault instead of 'suspend' in quickdie(),
so segment can accept new connections after 5 seconds. 5 seconds works
fine most of time, but still not stable enough, so we decide to use more
straight-forward mean to simulate in-recovery mode which reports a
POSTMASTER_IN_RECOVERY_MSG directly in ProcessStartupPacket(). To not
affecting other backends, we create a new database so fault injectors
only affect dispatch test cases.

1803e1eb

Ensure that `SUBPARTITION TEMPLATE` only appears after `SUBPARTITION BY` · b2f85691

由 Joao Pereira 提交于 8月 21, 2018

This commit changed the way the grammar was parsing the subpartition
information to ensure that the templates could not exist without a
`SUBPARTITION BY`.

Testing coverage for the case where the `SUBPARTITION TEMPLATE`
expression is written before a `SUBPARTITION BY` was added.

The error displayed when the template appeared after a Partition was
changed to point to the TEMPLATE
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NAdam Berlin <aberlin@pivotal.io>

b2f85691

07 9月, 2018 2 次提交

Remove start_equiv/end_equiv support from gpdiff. · 2a5b7355

由 Heikki Linnakangas 提交于 9月 07, 2018

We can manage without it. Convert them into human-oriented comments, and
rely on the usual "compare with expected output" method for all of these
tests.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/lrJFgQR-KhI/KFTnrJj2BQAJ

2a5b7355

Fix intermittent issue with appendonly regression test · 7943d890

由 Jimmy Yih 提交于 9月 06, 2018

A `CREATE TABLE AS` without a `DISTRIBUTED BY` clause will create a
randomly distributed table, when optimized by ORCA. The plan for the
CTAS will have a redistribute motion (random) between the scan and the
insert. Depending on your data, this style of plan could be more even,
equally even, or less even than a hash distributed table (the kind of
distribution usually assumed by planner).

This commit changes the test to explicitly distribute by the same column
that planner would guess.
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

7943d890

06 9月, 2018 3 次提交

Handle aggr_args correctly in the parser · 06a2472b

由 Taylor Vesely 提交于 9月 04, 2018

In GPDB AGGREGATE functions can be either 'ordered' or 'hypothetical',
and as a result the token aggr_args has more information than in
upstream. Excepting CREATE [ORDERED] AGGREGATE, the parser will extract
the function arguments from the aggr_args token using
extractAggrArgTypes().

The ALTER EXTENSION ADD/DROP AGGREGATE and SECURITY LIMIT syntax has
been added as part of the merge of PostgreSQL between 9.0 and 9.2, so
add a call to extract the function arguments.
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>

06a2472b

Integrate Gang management from portal to Dispatcher and simplify AssignGangs for init plans (#5555) · 78a4890a

由 Tang Pengzhou 提交于 9月 06, 2018

* Simplify the AssignGangs() logic for init plans

Previously, AssignGangs() assign gangs for both main plans and
init plans in one shot. Because init plans and main plan are
executed sequentially, so the gangs can be reused between main
plan and init plans, function AccumSliceReq() is designed for
this.

This process can be simplified: already know the root slice
index id will be adjusted to according init plan id, init plan
only need to assign their own slices.

* Integrate Gang management from portal to Dispatcher

Previously, Gang was managed by portal, freeGangsForPortal()
was used to cleanup gang resource, DTM related commands also
needed a gang to dispatch command outside of a portal and
used freeGangsForPortal() too. There might be multiple
command/plan/utility executed within one portal, all commands
relied on a dispatcher routine like CdbDispatchCommand /
CdbDispatchPlan/CdbDispatchUtility... to dispatch, gangs were
created by each dispatcher routines, but not be recycled or
destroyed when a routine finished except for primary writer
gang, one defect of this is gang resource cannot be reused
between dispatcher routines. GPDB already had an optimization
for init plans, if a plan contained init plans, AssignGangs
was called before execution of any of them it went through
the whole slice tree and created the maximum gang that both
main plan and init plans needed, this was doable because init
plans and main plan were executed sequentially, but it also
made AssignGangs logic complex, meanwhile, reusing an not
clean gang was not safe.

Another confusing thing was the gang and dispatcher were
managed separately which cause context inconsistent like:
when a dispatcher state was destroyed, gang was not recycled,
when a gang was destroyed by portal, the dispatcher state was
still in use and may refer to the context of a destroyed gang.

As described above, this commit integrates gang management
with dispatcher, a dispather state is responsible for creating
and tracking gangs as needed and destroy them when dispatcher
state is destroyed.

* Handle the case when primary writer gang has gone

When members of primary writer gang gone, the writer gang
is destroyed immediately (primaryWriterGang is set to NULL)
when a dispatcher rountine (eg.CdbDispatchCommand) finished.
So when dispatching two-phase-DTM/DTX related command, QD
doesn't know writer gang has gone, it may get unexpected
error like 'savepoint not exist', 'subtransaction level not
match', 'temp file not exist'.

Previously, primaryWriterGang is not reset when DTM/DTX
commands start even it is pointing to invalid segments, so
those DTM/DTX commands will not actually sent to segments,
an normal error reported on QD looks like 'could not
connect to segment: initialization of segworker'.

So we need a way to info global transaction that its writer
gang has lost. so when aborting transaction, QD can:
1. disconnect all reader gangs, this is usefull to skip
dispatching "ABORT_NO_PREPARE"
2. reset session and drop temp files because temp files in
segment is gone.
3. report a error when dispatching "rollback savepoint" DTX
because savepoint in segment is gone.
4. report a error when dispatch "abort subtransaction" DTX
because subtransaction is rollback when writer segment is down.

78a4890a

Fix command tag in "COPY (SELECT ...) TO <file>". · f5130c20

由 Heikki Linnakangas 提交于 9月 05, 2018

It used to always say "COPY 0", instead of the number of rows copied. This
source line was added in PostgreSQL 9.0 (commit 8ddc05fb), but it was
missed in the merge. Add a test case to check the command tags of
different variants of COPY, including this one.

f5130c20

05 9月, 2018 1 次提交

Support FORCE QUOTE * in external tables · 1d15b9bc

由 Daniel Gustafsson 提交于 9月 05, 2018

FORCE QUOTE * was added in PostgreSQL 9.0 as a shorthand for FORCE
QUOTE <column_1 .. column_n> in the COPY comamnd, where all columns
of a relation are added for forced quoting. External tables use copy
and copy options under the hood, so they too should support '*' to
quote all columns. This resolves a FIXME added during the 9.0 merge.

1d15b9bc

31 8月, 2018 2 次提交

Replace GPDB versions of some numeric aggregates with upstream's. · 325e6fcd

由 Heikki Linnakangas 提交于 8月 31, 2018

Among other things, this fixes the inaccuracy of integer avg() and sum()
functions. (i.e. fixes https://github.com/greenplum-db/gpdb/issues/5525)

The upstream versions are from PostgreSQL 9.6, using the 128-bit math
from the following commit:

commit 959277a4
Author: Andres Freund <andres@anarazel.de>
Date:   Fri Mar 20 10:26:17 2015 +0100

    Use 128-bit math to accelerate some aggregation functions.

    On platforms where we support 128bit integers, use them to implement
    faster transition functions for sum(int8), avg(int8),
    var_*(int2/int4),stdev_*(int2/int4). Where not supported continue to use
    numeric as a transition type.

    In some synthetic benchmarks this has been shown to provide significant
    speedups.

    Bumps catversion.

    Discussion: 544BB5F1.50709@proxel.se
    Author: Andreas Karlsson
    Reviewed-By: Peter Geoghegan, Petr Jelinek, Andres Freund,
        Oskari Saarenmaa, David Rowley

325e6fcd

Enable 'triggers' test · db913af7

由 xiong-gang 提交于 8月 31, 2018

Quite a few cases in 'triggers' don't apply on GPDB: non-SELECT statement
in trigger function, modification on views, INSTEAD OF triggers, etc. But
we can still use some of the test cases, for example, FOR STATEMENT trigger
start acting differently since we changed INSERT/DELETE/UPDATE to modifyTable
by merging 8.5, and it should have been caught by the tests.

db913af7

18 8月, 2018 2 次提交

Change cdbhash to use jump consistent hash. · 4a174240

由 Zhenghua Lyu 提交于 8月 18, 2018

Previously, Greenplum uses a very simple algorithm to reduce hash
value to segment id: it just does a modulo computation. Simply
modulo is not suitable for cluster expanding.

In this commit, we change cdbhash to use jump consistent hash algorithm.
More details, please refer the proposal on gpdb-dev mail-list:
https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/f5bqTzAZjAsCo-authored-by: NZhenghua Lyu <zlv@pivotal.io>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>

4a174240

Fix cost of BTree Index scan in ORCA · dc3186e5

由 Sambitesh Dash 提交于 8月 16, 2018

This is the GPDB side commit for the ORCA commit.

A note on test case changes :

In `subselect_gp` the plans becomes similar to Planner generated plans.

In `qp_gist_indexes4`, we started picking Index Scan over Bitmap Scan.

Bumped the ORCA version to v2.70.0

dc3186e5

15 8月, 2018 2 次提交

Refactor allow_system_table_mods into a boolean GUC (#5407) · 4c24d744

由 David Kimura 提交于 8月 15, 2018

The purpose of this refactor is to more closely align the GUC with postgres. It
started as a suggestion in https://github.com/greenplum-db/gpdb/pull/4790.
There are still differences, particularly around when this GUC can be set. In
GPDB it can be set by anyone at any time (PGC_USERSET), however in postgres it
is limited to postmaster restart (PGC_POSTMASTER). This difference was kept on
purpose until we have more buy-in as it is a bigger change on the end-user.
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>

4c24d744

Fix ALTER EXTERNAL TABLE ... ALTER TYPE to not require USING. · d244a783

由 Heikki Linnakangas 提交于 8月 15, 2018

Don't require a USING clause, when there is no cast between the old and
new datatype are not compatible, when altering an external table. It'll
just be ignored, we're not going to rewrite any data in an external
source. And conversely, don't allow USING, because it'll just be ignored.

Fixes https://github.com/greenplum-db/gpdb/issues/3356

d244a783

14 8月, 2018 1 次提交

Remove cdbdisp_finishCommand · 957629d1

由 Pengzhou Tang 提交于 7月 31, 2018

Previously, cdbdisp_finishCommand did three things:
1. cdbdisp_checkDispatchResult
2. cdbdisp_getDispatchResult
3. cdbdisp_destroyDispatcherState

However, cdbdisp_finishCommand didn't make code cleaner or more
convenient to use, in contrast, it makes error handling more
difficult and makes code more complicated and inconsistent.

This commit also reset estate->dispatcherState to NULL to avoid
re-entry of cdbdisp_* functions.

957629d1

13 8月, 2018 2 次提交

Convert gp_internal_tools into an extension · 2a8973ea

由 Joao Pereira 提交于 8月 07, 2018

- Created the needed control files
- Renamed and adapted the installation script
- Remove redundant tests in regress that were just checking installation
- Add tests to ensure installation is successfull
- Updated the Makefile to support the extension and add information
about regression tests
- Add contrib/gp_internal_tools tests to ICW
- Remove session_level_memory_consumption from regress test suite

2a8973ea

P

Should not hardcode user/role gpadmin in external_table test case. (#5459) · e29302ae
由 Paul Guo 提交于 8月 13, 2018

e29302ae

11 8月, 2018 1 次提交

Adding GiST support for GPORCA · ec3693e6

由 Ashuka Xue 提交于 7月 13, 2018

Prior to this commit, there was no support for GiST indexes in GPORCA.
For queries involving GiST indexes, ORCA was selecting Table Scan paths
as the optimal plan. These plans could take up to 300+ times longer than
Planner, which generated a index scan plan using the GiST index.

Example:
```
CREATE TABLE gist_tbl (a int, p polygon);
CREATE TABLE gist_tbl2 (b int, p polygon);
CREATE INDEX poly_index ON gist_tbl USING gist(p);

INSERT INTO gist_tbl SELECT i, polygon(box(point(i, i+2),point(i+4,
i+6))) FROM generate_series(1,50000)i;
INSERT INTO gist_tbl2 SELECT i, polygon(box(point(i+1, i+3),point(i+5,
i+7))) FROM generate_series(1,50000)i;

ANALYZE;
```
With the query `SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE
gist_tbl.p <@ gist_tbl2.p;`, we see a performance increase with the
support of GiST.

Before:
```
EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
   ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..171401912.12 rows=1 width=8)
         ->  Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
               ->  Nested Loop  (cost=0.00..171401912.12 rows=335499869 width=1)
                     Join Filter: gist_tbl.p <@ gist_tbl2.p
                     ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                     ->  Materialize  (cost=0.00..530.81 rows=49997 width=101)
                           ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..525.76 rows=49997 width=101)
                                 ->  Table Scan on gist_tbl  (cost=0.00..432.24 rows=16666 width=101)
 Optimizer status: PQO version 2.65.1
(10 rows)

Time: 170.172 ms
SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
 count
-------
 49999
(1 row)

Time: 546028.227 ms
```

After:
```
EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
   ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..21749053.24 rows=1 width=8)
         ->  Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
               ->  Nested Loop  (cost=0.00..21749053.24 rows=335499869 width=1)
                     Join Filter: true
                     ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..526.39 rows=50328 width=101)
                           ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                     ->  Bitmap Table Scan on gist_tbl  (cost=0.00..21746725.48 rows=6667 width=1)
                           Recheck Cond: gist_tbl.p <@ gist_tbl2.p
                           ->  Bitmap Index Scan on poly_index  (cost=0.00..0.00 rows=0 width=0)
                                 Index Cond: gist_tbl.p <@ gist_tbl2.p
 Optimizer status: PQO version 2.65.1
(12 rows)

Time: 617.489 ms

SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
 count
-------
 49999
(1 row)

Time: 7779.198 ms
```

GiST support was implemented by sending over GiST index information to
GPORCA in the metadata using a new index enum specifically for GiST.
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

ec3693e6

08 8月, 2018 1 次提交

Some fixes for externalgettup_custom() (#5391) · d65cd31e

由 Xiaoran Wang 提交于 8月 08, 2018

1)Refactor function externalgettup_custom
scan->raw_buf_done means there is no new data in the formatter->fmt_databuf( a block data) for the formatter to process. Maybe there is some data left in the fmt_databuf, but it is not a complete tuple and formatter can not process it .pstate->fe_eof means has no data left in the external file. When scan->raw_buf_done and pstate->fe_eof are both true, it means there is no new data for formatter to process.
If there is no new data and still there is some data in the formatter->fmt_databuf, it means the external file is not complete.
2) ereports WARNING instead of ERROR if the external file is not complete. If reports error, the transaction will rollback. To ignore the incomplete data at the end of file is better.

d65cd31e

07 8月, 2018 1 次提交

Add missing break in switch. (#5396) · 89c23f15

由 Paul Guo 提交于 8月 07, 2018

These were detected by compiler option -Wimplicit-fallthrough (gcc 7.x).
Some of these issues are introduced by git merge during postgres merge.

We've seen similar cases (missing break) several times so we probably
want this compiler option as the default in the future after we modify
code to remove all or most false alarms. Before that, we will need to
run with this option locally during postgres merge.

89c23f15

03 8月, 2018 2 次提交

Tidy up SREH error messages · bcc848f0

由 Daniel Gustafsson 提交于 8月 03, 2018

The errmsg() contained the error message as well as the error detail
and in one case the hint too.  Break up and fix punctuation to match
how error messages are written.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

bcc848f0

K
Revert "Merge with PostgreSQL 9.2beta2." · e0aa3ef2
由 Karen Huddleston 提交于 8月 02, 2018
```
This reverts commit 4750e1b6.
```
e0aa3ef2

02 8月, 2018 1 次提交

Merge with PostgreSQL 9.2beta2. · 4750e1b6

由 Richard Guo 提交于 8月 02, 2018

This is the final batch of commits from PostgreSQL 9.2 development,
up to the point where the REL9_2_STABLE branch was created, and 9.3
development started on the PostgreSQL master branch.

Notable upstream changes:

* Index-only scan was included in the batch of upstream commits. It
  allows queries to retrieve data only from indexes, avoiding heap access.

* Group commit was added to work effectively under heavy load. Previously,
  batching of commits became ineffective as the write workload increased,
  because of internal lock contention.

* A new fast-path lock mechanism was added to reduce the overhead of
  taking and releasing certain types of locks which are taken and released
  very frequently but rarely conflict.

* The new "parameterized path" mechanism was added. It allows inner index
  scans to use values from relations that are more than one join level up
  from the scan. This can greatly improve performance in situations where
  semantic restrictions (such as outer joins) limit the allowed join orderings.

* SP-GiST (Space-Partitioned GiST) index access method was added to support
  unbalanced partitioned search structures. For suitable problems, SP-GiST can
  be faster than GiST in both index build time and search time.

* Checkpoints now are performed by a dedicated background process. Formerly
  the background writer did both dirty-page writing and checkpointing. Separating
  this into two processes allows each goal to be accomplished more predictably.

* Custom plan was supported for specific parameter values even when using
  prepared statements.

* API for FDW was improved to provide multiple access "paths" for their tables,
  allowing more flexibility in join planning.

* Security_barrier option was added for views to prevents optimizations that
  might allow view-protected data to be exposed to users.

* Range data type was added to store a lower and upper bound belonging to its
  base data type.

* CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
  SELECT query is planned during the execution of the utility. To conform to
  this change, GPDB executes the utility statement only on QD and dispatches
  the plan of the SELECT query to QEs.
Co-authored-by: NAdam Lee <ali@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

4750e1b6

01 8月, 2018 2 次提交

Move external table custom formatter test to GPDB suite · 9f396b49

由 Daniel Gustafsson 提交于 8月 01, 2018

The test for creating an external table with a custom format was
placed in the upstream create_table.sql test suite.  Move to the
external_table suite instead to keep all external table tests in
one place and to align better with upstream.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

9f396b49

Remove dead code in exttable copy format parsing · ee8bebad

由 Daniel Gustafsson 提交于 8月 01, 2018

There is no point in checking for the presence of a formatter, as
it's already tested for in the error path in transformFormatOpts()
when the external table is created.  Remove dead code and make it
fail on the error case should we ever manage to reach here.  This
resolves a FIXME added during the 9.1 merge which highlighted the
issue.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

ee8bebad

31 7月, 2018 1 次提交
- D
  
  Improve comment and error message wording · dc7c370d
  由 Daniel Gustafsson 提交于 7月 31, 2018
  
  dc7c370d
24 7月, 2018 1 次提交

Fix race condition in autovacuum test · dfdfd144

由 David Kimura 提交于 7月 20, 2018

If autovacuum was triggered before ShmemVariableCache->latestCompletedXid is
updated by manually consuming xids then autovacuum may not vacuum template0
with a proper transaction id to compare against. We made the test more reliable
by suspending a new fault injector (auto_vac_worker_before_do_autovacuum) right
before autovacuum worker sets recentXid and starts doing the autovacuum. This
allows us to guarantee that autovacuum is comparing against a proper xid.

We also removed the loop in the test because vacuum_update_dat_frozen_xid fault
injector ensures the pg_database table has been updated.
Co-authored-by: NJimmy Yih <jyih@pivotal.io>

dfdfd144

23 7月, 2018 1 次提交

Enable update on distribution column in legacy planner. · 6be0a32a

由 Zhenghua Lyu 提交于 7月 23, 2018

Before, we cannot update distribution column in legacy planner, because the OLD tuple
and NEW tuple maybe belong to different segments. We enable this by borrowing ORCA's
logic, namely, split each update operation into delete and insert. The delete operation is hashed
by OLD tuple attributes, and insert operation is hashed by NEW tuple attributes. This change
includes following items:
* We need push missed OLD attributes to sub plan tree so that that attribute could be passed to top Motion.
* In addition, if the result relation has oids, we also need to put oid in the targetlist.
* If result relation is partitioned, we should special treat it because resultRelations is partition tables instead of root table, but that is true for normal Insert.
* Special treats for update triggers, because trigger cannot be executed across segments.
* Special treatment in nodeModifyTable, so that it can process Insert/Delete for update purpose.
* Proper initialization of SplitUpdate.

There are still TODOs:
* We don't handle cost gracefully, because we add SplitUpdate node after plan generated. Already added a FIXME for this
* For deletion, we could optimize in just sending distribution columns instead of all columns


Author: Xiaoran Wang <xiwang@pivotal.io>
Author: Max Yang <myang@pivotal.io>
Author: Shujie Zhang <shzhang@pivotal.io>
Author: Zhenghua Lyu <zlv@pivotal.io>

6be0a32a

21 7月, 2018 1 次提交

Remove tests for gp_setwith_alter_storage. · f11dc049

由 Ashwin Agrawal 提交于 7月 18, 2018

This feature is not ready for primetime yet, so no point testing the same by
enabling the guc. Hence for now removing the tests whenever in future we expose
this feature add and enable tests for the same.

Github issue #5300 tracks to full enable the feature.

f11dc049

04 7月, 2018 1 次提交
- A
  copy.c: dispatch missing AO segno map for not-partitioned tables · 14cb1039
  由 Adam Lee 提交于 7月 03, 2018
```
The map was missed by mistake, all AO loading actions need it.
```
  14cb1039
27 6月, 2018 1 次提交
- A
  COPY: don't dispatch AO segno map for unloading · b320948d
  由 Adam Lee 提交于 6月 26, 2018
```
Unloading doesn't need it, checking the distribution policy neither.
```
  b320948d