提交 · 194e84e5d7c8dee1fee68ef5702c8bea5f222e99 · Greenplum / Gpdb

11 1月, 2019 12 次提交

由 Taylor Vesely 提交于 1月 03, 2019

- set config flags to disable zstd in platforms that don't have the
  library baked in
  - currently only centos6 and centos7 have zstd provided in the build
    images
- delete the suse10 config flags, because it is no longer a supported
  platform
- add EXTRA_CONFIG_FLAGS to deb_create_package.bash
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NBen Christel <bchristel@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

194e84e5

test_gpdb.py: Fix configure command to run w/ options · e36d0b3e

由 Taylor Vesely 提交于 1月 05, 2019

With `shell=True` in the call to `subprocess.call`, the options to
`configure` were ignored. We don't need the shell and its use is
generally discouraged.
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

e36d0b3e

Make zstd tests more deterministic · 00dd1ead

由 Jimmy Yih 提交于 12月 28, 2018

The current tests would diff on the SELECT queries containing ORDER BY
LIMIT on a gpdemo cluster (seen on my MacOS Mojave laptop and on a
CentOS 6 & 7 VM). Most likely the test answer file was created from a
cluster that was not the test standard 3 primary segment configuration
so the SELECT may output differently.
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

00dd1ead

Remove 9.4 Merge FIXME regarding pm_launch_walreceiver · dc5acb72

由 Ekta Khanna 提交于 1月 09, 2019

The pm_launch_walreceiver variable is still needed to handle deadlocks
during parallel segment start. Removing the logic will result in
CAC_STARTUP being received and pg_ctl being stuck waiting for PQPING_OK.
The pm_launch_walreceiver variable is used to send Greenplum-specific
CAC_MIRROR_READY so that pg_ctl can continue.

Relative commit:
https://github.com/greenplum-db/gpdb/commit/b824fe8f269934987e30a4c177605b250703d20fCo-authored-by: NEkta Khanna <ekhanna@pivotal.io>
Co-authored-by: NJimmy Yih <jyih@pivotal.io>

dc5acb72

Disable materialized view on greenplum. · 22727acc

由 Paul Guo 提交于 1月 08, 2019

This feature will not be supported in the oncoming gpdb major release.
Explicitly disable it for now.

22727acc

Fix Material squelching. · 894de029

由 Heikki Linnakangas 提交于 1月 10, 2019

If a Material node is used to shield an underlying Motion node from
rescans, we mustn't eagerly-free it when it's squelched. Also, when a
such a Material node is squelched, don't squelch the underlying node,
because we might need to read more tuples from it later.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>

894de029

Fix crash after squelching a RecursiveUnion node. · 55795787

由 Heikki Linnakangas 提交于 1月 10, 2019

If a RecursiveUnion is rescanned after squelching, we still need the
tuplestores. So don't free them, just reset them to empty. This showed
up as crashes in the regression suite.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>

55795787

Fix recursion in squelching a SubqueryScan. · 6275939a

由 Heikki Linnakangas 提交于 1月 10, 2019

A SubqueryScan keeps the child Plan node in a special field, not in the
normal Plan.lefttree field. Therefore, it needs bespoken code to handle
recursion.

This fixes the starvation / deadlock e.g. that used to happen with this
query in the regression test suite. It was silenced by commit b7bb5438,
but we should still fix this, as the it might still cause trouble with
some other queries.

regression=# explain with recursive
r(i) as (
    select 1
    union all
    select r.i + 1 from r, recursive_table_2 where i = recursive_table_2.id
),
y(i) as (
    select 1
    union all
    select i + 1 from y, recursive_table_1 where i = recursive_table_1.id and EXISTS (select * from r limit 10)
)
select * from y limit 10;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.20 rows=10 width=4)
   ->  Recursive Union  (cost=0.00..24.96 rows=35 width=4)
         ->  Result  (cost=0.00..0.01 rows=1 width=0)
         ->  Result  (cost=2.15..2.43 rows=4 width=4)
               One-Time Filter: $2
               InitPlan 1 (returns $2)  (slice3)
                 ->  Limit  (cost=0.00..0.02 rows=1 width=0)
                       ->  Subquery Scan on r  (cost=0.00..0.68 rows=35 width=0)
                             ->  Recursive Union  (cost=0.00..24.76 rows=12 width=4)
                                   ->  Result  (cost=0.00..0.01 rows=1 width=0)
                                   ->  Hash Join  (cost=2.13..2.41 rows=4 width=4)
                                         Hash Cond: (r_1.i = recursive_table_2.id)
                                         ->  WorkTable Scan on r r_1  (cost=0.00..0.20 rows=4 width=4)
                                         ->  Hash  (cost=2.09..2.09 rows=1 width=4)
                                               ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..2.09 rows=3 width=4)
                                                     ->  Seq Scan on recursive_table_2  (cost=0.00..2.03 rows=1 width=4)
               ->  Hash Join  (cost=2.13..2.40 rows=4 width=4)
                     Hash Cond: (y.i = recursive_table_1.id)
                     ->  WorkTable Scan on y  (cost=0.00..0.20 rows=4 width=4)
                     ->  Hash  (cost=2.09..2.09 rows=1 width=4)
                           ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..2.09 rows=3 width=4)
                                 ->  Seq Scan on recursive_table_1  (cost=0.00..2.03 rows=1 width=4)
 Optimizer: legacy query optimizer
(23 rows)
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>

6275939a

Fix Limit, NestLoop and MergeJoin to squelch subnodes. · 7ce92149

由 Heikki Linnakangas 提交于 1月 10, 2019

Previously, these nodes didn't squelch the subnodes after returning the
last row, if the EXEC_FLAG_REWIND option was used. That's not right. It
is not the responsibility of these nodes to shield the subnodes from
squelches. If a subnode doesn't want to free resources eagerly, it needs
to deal with that by itself.

This used to cause deadlocks in the regression tests, in query plans
involving SubPlans with Limit nodes, when using the TCP interconnect.
(The TCP interconnect has shorter send queues, making it more prone to
starvation or deadlock if squelching is missing.) The regression failures
were silenced by commit b7bb5438, but there might be some other queries
that we just haven't hit yet, where it could still be reproduced.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>

7ce92149

Use centos6-test image in Concourse task files · 4ae1f3aa

由 Karen Huddleston 提交于 1月 09, 2019

Also updated docker README to reference new build image.
Co-authored-by: NDavid Sharp <dsharp@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

4ae1f3aa

Use centos6-test image for test jobs · 53f84628

由 David Sharp 提交于 1月 07, 2019

and centos6-build for the OSS build job. The image for this job was left
unmodified when we previously introduced the centos6-build image for
enterprise compile jobs. We changed the OSS job to also use the
centos6-build image so we could remove the centos-gpdb-test-6 image.

We also renamed the image resources to
centos<version>-<test,build>-gpdb6, to clarify what they are
building/testing. Docker resource names should match the repository and
tag of the docker image.
Co-authored-by: NDavid Sharp <dsharp@pivotal.io>
Co-authored-by: NBen Christel <bchristel@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

53f84628

Disable auto segment rebalance during gpstart. · 95606763

由 Ashwin Agrawal 提交于 1月 09, 2019

gpstart tried to automatically rebalances the cluster if synced
segment pairs are not in their preferred segment roles (primary or
mirror). This worked and was basically free in file replication. As
part of the cluster start, the gpstart utility would see that the
primary and mirror pair were both in up/sync state but segment roles
reversed in the catalog. It was simple to just send the correct
filerep signals to switch their segment roles to their preferred role.

With WAL replication, this is not as trivial.  The primary and mirror
segments themselves are very aware of their segment roles.  If a
segment found a recovery.conf file in their data directory, it'll
automatically start as a mirror.

So, till this is not properly implemented if decided to still be
supported as part of gpstart, removing the currently broken logic in
gpstart.

95606763

10 1月, 2019 15 次提交

docs: update links to the website · b841c2b4

由 Daniel Gustafsson 提交于 1月 10, 2019

The website redesign altered the URLs with no redirects, so existing
links need to be updated to match the new structure.

Reviewed-by: Mel Kiyama
Reviewed-by: David Yozie

b841c2b4

Minor wordsmithing on the README · a20ebe96

由 Daniel Gustafsson 提交于 1月 10, 2019

This polishes the wording in the README a bit where it seemed either
convoluted or strange to me. On top of wording it updates the directory
structure to reference gpcontrib, and removes mention of TINC with the
rationale that anyone reading this file is a new contributor and there
is little value in bringing up a framework which is its deathbed. It
also fixes the links to the website to actually work, since they site
redesign broke the old links without redirects. The ORCA naming is
discussed with all mentions changed to GPORCA, since that's the name
used in the documentation.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Discussion: https://github.com/greenplum-db/gpdb/pull/6648

a20ebe96

Don't use TIDs with high offset numbers in AO tables. · c249ac7a

由 Heikki Linnakangas 提交于 1月 10, 2019

Change the mapping AO segfilenum+rownum to an ItemPointer, so that we
avoid using ItemPointer.ip_posid values higher than 32768. Such offsets
are impossible on heap tables, because you can't fit that many tuples on
a page. In GiST, since PostgreSQL 9.1, we have taken advantage of that by
using 0xfffe (65534) to mark special "invalid" GiST tuples. We can tolerate
that, because those invalid tuples can only appear on internal pages, so
they cannot be confused with AO TIDs, which only appear on leaf pages. But
we will also use of those high values for other similar magic values, in
later version of PostgreSQL, so it seems better to keep clear of them, even
if we could make it work.

To allow binary upgrades of indexes that already contain AO tids with high
offsets, we still allow and handle those, too, in the code to fetch AO
tuples. Also relax the sanity check in GiST code, to not confuse those high
values with invalid tuples.

Fixes https://github.com/greenplum-db/gpdb/issues/6227Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

c249ac7a

Don't bind outgoing TCP connections to a particular IP address. · 8b8523eb

由 Heikki Linnakangas 提交于 1月 10, 2019

In the TCP interconnect, we used to bind outgoing TCP connections to the
same source IP address that the libpq connection came from. That can lead
to running out of ephemeral TCP ports. I was seeing errors, when running
the regression tests with the TCP interconnect:

ERROR: interconnect error setting up outgoing connection
DETAIL: Could not bind to local addr 2.0.0.0: Address already in use

This was easily reproducible, by running the parallel group of tests that
includes the qp_misc_jiras test. Apparantly that parallel group opens
especially many connections.

When a socket is bound to a particular IP address with bind(), it is also
allocated an ephemeral TCP port. On Linux, the range of ports available
can be seen in /proc/sys/net/ipv4/ip_local_port_range. It defaults to
32768-60999, but even if you increase the range, it's always quite
limited. bind() reserves the whole TCP port, even though multiple outgoing
connections could share the same source port, as long as their destination
IP address or port is different, because bind() doesn't know whether
you're going to use the port to listen for incoming connections, or for
establishing an outgoing connection. Listening for incoming connections
needs to reserve the port.

Linux kernel 4.2 introduced a new socket option, IP_BIND_ADDRESS_NO_PORT,
that we could use to give bind() a hint that we're using the socket for
an outgoing connection, so there's no need to reserve the whole port. But
actually, I don't think we should be calling bind() on outgoing
connections in the first place. I don't think the logic, to use the
incoming libpq connection's IP address as the source IP address of outgoing
interconnect connections, makes sense. The comment says that it is for
fault tolerance, but I don't buy that argument. If a network adapter is not
working, it should be disabled in the OS configuration so that it is not
used. It is not the application's job to make routing decisions.

Forcing the same source IP address seems outright wrong in some scenarios.
Imagine that the QD has two network adapters: one for connecting to the
outside world, and another for the internal network where the QEs are. In
that scenario, the interconnect connections between the QD and the QEs
should definitely *not* be established through the same network adapter as
the user's libpq connection.

Better to just remove the code to bind to a particular source IP address,
and let the OS do its job of routing TCP connections. (AFAICS, the UDP
interconnect never tried to force a particular source IP address when
sending.)
Reviewed-by: NPaul Guo <pguo@pivotal.io>
Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/ITkZdACpcVQ/H_74phbMFgAJ

8b8523eb

Introduce new class to hold context for Query->DXL translation. · c1de30da

由 Heikki Linnakangas 提交于 1月 10, 2019

The new class, CContextQueryToDXL, holds information that's global for
the whole query. This makes subquery planning less awkward, as we don't
need to pass global information up and down the query levels.
Reviewed-by: NEkta Khanna <ekhanna@pivotal.io>
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>

c1de30da

EXCHANGE PARTITION should ERROR out when relpersistence differs · 3d6c5664

由 Shaoqi Bai 提交于 1月 07, 2019

Currently, EXCHANGE PARTITION will allow a target partition and source
table with differing relpersistence types.
This should be checked for and banned when checking if the relation
is_exchangeable.
Co-authored-by-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by-by: NShaoqi Bai <sbai@pivotal.io>

3d6c5664

Resolve FIXME for validatepart by passing relpersistence of root · 263355cf

由 Melanie Plageman 提交于 1月 03, 2019

MergeAttributes was used in atpxPart_validate_spec to get the schema and
constraints to make a new leaf partition as part of ADD or SPLIT
PARTITION. It was likely used as a convenience, since it already
existed, and seems like the wrong function for the job.

Previously atpxPart_validate_spec simply hard-coded in false for the relation
persistence since the parameter was simply `isTemp`. Once the options
for relation persistence were expanded to included unlogged, this
paramter was changed to take a relpersistence. In MergeAttributes, for
the part which we actually hit when calling it from here (we pass in the
schema as NIL and therefore hit only half of the MergeAttributes code)
the `supers` parameter is actually that of the parent partition and
includes relpersistence, so, by passing in the relpersistence of the
parent as relpersistence here, the checks we do around relpersistence are
redundant because we are comparing the parent's relpersistence to its
own. However, because, currently, this function is only called when we
are making a new relation that, because we don't allow a different
persistence to be specified for the child would actually just be using
the relpersistence of the parent anyway, by passing it in hard-coded we
would actually be incorrectly assuming that we are creating a permanent
relation always.

Since MergeAttributes was overkill, we wrote a new helper
function, SetSchemaAndConstraints, to get the schema and constraints of
a relation. This function doesn't do very many special validation checks
that may be required by callers when using it in the context of
partition tables (so user beware), however, it is probably only useful
in the context of partition tables because it assumes constraints will
be cooked, which, wouldn't be the case for all relations.
We split it into two smaller inline functions for clarity. We also felt
this would be a useful helper function in general, so we extern'd it.

This commit also sets the relpersistence that is used to make the leaf
partition when adding a new partition or splitting an existing a partition.

makeRangeVar is a function from upstream which is basically a
constructor. It sets relpersistence in the RangeVar to a hard-coded
value of RELPERSISTENCE_PERMANENT. However, because we use the root
partition to get the constraints and column information for the new
leaf, after we use the default construction of the RangeVar, we need to
set the relpersistence to that of the parent.

This commit specifically only sets it back for the case in which we are
adding a partition with `ADD PARTITION` or through `SPLIT PARTITION`.

Without this commit, a leaf partition of an unlogged table created
through `ADD PARTITION` or `SPLIT PARTITION` would incorrectly have its
relpersistence set to permanent.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>

263355cf

A

Fix unit test failure for gp_replication_test. · 3267a548
由 Ashwin Agrawal 提交于 1月 09, 2019

3267a548
A

Adjust expected answer file for segwalrep/pg_basebackup.out. · df97fc50
由 Ashwin Agrawal 提交于 1月 09, 2019

df97fc50

gp_replica_check: Ignore status check unless it is_for_gp_walreciever · 6fb27ca7

由 Taylor Vesely 提交于 1月 09, 2019

The WalSndCtl can have status information for non-mirror walsender
connections, i.e. pg_basebackup connections. Ignore them.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

6fb27ca7

Make GetMirrorStatus more intelligent · e03a755a

由 Asim R P 提交于 12月 26, 2018

Now that we allow multipl WanSnd objects, FTS probes need to recognize
the WalSnd object corresponding to the mirror.  This is achieved by
defining Greenplum-specific application name "gp_replication".  The
mirrors use this application name as a connection parameter.  Any
other replication connections (backup and log streamer connections
initiated by pg_basebackup) do not use this application name.

Log streamer replication connection initiated by pg_basebackup should
NOT use Grenplum-specific application name.
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>

e03a755a

Initialize cluster with primary/mirror replication slots. · 06e7bf32

由 Adam Berlin 提交于 12月 14, 2018

This replication slot is used for WAL replication between primary and
mirror segments, and also master and standby.  The replication slot is
created when a mirror / standby segment is initialized using
pg_basebackup.  The replication slot is used by primary to keep track
of the WAL flush location reported by the mirror.  When the mirror
disconnects, it allows the primary to retain enough WAL so that the
mirror can catchup after reconnecting in future.

- defaults max_wal_senders to 10 to allow for basebackup to spin up
senders matching upstream
- defaults max_replication_slots to 10 instead of 0
- changes gp_basebackup to create a replication slot when a slot name is
provided during gpinitsystem
- changes gp_basebackup to use streaming replication during gpinitsystem
- creates and uses replication slot during full recovery

Note: We intend to reason more deeply with the default guc settings in
a later feature.
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>

06e7bf32

pg_basebackup: Add --slot option · 0d1b5de6

由 Peter Eisentraut 提交于 7月 21, 2015

This option specifies a replication slot for WAL streaming (-X stream),
so that there can be continuous replication slot use between WAL
streaming during the base backup and the start of regular streaming
replication.
Reviewed-by: NMichael Paquier <michael.paquier@gmail.com>

0d1b5de6

Remove the distributed log on new segments · 68536097

由 ZhangJackey 提交于 1月 10, 2019

df19119c eliminate distributed transaction log creation
and maintenance on QD (Only a 32K `pg_distributedlog/0000`
 file exists). Gpexpand will copy data files from
QD to new segments, So in the new segments the oldestXID
 is 3 (loaded from pg_controlfile which copied from QD).

After new segments join the cluster, they will maintain the
oldestXmin, so it will loop to find the page which is in the
distributed transaction log.

If the local transaction ID (xid) is huge on new segments, it will
lead to a hole between 0000 and TransactionIdToPage(xid),
then an error will be raised.

In this commit we truncate the distributedlog with cutoff of oldestxid
on new segments,  then the hole is gone, so the oldestXmin will be
initialized to oldestLocalXmin.

68536097

Update instructions in README.docker.md · f3807f13

由 gshaw-pivotal 提交于 1月 03, 2019

- Include how to make Greenplum within docker accessible to SQL editors
  running on the local machine (outside of docker).
- Update the pip install command so that psutil and lockfile are
  accessible when the make cluster command is executed.

f3807f13

09 1月, 2019 13 次提交

Fix typo in vagrant documentation · 6fc6bbd8

由 Georgios Kokolatos 提交于 1月 09, 2019

Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

6fc6bbd8

Fix assertion failure in join planning. · 22288e8d

由 Heikki Linnakangas 提交于 1月 09, 2019

cdbpath_motion_for_join() was sometimes returning an incorrect locus for a
join between SingleQE and Hashed loci. This happened, when even the "last
resort" strategy to move hashed side to the single QE failed. This can
happen at least in the query that's added to the regression tests. The
query involves a nested loop join path, when one side is a SingleQE locus
and the other side is a Hashed locus, and there are no join predicates
that can be used to determine the resulting locus.

While we're at it, turn the Assertion that this tripped, and some related
ones at the same place, into elog()s. No need to crash the whole server if
the planner screws up, and it'd be good to perform these sanity checks in
production, too.

The failure of the "last resort" codepath was left unhandled by commit
0522e960. Fixes https://github.com/greenplum-db/gpdb/issues/6643.
Reviewed-by: NPaul Guo <pguo@pivotal.io>

22288e8d

Y

fix typo and indent (#6653) · 1ffc362e
由 Yandong Yao 提交于 1月 09, 2019

1ffc362e

Do not enforce join ordering for ANTI and LASJ. (#6625) · 29daab51

由 Richard Guo 提交于 1月 09, 2019

The following identity holds true:

	(A antijoin B on (Pab)) innerjoin C on (Pac)
    	= (A innerjoin C on (Pac)) antijoin B on (Pab)

So we should not enforce join ordering for ANTI. Instead we need to
collapse ANTI join nodes so that they participate fully in the join
order search.

For example:

	select * from a join b on a.i = b.i where
		not exists (select i from c where a.i = c.i);

For this query, the origin join order is "(a innerjoin b) antijoin c". If
we enforce ANTI join ordering, this will be the final join order. But
another join order "(a antijoin c) innerjoin b" is also legal. We should
take this order into consideration and pick a cheaper one.

For LASJ, it is the same as ANTI joins.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

29daab51

A
pg_rewind: parse bitmap wal records. · 161920e8
由 Ashwin Agrawal 提交于 12月 27, 2018
```
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
```
161920e8
A
pg_rewind: add test for bitmap wal records. · a6913d2f
由 Ashwin Agrawal 提交于 1月 04, 2019
```
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
```
a6913d2f

In maintenance_mode ignore distributed log. · 4eb48055

由 Ashwin Agrawal 提交于 1月 04, 2019

With this commit the QE in maintenance mode will ignore the
distributed log and just pretend like single instance postgres.

Without this if starting QE as single instance only, no distributed
snapshot is executed. Due to this distributed oldest xmin points to
oldest datfrozen_xid in system. As a result, vacuum any table results
in HEAP_TUPLE_RECENTLY_DEAD and avoids cleaning up dead rows.
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

4eb48055

Fix calculation of WorkfileMgrLock and WorkfileQuerySpaceLock · 1540eb1c

由 Pengzhou Tang 提交于 12月 06, 2018

All lwlocks are stored in MainLWLockArray which is an array of
LWLockPadded structures:

typedef union LWLockPadded
{
  LWLock lock;
  char pad[LWLOCK_PADDED_SIZE];
} LWLockPadded;

The calculation in SyncHTPartLockId to fetch a lwlock is
incorrect because it offsets the array as an LWLock array.
In current code base, it works fine because the size of
LWLock happens to be 32, if structure LWLock get enlarged,
the calculation will mess up.

1540eb1c

P

fix according to comments · a8c2f7c4
由 Pengzhou Tang 提交于 1月 08, 2019

a8c2f7c4

Dispatcher should use DISPATCH_WAIT_FINISH mode to wait QEs for init plans · b7bb5438

由 Pengzhou Tang 提交于 1月 08, 2019

GPDB always set the REWIND flag for subplans include init plans, in 6195b967,
we enhanced the restriction that if a node is not eager free, we cannot squelch
a node earlier include init plans, this exposes a few hidden bugs: if init plan
contains a motion node that needs to be squelched earlier, the whole query will
get stuck in cdbdisp_checkDispatchResult() because some QEs are still keep
sending tuples.

To resolve this, we use DISPATCH_WAIT_FINISH mode for dispatcher to wait the
dispatch results of init plan, init plan with motion is always executed on
QD and should always be a SELECT-like plan, init plan must already fetched
all the tuples it needed before dispatcher waiting for the QEs,
DISPATCH_WAIT_FINISH is the right mode for init plan.

b7bb5438

E
Update distributed_snapshot isolation2 test expected output · bd7c4b1a
由 Ekta Khanna 提交于 1月 08, 2019
```
Co-authored-by: NJimmy Yih <jyih@pivotal.io>
```
bd7c4b1a

Fix distributed snapshot xmax check. · 2b4674a4

由 Ekta Khanna 提交于 1月 07, 2019

As part of commit dc78e56c, logic for distributed snapshot was modified
to use latestCompletedDxid. This changed the logic from xmax being
inclusive range to not inclusive for visible transactions in snapshot.
Hence, updating the check to return
DISTRIBUTEDSNAPSHOT_COMMITTED_INPROGRESS even for transaction id equal
to global xmax now. Other way to fix is using latestCompletedDxid
without +1 for xmax, but better is to keep logic similar to local
snapshot check and not have xmax in inclusive range of visible
transactions.

This was exposed in CI by test
isolation/results/heap-repeatable-read-vacuum-freeze failing
intermittently.  This was due to isolation framework itself triggering
query on pg_locks to check for deadlocks. This commit adds explicitely
test to cover the scenario.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

2b4674a4

Avoid calling CreateRestartPoint() from startup process. · 558c460e

由 Ashwin Agrawal 提交于 1月 08, 2019

With commit 8a11bfff, aggressive restart point creation is not performed in gpdb
as well. Since CreateRestartPoint() is not coded to be called from startup
process, GPDB specific code exception was added in past to work correctly for
previous aggressive restart point creations, calls to which could happen via
startup process.

Now given only when gp_replica_check is running restartpoint is created on
checkpoint record, which should be done via checkpointer process. Eliminate any
case of calling CreateRestartPoint() from startup process and thereby remove
GPDB added exception to CreateRestartPoint() and align to upstream code.

558c460e