提交 · fa09dd80f3be01fdc0464977bc6fe54c3d43cb3e · Greenplum / Gpdb

24 1月, 2019 7 次提交

Remove stale replication slots on mirrors. · fa09dd80

由 David Kimura 提交于 1月 17, 2019

Stale replication slots can exist on mirrors that were once acting as
primaries. In this case restart_lsn is non-zero value used in past
replication slot setup. The stale replication slot will continue to
retain xlog on mirror which is problematic and unnecessary.

This patch drops internal replication slot on startup of mirror.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

fa09dd80

Prevent int128 from requiring more than MAXALIGN alignment. · d74dd56a

由 Jesse Zhang 提交于 1月 22, 2019

We backported 128-bit integer support to speed up aggregates (commits
8122e143 and 959277a4) from upstream 9.6 into Greenplum (in
commits 9b164486 and 325e6fcd). However, we forgot to also port a
follow-up fix postgres/postgres@7518049980b, mostly because it's nuanced
and hard to reproduce.

There are two ways to tell the brokenness:

1. On a lucky day, tests would fail on my workstation, but not my laptop (or
   vice versa).

1. If you stare at the generated code for `int8_avg_combine` (and friends),
   you'll notice the compiler uses "aligned" instructions like `movaps` and
   `movdqa` (on AMD64).

Today's my lucky day.

Original commit message from postgres/postgres@7518049980b (by Tom Lane):

> Our initial work with int128 neglected alignment considerations, an
> oversight that came back to bite us in bug #14897 from Vincent Lachenal.
> It is unsurprising that int128 might have a 16-byte alignment requirement;
> what's slightly more surprising is that even notoriously lax Intel chips
> sometimes enforce that.

> Raising MAXALIGN seems out of the question: the costs in wasted disk and
> memory space would be significant, and there would also be an on-disk
> compatibility break.  Nor does it seem very practical to try to allow some
> data structures to have more-than-MAXALIGN alignment requirement, as we'd
> have to push knowledge of that throughout various code that copies data
> structures around.

> The only way out of the box is to make type int128 conform to the system's
> alignment assumptions.  Fortunately, gcc supports that via its
> __attribute__(aligned()) pragma; and since we don't currently support
> int128 on non-gcc-workalike compilers, we shouldn't be losing any platform
> support this way.

> Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF) and
> called it a day, I did a little bit of extra work to make the code more
> portable than that: it will also support int128 on compilers without
> __attribute__(aligned()), if the native alignment of their 128-bit-int
> type is no more than that of int64.

> Add a regression test case that exercises the one known instance of the
> problem, in parallel aggregation over a bigint column.

> This will need to be back-patched, along with the preparatory commit
> 91aec93e.  But let's see what the buildfarm makes of it first.

> Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org

(cherry picked from commit 75180499)

d74dd56a

Rearrange c.h to create a "compiler characteristics" section. · 60a08bc2

由 Jesse Zhang 提交于 1月 22, 2019

This cherry-picks 91aec93e. We had to be extra careful to preserve
still-in-use macros UnusedArg and STATIC_IF_INLINE and friends.

> Generalize section 1 to handle stuff that is principally about the
> compiler (not libraries), such as attributes, and collect stuff there
> that had been dropped into various other parts of c.h.  Also, push
> all the gettext macros into section 8, so that section 0 is really
> just inclusions rather than inclusions and random other stuff.

> The primary goal here is to get pg_attribute_aligned() defined before
> section 3, so that we can use it with int128.  But this seems like good
> cleanup anyway.

> This patch just moves macro definitions around, and shouldn't result
> in any changes in generated code.  But I'll push it out separately
> to see if the buildfarm agrees.

> Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org

(cherry picked from commit 91aec93e)

60a08bc2

Update GDD to not assign global transaction ids · e24ddd70

由 David Kimura 提交于 1月 18, 2019

Currently GDD sets DistributedTransactionContext to
DTX_CONTEXT_QD_DISTRIBUTED_CAPABLE and as a result allocates distributed
transaction id. It creates entry in ProcGlobal->allTmGxact with state
DTX_STATE_ACTIVE_NOT_DISTRIBUTED. The effect of this is that any query
taking a snapshot will see this transaction as in progress. Since GDD
transaction is short lived it is not an issue in general, but in CI it
causes flaky behavior for some of the vacuum tests. The flaky behavior
shows up as unvacuumed tables where the vacuum snapshot was taken while
GDD transaction was running thereby forcing vacuum to lower its oldest
XMIN. Current behavior of GDD consuming a distributed transaction id
(every 2 minutes by default) is also wasteful behavior.

Currently GDD also sends a snapshot to QE, but this isn't required and
is wasteful as well.

In this change for GDD we keep DistributedTransactionContext as
DTX_CONTEXT_LOCAL_ONLY and avoid dispatching snapshots to QEs.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

e24ddd70

Gpexpand use gp_add_segment to register primaries. · e2c699c8

由 Ashwin Agrawal 提交于 1月 23, 2019

Currently, dbid is used in tablespace path. Hence, while creating
segment need dbid. To get the dbid need to add segment to catalog
first. But adding segment to catalog before creating causes
issues. Hence, modify gpexpand to not let database generate the dbid,
but instead pass the dbid upfront generated while registering in
catalog. This way dbid used while creating the segment will be same as
dbid in catalog.
Reviewed-by: NJimmy Yih <jyih@pivotal.io>

e2c699c8

L

Add libzstd-devel to docker images (#6787) · bcdcb827
由 Lav Jain 提交于 1月 23, 2019

bcdcb827

Explicitly pass 0 as number of dead tuples to pgstat when vacuuming AO tables. · 148d718d

由 Georgios Kokolatos 提交于 1月 23, 2019

An argument can be made that hidden tuples in AO tables are similar to dead tuples
for regular tables. However, the use of this information with regards to pgstats
seems to be semantically distinct and consequently should not be exposed. As example
after a VACUUM (FULL, ANALYZE) of an AO table, hidden tuples will remain if AO
compaction thresholds are not met.

It seems preferable to explicitly pass 0 instead of the already zero'd LVRelStats
member for clarity.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

148d718d

23 1月, 2019 15 次提交

Fix bug: zombie record in gp_distribution_policy (#6768) · b949ac96

由 Jialun 提交于 1月 23, 2019

When a table has been transformed to a view by creating ON SELECT
rule, the record in gp_distribution_policy should be deleted also,
for there is no such record for a view.
Also, the relstorage in pg_class should be changed to 'v'.

b949ac96

Add libzstd to CentOS dependencies README · 78038632

由 Dmitriy Dubson 提交于 1月 22, 2019

Missing documentation on newly required `libzstd` dependency.
Reviewed-by: NJimmy Yih <jyih@pivotal.io>
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

78038632

gp_toolkit.gp_skew_* should support replicated table correctly · 6e862c90

由 Pengzhou Tang 提交于 1月 11, 2019

gp_toolkit.gp_skew_* series views/functions are used to query how data
is skewed in database. The idea is using a query like:
"select gp_segment_id, count(*) cnt from foo group by gp_segment_id",
and compare the cnt by gp_segment_id.

For the replicated table, only one replica is picked to count the tuple
number by the planner, so the old calculate logic produced a confusing
result that a replicated table is skewed which is not expected:

gpadmin=# select * From gp_toolkit.gp_skew_idle_fractions;
 sifoid | sifnamespace | sifrelname |      siffraction
 --------+--------------+------------+------------------------
   16385 | public       | rpt        | 0.66666666666666666667

What's more, gp_segment_id is ambiguous for replicated table, so in
commit b120194a, we disallow user to access system columns include
gp_segment_id, so gp_toolkit.gp_skew_* views now report an error
now.

This commit correct the results of gp_toolkit.gp_skew_*
views/functions for the replicated table although the results are
pointless, however, this way should be more friendly for users.

6e862c90

P
Remove the obsolete comment for RETURNING and put the test in a parallel... · 00daeffe
由 Paul Guo 提交于 1月 21, 2019
```
Remove the obsolete comment for RETURNING and put the test in a parallel running group, following pg upstream.
```
00daeffe

Run gp_toolkit early to reduce testing time of it due to less logs. · 5bc0bcb2

由 Paul Guo 提交于 1月 21, 2019

gp_toolkit test tests various log related views like gp_log_system(), etc. If
we run the test earlier, less logs are generated and thus the test runs fater.
In my test environment, the test time reduces from ~22 seconds to 6.X seconds
with this patch. Also, I check the whole test case, this change will not affect
the test coverage.

5bc0bcb2

Declare cursor for update should handle replicated table too · 80f49a18

由 Pengzhou Tang 提交于 1月 14, 2019

In 9.0 merge, we add bellow rule for FOR UPDATE:

select for update will lock the whole table, we do it at addRangeTableEntry.
The reason is that gpdb is an MPP database, the result tuples may not be on
the same segment. And for cursor statement, reader gang cannot get Xid to lock
the tuples, so we didn't add a LockRows node for distributed table to avoid
it

this rule should also apply to replicated table.

80f49a18

Synchronize mpp_execute option description and precedence rules in en… (#6734) · 7e0bd349

由 David Yozie 提交于 1月 22, 2019

* Synchronize mpp_execute option description and precedence rules in end-user documentation

* describe the order of precedence in each command

* one any -> any one

* Feedback from Lisa

7e0bd349

Parent partition and children partition table must have same columns · d8a613b8

由 ZhangJackey 提交于 1月 23, 2019

In the previous code, we can modify the parent partition's column
by ALTER TABLE ONLY, so the column of the parent partition
and children partition may be different.

In order to prohibit this situation, we check the DROP COLUMN/
ADD COLUMN/ALTER TYPE COLUMN statement to prohibit the
user only modify the column of parent partition or children partitions.

There was a discussion on gpdb-dev@:
https://groups.google.com/a/greenplum.org/forum/#!msg/gpdb-dev/0SzL_gSbqKo/d-2RpwKrFwAJ

d8a613b8

Delete top-level Dockerfile · ae67ca0f

由 Bradford D. Boyle 提交于 1月 15, 2019

It doesn't build because --disable-orca is not being passed to configure
and pivotaldata/gpdb-devel doesn't have xerces, on which Orca depends.

It seems this Dockerfile is not used. The Dockerfiles in
./src/tools/docker/*/Dockerfile are more recently maintained.
Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io>
Co-authored-by: NBen Christel <bchristel@pivotal.io>

ae67ca0f

CI: Remove extra sles11 task input for RC job · ca26fb34

由 Kris Macoskey 提交于 1月 22, 2019

For GPDB 6 Beta, only Centos 6/7 need to be passing for the same commit
to be a valid release candidate.

This was originally done in this commit: fa63e7ab

But the commit was missing an update to the task yaml for the
Release_Candidate job to accompodate removal of the sles11 input.
Authored-by: NKris Macoskey <kmacoskey@pivotal.io>

ca26fb34

A

Replace ModifyPostgresqlConfSetting with ModifyConfSetting. · a9cd61e0
由 Ashwin Agrawal 提交于 1月 22, 2019

a9cd61e0

Validation for gp_dbid and gp_contentid between QD catalog and QE. · 78aed203

由 Ashwin Agrawal 提交于 1月 15, 2019

Since gp_dbid and gp_contentid is stored in conf files on QE, its
helpful to have validation to compare values between QD catalog table
gp_segment_configuration and QE. This validation is performed using
FTS. FTS message includes gp_dbid and gp_contentid values from
catalog. QE validates the value while handling the FTS message and if
finds inconsistency PANICS.

This check is mostly targeted during development to catch missed
handling of gp_dbid and gp_contentid values in config files. For
future features like pg_upgrade and gpexpand which copy master
directory and convert it to segment.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

78aed203

A
Delete gpsetdbid.py and gp_dbid.py. · 549cd61c
由 Ashwin Agrawal 提交于 1月 14, 2019
```
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
```
549cd61c

Store gp_dbid and gp_contentid in conf files. · 4eaeb7bc

由 Ashwin Agrawal 提交于 1月 16, 2019

Currently, gp_dbid and gp_contentid is passed as command line
arguments for starting QD and QE. Since, the values are stored in
master's catalog table, to get the right values, must start the master
first. Hence, hard-coded dbid=1 was used for starting the master in
admin mode always. This worked fine till dbid was not used for
anything on-disk. But given dbid is used for tablespace path in GPDB
6, startting the instance with wrong dbid, means inviting recovery
time failues, data corruption or data loss situations. Dbid=1 will go
wrong after failover to standby master as it has dbid != 1. This
commit hence eliminate the need of passing the gp_dbid and
gp_contentid on command line, instead while creating the instance the
values are stored in conf files for the instance.

This also helps to avoid passing gp_dbid as argument to pg_rewind,
which needs to start target instance in single user mode to complete
recovery before performing rewind operation.

Plus, this eases during development to just use pg_ctl start and not
require to correctly pass these values.

 - gp_contentid is stored in postgresql.conf file.

 - gp_dbid is stored in internal.auto.conf.

 - Introduce internal.auto.conf file created during
   initdb. internal.auto.conf is included from postgresql.conf file.

 - Separate file is chosen to write gp_dbid for ease handling during
   pg_rewind and pg_basebackup, as can exclude copying this file from
   primary to mirror, instead of trying to edit the contents of the
   same after copy during these operations. gp_contentid remains same
   for primary and mirror hence having it in postgresql.conf file
   makes senes. If gp_contentid is also stored in this new file
   internal.auto.conf then pg_basebackup needs to be passed contentid
   as well to write to this file.

 - pg_basebackup: write the gp_dbid after backup. Since, gp_dbid is
   unique for primary and mirror, pg_basebackup excludes copying
   internal.auto.conf file storing the gp_dbid. pg_basebackup explicit
   (over)writes the file with value passed as
   --target-gp-dbid. --target-gp-dbid due to this is mandatory
   argument to pg_basebackup now.

 - gpexpand: update gp_dbid and gp_contentid post directory copy.

 - pg_upgrade: retain all configuration files for
   segment. postgresql.auto.conf and internal.auto.conf are also
   internal configuration files which should be restored back after
   directory copy. Similar, change is required in gp_upgrade repo in
   restoreSegmentFiles() after copyMasterDirOverSegment().

 - Update tests to avoid passing gp_dbid and gp_contentid.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

4eaeb7bc

gpinitsystem: add mirror to catalog first and then create them. · f6e85f1f

由 Ashwin Agrawal 提交于 1月 16, 2019

To create mirrors, pg_basebackup needs to be performed. pg_basebackup
to correctly handle tablespaces needs dbid as argument. This
requirement exist because dbid is used in tablespace path.

dbid in master catalog to be in sync with what's used by mirror for
tablespace, need to add mirror to catalog first. Get the dbid and pass
the same to pg_basebackup for creating mirror.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

f6e85f1f

22 1月, 2019 3 次提交

Remove the case of external partition · 9cef1c91

由 Adam Lee 提交于 1月 17, 2019

pg_upgrade doesn't like it, please revert this commit once the restriction is
removed.

```
Checking for external tables used in partitioning           fatal

| Your installation contains partitioned tables with external
| tables as partitions.  These partitions need to be removed
| from the partition hierarchy before the upgrade.  A list of
| external partitions to remove is in the file:
| 	external_partitions.txt

Failure, exiting
```

9cef1c91

pg_dump: dump the namespace while processing external partitions · bbb9f9dd

由 Adam Lee 提交于 1月 16, 2019

We forgot to dump the namespace while processing external partitions, it
would be a problem since upstream pg_dump decided not to dump the
search_path, this commit fixes it.

bbb9f9dd

Fix gppkg error when master and standby master are in the same node · 1f33759b

由 Haozhou Wang 提交于 1月 22, 2019

If both master and standby master are set in the same
node, the gppkg utility will report error when uninstall a gppkg.
This is because, gppkg utility assume master and standby master
are in the different node, which is not be true in test environment.

This patch fixed this issue, when master and standby master are in
the same node, we skip to install/uninstall gppkg on standby master
node.

1f33759b

21 1月, 2019 2 次提交

Remove GPDB_93_MERGE_FIXME (#6699) · d286b105

由 Shaoqi Bai 提交于 1月 21, 2019

The code was added to tackle the case when FTS sends promote message, on mirror create the PROMOTE file and signal mirror to promote. But while mirror is still under promotion and not completed yet, FTS sends promote again, which creates the PROMOTE file again. Now, this PROMOTE file exist on promoted mirror which is acting as primary.
So, if basebackup was taken from this primary to create mirror, it included PROMOTE file and auto promoted mirror on creation which is incorrect. Hence, via FTS to detect if this file exist delete PROMOTE file was added along with pg_basebackup excluding the copy of PROMOTE file.

Now, given that background and upstream commit to always just delete the PROMOTE file on postmaster start, covers for even if PROMOTE file gets created after mirror promotion and gets copied over by pg_basebackup. On mirror startup no risk of auto-promotion. So, we can safely remove this code now.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Reviewed-by: NPaul Guo <pguo@pivotal.io>

d286b105

Use the right rel for largest_child_relation(). · 8712da1e

由 Richard Guo 提交于 1月 21, 2019

Function largest_child_relation() is used to find the largest child
relation for an inherited/partitioned relation, recursively. Previously
we passed a wrong rel as its param.

This patch finds in root->simple_rel_array the right rel for
largest_child_relation(). Also it replaces several rt_fetch with a
search in root->simple_rte_array.

This patch fixes #6599.
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

8712da1e

19 1月, 2019 8 次提交

docs - reorg pxf content, add multi-server, objstore content (#6736) · f601572d

由 Lisa Owen 提交于 1月 18, 2019

* docs - reorg pxf content, add multi-server, objstore content

* misc edits, SERVER not optional

* add server, remove creds from examples

* address comments from alexd

* most edits requested by david

* add Minio to table column name

* edits from review with pxf team (start)

* clear text credentials, reorg objstore cfg page

* remove steps with XXX placeholder

* add MapR to supported hadoop distro list

* more objstore config updates

* address objstore comments from alex

* one parquet data type mapping table, misc edits

* misc edits from david

* add mapr hadoop config step, misc edits

* fix formatting

* clarify copying libs for MapR

* fix pxf links on CREATE EXTERNAL TABLE page

* misc edits

* mapr paths may differ based on version in use

* misc edits, use full topic name

* update OSS book for pxf subnav restructure

f601572d

Fix race-condition of pg_xlog delete during pg_basebackup · ab2ec2b4

由 David Kimura 提交于 1月 17, 2019

This commit addresses a race condition where it was possible that during
xlogstream the pg_xlog directory went missing. The race exists only with
--foceoverwrite and --xlog stream. In stream mode pg_basebackup forks a
process to populate pg_xlog directory with new transaction files and
another process to receive and untar base directory contents. Force
overwrite removes an existing pg_xlog directory before copying contents
from tar file. It is problematic if untar process deletes xlog directory
while stream process tries to write to it.

In order to avoid this situation in forceoverwrite mode, the deletion
of pg_xlog now happens before starting stream and untar processes. This
enables untar process to skip deletion of pg_xlog.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

ab2ec2b4

Refactor reloptions construction · 677a08ef

由 Daniel Gustafsson 提交于 1月 18, 2019

Use the CStringGetTextDatum() construct when generating the reloptions
array in order to improve readability. This patch started out by trying
to remove duplication in calculating the string length but turned into
a refactoring of the Datum creation instead.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

677a08ef

Use VERBOSE setting for HLL ANALYZE logging · e34f741f

由 Daniel Gustafsson 提交于 1月 18, 2019

Running ANALYZE with the HLL computation produce a lot of LOG messages
which are more geared towards troubleshooting than general purpose log
files. Fold these under ANALYZE VERBOSE to avoid cluttering up logfiles
on production systems unless explicitly asked for.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

e34f741f

Add QuickLZ compression wrappers to gpcontrib (#6718) · fd73773b

由 Bradford Boyle 提交于 1月 18, 2019

- Added with-quicklz configure flag
- Added quicklz gpcontrib directory with C wrapper functions and SQL installation file
- Added simple quicklz functional tests
- Added #undef HAVE_LIBQUICKLZ to pg_config.h.win32.
  This is to parallel the recent change in pg_config.h.in that adds
  quicklz. pg_config.h.win32 should be autogenerated, but isn't in
  practice.
Co-authored-by: NJimmy Yih <jyih@pivotal.io>
Co-authored-by: NBen Christel <bchristel@pivotal.io>
Co-authored-by: NDavid Sharp <dsharp@pivotal.io>

fd73773b

CI: Only centos 6/7 release candidates · fa63e7ab

由 Venkatesh Raghavan 提交于 1月 16, 2019

For GPDB 6 Beta centos 6/7 need to be passing for the same commit
to be a valid release candidate.
Co-authored-by: NKris Macoskey <kmacoskey@pivotal.io>

fa63e7ab

S

Bump ORCA version to v3.23.0 · a4e95e1f
由 Sambitesh Dash 提交于 1月 17, 2019

a4e95e1f

pg_isready: handle PQPING_MIRROR_READY and reassign exported constant · 46c83a66

由 Jacob Champion 提交于 1月 07, 2019

The GPDB-specific constant PQPING_MIRROR_READY, which indicates that a
mirror is ready for replication, was not handled in pg_isready.

Additionally, the value we selected for PQPING_MIRROR_READY might at one
point in the future conflict with upstream libpq, which would be a pain
to untangle. Try to avoid that situation by increasing the value.
Co-authored-by: NShoaib Lari <slari@pivotal.io>

46c83a66

18 1月, 2019 5 次提交

A

Ensure that replication slots are still active after rebalance. · cee0a6f5
由 Adam Berlin 提交于 1月 17, 2019

cee0a6f5

Remove flaky pg_basebackup scenario · 497d1112

由 David Kimura 提交于 1月 16, 2019

There was a race condition where it was possible that fault was
unexpectedly triggered by WAL sender object independent of
pg_basebackup being run. We could fix it to be more deternimistic by
incrementing wait for triggered count, but the test as a whole didn't
seem to add much value.
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

497d1112

pg_rewind: Update tests to create separate datadirs for each test (#6689) · ac3cf6e0

由 David Kimura 提交于 1月 17, 2019

Prior to this commit, the test recreated the tmp_check_* directory for
each running test. This would lead to loosing the datadir for the
failing test if it wasn't the last one. This commit, creates a new
directory specific to each test and cleans up artifacts of previous
passing tests
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

ac3cf6e0

A
Bump ORCA version to 3.22.0 · 6cb95608
由 Abhijit Subramanya 提交于 1月 15, 2019
```
Co-authored-by: NChris Hajas <chajas@pivotal.io>
```
6cb95608

Update the default value of optimizer_penalize_broadcast_threshold. · 873b657b

由 Abhijit Subramanya 提交于 1月 15, 2019

This commit sets the default value of the guc optimizer_penalize_broadcast_threshold
to 100000. We have seen a lot of cases where a plan with broadcast was chosen
due to underestimation of cardinality. In such cases a Redistribute motion
would have been better. So this commit will penalize broadcast when the number
of rows is greater than 100000 so that Redistribute is favored more in this
case. We have tested the change on the perf pipeline and do not see any
regression.
Co-authored-by: NChris Hajas <chajas@pivotal.io>

873b657b