提交 · 4a61641c7e51b5610d7f4c38aa9d268745dc9a04 · Greenplum / Gpdb

05 2月, 2019 12 次提交

Removed FIXME comment from test suite. · 4a61641c

由 Adam Berlin 提交于 2月 01, 2019

We left the comment explaining the difference between GPDB
and upstream, as it will be useful for future merge work.

pg_event_trigger_dropped_objects should observe events run
on the QD when tables are dropped and have the same behavior
as upstream.

There's a possible race-condition because the query no longer
joins, so there's a gap between the lookup of dropped objects
and the query for undroppable objects, but because theres no
other queries using this table or function, it is not a concern.
Co-authored-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

4a61641c

gpstate: rename segment variables for clarity · 8b3291bc

由 Jamie McAtamney 提交于 1月 28, 2019

"primary" and "mirror" are clearer than "peer" and "segment", which were
holdovers from a previous implementation. Also switch the order that we
pass these to _add_replication_info(), from (mirror, primary) to
(primary, mirror).
Co-authored-by: NJacob Champion <pchampion@pivotal.io>
Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>

8b3291bc

gpstate: display 'Copying files' status for primary during full recovery · 1cf77e4b

由 Jacob Champion 提交于 1月 24, 2019

To make it obvious when a basebackup is in progress (such as with
`gprecoverseg -F`), if any primary's WAL senders are actively backing up
logs to a receiver, `gpstate -s` will now display 'Copying files from
primary' for that primary's Mirror Status. This is true even if a mirror
is currently synced and streaming.
Co-authored-by: NAdam Berlin <aberlin@pivotal.io>

1cf77e4b

GpStateData: allow segment switching · 23cd20ca

由 Jacob Champion 提交于 1月 24, 2019

...so that we can log information for a segment pair all at once without
too much trouble.
Co-authored-by: NAdam Berlin <aberlin@pivotal.io>

23cd20ca

Add missing python-tarball to binary_swap compile job · 303d68b0

由 Karen Huddleston 提交于 2月 04, 2019

Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
Co-authored-by: NNandish Jayaram <njayaram@pivotal.io>

303d68b0

Remove ccache from ivy for all platforms · d5fb0cb3

由 Taylor Vesely 提交于 12月 13, 2018

We are moving away from using Ivy for managing enterprise GPDB
dependencies. Because ccache is no longer used by anything in the
gpdb_master CI, it is safe to remove.
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NBradford Boyle <bboyle@pivotal.io>

d5fb0cb3

Remove R dependency from Ivy · b598179d

由 Taylor Vesely 提交于 12月 12, 2018

As a first step to eventually removing Ivy as a dependency for the
enterprise build of GPDB, remove R from the list of dependencies
downloaded during make sync_tools. This version of R is not used, so no
need to modify anything else in the build process.

Also remove obsolete references to $R_HOME in releng.mk
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NBradford Boyle <bboyle@pivotal.io>

b598179d

Remove win32 third-party tarball from ivy · 2c6f4c9d

由 Taylor Vesely 提交于 12月 26, 2018

We have a high level goal to remove all ivy dependencies for gpdb6.
kfw and pygresql were the only dependencies in this tarball. We are
dropping both from ivy so we no longer need to pull the tarball here.
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>

2c6f4c9d

Drop ivy third-party extensions tarball for centos · fea40599

由 David Sharp 提交于 1月 31, 2019

All libraries and programs from the third-party extensions tarball are
now provided by the operating system, so we no longer need to pull this
artifact from ivy.

Dependencies that were dropped:
 - bzip2
 - libedit
 - libevent
 - libxml2
 - libyaml
 - liblber (left behind in the tarball by accident)
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>

Remove libyaml and liblber libraries from Makefile for centos

We are not providing these libraries from Ivy in rhel6_x86_64 anymore so
they will not be in our ext directory.
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>

fea40599

Revert "Revert "Move python artifact out of Ivy"" · d0671dc3

由 Karen Huddleston 提交于 1月 31, 2019

This reverts commit 39dfead3
and reapplies commit 39dfead3.

The original commit was reverted due to a test failure in gpcloud, but
the failing test has been disabled.
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

d0671dc3

Revert "Revert "Remove openssl-dependent libs from ivy.xml for CentOS"" · 07175e06

由 Karen Huddleston 提交于 1月 31, 2019

This reverts commit 3b04c467.
and reapplies commit 577b09ce.

The original commit was reverted due to a test failure in gpcloud, but
the failing test has been disabled.
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>
Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>

07175e06

docs - misc pxf edits (#6846) · d1158f03

由 Lisa Owen 提交于 2月 04, 2019

* docs - misc pxf edits

* add pxf to loading/unloading page and subnav

* edit requested by lav

d1158f03

03 2月, 2019 1 次提交

Change formatting of AO checksums in error messages. · 2a4cc4f0

由 Heikki Linnakangas 提交于 2月 03, 2019

The gpdiff rule in 'ao_checksum_corruption' test assumed that the
checksums were 8 characters wide. That was not always true, however,
because the checksums were not padded with zeros. Padding them with zeros
seems nicer, so change the error messages to do that.

This should fix these buildfarm failures we've been seeing recently:

-ERROR:  header checksum does not match, expected 0xXXXXXXXX and found 0xXXXXXXXX
+ERROR:  header checksum does not match, expected 0x21B733 and found 0x44C333F8

I'm not sure why we started seeing this now, and I didn't see those errors
on my laptop. But it's pure chance whether a checksum happens to begin
with a 0 or not, so it's not that surprising that some completely
unrelated change changed the physical contents of the table. This commit
should make the failure go away.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

2a4cc4f0

02 2月, 2019 11 次提交

A

Disable pg_isready test. · df6b44d5
由 Adam Berlin 提交于 2月 02, 2019

df6b44d5

gpcloud: disable 1_19_proxy test · 0048c59b

由 Adam Lee 提交于 2月 01, 2019

This feature requires higher versions cURL, which we decided not to
vendor in the future.

0048c59b

gpcheckcat: fix gp_distribution_policy.distkey comparisons · 1486e528

由 Jacob Champion 提交于 2月 01, 2019

Follow-up to 69ec6926. distkey is no longer a nullable column, so
comparisons against NULL (to check for random distribution) need to be
replaced with checks for an empty int2vector.

Although this change causes the currently failing master tests to pass,
I'm not sure if

    distkey = ''

is really an idiomatic way to check for an empty vector. I expect a
follow-up to this commit.

1486e528

A

Do not enable tap tests when the test os is 'sles'. · 37dd61b4
由 Adam Berlin 提交于 2月 01, 2019

37dd61b4
A

Leave a note for future me. · 8b104813
由 Adam Berlin 提交于 1月 31, 2019

8b104813

Modify upstream TAP tests to run in GPDB suite. · 61f9cb05

由 Jacob Champion 提交于 11月 27, 2018

- pg_ctl: fix start tests to use GPDB options

  We need to pass DB/content IDs and set utility mode for a single-segment
  startup.

- drop gp_toolkit before dropping plpgsql in upstream tests.
- add back utility mode in two tests that set their own psql options
- add TAP tests to top level Makefile
- intentionally disable failing TAP tests.
Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
Co-authored-by: NJesse Zhang <jzhang@pivotal.io>
Co-authored-by: NJacob Champion <pchampion@pivotal.io>
Co-authroed-by: NAdam Berlin <aberlin@pivotal.io>

61f9cb05

Adds --enable-tap-tests to configure step for concourse. · 44fab956

由 Jamie McAtamney 提交于 11月 29, 2018

- update pipeline to point to the new Ubuntu 16.04 image
- the Docker image tag in use was stale and no longer built. Update to the
new, supported, tag.
Co-authored-by: NJacob Champion <pchampion@pivotal.io>
Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
Co-authored-by: NJesse Zhang <jzhang@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>

44fab956

Merge TestLib.pm upstream 9.4 with GPDB version. · 2ac10830

由 Adam Berlin 提交于 1月 30, 2019

- TestLib.pm: fix run_log()

When this function was backported, a change was made to pass a
concatenated string through to IPC::Run instead of the original
arguments. This breaks in more complicated cases involving
quoting/redirection; return to the original implementation.
Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
Co-authored-by: NJesse Zhang <jzhang@pivotal.io>
Co-authored-by: NAdam Berlin <aberlin@pivotal>

2ac10830

Re-enable logging_collector GUC for use in tests · b101e4f0

由 Jacob Champion 提交于 11月 28, 2018

We need logging_collector to be turned off so that the tests can check
the logfile contents.
Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
Co-authored-by: NJesse Zhang <jzhang@pivotal.io>

b101e4f0

J
pg_ctl: fix initdb version check · 9d80c8cd
由 Jacob Champion 提交于 11月 27, 2018
```
We should check for the GPDB version string, not the one from upstream
Postgres.
```
9d80c8cd

Docs: remove install guide source (#6859) · 0af5719f

由 David Yozie 提交于 2月 01, 2019

* bump postgresql url reference to 9.4

* Remove source for install guide

* Revert "bump postgresql url reference to 9.4"

This reverts commit ab3405ae380f2f5a08ca5305f51fd431f479eae3.

0af5719f

01 2月, 2019 16 次提交

H
Fix gpcheckcat test case, distkey cannot be NULL anymore. · 6cacc636
由 Heikki Linnakangas 提交于 2月 01, 2019
```
A randomly distributed table is now represented by an empty int2vector.
```
6cacc636

Error out on multiple writers in CTE · bfcb7882

由 Daniel Gustafsson 提交于 2月 01, 2019

While Greenplum can plan a CTE query with multiple writable expressions,
it cannot execute it as there is a limitation on using a single writer
gang. Until we can support multiple writer gangs, let's error out with
a graceful error message rather than failing during exeucution with a
more cryptic internal error.

Ideally this will be reverted in GPDB 7.X but right now it's much too
close to release for attacking this.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

bfcb7882

Fix leftover merge conflict in xmlmap test · cfad092b

由 Daniel Gustafsson 提交于 2月 01, 2019

The 9.4.20 merge mistakenly left a merge conflict in the alternative
output for the xmlmap test. Fix verified against a backend without
XML support.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

cfad092b

Use normal hash operator classes for data distribution. · 242783ae

由 Heikki Linnakangas 提交于 2月 01, 2019

Replace the use of the built-in hashing support for built-in datatypes, in
cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time
to do this, since we've already made the change to use jump consistent
hashing in GPDB 6, so we'll need to deal with the upgrade problems
associated with changing the hash functions, anyway.

It is no longer enough to track which columns/expressions are used to
distribute data. You also need to know the hash function used. For that,
a new field is added to gp_distribution_policy, to record the hash
operator class used for each distribution key column. In the planner,
a new opfamily field is added to DistributionKey, to track that throughout
the planning.

Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the
default hash operator class for the datatype is used. But this patch
extends the syntax so that you can specify the operator class explicitly,
like "... DISTRIBUTED BY (column opclass)". This is similar to how an
operator class can be specified for each column in CREATE INDEX.

To support upgrade, the old hash functions have been converted to special
(non-default) operator classes, named cdbhash_*_ops. For example, if you
want to use the old hash function for an integer column, you could do
"DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist
of operators that have "compatible" cdbhash functions has been replaced
by putting the compatible hash opclasses in the same operator family. For
example, all legacy integer operator classes, cdbhash_int2_ops,
cdbhash_int4_ops and cdbhash_int8_ops, are all part of the
cdbhash_integer_ops operator family).

This removes the pg_database.hashmethod field. The hash method is now
tracked on a per-table and per-column basis, using the opclasses, so it's
not needed anymore.

To help with upgrade from GPDB 5, this introduces a new GUC called
'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash
opclasses, instead of the default hash opclasses, if the opclass is not
specified explicitly. pg_upgrade will set the new GUC, to force the use of
legacy hashops, when restoring the schema dump. It will also set the GUC
on all upgraded databases, as a per-database option, so any new tables
created after upgrade will also use the legacy opclasses. It seems better
to be consistent after upgrade, so that collocation between old and new
tables work for example. The idea is that some time after the upgrade, the
admin can reorganize all tables to use the default opclasses instead. At
that point, he should also clear the GUC on the converted databases. (Or
rather, the automated tool that hasn't been written yet, should do that.)

ORCA doesn't know about hash operator classes, or the possibility that we
might need to use a different hash function for two columns with the same
datatype. Therefore, it cannot produce correct plans for queries that mix
different distribution hash opclasses for the same datatype, in the same
query. There are checks in the Query->DXL translation, to detect that
case, and fall back to planner. As long as you stick to the default
opclasses in all tables, we let ORCA to create the plan without any regard
to them, and use the default opclasses when translating the DXL plan to a
Plan tree. We also allow the case that all tables in the query use the
"legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the
two, or using any non-default opclasses, forces ORCA to fall back.

One curiosity with this is the "int2vector" and "aclitem" datatypes. They
have a hash opclass, but no b-tree operators. GPDB 4 used to allow them
as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit
56e7c16b. Now they are allowed again, so you can specify an int2vector
or aclitem column in DISTRIBUTED BY, but it's still pretty useless,
because the planner still can't form EquivalenceClasses on it, and will
treat it as "strewn" distribution, and won't co-locate joins.

Abstime, reltime, tinterval datatypes don't have default hash opclasses.
They are being removed completely on PostgreSQL v12, and users shouldn't
be using them in the first place, so instead of adding hash opclasses for
them now, we accept that they can't be used as distribution key columns
anymore. Add a check to pg_upgrade, to refuse upgrade if they are used
as distribution keys in the old cluster. Do the same for 'money' datatype
as well, although that's not being removed in upstream.

The legacy hashing code for anyarray in GPDB 5 was actually broken. It
could produce a different hash value for two arrays that are considered
equal, according to the = operator, if there were differences in e.g.
whether the null bitmap was stored or not. Add a check to pg_upgrade, to
reject the upgrade if array types were used as distribution keys. The
upstream hash opclass for anyarray works, though, so it is OK to use
arrays as distribution keys in new tables. We just don't support binary
upgrading them from GPDB 5. (See github issue
https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of
'anyrange' had the same problem, but that was new in GPDB 6, so we don't
need a pg_upgrade check for that.

This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE
INDEX, so that you can no longer create a situation where a non-hashable
column becomes the distribution key. (Fixes github issue
https://github.com/greenplum-db/gpdb/issues/6317)

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
Reviewed-by: NNing Yu <nyu@pivotal.io>
Reviewed-by: NSimon Gao <sgao@pivotal.io>
Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
Reviewed-by: NYandong Yao <yyao@pivotal.io>

242783ae

Rename gp_distribution_policy.attrnums to distkey, and make it int2vector. · 69ec6926

由 Heikki Linnakangas 提交于 2月 01, 2019

This is in preparation for adding operator classes as a new column
(distclass) to gp_distribution_policy. This naming is consistent with
pg_index.indkey/indclass. Change the datatype to int2vector, also for
consistency with pg_index, and some other catalogs that store attribute
numbers, and because int2vector is slightly more convenient to work with
in the backend. Move the column to the end of the table, so that all the
variable-length and nullable columns are at the end, which makes it
possible to reference the other columns directly in Form_gp_policy.

Add a backend function, pg_get_table_distributedby(), to deparse the
DISTRIBUTED BY definition of a table into a string. This is similar to
pg_get_indexdef_columns(), pg_get_functiondef() etc. functions that we
have. Use the new function in psql and pg_dump, when connected to a GPDB6
server.
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NPeifeng Qiu <pqiu@pivotal.io>
Co-authored-by: NAdam Lee <ali@pivotal.io>

69ec6926

Make GDD tests deterministic · a0b9fde8

由 Pengzhou Tang 提交于 1月 25, 2019

GDD tests framework now acquire the desired lock by updating the nth tuple
in a segment instead of a specified value, so even a hash algorithm changed,
the tests will not be affected. This method works fine except that a segment
has not enough tuples to provide the nth tuple. Fix is simple, enlarge the
test tables from 20 rows to 100 rows.

Authored-by: Ning Yu nyu@pivotal.io

a0b9fde8

Update serially when GDD is disabled · 29e7f102

由 Zhang Shujie 提交于 2月 01, 2019

If Global Deadlock Detector is enabled, then the table lock may
downgrade to RowExclusiveLock, It may lead two problems:

1. When updating distributed keys concurrently, SplitUpdate node
   would generate more tuples in the table.
2. When updating concurrently, it may trigger the EvalPlanQual
   function, when the SubPlan has Motion node, it can not execute
   correctly.

Now we add a GUC for GDD, if it is disabled, we execute these
UPDATE statement serially, if it is enabled, we raise an error when
updating concurrently.

Co-authored-by: Zhenghua Lyu zlv@pivotal.io

29e7f102

Remove ext submodule folder · 1e43b584

由 Peifeng Qiu 提交于 2月 01, 2019

We rmeoved the submodule address but didn't remove the actural
folder. Submodule clone will fail due to missing url link. Remove
the folder to avoid that.

1e43b584

Fix OOM after cluster reset when gp_vmem_protect_limit > 16GB (#6862) · 9e0e7c27

由 Jialun 提交于 2月 01, 2019

The function VmemTracker_ShmemInit will initialize chunkSizeInBits
according to gp_vmem_protect_limit. Which is the unit of chunk size.
The base value of chunkSizeInBits is 20(1MB). If gp_vmem_protect_limit
is larger than 16GB, it will increase to adapter the large memory
environment. This value should not be changed after initialized.
But if this function was called more times, chunkSizeInBits will
accumulate.

Considering the scenario, QD crashed, then postmaster will reaper the
QD process and reset shared memory. This will lead to VmemTracker_ShmemInit
be called more times. So chunkSizeInBits will increase every time after
crash when gp_vmem_protect_limit is larger than 16GB. At last, the
chunkSize will be very large which means the new reserved chunk will
always be zero or a very small value. So the memory limit mechanism
takes no effect and will cause Out-of-Memory when cannot really
allocate new memory.

So we set chunkSizeInBits to BITS_IN_MB in VmemTracker_ShmemInit
every time instead of Assert.

Why there is no new test case in this commit?
- We just change an Assert to assignment, no logic changes.
- It is very difficult to add a crash case in current isolation test
  frame, for the connection will be lost due to crash.

We have verified the case in our dev environment manually by setting
gp_vmem_protect_limit to 65535 and kill -9 QD process. Then we see
chunkSizeInBits increases every time. At last, we got error message
"ERROR:  Canceling query because of high VMEM usage."

9e0e7c27

Remove unused gpfdist dependency submodule and WIN32 Readme (#6861) · 7281a162

由 Peifeng Qiu 提交于 2月 01, 2019

We no longer use the ext submodule for gpfdist dependencies. Remove
it to avoid confusion. WIN32 build process is changed to native
build. We will add README when it's ready.

7281a162

Remove a FIXME related to recoveryTargetIsLatest. (#6863) · 406fa028

由 Paul Guo 提交于 2月 01, 2019

The recoveryTargetIsLatest setting code was missing somehow and later
it was added back in commit 55808e18. Removing the FIXME comment.
Reviewed-by: NJimmy Yih <jyih@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

406fa028

Set needToPromoteCatalog before updating ControlFile->state. · 434bd5b9

由 Ashwin Agrawal 提交于 1月 31, 2019

Commit 6d80ce31 moved updating the
control file state above, which caused failure in CI for
gpactivatestandby test. As catalog update got missed since
needToPromoteCatalog remained set as false. Hence, move setting
needToPromoteCatalog before setting the ControlFile->state.

434bd5b9

Avoid FinishPreparedTransaction() calling readRecoveryCommandFile() · cb256d04

由 Ashwin Agrawal 提交于 1月 29, 2019

Not sure why we had FinishPreparedTransaction() calling
readRecoveryCommandFile(), seems serves no purpose to me. Seems to
exist from ages and wasn't able to find the rational for the same,
definitely not with current code. Seems unnecessary performance hit on
every commit to read and parse the file.

cb256d04

Align transaction log manager (xlog.c and xlog.h) to upstream. · 6d80ce31

由 Ashwin Agrawal 提交于 1月 28, 2019

Lot of differences collected over the years compared to upstream. Some
confusing or redundant code as well hence better to make it match
upstream.

6d80ce31

concourse: Remove unused dev_generate_installer.yml · b55a0b71

由 Amil Khanzada 提交于 1月 29, 2019

- We're not sure when this file became abandoned, but it doesn't seem to
  be being used anywhere.
- Also remove task file and bash scripts that were only referenced by
  this pipeline.
Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io>
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>

b55a0b71

Remove disabled code in set_plan_references_input_asserts() · e8238cc1

由 Georgios Kokolatos 提交于 1月 31, 2019

This commit removes GPDB_93_MERGE_FIXME introduced while including
46c508fb from upstream. The intention of the upstream commit is
to keep planner params separated so that they don't get reused
incorrectly. In doing so, it removed the need for a global list of
PlannerParamItems.

The removed assertion in this commit was verifying that each Param
in the tree was included in a global list of PlannerParamItems, and
that each datatype of each Param matches that in the global list.

At the time of the assertion, we simply don't have the necessary
information to be able to verify properly. An argument could be made
for re-introducing such a global list PlannerParamItems. However
this assertion would not verify that a parameter is ancored in the
right place and it would introduce additional code to maintain.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

e8238cc1