提交 · b4e2e3e2280ca81d1a3a1b20d9659e229238ed18 · Greenplum / Gpdb

14 3月, 2019 2 次提交

Rename legacy planner to Postgres planner · b4e2e3e2

由 Daniel Gustafsson 提交于 3月 14, 2019

As we merge with upstream and by that keep refining the Postgres
planner, legacy planner is no longer a suitable name. This changes
all variations of the spelling (legacy planner, legacy optimizer,
legacy query optimizer etc) to say "Postgres" rather than "legacy".
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NDavid Yozie <dyozie@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

b4e2e3e2

D

Correcting limitation re: updating distribution keys (#7163) · 4f625f65
由 David Yozie 提交于 3月 13, 2019

4f625f65

13 3月, 2019 1 次提交
- D
  
  Indicate that GPORCA doesn't fully support per-column collation (#7153) · e4571807
  由 David Yozie 提交于 3月 12, 2019
  
  e4571807
12 3月, 2019 3 次提交
- M
  docs - GDD - add information about specific type of deadlocks. (#7014) · 609fa947
  由 Mel Kiyama 提交于 3月 11, 2019
```
* docs - GDD - add information about specific type of deadlocks.

* docs - GDD - update deadlock information based on review comments.

* typo fix

* Small edit for clarity

* Removing duplicate info
```
  609fa947
- L
  
  docs - register pgcrypto extension, use 9.4 pg xrefs (#7128) · 59822c2e
  由 Lisa Owen 提交于 3月 11, 2019
  
  59822c2e
- D
  Changing version in path from /600 to /6-0 for consistency with other docs and... · 11529a7e
  由 David Yozie 提交于 3月 11, 2019
```
Changing version in path from /600 to /6-0 for consistency with other docs and to recognize that only the minor version digit is significant for the doc build (#7141)
```
  11529a7e
11 3月, 2019 2 次提交

Rename recursive CTE guc to remove _prototype · f6a1a60e

由 Daniel Gustafsson 提交于 3月 11, 2019

The GUC which enables recursive CTEs is in the currently released
version called gp_recursive_cte_prototype, but in order to reflect
the current state of the code it's now renamed to gp_recursive_cte.
By default the GUC is still off, but that might change before we
ship the next release.

The previous GUC name is still supported, but marked as deprecated,
in order to make upgrades easier.
Reviewed-by: NIvan Novick <inovick@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

f6a1a60e

Retire the reshuffle method for table data expansion (#7091) · 1c262c6e

由 Ning Yu 提交于 3月 11, 2019

This method was introduced to improve the data redistribution
performance during gpexpand phase2, however per benchmark results the
effect does not reach our expectation. For example when expanding a
table from 7 segments to 8 segments the reshuffle method is only 30%
faster than the traditional CTAS method, when expanding from 4 to 8
segments reshuffle is even 10% slower than CTAS. When there are indexes
on the table the reshuffle performance can be worse, and extra VACUUM is
needed to actually free the disk space. According to our experiments
the bottleneck of reshuffle method is on the tuple deletion operation,
it is much slower than the insertion operation used by CTAS.

The reshuffle method does have some benefits, it requires less extra
disk space, it also requires less network bandwidth (similar to CTAS
method with the new JCH reduce method, but less than CTAS + MOD). And
it can be faster in some cases, however as we can not automatically
determine when it is faster it is not easy to get benefit from it in
practice.

On the other side the reshuffle method is less tested, it is possible to
have bugs in corner cases, so it is not production ready yet.

In such a case we decided to retire it entirely for now, we might add it
back in the future if we can get rid of the slow deletion or find out
reliable ways to automatically choose between reshuffle and ctas
methods.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

1c262c6e

10 3月, 2019 1 次提交

docs: Add spacing for doc xrefs · ce0cdc1d

由 Daniel Gustafsson 提交于 3月 09, 2019

Make sure there is whitespace before xref tags when used in normal
text to avoid links squashed together with text.
Reviewed-by: NLisa Owen <lowen@pivotal.io>

ce0cdc1d

09 3月, 2019 2 次提交

docs - CTE available with INSERT, UPDATE, DELETE (#7025) · 45d72f77

由 Mel Kiyama 提交于 3月 08, 2019

* docs - CTE available with INSERT, UPDATE, DELETE

-updated GUC
-updated Admin Guide topic WITH Queries (Common Table Expressions)

updates to SELECT, INSERT, UPDATE DELETE will be part of postgres 9.2 merge.

* docs - CTE updates from review comments

* docs - CTE more updates from review comments

* docs - CTE - updates from review comments

* Experimental -> Beta wording

45d72f77

Docs: reword from 'experimental' to 'beta' (#7103) · 79d3bb7f

由 David Yozie 提交于 3月 08, 2019

* reword from 'experimental' to 'beta'

* Experimental -> Beta in markdown source

* typo fix

* Removing SuSE 11 details in Beta notes

79d3bb7f

08 3月, 2019 1 次提交
- L
  
  docs - remove SOCKS proxy support from s3 protocol (#7106) · 6be44cd6
  由 Lisa Owen 提交于 3月 07, 2019
  
  6be44cd6
05 3月, 2019 1 次提交
- C
  Warn not to run gpbackup/gprestore during expand. (#7062) · 1aeb61e9
  由 Chuck Litzell 提交于 3月 04, 2019
```
* Warn not to run gpbackup/gprestore during expand.

* Updates from review
```
  1aeb61e9
01 3月, 2019 1 次提交
- L
  docs - add new host/seg resource group usage views (#7061) · 40240513
  由 Lisa Owen 提交于 2月 28, 2019
```
* docs - add new host/seg resource group usage views

* clarifications requested by ning
```
  40240513
21 2月, 2019 1 次提交
- M
  
  docs - removed percentile functions, add ordered-set aggregate functions not supported. (#7024) · 00eda181
  由 Mel Kiyama 提交于 2月 20, 2019
  
  00eda181
14 2月, 2019 1 次提交
- M
  docs - exchange partition, add limitation exchanging with a partitioned table... · 757f1aac
  由 Mel Kiyama 提交于 2月 13, 2019
```
docs - exchange partition, add limitation exchanging with a partitioned table is not supported. (#6961)
```
  757f1aac
12 2月, 2019 2 次提交

L
docs - appendoptimized is an alias for appendonly storage option (#6925) · acb4090f
由 Lisa Owen 提交于 2月 11, 2019
```
* docs - appendoptimized is an alias for appendonly storage option

* option names lower case, add legacy, misc edits
```
acb4090f

docs - update compression information (#6929) · a742ef05

由 Mel Kiyama 提交于 2月 11, 2019

* docs - update compression information
-Add a topic that lists features that support configuring compression.
-update gp_workfile_compression GUC. Remove requirement to install zstd.

* docs - update compression information based on review comments.
-changed description to point to feature/utility for specific compression information.
-added general, pivotal, and OSS specific information about compression requirements.

* docs - compression information - reivew edit.

a742ef05

08 2月, 2019 2 次提交

Remove gpexpand checks on unique indexes · e2c8b178

由 Heikki Linnakangas 提交于 2月 08, 2019

Unique indexes are now enforced throughout gpexpand, so there's no need to
warn or treat them specially. (gpexpand used to first make all tables
randomly distributed, and while in that state, unique indexes could not be
enforced. But it doesn't do that anymore.)

This also removes the gp_status_detail.rank column. There's no particular
reason to process tables with indexes before others anymore, and that was
the only criteria used for ranking.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

e2c8b178

docs - update template database information. (#6917) · 633960fe

由 Mel Kiyama 提交于 2月 07, 2019

* docs - update template database information.

Updated ddl-database.xml topic - section About Template Databases
Also removed draft text from expand topics, gpexpand and pg_tablespace

* docs - update template database information based on review comments.
-Changed text - postgres is not a template database
-changed title to About Template and Default Databases

* Small line edit

633960fe

06 2月, 2019 4 次提交

Re-implement compression of hash join/agg spill files. · d130fe19

由 Heikki Linnakangas 提交于 2月 06, 2019

This re-implements compression that was lost in the WorkFile/BufFile
refactoring. This new implementation uses Zstandard rather than zlib.

Compression can be enabled with "gp_workfile_compression=on". It is
disabled by default.
Reviewed-by: NMel Kiyama <mkiyama@pivotal.io>
Reviewed-by: NYandong Yao <yyao@pivotal.io>

d130fe19

Replace WorkFile stuff with plain BufFiles. · f1ef3668

由 Heikki Linnakangas 提交于 2月 06, 2019

Temporary files have been somewhat inconsistent across different
operations. Some operations used the Greenplum-specific "workfile" API,
while others used the upstream BufFile API directly.

The workfile API provides some extra features: workfiles are visible in
the gp_toolkit views, and you can limit their size with the
gp_workfile_limit_* GUCs. The temporary files that didn't go through the
workfile API were exempt, which is not cool.

To make things consistent, remove the workfile APIs. Use BufFiles
directly everywhere. Re-implement the user-facing view and tracking the
limits, on top of the BufFile API, so that those features are not lost.

The workfile API also supported compressing the temporary files using
zlib. That feature is lost with this commit, but will be re-introduced by
the next commit.

Another feature that this removes, is checksumming temporary files.
That doesn't seem very useful, so we can probably live without it. But
if it's still needed, then that should also be re-implemented on top
of the BufFile API later.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8Xe9MGor0pM/SjqiOo83BAAJReviewed-by: NMel Kiyama <mkiyama@pivotal.io>
Reviewed-by: NYandong Yao <yyao@pivotal.io>

f1ef3668

Remove obsolete check in gpexpand for dropped columns. · b94767b5

由 Heikki Linnakangas 提交于 2月 06, 2019

The way dropped columns are handled in ALTER TABLE was changed back in
2017, commit 62d66c06, but gpexpand didn't get the memo. Dropped columns
of custom datatypes no longer pose any problems.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

b94767b5

docs - add GUC gp_enable_global_deadlock_detector (#6894) · a246ded0

由 Mel Kiyama 提交于 2月 05, 2019

* docs - add GUC gp_enable_global_deadlock_detector

Add Note text to DELETE, LOCK, UPDATE about GDD.
Also, update lock descriptions in dml.xml to point to GDD section.

* docs - review updates for gp_enable_global_deadlock_detector GUC

a246ded0

05 2月, 2019 3 次提交

docs - remove/replace references to gphdfs (#6847) · e455f6a6

由 Lisa Owen 提交于 2月 04, 2019

* docs - remove/replace references to gphdfs

* misc edits

* stronger statement (note) about pxf replacing gphdfs

* stronger statement

* edit to note requested by david

* haved -> have... doh

e455f6a6

M

docs - remove filespace from docs (#6892) · 8138541f
由 Mel Kiyama 提交于 2月 04, 2019

8138541f

docs - misc pxf edits (#6846) · d1158f03

由 Lisa Owen 提交于 2月 04, 2019

* docs - misc pxf edits

* add pxf to loading/unloading page and subnav

* edit requested by lav

d1158f03

01 2月, 2019 1 次提交

Use normal hash operator classes for data distribution. · 242783ae

由 Heikki Linnakangas 提交于 2月 01, 2019

Replace the use of the built-in hashing support for built-in datatypes, in
cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time
to do this, since we've already made the change to use jump consistent
hashing in GPDB 6, so we'll need to deal with the upgrade problems
associated with changing the hash functions, anyway.

It is no longer enough to track which columns/expressions are used to
distribute data. You also need to know the hash function used. For that,
a new field is added to gp_distribution_policy, to record the hash
operator class used for each distribution key column. In the planner,
a new opfamily field is added to DistributionKey, to track that throughout
the planning.

Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the
default hash operator class for the datatype is used. But this patch
extends the syntax so that you can specify the operator class explicitly,
like "... DISTRIBUTED BY (column opclass)". This is similar to how an
operator class can be specified for each column in CREATE INDEX.

To support upgrade, the old hash functions have been converted to special
(non-default) operator classes, named cdbhash_*_ops. For example, if you
want to use the old hash function for an integer column, you could do
"DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist
of operators that have "compatible" cdbhash functions has been replaced
by putting the compatible hash opclasses in the same operator family. For
example, all legacy integer operator classes, cdbhash_int2_ops,
cdbhash_int4_ops and cdbhash_int8_ops, are all part of the
cdbhash_integer_ops operator family).

This removes the pg_database.hashmethod field. The hash method is now
tracked on a per-table and per-column basis, using the opclasses, so it's
not needed anymore.

To help with upgrade from GPDB 5, this introduces a new GUC called
'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash
opclasses, instead of the default hash opclasses, if the opclass is not
specified explicitly. pg_upgrade will set the new GUC, to force the use of
legacy hashops, when restoring the schema dump. It will also set the GUC
on all upgraded databases, as a per-database option, so any new tables
created after upgrade will also use the legacy opclasses. It seems better
to be consistent after upgrade, so that collocation between old and new
tables work for example. The idea is that some time after the upgrade, the
admin can reorganize all tables to use the default opclasses instead. At
that point, he should also clear the GUC on the converted databases. (Or
rather, the automated tool that hasn't been written yet, should do that.)

ORCA doesn't know about hash operator classes, or the possibility that we
might need to use a different hash function for two columns with the same
datatype. Therefore, it cannot produce correct plans for queries that mix
different distribution hash opclasses for the same datatype, in the same
query. There are checks in the Query->DXL translation, to detect that
case, and fall back to planner. As long as you stick to the default
opclasses in all tables, we let ORCA to create the plan without any regard
to them, and use the default opclasses when translating the DXL plan to a
Plan tree. We also allow the case that all tables in the query use the
"legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the
two, or using any non-default opclasses, forces ORCA to fall back.

One curiosity with this is the "int2vector" and "aclitem" datatypes. They
have a hash opclass, but no b-tree operators. GPDB 4 used to allow them
as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit
56e7c16b. Now they are allowed again, so you can specify an int2vector
or aclitem column in DISTRIBUTED BY, but it's still pretty useless,
because the planner still can't form EquivalenceClasses on it, and will
treat it as "strewn" distribution, and won't co-locate joins.

Abstime, reltime, tinterval datatypes don't have default hash opclasses.
They are being removed completely on PostgreSQL v12, and users shouldn't
be using them in the first place, so instead of adding hash opclasses for
them now, we accept that they can't be used as distribution key columns
anymore. Add a check to pg_upgrade, to refuse upgrade if they are used
as distribution keys in the old cluster. Do the same for 'money' datatype
as well, although that's not being removed in upstream.

The legacy hashing code for anyarray in GPDB 5 was actually broken. It
could produce a different hash value for two arrays that are considered
equal, according to the = operator, if there were differences in e.g.
whether the null bitmap was stored or not. Add a check to pg_upgrade, to
reject the upgrade if array types were used as distribution keys. The
upstream hash opclass for anyarray works, though, so it is OK to use
arrays as distribution keys in new tables. We just don't support binary
upgrading them from GPDB 5. (See github issue
https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of
'anyrange' had the same problem, but that was new in GPDB 6, so we don't
need a pg_upgrade check for that.

This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE
INDEX, so that you can no longer create a situation where a non-hashable
column becomes the distribution key. (Fixes github issue
https://github.com/greenplum-db/gpdb/issues/6317)

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
Reviewed-by: NNing Yu <nyu@pivotal.io>
Reviewed-by: NSimon Gao <sgao@pivotal.io>
Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
Reviewed-by: NYandong Yao <yyao@pivotal.io>

242783ae

29 1月, 2019 3 次提交

docs - remove gptransfer from docs (#6821) · 89c5c3fc

由 Mel Kiyama 提交于 1月 28, 2019

* docs - remove gptransfer from docs
--removed gptransfer topics, references to gptransfer, and images.
--also updated text in gpcopy-migrate as rough update for 6.0

* docs - remove gptransfer from docs - review updates

89c5c3fc

docs - REPEATABLE READ xact mode is supported. (#6717) · fe719bf5

由 Chuck Litzell 提交于 1月 28, 2019

* docs - REPEATABLE READ xact mode is supported. SERIALIZABLE falls back to REPEATABLE READ.

* Note that GPDB doesn't implement PGSQL SSI transactions

* Review comments

fe719bf5

docs - updates for online expand (#6719) · dd5bb58b

由 Mel Kiyama 提交于 1月 28, 2019

* docs - updates for online expand

* docs - online expand - edits based on review comments.
updated catalog table information.
removed draft comments.

dd5bb58b

19 1月, 2019 1 次提交

docs - reorg pxf content, add multi-server, objstore content (#6736) · f601572d

由 Lisa Owen 提交于 1月 18, 2019

* docs - reorg pxf content, add multi-server, objstore content

* misc edits, SERVER not optional

* add server, remove creds from examples

* address comments from alexd

* most edits requested by david

* add Minio to table column name

* edits from review with pxf team (start)

* clear text credentials, reorg objstore cfg page

* remove steps with XXX placeholder

* add MapR to supported hadoop distro list

* more objstore config updates

* address objstore comments from alex

* one parquet data type mapping table, misc edits

* misc edits from david

* add mapr hadoop config step, misc edits

* fix formatting

* clarify copying libs for MapR

* fix pxf links on CREATE EXTERNAL TABLE page

* misc edits

* mapr paths may differ based on version in use

* misc edits, use full topic name

* update OSS book for pxf subnav restructure

f601572d

17 1月, 2019 1 次提交

Remove mentions about error table from docs · 2c576b08

由 Daniel Gustafsson 提交于 1月 16, 2019

The INTO ERROR TABLE syntax has been deprecated since Greenplum 5
shipped.

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/mzYcVk_G5UwReviewed-by: NDavid Yozie <dyozie@pivotal.io>

2c576b08

16 1月, 2019 2 次提交
- C
  Docs - system columns suppressed in replicated tables (#6671) · 17068f86
  由 Chuck Litzell 提交于 1月 15, 2019
```
* Docs - update docs to note that system columns are unavailable in
queries on replicated tables.

* Edits from reviewers
```
  17068f86
- D
  
  Docs: removing broken xref · c5c6c6b1
  由 David Yozie 提交于 1月 15, 2019
  
  c5c6c6b1
15 1月, 2019 1 次提交

docs - migrate using gpcopy- remove same number of source, dest. host… (#6660) · 08676bab

由 Mel Kiyama 提交于 1月 14, 2019

* docs - migrate using gpcopy- remove same number of source, dest. hosts restriction.

* docs - migrate using gpcopy- updates based on review comments.

08676bab

08 1月, 2019 1 次提交

docs - first HA updates that use WAL replication. (#6588) · 96803e5b

由 Mel Kiyama 提交于 1月 07, 2019

* docs - first HA updates that use WAL replication.

--Removed references to filerep
--Updated segment instance states to use WAL rep states
--Other misc. updates.

* docs -  HA updates for WAL replication - review comment upates

96803e5b

05 1月, 2019 1 次提交

Adds window examples to query topic in admin guide (#6546) · c6696f95

由 Chuck Litzell 提交于 1月 04, 2019

* Adds window examples to query topic in admin guide

* Fix some clear errors in CREATE TYPE and CREATE FUNCTION SQL

* Changes from review comments

* Add missing period

c6696f95

04 1月, 2019 1 次提交
- D
  
  Fix typo in administration guide · 4504c96f
  由 Daniel Gustafsson 提交于 1月 04, 2019
  
  4504c96f
03 1月, 2019 1 次提交

docs - discuss the global deadlock detector (#6538) · ea5a0a1c

由 Lisa Owen 提交于 1月 02, 2019

* docs - discuss the global deadlock detector

* some of the edits requested by david

* move opening paragraph to release note

* reorg content, add a bit about local deadlock

* guc can be reloaded

* concurrent update AND DELETE

ea5a0a1c