提交 · aa148d2a3cba9866a2c3a06364a26dee32d1c0a2 · Greenplum / Gpdb

21 9月, 2017 2 次提交

Add support for CREATE FUNCTION EXECUTE ON [MASTER | ALL SEGMENTS] · aa148d2a

由 Heikki Linnakangas 提交于 9月 20, 2017

We already had a hack for the EXECUTE ON ALL SEGMENTS case, by setting
prodataaccess='s'. This exposes the functionality to users via DDL, and adds
support for the EXECUTE ON MASTER case.

There was discussion on gpdb-dev about also supporting ON MASTER AND ALL
SEGMENTS, but that is not implemented yet. There is no handy "locus" in the
planner to represent that. There was also discussion about making a
gp_segment_id column implicitly available for functions, but that is also
not implemented yet.

The old behavior was that a function that if a function was marked as
IMMUTABLE, it could be executed anywhere. Otherwise it was always executed
on the master. For backwards-compatibility, this keeps that behavior for
EXECUTE ON ANY (the default), so even if a function is marked as EXECUTE ON
ANY, it will always be executed on the master unless it's IMMUTABLE.

There is no support for these new options in ORCA. Using any ON MASTER or
ON ALL SEGMENTS functions in a query cause ORCA to fall back. This is the
same as with the prodataaccess='s' hack that this replaces, but now that it
is more user-visible, it would be nice to teach ORCA about it.

The new options are only supported for set-returning functions, because for
a regular function marked as EXECUTE ON ALL SEGMENTS, it's not clear how
the results should be combined. ON MASTER would probably be doable, but
there's no need for that right now, so punt.

Another restriction is that a function with ON ALL SEGMENTS or ON MASTER can
only be used in the FROM clause, or in the target list of a simple SELECT
with no FROM clause. So "SELECT func()" is accepted, but "SELECT func() FROM
foo" is not. "SELECT * FROM func(), foo" works, however. EXECUTE ON ANY
functions, which is the default, work the same as before.

aa148d2a

Fix multistage aggregation plan targetlists · 41640e69

由 Bhuvnesh Chaudhary 提交于 9月 19, 2017

If there are aggregation queries with aliases same as the table actual
columns and they are propagated further from subqueries and grouping is
applied on the column alias it may result in inconsistent targetlists
for aggregation plan causing crash.

	CREATE TABLE t1 (a int) DISTRIBUTED RANDOMLY;
	SELECT substr(a, 2) as a
	FROM
		(SELECT ('-'||a)::varchar as a
			FROM (SELECT a FROM t1) t2
		) t3
	GROUP BY a;

41640e69

20 9月, 2017 10 次提交

Dump more detailed info for memory usage in gp_resgroup_status · 2816fe67

由 Pengzhou Tang 提交于 9月 18, 2017

In this commit, we add more detailed memory metrics to the 'memory_usage'
column of gp_resgroup_status include current/available memory usage in
a group, current/available memory usage for a slot, current/available
memory usage for the shared part.

2816fe67

resource group: refine ResGroupSlotAcquire · 4646bbc6

由 Gang Xiong 提交于 9月 11, 2017

Previously, waiters waiting on a dropped resource group need to be
reassigned to a new group, to achieve it, ResGroupSlotAcquire is
modified to be complicated and not easy to understand, this commit
refines it.

Author: Gang Xiong <gxiong@pivotal.io>

4646bbc6

resgroup: Allow concurrency to be zero. · 77007ff6

由 Pengzhou Tang 提交于 9月 05, 2017

Allow CREATE RESOURCE GROUP and ALTER RESOURCE GROUP to set concurrency
to 0, so there will eventually be no running queries after some time, so
the resource group can be dropped. On drop all pending queries will be
moved to the new resource group assigned to the role; but if the role is
also dropped the pending queries will all be canceled. Another thing is
we do not allow setting concurrency of admin group to zero, superuser is
under admin group and only superuser can alter resource group, so once
concurrency of admin group is set to zero, there will be no chance to set
it again.
Signed-off-by: NNing Yu <nyu@pivotal.io>

77007ff6

M
Report error when 'COPY (SELECT ...) TO' with 'ON SEGMENT' · cbddcc86
由 Ming LI 提交于 9月 20, 2017
```
Because we don't know the data location of the result of SELECT query,
ON SEGMENT is forbidden.
```
cbddcc86

Remove the restriction on sum of memory_spill_ratio and memory_shared_quota. · c5a5780a

由 Richard Guo 提交于 9月 20, 2017

This commit does two changes:
1. Remove the restriction that sum of memory_spill_ratio and memory_shared_quota
must be no larger than 100.
2. Change the range of memory_spill_ratio to be [0, 100].

c5a5780a

Fix warning of passing const to non-const parameter. · f4417c50

由 Hubert Zhang 提交于 9月 19, 2017

Function FaultInjectorIdentifierStringToEnum(faultName) pass a const
string to a non-const parameter, which cause a build warnig. But on the
second thought, we have supported injecting fault by fault name without
corresponding fault identifier, so it's better to use faultname instead
of fault enum identifier in the ereport.

f4417c50

C

Change requirement to have zlib in PATH to have zlib installed. (#3315) · a40f04b8
由 Chuck Litzell 提交于 9月 19, 2017

a40f04b8

Developer version of gpstart for WALRep · dc549c2f

由 Taylor Vesely 提交于 9月 15, 2017

Adds a clusterstart command to gpsegwalrep.py allow a user to start a
cluster with WALRep configured. This is a developer utility that assumes
all cluster replicas are present on localhost, and thus is not intended
for production use.

dc549c2f

L
docs - memory_spill_ratio guc and related content (#3278) · 4f0392fd
由 Lisa Owen 提交于 9月 19, 2017
```
* docs - memory_spill_ratio guc and related content

* operator -> transaction
```
4f0392fd

Remove dependency on system curl; Fix bug with OSX (#3261) · f12d756c

由 Lav Jain 提交于 9月 19, 2017

* Remove dependency on system curl; Fix bug with OSX

* Add ifdef for CURLOPT_RESOLVE

* Incorporate feedback

* brew curl not needed anymore

f12d756c

19 9月, 2017 8 次提交

Map GPOS severity level to GPDB Severity Levels · e25eba47

由 Bhuvnesh Chaudhary 提交于 9月 13, 2017

GPOS raises exception with different severity level, but
they were being logged to GPDB logs at LOG severity level.
This disabled users to not turn off logging for GPOS exceptions, unless
GPDB log setting was changed higher than LOG severity level.

This is the initial commit which introduces the functionality. If an
exception is created without the GPDB severity level, it will default to
LOG severity level in GPDB.
Signed-off-by: NJemish Patel <jpatel@pivotal.io>

e25eba47

B

Bump ORCA version to v2.45.1 · 6fae6442
由 Bhuvnesh Chaudhary 提交于 9月 18, 2017

6fae6442
X
Fix: address PR comment and adding MERGE_FIXME · f3c00e1b
由 Xin Zhang 提交于 9月 14, 2017
```
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>
```
f3c00e1b
A
Fix: using macro for GP_REPLICATION_CONFIG_FILENAME. · 60db8cfd
由 Abhijit Subramanya 提交于 9月 14, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
60db8cfd

Create generic API to set any GUC values in GP_REPLICATION_CONFIG_FILENAME · f39047dd

由 Xin Zhang 提交于 9月 13, 2017

New API: void set_gp_replication_config(const char *name, const char *value)

This function is inspired by the upstream ALTER SYSTEM command
AlterSystemSetConfigFile() from commit
7dfab04a.

Once we merged the upstream changes, we can remove this function and
directly use the AlterSystemSetConfigFile().
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>

f39047dd

Add GP-specific replication config file gp_replication.conf · a90b4103

由 Xin Zhang 提交于 9月 13, 2017

We use this file to store the GUC value `sychronous_standby_names` in
order to control the blocking behavior between primary and mirrors as
used in upstream. When this GUC is on, the primary is blocked and
waiting for the commits propagated to mirrors regardless of mirror
status. When this GUC is off, primary just archive and won't wait for
mirrors.

gp_replication.conf is now read unconditionally by the GUC parsing logic
and needs to be set up by initdb. Refactor set_null_conf() to take a
filename so that we don't copy-paste more code.
Signed-off-by: NJacob Champion <pchampion@pivotal.io>

a90b4103

add_assignment: don't crash if assignment isn't found · 6c23d77f

由 Jacob Champion 提交于 9月 13, 2017

add_assignment previously didn't handle the case where an existing
config assignment was not found, and segfaulted. Fix that by inserting
brand-new assignments at the end of the config list.

6c23d77f

E
Revert commit e70ca9ce. · 4df139b1
由 Ekta Khanna and Jemish Patel 提交于 9月 18, 2017
```
The pipeline was failing as the ORCA tag for 2.45.0 was not pushed.
It is currently available.
```
4df139b1

18 9月, 2017 3 次提交

Add sanity checks for unrecognized window frame options. · c7c158dd

由 Heikki Linnakangas 提交于 9月 18, 2017

These shouldn't happen, but Coverity warned about these. GCC would also
complain, but I've been compiling with -Wno-maybe-uninitialized lately,
because of noise.

Actually, this isn't quite enough; ORCA also needs to mark GPOS_RAISE
with the "noreturn" attribute, so that the compiler gets the hint.
Opened https://github.com/greenplum-db/gporca/pull/234 about that.

c7c158dd

Using fault name instead of enum as the key of fault hash table (#3249) · 4616d3ec

由 Huan Zhang 提交于 9月 18, 2017

Using fault name instead of enum as the key of fault hash table

GPDB fault injector uses fault enum as the key of fault hash table.
If someone wants to inject fault into gpdb extensions(a separate repo),
she has to hard code the extension related fault enums into gpdb core
code, this is not a good practice.
So we simply use fault name as the hash key to remove the need of hard
code the fault enum. Note that fault injector API doesn't change.

4616d3ec

Revert "Incorrect Decorrelation results in wrong plan" · e70ca9ce

由 Adam Lee 提交于 9月 18, 2017

This reverts commit 378426fe.

> --2017-09-17 12:12:00--  https://github.com/greenplum-db/gporca/releases/download/v2.45.0/bin_orca_centos5_release.tar.gz
> Resolving github.com... 192.30.255.113, 192.30.255.112
> Connecting to github.com|192.30.255.113|:443... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2017-09-17 12:12:00 ERROR 404: Not Found.

e70ca9ce

17 9月, 2017 2 次提交

Convert WindowFrame to frameOptions + start + end · ebf9763c

由 Heikki Linnakangas 提交于 9月 17, 2017

In GPDB, we have so far used a WindowFrame struct to represent the start
and end window bound, in a ROWS/RANGE BETWEEN clause, while PostgreSQL
uses the combination of  a frameOptions bitmask and start and end
expressions. Refactor to replace the WindowFrame with the upstream
representation.

ebf9763c

Hardcode the "frame maker" function for LEAD and LAG. · 686aab95

由 Heikki Linnakangas 提交于 9月 17, 2017

This removes pg_window.winframemakerfunc column. It was only used for
LEAD/LAG, and only in the Postgres planner. Hardcode the same special
handling for LEAD/LAG in planwindow.c instead, based on winkind.

This is one step in refactoring the planner and executor further, to
replace the GPDB implementation of window functions with the upstream
one.

686aab95

16 9月, 2017 12 次提交

Add check for hash partitioned tables in pg_upgrade. · 22072ec5

由 Heikki Linnakangas 提交于 9月 16, 2017

I was about to add this as part of the PostgreSQL 8.4 merge, as a check
when upgrading from 8.3 to 8.4, because the hash algorithm was changed
in 8.4. However, turns out that pg_dump doesn't support hash partitioned
tables at all, so pg_upgrade won't work on a database that contains any
hash partitioned tables, even on a same-version upgrade. Hence, let's
add this check unconditionally on all server versions.

There are comments talking about the hash function change, because of that
devleopment history. I think that's useful documentation, just in case
we ever start to support hash partitions in pg_dump, so I left it there.

22072ec5

Fix check for superuser_reserved_connections. · 06ea112c

由 Heikki Linnakangas 提交于 9月 16, 2017

Upstream uses >= here. It was changed in GPDB, to use > instead of >=. but
I don't see how that's more correct or better. I tracked that change in
the old pre-open-sourcing repository to this commit:

commit f3e98a1ef5fc5915662077b137c563371ea1c0a4
Date: Mon Apr 6 15:04:33 2009 -0800

   Fixed guc check for ReservedBackends.

   [git-p4: depot-paths = "//cdb2/main/": change = 33269]

So, there was no explanation there either, what the alleged problem was.

06ea112c

Fix CREATE TABLE AS VALUES ... DISTRIBUTED BY · 47936ab2

由 Heikki Linnakangas 提交于 9月 16, 2017

Should call setQryDistributionPolicy() after applyColumnNames(), otherwise
the column names specified in the CREATE TABLE cannot be used in the
DISTRIBUTED BY clause. Add test case.

Fixes github issue #3285.

47936ab2

Remove function isMemoryIntensiveFunction · 5c9b81ef

由 Kavinder Dhaliwal 提交于 8月 30, 2017

Historically this function was used to special case a few operators that
were not considered to be MemoryIntensive. However, now it always
returns true. This commit removes the function and also moves the case
for T_FunctionScan in IsMemoryIntensiveOperator into the group that
always returns true, as this is its current behavior

5c9b81ef

A
Fix test broken by the ALTER TYPE SET DEFAULT ENCODING changes · 8a31b037
由 Abhijit Subramanya 提交于 9月 15, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
8a31b037
K

Create separate pipeline for NBU and DDBoost pipeline jobs · f8869b5c
由 Karen Huddleston 提交于 9月 14, 2017

f8869b5c

Incorrect Decorrelation results in wrong plan · 378426fe

由 Bhuvnesh Chaudhary 提交于 9月 14, 2017

While attempting to decorrelate the subquery, we were
incorrectly pulling up the join before calculating the window function
of the results of the join. In cases where we have subqueries with
window function and the subquery has outer references we should not
attempt decorrelating it.

There are further optimization which can be done in case of existential
queries but this PR fixes the plan.
Signed-off-by: NJemish Patel <jpatel@pivotal.io>

378426fe

M

doc: document issue when updating both max_statement_mem and statement_mem (#3269) · 5303f02a
由 Mel Kiyama 提交于 9月 15, 2017

5303f02a
M
docs: COPY command - add PROGRAM clause (#3297) · e86c8aae
由 Mel Kiyama 提交于 9月 15, 2017
```
* docs: COPY command add PROGRAM clause

* docs: copy - edits from review comments.
```
e86c8aae
H
Fix yet another TINC test, broken by the SET DEFAULT ENCODING changes. · 7979ab57
由 Heikki Linnakangas 提交于 9月 15, 2017
```
Again attempting a blind fix..
```
7979ab57
H
Fix another test broken by the ALTER TYPE SET DEFAULT ENCODING changes. · c0818e44
由 Heikki Linnakangas 提交于 9月 15, 2017
```
I can't easily run these tests myself, so hoping I get this fixed blindly..
```
c0818e44
D

Removing to_json() function from docs builds until it's added to the system catalog (#3303) · 85fdaf1e
由 David Yozie 提交于 9月 15, 2017

85fdaf1e

15 9月, 2017 3 次提交

Make it possible to build without libbz2, also on non-Windows. · d6749c3c

由 Heikki Linnakangas 提交于 9月 15, 2017

The bzip2 library is only used by the gfile/fstream code, used for external
tables and gpfdist. The usage of bzip2 was in #ifndef WIN32 blocks, so it
was only built on non-Windows systems.

Instead of tying it to the platform, use a proper autoconf check and
HAVE_LIBBZ2 flags. This makes it possible to build gpfdist with bzip2
support on Windows, as well as building without bzip2 on non-Windows
systems. That makes it easier to test the otherwise Windows-only codepaths
on other platforms. --with-libbz2 is still the default, but you can now use
--without-libbz2 if you wish.

I'm sure that some regression tests will fail if you actually build the
server without libbz2, but I'm not going to address that right now. We have
similar problems with other features that are in principle optional, but
cause some regression tests to fail.

Also use "#ifdef HAVE_LIBZ" rather than "#ifndef WIN32" to enable/disable
zlib support in gpfdist. Building the server still fails if you use
--without-zlib, but at least you can build the client programs without
zlib, also on non-Windows systems.

Remove obsolete copy of bzlib.h from the repository while we're at it.

d6749c3c

Fix stanullfrac computation on column with all-wide values. · 90bcf3fd

由 Heikki Linnakangas 提交于 9月 15, 2017

If a the sample of a column consists entirely of "too wide" values, which
are left out of the sample when it's passed to the compute_stats function,
we pass an empty sample to it. The default compute_stats gets confused by
that, and computes the null fraction as 0 / 0 = NaN, so we end up storing
NaN as stanullfrac.

If all the values in the sample are wide values, then they're surely not
NULLs, so the right thing to do is to store stanullfrac = 0. That is a
bit non-linear with the normal compute_stats function, which effectively
treats too wide values as not existing at all, which artificially inflates
the null fraction. Another non-linear thing is that we store stawidth=1024
in this special case, but the normal computation again ignores the wide
values in computing stawidth. If we wanted to do something about that, we
should adjust the normal computation to take those wide values better into
account, but that's a different story, and now we at least won't store NaN
in stanullfrac any longer.

Fixes github issue #3259.

90bcf3fd

Fix test case after the change to ALTER TYPE SET DEFAULT ENCODING. · 30e50772

由 Heikki Linnakangas 提交于 9月 15, 2017

Commit b4f125bd changed ALTER TYPE SET DEFAULT ENCODING to no longer
accept SQL type aliases. A consequence of that is that "char" no longer
meand "character varying", but actual "char" datatype. Change the tests
to use the PostgreSQL name for that datatype, "bpchar".

30e50772