提交 · 430e7343d5e82d4d3b7194ecacbbb4cb1fe64c29 · Greenplum / Gpdb

22 9月, 2017 1 次提交
- L
  docs - add suse11 swapaccount req to resgroup cgroup cfg (#3323) · 430e7343
  由 Lisa Owen 提交于 9月 21, 2017
```
* docs - add suse11 swapaccount req to resgroup cgroup cfg

* must reboot after setting boot parameters
```
  430e7343
21 9月, 2017 14 次提交

Mask out differences in plperl.c line numbers in errors. · 8b153171

由 Heikki Linnakangas 提交于 9月 21, 2017

Ideally, we would use proper error codes, or find some other way to prevent
the useless "(plperl.c:2118)" from appearing in PL/perl errors. Later
versions of PostgreSQL do that, so we'll get that eventually. In the
meanwhile, silence errors caused by code movement in that file. Same as
we had done for plperl's own tests already.

8b153171

Use autoconf for resolving PXF library dependency · 6f1ca717

由 Daniel Gustafsson 提交于 9月 21, 2017

Leverage the core autoconf scaffolding for resolving the dependency
on libcurl. Enabling PXF in autoconf now automatically adds libcurl
as a dependency. Coupled with the recent commit which relaxes the
curl version requirement on macOS, we can remove the library copying
from the PXF makefile as well.

6f1ca717

Fix bug in handling re-scan of a hash join. · f7101d98

由 Heikki Linnakangas 提交于 9月 21, 2017

The WITH RECURSIVE test case in 'join_gp' would miss some rows, if
the hash algorithm (src/backend/access/hash/hashfunc.c) was replaced
with the one from PostgreSQL 8.4, or if statement_mem was lowered from
1000 kB to 700 kB. This is what happened:

1. A tuple belongs to batch 0, and is kept in memory during processing
   batch 0.

2. The outer scan finishes, and we spill the inner batch 0 from memory
   to a file, with SpillFirstBatch, and start processing tuple 1

3. While processing batch 1, the number of batches is increased, and
   the tuple that belonged to batch 0, and was already written to the
   batch 0's file, is moved, to a later batch.

4. After the first scan is complete, the hash join is re-scanned

5. We reload the batch file 0 into memory. While reloading, we encounter
   the tuple that now doesn't seem to belong to batch 0, and throw it
   away.

6. We perform the rest of the re-scan. We have missed any matches to the
   tuple that was thrown away. It was not part of the later batch files,
   because in the first pass, it was handled as part of batch 0. But in
   the re-scan, it was not handled as part of batch 0, because nbatch was
   now larger, so it didn't belong there.

To fix, when reloading a batch file we see a tuple that actually belongs
to a later batch file, we write it to that later file. To avoid adding
it there multiple times, if the hash join is re-scanned multiple times,
if any tuples are moved when reloading a batch file, destroy the batch
file and re-create it with just the remaining tuples.

This is made a bit complicated by the fact that BFZ temp files don't support
appending to a file that's already been rewinded for reading. So what we
actually do, is always re-create the batch file, even if there has been no
changes to it. I left comments about that, Ideally, we would either support
re-appending to BFZ files, or stopped using BFZ workfiles for this
altogether (I'm not convinced they're any better than plain BufFiles). But
that can be done later.

Fixes github issue #3284

f7101d98

Don't double-count inner tuples reloaded from file. · 429ff8c4

由 Heikki Linnakangas 提交于 9月 21, 2017

ExecHashTableInsert also increments the counter, so we don't need to do it
here. This is harmless AFAICS, the counter isn't used for anything but
instrumentation at the moment, but it confused me while debugging.

429ff8c4

Fix CURRENT OF to work with PL/pgSQL cursors. · 91411ac4

由 Heikki Linnakangas 提交于 9月 21, 2017

It only worked for cursors declared with DECLARE CURSOR, before. You got
an "there is no parameter $0" error if you tried. This moves the decision
on whether a plan is "simply updatable", from the parser to the planner.
Doing it in the parser was awkward, because we only want to do it for
queries that are used in a cursor, and for SPI queries, we don't know it
at that time yet.

For some reason, the copy, out, read-functions of CurrentOfExpr were missing
the cursor_param field. While we're at it, reorder the code to match
upstream.

This only makes the required changes to the Postgres planner. ORCA has never
supported updatable cursors. In fact, it will fall back to the Postgres
planner on any DECLARE CURSOR command, so that's why the existing tests
have passed even with optimizer=off.

91411ac4

Remove now-unnecessary code from gp_read_error_log to dispatch the call. · 4035881e

由 Heikki Linnakangas 提交于 9月 21, 2017

There was code in gp_read_error_log(), to "manually" dispatch the call to
all the segments, if it was executed in the dispatcher. This was
previously necessary, because even though the function was marked with
prodataaccess='s', the planner did not guarantee that it's executed in the
segments, when called in the targetlist like "SELECT
gp_read_error_log('tab')". Now that we have the EXECUTE ON ALL SEGMENTS
syntax, and are more rigorous about enforcing that in the planner, this
hack is no longer required.

4035881e

Refactor resource group source code, part 2. · a2cf9bdf

由 Ning Yu 提交于 9月 21, 2017

* resgroup: provide helper funcs for memory usage updates.

We used to have complex and duplicate logic to update group & slot
memory usage under different context, now we provide two helper
functions to increase or decrease memory usage in group and slot.

Two bad named functions `attachToSlot()` and `detachFromSlot()` are
retired now.

* resgroup: provide helper function to unassign a dropped resgroup.

* resgroup: move complex checks into helper functions.

Many helper functions were added with descriptive names to increase
readability of lots of complex checks.

Also added a pointer to resource group slot in self.

* resgroup: add helper functions for wait queue operations.

a2cf9bdf

Fix aix7_ppc_64 making script · 15c04803

由 Adam Lee 提交于 9月 21, 2017

    $ make -j -s install
    ...
    --- subprocess32, Linux only
    /bin/sh: line 3: [: =: unary operator expected
    --- stream
    ...
    Greenplum Database installation complete.

When `$(BLD_ARCH)` is empty, the check becomes `[ = 'aix7_ppc_64' ]`, and gets
the unary operator expected error.

15c04803

Make gp_replication.conf for USE_SEGWALREP only. · b7ce6930

由 Ashwin Agrawal 提交于 9月 20, 2017

The intend of this extra configuration file is to control the
synchronization between primary and mirror for WALREP.

The gp_replication.conf is not designed to work with filerep, for
example, the scripts like gp_expand will fail since it directly modify
the configuration files instead of going through initdb.
Signed-off-by: NXin Zhang <xzhang@pivotal.io>

b7ce6930

A

Add step for disabling SIP to README.macOS.md · d60e2389
由 Amil Khanzada and Ben Christel 提交于 9月 20, 2017

d60e2389
L

Simplify PXF URI by removing need to specify cluster as 'default' (#3329) · 93df651b
由 Lav Jain 提交于 9月 20, 2017

93df651b

Take advantage of the new EXECUTE ON syntax in gp_toolkit. · 9a039e4f

由 Heikki Linnakangas 提交于 9月 20, 2017

Also change a few regression tests to use the new syntax, instead of
gp_toolkit's __gp_localid and __gp_masterid functions.

9a039e4f

Add support for CREATE FUNCTION EXECUTE ON [MASTER | ALL SEGMENTS] · aa148d2a

由 Heikki Linnakangas 提交于 9月 20, 2017

We already had a hack for the EXECUTE ON ALL SEGMENTS case, by setting
prodataaccess='s'. This exposes the functionality to users via DDL, and adds
support for the EXECUTE ON MASTER case.

There was discussion on gpdb-dev about also supporting ON MASTER AND ALL
SEGMENTS, but that is not implemented yet. There is no handy "locus" in the
planner to represent that. There was also discussion about making a
gp_segment_id column implicitly available for functions, but that is also
not implemented yet.

The old behavior was that a function that if a function was marked as
IMMUTABLE, it could be executed anywhere. Otherwise it was always executed
on the master. For backwards-compatibility, this keeps that behavior for
EXECUTE ON ANY (the default), so even if a function is marked as EXECUTE ON
ANY, it will always be executed on the master unless it's IMMUTABLE.

There is no support for these new options in ORCA. Using any ON MASTER or
ON ALL SEGMENTS functions in a query cause ORCA to fall back. This is the
same as with the prodataaccess='s' hack that this replaces, but now that it
is more user-visible, it would be nice to teach ORCA about it.

The new options are only supported for set-returning functions, because for
a regular function marked as EXECUTE ON ALL SEGMENTS, it's not clear how
the results should be combined. ON MASTER would probably be doable, but
there's no need for that right now, so punt.

Another restriction is that a function with ON ALL SEGMENTS or ON MASTER can
only be used in the FROM clause, or in the target list of a simple SELECT
with no FROM clause. So "SELECT func()" is accepted, but "SELECT func() FROM
foo" is not. "SELECT * FROM func(), foo" works, however. EXECUTE ON ANY
functions, which is the default, work the same as before.

aa148d2a

Fix multistage aggregation plan targetlists · 41640e69

由 Bhuvnesh Chaudhary 提交于 9月 19, 2017

If there are aggregation queries with aliases same as the table actual
columns and they are propagated further from subqueries and grouping is
applied on the column alias it may result in inconsistent targetlists
for aggregation plan causing crash.

	CREATE TABLE t1 (a int) DISTRIBUTED RANDOMLY;
	SELECT substr(a, 2) as a
	FROM
		(SELECT ('-'||a)::varchar as a
			FROM (SELECT a FROM t1) t2
		) t3
	GROUP BY a;

41640e69

20 9月, 2017 10 次提交

Dump more detailed info for memory usage in gp_resgroup_status · 2816fe67

由 Pengzhou Tang 提交于 9月 18, 2017

In this commit, we add more detailed memory metrics to the 'memory_usage'
column of gp_resgroup_status include current/available memory usage in
a group, current/available memory usage for a slot, current/available
memory usage for the shared part.

2816fe67

resource group: refine ResGroupSlotAcquire · 4646bbc6

由 Gang Xiong 提交于 9月 11, 2017

Previously, waiters waiting on a dropped resource group need to be
reassigned to a new group, to achieve it, ResGroupSlotAcquire is
modified to be complicated and not easy to understand, this commit
refines it.

Author: Gang Xiong <gxiong@pivotal.io>

4646bbc6

resgroup: Allow concurrency to be zero. · 77007ff6

由 Pengzhou Tang 提交于 9月 05, 2017

Allow CREATE RESOURCE GROUP and ALTER RESOURCE GROUP to set concurrency
to 0, so there will eventually be no running queries after some time, so
the resource group can be dropped. On drop all pending queries will be
moved to the new resource group assigned to the role; but if the role is
also dropped the pending queries will all be canceled. Another thing is
we do not allow setting concurrency of admin group to zero, superuser is
under admin group and only superuser can alter resource group, so once
concurrency of admin group is set to zero, there will be no chance to set
it again.
Signed-off-by: NNing Yu <nyu@pivotal.io>

77007ff6

M
Report error when 'COPY (SELECT ...) TO' with 'ON SEGMENT' · cbddcc86
由 Ming LI 提交于 9月 20, 2017
```
Because we don't know the data location of the result of SELECT query,
ON SEGMENT is forbidden.
```
cbddcc86

Remove the restriction on sum of memory_spill_ratio and memory_shared_quota. · c5a5780a

由 Richard Guo 提交于 9月 20, 2017

This commit does two changes:
1. Remove the restriction that sum of memory_spill_ratio and memory_shared_quota
must be no larger than 100.
2. Change the range of memory_spill_ratio to be [0, 100].

c5a5780a

Fix warning of passing const to non-const parameter. · f4417c50

由 Hubert Zhang 提交于 9月 19, 2017

Function FaultInjectorIdentifierStringToEnum(faultName) pass a const
string to a non-const parameter, which cause a build warnig. But on the
second thought, we have supported injecting fault by fault name without
corresponding fault identifier, so it's better to use faultname instead
of fault enum identifier in the ereport.

f4417c50

C

Change requirement to have zlib in PATH to have zlib installed. (#3315) · a40f04b8
由 Chuck Litzell 提交于 9月 19, 2017

a40f04b8

Developer version of gpstart for WALRep · dc549c2f

由 Taylor Vesely 提交于 9月 15, 2017

Adds a clusterstart command to gpsegwalrep.py allow a user to start a
cluster with WALRep configured. This is a developer utility that assumes
all cluster replicas are present on localhost, and thus is not intended
for production use.

dc549c2f

L
docs - memory_spill_ratio guc and related content (#3278) · 4f0392fd
由 Lisa Owen 提交于 9月 19, 2017
```
* docs - memory_spill_ratio guc and related content

* operator -> transaction
```
4f0392fd

Remove dependency on system curl; Fix bug with OSX (#3261) · f12d756c

由 Lav Jain 提交于 9月 19, 2017

* Remove dependency on system curl; Fix bug with OSX

* Add ifdef for CURLOPT_RESOLVE

* Incorporate feedback

* brew curl not needed anymore

f12d756c

19 9月, 2017 8 次提交

Map GPOS severity level to GPDB Severity Levels · e25eba47

由 Bhuvnesh Chaudhary 提交于 9月 13, 2017

GPOS raises exception with different severity level, but
they were being logged to GPDB logs at LOG severity level.
This disabled users to not turn off logging for GPOS exceptions, unless
GPDB log setting was changed higher than LOG severity level.

This is the initial commit which introduces the functionality. If an
exception is created without the GPDB severity level, it will default to
LOG severity level in GPDB.
Signed-off-by: NJemish Patel <jpatel@pivotal.io>

e25eba47

B

Bump ORCA version to v2.45.1 · 6fae6442
由 Bhuvnesh Chaudhary 提交于 9月 18, 2017

6fae6442
X
Fix: address PR comment and adding MERGE_FIXME · f3c00e1b
由 Xin Zhang 提交于 9月 14, 2017
```
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>
```
f3c00e1b
A
Fix: using macro for GP_REPLICATION_CONFIG_FILENAME. · 60db8cfd
由 Abhijit Subramanya 提交于 9月 14, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
60db8cfd

Create generic API to set any GUC values in GP_REPLICATION_CONFIG_FILENAME · f39047dd

由 Xin Zhang 提交于 9月 13, 2017

New API: void set_gp_replication_config(const char *name, const char *value)

This function is inspired by the upstream ALTER SYSTEM command
AlterSystemSetConfigFile() from commit
7dfab04a.

Once we merged the upstream changes, we can remove this function and
directly use the AlterSystemSetConfigFile().
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>

f39047dd

Add GP-specific replication config file gp_replication.conf · a90b4103

由 Xin Zhang 提交于 9月 13, 2017

We use this file to store the GUC value `sychronous_standby_names` in
order to control the blocking behavior between primary and mirrors as
used in upstream. When this GUC is on, the primary is blocked and
waiting for the commits propagated to mirrors regardless of mirror
status. When this GUC is off, primary just archive and won't wait for
mirrors.

gp_replication.conf is now read unconditionally by the GUC parsing logic
and needs to be set up by initdb. Refactor set_null_conf() to take a
filename so that we don't copy-paste more code.
Signed-off-by: NJacob Champion <pchampion@pivotal.io>

a90b4103

add_assignment: don't crash if assignment isn't found · 6c23d77f

由 Jacob Champion 提交于 9月 13, 2017

add_assignment previously didn't handle the case where an existing
config assignment was not found, and segfaulted. Fix that by inserting
brand-new assignments at the end of the config list.

6c23d77f

E
Revert commit e70ca9ce. · 4df139b1
由 Ekta Khanna and Jemish Patel 提交于 9月 18, 2017
```
The pipeline was failing as the ORCA tag for 2.45.0 was not pushed.
It is currently available.
```
4df139b1

18 9月, 2017 3 次提交

Add sanity checks for unrecognized window frame options. · c7c158dd

由 Heikki Linnakangas 提交于 9月 18, 2017

These shouldn't happen, but Coverity warned about these. GCC would also
complain, but I've been compiling with -Wno-maybe-uninitialized lately,
because of noise.

Actually, this isn't quite enough; ORCA also needs to mark GPOS_RAISE
with the "noreturn" attribute, so that the compiler gets the hint.
Opened https://github.com/greenplum-db/gporca/pull/234 about that.

c7c158dd

Using fault name instead of enum as the key of fault hash table (#3249) · 4616d3ec

由 Huan Zhang 提交于 9月 18, 2017

Using fault name instead of enum as the key of fault hash table

GPDB fault injector uses fault enum as the key of fault hash table.
If someone wants to inject fault into gpdb extensions(a separate repo),
she has to hard code the extension related fault enums into gpdb core
code, this is not a good practice.
So we simply use fault name as the hash key to remove the need of hard
code the fault enum. Note that fault injector API doesn't change.

4616d3ec

Revert "Incorrect Decorrelation results in wrong plan" · e70ca9ce

由 Adam Lee 提交于 9月 18, 2017

This reverts commit 378426fe.

> --2017-09-17 12:12:00--  https://github.com/greenplum-db/gporca/releases/download/v2.45.0/bin_orca_centos5_release.tar.gz
> Resolving github.com... 192.30.255.113, 192.30.255.112
> Connecting to github.com|192.30.255.113|:443... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2017-09-17 12:12:00 ERROR 404: Not Found.

e70ca9ce

17 9月, 2017 2 次提交

Convert WindowFrame to frameOptions + start + end · ebf9763c

由 Heikki Linnakangas 提交于 9月 17, 2017

In GPDB, we have so far used a WindowFrame struct to represent the start
and end window bound, in a ROWS/RANGE BETWEEN clause, while PostgreSQL
uses the combination of  a frameOptions bitmask and start and end
expressions. Refactor to replace the WindowFrame with the upstream
representation.

ebf9763c

Hardcode the "frame maker" function for LEAD and LAG. · 686aab95

由 Heikki Linnakangas 提交于 9月 17, 2017

This removes pg_window.winframemakerfunc column. It was only used for
LEAD/LAG, and only in the Postgres planner. Hardcode the same special
handling for LEAD/LAG in planwindow.c instead, based on winkind.

This is one step in refactoring the planner and executor further, to
replace the GPDB implementation of window functions with the upstream
one.

686aab95

16 9月, 2017 2 次提交

Add check for hash partitioned tables in pg_upgrade. · 22072ec5

由 Heikki Linnakangas 提交于 9月 16, 2017

I was about to add this as part of the PostgreSQL 8.4 merge, as a check
when upgrading from 8.3 to 8.4, because the hash algorithm was changed
in 8.4. However, turns out that pg_dump doesn't support hash partitioned
tables at all, so pg_upgrade won't work on a database that contains any
hash partitioned tables, even on a same-version upgrade. Hence, let's
add this check unconditionally on all server versions.

There are comments talking about the hash function change, because of that
devleopment history. I think that's useful documentation, just in case
we ever start to support hash partitions in pg_dump, so I left it there.

22072ec5

Fix check for superuser_reserved_connections. · 06ea112c

由 Heikki Linnakangas 提交于 9月 16, 2017

Upstream uses >= here. It was changed in GPDB, to use > instead of >=. but
I don't see how that's more correct or better. I tracked that change in
the old pre-open-sourcing repository to this commit:

commit f3e98a1ef5fc5915662077b137c563371ea1c0a4
Date: Mon Apr 6 15:04:33 2009 -0800

   Fixed guc check for ReservedBackends.

   [git-p4: depot-paths = "//cdb2/main/": change = 33269]

So, there was no explanation there either, what the alleged problem was.

06ea112c