提交 · 522c7c09e59df8993f33c8ce56e07734d5a24a4d · Greenplum / Gpdb

28 8月, 2017 12 次提交

Use ereport() rather than elog(), for "expected" ERRORs. · 522c7c09

由 Heikki Linnakangas 提交于 8月 28, 2017

Use ereport(), with a proper error code, for errors that are "expected" to
happen sometimes, like missing configuration files, or network failure.

That's nicer in general, but the reason I bumped into this is that internal
error messages include a source file name and line number in GPDB, and if
those error messages are included in the expected output of regression
tests, those tests will fail any time you do change the file, so that the
elog() moves around and the line number changes.

522c7c09

Avoid side effects in assertions · 288dde95

由 Daniel Gustafsson 提交于 8月 28, 2017

An assertion with a side effect may alter the main codepath when
the tree is built with --enable-cassert, which in turn may lead
to subtle differences due compiler optimizations and/or straight
bugs in the side effect. Rewrite the assertions without side
effects to leave the main codepath intact.

288dde95

P

Enlarge the range of memory_shared_quota in tests · 46eb2c73
由 Pengzhou Tang 提交于 8月 23, 2017

46eb2c73
P
Add test cases under utility mode for resource group · 910036b0
由 Pengzhou Tang 提交于 8月 22, 2017
```
This commit verified that connections in utility mode should not be
governed by resource group
```
910036b0

Add resource group tests for cursors, pl functions and prepare/execute statements · 9392d17b

由 Pengzhou Tang 提交于 8月 22, 2017

This commit verified that cursors, prepare and statements within pl functions didn't
trigger self-deadlock by the concurrency control of resource group.

9392d17b

Use index scan on pg_resgroupcapability for resource group · 3fd8a618

由 Pengzhou Tang 提交于 8月 21, 2017

We used to use sequence scan on pg_resgroupcapability for functions that
need to do a full scan of pg_resgroupcapability. Problem is accessing
in this way will take a long time after pg_resgroupcapability table was
updated/deleted million times as our stress tests do, pg_resgroupcapability
was filled of invalid blocks and sequence scan wasted lots of time to bypass
those blocks. Using index scan on it can resolve this problem.

3fd8a618

Add parallel tests for resource group · ee1d30c9

由 Pengzhou Tang 提交于 8月 10, 2017

This commit contain all kinds of parallel tests include combination of CREATE,
DROP, ALTER and Queries, this commit depends on dblink component to run
queries concurrently.
Signed-off-by: NRichard Guo <riguo@pivotal.io>

ee1d30c9

Optimize `COPY TO ON SEGMENT` result processing · 266355d3

由 Adam Lee 提交于 8月 25, 2017

Don't send nonsense '\n' characters just for counting, let segments
report how many rows are processed instead.
Signed-off-by: NMing LI <mli@apache.org>

266355d3

Add GUC to control the distribution key checking for "COPY FROM ON SEGMENT" · 6566d48c

由 Xiaoran Wang 提交于 8月 23, 2017

GUC value is `gp_enable_segment_copy_checking`, its default value is true.
User can disable the distribution key check with with a GUC value.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

6566d48c

Check distribution key restriction for `COPY FROM ON SEGMEN` · 65321259

由 Xiaoran Wang 提交于 8月 10, 2017

When use command `COPY FROM ON SEGMENT`, we copy data from
local file to the table on the segment directly. When copying
data, we need to apply the distribution policy on the record to compute
the target segment. If the target segment ID isn't equal to
current segment ID, we will report error to keep the distribution
key restriction.

Because the segment has no meta data info about table distribution policy and
partition policy,we copy the distribution policy of main table from
master to segment in the query plan. When the parent table and
partitioned sub table has different distribution policy, it is difficult
to check all the distribution key restriction in all sub tables. In this
case , we will report error.

In case of the partitioned table's distribution policy is
RANDOMLY and different from the parent table, user can use GUC value
`gp_enable_segment_copy_checking` to disable this check.

Check the distribution key restriction as follows:

1) Table isn't partioned:
    Compute the data target segment.If the data doesn't belong the
    segment, will report error.

2) Table is partitioned and the distribution policy of partitioned table
as same as the main table:
    Compute the data target segment.If the data doesn't belong
    the segment, will report error.

3) Table is partitioned and the distribution policy of partitioned
table is different from main table:
    Not support to check ,report error.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
Signed-off-by: NMing LI <mli@apache.org>
Signed-off-by: NAdam Lee <ali@pivotal.io>

65321259

Add new GPDB_EXTRA_COL mechanism to process_col_defaults.pl · 215283a6

由 Heikki Linnakangas 提交于 8月 27, 2017

This allows setting one of the GPDB-added attributes on a line, without
modifying the original line. This reduces the diff of pg_proc.h from
upstream.

This has no effect on the resulting BKI file, except for whitespace. IOW,
there are no catalog changes in this commit. I checked that by diffing the
resulting BKI file, before and after this patch, with "diff -w".

I still left the TODO comment in pg_proc.h in place, which pointed out
that it'd be nice if we could automatically use prodataaccess = 'c' as the
default for SQL-language columns, and 'n' for others. I actually wrote a
more flexible prototype at first that could do that. In that prototype, you
could provide an arbitrary perl expression that was evaluated on every row,
and could compute a value based on other columns. But that was more
complicated, and at the same time, not as flexible, because you could still
not specify particular values for just one row. So I think this is better
in the end.

Also, I noticed that we haven't actually marked all SQL-language functions
with prodataaccess = 'c'. Tsk tsk. It's too late for catalog changes, so
not fixing that now. At some point, we should discuss whether we should
do something different with prodataaccess, like change the code so that
it's ignored for SQL language functions altogether. Or perhaps just remove
the column, the only useful value for it is the magic 's' value, which
can only be used in built-in functions because there's no DDL syntax for
it. But that's a whole different story.

215283a6

Don't strip semicolon from re-constructed DATA lines in BKI sources. · e6569903

由 Heikki Linnakangas 提交于 8月 27, 2017

It's harmless, as the genbki script ignores them, but seems a bit untidy.
It also makes it harder to diff between the re-constructed DATA lines and
the originals.

e6569903

27 8月, 2017 1 次提交

Ensure directory cleanup in workfile manager · 6aecdb98

由 Daniel Gustafsson 提交于 8月 26, 2017

Calling elog(ERROR ..) will exit the current context, so the
cleanup function would never run. Shift around to ensure the
cleanup is called. This codepath was recently introduced in
commit 00ce2c14 where a surrounding try/catch block was removed.

Also change to ereport from elog since this is an error that
a user could run into.

6aecdb98

26 8月, 2017 12 次提交

gpstart: verify heap_checksum matches on master and standby · 532d0747

由 Shoaib Lari 提交于 8月 23, 2017

This change adds a check to verify that the heap_checksum setting on
standby master matches master. If not, gpstart exits.
For completeness, we'll be checking all the segments.

Note: Heapchecksum (pg_control) can change if a user runs pg_resetxlog
Signed-off-by: NMarbin Tan <mtan@pivotal.io>

532d0747

Fix compilation of s3utils.cpp when 'char' is unsigned. · 3b645789

由 Heikki Linnakangas 提交于 8月 26, 2017

On ARM64, 'char' is unsigned, unlike on most platforms including x86.

Per report from Dave Cramer.

Discussion: https://github.com/greenplum-db/gpdb/pull/1983#issuecomment-324955931

3b645789

Copy parse location of ArrayExpr. · 97b9553f

由 Heikki Linnakangas 提交于 8月 26, 2017

As part of the 8.3 merge, we backported the addition of the location field.
from 8.4, but accidentally left the copy support for it commented out.

Also, be more pedantic about using COPY/READ/WRITE_LOCATION_FIELD macros
to deal with location fields, rather than the *_INT_FIELD macros. This is
purely cosmetic, but hopefully reduces merge conflicts a bit. I didn't touch
the few places where we already had the location field in 8.3, those will
be changed as part of the 8.4 merge.

97b9553f

WorkerPool: Error out if numWorkers is 0 or less. · 2debe2b9

由 Nadeem Ghani 提交于 8月 23, 2017

Currently, its possible to create a workerpool with 0 workers, which is
good for killing time but not much else. (hang forever).

Now we'll fail fast with the worker pool guard.
Signed-off-by: NShoaib Lari <slari@pivotal.io>
Signed-off-by: NNadeem Ghani <nghani@pivotal.io>

2debe2b9

gprecoverseg: validate checksum setting · e0e331f2

由 Larry Hamel 提交于 8月 11, 2017

As part of the validation phase of gprecoverseg, before proceeding,
validate that the setting for GUC data_checksums is the same between
master and segments. The validation is done by comparing pg_control file content.
Fail fast if the settings are not the same.

If no segments are able to report their settings, then gprecoverseg
fails. (This failure to report would be unexpected since there is already a check for at least
one segment alive to progress to the validation phase.)
Signed-off-by: NMarbin Tan <mtan@pivotal.io>
Signed-off-by: NNadeem Ghani <nghani@pivotal.io>

e0e331f2

L

Whitespace formatting · c325604e
由 Larry Hamel 提交于 8月 11, 2017

c325604e
M
remove max in-flight for gprecoverseg terraform · 5e1e9e16
由 Marbin Tan 提交于 8月 15, 2017
```
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
```
5e1e9e16
C

Update syntax for CREATE/ALTER/DROP USER/ROLE/GROUP statements (#3061) · 53b2baff
由 Chuck Litzell 提交于 8月 25, 2017

53b2baff

Remove winlevelsup field, to the extent possible without catalog change. · 563c8c6b

由 Heikki Linnakangas 提交于 8月 25, 2017

The winlevelsup field isn't used. The reason it's not needed can be summed
up by this comment in PostgreSQL 8.4's transformWindowFunc function:

>  * Unlike aggregates, only the most closely nested pstate level need be
>  * considered --- there are no "outer window functions" per SQL spec.

Second line of reasoning is that the winlevelsup field was always
initialized to 0, and only incremented in the IncrementVarSublevelsUp
function. But that function is only used during planning, so winlevelsup
was always 0 in the parse and parse analysis stage. However, the field was
read only in the parse analysis phase, which means that it was always 0
when it was read.

Third line of reasoning is that the regression tests are happy without it,
and there was a check in the ORCA translator too, that would've thrown
an error if it was ever non-zero.

I left the field in place in the struct, to avoid a catalog change, but it
is now unsued. WindowRef nodes can be stored in catalogs, as part of views,
I believe.

563c8c6b

C
revise to more accurately describe failed checksum verify behavior (#2990) · 231a357f
由 Chuck Litzell 提交于 8月 25, 2017
```
* revise to more accurately describe failed checksum verify behavior

* Remove 'PostgreSQL' managed memory
```
231a357f
H

Whitespace and other cosmetic changes, to match upstream sources better. · 566e0142
由 Heikki Linnakangas 提交于 8月 25, 2017

566e0142

DOCS: add heap checksum information to gpstart. (#3014) · 54b808a7

由 Mel Kiyama 提交于 8月 25, 2017

* DOCS: add heap checksum information to gpstart.

* DOC: gpstart - update heap checksum information. More detail information in overview.

* docs: gpstart minor edits.

* docs: gpstart -clarify warning about using --skip-heap-checksum-validation

54b808a7

25 8月, 2017 13 次提交

Change a few error messages to match upstream again. · ceb602da

由 Heikki Linnakangas 提交于 8月 25, 2017

I don't understand why these were modified in GPDB in the first place.
I dug into the old git history, from before Greenplum was open sourced, and
traced the change to a massive commit from 2011, which added support for
(non-recursive) WITH clause. I think the change was just collateral damage
in that patch; I don't see any relationship between WITH clause support and
these error messages.

These errors can be reproduced with queries like this:

    (select 'foobar' order by 1) order by 1;
    (select 'foobar' limit 1) limit 2;
    (select 'foobar' offset 1) offset 2;

ceb602da

Use ereport, rather than elog, for performance. · 01dff3ba

由 Heikki Linnakangas 提交于 8月 25, 2017

ereport() has one subtle but important difference to elog: it doesn't
evaluate its arguments, if the log level says that the message doesn't
need to be printed. This makes a small but measurable difference in
performance, if the arguments contain more complicated expressions, like
function calls.

While performance testing a workload with very short queries, I saw some
CPU time being used in DtxContextToString. Those calls were coming from the
arguments to elog() statements, and the result was always thrown away,
because the log level was not high enough to actually log anything. Turn
those elog()s into ereport()s, for speed.

The problematic case here was a few elogs containing DtxContextToString
calls, in hot codepaths, but I changed a few surrounding ones too, for
consistency.

Simplify the mock test, to not bother mocking elog(), while we're at it.
The real elog/ereport work just fine in the mock environment.

01dff3ba

Fix assertion failure in single-user mode. · a29aecf7

由 Heikki Linnakangas 提交于 8月 25, 2017

In single-user mode, MyQueueId isn't set. But there was an assertion for
that in ResourceQueueGetQueryMemoryLimit. To fix, don't apply memory limits
in single-user mode.

a29aecf7

P
Cleanup specialized test cases for UDP type of interconnect · 61b82012
由 Pengzhou Tang 提交于 8月 24, 2017
```
UDP type of interconnect has been removed from the code base, we
need also cleanup it's specialized test cases.
```
61b82012
A
Add -k to pg_resetxlog to specify data_checksum_version. · e61a0134
由 Ashwin Agrawal 提交于 8月 23, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
e61a0134

Make pg_resetxlog consistent with gpinitsystem checksum setting · 0bc37e07

由 Xin Zhang 提交于 8月 23, 2017

By default, gpinitsystem will turn on HEAP_CHECKSUM by calling initdb
with --data-checksum.

Originally, pg_resetxlog will set data_checksum_version to 0 if
pg_control is not readable.

In this fix, we make pg_resetxlog to set the data_checksum_version to
PG_DATA_CHECKSUM_VERSION instead, hence its default behavior will be
consistent with gpinitsystem.
Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>

0bc37e07

Fix bug: resgroup decrease concurrency_limit not work correctly · 79f3d357

由 Zhenghua Lyu 提交于 8月 25, 2017

In previous code, when user decreases resgroup concurrency_limit
to a value that less than the number of current running jobs in
that resgroup, the calculation of memory that need to return to
SYSPOOL is not correct and it might cause assert fail.

We add code that takes into account this situation. And the logic
here is that when we decide to return some memory to SYSPOOL, we
only return `Min(total_memory_should_return, max_mem_can_return)`.

And since alter-memory command has gradually-effect semantic, when
a job is just before ending and it finds out that its ending could
provide free slot for others(not blocked by concurrency_limit), it
will try not to return all the memory it could but reserve some to
make sure that new job would not be blocked because of memory_quota.
Signed-off-by: NGang Xiong <gxiong@pivotal.io>

79f3d357

DOCS: update gpconfig. Updated, simplified quoting syntax for GUC values (#3016) · 0aed7aef

由 Mel Kiyama 提交于 8月 24, 2017

* DOCS: update gpconfig. Updated, simplified quoting syntax for GUC values

* docs: update gpconfig examples in docs to use updated quoting syntax.

* docs: gpconfig -fix typos

0aed7aef

docs - misc edits (#3019) · e53d773f

由 Lisa Owen 提交于 8月 24, 2017

* misc doc edits

- pg_proc prorows column
- pg_locks virtualxid, virtualtransaction columns
- update a few queries to remove pg_locks transaction
- add TRUNCATE to LOCK sql page ACCESS RESTRICTIVE command list

* fix pg_locks intro wording

e53d773f

X
Update PR pipeline with CONFIGURE_FLAGS · 8409d622
由 Xin Zhang 提交于 8月 24, 2017
```
Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
```
8409d622

Drop test view after test, to work around failure in binary swap test. · fef69349

由 Heikki Linnakangas 提交于 8月 25, 2017

Commit a7de4d60 added a test view, which is causing trouble for the
binary swap test. The binary swap test runs the regression tests using
the new binaries, but then swaps out an old binary, and runs pg_dump
against the regression database. That fails with this test view, because
the old version still has the bug that commit a7de4d60 fixed, and
cannot parse the view definition stored in the catalogs correctly.

So, the code is fine, and we are still binary-compatible. To silence the
false failure in the binary swap test, drop the test view after the test,
so that it won't be present in the regression database, when the binaries
are swapped.

Analysis by Jesse Zhang. See discssuon on gpdb-dev mailing list.

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/UJ3U_yqs38A

fef69349

S
Enhanced PXF extension to support read via HDFS Text profile · 122186bc
由 Shivram Mani 提交于 8月 24, 2017
```
Signed-off-by: NJohn Gaskin <Quikling@pivotal.io>
```
122186bc
A

Fix side effect due to Assert() in UnpackCheckPointRecord(). · 40391f6d
由 Ashwin Agrawal 提交于 8月 24, 2017

40391f6d

24 8月, 2017 2 次提交

Revert the premature optimization in readfuncs.c. · a7de4d60

由 Heikki Linnakangas 提交于 8月 24, 2017

We had replaced the upstream code in readfuncs.c that checks what kind of
a Node we're reading, with a seemingly smarter binary search. However, that's
a premature optimization. Firstly, the linear search is pretty darn fast,
because after compiler optimizations, it will check for the string length
first. Secondly, the binary search implementation required an extra
palloc+pfree, which is expensive enough that it almost surely destroys any
performance gain from using a binary search. Thirdly, this isn't a very
performance-sensitive codepath anyway. This is used e.g. to read view
definitions from the catalog, which doesn't happen very often. The
serialization code used when dispatching a query from QD to QEs is a more
hot codepath, but that path uses the different method, in readfast.c.

So, revert the code the way it is in the upstream. This hopefully reduces
merge conflicts in the future.

Also, there was in fact a silly bug in the old implementation. It used
wrong identifier string for the RowCompare expression. Because of that, if
you tried to use a row comparison in a view, you got an error. Fix that,
and also add a regression test for it.

a7de4d60

Enable forgotten test. · e1cbe31b

由 Heikki Linnakangas 提交于 8月 24, 2017

Commit 0be2e5b0 added a regression test to check that if a view contains
an ORDER BY, that ORDER BY is obeyed on selecting from the view. However,
it forgot to add the test to the schedule, so it was never run. Add it.

There is actually a problem with the test as it is written: gpdiff masks
out the differences in the row order, so this test won't catch the problem
that it was originally written for. Nevertheless, seems better to enable
the test and run, than not run it at all. But add a comment to point that
out.

e1cbe31b