提交 · 11e4aa669da6bb884f365d4fc6d9866bd6b2dd70 · Greenplum / Gpdb

05 9月, 2017 1 次提交

Simplify tuple serialization in Motion nodes. · 11e4aa66

由 Ning Yu 提交于 9月 05, 2017

* Simplify tuple serialization in Motion nodes.

There is a fast-path for tuples that contain no toasted attributes,
which writes the raw tuple almost as is. However, the slow path is
significantly more complicated, calling each attribute's binary
send/receive functions (although there's a fast-path for a few
built-in datatypes). I don't see any need for calling I/O functions
here. We can just write the raw Datum on the wire. If that works
for tuples with no toasted attributes, it should work for all tuples,
if we just detoast any toasted attributes first.

This makes the code a lot simpler, and also fixes a bug with data
types that don't have a binary send/receive routines. We used to
call the regular (text) I/O functions in that case, but didn't handle
the resulting cstring correctly.

Diagnosis and test case by Foyzur Rahman.
Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io>
Signed-off-by: NNing Yu <nyu@pivotal.io>

11e4aa66

02 9月, 2017 1 次提交

Dont pullup correlated subqueries with limit/offset clause · 34521645

由 Bhuvnesh Chaudhary 提交于 8月 31, 2017

Queries in which there is a IN clause on top of a correlated subquery
containing limit/offset clause, planner tries to identify if the IN clause can
be converted to a join in convert_IN_to_join() and creates a RTE
for the join if its possible and does not consider limit/offset clause
while making a decision. However, later in pull_up_subqueries(),
check enforced by is_simple_subquery causes the subquery containing limit/offset
clauses to be not pulled up. This inconsistency causes a plan to be
generated with a param, however, with no corresponding subplan.

The patch fixes the issues by adding the relevant checks in
convert_IN_to_join() to identify if the subquery is
correlated and contains limit/offset clause, in such cases the sublink
will not be converted to a join and a plan with subplan will be created.

34521645

01 9月, 2017 3 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

Set errcode on AO checksum errors · 60f9ac3d

由 Daniel Gustafsson 提交于 9月 01, 2017

The missing errcode makes the ereport call include the line number
of the invocation from the .c file, which not only isn't too useful
but cause the tests to fail on adding/removing code from the file.

60f9ac3d

Improve the error messages in a few COPY FROM SEGMENT cases. · 43b59d57

由 Heikki Linnakangas 提交于 9月 01, 2017

* Use ereport() with a proper error code, rather than elog(), so that you
  don't get the source file name and line number in the message, and the
  serious-looking backtrace in the log.

* Remove the hint that advised "SET gp_enable_segment_copy_checking=off",
  when a row failed the check that it's being loaded to the correct
  segment. Ignoring the mismatch seems like very bad idea, because if
  your rows are in incorrect segments, all bets are off, and you'll likely
  get incorrect query results when you try to query the table.

43b59d57

31 8月, 2017 1 次提交

Refactor gp_guc_list_show to have a simpler calling convention. · ddff4877

由 Heikki Linnakangas 提交于 8月 30, 2017

Rather than appending to a StringInfo, return a string. The caller can
append that to a StringInfo if he wants to. And instead of passing a
prefix as argument, the caller can prepend that too.

Both callers passed the same format string, so just embed that in the
function itself.

Don't append a trailing "; ". It's easier for the caller to append it,
if it's preferred, than to remove it afterwards.

Also add a regression test for the 'gp_enable_fallback_plan' GUC. There
were none before. The error message you get with that GUC disabled uses the
gp_guc_list_show function.

ddff4877

30 8月, 2017 1 次提交

Remove bogus/unnecessary gpdiff "mvd" directives. · 6f73417b

由 Heikki Linnakangas 提交于 8月 30, 2017

Most, if not all, of the queries in the qp_olap_windowerr test, contained
gpdiff "mvd" directives, to tell gpdiff what the expected order of output
rows is. However, all of the queries in that test fail on purpose, because
of varios errors. That means that the "mvd" directives didn't do anything,
because there were not result sets in the output.

However, commit de548159, added a few tests that return a result set,
to the end of the test script. That caused the preceding "mvd" directives
to be applied, incorrectly, to those new result sets. That produced a lot
of messages like "specified MVD column out of range: 3 vs 1" in the
console. While harmless, they didn't cause the test to fail, let's be tidy.

6f73417b

29 8月, 2017 4 次提交

Use IsResGroupEnabled in regress.c to pass binary swap. · 3bffbb2b

由 Pengzhou Tang 提交于 8月 29, 2017

In binary swap test, new binary is replaced by old binary and then run
pg_dump, however, pg_dump will still try to load new regress.so, so if
regress.so contain new symbols only belong to new binary, it will report
an error. For resGroupPalloc() ifself, IsResGroupEnabled or
IsResGroupActivated make no much difference, so to make binary swap pass,
we still use IsResGroupEnabled.

3bffbb2b

Refactor window function grammar rules, to reduce diff vs. upstream. · df8d694e

由 Heikki Linnakangas 提交于 8月 29, 2017

As a pleasent side-effect, the window specification in a
"func_expr OVER (window_spec)" rule now gets its parse location set
correctly. Thanks to that, a few error messages now give a user-friendly
error location line.

df8d694e

Perform resource group operations only when it's initialized · 939208b5

由 Pengzhou Tang 提交于 8月 23, 2017

The resource group is enabled but not initialized on auxiliary processes
and special backends like ftsprobe and filerep, previously we performed
resource group operations no matter resource group is initialized or not
which leads to some unexpected error.

939208b5

X
GUC gp_enable_segment_copy_checking should be sent by cdbgang · f393cbe2
由 Xiaoran Wang 提交于 8月 29, 2017
```
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
Signed-off-by: NAdam Lee <ali@pivotal.io>
```
f393cbe2

28 8月, 2017 2 次提交

Add GUC to control the distribution key checking for "COPY FROM ON SEGMENT" · 6566d48c

由 Xiaoran Wang 提交于 8月 23, 2017

GUC value is `gp_enable_segment_copy_checking`, its default value is true.
User can disable the distribution key check with with a GUC value.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

6566d48c

Check distribution key restriction for `COPY FROM ON SEGMEN` · 65321259

由 Xiaoran Wang 提交于 8月 10, 2017

When use command `COPY FROM ON SEGMENT`, we copy data from
local file to the table on the segment directly. When copying
data, we need to apply the distribution policy on the record to compute
the target segment. If the target segment ID isn't equal to
current segment ID, we will report error to keep the distribution
key restriction.

Because the segment has no meta data info about table distribution policy and
partition policy,we copy the distribution policy of main table from
master to segment in the query plan. When the parent table and
partitioned sub table has different distribution policy, it is difficult
to check all the distribution key restriction in all sub tables. In this
case , we will report error.

In case of the partitioned table's distribution policy is
RANDOMLY and different from the parent table, user can use GUC value
`gp_enable_segment_copy_checking` to disable this check.

Check the distribution key restriction as follows:

1) Table isn't partioned:
    Compute the data target segment.If the data doesn't belong the
    segment, will report error.

2) Table is partitioned and the distribution policy of partitioned table
as same as the main table:
    Compute the data target segment.If the data doesn't belong
    the segment, will report error.

3) Table is partitioned and the distribution policy of partitioned
table is different from main table:
    Not support to check ,report error.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
Signed-off-by: NMing LI <mli@apache.org>
Signed-off-by: NAdam Lee <ali@pivotal.io>

65321259

25 8月, 2017 2 次提交

A
Add -k to pg_resetxlog to specify data_checksum_version. · e61a0134
由 Ashwin Agrawal 提交于 8月 23, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
e61a0134

Drop test view after test, to work around failure in binary swap test. · fef69349

由 Heikki Linnakangas 提交于 8月 25, 2017

Commit a7de4d60 added a test view, which is causing trouble for the
binary swap test. The binary swap test runs the regression tests using
the new binaries, but then swaps out an old binary, and runs pg_dump
against the regression database. That fails with this test view, because
the old version still has the bug that commit a7de4d60 fixed, and
cannot parse the view definition stored in the catalogs correctly.

So, the code is fine, and we are still binary-compatible. To silence the
false failure in the binary swap test, drop the test view after the test,
so that it won't be present in the regression database, when the binaries
are swapped.

Analysis by Jesse Zhang. See discssuon on gpdb-dev mailing list.

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/UJ3U_yqs38A

fef69349

24 8月, 2017 3 次提交

Revert the premature optimization in readfuncs.c. · a7de4d60

由 Heikki Linnakangas 提交于 8月 24, 2017

We had replaced the upstream code in readfuncs.c that checks what kind of
a Node we're reading, with a seemingly smarter binary search. However, that's
a premature optimization. Firstly, the linear search is pretty darn fast,
because after compiler optimizations, it will check for the string length
first. Secondly, the binary search implementation required an extra
palloc+pfree, which is expensive enough that it almost surely destroys any
performance gain from using a binary search. Thirdly, this isn't a very
performance-sensitive codepath anyway. This is used e.g. to read view
definitions from the catalog, which doesn't happen very often. The
serialization code used when dispatching a query from QD to QEs is a more
hot codepath, but that path uses the different method, in readfast.c.

So, revert the code the way it is in the upstream. This hopefully reduces
merge conflicts in the future.

Also, there was in fact a silly bug in the old implementation. It used
wrong identifier string for the RowCompare expression. Because of that, if
you tried to use a row comparison in a view, you got an error. Fix that,
and also add a regression test for it.

a7de4d60

Enable forgotten test. · e1cbe31b

由 Heikki Linnakangas 提交于 8月 24, 2017

Commit 0be2e5b0 added a regression test to check that if a view contains
an ORDER BY, that ORDER BY is obeyed on selecting from the view. However,
it forgot to add the test to the schedule, so it was never run. Add it.

There is actually a problem with the test as it is written: gpdiff masks
out the differences in the row order, so this test won't catch the problem
that it was originally written for. Nevertheless, seems better to enable
the test and run, than not run it at all. But add a comment to point that
out.

e1cbe31b

Fix bug where a GUC is set twice on SET command. · bba4fe16

由 Heikki Linnakangas 提交于 8月 24, 2017

In order to test this, modify the gpdiff tool to not suppress duplicate
WARNINGs, if the duplicates don't have a segment number attached to them.
The bug was easy to see by setting work_mem, which emits a WARNING in
GPDB, except that suppressing the duplicates hid the bug. Even though we
have a few tests that set work_mem, add one that does that explicitly for
the purpose of testing for this bug, with a comment explaining its purpose.

While we're at it, move the GPDB-specific test queries out of 'guc' test,
into a new 'guc_gp' test, to keep the upstream 'guc' test pristine.

Fixes github issue #3008, reported by @guofengrichard.

bba4fe16

23 8月, 2017 1 次提交
- V
  
  Minor test cleanup · c1a7bf25
  由 Venkatesh Raghavan 提交于 8月 22, 2017
  
  c1a7bf25
22 8月, 2017 2 次提交

V

Remove triple dash that screws up with tests · afad6d0a
由 Venkatesh Raghavan 提交于 8月 21, 2017

afad6d0a

Coerce ROWS PRECEDING/FOLLOWING expression already at parse time. · 710e36e5

由 Heikki Linnakangas 提交于 8月 21, 2017

This is potentially a tiny bit faster, if the coercion can be performed
just once at parse/plan time, rather than on every row.

This fixes some of the bogus error checks and inconsistencies in handling
the ROWS expressions. For example, before, if you passed a string constant
as the ROWS expression, you got an error, but if you passed a more
complicated expression, that returned a string, the string was cast to an
integer at runtime. And those casts evaded the plan-time checks for
negative values.

Also, move the checks for negative ROWS/RANGE value from the parser to the
beginning of execution, even in the cases where the value is a constant, or
a stable expression that only needs to be executed once. We were missing
the checks in ORCA, so this fixes the behavior with ORCA for such queries.

710e36e5

21 8月, 2017 2 次提交

Fix tests for NULL values in ROWS PRECEDING/FOLLOWING clause. · de548159

由 Heikki Linnakangas 提交于 8月 21, 2017

There were two problems with the existing NULL check. Firstly, it would
not catch expressions that returned NULL at runtime, only NULL constants.
You got a non-friendly "Unexpected internal error" error instead. Secondly,
it rejected expressions that contained NULL constants anywhere in the
expression, even if the expression as whole returned a non-NULL.

To fix both issues, add a NULL-check with a better error message to the
executor, and remove the parse-time check altogether. I copied the error
message texts from the upstream, so that you get the same error, even
though the implementation is quite different.

Add regression tests for those cases too, we apparently didn't have any.

Fixes github issue #2999.

de548159

Move ORCA invocation into standard_planner · d5dbbfd9

由 Daniel Gustafsson 提交于 8月 21, 2017

The way ORCA was tied into the planner, running a planner_hook
was not supported in the intended way. This commit moves ORCA
into standard_planner() instead of planner() and leaves the hook
for extensions to make use of, with or without ORCA. Since the
intention with the optimizer GUC is to replace the planner in
postgres, while keeping the planning proess, this allows for
planner extensions to co-operate with that.

In order to reduce the Greenplum footprint in upstream postgres
source files for future merges, the ORCA functions are moved to
their own file.

Also adds a memaccounting class for planner hooks since they
otherwise ran in the planner scope, as well as a test for using
planner_hooks.

d5dbbfd9

18 8月, 2017 5 次提交

Add test case for \dx and \dx+ · bb379e33

由 Heikki Linnakangas 提交于 8月 18, 2017

I repurposed the existing psql_gpdb_du test for this. It was very small,
I think we can put tests for other \d commands in the same file. In the
passing fix a typo in a comment there, and move the expected output from
output/ to expected/, because it doesn't need any string replacements.

bb379e33

\COPY doesn't support ON\tSEGMENT · 74225f3d

由 Xiaoran Wang 提交于 8月 08, 2017

When there is \t between 'ON' and 'SEGMENT' , client can give
right error message.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

74225f3d

Create minimal segment WAL replication pipeline · 076a3848

由 Jimmy Yih 提交于 8月 11, 2017

As we develop segment WAL replication, we need a green to green CI
pipeline that we can start small and add relative jobs from
gpdb_master as we add more functionality. This commit establishes that
pipeline and omits ICW tests that are currently not supported by
segment WAL replication.

Authors: Abhijit Subramanya and Jimmy Yih

076a3848

Fix check for all-Const target list, in single-row-insert dispatch. · 2b497f09

由 Heikki Linnakangas 提交于 8月 17, 2017

If you have a simple insert, like "INSERT INTO foo VALUES ('bar')", we
evalute the target list (i.e. 'bar') in the master, and route the insert to
the correct partition and segment, based on the constants. However, there
was a mismatch between allConstantValuesClause(), and what its callers
assumed. The callers assumed that if allConstantValuesClause() returns
true, the target list contains only Const nodes. But in reality,
allConstantValuesClause() also returned true, if there were non-volatile
function expressions in the target list, that could be evaluated, and would
then produce a constant result.

Fix the mismatch, by making allConstantValuesClause() be more strict, so
so that it only returns true if all the entries are true Consts.

Fixes github issue #285, reported by @liruto.

2b497f09

Fix assertion failure (or incorrect behavior) in a planner corner-case. · 87ffb8e8

由 Heikki Linnakangas 提交于 8月 17, 2017

It almost seems like when we merged with PostgreSQL 8.2 (sic), we missed
this one line from commit 986085a7. Before that, the rowMarks list was
a list of integers, but now it's a list of RowMarkClauses.

87ffb8e8

17 8月, 2017 3 次提交

Increase the default value of default_statistics_target from 10 to 100, · ef804bdf

由 Tom Lane 提交于 12月 13, 2008

and its maximum value from 1000 to 10000.  ALTER TABLE SET STATISTICS
similarly now allows a value up to 10000.  Per discussion.

Cherry-pick from 65e3ea76

ef804bdf

Use correct #include for IsResGroupEnabled. · 6f4aad8a

由 Heikki Linnakangas 提交于 8月 17, 2017

This silences a compiler warning:

regress.c:2884:7: warning: implicit declaration of function ‘IsResGroupEnabled’ [-Wimplicit-function-declaration]

6f4aad8a

Update gpcopy test cases · cf5bfd6f

由 Adam Lee 提交于 8月 15, 2017

1, remove unnecessary web tables
2, add various distribution with `ON SEGMENT` tests
3, add zero column with `ON SEGMENT` tests

cf5bfd6f

16 8月, 2017 2 次提交

AO, CO and PT should have relfrozenxid as InvalidTransactionId. · c2fd24be

由 Ashwin Agrawal 提交于 8月 09, 2017

AO and CO tables never store transactionIds and persistent table always have
tuples with FrozenXid only. Hence these tables should not have valid
relfrozenxid, hence also should not matter for age calculation. Persistent
tables are currently skipped internally from `datfrozenxid` calculation but
still contained valid value for relfrozenxid in pg_class. Whereas AO/CO tables
carried valid relfrozenxid and were used for `datfrozenxid` calculation as well.

This commit makes sure only tables involved in `datfrozenxid` calculation,
actually carry valid relfrozenxid in pg_class and others don't. So, externally
someone executing age() function can easily ignore such tables by checking `not
relfrozenxid = 0`.

Fixes #2856.

c2fd24be

Add pg_regress --exclude-tests option · c1417c9e

由 Jimmy Yih 提交于 8月 14, 2017

When running pg_regress, sometimes you may want to exclude certain
tests from your schedule file from running. Before, you would have to
modify the schedule file to comment/ignore out the unwanted test. Now
we can do it from command line with --exclude-tests option that is
space and comma delimited.

Example:
./pg_regress --exclude-tests="test1 test2,test3 ... testN"

Authors: Abhijit Subramanya and Jimmy Yih

c1417c9e

15 8月, 2017 2 次提交

Speed up and clean up gpdiff. · 275fe5dd

由 Heikki Linnakangas 提交于 8月 15, 2017

* Remove unnecessary "use" directives. Importing modules adds some overhead.

* Use File::Temp instead of POSIX.tmpnam for creating temporary files.
  File::Temp is used in atmsort.pm anyway, so also using it in explain.pm
  doesn't add extra overhead, like the otherwise-unused POSIX module does.
  The File::Temp API actually fits our needs better, so this shortens the
  code. And it allows removing the "use POSIX" directives, which again
  reduces the overhead of starting up the program.

* Use perl's built-in grep function, instead of launching the external
  grep program. Launching a new process is more expensive.

Altogether, these changes shave maybe 5-10 ms off the startup time of
gpdiff.pl, out of about 60 ms. That's not a huge difference, but every
little helps, and this is nice clean up in any case.

275fe5dd

Remove unnecessary gdiff/diff check. · 3984e9f5

由 Heikki Linnakangas 提交于 8月 15, 2017

In PostgreSQL, we always just use 'diff', even on Solaris. Should be good
enough for GPDB too. (We don't officially even support Solaris anymore.)

3984e9f5

14 8月, 2017 2 次提交

resgroup: support memory limit & shared quota alteration. · b6d0dc37

由 Ning Yu 提交于 8月 14, 2017

Now we support the resgroup memory_limit memory_shared_quota alteration
in below syntax:

	ALTER RESOURCE GROUP <group> SET MEMORY_SHARED_QUOTA <value>;
	ALTER RESOURCE GROUP <group> SET MEMORY_LIMIT <value>;

The new value may take effect immediately if the actual shared memory
usage is lower than the new value; otherwise it will delay the effect.
Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io>

b6d0dc37

Make ICW pass when resgroup is enabled. · e1eed831

由 Ning Yu 提交于 8月 14, 2017

* resgroup: increase max slots for isolation tests.
* ICW: ignore resgroup related warnings.
* ICW: try to load resgroup variant of answers when resgroup enabled.
* ICW: provide resgroup variant of answers.
* ICW: check whether resqueue is enabled in UDF.
* ICR: substitude usrname in gpconfig output.
* ICR: explicitly set max_connections.
* isolation2: increase resgroup concurrency for max_concurrency tests.

e1eed831

11 8月, 2017 1 次提交
- A
  Reorganize gpcopy cases · 93866b29
  由 Adam Lee 提交于 8月 07, 2017
```
They were placed into wrong file.
```
  93866b29
10 8月, 2017 2 次提交

Replace low-level mock test with pg_regress regression test. · 748f8abe

由 Heikki Linnakangas 提交于 8月 09, 2017

One less test program makes the tests to go a tiny bit faster. This also
tests more directly the problem with the GUCs that the low-level function
was added for, rather than the low-level function.

Arguably we don't need a test for any of this at all anymore, as this fix
for the ALTER USER RESET ALL bug was backported from PostgreSQL a long time
ago, but at this point, we have caught up with that PostgreSQL code and
the GPDB code is identical to the upstream. But OTOH, it's easy and quick to
test as a pg_regress test, so might as well.

748f8abe

Organize all resgroup tests to the same directory · d837c305

由 Pengzhou Tang 提交于 8月 09, 2017

resource group tests are messing up isolation2 directory, this commits
move them all to the same directory to be tidy and organized, it also
handle the sub directory in input/output directory for resgroup.

d837c305