提交 · a25e2cd68701f526aa536d37dc90e602ace91bba · Greenplum / Gpdb

22 11月, 2018 12 次提交

Fix confusion with distribution keys of queries with FULL JOINs. · a25e2cd6

由 Heikki Linnakangas 提交于 11月 22, 2018

There was some confusion on how NULLs are distributed, when CdbPathLocus
is of Hashed or HashedOJ type. The comment in cdbpathlocus.h suggested
that NULLs can be on any segment. But the rest of the code assumed that
that's true only for HashedOJ, and that for Hashed, all NULLs are stored
on a particular segment. There was a comment in cdbgroup.c that said "Or
would HashedOJ ok, too?"; the answer to that is "No!". Given the comment
in cdbpathlocus.h, I'm not suprised that the author was not very sure
about that. Clarify the comments in cdbpathlocus.h and cdbgroup.c on that.

There were a few cases where we got that actively wrong. repartitionPlan()
function is used to inject a Redistribute Motion into queries used for
CREATE TABLE AS and INSERT, if the "current" locus didn't match the target
table's policy. It did not check for HashedOJ. Because of that, if the
query contained FULL JOINs, NULL values might end up on all segments. Code
elsewhere, particularly in cdbgroup.c, assumes that all NULLs in a table
are stored on a single segment, identified by the cdbhash value of a NULL
datum. Fix that, by adding a check for HashedOJ in repartitionPlan(), and
forcing a Redistribute Motion.

CREATE TABLE AS had a similar problem, in the code to decide which
distribution key to use, if the user didn't specify DISTRIBUTED BY
explicitly. The default behaviour is to choose a distribution key that
matches the distribution of the query, so that we can avoid adding an
extra Redistribute Motion. After fixing repartitionPlan, there was no
correctness problem, but if we chose the key based on a HashedOJ locus,
there is no performance benefit because we'd need a Redistribute Motion
anyway. So modify the code that chooses the CTAS distribution key to
ignore HashedOJ.

While we're at it, refactor the code to choose the CTAS distribution key,
by moving it to a separate function. It had become ridiculously deeply
indented.

Fixes https://github.com/greenplum-db/gpdb/issues/6154, and adds tests.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

a25e2cd6

Add tests for deriving distribution keys from query in CREATE TABLE AS. · 9457fe71

由 Heikki Linnakangas 提交于 11月 22, 2018

The case where some, but not all, of the query's distribution keys were
present in the result set, was not covered by any existing tests.

Per Paul Guo's observation.

9457fe71

Cosmetic fixes in the code to determine distribution key for CTAS. · a5fa3110

由 Heikki Linnakangas 提交于 11月 22, 2018

Fix indentation. In the code to generate a NOTICE, remove if() for
condition that we had checked earlier in the function already, and use a
StringInfo for building the string.

a5fa3110

H
Remove unused variables, to silence compiler warnings. · 88a185a5
由 Heikki Linnakangas 提交于 11月 22, 2018
```
These were left behind by commit 7f6066ea.
```
88a185a5

Fix obsolete comment in cdb_build_distribution_pathkeys(). · 8c9c0576

由 Heikki Linnakangas 提交于 11月 22, 2018

It returns a simple list of PathKeys, not a list of lists. The code was
changed in the 8.3-era merge of equivalence classes already, but we
neglected the comment.

8c9c0576

Remove dead code in ATExecSetDistributedBy · 50f2e3bb

由 Zhenghua Lyu 提交于 11月 22, 2018

This commit is the first step to refactor ATExecSetDistributedBy. Its
main purpose is to remove some dead code in this function and during
the process we find some helper functions can also be simplified so
the simplification is also in this commit.

According to MPP-7770, we should disable changing storage options for now.
It is ugly to just throw an error when encounter `appendonly` option but
without removing the code. In this commit remove all related logic.

Because of with clause can only contain reshuffle|reorganize, we only
new_rel_opts if the table itself is ao|aoco. No need to deduce it from with
clause.

We also remove the unnecessary checks at the start of this function. Because
These checks have been already done in the function `ATPrepCmd`.

Co-authored-by: Shuejie Zhang <shzhang@pivotal.io >

50f2e3bb

After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the... · 1136f2fb

由 Shaoqi Bai 提交于 11月 20, 2018

After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the master will block when the standby master is down.

There are a couple things that need to be done to unblock the master.
1. Run gpinitstandby -n to start the standby master back up.
2. Run psql postgres -c "ALTER SYSTEM SET synchronous_standby_names = '';" and reload the master segment.

Note that the ALTER SYSTEM SET has to be called again to set synchronous_standby_names back to '*' (and master config reloaded) to enable synchronous replication again.
Thoughts are to make it 1 step combined into gpinitstandby -n instead of documenting a multi-step process.

What this commit is just to make it 1 step combined into gpinitstandby -n.
Co-authored-by: NNing Yu <nyu@pivotal.io>

1136f2fb

Remove Master/Standby SyncRepWait Greenplum hack · 7f6066ea

由 Jimmy Yih 提交于 11月 12, 2018

When the standby master is unavailable, the master will not block on commits
even though we enable synchronous replication. This is because we have a
Greenplum hack which checks if the WAL stream with the standby master is
valid. If the stream is invalid, the master will quickly skip the SyncRepWait
and continue on its commit.

Remove this hack in order to make Master/Standby and Primary/Mirror WAL
replication more similar.
Co-authored-by: NShaoqi Bai <sbai@pivotal.io>

7f6066ea

Use max numsegments of subpaths for Append node · 1b2f7bcd

由 Ning Yu 提交于 11月 22, 2018

Suppose t1 has numsegments=1 and t2 has numsegments=2, then below query
will have incorrect plan:

    explain (costs off) select * from t2 a join t2 b using(c2)
              union all select * from t1 c join t1 d using(c2);
                                   QUERY PLAN
    ------------------------------------------------------------------------
     Gather Motion 1:1  (slice3; segments: 1)
       ->  Append
             ->  Hash Join
                   Hash Cond: (a.c2 = b.c2)
                   ->  Redistribute Motion 2:2  (slice1; segments: 2)
                         Hash Key: a.c2
                         ->  Seq Scan on t2 a
                   ->  Hash
                         ->  Redistribute Motion 2:2  (slice2; segments: 2)
                               Hash Key: b.c2
                               ->  Seq Scan on t2 b
             ->  Hash Join
                   Hash Cond: (c.c2 = d.c2)
                   ->  Seq Scan on t1 c
                   ->  Hash
                         ->  Seq Scan on t1 d
     Optimizer: legacy query optimizer
    (17 rows)

slice2 has a 2:2 redistribute motion to slice3, however slice3 only has
1 segment, this is due to Append's numsegments is decided from the last
subpath.

To fix the issue we should use max numsegments of subpaths for Append.

The issue was already fixed in 39856768,
we are only adding tests for it now.

1b2f7bcd

New extension to debug partially distributed tables · 3119009a

由 Ning Yu 提交于 11月 22, 2018

Introduced a new debugging extension gp_debug_numsegments to get / set
the default numsegments when creating tables.

gp_debug_get_create_table_default_numsegments() gets the default
numsegments.

gp_debug_set_create_table_default_numsegments(text) sets the default
numsegments in text format, valid values are:
- 'FULL': all the segments;
- 'RANDOM': pick a random set of segments each time;
- 'MINIMAL': the minimal set of segments;

gp_debug_set_create_table_default_numsegments(integer) sets the default
numsegments directly, valid range is [1, gp_num_contents_in_cluster].

gp_debug_reset_create_table_default_numsegments(text) or
gp_debug_reset_create_table_default_numsegments(integer) reset the
default numsegments to the specified value, and the value can be reused
later.

gp_debug_reset_create_table_default_numsegments() resets the default
numsegments to the value passed last time, if there is no previous call
to it the value is 'FULL'.

Refactored ICG test partial_table.sql to create partial tables with this
extension.

3119009a

Fix dynahash HASH_ENTER usage · e576c0b9

由 Daniel Gustafsson 提交于 11月 21, 2018

rel_partitioning_is_uniform() and addMCVToHashTable() inserted with
HASH_ENTER, and subsequently checked the returnvalue for NULL in
order to error out on "out of memory". HASH_ENTER however doesn't
return if it couldn't insert and will error out itself so remove the
test as it cannot happen.

groupHashNew() was using HASH_ENTER_NULL which does return NULL in
out of memory situations, but it failed to correctly handle the
returnvalue and dereferenced without check risking a null pointer
deref under memory pressure. Fix by using HASH_ENTER instead as
the code clearly expect that behavior.
Reviewed-by: NPaul Guo <paulguo@gmail.com>

e576c0b9

Another attempt at fixing Assertion on empty AO tables. · 77ac9bdf

由 Heikki Linnakangas 提交于 11月 21, 2018

Commit cc2e211f attempted to silence the assertion in
vac_update_relstats(), but the assertion it in turn added, was hit
heavily. vacuum_appendonly_fill_stats() function, where I added the check
for zero pages and non-zero tuples combination, is also reached in QD mode,
contrary to the comments and the assertion that I added. I'm not sure why
we look at the totals in QD mode - AFAICS we just throw away them away -
but I'm reluctant to start restructuring this code right now. So move the
code to zap reltuples to 0, into vac_update_relstats().
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

77ac9bdf

21 11月, 2018 11 次提交

Set statistics to zero when no sampled rows · fb537b53

由 Daniel Gustafsson 提交于 11月 21, 2018

In the unlikely event that we reach this codepath with a samplerows
value of zero (which albeit unlikely could happen), avoid performing
a division by zero and instead set the null fraction to zero as we
clearly don't have any more information to go on. The HLL code calls
calls the compute_stats function pointer with zero samplerows, and
while that's using a different compute_stats function, it's an easy
mistake to make when not all functions can handle a division by zero.
This is defensive programming prompted by a report that triggered an
old bug like this without actually hitting this, but there is little
reason to take the risk of a crash. Suspenders go well with belts.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

fb537b53

Fix assertion failure when vacuuming an AO table in utility mode. · cc2e211f

由 Heikki Linnakangas 提交于 11月 21, 2018

There's an assertion in vac_update_relstats(), that if num_tuples is
non-zero, num_pages must also be non-zero. Makes sense: if the table takes
up no space, there can't be any tuples in it. But we hit that case, on
vacuum of an AO table in the QD node in utility mode. The QD has the total
tuple counts in each AO segment in the pg_aoseg table, across the whole
cluster, but it doesn't have the physical sizes. So it reported N tuples,
but 0 bytes.

We started hitting that assertion after commit c0ce2eb9, which fixed the
rounding in RelationGuessNumberOfBlocks(), so that it now returns 0
blocks, for 0-byte relation. It used to return 1, which masked the
problem.

Fix by reporting 0 tuples and 0 blocks in the QD. That's a change from the
old behaviour, which was to report N tuples and 1 block, but it's
consistent with heap tables.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

cc2e211f

M

docs - gpbackup - add note about SSH connections and large number of hosts. (#6282) · b970baf9
由 Mel Kiyama 提交于 11月 20, 2018

b970baf9

docs - update docs with PG 8.4 merge, interation 3 (#6269) · b7756c77

由 Chuck Litzell 提交于 11月 20, 2018

* Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
(PR #3780)

* Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
(PR #3780)

* Fix example syntax

* Add missing <p> codes and move into sectiondiv

b7756c77

D

finish interval updates; finish porting date/time info from postgres (#6133) · 87cca775
由 David Yozie 提交于 11月 20, 2018

87cca775

docs - using dblink as a non-superuser (#6270) · 5e2dc9bd

由 Mel Kiyama 提交于 11月 20, 2018

* docs - using dblink as a non-superuser

-update installing dblink on 6.0 - uses create extension
-clarify some dblink_connect() information
-add using dblink_connect_u() as a non-superuser

* docs - using dblink as a non-superuser - review comment updates

* docs - using dblink as a non-superuser - more review comment updates

5e2dc9bd

docs - gpcopy - add option --dest-dbname (#6276) · 66737c75

由 Mel Kiyama 提交于 11月 20, 2018

* docs - gpcopy - add option --dest-dbname

Also, add list of database names must be enclosed in quotes.

* docs - gpcopy --dest-dbname review comment updates.
Also, changed list of multiple db names to specify no spaces between names.

* docs - gpcopy --dest-dbname - fix typo

66737c75

pg_rewind: Avoid need for empty callback test function. · a1d62a46

由 Ashwin Agrawal 提交于 11月 19, 2018

Melanie suggested usage of
`declare -F <function_name> > /dev/null && <function_name>` in pg_rewind test.

-F will return only the name of the function and then the error code will be
non-zero if it is not defined and then by and-ing it with the function
definition alone, that function will only run in run_test if it is defined. So,
this helps to avoid coding empty functions if the test doesn't wish to have any
implementation for the same.
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>

a1d62a46

pg_rewind: handle AO and CO tables. · 97a76214

由 Ashwin Agrawal 提交于 11月 15, 2018

Copy tail end of AO file from source to target (with some exceptions) if xlog
record for the same is found on target after divergence. COPY_TAIL is used as ao
inserts are not fixed size hence best we can do is copy till end of file,
starting from modification offset.

Truncate record for ao is ignored similar to heap for same reasons.

Test is added for AO and CO tables to validate COPY and COPY_TAIL is performed
correctly.
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

97a76214

Fix pg_rewind to copy BTREE_METAPAGE for newroot. · 0a08f319

由 Ashwin Agrawal 提交于 11月 15, 2018

This XLOG_BTREE_NEWROOT xlog record type doesn't log the update to the
metapage. Hence pg_rewind misses to rewind change to meta page. The redo routine
handling the XLOG_BTREE_NEWROOT implicitly makes update to meta-page which is
always hard-coded to block zero.
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

0a08f319

pg_rewind tests: promote and validate master after rewind. · adb10fb2

由 Ashwin Agrawal 提交于 11月 15, 2018

Enhance the test framework to verify by promoting the old master after pg_rewind
and actually running queries against it. Previously, queries were run only
against standby which didn't undergo pg_rewind. So, basically tests missed
validating if pg_rewind did its job correctly or not.
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

adb10fb2

20 11月, 2018 10 次提交

Remove limitation on duplicate col aliases in CTE · 2c5116dc

由 Daniel Gustafsson 提交于 11月 20, 2018

There is no restriction in the SQL specification regarding duplicate
column aliases in a CTE. Remove the specific check and align us with
how queries work in upstream. Also fix test fallout to match the new
behavior with slightly tweaked queries to keep output volume down.

2c5116dc

Use local variables rather than parameters · 418ae1c1

由 Daniel Gustafsson 提交于 11月 20, 2018

Scribbling on unused parameters rather than allocating local variables
isn't adviced behavior even when it's safe to do. Move to using our own
variables instead.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>

418ae1c1

Fix data lost during data reorganization · 1c0707b9

由 Ning Yu 提交于 11月 20, 2018

A temp table is created during data reorganization, its numsegments must
be the same with the original table, otherwise some data will be lost
after the reorganization.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
Reviewed-by: NShujie Zhang <shzhang@pivotal.io>

1c0707b9

Update alternate expectfile for select_view (#6259) · eb097d27

由 BaiShaoqi 提交于 11月 20, 2018

Commit 4c54c894 eliminates our divergence in deparsing filter
expressions mostly by adding parentheses. However only two out of four
expectfiles were updated, mostly likely because the author didn't run
alternate systems (i.e. macOS), . This leads to one of those
duh-it-fails-on-my-mac failures that can be easily fixed.

This commit brings the two expectfiles up-to-date.
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

Reviewed by Paul Guo, Heikki  Linnakangas

eb097d27

Fix calculation of AO table's reltuples in new ANALYZE. · 287ac29b

由 Heikki Linnakangas 提交于 11月 20, 2018

The total number of rows in table, stored in pg_class.reltuples, was
calculated incorrectly. It was set to the number of rows in the sample,
not the (estimated?) total number of rows in the table. We didn't notice,
because most of the existing tests use small tables, so that the whole
table fit in the sample. Put back code like we used to have, for
collecting the totals.

This papers over the assertion failure we're seeing in the 'aocs' test in
the concourse pipeline, with ORCA, by changing the plan to one that doesn't
crash. It needs a proper fix, of course, but we're tracking that separately
as https://github.com/greenplum-db/gpdb/issues/6274.

287ac29b

CI: prevent paused jobs from blocking release candidate · e8ae95ef

由 Jacob Champion 提交于 11月 19, 2018

The walrep_2 and gpexpand_1 jobs are paused while we ensure that all
relevant tests have been ported from them. But they have been
unintentionally blocking the RC gate while paused. Remove them from the
gate list.

e8ae95ef

Another fix to acquire_number_of_blocks() in ANALYZE. · de118900

由 Heikki Linnakangas 提交于 11月 19, 2018

The previous fix (cc6d45c7)'t work correctly for indexes. Indexes don't
have entries in gp_distribution_policy, check the parent table's policy
instead.

de118900

D

Docs: Updating book build path; Standardizing home navigation link product name · 03118947
由 dyozie 提交于 11月 19, 2018

03118947
B
Replace deprecated Concourse PR resource · 69c33de7
由 Bradford D. Boyle 提交于 11月 16, 2018
```
Authored-by: NBradford D. Boyle <bboyle@pivotal.io>
```
69c33de7

Fix assertion failure on ANALYZE. · cc6d45c7

由 Heikki Linnakangas 提交于 11月 19, 2018

In commit baa76536, I got this condition right in acquire_sample_rows()
function, but not in acquire_number_of_blocks(). As a result, we sometimes
ended up with a sample with some rows, but relpages = 0. There was an
assertion that failed with that combination.

cc6d45c7

19 11月, 2018 7 次提交

Use block-level sampling for ANALYZE on heap tables. · baa76536

由 Heikki Linnakangas 提交于 11月 19, 2018

The user-visible feature here is to speed up ANALYZE on heap tables, by
using the block-level sampling method from PostgreSQL, instead of the
scanning the whole table. This refactors the ANALYZE code, to make it
possible to call the upstream function to do the block-level sampling
again.

A secondary feature is that this enables the ANALYZE callback for Foreign
Data Wrappers, introduced in PostgreSQL 9.2. It was disabled in the 9.2
merge, because the function signature had been changed in GPDB, making
it cumbersome to call it. This patch reverts acquire-function's signature
to the same as in upstream, so the FDW callback now works again.

This also moves the logic to decide which rels to analyze, based on
optimizer_analyze_root_partition or optimizer_analyze_midlevel_partition
GUCs and the ROOTPARTITION option, wholly to get_rel_oids() in vacuum.c.
It was mostly done there already, but there were some warts in analyze.c.
None of the code in analyze.c should now treat partitions differently
from normal inherited tables, which helps to minimize the diff vs
upstream. (Except that the HLL code still heavily uses partition-related
functions. That's a TODO.)

baa76536

Fix test case for changed data distribution with Jump Consistent hash. · da0f7a74

由 Heikki Linnakangas 提交于 11月 19, 2018

Commit 4a174240 changed the hash method, which changed the data
distribution of data. It adjusted many test cases that were dependent on
the data distribution, but missed this one. This one isn't failing at the
moment, but it was clearly mean to look at the relfrozenxid of the segment
that it inserts all the data to. Change the value of the distribution key
so that all the data falls into segment 0 again.

da0f7a74

Fix rounding in RelationGuessNumberOfBlocks(). · c0ce2eb9

由 Heikki Linnakangas 提交于 11月 19, 2018

Doesn't matter much for AO tables, but if this is to be used for heap
tables, where the relation size is always a multiple of BLCKSZ, it would
be nice to not be off-by-one.

c0ce2eb9

Simplify the data structure to track "too wide" datums slightly. · fe914bcc

由 Heikki Linnakangas 提交于 11月 19, 2018

The per-column 'toowide_cnt' was only used to check if there are any, so
it's redundant with the actual flags. Replace the bool array with
Bitmapset. The Bitmapset is left as NULL when there are no bits set, so we
can use a NULL check in place of checking toowide_cnt.

fe914bcc

H

Add test cases for ANALYZE on replicated and partially distributed tables. · 360fb933
由 Heikki Linnakangas 提交于 11月 19, 2018

360fb933

In a database-wide ANALYZE, analyze root partitions last. · ec35f3f9

由 Heikki Linnakangas 提交于 11月 19, 2018

The HLL method of updating the root stats only works if the leaf partitions
have already been analyzed. So by always analyzing the leafs first, we
can make use of HLL in updating the root stats. Otherwise, it's pure luck
if it works out or not, depending on the order the leafs and parents are
analyzed.

ec35f3f9

Set dbname and username when MyProcPort is not initialized when setup… (#6206) · b7598541

由 Hubert Zhang 提交于 11月 19, 2018

For background worker, MyProcPort is not initialized, which is used to set dbname and username when creategang. We could use MyDatabaseId and AuthenticatedUserId instead, which are initialized in InitPostgres() for background worker.

b7598541