提交 · 3119009acfebd4e7bff4110485a9f2c1d81e09a7 · Greenplum / Gpdb

22 11月, 2018 3 次提交

New extension to debug partially distributed tables · 3119009a

由 Ning Yu 提交于 11月 22, 2018

Introduced a new debugging extension gp_debug_numsegments to get / set
the default numsegments when creating tables.

gp_debug_get_create_table_default_numsegments() gets the default
numsegments.

gp_debug_set_create_table_default_numsegments(text) sets the default
numsegments in text format, valid values are:
- 'FULL': all the segments;
- 'RANDOM': pick a random set of segments each time;
- 'MINIMAL': the minimal set of segments;

gp_debug_set_create_table_default_numsegments(integer) sets the default
numsegments directly, valid range is [1, gp_num_contents_in_cluster].

gp_debug_reset_create_table_default_numsegments(text) or
gp_debug_reset_create_table_default_numsegments(integer) reset the
default numsegments to the specified value, and the value can be reused
later.

gp_debug_reset_create_table_default_numsegments() resets the default
numsegments to the value passed last time, if there is no previous call
to it the value is 'FULL'.

Refactored ICG test partial_table.sql to create partial tables with this
extension.

3119009a

Fix dynahash HASH_ENTER usage · e576c0b9

由 Daniel Gustafsson 提交于 11月 21, 2018

rel_partitioning_is_uniform() and addMCVToHashTable() inserted with
HASH_ENTER, and subsequently checked the returnvalue for NULL in
order to error out on "out of memory". HASH_ENTER however doesn't
return if it couldn't insert and will error out itself so remove the
test as it cannot happen.

groupHashNew() was using HASH_ENTER_NULL which does return NULL in
out of memory situations, but it failed to correctly handle the
returnvalue and dereferenced without check risking a null pointer
deref under memory pressure. Fix by using HASH_ENTER instead as
the code clearly expect that behavior.
Reviewed-by: NPaul Guo <paulguo@gmail.com>

e576c0b9

Another attempt at fixing Assertion on empty AO tables. · 77ac9bdf

由 Heikki Linnakangas 提交于 11月 21, 2018

Commit cc2e211f attempted to silence the assertion in
vac_update_relstats(), but the assertion it in turn added, was hit
heavily. vacuum_appendonly_fill_stats() function, where I added the check
for zero pages and non-zero tuples combination, is also reached in QD mode,
contrary to the comments and the assertion that I added. I'm not sure why
we look at the totals in QD mode - AFAICS we just throw away them away -
but I'm reluctant to start restructuring this code right now. So move the
code to zap reltuples to 0, into vac_update_relstats().
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

77ac9bdf

21 11月, 2018 11 次提交

Set statistics to zero when no sampled rows · fb537b53

由 Daniel Gustafsson 提交于 11月 21, 2018

In the unlikely event that we reach this codepath with a samplerows
value of zero (which albeit unlikely could happen), avoid performing
a division by zero and instead set the null fraction to zero as we
clearly don't have any more information to go on. The HLL code calls
calls the compute_stats function pointer with zero samplerows, and
while that's using a different compute_stats function, it's an easy
mistake to make when not all functions can handle a division by zero.
This is defensive programming prompted by a report that triggered an
old bug like this without actually hitting this, but there is little
reason to take the risk of a crash. Suspenders go well with belts.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

fb537b53

Fix assertion failure when vacuuming an AO table in utility mode. · cc2e211f

由 Heikki Linnakangas 提交于 11月 21, 2018

There's an assertion in vac_update_relstats(), that if num_tuples is
non-zero, num_pages must also be non-zero. Makes sense: if the table takes
up no space, there can't be any tuples in it. But we hit that case, on
vacuum of an AO table in the QD node in utility mode. The QD has the total
tuple counts in each AO segment in the pg_aoseg table, across the whole
cluster, but it doesn't have the physical sizes. So it reported N tuples,
but 0 bytes.

We started hitting that assertion after commit c0ce2eb9, which fixed the
rounding in RelationGuessNumberOfBlocks(), so that it now returns 0
blocks, for 0-byte relation. It used to return 1, which masked the
problem.

Fix by reporting 0 tuples and 0 blocks in the QD. That's a change from the
old behaviour, which was to report N tuples and 1 block, but it's
consistent with heap tables.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

cc2e211f

M

docs - gpbackup - add note about SSH connections and large number of hosts. (#6282) · b970baf9
由 Mel Kiyama 提交于 11月 20, 2018

b970baf9

docs - update docs with PG 8.4 merge, interation 3 (#6269) · b7756c77

由 Chuck Litzell 提交于 11月 20, 2018

* Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
(PR #3780)

* Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
(PR #3780)

* Fix example syntax

* Add missing <p> codes and move into sectiondiv

b7756c77

D

finish interval updates; finish porting date/time info from postgres (#6133) · 87cca775
由 David Yozie 提交于 11月 20, 2018

87cca775

docs - using dblink as a non-superuser (#6270) · 5e2dc9bd

由 Mel Kiyama 提交于 11月 20, 2018

* docs - using dblink as a non-superuser

-update installing dblink on 6.0 - uses create extension
-clarify some dblink_connect() information
-add using dblink_connect_u() as a non-superuser

* docs - using dblink as a non-superuser - review comment updates

* docs - using dblink as a non-superuser - more review comment updates

5e2dc9bd

docs - gpcopy - add option --dest-dbname (#6276) · 66737c75

由 Mel Kiyama 提交于 11月 20, 2018

* docs - gpcopy - add option --dest-dbname

Also, add list of database names must be enclosed in quotes.

* docs - gpcopy --dest-dbname review comment updates.
Also, changed list of multiple db names to specify no spaces between names.

* docs - gpcopy --dest-dbname - fix typo

66737c75

pg_rewind: Avoid need for empty callback test function. · a1d62a46

由 Ashwin Agrawal 提交于 11月 19, 2018

Melanie suggested usage of
`declare -F <function_name> > /dev/null && <function_name>` in pg_rewind test.

-F will return only the name of the function and then the error code will be
non-zero if it is not defined and then by and-ing it with the function
definition alone, that function will only run in run_test if it is defined. So,
this helps to avoid coding empty functions if the test doesn't wish to have any
implementation for the same.
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>

a1d62a46

pg_rewind: handle AO and CO tables. · 97a76214

由 Ashwin Agrawal 提交于 11月 15, 2018

Copy tail end of AO file from source to target (with some exceptions) if xlog
record for the same is found on target after divergence. COPY_TAIL is used as ao
inserts are not fixed size hence best we can do is copy till end of file,
starting from modification offset.

Truncate record for ao is ignored similar to heap for same reasons.

Test is added for AO and CO tables to validate COPY and COPY_TAIL is performed
correctly.
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

97a76214

Fix pg_rewind to copy BTREE_METAPAGE for newroot. · 0a08f319

由 Ashwin Agrawal 提交于 11月 15, 2018

This XLOG_BTREE_NEWROOT xlog record type doesn't log the update to the
metapage. Hence pg_rewind misses to rewind change to meta page. The redo routine
handling the XLOG_BTREE_NEWROOT implicitly makes update to meta-page which is
always hard-coded to block zero.
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

0a08f319

pg_rewind tests: promote and validate master after rewind. · adb10fb2

由 Ashwin Agrawal 提交于 11月 15, 2018

Enhance the test framework to verify by promoting the old master after pg_rewind
and actually running queries against it. Previously, queries were run only
against standby which didn't undergo pg_rewind. So, basically tests missed
validating if pg_rewind did its job correctly or not.
Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>

adb10fb2

20 11月, 2018 10 次提交

Remove limitation on duplicate col aliases in CTE · 2c5116dc

由 Daniel Gustafsson 提交于 11月 20, 2018

There is no restriction in the SQL specification regarding duplicate
column aliases in a CTE. Remove the specific check and align us with
how queries work in upstream. Also fix test fallout to match the new
behavior with slightly tweaked queries to keep output volume down.

2c5116dc

Use local variables rather than parameters · 418ae1c1

由 Daniel Gustafsson 提交于 11月 20, 2018

Scribbling on unused parameters rather than allocating local variables
isn't adviced behavior even when it's safe to do. Move to using our own
variables instead.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>

418ae1c1

Fix data lost during data reorganization · 1c0707b9

由 Ning Yu 提交于 11月 20, 2018

A temp table is created during data reorganization, its numsegments must
be the same with the original table, otherwise some data will be lost
after the reorganization.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
Reviewed-by: NShujie Zhang <shzhang@pivotal.io>

1c0707b9

Update alternate expectfile for select_view (#6259) · eb097d27

由 BaiShaoqi 提交于 11月 20, 2018

Commit 4c54c894 eliminates our divergence in deparsing filter
expressions mostly by adding parentheses. However only two out of four
expectfiles were updated, mostly likely because the author didn't run
alternate systems (i.e. macOS), . This leads to one of those
duh-it-fails-on-my-mac failures that can be easily fixed.

This commit brings the two expectfiles up-to-date.
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

Reviewed by Paul Guo, Heikki  Linnakangas

eb097d27

Fix calculation of AO table's reltuples in new ANALYZE. · 287ac29b

由 Heikki Linnakangas 提交于 11月 20, 2018

The total number of rows in table, stored in pg_class.reltuples, was
calculated incorrectly. It was set to the number of rows in the sample,
not the (estimated?) total number of rows in the table. We didn't notice,
because most of the existing tests use small tables, so that the whole
table fit in the sample. Put back code like we used to have, for
collecting the totals.

This papers over the assertion failure we're seeing in the 'aocs' test in
the concourse pipeline, with ORCA, by changing the plan to one that doesn't
crash. It needs a proper fix, of course, but we're tracking that separately
as https://github.com/greenplum-db/gpdb/issues/6274.

287ac29b

CI: prevent paused jobs from blocking release candidate · e8ae95ef

由 Jacob Champion 提交于 11月 19, 2018

The walrep_2 and gpexpand_1 jobs are paused while we ensure that all
relevant tests have been ported from them. But they have been
unintentionally blocking the RC gate while paused. Remove them from the
gate list.

e8ae95ef

Another fix to acquire_number_of_blocks() in ANALYZE. · de118900

由 Heikki Linnakangas 提交于 11月 19, 2018

The previous fix (cc6d45c7)'t work correctly for indexes. Indexes don't
have entries in gp_distribution_policy, check the parent table's policy
instead.

de118900

D

Docs: Updating book build path; Standardizing home navigation link product name · 03118947
由 dyozie 提交于 11月 19, 2018

03118947
B
Replace deprecated Concourse PR resource · 69c33de7
由 Bradford D. Boyle 提交于 11月 16, 2018
```
Authored-by: NBradford D. Boyle <bboyle@pivotal.io>
```
69c33de7

Fix assertion failure on ANALYZE. · cc6d45c7

由 Heikki Linnakangas 提交于 11月 19, 2018

In commit baa76536, I got this condition right in acquire_sample_rows()
function, but not in acquire_number_of_blocks(). As a result, we sometimes
ended up with a sample with some rows, but relpages = 0. There was an
assertion that failed with that combination.

cc6d45c7

19 11月, 2018 12 次提交

Use block-level sampling for ANALYZE on heap tables. · baa76536

由 Heikki Linnakangas 提交于 11月 19, 2018

The user-visible feature here is to speed up ANALYZE on heap tables, by
using the block-level sampling method from PostgreSQL, instead of the
scanning the whole table. This refactors the ANALYZE code, to make it
possible to call the upstream function to do the block-level sampling
again.

A secondary feature is that this enables the ANALYZE callback for Foreign
Data Wrappers, introduced in PostgreSQL 9.2. It was disabled in the 9.2
merge, because the function signature had been changed in GPDB, making
it cumbersome to call it. This patch reverts acquire-function's signature
to the same as in upstream, so the FDW callback now works again.

This also moves the logic to decide which rels to analyze, based on
optimizer_analyze_root_partition or optimizer_analyze_midlevel_partition
GUCs and the ROOTPARTITION option, wholly to get_rel_oids() in vacuum.c.
It was mostly done there already, but there were some warts in analyze.c.
None of the code in analyze.c should now treat partitions differently
from normal inherited tables, which helps to minimize the diff vs
upstream. (Except that the HLL code still heavily uses partition-related
functions. That's a TODO.)

baa76536

Fix test case for changed data distribution with Jump Consistent hash. · da0f7a74

由 Heikki Linnakangas 提交于 11月 19, 2018

Commit 4a174240 changed the hash method, which changed the data
distribution of data. It adjusted many test cases that were dependent on
the data distribution, but missed this one. This one isn't failing at the
moment, but it was clearly mean to look at the relfrozenxid of the segment
that it inserts all the data to. Change the value of the distribution key
so that all the data falls into segment 0 again.

da0f7a74

Fix rounding in RelationGuessNumberOfBlocks(). · c0ce2eb9

由 Heikki Linnakangas 提交于 11月 19, 2018

Doesn't matter much for AO tables, but if this is to be used for heap
tables, where the relation size is always a multiple of BLCKSZ, it would
be nice to not be off-by-one.

c0ce2eb9

Simplify the data structure to track "too wide" datums slightly. · fe914bcc

由 Heikki Linnakangas 提交于 11月 19, 2018

The per-column 'toowide_cnt' was only used to check if there are any, so
it's redundant with the actual flags. Replace the bool array with
Bitmapset. The Bitmapset is left as NULL when there are no bits set, so we
can use a NULL check in place of checking toowide_cnt.

fe914bcc

H

Add test cases for ANALYZE on replicated and partially distributed tables. · 360fb933
由 Heikki Linnakangas 提交于 11月 19, 2018

360fb933

In a database-wide ANALYZE, analyze root partitions last. · ec35f3f9

由 Heikki Linnakangas 提交于 11月 19, 2018

The HLL method of updating the root stats only works if the leaf partitions
have already been analyzed. So by always analyzing the leafs first, we
can make use of HLL in updating the root stats. Otherwise, it's pure luck
if it works out or not, depending on the order the leafs and parents are
analyzed.

ec35f3f9

Set dbname and username when MyProcPort is not initialized when setup… (#6206) · b7598541

由 Hubert Zhang 提交于 11月 19, 2018

For background worker, MyProcPort is not initialized, which is used to set dbname and username when creategang. We could use MyDatabaseId and AuthenticatedUserId instead, which are initialized in InitPostgres() for background worker.

b7598541

Correct behavior for reshuffling partition tables · 8bf413d6

由 ZhangJackey 提交于 11月 19, 2018

The previous code makes UPDATE statement for root
and its children partitions when we reshuffle a partition
table. It not only involves redundant work but also will
lead to an error while reshuffling a two-level partition
table(because the mid-level partitions have no data).

The commit does the following work:

* Only make UPDATE statement for leaf partition or
   non-partition table.
* Refactor the reshuffle test cases. We remove the
   python udf code and use `gp_execute_on_server`
   and `gp_dist_random` to test replicated table.

Co-authored-by: Shujie Zhang shzhang@pivotal.io
Co-authored-by: Zhenghua Lyu zlv@pivotal.io

8bf413d6

Corresponding test changes to the FDW restriction removal · a352fb9d

由 Adam Lee 提交于 11月 19, 2018

Thanks to "Remove restriction to INSERT/UPDATE/DELETE foreign relations
(#6202)", get the right upstream error message back.

a352fb9d

Remove restriction to INSERT/UPDATE/DELETE foreign relations (#6202) · eea71b3d

由 Francisco Guerrero 提交于 11月 18, 2018

- While spiking on implementing PXF using FDW,
  we noticed that error messages for file_fdw
  were different from upstream error messages
  when updating a file_fdw. GPDB introduced a
  check in `setTargetTable` for foreign relations
  that might have not been cleaned up during
  merge. This PR removes the check in `setTargetTable`
  and fixes the expected output in file_fdw.
  Running make installcheck on file_fdw, now
  succeeds, as we are matching the upstream
  error message "updates aren't supported".
Authored-by: NFrancisco Guerrero <aguerrero@pivotal.io>

eea71b3d

A
file_fdw changes on the mpp FDW · fc5a7d74
由 Adam Lee 提交于 7月 12, 2018
```
Forbid `mpp_execute 'any'`, pass `on_segment` for 'all segments'.
```
fc5a7d74

Add support for executing foreign tables on master, any or all segments · 3c6c6ab2

由 Adam Lee 提交于 7月 12, 2018

This commit adds the support and option of `mpp_execute 'MASTER | ANY |
ALL SEGMENTS'` for foreign tables.

MASTER is the default, FDW requests for data from master.

ANY, FDW requests for data from master or one any segment, depends on
which path costs less.

ALL SEGMENTS, FDW requests for data from all segments, wrappers need to
have a policy matching the segments to data.

For instance, file_fdw probes the mpp_execute vaule, then load different
files based on the segment number. But something like gpfdist on the
foreign side doesn't need this, which hands out a different slice of the
data to each request, all segments could request the same location.

3c6c6ab2

18 11月, 2018 1 次提交

Add alternate output for gp_xml for unsupported feature · 867001c0

由 Daniel Gustafsson 提交于 11月 17, 2018

Commit c20e5c4e added the gp_xml testsuite for GPDB XML tests in
order to not pollute the upstrean xml testsuite, it however failed to
add the alternative output for clusters compiled with out support for
XML.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>

867001c0

17 11月, 2018 3 次提交

Revert "Revert "Temporary workaround to make qp_dml_joins deterministic"" · 8cfdf56b

由 Alexandra Wang 提交于 11月 16, 2018

This reverts commit fd602f46.
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

8cfdf56b

Fix bugs in FDWs and postgres_fdw. · f07beb69

由 Heikki Linnakangas 提交于 11月 17, 2018

* Replace ExecStoreTuple with ExecStoreHeapTuple in postgres_fdw, to make
  it compile.

* Teach plan_tree_walker() to recurse into ForeignScan.fdw_exprs field.
  This fixes a crash in the postgres_fdw regression tests.

This is enough to make postgres_fdw compile, and not crash. It's still
failing a lot of regression tests, though. The failures need further
investigation, but this is a start.
Reviewed-by: NFrancisco Guerrero <aguerrero@pivotal.io>

f07beb69

Correctly wrap pragma in compiler specifics · 6a36f836

由 Daniel Gustafsson 提交于 11月 16, 2018

The clang pragma added in ca2adbc6 lacked a guard
to ensure it was only inspected by the clang preprocessor.

Reported-by: Heikki Linnakangas
Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/qDHCpAJaxrA

6a36f836