提交 · 28acba6e475770e5e6af35a9b3342fa230334500 · Greenplum / Gpdb

17 7月, 2018 7 次提交

tuptoaster: gracefully handle changes to TOAST_MAX_CHUNK_SIZE · 28acba6e

由 Jacob Champion 提交于 7月 17, 2018

GPDB 4.3 uses a slightly smaller TOAST_MAX_CHUNK_SIZE, and we claim to
be able to upgrade those toast tables, but there's currently no support
in tuptoaster:

	ERROR:  unexpected chunk size 8138 (expected 8140) in chunk 0 of 4
	for toast value 16486 in pg_toast_16479 (tuptoaster.c:1840)

Add some tests to flush this out. This adds the
gp_test_toast_max_chunk_size_override GUC, which allows a superuser to
manually reduce the chunk size used by toast_save_datum().

We can handle changes to TOAST_MAX_CHUNK_SIZE by taking a look at the
size of the first chunk. For the full toast_fetch_datum(), we can
optimistically assume that the max chunk size is TOAST_MAX_CHUNK_SIZE,
and then adjust if that turns out to be false. For the random access in
toast_fetch_datum_slice(), though, I haven't tried a similar
optimization -- if we don't know the chunk size to begin with, it seems
like there are too many corner cases to keep track of when jumping into
the middle of a toast relation. So toast_fetch_datum_slice() will always
open the first chunk, which may somewhat impact performance for some
queries.

28acba6e

A

Remove seqserver process from walrep test. · f8943298
由 Ashwin Agrawal 提交于 7月 16, 2018

f8943298
M

docs - minor edits to docs. · 0f340ded
由 mkiyama 提交于 7月 16, 2018

0f340ded
A

Fix unused qdinfo variable warnings. · 8e61ae80
由 Ashwin Agrawal 提交于 7月 12, 2018

8e61ae80

Get sequence code closer to upstream. · e19b3a2c

由 Ashwin Agrawal 提交于 7月 12, 2018

This patch attempts to get the sequence code as close to upstream as
possible. There is still mix of code from various upstream versions due to past
cherry-picks for bug fixes so its hard to state which exact upstream version it
matches but atleast all the gpdb specific modifications are minimized and
unnecessary code movement has been removed.

e19b3a2c

Refactor of Sequence handling. · eae1ee3f

由 David Kimura 提交于 6月 19, 2018

Handling sequences in MPP stage is challenging. This patch refactors the same
mainly to eliminate the shortcomings/pitfalls in previous implementation. Lets
first glance at issues with older implementation.

- required relfilenode == oid for all sequence relations
- "denial of service" due to dedicated but single process running on QD to
serve sequence values for all the tables in all the databases to all the QEs
- sequence tables direct opened as a result instead of flowing throw relcache
- divergence from upstream implementation

Many solutions were considered refer mailing discussion for the same before
settling on one in here. Newer implementation still leverages centralized place
QD to generated sequence values. It now leverages the existing QD backend
process connecting to QEs for the query to serve the nextval request. As a
result need for relfilenode == oid gets eliminated as based on oid, QD process
can now lookup relfilenode from catalog and also leverage relcache. No more
direct opens by single process across databases.

For communication between QD and QE for sequence nextval request, async notify
message is used (Notify messages are not used in GPDB for anything else
currently). QD process while waiting for results from QEs is sitting idle and
hence on seeing the nextval request, calls `nextval_internal()` and responds
back with the value.

Since the need for separate sequence server process went away, all the code for
the same is removed.

Discussion:
https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/hni7lS9xH4c/o_M3ddAeBgAJCo-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

eae1ee3f

Clear all the notify messages in cdbconn_discardResults() · 334252c0

由 Ashwin Agrawal 提交于 7月 12, 2018

Currently, notify messages are not used in GPDB by QE's except for sequence
nextval messages. While cleaning the connection for reuse best to remove all the
notify messages as well to be safe than sorry even later if we start using it
for more things.

334252c0

14 7月, 2018 4 次提交

D
gppylib: Remove gp.get_user() and replace uses with unix.getUserName() · a88bec32
由 David Sharp 提交于 7月 13, 2018
```
Authored-by: NDavid Sharp <dsharp@pivotal.io>
```
a88bec32
A

Fix compile warning in walrep test. · 1e2f5940
由 Ashwin Agrawal 提交于 7月 13, 2018

1e2f5940

Refactor conditional for ReloadDbConf command · 5d88fb8b

由 Larry Hamel 提交于 7月 12, 2018

-- removes match for "localhost" because a Greenplum cluster defined
with "localhost" as the name of a node will not work
Authored-by: NLarry Hamel <lhamel@pivotal.io>

5d88fb8b

`gpstop -u` runs HUP commands locally when possible · e29d2012

由 Larry Hamel 提交于 7月 11, 2018

-- previously, `gpstop -u` used `ssh` for all commands. This changes to
add a conditional for when the host == localhost, and runs the command
without `ssh`
Co-authored-by: NLarry Hamel <lhamel@pivotal.io>
Co-authored-by: NNadeem Ghani <nghani@pivotal.io>

e29d2012

13 7月, 2018 14 次提交

J
Fix resource group bypass test case (#5278) · 4fb21650
由 Jialun 提交于 7月 13, 2018
```
- Remove async session before CREATE FUNCTION
- Change comment format from -- to /* */
```
4fb21650

Fix resgroup bypass quota (#5262) · dc82ceea

由 Jialun 提交于 7月 13, 2018

* Add resource group bypass memory limit.

- Bypassed query allocates all the memory in group / global shared memory,
  and is also enforced by the group's memory limit;
- Bypassed query also has a memory limit of 10 chunks per process;

* Add test cases for the resgroup bypass memory limit.

* Provide ORCA answer file.

* Adjust memory limit on QD.

dc82ceea

docs - gpbackup S3 plugin support for S3 compatible data stores. (#5233) · 13fa287c

由 Mel Kiyama 提交于 7月 12, 2018

* docs - gpbackup S3 plugin - support for S3 compatible data stores.

Link to HTML format on GPDB docs review site.
http://docs-gpdb-review-staging.cfapps.io/review/admin_guide/managing/backup-s3-plugin.html

* docs - gpbackup S3 plugin - review comment updates

* docs - gpbackup S3 plugin - Add OSS/Pivotal support information.

* docs - gpbackup S3 plugin - fix typos.

* docs - gpbackup S3 plugin - updated information about S3 compatible data stores.

13fa287c

docs - gpbackup API - update for scope argument. (#5267) · 5a5f03cd

由 Mel Kiyama 提交于 7月 12, 2018

* docs - gpbackup API - update for scope argument.

This will be ported to 5X_STABLE

* docs - gpbackup API - correct API scope description based on review comments.

* docs - gpbackup API - edit scope description based on review comments.

--Updated API version

5a5f03cd

Docs: Edit pxf filter pushdown docs (#5219) · 1235760a

由 David Yozie 提交于 7月 12, 2018

* fix problematic xrefs

* Consistency edits for list items

* edit, relocate filter pushdown section

* minor edits to guc description

* remove note about non-support in Hive doc

* Edits from Lisa's review

* Adding note about experimental status of HBase connector pushdown

* Adding note about experimental status of Hive connector pushdown

* Revert "Adding note about experimental status of Hive connector pushdown"

This reverts commit 43dfe51526e19983835f7cbd25d540d3c0dec4ba.

* Revert "Adding note about experimental status of HBase connector pushdown"

This reverts commit 3b143de058c7403c2bc141c11c61bf227c2abf3a.

* restoring HBase, Hive pushdown support

* slight wording change

* adding xref

1235760a

Revert "Use -j4 for make unitest-check." · 301f3512

由 Ashwin Agrawal 提交于 7月 12, 2018

This reverts commit 5ca65bed. Seems we are not
ready for parallel compilation of unittests yet. CI failed with weird looking
errors.

301f3512

A
Use -j4 for make unitest-check. · 5ca65bed
由 Ashwin Agrawal 提交于 7月 10, 2018
```
Lets run unit tests faster not sure why we didn't do it till date.
```
5ca65bed
A

Fix warnings flagged by gcc. · ea59fd52
由 Ashwin Agrawal 提交于 7月 11, 2018

ea59fd52
A
Fix various compiler warnings (flagged by clang). · 8e6f36ba
由 Ashwin Agrawal 提交于 7月 09, 2018
```
These are mostly of type unused variable and incorrect pointer assignments.
```
8e6f36ba

Increase PR Pipeline ICW timeout values · 12b8fe81

由 Jimmy Yih 提交于 7月 12, 2018

The PR Pipeline ICW jobs run for quite a long time most likely due to
some weird container placement strategy (maybe placing them all on the
same worker instead of distributing the workload).  Although it would
be nice to catch small or medium performance regressions in PR
Pipeline, that is not doable when using containers as the run time
variance can be very wide.  So let's bump these values up to just
prevent PR Pipeline hangs or serious performance regressions.

[ci skip]

12b8fe81

Initialize sequence page under buffer content lock · d331cc78

由 Asim R P 提交于 7月 10, 2018

Shared buffer access rules mandate that a pin as well as content lock
in exclusive mode is needed to update a shared buffer.

d331cc78

Fix race condition in DELETE RETURNING. · 3637c5d4

由 Tom Lane 提交于 3月 10, 2013

When RETURNING is specified, ExecDelete would return a virtual-tuple slot
that could contain pointers into an already-unpinned disk buffer.  Another
process could change the buffer contents before we get around to using the
data, resulting in garbage results or even a crash.  This seems of fairly
low probability, which may explain why there are no known field reports of
the problem, but it's definitely possible.  Fix by forcing the result slot
to be "materialized" before we release pin on the disk buffer.

Back-patch to 9.0; in earlier branches there is no bug because
ExecProcessReturning sent the tuple to the destination immediately.  Also,
this is already fixed in HEAD as part of the writable-foreign-tables patch
(where the fix is necessary for DELETE RETURNING to work at all with
postgres_fdw).

3637c5d4

A
Shared buffer should not be accessed after it is unpinned · 942c67e3
由 Asim R P 提交于 7月 05, 2018
```
We seemed to be doing that in this case.  This was caught by enabling
memory_protect_buffer_pool GUC.
```
942c67e3

Update gpsd and minirepro to capture HLL counter · 4cdfd41a

由 Ekta Khanna 提交于 7月 09, 2018

Prior to this commit, the minirepro and gpsd util, captured the hll
counter from the `pg_statistic` table as `int[]` instead of a `bytea[]`,
which caused errors while trying to load it. This commit fixes this
issue.

4cdfd41a

12 7月, 2018 11 次提交

Fix invalid reference to relcache entries. · 234150b7

由 Richard Guo 提交于 7月 12, 2018

Fix invalid reference to relcache entries.

After a relation is closed, the relcache entry
might be freed if its refcount goes to zero and we
should avoid further reference to it.

This is a fix to already existing bug and current
tests can cover the code changes here. So there is
no need to add new test case.

234150b7

Remove fuzzystrmatch behave test · ea543f25

由 Chris Hajas 提交于 7月 03, 2018

fuzzystrmatch is now an extension and can be loaded through CREATE EXTENSION. This
suite was probably more relevant when GPDB didn't support extensions.
Authored-by: NChris Hajas <chajas@pivotal.io>

ea543f25

Remove unused misc.feature and gplog behave tests · 90e0d934

由 Chris Hajas 提交于 7月 03, 2018

misc.feature contained "Miscellaneous tests which do not belong to mgmt
utilities" and are not being run/maintained.

gplog.feature was an attempt to test the logging feature in gppylib, but
likewise hasn't been run/maintained.
Authored-by: NChris Hajas <chajas@pivotal.io>

90e0d934

Remove unused gpfdist behave tests · 5dcb357e

由 Chris Hajas 提交于 7月 03, 2018

The maintained tests for gpfdist are located in src/bin/gpfdist/regress.
Authored-by: NChris Hajas <chajas@pivotal.io>

5dcb357e

Remove unused gpload and gpreload behave tests · 9c6f9b77

由 Chris Hajas 提交于 7月 03, 2018

The maintained tests for gpload are located in gpMgmt/bin/gpload_test.
Authored-by: NChris Hajas <chajas@pivotal.io>

9c6f9b77

Set required statement_mem in qp_dml_joins test. · a5ff65ac

由 Ashwin Agrawal 提交于 7月 11, 2018

This has been annoying from long time, as always fails with planner. It was
fixed earlier in CI by setting the statement_mem but ideally it to have it in
tests to pass anywhere and not just in CI.

Fixes #4668.

a5ff65ac

D
Set gp_enable_groupext_distinct_pruning on by default · 300f32b9
由 Daniel Gustafsson 提交于 7月 10, 2018
```
It was disabled in the 8.4 merge
Signed-off-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
```
300f32b9

Remove comment about refactoring · 264040e9

由 Daniel Gustafsson 提交于 7月 10, 2018

the iterator result is required between calls
Signed-off-by: NJoao Pereira <jdealmeidapereira@pivotal.io>

264040e9

docs - gpcopy new options --dry-run, --no-distribution-check (#5264) · ad3858c4

由 Mel Kiyama 提交于 7月 11, 2018

* docs - gpcopy new options --dry-run, --no-distribution-check

--Add limitation and warning about copying some types of partitioned tables with --no-distribution-check.
--Also added limitation and warning to Admin Guide.

* docs - gpcopy fix typo for new option --no-distribution-check

* docs - gpcopy new options. Removed unneeded xref.

ad3858c4

docs - add GUC gp_resource_group_bypass (#5231) · 0b090c9f

由 Mel Kiyama 提交于 7月 11, 2018

* docs - add GUC gp_resource_group_bypass

Link to HTML on GPDB review doc site.
http://docs-gpdb-review-staging.cfapps.io/review/ref_guide/config_params/guc-list.html#gp_resource_group_bypass

GUC list by category
http://docs-gpdb-review-staging.cfapps.io/review/ref_guide/config_params/guc_category-list.html#topic444

* docs - new GUC gp_resource_group_bypass

--Edits based on review comments.
--Add link to GUC from Admin Guide.
--Update TOC for this guc and other gucs not in TOC.

* docs - review updates for GUC gp_resource_group_bypass

* docs - fix typos in definition of GUC gp_resource_group_bypass

* docs - update to GUC gp_resource_group_bypass based on dev changes.

0b090c9f

docs - update gphdfs parquet support information. (#5254) · 4e22f20e

由 Mel Kiyama 提交于 7月 11, 2018

* docs - update gphdfs parquet support information.

--Update parequet support to 1.7.0 and later.
--Change location of parquet bundle jar files to
   https://mvnrepository.com/artifact/org.apache.parquet/parquet-hadoop-bundle

previous location was
  http://parquet.apache.org/downloads/

* docs - review updates of gphdfs parquet support information.

4e22f20e

11 7月, 2018 4 次提交

Improve handling of rd_cdbpolicy. · 0bfc7251

由 Ashwin Agrawal 提交于 7月 06, 2018

Pointers from Relation object needs to be handled with special care. As having
refcount on the object doesn't mean the object is not modified. Incase of cache
invalidation message handling Relation object gets *rebuild*. As part of rebuild
only guarantee maintained is that Relation object address will not change. But
the memory addresses inside the Relation object gets freed and freshly allocated
and populated with latest data from catalog.

For example below code sequence is dangerous

rel->rd_cdbpolicy = original_policy;
GpPolicyReplace(RelationGetRelid(rel), original_policy);

If relcache invalidation message is served after assigning value to
rd_cdbpolicy, the rebuild will free the memory for rd_cdbpolicy (which means
original_policy) and replaced with current contents of
gp_distribution_policy. So, when GpPolicyReplace() called with original_policy
is going to access freed memory. Plus, rd_cdbpolicy will have stale value in
cache and not intended refreshed value. This issue was hit in CI few times and
reproduces with higher frequency with `-DRELCACHE_FORCE_RELEASE`.

Hence this patch fixes all uses to rd_cdbpolicy to make use of rd_cdbpolicy
pointer directly from Relation object and also to update the catalog first
before assigning the value to rd_cdbpolicy.

0bfc7251

Fix duplicate distributed keys for CTAS · 7680b762

由 Pengzhou Tang 提交于 6月 19, 2018

To keep it consistent with the "Create table" syntax, CTAS should also
disallow duplicate distributed keys, otherwise backup and restore will
mess up.

7680b762

D

Docs: Fix description, value range, and defaults for optimizer_nestloop_factor (#5188) · bc38878d
由 David Yozie 提交于 7月 10, 2018

bc38878d

Fix alter table CLI help documentation · 30617fe0

由 Asim R P 提交于 7月 06, 2018

In ALTER TABLE SET DISTRIBUTED BY, the "WITH(reorganize=...)" option
must be specified before the "DISTRIBUTED ..." clause.

30617fe0