提交 · 0f6f7e2065013db94c65af3f6aafd3db45e4fd29 · Greenplum / Gpdb

29 12月, 2017 2 次提交

Make btdexppages parameter to function gp_bloat_diag() numeric. · 0f6f7e20

由 Nadeem Ghani 提交于 12月 20, 2017

The gp_bloat_expected_pages.btdexppages column is numeric, but it was passed to
the function gp_bloat_diag() as integer in the definition of the view of the
same name gp_bloat_diag. This caused integer overflow errors when the number of
expected pages exceeded max integer limit for columns with very large widths.

This changes the function signature and call to use numeric for the
btxexppages paramter.

Adding a simple test to mimic the cutomer issue.

The test output file src/test/regress/expected/gp_toolkit.out modified to
conform to the 5X_STABLE query output.

Author: Nadeem Ghani <nghani@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>
(cherry picked from commit 152783e1)

0f6f7e20

M

docs: pl/container updates and edits (#4197) · 7fc3dc7c
由 Mel Kiyama 提交于 12月 28, 2017

7fc3dc7c

28 12月, 2017 5 次提交

Able to cancel COPY PROGRAM ON SEGMENT if the program hangs · ecd44052

由 Adam Lee 提交于 12月 14, 2017

There are two places that QD keep trying to get data, ignore SIGINT, and
not send signal to QEs. If the program on segment has no input/output,
copy command hangs.

To fix it, this commit:

1, lets QD wait connections able to be read before PQgetResult(), and
cancels queries if gets interrupt signals while waiting
2, sets DF_CANCEL_ON_ERROR when dispatch in cdbcopy.c
3, completes copy error handling

-- prepare
create table test(t text);
copy test from program 'yes|head -n 655360';

-- could be canceled
copy test from program 'sleep 100 && yes test';
copy test from program 'sleep 100 && yes test<SEGID>' on segment;
copy test from program 'yes test';
copy test to '/dev/null';
copy test to program 'sleep 100 && yes test';
copy test to program 'sleep 100 && yes test<SEGID>' on segment;

-- should fail
copy test from program 'yes test<SEGID>' on segment;
copy test to program 'sleep 0.1 && cat > /dev/nulls';
copy test to program 'sleep 0.1<SEGID> && cat > /dev/nulls' on segment;

(cherry picked from commit 25c70407dc038a2c56ccb37a3540c9af6a99e6e4)

ecd44052

Update ORCA version 2.53.2 · ffda7c86

由 Bhuvnesh Chaudhary 提交于 12月 19, 2017

Test files are also updated in this commit as now we don't generate
cross join alternative if an input join was present.

cross join contains CScalarConst(1) as the join condition. if the
input expression is as below with cross join at top level between
CLogicalInnerJoin and CLogicalGet "t3"

+--CLogicalInnerJoin
   |--CLogicalInnerJoin
   |  |--CLogicalGet "t1"
   |  |--CLogicalGet "t2"
   |  +--CScalarCmp (=)
   |     |--CScalarIdent "a" (0)
   |     +--CScalarIdent "b" (9)
   |--CLogicalGet "t3"
   +--CScalarConst (1)
for the above expression (lower) predicate generated for the cross join
between t1 and t3 will be: CScalarConst (1) In only such cases, donot
generate such alternative with the lower join as cross join example:

+--CLogicalInnerJoin
   |--CLogicalInnerJoin
   |  |--CLogicalGet "t1"
   |  |--CLogicalGet "t3"
   |  +--CScalarConst (1)
   |--CLogicalGet "t2"
   +--CScalarCmp (=)
      |--CScalarIdent "a" (0)
      +--CScalarIdent "b" (9)
Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>

ffda7c86

Fix bug in ALTER TABLE ADD COLUMN for AOCS · 27a7957b

由 Xin Zhang 提交于 12月 26, 2017

If the first insert into AOCS table aborted, the visible blocks in the block
directory should be greater than 1. By default, we initialize the
`DatumStreamWriter` with `blockFirstRowNumber=1` for newly added columns. Hence,
the first row numbers are not consistent between the visible blocks. This caused
inconsistency between the base table scan vs. the scan using indexes through
block directory.

This wrong result issue is only happened to the first invisible blocks. The
current code (`aocs_addcol_endblock()` called in `ATAocsWriteNewColumns()`) already
handles other gaps after the first visible blocks.

The fix updates the `blockFirstRowNumber` with `expectedFRN`, and hence fixed the
mis-alignment of visible blocks.

Author: Xin Zhang <xzhang@pivotal.io>
Author: Ashwin Agrawal <aagrawal@pivotal.io>
(cherry picked from commit 7e7ee7e2)

27a7957b

CI: depricate CCP remote state path convention · e924b46e

由 Jim Doty 提交于 12月 27, 2017

Going forward any pipeline with a terraform resource should specify a
BUCKET_PATH of `clusters/`.

As a static value it can be hardcoded in place of being an interpolated
value from a secrets file. The key `tf-bucket-path` in any secrets yaml
files is now depricated and should be removed.

We are dropping the convention where we had teams place their clusters'
tfstate files in a path. In the past this convention was necessary, but
now that we are tagging the clusters with much richer metadata, this
convention is no longer strictly necessary.

Balanced against the need for keeping the bucket organized was the
possibility of two clusters attempting to register their AWS public key
with the same name. This collision came as a result of the mismatch in
scope. The terraform resource that provisioned the clusters, and
selected names is only looking in the path given for name collisions,
but the names for AWS keys has to be unique for the account.

By collapsing all of the clusters into an account wide bucket, we will
rely on the terraform resource to check for name conflicts going
forward.

Addresses bug: https://www.pivotaltracker.com/story/show/153928527Signed-off-by: NKris Macoskey <kmacoskey@pivotal.io>

e924b46e

L

Fix recursive chown for pxf_automation (#4162) · 99124cad
由 Lav Jain 提交于 12月 27, 2017

99124cad

27 12月, 2017 3 次提交

Fix gpload count error bug with EXTERNAL.SCHEMA · 505134f3

由 Jialun Du 提交于 12月 25, 2017

- Fix gpload.py, add schema prefix to every external table name
  when EXTERNAL.SCHEMA is set
- Add new test cases

505134f3

tinc: gpexpand move port range lower than ephemeral ports · 7fbb6a50

由 C.J. Jameson 提交于 12月 12, 2017

Fix gpexpand the same way as this commit 4b439bc9

Port numbers used by GPDB should be below kernel's ephemeral port range

The ephemeral port range is given by net.ipv4.ip_local_port_range kernel
parameter. It is set to 32768 --> 60999. If GPDB uses port numbers in this
range, FTS probe request may not get a response, resulting in FTS incorrectly
marking a primary down.

We change the example configuration files to lower the port number to proper
range.

Author: Marbin Tan <mtan@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>
(cherry picked from commit 4c7ea6dc)

7fbb6a50

H
Change bitmap scan cost per page formula back to upstream · 9aac38f0
由 Haisheng Yuan 提交于 12月 26, 2017
```
This reverts commit 2e301044
```
9aac38f0

23 12月, 2017 1 次提交

partition_pruning: don't drop public tables other tests need · fd632e3c

由 Jacob Champion 提交于 12月 20, 2017

The portals_updatable test makes use of the public.bar table, which the
partition_pruning test occasionally dropped. To fix, don't fall back to
the public schema in the partition_pruning search path. Put the
temporary functions in the partition_pruning schema as well for good
measure.

Author: Asim R P <apraveen@pivotal.io>
Author: Jacob Champion <pchampion@pivotal.io>
(cherry picked from commit 5d57a445)

fd632e3c

22 12月, 2017 4 次提交

docs - correct the gptransfer example using --full and -d options together (#4168) · 334ee1d2

由 Lisa Owen 提交于 12月 21, 2017

* docs - correct the example using --full and -d options

* avoid using --full and --schema-only together, do not prohibit

* clean up schema copy bullet point

334ee1d2

Remove dead code · 84cf4156

由 Nadeem Ghani 提交于 12月 20, 2017

There was step defined but not being used and a helper method only being
called from that step. This commit removes both.

Author: Nadeem Ghani <nghani@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>

84cf4156

behave: Allow gptransfer test to run at midnight without flaking. · 718d1440

由 Nadeem Ghani 提交于 12月 18, 2017

The gptransfer test used a step that looked for a logfile with a date in the
name. If that logfile existed at 11:59PM on the day before, and the test looked
for it at 12:00AM on the next day, it "wouldn't be there"

Refactor the test so that assertions about using the typical
gpAdminLogs directory are as banal as possible.

Also see https://github.com/greenplum-db/gpdb/pull/4072/commits/84bf83d5013f891547dc21576d04a281cfd2faf7.

Author: Nadeem Ghani <nghani@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>

718d1440

D

docs: small clarification to revoke reference from recent 8.4 merge · d16945ed
由 dyozie 提交于 12月 21, 2017

d16945ed

21 12月, 2017 6 次提交

CI: Remove no longer necessary package installs · c5fb8a72

由 Kris Macoskey 提交于 12月 20, 2017

Installation of packages on every execution of a test suffers from any
upstream flakiness. Therefore installation of generic packages is being
moved to the underlying OS, in this case the AMI being used for the CCP
job.

In place of outright removing the package installation, it is a much
better pattern to instead replace installation with a validation of the
assumptions made for packages installed on the underlying OS the test
will run within.

The call `yum --cacheonly list installed [List of Packages]` does a number of things:

1. For the given list of packages, if installed the command will
return 0, and if any are not installed will return 1

2. The `--cacheonly` prevents the call from issuing an upstream
repository metadata refresh. This is not a requirement, but is an easy
optimization that avoids upstream flakiness even further.

Note: `--cacheonly` assumes that the repostiroy metadata cache has
been refresh atleast once. If not, the flag will cause the command to
fail. We are assuming that it has been performed at least once in the
underlying OS in order to install the packages in the first place.
Signed-off-by: NAlexandra Wang <lewang@pivotal.io>
Signed-off-by: NDivya Bhargov <dbhargov@pivotal.io>

c5fb8a72

Tune bitmap scan cost model by updating to nonlinear cost per page · 2e301044

由 Haisheng Yuan 提交于 12月 20, 2017

We had some customers reporting that planner generates plan using seqscan
instead of bitmapscan, and the execution time of seqscan is 5x slower than
using bitmapscan. The statistics were updated and quite accurate.

Bitmap table scan uses some formula to interpolate between random_page_cost and
seq_page_cost to determine the cost per page. But it turns out that the default
value of random_page_cost is 100x of the value of seq_page_cost. With the
original cost formula, random_page_cost predominates in the final cost result,
even the formula is declared to be non-linear, but it is still more like linear,
which can't reflect the real cost per page when a majority of pages are fetched.

Therefore, the cost formula is updated to real non-linear function to reflect
both random_page_cost and seq_page_cost for different percentage of pages
fetched.

For example, for the default value random_page_cost = 100, seq_page_cost = 1,
if 80% pages are fetched, the cost per page in old formula is 11.45, which is
10x more than seqscan, because the cost is dominated by random_page_cost in the
formula. With the new formula, the cost per page is 1.63, which can reflect the
real cost better, in my opinion.

[#151934601]

2e301044

B

Pass Row Number and Rank Oid to Optimizer Config · 46d4f74a
由 Bhuvnesh Chaudhary 提交于 12月 19, 2017

46d4f74a
L
docs - add note about MaxStartups to relevant utility cmds (#4186) · 7ae23a6a
由 Lisa Owen 提交于 12月 20, 2017
```
* docs - add note about MaxStartups to relevant utility cmds

* uses ...
```
7ae23a6a
S

Bump ORCA version to 2.52.0 · 006431b2
由 Shreedhar Hardikar 提交于 12月 19, 2017

006431b2

Reimplement ORCA interrupts using a callback function · fdbe5bbb

由 Shreedhar Hardikar 提交于 12月 13, 2017

As pointed out by Heikki, maintaining another variable to match one in
the database system will be error-prone and cumbersome, especially while
merging with upstream. This commit initializes ORCA with a pointer to a
GPDB function that returns true when QueryCancelPending or
ProcDiePending is set. This way we no longer have to micro-manage
setting and re-setting some internal ORCA variable, or touch signal
handlers.

This commit also reverts commit 0dfd0ebc "Support optimization interrupts
in ORCA" and reuses tests already pushed by 916f460f and 0dfd0ebc.

fdbe5bbb

20 12月, 2017 2 次提交
- D
  
  docs - promoting version to 5.4.0 · 33539664
  由 dyozie 提交于 12月 19, 2017
  
  33539664
- L
  docs - persist resource group configuration (#4125) · f5ed200d
  由 Lisa Owen 提交于 12月 19, 2017
```
* docs - persist resource group configuration

* recreate, not persist, the cgroup hierarchies

* configure service start cmds with preferred auto-start tool
```
  f5ed200d
19 12月, 2017 7 次提交

Remove output (not answer) files from memory accounting TINC · 77fab4e1

由 Sambitesh Dash 提交于 12月 18, 2017

They were leftover from the Perforce repo, and they should never have
been checked in. Now this normally wouldn't have been an issue, except
for commit history cleanliness: we would just silently overwrite the
output files with actual output. But what if in CI those "output files"
have different permissions? Turns out we would silently leave them
alone. Two steps down the road we have a diff failure ...

This commit removes -- at long last -- those output files.

This fix is forward-ported from an older, closed-source version of
Greenplum, where we first spotted this oversight. Strangely this is not
causing any test failures on master or 5 ... But this still should be
ported, even for cleanliness sake.
Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
(cherry picked from commit 20d6b178)

77fab4e1

Doc edits for gptransfer --schema-only change (#9) · fd7c0ef5

由 David Yozie 提交于 12月 15, 2017

* Doc edits for gptransfer --schema-only change

* Change header title; add xref

* --d -> -d

* remove extraneous comma

* changing -d behavior to match -t; making sentences parallel

fd7c0ef5

docs - costing diffs between gporca/planner and RQ limits (#4147) · a6ff2836

由 Lisa Owen 提交于 12月 18, 2017

* docs - costing diffs between gporca/planner and RQ limits

* mention fallback

* RQs do not align/differentiate costs between planners

a6ff2836

M
docs: optimizer_join_order_threshold GUC update max value (#4156) · 9094056f
由 Mel Kiyama 提交于 12月 18, 2017
```
PR for 5X_STABLE
Will be ported to MAIN
```
9094056f

Make GetResGroupIdForName() take a const char* instead of char* · 123c58a5

由 David Sharp 提交于 12月 07, 2017

Author: Amil Khanzada <akhanzada@pivotal.io>
Author: David Sharp <dsharp@pivotal.io>
(cherry picked from commit 35ae9aee)

123c58a5

Move AssignResGroupOnMaster() to the end of StartTransaction() (#3924) · 643be64a

由 Amil Khanzada 提交于 11月 30, 2017

- As part of determining the resource group that a transaction should be
  assigned to, AssignResGroupOnMaster() calls GetResGroupIdForRole(), which
  queries a syscache on the catalog table pg_authid, which maps users to
  resource groups.
- Prior to this commit, AssignResGroupOnMaster() was doing the queries
  on pg_authid near the top of StartTransaction() before the per-transaction
  memory context was set up. This required GetResGroupIdForRole() to run
  ResourceOwnerCreate() to avoid segfaulting gpdb and also led to many
  potential issues:
  * unknown behavior if a relcache invalidation event happens on pg_authid's
    syscache
  * possible stale pg_authid entries, as access done with SnapshotNow and
    out-of-date RecentGlobalXmin
  * memory leaks due to no memory context
  * uphill battle as newer version of PostgreSQL remove SnapshotNow and
    assume catalog lookups only happen when transactions are open
Signed-off-by: NDavid Sharp <dsharp@pivotal.io>
Signed-off-by: NAmil Khanzada <akhanzada@pivotal.io>
(cherry picked from commit 9ea766d9)

643be64a

behave: make pid detection more robust · 25925f42

由 Marbin Tan 提交于 12月 11, 2017

This is simply a setup/cleanup step for the behave tests, so be
accomodating to try to get it to work.

Scope: affects gpcheckcat.feature and backups.feature; these tests
already have some timing affordances; this just adds a bit more backstop

Author: Marbin Tan <mtan@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>

25925f42

18 12月, 2017 4 次提交
- C
  Enhance hardening docs on trust and ident (#4155) · 3c88ab02
  由 Chuck Litzell 提交于 12月 18, 2017
```
* Enhance hardening docs on trust and ident

* Format source. No content changes.
```
  3c88ab02
- J
  
  remove citext.sql since it will be generated by make install (#4166) · 4283ae96
  由 Jinbao Chen 提交于 12月 18, 2017
  
  4283ae96
- L
  Change HADOOP_TARGET_VERSION tokens to hadoop, cdh, hdp, mpr (#4048) · c6867837
  由 Lav Jain 提交于 12月 15, 2017
```
* Cleanup makefiles for GPHDFS
* Fix HADOOP_TARGET_VERSION
* Change gphdfs_target_version tokens to hadoop, cdh, hdp, mpr
```
  c6867837
- L
  
  Update GCC to 6.X to comply with latest gpdb changes (#4146) · b3fb3c4a
  由 Lav Jain 提交于 12月 17, 2017
  
  b3fb3c4a
16 12月, 2017 6 次提交
- M
  behave: gprecoverseg - ensure that gpfaultinjector gets triggered · ffacc104
  由 Marbin Tan 提交于 12月 12, 2017
```
Ensure that we're triggering the `gpfaultinjector`.
There are cases where even though we have the `gpfaultinjector` setup
and the transaction still does not block properly.

By creating a database, we ensure that all segments gets contacted,
and FTS will detect the issue that we created with gpfaultinjector.

(cherry picked from commit acaccc6e)
```
  ffacc104
- M
  
  Chaning the ownership of gpdb_src/contrib/citext/citext.sql for regression tests · af711e23
  由 Mike Roth 提交于 12月 15, 2017
  
  af711e23
- L
  Revert "Change HADOOP_TARGET_VERSION tokens to hadoop, cdh, hdp, mpr (#4048)" · 6ded8d2c
  由 Lav Jain 提交于 12月 15, 2017
```
This reverts commit 99e2b078.
```
  6ded8d2c
- L
  Revert "GPHDFS should not be enabled by default" · 4f53435a
  由 Lav Jain 提交于 12月 15, 2017
```
This reverts commit e2750db5.
```
  4f53435a
- L
  
  GPHDFS should not be enabled by default · e2750db5
  由 Lav Jain 提交于 12月 15, 2017
  
  e2750db5
- L
  Change HADOOP_TARGET_VERSION tokens to hadoop, cdh, hdp, mpr (#4048) · 99e2b078
  由 Lav Jain 提交于 12月 15, 2017
```
* Cleanup makefiles for GPHDFS
* Fix HADOOP_TARGET_VERSION
* Change gphdfs_target_version tokens to hadoop, cdh, hdp, mpr
```
  99e2b078