提交 · aa5798a930512feda63ac0db645ed7ac73088d5f · Greenplum / Gpdb

03 2月, 2018 1 次提交

Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF. · aa5798a9

由 Ashwin Agrawal 提交于 1月 23, 2018

`repair_frag()` should consult distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) while following and moving chains
of updated tuples. Vacuum consults distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) to find which tuples can be
deleted and not. For RECENTLY_DEAD tuples it used to make decision just based on
comparison with OldestXmin which is not sufficient and even there distributed
snapshot must be checked.

Fixes #4298

(cherry picked from commit 313ab24f)

aa5798a9

02 2月, 2018 9 次提交

Fix documentation query · 25c66e92

由 Daniel Gustafsson 提交于 2月 02, 2018

Remove extra semicolon which breaks the query for copy-paste.
Backported from master.

Per report by Michael Mulcahy

25c66e92

docs: pl/container fixes: (#4468) · 1d7fee25

由 Mel Kiyama 提交于 2月 01, 2018

-Move note in Examples section to emphasize container ID depends on configuration
-Add creating a docker group in Docker Install instructions for CentOS 6

1d7fee25

docs: update shared mem. info in GUCS max_fsm_pages and max_connections (#4467) · 345cce9b

由 Mel Kiyama 提交于 2月 01, 2018

* docs: update shared mem. info in GUCS max_fsm_pages and max_connections

Add info from support about shared memory conflicts.

PR for 5X_STABLE
Will be ported to MAIN

* docs: update shared mem. info in GUCS - review edits

345cce9b

CI: Remove references to tf-bucket-path from README (#4463) · 62257404

由 kaknikhil 提交于 2月 01, 2018

Commit e924b46e removed all references
to the parameter tf-bucket-path because it is now hardcoded to clusters.

62257404

A
Revert "Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF." · f71748df
由 Ashwin Agrawal 提交于 2月 01, 2018
```
This reverts commit 508ffd48.
```
f71748df

Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF. · 508ffd48

由 Ashwin Agrawal 提交于 1月 23, 2018

`repair_frag()` should consult distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) while following and moving chains
of updated tuples. Vacuum consults distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) to find which tuples can be
deleted and not. For RECENTLY_DEAD tuples it used to make decision just based on
comparison with OldestXmin which is not sufficient and even there distributed
snapshot must be checked.

Fixes #4298

(cherry picked from commit 313ab24f)

508ffd48

Fix get_attstatsslot()/free_attstatsslot() when statistics are broken. · 5bc15b17

由 Dhanashree Kashid 提交于 1月 12, 2018

In scenarios where pg_statistic contains wrong statistic entry for an
attribute, or when the statistics on a particular attribute are broken,
for e.g the type of elements stored in stavalues<1/2/3> is different
than the actual attribute type or when there are holes in the attribute
numbers due to adding/dropping columns; following two APIs fail because
they relied on the attribute type sent by the caller:

- get_attstatsslot() : Extracts the contents (numbers/frequency array and
values array) of the requested statistic slot (MCV, HISTOGRAM etc). If the
attribute is pass-by-reference or if the attribute is of toastable type
(varlena types)then it returns a copy allocated with palloc()
- free_attstatsslot() : Frees any palloc'd data by get_attstatsslot()

This problem was fixed in upstream 8.3
(8c21b4e9) for get_attstatsslot(),
wherein the actual element type of the array will be used for
deconstructing it rather that using caller passed OID.
free_attstatsslot() still depends on the type oid sent by caller.

However the issue still exists for free_attstatsslot() where it crashes while
freeing the array. The crash happened because the caller sent type OID was of
type TEXT meaning this a varlena type and hence free_attstatsslot() attempted
to free the datum; however due to the broken slot the datums extracted from
values array were of fixed length type such as int. We considered the int value
as memory address and crashed while freeing it.

This commit brings in a following fix from upstream 10 which redesigns
get_attstatsslot()/free_attstatsslot() such than they robust to scenarios like
these.

commit 9aab83fc
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sat May 13 15:14:39 2017 -0400

    Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.

    The mess cleaned up in commit da075960 is clear evidence that it's a
    bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
    to provide the correct type OID for the array elements in the slot.
    Moreover, we weren't even getting any performance benefit from that,
    since get_attstatsslot() was extracting the real type OID from the array
    anyway.  So we ought to get rid of that requirement; indeed, it would
    make more sense for get_attstatsslot() to pass back the type OID it found,
    in case the caller isn't sure what to expect, which is likely in binary-
    compatible-operator cases.

    Another problem with the current implementation is that if the stats array
    element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
    for each element.  That seemed acceptable when the code was written because
    we were targeting O(10) array sizes --- but these days, stats arrays are
    almost always bigger than that, sometimes much bigger.  We can save a
    significant number of cycles by doing one palloc/memcpy/pfree of the whole
    array.  Indeed, in the now-probably-common case where the array is toasted,
    that happens anyway so this method is basically free.  (Note: although the
    catcache code will inline any out-of-line toasted values, it doesn't
    decompress them.  At the other end of the size range, it doesn't expand
    short-header datums either.  In either case, DatumGetArrayTypeP would have
    to make a copy.  We do end up using an extra array copy step if the element
    type is pass-by-value and the array length is neither small enough for a
    short header nor large enough to have suffered compression.  But that
    seems like a very acceptable price for winning in pass-by-ref cases.)

    Hence, redesign to take these insights into account.  While at it,
    convert to an API in which we fill a struct rather than passing a bunch
    of pointers to individual output arguments.  That will make it less
    painful if we ever want further expansion of what get_attstatsslot can
    pass back.

    It's certainly arguable that this is new development and not something to
    push post-feature-freeze.  However, I view it as primarily bug-proofing
    and therefore something that's better to have sooner not later.  Since
    we aren't quite at beta phase yet, let's put it in.

    Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us

Most of the changes are same as the upstream commit with following additional
changes:
- Relcache translator changes in ORCA.
- Added a test that simulates the crash due to broken stats
- get_attstatsslot() contains an extra check for empty slot array which existed
in master but is not there in upstream.
Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>
(cherry picked from commit ae06d7b0)

5bc15b17

L

docs - pxf default port is now 5888 (#4452) · 1591591a
由 Lisa Owen 提交于 2月 01, 2018

1591591a
D

docs - updating doc build to use 5.5 versioning · 8d902c9b
由 dyozie 提交于 2月 01, 2018

8d902c9b

01 2月, 2018 3 次提交

Fix COPY PROGRAM issues · 6f81d55b

由 Adam Lee 提交于 1月 29, 2018

1, pipes might not exist while in close_program_pipes(), check it.
  For instance, relation doesn't exist, copy workflow fails before
  executing the program, "cstate->program_pipes->pid" dereferences NULL.

2, the program might be still running or hang when copy exits, kill it.
  Cases like the program hangs, doesn't take signals, user is trying to
  cancel.
  Since it's already the end of copy, and the program was started by
  copy, it should be safe to kill to clean up.

(cherry picked from commit d6bd4ac4)

6f81d55b

L
docs - clarify some resource group information (#4454) · 7ba45f85
由 Lisa Owen 提交于 1月 31, 2018
```
* docs - clarify some resource group information

* reword a bit to make intent clearer
```
7ba45f85
L

Import RPM GPG key when building docker image for Centos6 · e51d287c
由 Lav Jain 提交于 1月 31, 2018

e51d287c

31 1月, 2018 5 次提交

Update time zone data files to tzdata release 2018c. · 5b21048b

由 Tom Lane 提交于 1月 27, 2018

DST law changes in Brazil, Sao Tome and Principe. Historical corrections
for Bolivia, Japan, and South Sudan. The "US/Pacific-New" zone has been
removed (it was only a link to America/Los_Angeles anyway).

5b21048b

Fix dispatching of queries with record-type parameters. · f24b9ab5

由 Heikki Linnakangas 提交于 1月 30, 2018

This fixes the "ERROR: record type has not been registered" error, when
a record-type variable is used in a query inside a PL/pgSQL function.
This is essentially the same problem we battled with in Motion nodes
in GPDB 5, and added the whole tuple remapper to deal with it. Only this
time, the problem is with record Datums being dispatched from QD to QE,
as Params, rather than with record Datums being transferred across a
Motion.

To fix, send the transient record type cache along with the query
parameters, if there are any of the parameters are transient record types.
This is a bit inefficient, as the transient record type cache can be quite
large. A more fine-grained approach would be to send only those record
types that are actually used in the parameters, but more code would be
required to figure that out. This will do for now.

Refactor the serialization and deserialization of the query parameters, to
leverage the outfast/readfast functions.

Backport to 5X_STABLE. This changes the wire format of query parameters, so
this requires the QD and QE to be on the same minor version. But this does
not change the on-disk format, or the numbering of existing Node tags.

Fixes github issue #4444.

f24b9ab5

gpdeletesystem: require force if backups exist · ae07664c

由 Nadeem Ghani 提交于 1月 26, 2018

The current behavior is to check for the existence of dumps dirs and, unless
run with the -f flag, error out if they do. This commit extends this
check to backups.

Author: Nadeem Ghani <nghani@pivotal.io>

ae07664c

L

docs - use <> instead of italics (#4450) · 35ef1b6d
由 Lisa Owen 提交于 1月 30, 2018

35ef1b6d
B

Bump ORCA to v2.54.2 · 0a829ef3
由 Bhuvnesh Chaudhary 提交于 1月 30, 2018

0a829ef3

30 1月, 2018 6 次提交

W

Fix binary swap test failed by guc gp_enable_query_metrics. · 27d029b1
由 Wang Hao 提交于 1月 30, 2018

27d029b1

Add hook to perfmon stat sender for cluster info collection · 40653f8e

由 Wang Hao 提交于 12月 28, 2017

Leverage perfmon stat sender procees to call a hook function for cluster
level info collection.

Without changing existing behavior of stat sender process, this commit
changed the interval in main loop of stat sender from 1 second to 100 ms
because cluster info collector requires a finer granularity of sampling.
With this change, stats sender process should be started when
gp_enable_query_metrics is on.

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

40653f8e

Add hook to handle query info · d57c698c

由 Wang Hao 提交于 10月 10, 2017

The hook is called for
 - each query Submit/Start/Finish/Abort/Error
 - each plan node, on executor Init/Start/Finish

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

d57c698c

Alloc Instrumentation in Shmem · 9a0954e4

由 Wang Hao 提交于 10月 20, 2017

On postmaster start, additional space in Shmem is allocated for Instrumentation
slots and a header. The number of slots is controlled by a cluster level GUC,
default is 5MB (approximate 30K slots). The default number is estimated by 250
concurrent queries * 120 nodes per query. If the slots are exhausted,
instruments are allocated in local memory as fallback.

These slots are organized as a free list:
  - Header points to the first free slot.
  - Each free slot points to next free slot.
  - The last free slot's next pointer is NULL.

ExecInitNode calls GpInstrAlloc to pick an empty slot from the free list:
  - The free slot pointed by the header is picked.
  - The picked slot's next pointer is assigned to the header.
  - A spin lock on the header to prevent concurrent writing.
  - When GUC gp_enable_query_metrics is off, Instrumentation will
    be allocated in local memory.

Slots are recycled by resource owner callback function.

Benchmark result with TPC-DS shows performance impact by this commit is less than 0.1%
To improve performance of instrumenting, following optimizations are added:
  - Introduce instrument_option to skip CDB info collection
  - Optimize tuplecount in Instrumentation from double to uint64
  - Replace instrument tuple entry/exit function with macro
  - Add need_timer to Instrumentation, to allow eliminating of timing overhead.
    This is porting part of upstream commit:
------------------------------------------------------------------------
commit af7914c6
Author: Robert Haas <rhaas@postgresql.org>
Date:   Tue Feb 7 11:23:04 2012 -0500

Add TIMING option to EXPLAIN, to allow eliminating of timing overhead.
------------------------------------------------------------------------

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

9a0954e4

P
Fix AIX loader packaging problems (#4397) · 5db91e21
由 Peifeng Qiu 提交于 1月 24, 2018
```
- Add libpython
- Add libevent
- Add libgcc_s
```
5db91e21

Remove flaky test in QP-memory-accounting · 32c0a57e

由 Haisheng Yuan 提交于 1月 23, 2018

The flakyness is caused by the Orca version bump to 2.53.11, in which Orca
improved the equivclasses for inner/outer join plan. Previously Orca generated
bad plan, causing OOM. But as of 2.53.11, Orca generated better plan, may not
cause OOM. Due to the flakyness of this test, remove it.

32c0a57e

27 1月, 2018 7 次提交
- L
  
  Fix condition for pushing ICW green artifact · 488713db
  由 Lav Jain 提交于 1月 27, 2018
  
  488713db
- J
  concourse: remove stagger-start from generated dev pipelines · 49f4eeed
  由 Jim Doty 提交于 1月 23, 2018
```
- Stagger kickoff based on prod/dev and number of sections
- Also suggest dev pipeline command has a git-branch

Author: Jim Doty <jdoty@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>
(cherry picked from commit f84ff8ce)
```
  49f4eeed
- S
  concourse: print pipeline-setting instructions only if valid · bb2b6263
  由 Shoaib Lari 提交于 1月 26, 2018
```
Partial cherry-pick of da4dc150

Author: C.J. Jameson <cjameson@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>
```
  bb2b6263
- J
  gen_pipeline.py: don't clobber envvars when calling other scripts · 2ae17519
  由 Jacob Champion 提交于 1月 14, 2018
```
Resetting all the environment variables makes it difficult to use an
alternate Python installation (e.g. virtualenv). Make a copy of the
current env instead, and add our PIPELINE_FILE to that.

(cherry picked from commit df83e394)
```
  2ae17519
- L
  
  Update readme instructions for using docker for building GPDB (#4424) · 8a02ae39
  由 Lav Jain 提交于 1月 26, 2018
  
  8a02ae39
- L
  
  Push gpdb artifact to S3 when ICW is green (prod only) (#4399) · 9f0b8b07
  由 Lav Jain 提交于 1月 26, 2018
  
  9f0b8b07
- H
  
  Bump Orca version to 2.54.1 · b965d4f5
  由 Haisheng Yuan 提交于 1月 26, 2018
  
  b965d4f5
25 1月, 2018 7 次提交

Backport timezone data format change · 5acfbfaa

由 Daniel Gustafsson 提交于 1月 25, 2018

This backports the below commit which moved from the source timezone
file to the newly introduced compact format.

commit 097b24cea68ac167a82bb617eb1844c8be4eaf24
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat Nov 25 15:30:11 2017 -0500

Replace raw timezone source data with IANA's new compact format.

Traditionally IANA has distributed their timezone data in pure source
form, replete with extensive historical comments. As of release 2017c,
they've added a compact single-file format that omits comments and
abbreviates command keywords. This form is way shorter than the pure
source, even before considering its allegedly better compressibility.
Hence, let's distribute the data in that form rather than pure source.

I'm pushing this now, rather than at the next timezone database update,
so that it's easy to confirm that this data file produces compiled zic
output that's identical to what we were getting before.

Discussion: https://postgr.es/m/1915.1511210334@sss.pgh.pa.us

Backported from master.

5acfbfaa

Update timezone data and management code to match PostgreSQL · b749790a

由 Daniel Gustafsson 提交于 1月 25, 2018

The timezone data in Greenplum are from the base version of
PostgreSQL that the current version of Greenplum is based on.
This cause issues since it means we are years behind on tz
changes that have happened. This pulls in the timezone data
and code from PostgreSQL 10.1 with as few changes to Greenplum
as possible to minimize merge conflicts. The goal is to gain
data rather than features, and for Greenplum for each release
to be able to stay current with the iana tz database as it is
imported into upstream PostgreSQL.

This removes a Greenplum specific test for the Yakutsk timezone
as it was made obsolete by upstream tz commit 1ac038c2c3f25f72.

Backported from master.

b749790a

M
docs: unhide gpmovemirrors in utility guide (#4346) · a6898590
由 Mel Kiyama 提交于 1月 24, 2018
```
PR for 5X_STABLE only
```
a6898590
S

Bump ORCA version to 2.54.0 · a518c467
由 Shreedhar Hardikar 提交于 1月 23, 2018

a518c467

Do not maintain mdidType separately in CDXLScalarIdent · 514b5bc0

由 Shreedhar Hardikar 提交于 11月 29, 2017

This information can be easily derived from CDXLColRef member of
CDXLScalarIdent. This now mirrors what is done in the ORCA types
CScalarIdent and CColRef.
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

514b5bc0

L

Use PXF docker image to run PXF smoke (#4416) · 9da70981
由 Lav Jain 提交于 1月 24, 2018

9da70981
C

docs: remove gpperfmon filerep table from reference guide (#4415) · 3f4e44dc
由 Chuck Litzell 提交于 1月 24, 2018

3f4e44dc

24 1月, 2018 2 次提交

Fix invalid ALTER FUNCTION syntax in gpcrondump (#4390) · 9f836456

由 Chris Hajas 提交于 1月 23, 2018

Previously, we would include the DEFAULT argument modifier when
performing the ACL dump, which caused an error during restore. This is
dumping correctly in pg_dump, so we now use the same logic in gpcrondump.

Author: Chris Hajas <chajas@pivotal.io>

9f836456

L

Change PXF port number to 5888 (#4345) · ec39602c
由 Lav Jain 提交于 1月 23, 2018

ec39602c