提交 · 14b9ea1c611c05f1694d285df30060d8da6aad08 · Greenplum / Gpdb

19 6月, 2016 4 次提交

D
Fix spurious whitespace in the mapred code · 14b9ea1c
由 Daniel Gustafsson 提交于 5月 10, 2016
```
No functional changes are included in this diff, only trailing
whitespace cleanup.
```
14b9ea1c

Extend mapred reduce function test with builtin reducer · 61320c92

由 Daniel Gustafsson 提交于 5月 10, 2016

The agebracket.yml mapred test case is testing the reduction with
a simple reducer summing the values. GPMapred has a builtin SUM
reducer however which will yield the same result, extend the test
with a copy of the agebracket.yml test using the builtin ensuring
that the output is identical.

61320c92

H
Fix crash if an EXPLAIN statement is passed to gp_dump_query_oids() · 754d0cc4
由 Heikki Linnakangas 提交于 6月 18, 2016
```
I was alerted to this by a compiler warning, which this also fixes. Was
broken by the 8.3 merge.
```
754d0cc4

Backport 8.4 changes to text search pg_*_is_visible() functions. · a800eb33

由 Heikki Linnakangas 提交于 6月 18, 2016

We had backported similar changes to the other pg_*_is_visible()
functions earlier, so for consistency, do the same for the new functions
we got from the 8.3 merge.

The related upstream commit was:

commit 66bb74db
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Mon Dec 15 18:09:41 2008 +0000

Arrange for the pg_foo_is_visible and has_foo_privilege families of functions
to return NULL, instead of erroring out, if the target object is specified by
OID and we can't find that OID in the catalogs. Since these functions operate
internally on SnapshotNow rules, there is a race condition when using them
in user queries: the query's MVCC snapshot might "see" a catalog row that's
already committed dead, leading to a failure when the inquiry function is
applied. Returning NULL should generally provide more convenient behavior.
This issue has been complained of before, and in particular we are now seeing
it in the regression tests due to another recent patch.

a800eb33

18 6月, 2016 4 次提交

Remove inaccurate icg test that injects QueryFinishPending · b068baf1

由 Nikos Armenatzoglou 提交于 6月 16, 2016

This test was introduced in commit c36ecd6f.
It does not actually test the changes of that commit, instead it was aiming to test some other scenarios.
It seems that the test does not have a deterministic behavior.
We remove it for now, and we will investigate if the scenarios that we were aiming to test are possible and write deterministic tests.

b068baf1

F

Adding inline to all static functions in gp_alloc.h to suppress compiler warnings. · 2ef9afd8
由 Foyzur Rahman 提交于 6月 17, 2016

2ef9afd8

Fix uninitialized variable in print_rmgr_heap2() · b4fbc164

由 Daniel Gustafsson 提交于 6月 17, 2016

The recent hacks to make xlogdump work with the 8.3 merge had a
thinko where the initialization of the struct ended up inside the
guards for version checks effectively breaking it for < 9.0.

Spotted (and reviewed) by Heikki Linnakangas

b4fbc164

Remove unnecessary debug-logging of tuple counts. · c711300d

由 Heikki Linnakangas 提交于 6月 17, 2016

It doesn't seem too useful, to hae a DEBUG1 of tuple counts. I doubt
anyone has taken advantage of that in years. Removing it allows us
to refactor the code inside GetPlannedStmtLogLevel() back into
GetCommandLogLevel(), like it is in the upstream.

c711300d

17 6月, 2016 6 次提交
- H
  
  Cosmetic fixes, to reduce diff with upstream. · d6f5fbcd
  由 Heikki Linnakangas 提交于 6月 17, 2016
  
  d6f5fbcd
- H
  Make distribute_qual_to_rels() static again, like it is in upstream. · 94f78395
  由 Heikki Linnakangas 提交于 6月 17, 2016
```
There are no callers to it outside the file, so why not. Reduces the
diff vs. upstream.
```
  94f78395
- H
  Move GUC variables back to where they are in upstream. · 9b31ab25
  由 Heikki Linnakangas 提交于 6月 17, 2016
```
This seems like a completely unnecessary change. Reduce the diff vs.
upstream.
```
  9b31ab25
- H
  Remove dead function. · c2158136
  由 Heikki Linnakangas 提交于 6月 17, 2016
```
Not sure what this was used for or when, but it's dead now.
```
  c2158136
- H
  
  Remove unused "AsetDirect" code. · 2a871257
  由 Heikki Linnakangas 提交于 6月 17, 2016
  
  2a871257
- C
  Support larger fixed format table definitions · ca367b42
  由 Craig Sylvester 提交于 6月 17, 2016
```
Removed maxlen as a maximum size for the custom format supporting fixed
format descriptions. Allows for the creation of tables having up to
MaxHeapAttributeNumber of columns.
```
  ca367b42
16 6月, 2016 7 次提交

Use empty target list in "dummy" initplans. · 2bf1b035

由 Heikki Linnakangas 提交于 6月 16, 2016

We replace unused init plans with dummy Result plans at end of planning.
Use an empty target list for the dummy Result, instead the target list of
the original Plan we're replacing. An empty target list is obviously better
from a performance point of view, but more importantly, if the original
target list contained an AggRef node, initializing the AggRef expression
at ExecutorStart() will fail, because there is a check that an AggRef
can only appear in an Agg node.

Per Nikhil Kak's report. gpcheckcat contains a query that tripped on this.

2bf1b035

H

Update alternative expected output files for 'xml' test. · 557214b6
由 Heikki Linnakangas 提交于 6月 16, 2016

557214b6
D

Fix documentation typos in xlogdump · b540aaec
由 Daniel Gustafsson 提交于 6月 15, 2016

b540aaec

Reconnect 8.3 specific code in xlogdump · b4713277

由 Daniel Gustafsson 提交于 6月 15, 2016

Now that PostgreSQL 8.3 has been merged we can enable the 8.3 specific
parts of xlogdump. There was a bug in the upstream code however that
included calls using struct members not available until 9.0 (coming in
commit 375369ac), disconnect this in the typical if/endif way. Since
xlogdump is abandonware due to the inclusion of pg_xlogdump in 9.3
theres not much use in submitting this upstream since we effectively
maintain this code.

b4713277

Remove obsolete heap scan readahead code. · 8b0f9b4d

由 Heikki Linnakangas 提交于 6月 15, 2016

The main effect of the readahead code was not readahead as such, but to
avoid trashing the buffer cache when doing a sequential scan of a large
table. We just got similar functionality from PostgreSQL 8.3: the "ring
buffer" feature, in commit d526575f. So we don't need the GPDB-specific
code anymore.

This was slightly mismerged: we were reading the same page twice in a row:
once with the new ReadBufferWithStrategy() call and second time with the
old KillAndReadBuffer() call. That is now fixed too.

8b0f9b4d

Revert relcache invalidation at EOX to work like before the merge. · 3b8a4fe2

由 Heikki Linnakangas 提交于 6月 15, 2016

The other FIXME comment that the removed comment refers to was reverted
earlier, before we pushed the 8.3 merge. Should've uncommented this back
then, but I missed the need for that because I've been building with
--enable-cassert. This fixes the regression failure on gpdtm_plpgsql test
case, when built without assertions.

3b8a4fe2

Remove typedefs in favor of forward declarations · a2ae10fc

由 Daniel Gustafsson 提交于 6月 15, 2016

Commit 5dc7dda9 added a forward declaration for PartitionSelector and
PartitionSelectorState, the redefenition of which is only legal in
C11 which we are yet to use.

+1 from hlinnakangas@

a2ae10fc

15 6月, 2016 10 次提交

D

Fix documentation typos in cdbgroup · 67414f01
由 Daniel Gustafsson 提交于 6月 15, 2016

67414f01

Re-enable psql queries that depend on 8.3 catalogs. · a243b842

由 Heikki Linnakangas 提交于 6月 15, 2016

These queries were backported from later PostgreSQL releases earlier,
but had to be disabled when we started the 8.3 merge, because we didn't
have all the catalog changes from 8.3 yet. Now we do, so re-enable the
queries.

a243b842

Backport the "proper" fix for CVE-2013-0255. · 40d6ea40

由 Heikki Linnakangas 提交于 6月 15, 2016

The PostgreSQL 8.3 merge included a band-aid fix for this, but since we
haven't yet made a GPDB 5.0 release, and are therefore free to still modify
the catalogs, let's apply the proper fix that went into PostgreSQL 9.3 at
the time.

commit 71627f3d
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Feb 13 16:20:01 2013 -0500

    Fix CVE-2013-0255 properly.

    Revert commit ab0f7b60 (in HEAD only)
    in favor of the proper solution, which is to declare enum_recv() correctly
    in the system catalogs.  It should be declared to take type "internal"
    not "cstring".

    Also improve the type_sanity regression test, which should have caught
    this typo, so that it actually would.  Most of the relevant checks on
    the signature of type I/O functions should not have been restricted to
    basetypes/pseudotypes, as they should apply to any type's I/O functions.

40d6ea40

Merge with PostgreSQL 8.3.23. · a453004e

由 Heikki Linnakangas 提交于 6月 15, 2016

Everything in PostgreSQL 8.3 is now in Greenplum. PostgreSQL 8.3 has been
end-of-lifed in upstream, so there will be no new upstream minor releases
of this series anymore either.

This includes a lot of new features from PostgreSQL 8.3, and a ton of bug
fixes. See upstream release notes for details. One big user-visible change
included is the removal of implicit cast from other datatypes to text. When
this was done in PostgreSQL, it caused a lot of sloppily written
application queries to break. Which is a good thing in the long run, but it
caused a lot of pain on upgrade.

A few features work slightly differently in GPDB:

* Lazy XIDs feature in upstream reduces XID consumption, by not assigning
  XIDs to read-only transactions. That optimization has not been
  implemented for GPDB distributed transactions, however, so practically
  all queries in GPDB still consume XIDs, whether they're read-only or not.

* temp_tablespaces GUC was added in upstream, but it has been disabled in
  GPDB, in favor of GPDB-specific system to decide which filespace to use
  for temporary files.

* B-tree indexes can now be used to answer "col IS NULL" queries. That
  support has not been implemented for bitmap indexes.

* Support was added for "top N" sorting, speeding up queries with ORDER BY
  and LIMIT. That support was not implemented for tuplesort_mk.c, however,
  so you only get that benefit with enable_mk_sort=off.

* Multi-worker autovacuum support was merged, but autovacuum as whole is
  still disabled in GPDB.

* Planner hook was added. Nothing special there, but we should consider
  refactoring the ORCA glue code to use it.

* Plan cache invalidation was added upstream, which made the
  gp_plpgsql_clear_cache_always option obsolete. We also backported the
  further invalidation improvements from 8.4. See below for more
  information.

In addition to those, there were some big refactorings that were not that
interesting from user point of view, but caused a lot of code churn:
InitPlans are now initialized separately at executor startup, the executor
range table was changed to be "flat", and code related to parse analysis of
DDL commands was moved around. The way UPDATE WHERE CURRENT OF was
implemented in GPDB was refactored to match the new upstream support for
the same (see below for more information).

Plan cache invalidation
-----------------------

One notable feature included in this merge is the support for plan cache
invalidation. GPDB contained a hack to invalidate all plans in master
between transactions, for the same purpose, but the proper plan cache
invalidation is a much better solution. However, because the GPDB hack also
handled dropped/recreated functions in some cases, if we just leave it out
and replace with the 8.3 plan cache invalidation, there will be a
 regression in those cases. To fix, this merge commit includes a backport
the rest of the plan cache invalidation support from PostgreSQL 8.4, which
 handles functions and operators too. The upstream commit for that was:

commit ee33b95d
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Tue Sep 9 18:58:09 2008 +0000

    Improve the plan cache invalidation mechanism to make it invalidate plans
    when user-defined functions used in a plan are modified.  Also invalidate
    plans when schemas, operators, or operator classes are modified; but for these
    cases we just invalidate everything rather than tracking exact dependencies,
    since these types of objects seldom change in a production database.

    Tom Lane; loosely based on a patch by Martin Pihlak.

As a result of that, the gp_plpgsql_clear_cache_always GUC is removed, as
it's no longer needed. This also includes new test cases in the plpgsql
cache regression test, to demonstrate the improvements.

CURRENT OF code changes
-----------------------

This merge includes the upstream support for UPDATE WHERE CURRENT OF. GPDB
had support for that already, it's been refactored to use the upstream code
as much as possible (execCurrentOf()). Now that we have the upstream code
available, this also refactors the way current position of a cursor is
dispatched. Instead of modifying every CurrentOfExpr node in the plan tree
before dispatching it, store the current positions of each cursor mentioned
in a CurrentOfExpr in a separate CursorPosInfo node, and dispatch those
along with the query plan, in QueryDispatchDesc.

This was a joint effort between me (Heikki) and Daniel Gustafsson.

a453004e

Change the way static Partition Selector nodes are printed in EXPLAIN · 5dc7dda9

由 Heikki Linnakangas 提交于 6月 15, 2016

The printablePredicate of a static PartitionSelector node contains Var nodes
with varno=INNER. That's bogus, because the PartitionSelector node doesn't
actually have any child nodes, but works at execution time because the
printablePredicate is not only used by EXPLAIN. In most cases, it still
worked, because most Var nodes carry a varnoold field, which is used by
EXPLAIN for the lookup, but there was one case of "bogus varno" error even
memorized in the expected output of the regression suite. (PostgreSQL 8.3
changed the way EXPLAIN resolves the printable name so that varnoold no
longer saves the bacon, and you would get a lot more of those errors)

To fix, teach the EXPLAIN of a Sequence node to also reach into the
static PartitionSelector node, and print the printablePredicate as if that
qual was part of the Sequence node directly.

The user-visible effect of this is that the static Partition Selector
expression now appears in EXPLAIN output as a direct attribute of the
Sequence node, not as a separate child node. Also, if a static Partition
Selector doesn't have a "printablePredicate", i.e. it doesn't actually
do any selection, it's not printed at all.

5dc7dda9

Add support for CoerceViaIO nodes to ORCA translator library. · 84286b9c

由 Heikki Linnakangas 提交于 6月 15, 2016

This is mostly copy-pasted from CoerceToDomain.

Note: This requires an up-to-date version of ORCA to compile, older
versions of the ORCA library itself don't know about CoerceViaIO nodes
either.

84286b9c

H
Revert unintentional change to horology.sql. · 72d6b148
由 Heikki Linnakangas 提交于 6月 15, 2016
```
This change in commit af5e4bcf was surely not intentional, and made the
test to fail.
```
72d6b148

Porting optimizer functional tests to ICG [#120031461] · af5e4bcf

由 Haisheng Yuan and Omer Arap 提交于 6月 14, 2016

    Port `dml/triggers`
    Port `olap/groupingfunction`
    Port `functions/builtin`
    Port `dml/functional`
    Port `dml/functional/sql_partiton`
    Port `functions/functionProperty`

    Update optfunctional_schedule with ported tests

    Update the Makefile for optimizer functional tests

af5e4bcf

Allocate tuples retrieved by a SharedScan in a longer living memory context · c36ecd6f

由 Nikos Armenatzoglou 提交于 6月 14, 2016

When SharedScan requests a tuple from the underlying Sort, and the Sort has spilled to disk, the mk_sort code (also the regular sort) makes a copy of the memtuple, but places it in the sortcontext.
When hitting a QueryFinishPending (e.g., master received enough tuples, so stop the execution), EagerFree is called for the entire tree. When reaching the sort node, the sortcontext is deleted, even though the SharedScan above it still has a pointer to a memtuple allocated in this context.

The solution is: pass a memory context for the mk_sort and regular sort to use (change the API and implementation of the following two functions: tuplesort_gettupleslot_pos_mk and tuplesort_gettupleslot_pos). Those functions should use the context provided by the caller to make a copy of the memtuple.
Signed-off-by: NGeorge Caragea <gcaragea@pivotal.io>

c36ecd6f

Tiny bugfix for gpcrondump · abf26ba2

由 Jamie McAtamney 提交于 6月 14, 2016

Removed extraneous argument to _prompt_continue that would cause
gpcrondump to fail if run without -a.

abf26ba2

14 6月, 2016 4 次提交

H

Copy `regression.diffs` and `regression.out` after ICG failure [#120858343] · 4f48254a
由 Haisheng Yuan and Jesse Zhang 提交于 6月 10, 2016

4f48254a

Modified Behave test to make repeated testing function correctly. · 0c31822e

由 Jamie McAtamney 提交于 6月 13, 2016

In commit f79e98fa, which enabled the
backup and restoration of roles containing special characters, the new
Behave test did not remove the created test role at the end, so running
the test a second time would fail.  This commit updates that test so
that it does not leave the test role around between runs.

0c31822e

Fix spelling and file markers in resource scheduling/queue · a3592daa

由 Daniel Gustafsson 提交于 6月 13, 2016

Fixed a few spelling errors, and minor whitespace issues, in the
Resource Scheduling README while reading it. Also updated the file
markers on the README, Makefile and code to match upstream CVS to
Git conversion. No semantic changes made to the text.

a3592daa

Enable assigning database owner containing special characters. · f79e98fa

由 Jamie McAtamney 提交于 6月 01, 2016

Previously, the cdatabase file created by gpcrondump did not correctly
quote a role containing special characters in its CREATE DATABASE
statement.  Such roles are now handled correctly.

Authors: Jamie McAtamney and Chumki Roy

f79e98fa

13 6月, 2016 3 次提交

A
Small improvements to the vagrant build for centos · fb911c72
由 Andreas Scherbaum 提交于 6月 13, 2016
```
Fix provided by @akon-dey
Closes: #780
```
fb911c72

Dispatch exactly same text string for all slices. · 4b360942

由 Kenan Yao 提交于 6月 06, 2016

Include a map from sliceIndex to gang_id in the dispatched string,
and remove the localSlice field, hence QE should get the localSlice
from the map now. By this way, we avoid duplicating and modifying
the dispatch text string slice by slice, and each QE of a sliced
dispatch would get same contents now.

The extra space cost is sizeof(int) * SliceNumber bytes, and the extra
computing cost is iterating the SliceNumber-size array. Compared with
memcpy of text string for each slice in previous implementation, this
way is much cheaper, because SliceNumber is much smaller than the size
of dispatch text string. Also, since SliceNumber is so small, we just
use an array for the map instead of a hash table.

Also, clean up some dead code in dispatcher, including:
(1) Remove primary_gang_id field of Slice struct and DispatchCommandDtxProtocolParms
struct, since dispatch agent is deprecated now;
(2) Remove redundant logic in cdbdisp_dispatchX;
(3) Clean up buildGpDtxProtocolCommand;

4b360942

Fix bugs when a SET command is executed within init plan · 2cda812c

由 Pengzhou Tang 提交于 5月 24, 2016

In commit d2725929, GPDB marked all allocatedReaderGangs with noReuse flag. When plan contains
init plan and a SET command executed within it, GPDB will mark pre-assigned gangs to noReuse and
destroy them which make query crash

2cda812c

11 6月, 2016 2 次提交

The GPDB Vmem is the lowest layer of memory allocator that supports higher... · 42ca3506

由 Foyzur Rahman 提交于 6月 10, 2016

The GPDB Vmem is the lowest layer of memory allocator that supports higher allocators such as AllocSet. This layer (mostly defined in memprot.c) is in charge of actually calling malloc/realloc/free to allocate/reallocate/free memory. In this process this layer is also in charge of reserving "virtual" memory or Vmem, which is a GPDB specific shared memory counter to track per-segment combined allocations across all the GPDB processes under Vmem umbrella. The Vmem counter is managed by a separate module Vmem_Tracker, and the memprot functions (such as gp_malloc, gp_free2 and gp_realloc) call the APIs provided by VmemTracker.

Previously the memprot allocators (gp_alloc/gp_realloc/gp_free) were only allocating/freeing memory but were not adding any additional metadata in the header (and there was no header) to track the size of allocations. Therefore, there was no gp_free as freeing memory requires the size of the free to adjust Vmem counter inside VmemTracker. This was patched by explicitly passing size info in gp_free2.

In this PR we do the following:

* We add allocation size in Vmem header (along with checksums which are only available in debug build to detect header and footer boundary, and buffer overruns).

* We remove size information from the block header of AllocSet.

* We rename gp_free2 to gp_free as the second parameter (size information) is now obtained from the header and therefore no longer necessary

* We modify all the consumers of memprot.c APIs to use the new APIs

* We add unit tests to test the metadata and the correctness of the new Vmem allocators

This is the first step to integrate external modules and third party allocations with Vmem. A long running issue in GPDB is its inability to track allocations by external components including libraries such as ORCA. Therefore, the central Vmem counter is often way off from the underlying allocations, and this may run the system out of memory. By maintaining the size information in the Vmem header, we now have a self-contained allocator that can be exposed to external allocators such as GPOS allocators, without forcing them to manage size information separately.

This fixes #117269929.
Signed-off-by: NMarc Spehlmann <marc.spehlmann@gmail.com>

42ca3506

[#120984085] Adds GUC for array expansion. · 53230187

由 Jesse Zhang and Marc Spehlmann 提交于 6月 07, 2016

This GUC will be used to control the MEMO size as well as optimization
time for large IN list or large array comparison expressions.

Only the Array with less number of elements than the GUC will be
expanded and participate in constraint derivation.

Trade-off of using this GUC is loss of potential benefits from the
constraint derivation (e.g. conflict detection, partition elimination)
with shorter optimization time and less memory utilization.

53230187