提交 · 4b3609423ae8e1d269bf0b02238fd05af7ca495d · Greenplum / Gpdb

13 6月, 2016 2 次提交

Dispatch exactly same text string for all slices. · 4b360942

由 Kenan Yao 提交于 6月 06, 2016

Include a map from sliceIndex to gang_id in the dispatched string,
and remove the localSlice field, hence QE should get the localSlice
from the map now. By this way, we avoid duplicating and modifying
the dispatch text string slice by slice, and each QE of a sliced
dispatch would get same contents now.

The extra space cost is sizeof(int) * SliceNumber bytes, and the extra
computing cost is iterating the SliceNumber-size array. Compared with
memcpy of text string for each slice in previous implementation, this
way is much cheaper, because SliceNumber is much smaller than the size
of dispatch text string. Also, since SliceNumber is so small, we just
use an array for the map instead of a hash table.

Also, clean up some dead code in dispatcher, including:
(1) Remove primary_gang_id field of Slice struct and DispatchCommandDtxProtocolParms
struct, since dispatch agent is deprecated now;
(2) Remove redundant logic in cdbdisp_dispatchX;
(3) Clean up buildGpDtxProtocolCommand;

4b360942

Fix bugs when a SET command is executed within init plan · 2cda812c

由 Pengzhou Tang 提交于 5月 24, 2016

In commit d2725929, GPDB marked all allocatedReaderGangs with noReuse flag. When plan contains
init plan and a SET command executed within it, GPDB will mark pre-assigned gangs to noReuse and
destroy them which make query crash

2cda812c

08 6月, 2016 2 次提交

Function to rebuild free tid list. · 00ac15f6

由 Asim R P 提交于 5月 04, 2016

If found inconsistent, the free tid list will be rebuilt automatically during
recovery.  During normal operation, super user may invoke the function
gp_persistent_freelist_rebuild(OID) to rebuild the free list.

A basic test case is added to verify sanity of a free tid list rebuilt using
the function.

00ac15f6

Formatting changes. · c4f4e27b

由 Asim R P 提交于 4月 27, 2016

Break long elog messages into multiple lines, remove trailing
whitespace and start elog messages with lower case.

c4f4e27b

07 6月, 2016 2 次提交

Don't decompress datums before sending them over the network. · b1aa03e4

由 Heikki Linnakangas 提交于 6月 07, 2016

It seems better to send in compressed form, and decompress in the receiver,
to reduce network I/O. The amount of CPU work required for the
decompression is the same whether its done in the sender or the receiver.
There was even a comment implying that, but for some reason, we didn't do
it that way.

This is in preparation for the PostgreSQL 8.3 merge: The HEAP_COMPRESSED
(and HEAP_HASEXTENDED which included it) flag was removed in PostgreSQL
8.3.

b1aa03e4

F
Fixing unit tests because of gcc proprietary extensions (#120725493). · a910c040
由 Foyzur Rahman 提交于 6月 06, 2016
```
Signed-off-by: NMarc Spehlmann <marc.spehlmann@gmail.com>
```
a910c040

03 6月, 2016 4 次提交

Move stripping of subqueries to the end of planning. · 1f91b092

由 Heikki Linnakangas 提交于 6月 03, 2016

Seems more straightforward. Firstly, modifying a PlannedStmt at execution
is dubious, if the PlannedStmt is reused for executing the same query
again. Secondly, if the original Query objects are indeed not needed
after planning, we can save a little bit of memory, and avoid the overhead
of stripping the plan on every execution. This also makes any breakage
more obvious, if it turns out that the Query is actually still needed
at execution for some reason, as that issue would then show up on the
first execution already, and not only on reuse of a PlannedStmt.

Per Kenan Yao's observation that stripPlanBeforeDispatch() also scribbles
on the PlannedStmt.

1f91b092

H
Move memoryAccounting field from Plan to PlanState. · 4f4f10fc
由 Heikki Linnakangas 提交于 4月 01, 2016
```
This is also transient information, only valid during execution, and should
not be cached with the plan.
```
4f4f10fc

Move transient information out of PlannedStmt, to a new struct. · 41478b89

由 Heikki Linnakangas 提交于 4月 01, 2016

That includes the slice table, transientRecordTypes, and IntoClause's
oidInfo. These are transient information, created in ExecutorStart, not
something that should be cached along with the plan. transientRecordTypes
and oidInfo in particular were stored in PlannedStmt only so that they can
be conveniently dispatched to QEs along with the plan. That's not a problem
at the moment, but with the upcoming PostgreSQL 8.3 merge, we'll start
keeping the PlannedStmt struct around for many executions, so let's create
a new struct to hold that kind of information, which is transmitted from
QD to QEs along with the plan (that new struct is called QueryDispatchDesc).

41478b89

Make InitPlan Param mechanism cope with InitPlans that are never executed. · a5747dbc

由 Heikki Linnakangas 提交于 6月 02, 2016

If there are two InitPlan nodes inside each other, and the outer InitPlan
never executes the inner InitPlan (because the value of the outer InitPlan
was determined without it), don't get confused when some of the params
are not set (= won't even have a valid type OID)

bfv_subquery regression tests exercised this. It failed with
--enable-cassert. Fixes github issue #500.

a5747dbc

02 6月, 2016 1 次提交

Remove checkpoint.h, and move the definitions in it to bgwriter.h. · a619808f

由 Heikki Linnakangas 提交于 6月 01, 2016

Having a "checkpoint.h", corresponding to "checkpoint.c", makes perfect
sense, but those function definitions are in bgwriter.h in PostgreSQL, and
keeping the code as close to upstream as possible trumps the consistency of
keeping definitions for "foo.c" in header file called "foo.h". Keeping
things close to upstream makes merging easier.

a619808f

01 6月, 2016 1 次提交

Misc cosmetic cleanup. · 0da1de0e

由 Heikki Linnakangas 提交于 6月 01, 2016

These changes were lifted from upcoming PostgreSQL 8.3 merge branch, but
are not related to the merge per se, so let's get them out of the way
before the big merge lands. Plus other cosmetic things in neighbouring
code that stuck my eye.

0da1de0e

29 5月, 2016 1 次提交

Fix bug in freeGangsForPortal(). · e88fa555

由 Heikki Linnakangas 提交于 5月 28, 2016

prev_item variable needs to be reset between the two loops. Otherwise,
if the first item in the latter list (allocatedReaderGangs1) needs to
be removed, things go wrong. I got an assertion failure with
installcheck-good from that:

FailedAssertion(""!(prev != ((void *)0) ? ((prev)->next) == cell : list_head(list) == cell)"", File: ""list.c"", Line: 616)

This got broken in the recent refactoring commit 46dfa750. The rest of the
changes in this commit, introducing the new next_item variable, wasn't
needed for correctness, but IMHO makes the code easier to understand.

e88fa555

27 5月, 2016 2 次提交
- K
  
  Add unit tests for cdbdisp_dispatchPlan and cdbdisp_makeResult. · b71771ce
  由 Kenan Yao 提交于 5月 26, 2016
  
  b71771ce
- K
  
  Code format and comment change in dispatcher; no code logic change involved. · 6606a5c6
  由 Kenan Yao 提交于 5月 26, 2016
  
  6606a5c6
25 5月, 2016 1 次提交
- L
  
  Typo in gp_read_backup_file__() · cd69f0b3
  由 Lubomir Petrov 提交于 5月 23, 2016
  
  cd69f0b3
21 5月, 2016 1 次提交

refactor gang management code · 46dfa750

由 Gang Xiong 提交于 5月 20, 2016

1) add one new type of gang: singleton reader gang.
2) change interface of allocateGang.
3) handling exceptions during gang creation: segment down and segment reset.
4) cleanup some dead code.

46dfa750

19 5月, 2016 1 次提交

Split cdbdisp.c into several files, and put them into a new · 895b7d50

由 Pengzhou Tang 提交于 5月 12, 2016

dispatcher/ directory

This commit has no logic change, it just contains movement of code across
files, to make dispatcher code clearer, and easier for unit testing.

Signed-off-by: Kenan Yao

895b7d50

18 5月, 2016 1 次提交

Revert changes related to backend shutdown. · af7b1b51

由 Heikki Linnakangas 提交于 5月 18, 2016

There were a bunch of changes vs. upstream in the way the PGPROC free list
was managed, and the way backend exit was handled. They seemed largely
unnecessary, and somewhat buggy, so I reverted them. Avoiding unnecessary
differences makes merging with upstream easier too.

* The freelist was protected by atomic operations instead of a spinlock.
There was an ABA problem in the implementation, however. In Prepend(), if
another backend grabbed the PGPROC we were just about to grab for ourselves,
and returned it to the freelist before we iterate and notice, we might
set the head of the free list to a PGPROC that's actually already in use.
It's a tight window, and backend startup is quite heavy, so that's unlikely
to happen in practice. Still, it's a bug. Because backend start up is such
a heavy operation, this codepath is not so performance-critical that you
would gain anything from using atomic operations instead of a spinlock, so
just switch back to using a spinlock like in the upstream.

* When a backend exited, the responsibility to recycle the PGPROC entry
to the free list was moved to the postmaster, from the backend itself.
That's not broken per se, AFAICS, but it violates the general principle of
avoiding shared memory access in postmaster.

* There was a dead-man's switch, in the form of the postmasterResetRequired
flag in the PGPROC entry. If a backend died unexpectedly, and the flag
was set, postmaster would restart the whole server. If the flag was not
set, it would clean up only the PGPROC entry that was left behind and
let the system run normally. However, the flag was in fact always set,
except after ProcKill had already run, i.e. when the process had exited
normally. So I don't see the point of that, we might as well rely on the
exit status to signal normal/abnormal exit, like we do in the upstream. That
has worked fine for PostgreSQL.

* There was one more case where the dead-man's switch was activated, even
though the backend exited normally: In AuxiliaryProcKill(), if a filerep
subprocess died, and it didn't have a parent process anymore. That means
that the master filerep process had already died unexpectedly (filerep
subprocesses are children of the are not direct children of postmaster).
That seems unnecessary, however: if the filerep process had died
unexpectedly, the postmaster should wake up to that, and would restart
the server. To play it safe, though, make the subprocess exit with non-zero
exit status in that case, so that the postmaster will wake up to that, if
it didn't notice the master filerep process dying for some reason.

* HaveNFreeProcs() was rewritten by maintaining the number of entries
in the free list in a variable, instead of walking the list to count them.
Presumably to make backend startup cheaper, when max_connections is high.
I kept that, but it's slightly simpler now that we use a spinlock to protect
the free list again: no need to use atomic ops for the variable anymore.

* The autovacFreeProcs list was not used. Autovacuum workers got their
PGPROC entry from the regular free list. Fix that, and also add
missing InitSharedLatch() call to the initialization of the autovacuum
workers list.

af7b1b51

17 5月, 2016 2 次提交

F
Fixing gang leak during disconnectAndDestroyAllGangs · 17ab121e
由 Foyzur Rahman 提交于 5月 13, 2016
```
Signed-off-by: NKarthikeyan Jambu Rajaraman <karthi.jrk@gmail.com>
```
17ab121e

Validate the previous free TID in gp_persistent_relation_node. · e8c990fd

由 Jimmy Yih 提交于 4月 22, 2016

Previously, previous free TID validation was done under the GUC
persistent_integrity_checks. This commit extracts the previous free TID
validation into another GUC validate_previous_free_tid and is enabled by
default. If the validation detects a corruption in the free TID list, we
will now switch to a new free TID list and leave the corrupted one detached
for cleanup during persistent table rebuild or during crash recovery.

Authors: Jimmy Yih and Abhijit Subramanya

e8c990fd

13 5月, 2016 3 次提交

Clean up the way the results array is allocated in cdbdisp_returnResults(). · 6a28c978

由 Heikki Linnakangas 提交于 5月 13, 2016

I saw the "nresults < nslots" assertion fail, while hacking on something
else. It happened when a Distributed Prepare command failed, and there were
several error result sets from a segment. I'm not sure how normal it is to
receive multiple ERROR responses to a single query, but the protocol
certainly allows it, and I don't see any explanation for why the code used
to assume that there can be at most 2 result sets from each segment.

Remove that assumption, and make the code cope with more than two result
sets from a segment, by calculating the required size of the array
accurately.

In the passing, remove the NULL-terminator from the array, and change the
callers that depended on it to use the returned size variable instead.
Makes the loops in the callers look less funky.

6a28c978

Fix memory leak in gp_read_error_log(). · c04d827d

由 Heikki Linnakangas 提交于 5月 13, 2016

The code incorrectly called free() on last+1 element of the array. The
array returned by cdbdisp_dispatchRMCommand() always has a NULL element as
terminator, and free(NULL) is a no-op, which is why this didn't outright
crash. But clearly the intention here was to free() the array itself,
otherwise it's leaked.

c04d827d

D

Fix typo in get_parts() comment documentation · da66ec51
由 Daniel Gustafsson 提交于 5月 11, 2016

da66ec51

10 5月, 2016 3 次提交

D
Fix spelling in error log message · df0e67da
由 Daniel Gustafsson 提交于 5月 09, 2016
```
Spotted while fixing consumers of this function in 09ed186.
```
df0e67da

Remove WATCH_VISIBILITY_IN_ACTION debugging aid. · 28b12671

由 Heikki Linnakangas 提交于 5月 10, 2016

It might be useful in debugging, but it got pretty badly in the way while
merging with PostgreSQL 8.3. We could fix it, of course, but on balance,
I don't think it's worth the effort. It's going to be a maintenance burden
going forward too, as the WATCH_* calls are scattered all over the
visibility checking code. If we need debugging code like that, we should
find a less invasive way to implement it, or submit the mechanism to
upstream so that we wouldn't need to maintain it as a diff.

28b12671

H
Separate GPDB-specific shared snapshot code to its own .c and .h files. · 7745cde1
由 Heikki Linnakangas 提交于 5月 10, 2016
```
Not urgent to do right now, but makes merging and diffing easier.
```
7745cde1

09 5月, 2016 2 次提交

Clean up EXPLAIN-related code. · 3ea0db50

由 Heikki Linnakangas 提交于 5月 09, 2016

There's no real need for a separate MemoryContext for EXPLAIN stuff. Might
as well leave all that in the per-statement context.

3ea0db50

Remove EXPLAIN remnants of workfile caching. · 0eb57971

由 Heikki Linnakangas 提交于 5月 09, 2016

Workfile caching was an experimental feature that was never taken into
production use. It has since been removed, but these instrumentation
remnants of it were still lying around unused.

0eb57971

06 5月, 2016 2 次提交
- D
  Fix typos and comment style in cdbcat.c · 19dccfde
  由 Daniel Gustafsson 提交于 5月 06, 2016
```
Fixes a few observed spelling errors and also removes excessive
whitespace in comments.
```
  19dccfde
- G
  refactor dispatcher code · 7363a6d9
  由 Gang Xiong 提交于 4月 23, 2016
```
refactor cdbdisp_dispatchToGang interface.
refactor memory management in dispatch.
```
  7363a6d9
05 5月, 2016 1 次提交

Remove warning of 'incompatible type' when compiling (#688) · f3983337

由 Kuien Liu 提交于 5月 05, 2016

Files changed:
    modified:   src/backend/cdb/motion/ic_common.c
    modified:   src/backend/executor/spi.c
    modified:   src/backend/nodes/outfuncs.c
    modified:   src/backend/optimizer/path/costsize.c
    modified:   src/backend/storage/file/compress_zlib.c

Note: The warning in function _outScanInfo() of outfuncs.c is temporally
    fixed and would be treated as dead code to be removed soon.

Thanks to Heikki Linnakangas' comments.

f3983337

04 5月, 2016 2 次提交

H

Mark functions that are not used elsewhere as static, in cdbhash.c. · 4c6cbcc8
由 Heikki Linnakangas 提交于 5月 04, 2016

4c6cbcc8

Add ddboost storage unit option into gpcrondump, gpdbrestore and gpmfr · f044948c

由 Pengcheng Tang 提交于 4月 25, 2016

When user dumps database to Data Domain Boost server, storage
unit and backup directory must be already created and specified,
previously, we hard coded the storage unit to "GPDB" and user
had no option to use others.

This commit adds --ddboost-storage-unit option, which allows
user to dynamically specify storage unit for dump and restore.

This commits allows user to have storage unit information
statically saved into configure file in their cluster host.

This commit added storage unit option into gpmfr for replicating
and recovering dump copies, in which case it uses identical storage
unit and backup directory between primary and secondary DDBoost server.

--ddboost-storage-unit option takes higher priority than using
statically configured storage unit.

Authors:
Pengcheng Tang, Marbin Tan, Nikhil Kak
Lawrence Hamel, Stephen Wu, Chris Hajas, Chumki Roy

f044948c

03 5月, 2016 1 次提交

Fix using external table in a subplan. · 8ae5a93f

由 Heikki Linnakangas 提交于 5月 03, 2016

ParallelizeCorrelatedSubPlanMutator() turns each Scan on a base relation
into a "Result - Material - Broadcast - Scan" pattern, but it missed
ExternalScans. External tables are supposed to be treated as distributed,
i.e. each segment holds different part of the external table, so they
need to be treated like regular tables.

8ae5a93f

29 4月, 2016 1 次提交

Remove warning of 'unused variables' when compiling (#675) · ed3e998d

由 Kuien Liu 提交于 4月 29, 2016

Files changed:
    modified:   src/backend/access/external/url.c
    modified:   src/backend/cdb/cdbhash.c
    modified:   src/backend/cdb/cdbmutate.c
    modified:   src/pl/plpgsql/src/pl_funcs.c

Skipped:
    catcoretable.c:104:26:CatCoreType_int4_array
	because it is an enumerated constant to ensure integrity

ed3e998d

25 4月, 2016 1 次提交

Simplify the parsed-representation of ALTER TABLE ADD PARTITION. · 08db9061

由 Heikki Linnakangas 提交于 4月 25, 2016

atpxPartAddList() needs a CreateStmt that represents the parent table,
but instead of creating it already in the parser, and adding more details
to it in analyze.c, it's simpler to create it later, in atpxPartAddList(),
where it's actually needed.

08db9061

14 4月, 2016 2 次提交

A

Support complex number type · 9dd747ae
由 Atri Sharma 提交于 3月 11, 2016

9dd747ae

Fix memory overflow when number of distributed by columns exceed the limitation. · 3db8abf3

由 Pengzhou Tang 提交于 4月 05, 2016

The maximal number of distributed by columns is 1600, gpdb should error out when
it exceeds the limitation. Another thing is gpdb should allocate enough memory to
hold those columns, otherwise it will cause memory overflow.

3db8abf3

07 4月, 2016 1 次提交

Using zlib with palloc() for spillfiles and dispatch · 7d2d2388

由 George Caragea 提交于 4月 06, 2016

This commit makes zlib to use palloc/pfree, by rewriting compress_zlib.c to utilize the API in gfile.c, and dispatcher to use palloc and pfree through a set of zlib wrapper functions. This enables our protection mechanisms (gp_vmem_protect_limit, runaway_detector_activation_percent) to keep the system stable by cleanly canceling a query which would otherwise cause us to run out of memory.
Signed-off-by: NNikos Armenatzoglou <nikos.armenatzoglou@gmail.com>

7d2d2388