提交 · 0caa0d04db24d2c571fa7daa410bc6a5b319a2a2 · Greenplum / Gpdb

27 6月, 2012 1 次提交

Make DROP FUNCTION hint more informative. · 0caa0d04

由 Robert Haas 提交于 6月 26, 2012

If you decide you want to take the hint, this gives you something you
can paste right back to the server.

Dean Rasheed

0caa0d04

26 6月, 2012 9 次提交

Reduce use of heavyweight locking inside hash AM. · 76837c15

由 Robert Haas 提交于 6月 26, 2012

Avoid using LockPage(rel, 0, lockmode) to protect against changes to
the bucket mapping. Instead, an exclusive buffer content lock is now
viewed as sufficient permission to modify the metapage, and a shared
buffer content lock is used when such modifications need to be
prevented. This more relaxed locking regimen makes it possible that,
when we're busy getting a heavyweight bucket on the bucket we intend
to search or insert into, a bucket split might occur underneath us.
To compenate for that possibility, we use a loop-and-retry system:
release the metapage content lock, acquire the heavyweight lock on the
target bucket, and then reacquire the metapage content lock and check
that the bucket mapping has not changed. Normally it hasn't, and
we're done. But if by chance it has, we simply unlock the metapage,
release the heavyweight lock we acquired previously, lock the new
bucket, and loop around again. Even in the worst case we cannot loop
very many times here, since we don't split the same bucket again until
we've split all the other buckets, and 2^N gets big pretty fast.

This results in greatly improved concurrency, because we're
effectively replacing two lwlock acquire-and-release cycles in
exclusive mode (on one of the lock manager locks) with a single
acquire-and-release cycle in shared mode (on the metapage buffer
content lock). Testing shows that it's still not quite as good as
btree; for that, we'd probably have to find some way of getting rid
of the heavyweight bucket locks as well, which does not appear
straightforward.

Patch by me, review by Jeff Janes.

76837c15

Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring. · 038f3a05

由 Heikki Linnakangas 提交于 6月 26, 2012

The xlogid + segno representation of a particular WAL segment doesn't make
much sense in pg_resetxlog anymore, now that we don't use that anywhere
else. Use the WAL filename instead, since that's a convenient way to name a
particular WAL segment.

I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
patch, but I neglected pg_upgrade and the docs. This should now be more
complete.

038f3a05

Make pg_dump emit more accurate dependency information. · 8a504a36

由 Tom Lane 提交于 6月 25, 2012

While pg_dump has included dependency information in archive-format output
ever since 7.3, it never made any large effort to ensure that that
information was actually useful. In particular, in common situations where
dependency chains include objects that aren't separately emitted in the
dump, the dependencies shown for objects that were emitted would reference
the dump IDs of these un-dumped objects, leaving no clue about which other
objects the visible objects indirectly depend on. So far, parallel
pg_restore has managed to avoid tripping over this misfeature, but only
by dint of some crude hacks like not trusting dependency information in
the pre-data section of the archive.

It seems prudent to do something about this before it rises up to bite us,
so instead of emitting the "raw" dependencies of each dumped object,
recursively search for its actual dependencies among the subset of objects
that are being dumped.

Back-patch to 9.2, since that code hasn't yet diverged materially from
HEAD. At some point we might need to back-patch further, but right now
there are no known cases where this is actively necessary. (The one known
case, bug #6699, is fixed in a different way by my previous patch.) Since
this patch depends on 9.2 changes that made TOC entries be marked before
output commences as to whether they'll be dumped, back-patching further
would require additional surgery; and as of now there's no evidence that
it's worth the risk.

8a504a36

Improve pg_dump's dependency-sorting logic to enforce section dump order. · a1ef01fe

由 Tom Lane 提交于 6月 25, 2012

As of 9.2, with the --section option, it is very important that the concept
of "pre data", "data", and "post data" sections of the output be honored
strictly; else a dump divided into separate sectional files might be
unrestorable. However, the dependency-sorting logic knew nothing of
sections and would happily select output orderings that didn't fit that
structure. Doing so was mostly harmless before 9.2, but now we need to be
sure it doesn't do that. To fix, create dummy objects representing the
section boundaries and add dependencies between them and all the normal
objects. (This might sound expensive but it seems to only add a percent or
two to pg_dump's runtime.)

This also fixes a problem introduced in 9.1 by the feature that allows
incomplete GROUP BY lists when a primary key is given in GROUP BY.
That means that views can depend on primary key constraints. Previously,
pg_dump would deal with that by simply emitting the primary key constraint
before the view definition (and hence before the data section of the
output). That's bad enough for simple serial restores, where creating an
index before the data is loaded works, but is undesirable for speed
reasons. But it could lead to outright failure of parallel restores, as
seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would
switch into parallel mode as soon as it reached the constraint, and then
very possibly would try to emit the view definition before the primary key
was committed (as a consequence of another bug that causes the view not to
be correctly marked as depending on the constraint). Adding the section
boundary constraints forces the dependency-sorting code to break the view
into separate table and rule declarations, allowing the rule, and hence the
primary key constraint it depends on, to revert to their intended location
in the post-data section. This also somewhat accidentally works around the
bogus-dependency-marking problem, because the rule will be correctly shown
as depending on the constraint, so parallel pg_restore will now do the
right thing. (We will fix the bogus-dependency problem for real in a
separate patch, but that patch is not easily back-portable to 9.1, so the
fact that this patch is enough to dodge the only known symptom is
fortunate.)

Back-patch to 9.1, except for the hunk that adds verification that the
finished archive TOC list is in correct section order; the place where
it was convenient to add that doesn't exist in 9.1.

a1ef01fe

A
Tighten up includes in sinvaladt.h, twophase.h, proc.h · 77ed0c69
由 Alvaro Herrera 提交于 6月 25, 2012
```
Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in
proc.h with xlogdefs.h.
```
77ed0c69

Unify calling conventions for postgres/postmaster sub-main functions · eeece9e6

由 Peter Eisentraut 提交于 6月 25, 2012

There was a wild mix of calling conventions: Some were declared to
return void and didn't return, some returned an int exit code, some
claimed to return an exit code, which the callers checked, but
actually never returned, and so on.

Now all of these functions are declared to return void and decorated
with attribute noreturn and don't return.  That's easiest, and most
code already worked that way.

eeece9e6

R
Fix typo in DEBUG message, introduced by recent WAL refactoring. · c7d47abd
由 Robert Haas 提交于 6月 25, 2012
```
Fujii Masao
```
c7d47abd
R
Unbreak pg_resetxlog -l. · a6427f1f
由 Robert Haas 提交于 6月 25, 2012
```
Fujii Masao
```
a6427f1f

Remove sanity test in XRecOffIsValid. · 2dfa87bc

由 Robert Haas 提交于 6月 25, 2012

Commit 061e7efb changed the rules
for splitting xlog records across pages, but neglected to update this
test.  It's possible that there's some better action here than just
removing the test completely, but this at least appears to get some
of the things that are currently broken (like initdb on MacOS X)
working again.

2dfa87bc

25 6月, 2012 7 次提交

K

Fix warning for 64-bit literal on 32-bit build. · 5c7f954d
由 Kevin Grittner 提交于 6月 25, 2012

5c7f954d

Replace int2/int4 in C code with int16/int32 · b8b2e3b2

由 Peter Eisentraut 提交于 6月 25, 2012

The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.

Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.

b8b2e3b2

H

I missed some references to xlogid/xrecoff in Win32-only code. Fix. · 7eb8c785
由 Heikki Linnakangas 提交于 6月 24, 2012

7eb8c785
H
Use UINT64CONST for 64-bit integer constants. · 0687a260
由 Heikki Linnakangas 提交于 6月 24, 2012
```
Peter Eisentraut advised me that UINT64CONST is the proper way to do that,
not LL suffix.
```
0687a260
H
Oops. Remove stray paren. · a218e23a
由 Heikki Linnakangas 提交于 6月 24, 2012
```
I didn't notice this on my laptop as I don't HAVE_FSYNC_WRITETHROUGH.
```
a218e23a
H
Use LL suffix for 64-bit constants. · 96ff85e2
由 Heikki Linnakangas 提交于 6月 24, 2012
```
Per warning from buildfarm member 'locust'. At least I think this what's
making it upset.
```
96ff85e2

Replace XLogRecPtr struct with a 64-bit integer. · 0ab9d1c4

由 Heikki Linnakangas 提交于 6月 24, 2012

This simplifies code that needs to do arithmetic on XLogRecPtrs.

To avoid changing on-disk format of data pages, the LSN on data pages is
still stored in the old format. That should keep pg_upgrade happy. However,
we have XLogRecPtrs embedded in the control file, and in the structs that
are sent over the replication protocol, so this changes breaks compatibility
of pg_basebackup and server. I didn't do anything about this in this patch,
per discussion on -hackers, the right thing to do would to be to change the
replication protocol to be architecture-independent, so that you could use
a newer version of pg_receivexlog, for example, against an older server
version.

0ab9d1c4

24 6月, 2012 3 次提交

Allow WAL record header to be split across pages. · 061e7efb

由 Heikki Linnakangas 提交于 6月 24, 2012

This saves a few bytes of WAL space, but the real motivation is to make it
predictable how much WAL space a record requires, as it no longer depends
on whether we need to waste the last few bytes at end of WAL page because
the header doesn't fit.

The total length field of WAL record, xl_tot_len, is moved to the beginning
of the WAL record header, so that it is still always found on the first page
where a WAL record begins.

Bump WAL version number again as this is an incompatible change.

061e7efb

Move WAL continuation record information to WAL page header. · 20ba5ca6

由 Heikki Linnakangas 提交于 6月 24, 2012

The continuation record only contained one field, xl_rem_len, so it makes
things simpler to just include it in the WAL page header. This wastes four
bytes on pages that don't begin with a continuation from previos page, plus
four bytes on every page, because of padding.

The motivation of this is to make it easier to calculate how much space a
WAL record needs. Before this patch, it depended on how many page boundaries
the record crosses. The motivation of that, in turn, is to separate the
allocation of space in the WAL from the copying of the record data to the
allocated space. Keeping the calculation of space required simple helps to
keep the critical section of allocating the space from WAL short. But that's
not included in this patch yet.

Bump WAL version number again, as this is an incompatible change.

20ba5ca6

Don't waste the last segment of each 4GB logical log file. · dfda6eba

由 Heikki Linnakangas 提交于 6月 24, 2012

The comments claimed that wasting the last segment made it easier to do
calculations with XLogRecPtrs, because you don't have problems representing
last-byte-position-plus-1 that way. In my experience, however, it only made
things more complicated, because the there was two ways to represent the
boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
picky about which representation was used.

Also, use a 64-bit segment number instead of the log/seg combination, to
point to a certain WAL segment. We assume that all platforms have a working
64-bit integer type nowadays.

This is an incompatible change in WAL format, so bumping WAL version number.

dfda6eba

22 6月, 2012 5 次提交

Make pgbench -i emit only one-tenth as many status messages. · 47c7365e

由 Robert Haas 提交于 6月 22, 2012

These days, even a wimpy system can insert 10000 tuples in the blink of
an eye, so there's no real need for this much verbosity.

Per complaint from Tatsuo Ishii.

47c7365e

Document that && can be used to search arrays. · 6ef5baf8

由 Robert Haas 提交于 6月 22, 2012

Also, add some cross-links to the indexing documentation, so it's easier
to notice that && and other array operators have index support.

Ryan Kelly, edited by me.

6ef5baf8

P
Make placeholders in SQL command help more consistent and precise · 6753ced3
由 Peter Eisentraut 提交于 6月 22, 2012
```
To avoid divergent names on related pages, avoid ambiguities, and
reduce translation work a little.
```
6753ced3

Fix memory leak in ARRAY(SELECT ...) subqueries. · d14241c2

由 Tom Lane 提交于 6月 21, 2012

Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which
I think can only happen if the sub-select is embedded in a larger,
correlated subquery) would leak memory for the duration of the query,
due to not reclaiming the array generated in the previous execution.
Per bug #6698 from Armando Miraglia. Diagnosis and fix idea by Heikki,
patch itself by me.

This has been like this all along, so back-patch to all supported versions.

d14241c2

A

Repair comment mangled by a pgindent run long ago · 68d0e3cb
由 Alvaro Herrera 提交于 5月 28, 2012

68d0e3cb

21 6月, 2012 4 次提交

Add a small cache of locks owned by a resource owner in ResourceOwner. · eeb6f37d

由 Heikki Linnakangas 提交于 6月 21, 2012

This speeds up reassigning locks to the parent owner, when the transaction
holds a lot of locks, but only a few of them belong to the current resource
owner. This is particularly helps pg_dump when dumping a large number of
objects.

The cache can hold up to 15 locks in each resource owner. After that, the
cache is marked as overflowed, and we fall back to the old method of
scanning the whole local lock table. The tradeoff here is that the cache has
to be scanned whenever a lock is released, so if the cache is too large,
lock release becomes more expensive. 15 seems enough to cover pg_dump, and
doesn't have much impact on lock release.

Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.

eeb6f37d

Remove incomplete/incorrect support for zero-column foreign keys. · dfd9c116

由 Tom Lane 提交于 6月 20, 2012

The original coding in ri_triggers.c had partial support for the concept of
zero-column foreign key constraints. But this is not defined in the SQL
standard, nor was it ever allowed by any other part of Postgres, nor was it
very fully implemented even here (eg there was no support for preventing
PK-table deletions that would violate the constraint). Doesn't seem very
useful to carry 100-plus lines of code for a corner case that no one is
interested in making work. Instead, just add a check that the column list
read from pg_constraint is non-empty.

dfd9c116

Increase MAX_SYSCACHE_CALLBACKS from 20 to 32. · 0ce4459a

由 Tom Lane 提交于 6月 20, 2012

By my count there are 18 callers of CacheRegisterSyscacheCallback in the
core code in HEAD, so we are potentially leaving as few as 2 slots for any
add-on code to use (though possibly not all these callers would actually
activate in any particular session). That doesn't seem like a lot of
headroom, so let's pump it up a little.

0ce4459a

Cache the results of ri_FetchConstraintInfo in a backend-local cache. · 45ba424f

由 Tom Lane 提交于 6月 20, 2012

Extracting data from pg_constraint turned out to take as much as 10% of the
runtime in a bulk-update case where the foreign key column wasn't changing,
because we did it over again for each tuple. Fix that by maintaining a
backend-local cache of the results. This is really a pretty small patch,
but converting the trigger functions to work with pointers rather than
local struct variables requires a lot of mechanical changes.

45ba424f

20 6月, 2012 5 次提交

Improve tests for whether we can skip queueing RI enforcement triggers. · cfa0f425

由 Tom Lane 提交于 6月 19, 2012

During an update of a PK row, we can skip firing the RI trigger if any old
key value is NULL, because then the row could not have had any matching
rows in the FK table. Conversely, during an update of an FK row, the
outcome is determined if any new key value is NULL. In either case it
becomes unnecessary to compare individual key values.

This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
DISTINCT semantics for the key comparisons. In the event there is no need
for that and so this patch looks nothing like his, but he should still get
credit for having re-opened consideration of the trigger skip logic.

cfa0f425

Add pgbench option to add foreign key constraints to the standard scenario. · afe1c51c

由 Tom Lane 提交于 6月 19, 2012

The option --foreign-keys, used at initialization time, will create foreign
key constraints for the columns that represent references to other tables'
primary keys.  This can help in benchmarking FK performance.

Jeff Janes

afe1c51c

pg_dump: Fix verbosity level in LO progress messages · 11b335ac

由 Alvaro Herrera 提交于 6月 18, 2012

In passing, reword another instance of the same message that was
gratuitously different.

Author: Josh Kupershmidt
after a bug report by Bosco Rama

11b335ac

P
Remove confusing half sentence from legal notice · c521665b
由 Peter Eisentraut 提交于 6月 20, 2012
```
pointed out by Stefan Kaltenbrunner
```
c521665b

Share RI trigger code between NO ACTION and RESTRICT cases. · fe3db740

由 Tom Lane 提交于 6月 19, 2012

These triggers are identical except for whether ri_Check_Pk_Match is to be
called, so factor out the common code to save a couple hundred lines.

Also, eliminate null-column checks in ri_Check_Pk_Match, since they're
duplicate with the calling functions and require unnecessary complication
in its API statement.

Simplify the way code is shared between RI_FKey_check_ins and
RI_FKey_check_upd, too.

fe3db740

19 6月, 2012 6 次提交

Improve comments about why SET DEFAULT triggers must recheck for matches. · 48756be9

由 Tom Lane 提交于 6月 18, 2012

I was confused about this, so try to make it clearer for the next person.

(This seems like a fairly inefficient way of dealing with a corner case,
but I don't have a better idea offhand.  Maybe if there were a way to turn
off the RI_FKey_keyequal_upd_fk event filter temporarily?)

48756be9

Allow ON UPDATE/DELETE SET DEFAULT plans to be cached. · e8c9fd5f

由 Tom Lane 提交于 6月 18, 2012

Once upon a time, somebody was worried that cached RI plans wouldn't get
remade with new default values after ALTER TABLE ... SET DEFAULT, so they
didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions.
That time is long gone, though (and even at the time I doubt this was the
greatest hazard posed by ALTER TABLE...). So allow these triggers to cache
their plans just like the others.

The cache_plan argument to ri_PlanCheck is now vestigial, since there
are no callers that don't pass "true"; but I left it alone in case there
is any future need for it.

e8c9fd5f

Remove derived fields from RI_QueryKey, and do a bit of other cleanup. · 03a5ba24

由 Tom Lane 提交于 6月 18, 2012

We really only need the foreign key constraint's OID and the query type
code to uniquely identify each plan we are caching for FK checks. The
other stuff that was in the struct had no business being used as part of
a hash key, and was all just being copied from struct RI_ConstraintInfo
anyway. Get rid of the unnecessary fields, and readjust various function
APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source.

I'd be surprised if this makes any measurable performance difference,
but it certainly feels cleaner.

03a5ba24

P

pg_dump: Add missing newlines at end of messages · e1e97e93
由 Peter Eisentraut 提交于 6月 18, 2012

e1e97e93

Update SQL spec references in ri_triggers code to match SQL:2008. · f9429746

由 Tom Lane 提交于 6月 18, 2012

Now that what we're implementing isn't SQL92, we probably shouldn't cite
chapter and verse in that spec anymore.  Also fix some comments that
talked about MATCH FULL but in fact were in code that's also used for
MATCH SIMPLE.

No code changes in this commit, just comments.

f9429746

Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec. · c75be2ad

由 Tom Lane 提交于 6月 18, 2012

Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for
a multicolumn MATCH SIMPLE foreign key constraint, we would set only those
referencing columns corresponding to referenced columns that were changed.
This is what the SQL92 standard said to do --- but more recent versions
of the standard say that all referencing columns should be set to null or
their default values, no matter exactly which referenced columns changed.
At least for SET DEFAULT, that is clearly saner behavior. It's somewhat
debatable whether it's an improvement for SET NULL, but it appears that
other RDBMS systems read the spec this way. So let's do it like that.

This is a release-notable behavioral change, although considering that
our documentation already implied it was done this way, the lack of
complaints suggests few people use such cases.

c75be2ad