提交 · 5c54f63fd66973c32f6e96333d3bee1ba3669563 · Greenplum / Gpdb

24 1月, 2013 8 次提交

Fix rare missing cancellations in Hot Standby. · 5c54f63f

由 Simon Riggs 提交于 1月 24, 2013

The machinery around XLOG_HEAP2_CLEANUP_INFO failed
to correctly pass through the necessary information
on latestRemovedXid, avoiding cancellations in some
infrequent concurrent update/cleanup scenarios.

Backpatchable fix to 9.0

Detailed bug report and fix by Noah Misch,
backpatchable version by me.

5c54f63f

pg_upgrade: report failed cluster name · bd6aca8a

由 Bruce Momjian 提交于 1月 24, 2013

When pg_upgrade can't find required pg_controldata information, report
_which_ cluster is failing, with this message:

	The %s cluster lacks some required control information:

bd6aca8a

H
Also fix rotation of csvlog on Windows. · 168d3157
由 Heikki Linnakangas 提交于 1月 24, 2013
```
Backpatch to 9.2, like the previous fix.
```
168d3157
S
Docs shouldn't say HOT Standby. · f64315c6
由 Simon Riggs 提交于 1月 24, 2013
```
Not an acronym.

Jeff Janes
```
f64315c6

Fix failure to rotate postmaster log file for size reasons on Windows. · 8556869f

由 Tom Lane 提交于 1月 23, 2013

When we eliminated "unnecessary" wakeups of the syslogger process, we
broke size-based logfile rotation on Windows, because on that platform
data transfer is done in a separate thread.  While non-Windows platforms
would recheck the output file size after every log message, Windows only
did so when the control thread woke up for some other reason, which might
be quite infrequent.  Per bug #7814 from Tsunezumi.  Back-patch to 9.2
where the problem was introduced.

Jeff Janes

8556869f

A
isolationtester: add a few fflush(stderr) calls · ca5db759
由 Alvaro Herrera 提交于 1月 23, 2013
```
The lack of them is causing failures in some BF members.

Per Andrew Dunstan.
```
ca5db759
R
Clarify that connection parameters aren't totally meaningless for PQping. · 40ed59b2
由 Robert Haas 提交于 1月 23, 2013
```
Per discussion with Phil Sorber.
```
40ed59b2

pg_isready · ac2e9673

由 Robert Haas 提交于 1月 23, 2013

New command-line utility to test whether a server is ready to
accept connections.

Phil Sorber, reviewed by Michael Paquier and Peter Eisentraut

ac2e9673

23 1月, 2013 10 次提交

Improve concurrency of foreign key locking · 0ac5ad51

由 Alvaro Herrera 提交于 1月 23, 2013

This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov

0ac5ad51

R
Further documentation tweaks for event triggers. · f925c79b
由 Robert Haas 提交于 1月 23, 2013
```
Per discussion between Dimitri Fontaine, myself, and others.
```
f925c79b
R

Update comments and output for event_trigger regression test. · 601e2935
由 Robert Haas 提交于 1月 23, 2013

601e2935
H

Implement pg_unreachable() on MSVC. · 52906f17
由 Heikki Linnakangas 提交于 1月 23, 2013

52906f17
A
Gitignore vcxproj files. · eaf76484
由 Andrew Dunstan 提交于 1月 23, 2013
```
Per request from Craig Ringer.
```
eaf76484

Fix more issues with cascading replication and timeline switches. · 990fe3c4

由 Heikki Linnakangas 提交于 1月 23, 2013

When a standby server follows the master using WAL archive, and it chooses
a new timeline (recovery_target_timeline='latest'), it only fetches the
timeline history file for the chosen target timeline, not any other history
files that might be missing from pg_xlog. For example, if the current
timeline is 2, and we choose 4 as the new recovery target timeline, the
history file for timeline 3 is not fetched, even if it's part of this
server's history. That's enough for the standby itself - the history file
for timeline 4 includes timeline 3 as well - but if a cascading standby
server wants to recover to timeline 3, it needs the history file. To fix,
when a new recovery target timeline is chosen, try to copy any missing
history files from the archive to pg_xlog between the old and new target
timeline.

A second similar issue was with the WAL files. When a standby recovers from
archive, and it reaches a segment that contains a switch to a new timeline,
recovery fetches only the WAL file labelled with the new timeline's ID. The
file from the new timeline contains a copy of the WAL from the old timeline
up to the point where the switch happened, and recovery recovers it from the
new file. But in streaming replication, walsender only tries to read it
from the old timeline's file. To fix, change walsender to read it from the
new file, so that it behaves the same as recovery in that sense, and doesn't
try to open the possibly nonexistent file with the old timeline's ID.

990fe3c4

pg_upgrade: remove --single-transaction usage · 861ad67b

由 Bruce Momjian 提交于 1月 22, 2013

With AtEOXact applied, --single-transaction makes pg_restore slower, and
has the potential to require lock table configuration, so remove the
argument.

Per suggestion from Tom.

861ad67b

P
doc: Fix declared number of columns in table · 21c87a0d
由 Peter Eisentraut 提交于 1月 22, 2013
```
This was broken in 841a5150.
```
21c87a0d
R
Fix a few small bugs in yesterday's event trigger patch. · ddef9a00
由 Robert Haas 提交于 1月 22, 2013
```
Dimitri Fontaine
```
ddef9a00
R
Fix CREATE EVENT TRIGGER syntax synopsis in documentation. · 4c977319
由 Robert Haas 提交于 1月 22, 2013
```
Dimitri Fontaine, per a report from Thom Brown
```
4c977319

22 1月, 2013 3 次提交

R
Typo fixes. · 9917a491
由 Robert Haas 提交于 1月 21, 2013
```
Noted by Thom Brown.
```
9917a491

Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag. · 75b39e79

由 Tom Lane 提交于 1月 21, 2013

Originally we didn't bother to mark FuncExprs with any indication whether
VARIADIC had been given in the source text, because there didn't seem to be
any need for it at runtime. However, because we cannot fold a VARIADIC ANY
function's arguments into an array (since they're not necessarily all the
same type), we do actually need that information at runtime if VARIADIC ANY
functions are to respond unsurprisingly to use of the VARIADIC keyword.
Add the missing field, and also fix ruleutils.c so that VARIADIC ANY
function calls are dumped properly.

Extracted from a larger patch that also fixes concat() and format() (the
only two extant VARIADIC ANY functions) to behave properly when VARIADIC is
specified. This portion seems appropriate to review and commit separately.

Pavel Stehule

75b39e79

R
Add ddl_command_end support for event triggers. · 841a5150
由 Robert Haas 提交于 1月 21, 2013
```
Dimitri Fontaine, with slight changes by me
```
841a5150

21 1月, 2013 5 次提交

Refactor ALTER some-obj RENAME implementation · 765cbfdc

由 Alvaro Herrera 提交于 1月 21, 2013

Remove duplicate implementations of catalog munging and miscellaneous
privilege checks.  Instead rely on already existing data in
objectaddress.c to do the work.

Author: KaiGai Kohei, changes by me
Reviewed by: Robert Haas, Álvaro Herrera, Dimitri Fontaine

765cbfdc

Fix one-byte buffer overrun in PQprintTuples(). · 8f0d8f48

由 Tom Lane 提交于 1月 20, 2013

This bug goes back to the original Postgres95 sources.  Its significance
to modern PG versions is marginal, since we have not used PQprintTuples()
internally in a very long time, and it doesn't seem to have ever been
documented either.  Still, it *is* exposed to client apps, so somebody
out there might possibly be using it.

Xi Wang

8f0d8f48

T
Fix error-checking typo in check_TSCurrentConfig(). · 535e69a4
由 Tom Lane 提交于 1月 20, 2013
```
The code failed to detect an out-of-memory failure.

Xi Wang
```
535e69a4

doc: Fix syntax of a URL · 693eb9df

由 Peter Eisentraut 提交于 1月 20, 2013

Leading white space before the "http:" is apparently treated as a
relative link at least by some browsers.

693eb9df

Fix an O(N^2) performance issue for sessions modifying many relations. · d5b31cc3

由 Tom Lane 提交于 1月 20, 2013

AtEOXact_RelationCache() scanned the entire relation cache at the end of
any transaction that created a new relation or assigned a new relfilenode.
Thus, clients such as pg_restore had an O(N^2) performance problem that
would start to be noticeable after creating 10000 or so tables. Since
typically only a small number of relcache entries need any cleanup, we
can fix this by keeping a small list of their OIDs and doing hash_searches
for them. We fall back to the full-table scan if the list overflows.

Ideally, the maximum list length would be set at the point where N
hash_searches would cost just less than the full-table scan. Some quick
experimentation says that point might be around 50-100; I (tgl)
conservatively set MAX_EOXACT_LIST = 32. For the case that we're worried
about here, which is short single-statement transactions, it's unlikely
there would ever be more than about a dozen list entries anyway; so it's
probably not worth being too tense about the value.

We could avoid the hash_searches by instead keeping the target relcache
entries linked into a list, but that would be noticeably more complicated
and bug-prone because of the need to maintain such a list in the face of
relcache entry drops. Since a relcache entry can only need such cleanup
after a somewhat-heavyweight filesystem operation, trying to save a
hash_search per cleanup doesn't seem very useful anyway --- it's the scan
over all the not-needing-cleanup entries that we wish to avoid here.

Jeff Janes, reviewed and tweaked a bit by Tom Lane

d5b31cc3

20 1月, 2013 4 次提交

M
Clarify that streaming replication can be both async and sync · 0a2da528
由 Magnus Hagander 提交于 1月 20, 2013
```
Josh Kupershmidt
```
0a2da528

Use SET TRANSACTION READ ONLY in pg_dump, if server supports it. · 26d905a1

由 Tom Lane 提交于 1月 19, 2013

This currently does little except serve as documentation. (The one case
where it has a performance benefit, SERIALIZABLE mode in 9.1 and up, was
already using READ ONLY mode.) However, it's possible that it might have
performance benefits in future, and in any case it seems like good
practice since it would catch any accidentally non-read-only operations.

Pavan Deolasee

26d905a1

Modernize string literal syntax in tutorial example. · 4b94cfb5

由 Tom Lane 提交于 1月 19, 2013

Un-double the backslashes in the LIKE patterns, since
standard_conforming_strings is now the default.  Just to be sure, include
a command to set standard_conforming_strings to ON in the example.

Back-patch to 9.1, where standard_conforming_strings became the default.

Josh Kupershmidt, reviewed by Jeff Janes

4b94cfb5

Make pgxs build executables with the right suffix. · 9f10f7dc

由 Andrew Dunstan 提交于 1月 19, 2013

Complaint and patch from Zoltán Böszörményi.

When cross-compiling, the native make doesn't know
about the Windows .exe suffix, so it only builds with
it when explicitly told to do so.

The native make will not see the link between the target
name and the built executable, and might this do unnecesary
work, but that's a bigger problem than this one, if in fact
we consider it a problem at all.

Back-patch to all live branches.

9f10f7dc

19 1月, 2013 4 次提交

libpq doc: Clarify what commands return PGRES_TUPLES_OK · fb197290

由 Peter Eisentraut 提交于 1月 18, 2013

The old text claimed that INSERT and UPDATE always return
PGRES_COMMAND_OK, but INSERT/UPDATE with RETURNING return
PGRES_TUPLES_OK.

Josh Kupershmidt

fb197290

Protect against SnapshotNow race conditions in pg_tablespace scans. · c2a14bc7

由 Tom Lane 提交于 1月 18, 2013

Use of SnapshotNow is known to expose us to race conditions if the tuple(s)
being sought could be updated by concurrently-committing transactions.
CREATE DATABASE and DROP DATABASE are particularly exposed because they do
heavyweight filesystem operations during their scans of pg_tablespace,
so that the scans run for a very long time compared to most.  Furthermore,
the potential consequences of a missed or twice-visited row are nastier
than average:

* createdb() could fail with a bogus "file already exists" error, or
  silently fail to copy one or more tablespace's worth of files into the
  new database.

* remove_dbtablespaces() could miss one or more tablespaces, thus failing
  to free filesystem space for the dropped database.

* check_db_file_conflict() could likewise miss a tablespace, leading to an
  OID conflict that could result in data loss either immediately or in
  future operations.  (This seems of very low probability, though, since a
  duplicate database OID would be unlikely to start with.)

Hence, it seems worth fixing these three places to use MVCC snapshots, even
though this will someday be superseded by a generic solution to SnapshotNow
race conditions.

Back-patch to all active branches.

Stephen Frost and Tom Lane

c2a14bc7

B

Rename new latex longtable function name, for consistency · 530bbfac
由 Bruce Momjian 提交于 1月 18, 2013

530bbfac

Unbreak lock conflict detection for Hot Standby. · d8c38966

由 Robert Haas 提交于 1月 18, 2013

This got broken in the original fast-path locking patch, because
I failed to account for the fact that Hot Standby startup process
might take a strong relation lock on a relation in a database to
which it is not bound, and confused MyDatabaseId with the database
ID of the relation being locked.

Report and diagnosis by Andres Freund.  Final form of patch by me.

d8c38966

18 1月, 2013 6 次提交

Improve pg_upgrade error report · 600250d0

由 Bruce Momjian 提交于 1月 18, 2013

If the cluster alignments don't match, output this suggestion:

	Likely one cluster is a 32-bit install, the other 64-bit

600250d0

A
Fix off-by-one bug in xlog reading logic · 8c17144c
由 Alvaro Herrera 提交于 1月 18, 2013
```
Bug reported by Michael Paquier

Author: Andres Freund
```
8c17144c

psql latex fixes · 74a82baf

由 Bruce Momjian 提交于 1月 18, 2013

Remove extra line at bottom of table for new 'latex' mode border=3.
Also update 'latex'-longtable 'tableattr' docs to say
'whitespace-separated' instead of 'space'.

74a82baf

Now that START_REPLICATION returns the next timeline's ID after reaching end · 6f7cddc7

由 Heikki Linnakangas 提交于 1月 18, 2013

of timeline, take advantage of that in walreceiver.

Startup process is still in control of choosign the target timeline, by
scanning the timeline history files present in pg_xlog, but walreceiver now
uses the next timeline's ID to fetch its history file immediately after it
has finished streaming the old timeline. Before, the standby would first try
to restart streaming on the old timeline, which fetches the missing timeline
history file as a side-effect, and only then restart from the new timeline.
This patch eliminates the extra iteration, which speeds up the timeline
switch and reduces the noise in the log caused by the extra restart on the
old timeline.

6f7cddc7

Use the right timeline when beginning to stream from master. · 2ff65553

由 Heikki Linnakangas 提交于 1月 18, 2013

The xlogreader refactoring broke the logic to decide which timeline to start
streaming from. XLogPageRead() uses the timeline history to check which
timeline the requested WAL position falls into. However, after the
refactoring, XLogPageRead() is always first called with the first page in
the segment, to verify the segment header, and only then with the actual WAL
position we're interested in. That first read of the segment's header made
XLogPageRead() to always start streaming from the old timeline containing
the segment header, not the timeline containing the actual record, if there
was a timeline switch within the segment.

I thought I fixed this yesterday, but that fix was too narrow and only fixed
this for the corner-case that the timeline switch happened in the first page
of the segment. To fix this more robustly, pass explicitly the position of
the record we're actually interested in to XLogPageRead, and use that to
decide which timeline to read from, rather than deduce it from the page and
offset.

Per report from Fujii Masao.

2ff65553

When xlogreader asks the callback function to read a page, make sure we · 88228e6f

由 Heikki Linnakangas 提交于 1月 17, 2013

get a large enough part of the page to include the beginning of the next
record we're interested in. The XLogPageRead callback uses the requested
length to decide which timeline to stream WAL from, and if the first call
is short, and the page contains a timeline switch, we'll repeatedly try
to stream that page from the old timeline, and never get across the
timeline switch.

88228e6f