提交 · da63fec7dbb0c459aa0f8753a3d30a3a6cd3d73f · Greenplum / Gpdb

01 12月, 2012 1 次提交

Add missing buffer lock acquisition in GetTupleForTrigger(). · da63fec7

由 Tom Lane 提交于 11月 30, 2012

If we had not been holding buffer pin continuously since the tuple was
initially fetched by the UPDATE or DELETE query, it would be possible for
VACUUM or a page-prune operation to move the tuple while we're trying to
copy it. This would result in a garbage "old" tuple value being passed to
an AFTER ROW UPDATE or AFTER ROW DELETE trigger. The preconditions for
this are somewhat improbable, and the timing constraints are very tight;
so it's not so surprising that this hasn't been reported from the field,
even though the bug has been there a long time.

Problem found by Andres Freund. Back-patch to all active branches.

da63fec7

30 11月, 2012 7 次提交

Clean environment for pg_upgrade test. · abece8af

由 Andrew Dunstan 提交于 11月 30, 2012

This removes exisiting PG settings from the environment for
pg_upgrade tests, just like pg_regress does.

abece8af

Add libpq function PQconninfo() · 65c3bf19

由 Magnus Hagander 提交于 11月 30, 2012

This allows a caller to get back the exact conninfo array that was
used to create a connection, including parameters read from the
environment.

In doing this, restructure how options are copied from the conninfo
to the actual connection.

Zoltan Boszormenyi and Magnus Hagander

65c3bf19

Produce a more useful error message for over-length Unix socket paths. · 4af446e7

由 Tom Lane 提交于 11月 29, 2012

The length of a socket path name is constrained by the size of struct
sockaddr_un, and there's not a lot we can do about it since that is a
kernel API. However, it would be a good thing if we produced an
intelligible error message when the user specifies a socket path that's too
long --- and getaddrinfo's standard API is too impoverished to do this in
the natural way. So insert explicit tests at the places where we construct
a socket path name. Now you'll get an error that makes sense and even
tells you what the limit is, rather than something generic like
"Non-recoverable failure in name resolution".

Per trouble report from Jeremy Drake and a fix idea from Andrew Dunstan.

4af446e7

S

Correctly init fast path fields on PGPROC · d3fe5993
由 Simon Riggs 提交于 11月 29, 2012

d3fe5993
S

Cleanup VirtualXact at end of Hot Standby. · f1e57a4e
由 Simon Riggs 提交于 11月 29, 2012

f1e57a4e

Basic binary heap implementation. · 7a2fe9bd

由 Robert Haas 提交于 11月 29, 2012

There are probably other places where this can be used, but for now,
this just makes MergeAppend use it, so that this code will have test
coverage.  There is other work in the queue that will use this, as
well.

Abhijit Menon-Sen, reviewed by Andres Freund, Robert Haas, Álvaro
Herrera, Tom Lane, and others.

7a2fe9bd

M
When processing nested structure pointer variables ecpg always expected an · 086cf145
由 Michael Meskes 提交于 11月 29, 2012
```
array datatype which of course is wrong.

Applied patch by Muhammad Usama <m.usama@gmail.com> to fix this.
```
086cf145

29 11月, 2012 3 次提交

Suppress parallel build in interfaces/ecpg/preproc/. · 1fc698cf

由 Tom Lane 提交于 11月 28, 2012

This is to see if it will stop intermittent build failures on buildfarm
member okapi. We know that gmake 3.82 has some problems with sometimes
not honoring dependencies in parallel builds, and it seems likely that
this is more of the same. Since the vast bulk of the work in the preproc
directory is associated with creating preproc.c and then preproc.o,
parallelism buys us hardly anything here anyway.

Also, make both this .NOTPARALLEL and the one previously added in
interfaces/ecpg/Makefile be conditional on "ifeq ($(MAKE_VERSION),3.82)".
The known bug in gmake is fixed upstream and should not be present in
3.83 and up, and there's no reason to think it affects older releases.

1fc698cf

Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY. · 3c840464

由 Tom Lane 提交于 11月 28, 2012

Commit 8cb53654, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.

To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)

In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.

In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.

Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.

In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.

This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.

Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.

3c840464

A
Split out rmgr rm_desc functions into their own files · 1577b46b
由 Alvaro Herrera 提交于 11月 28, 2012
```
This is necessary (but not sufficient) to have them compilable outside
of a backend environment.
```
1577b46b

28 11月, 2012 1 次提交

If we don't have a backup-end-location, don't claim we've reached it. · dd7353dd

由 Heikki Linnakangas 提交于 11月 28, 2012

This was apparently a typo, which caused recovery to think that it
immediately reached the end of backup, and allowed the database to start
up too early.

Reported by Jeff Janes. Backpatch to 9.2, where this code was introduced.

dd7353dd

27 11月, 2012 4 次提交

T
Add explicit casts in ilist.h's inline functions. · e78d288c
由 Tom Lane 提交于 11月 27, 2012
```
Needed to silence C++ errors, per report from Peter Eisentraut.

Andres Freund
```
e78d288c

Add OpenTransientFile, with automatic cleanup at end-of-xact. · 1f67078e

由 Heikki Linnakangas 提交于 11月 27, 2012

Files opened with BasicOpenFile or PathNameOpenFile are not automatically
cleaned up on error. That puts unnecessary burden on callers that only want
to keep the file open for a short time. There is AllocateFile, but that
returns a buffered FILE * stream, which in many cases is not the nicest API
to work with. So add function called OpenTransientFile, which returns a
unbuffered fd that's cleaned up like the FILE* returned by AllocateFile().

This plugs a few rare fd leaks in error cases:

1. copy_file() - fixed by by using OpenTransientFile instead of BasicOpenFile
2. XLogFileInit() - fixed by adding close() calls to the error cases. Can't
use OpenTransientFile here because the fd is supposed to persist over
transaction boundaries.
3. lo_import/lo_export - fixed by using OpenTransientFile instead of
PathNameOpenFile.

In addition to plugging those leaks, this replaces many BasicOpenFile() calls
with OpenTransientFile() that were not leaking, because the code meticulously
closed the file on error. That wasn't strictly necessary, but IMHO it's good
for robustness.

The same leaks exist in older versions, but given the rarity of the issues,
I'm not backpatching this. Not yet, anyway - it might be good to backpatch
later, after this mechanism has had some more testing in master branch.

1f67078e

Revert patch for taking fewer snapshots. · 53299429

由 Tom Lane 提交于 11月 26, 2012

This reverts commit d573e239, "Take fewer
snapshots". While that seemed like a good idea at the time, it caused
execution to use a snapshot that had been acquired before locking any of
the tables mentioned in the query. This created user-visible anomalies
that were not present in any prior release of Postgres, as reported by
Tomas Vondra. While this whole area could do with a redesign (since there
are related cases that have anomalies anyway), it doesn't seem likely that
any future patch would be reasonably back-patchable; and we don't want 9.2
to exhibit a behavior that's subtly unlike either past or future releases.
Hence, revert to prior code while we rethink the problem.

53299429

Fix SELECT DISTINCT with index-optimized MIN/MAX on inheritance trees. · d3237e04

由 Tom Lane 提交于 11月 26, 2012

In a query such as "SELECT DISTINCT min(x) FROM tab", the DISTINCT is
pretty useless (there being only one output row), but nonetheless it
shouldn't fail. But it could fail if "tab" is an inheritance parent,
because planagg.c's code for fixing up equivalence classes after making the
index-optimized MIN/MAX transformation wasn't prepared to find child-table
versions of the aggregate expression. The least ugly fix seems to be
to add an option to mutate_eclass_expressions() to skip child-table
equivalence class members, which aren't used anymore at this stage of
planning so it's not really necessary to fix them. Since child members
are ignored in many cases already, it seems plausible for
mutate_eclass_expressions() to have an option to ignore them too.

Per bug #7703 from Maxim Boguk.

Back-patch to 9.1. Although the same code exists before that, it cannot
encounter child-table aggregates AFAICS, because the index optimization
transformation cannot succeed on inheritance trees before 9.1 (for lack
of MergeAppend).

d3237e04

25 11月, 2012 2 次提交
- B
  In pg_upgrade, simplify function copy_file() by using pg_malloc() and · 6b711cf3
  由 Bruce Momjian 提交于 11月 24, 2012
```
centralizing error/shutdown code.
```
  6b711cf3
- B
  In pg_upgrade, fix a few place that used maloc/free rather than · 16e1ae77
  由 Bruce Momjian 提交于 11月 24, 2012
```
pg_malloc/pg_free.
```
  16e1ae77
24 11月, 2012 1 次提交
- P
  Remove -Wlogical-op from standard compiler flags · bc5430aa
  由 Peter Eisentraut 提交于 11月 23, 2012
```
It creates too many warnings with GCC 4.3 and 4.4.
```
  bc5430aa
23 11月, 2012 2 次提交

M
Applied patch by Chen Huajun <chenhj@cn.fujitsu.com> to make ecpg able to cope · c50b8a46
由 Michael Meskes 提交于 11月 23, 2012
```
with very long structs.
```
c50b8a46

Fix pg_resetxlog to use correct path to postmaster.pid. · 455b8887

由 Tom Lane 提交于 11月 22, 2012

Since we've already chdir'd into the data directory, the file should
be referenced as just "postmaster.pid", without prefixing the directory
path. This is harmless in the normal case where an absolute PGDATA path
is used, but quite dangerous if a relative path is specified, since the
program might then fail to notice an active postmaster.

Reported by Hari Babu. This got broken in my commit
eb5949d1, so patch all active versions.

455b8887

22 11月, 2012 2 次提交

Avoid bogus "out-of-sequence timeline ID" errors in standby-mode. · 24c19e6b

由 Heikki Linnakangas 提交于 11月 22, 2012

When startup process opens a WAL segment after replaying part of it, it
validates the first page on the WAL segment, even though the page it's
really interested in later in the file. As part of the validation, it checks
that the TLI on the page header is >= the TLI it saw on the last page it
read. If the segment contains a timeline switch, and we have already
replayed it, and then re-open the WAL segment (because of streaming
replication got disconnected and reconnected, for example), the TLI check
will fail when the first page is validated. Fix that by relaxing the TLI
check when re-opening a WAL segment.

Backpatch to 9.0. Earlier versions had the same code, but before standby
mode was introduced in 9.0, recovery never tried to re-read a segment after
partially replaying it.

Reported by Amit Kapila, while testing a new feature.

24c19e6b

Don't launch new child processes after we've been told to shut down. · 27b2c6a1

由 Tom Lane 提交于 11月 21, 2012

Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.

Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.

Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.

27b2c6a1

21 11月, 2012 1 次提交

Speed up operations on numeric, mostly by avoiding palloc() overhead. · 5cb0e335

由 Heikki Linnakangas 提交于 11月 21, 2012

In many functions, a NumericVar was initialized from an input Numeric, to be
passed as input to a calculation function. When the NumericVar is not
modified, the digits array of the NumericVar can point directly to the digits
array in the original Numeric, and we can avoid a palloc() and memcpy(). Add
init_var_from_num() function to initialize a var like that.

Remove dscale argument from get_str_from_var(), as all the callers just
passed the dscale of the variable. That means that the rounding it used to
do was not actually necessary, and get_str_from_var() no longer scribbles on
its input. That makes it safer in general, and allows us to use the new
init_var_from_num() function in e.g numeric_out().

Also modified numericvar_to_int8() to no scribble on its input either. It
creates a temporary copy to avoid that. To compensate, the callers no longer
need to create a temporary copy, so the net # of pallocs is the same, but this
is nicer.

In the passing, use a constant for the number 10 in get_str_from_var_sci(),
when calculating 10^exponent. Saves a palloc() and some cycles to convert
integer 10 to numeric.

Original patch by Kyotaro HORIGUCHI, with further changes by me. Reviewed
by Pavel Stehule.

5cb0e335

20 11月, 2012 2 次提交

B
In pg_upgrade, report errno string if file existence check returns an · b55743a5
由 Bruce Momjian 提交于 11月 19, 2012
```
error and errno != ENOENT.
```
b55743a5

Improve handling of INT_MIN / -1 and related cases. · 1f7cb5c3

由 Tom Lane 提交于 11月 19, 2012

Some platforms throw an exception for this division, rather than returning
a necessarily-overflowed result.  Since we were testing for overflow after
the fact, an exception isn't nice.  We can avoid the problem by treating
division by -1 as negation.

Add some regression tests so that we'll find out if any compilers try to
optimize away the overflow check conditions.

This ought to be back-patched, but I'm going to see what the buildfarm
reports about the regression tests first.

Per discussion with Xi Wang, though this is different from the patch he
submitted.

1f7cb5c3

19 11月, 2012 4 次提交

Fix archive_cleanup_command. · 644a0a63

由 Heikki Linnakangas 提交于 11月 19, 2012

When I moved ExecuteRecoveryCommand() from xlog.c to xlogarchive.c, I didn't
realize that it's called from the checkpoint process, not the startup
process. I tried to use InRedo variable to decide whether or not to attempt
cleaning up the archive (must not do so before we have read the initial
checkpoint record), but that variable is only valid within the startup
process.

Instead, let ExecuteRecoveryCommand() always clean up the archive, and add
an explicit argument to RestoreArchivedFile() to say whether that's allowed
or not. The caller knows better.

Reported by Erik Rijkers, diagnosis by Fujii Masao. Only 9.3devel is
affected.

644a0a63

Limit values of archive_timeout, post_auth_delay, auth_delay.milliseconds. · b6e3798f

由 Tom Lane 提交于 11月 18, 2012

The previous definitions of these GUC variables allowed them to range
up to INT_MAX, but in point of fact the underlying code would suffer
overflows or other errors with large values. Reduce the maximum values
to something that won't misbehave. There's no apparent value in working
harder than this, since very large delays aren't sensible for any of
these. (Note: the risk with archive_timeout is that if we're late
checking the state, the timestamp difference it's being compared to
might overflow. So we need some amount of slop; the choice of INT_MAX/2
is arbitrary.)

Per followup investigation of bug #7670. Although this isn't a very
significant fix, might as well back-patch.

b6e3798f

Fix syslogger to not fail when log_rotation_age exceeds 2^31 milliseconds. · d038966d

由 Tom Lane 提交于 11月 18, 2012

We need to avoid calling WaitLatch with timeouts exceeding INT_MAX.
Fortunately a simple clamp will do the trick, since no harm is done if
the wait times out before it's really time to rotate the log file.
Per bug #7670 (probably bug #7545 is the same thing, too).

In passing, fix bogus definition of log_rotation_age's maximum value in
guc.c --- it was numerically right, but only because MINS_PER_HOUR and
SECS_PER_MINUTE have the same value.

Back-patch to 9.2. Before that, syslogger wasn't using WaitLatch.

d038966d

Assert that WaitLatch's timeout is not more than INT_MAX milliseconds. · 14ddff44

由 Tom Lane 提交于 11月 18, 2012

The behavior with larger values is unspecified by the Single Unix Spec.
It appears that BSD-derived kernels report EINVAL, although Linux does not.
If waiting for longer intervals is desired, the calling code has to do
something to limit the delay; we can't portably fix it here since "long"
may not be any wider than "int" in the first place.

Part of response to bug #7670, though this change doesn't fix that
(in fact, it converts the problem from an ERROR into an Assert failure).
No back-patch since it's just an assertion addition.

14ddff44

18 11月, 2012 1 次提交
- P
  doc: Put pg_temp into documentation index · 6b6633ad
  由 Peter Eisentraut 提交于 11月 17, 2012
```
Karl O. Pinc
```
  6b6633ad
16 11月, 2012 2 次提交

P

Add -Wlogical-op to standard compiler flags, if supported · 67c03c6f
由 Peter Eisentraut 提交于 11月 16, 2012

67c03c6f

Improve check_partial_indexes() to consider join clauses in proof attempts. · 1746ba92

由 Tom Lane 提交于 11月 15, 2012

Traditionally check_partial_indexes() has only looked at restriction
clauses while trying to prove partial indexes usable in queries. However,
join clauses can also be used in some cases; mainly, that a strict operator
on "x" proves an "x IS NOT NULL" index predicate, even if the operator is
in a join clause rather than a restriction clause. Adding this code fixes
a regression in 9.2, because previously we would take join clauses into
account when considering whether a partial index could be used in a
nestloop inner indexscan path. 9.2 doesn't handle nestloop inner
indexscans in the same way, and this consideration was overlooked in the
rewrite. Moving the work to check_partial_indexes() is a better solution
anyway, since the proof applies whether or not we actually use the index
in that particular way, and we don't have to do it over again for each
possible outer relation. Per report from Dave Cramer.

1746ba92

15 11月, 2012 4 次提交

P
doc: Put commas in the right place on pg_restore reference page · 817c186e
由 Peter Eisentraut 提交于 11月 15, 2012
```
Karl O. Pinc
```
817c186e
B

In pg_upgrade, add third meaningless parameter to open(). · 546d65d5
由 Bruce Momjian 提交于 11月 14, 2012

546d65d5

In pg_upgrade, copy fsm, vm, and extent files by checking for file · 29add0de

由 Bruce Momjian 提交于 11月 14, 2012

existence via open(), rather than collecting a directory listing and
looking up matching relfilenode files with sequential scans of the
array.  This speeds up pg_upgrade by 2x for a large number of tables,
e.g. 16k.

Per observation by Ants Aasma.

29add0de

Fix the int8 and int2 cases of (minimum possible integer) % (-1). · a235b85a

由 Tom Lane 提交于 11月 14, 2012

The correct answer for this (or any other case with arg2 = -1) is zero,
but some machines throw a floating-point exception instead of behaving
sanely.  Commit f9ac414c dealt with this
in int4mod, but overlooked the fact that it also happens in int8mod
(at least on my Linux x86_64 machine).  Protect int2mod as well; it's
not clear whether any machines fail there (mine does not) but since the
test is so cheap it seems better safe than sorry.  While at it, simplify
the original guard in int4mod: we need only check for arg2 == -1, we
don't need to check arg1 explicitly.

Xi Wang, with some editing by me.

a235b85a

14 11月, 2012 3 次提交

B

Mark pg_upgrade's free_db_and_rel_infos() as a static function. · dec10ba4
由 Bruce Momjian 提交于 11月 13, 2012

dec10ba4
B

Adjust find_status for newer Linux 'nm' output format. · 3bdfd9cb
由 Bruce Momjian 提交于 11月 13, 2012

3bdfd9cb

Fix memory leaks in record_out() and record_send(). · 273986bf

由 Tom Lane 提交于 11月 13, 2012

record_out() leaks memory: it fails to free the strings returned by the
per-column output functions, and also is careless about detoasted values.
This results in a query-lifespan memory leakage when returning composite
values to the client, because printtup() runs the output functions in the
query-lifespan memory context.  Fix it to handle these issues the same way
printtup() does.  Also fix a similar leakage in record_send().

(At some point we might want to try to run output functions in
shorter-lived memory contexts, so that we don't need a zero-leakage policy
for them.  But that would be a significantly more invasive patch, which
doesn't seem like material for back-patching.)

In passing, use appendStringInfoCharMacro instead of appendStringInfoChar
in the innermost data-copying loop of record_out, to try to shave a few
cycles from this function's runtime.

Per trouble report from Carlos Henrique Reimer.  Back-patch to all
supported versions.

273986bf