提交 · fafa374f2d1e04ab265d56cdadb634124364646f · Greenplum / Gpdb

08 2月, 2010 1 次提交

Remove old-style VACUUM FULL (which was known for a little while as · 0a469c87

由 Tom Lane 提交于 2月 08, 2010

VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.

Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).

We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long
as we want to support in-place update from pre-9.0 databases.

0a469c87

24 1月, 2010 1 次提交

In HS, Startup process sets SIGALRM when waiting for buffer pin. If · 959ac58c

由 Simon Riggs 提交于 1月 23, 2010

woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.

959ac58c

16 1月, 2010 1 次提交

Teach standby conflict resolution to use SIGUSR1 · a8ce974c

由 Simon Riggs 提交于 1月 16, 2010

Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.

a8ce974c

15 1月, 2010 1 次提交

Introduce Streaming Replication. · 40f908bd

由 Heikki Linnakangas 提交于 1月 15, 2010

This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me

40f908bd

03 1月, 2010 1 次提交
- B
  
  Update copyright for the year 2010. · 02398008
  由 Bruce Momjian 提交于 1月 02, 2010
  
  02398008
19 12月, 2009 1 次提交

Allow read only connections during recovery, known as Hot Standby. · efc16ea5

由 Simon Riggs 提交于 12月 19, 2009

Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.

efc16ea5

01 9月, 2009 1 次提交

Change the autovacuum launcher to read pg_database directly, rather than · 00e6a16d

由 Tom Lane 提交于 8月 31, 2009

via the "flat files" facility. This requires making it enough like a backend
to be able to run transactions; it's no longer an "auxiliary process" but
more like the autovacuum worker processes. Also, its signal handling has
to be brought into line with backends/workers. In particular, since it
now has to handle procsignal.c processing, the special autovac-launcher-only
signal conditions are moved to SIGUSR2.

Alvaro, with some cleanup from Tom

00e6a16d

13 8月, 2009 1 次提交

Allow backends to start up without use of the flat-file copy of pg_database. · 04011cc9

由 Tom Lane 提交于 8月 12, 2009

To make this work in the base case, pg_database now has a nailed-in-cache
relation descriptor that is initialized using hardwired knowledge in
relcache.c. This means pg_database is added to the set of relations that
need to have a Schema_pg_xxx macro maintained in pg_attribute.h. When this
path is taken, we'll have to do a seqscan of pg_database to find the row
we need.

In the normal case, we are able to do an indexscan to find the database's row
by name. This is made possible by storing a global relcache init file that
describes only the shared catalogs and their indexes (and therefore is usable
by all backends in any database). A new backend loads this cache file,
finds its database OID after an indexscan on pg_database, and then loads
the local relcache init file for that database.

This change should effectively eliminate number of databases as a factor
in backend startup time, even with large numbers of databases. However,
the real reason for doing it is as a first step towards getting rid of
the flat files altogether. There are still several other sub-projects
to be tackled before that can happen.

04011cc9

11 6月, 2009 1 次提交
- B
  8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list · d7471402
  由 Bruce Momjian 提交于 6月 11, 2009
```
provided by Andrew.
```
  d7471402
06 5月, 2009 1 次提交

Install a "dead man switch" to allow the postmaster to detect cases where · 969d7cd4

由 Tom Lane 提交于 5月 05, 2009

a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory.  We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead.  Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft).  If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.

The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway.  This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.

This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.

Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.

969d7cd4

02 1月, 2009 1 次提交
- B
  
  Update copyright for 2009. · 511db38a
  由 Bruce Momjian 提交于 1月 01, 2009
  
  511db38a
09 12月, 2008 2 次提交
- H
  
  Revert SIGUSR1 multiplexing patch, per Tom's objection. · dea81a6c
  由 Heikki Linnakangas 提交于 12月 09, 2008
  
  dea81a6c
- H
  Provide support for multiplexing SIGUSR1 signal. The upcoming synchronous · 7b05b3fa
  由 Heikki Linnakangas 提交于 12月 09, 2008
```
replication patch needs a signal, but we've already used SIGUSR1 and
SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that,
and for other purposes too if the need arises.
```
  7b05b3fa
03 11月, 2008 1 次提交

Remove the last vestiges of the MAKE_PTR/MAKE_OFFSET mechanism. We haven't · d7112cfa

由 Tom Lane 提交于 11月 02, 2008

allowed different processes to have different addresses for the shmem segment
in quite a long time, but there were still a few places left that used the
old coding convention.  Clean them up to reduce confusion and improve the
compiler's ability to detect pointer type mismatches.

Kris Jurka

d7112cfa

10 6月, 2008 1 次提交
- N
  
  Further tweak for comment in CheckDeadLock(), per Tom. · 83742460
  由 Neil Conway 提交于 6月 09, 2008
  
  83742460
09 6月, 2008 1 次提交
- N
  
  Fix typo in comment. · da80a4b9
  由 Neil Conway 提交于 6月 09, 2008
  
  da80a4b9
27 1月, 2008 1 次提交

Change StatementCancelHandler() to check the DoingCommandRead flag to decide · 6322e844

由 Tom Lane 提交于 1月 26, 2008

whether to execute an immediate interrupt, rather than testing whether
LockWaitCancel() cancelled a lock wait. The old way misclassified the case
where we were blocked in ProcWaitForSignal(), and arguably would misclassify
any other future additions of new ImmediateInterruptOK states too. This
allows reverting the old kluge that gave LockWaitCancel() a return value,
since no callers care anymore. Improve comments in the various
implementations of PGSemaphoreLock() to explain that on some platforms, the
assumption that semop() exits after a signal is wrong, and so we must ensure
that the signal handler itself throws elog if we want cancel or die interrupts
to be effective. Per testing related to bug #3883, though this patch doesn't
solve those problems fully.

Perhaps this change should be back-patched, but since pre-8.3 branches aren't
really relying on autovacuum to respond to SIGINT, it doesn't seem critical
for them.

6322e844

02 1月, 2008 1 次提交
- B
  
  Update copyrights in source tree to 2008. · 9098ab9e
  由 Bruce Momjian 提交于 1月 01, 2008
  
  9098ab9e
16 11月, 2007 1 次提交
- B
  
  pgindent run for 8.3. · fdf5a5ef
  由 Bruce Momjian 提交于 11月 15, 2007
  
  fdf5a5ef
27 10月, 2007 1 次提交
- A
  Allow an autovacuum worker to be interrupted automatically when it is found · acac68b2
  由 Alvaro Herrera 提交于 10月 26, 2007
```
to be locking another process (except when it's working to prevent Xid
wraparound problems).
```
  acac68b2
25 10月, 2007 1 次提交

Rearrange vacuum-related bits in PGPROC as a bitmask, to better support · 745c1b2c

由 Alvaro Herrera 提交于 10月 24, 2007

having several of them.  Add two more flags: whether the process is
executing an ANALYZE, and whether a vacuum is for Xid wraparound (which
is obviously only set by autovacuum).

Sneakily move the worker's recently-acquired PostAuthDelay to a more useful
place.

745c1b2c

09 9月, 2007 1 次提交

Replace the former method of determining snapshot xmax --- to wit, calling · 6bd4f401

由 Tom Lane 提交于 9月 08, 2007

ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid"
variable that is updated during transaction commit or abort. Since
latestCompletedXid is written only in places that had to lock ProcArrayLock
exclusively anyway, and is read only in places that had to lock ProcArrayLock
shared anyway, it adds no new locking requirements to the system despite being
cluster-wide. Moreover, removing ReadNewTransactionId from snapshot
acquisition eliminates the need to take both XidGenLock and ProcArrayLock at
the same time. Since XidGenLock is sometimes held across I/O this can be a
significant win. Some preliminary benchmarking suggested that this patch has
no effect on average throughput but can significantly improve the worst-case
transaction times seen in pgbench. Concept by Florian Pflug, implementation
by Tom Lane.

6bd4f401

06 9月, 2007 1 次提交

Implement lazy XID allocation: transactions that do not modify any database · 295e6398

由 Tom Lane 提交于 9月 05, 2007

rows will normally never obtain an XID at all. We already did things this way
for subtransactions, but this patch extends the concept to top-level
transactions. In applications where there are lots of short read-only
transactions, this should improve performance noticeably; not so much from
removal of the actual XID-assignments, as from reduction of overhead that's
driven by the rate of XID consumption. We add a concept of a "virtual
transaction ID" so that active transactions can be uniquely identified even
if they don't have a regular XID. This is a much lighter-weight concept:
uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
record is made about them.

Florian Pflug, with some editorialization by Tom.

295e6398

28 8月, 2007 1 次提交

Improve behavior of log_lock_waits patch. Ensure that something gets logged · 24d4517b

由 Tom Lane 提交于 8月 28, 2007

even if the "deadlock detected" ERROR message is suppressed by an exception
catcher. Be clearer about the event sequence when a soft deadlock is fixed:
the fixing process might or might not still have to wait, so log that
separately. Fix race condition when someone releases us from the lock partway
through printing all this junk --- we'd not get confused about our state, but
the log message sequence could have been misleading, ie, a "still waiting"
message with no subsequent "acquired" message. Greg Stark and Tom Lane.

24d4517b

17 7月, 2007 1 次提交
- T
  Add comments spelling out why it's a good idea to release multiple · 82b36846
  由 Tom Lane 提交于 7月 16, 2007
```
partition locks in reverse order.
```
  82b36846
20 6月, 2007 2 次提交

Only log 'process acquired lock' if we actually did get the lock. This · 9cce91db

由 Tom Lane 提交于 6月 19, 2007

test seems inessential right now since the only control path for not
getting the lock is via CHECK_FOR_INTERRUPTS which won't return control
to ProcSleep, but it would be important if we ever allow the deadlock
code to kill someone else's transaction instead of our own.

9cce91db

Code review for log_lock_waits patch. Don't try to issue log messages from · 6e072287

由 Tom Lane 提交于 6月 19, 2007

within a signal handler (this might be safe given the relatively narrow code
range in which the interrupt is enabled, but it seems awfully risky); do issue
more informative log messages that tell what is being waited for and the exact
length of the wait; minor other code cleanup. Greg Stark and Tom Lane

6e072287

17 4月, 2007 1 次提交

Add a multi-worker capability to autovacuum. This allows multiple worker · e2a186b0

由 Alvaro Herrera 提交于 4月 16, 2007

processes to be running simultaneously.  Also, now autovacuum processes do not
count towards the max_connections limit; they are counted separately from
regular processes, and are limited by the new GUC variable
autovacuum_max_workers.

The launcher now has intelligence to launch workers on each database every
autovacuum_naptime seconds, limited only on the max amount of worker slots
available.

Also, the global worker I/O utilization is limited by the vacuum cost-based
delay feature.  Workers are "balanced" so that the total I/O consumption does
not exceed the established limit.  This part of the patch was contributed by
ITAGAKI Takahiro.

Per discussion.

e2a186b0

04 4月, 2007 1 次提交

Remove the CheckpointStartLock in favor of having backends show whether they · 9c9b6194

由 Tom Lane 提交于 4月 03, 2007

are in their commit critical sections via flags in the ProcArray. Checkpoint
can watch the ProcArray to determine when it's safe to proceed. This is
a considerably better solution to the original problem of race conditions
between checkpoint and transaction commit: it speeds up commit, since there's
one less lock to fool with, and it prevents the problem of checkpoint being
delayed indefinitely when there's a constant flow of commits. Heikki, with
some kibitzing from Tom.

9c9b6194

07 3月, 2007 1 次提交
- A
  Cleanup the bootstrap code a little, and rename "dummy procs" in the code · 626eb021
  由 Alvaro Herrera 提交于 3月 07, 2007
```
comments and variables to "auxiliary proc", per Heikki's request.
```
  626eb021
04 3月, 2007 1 次提交
- B
  Add GUC log_lock_waits to log long wait times. · e52c4a6e
  由 Bruce Momjian 提交于 3月 03, 2007
```
Simon Riggs
```
  e52c4a6e
16 2月, 2007 1 次提交

Restructure autovacuum in two processes: a dummy process, which runs · 18206509

由 Alvaro Herrera 提交于 2月 15, 2007

continuously, and requests vacuum runs of "autovacuum workers" to postmaster.
The workers do the actual vacuum work. This allows for future improvements,
like allowing multiple autovacuum jobs running in parallel.

For now, the code keeps the original behavior of having a single autovac
process at any time by sleeping until the previous worker has finished.

18206509

16 1月, 2007 1 次提交

Arrange for autovacuum to be killed when another operation wants to be alone · eb63cc3d

由 Alvaro Herrera 提交于 1月 16, 2007

accessing it, like DROP DATABASE.  This allows the regression tests to pass
with autovacuum enabled, which open the gates for finally enabling autovacuum
by default.

eb63cc3d

06 1月, 2007 1 次提交
- B
  Update CVS HEAD for 2007 copyright. Back branches are typically not · 29dccf5f
  由 Bruce Momjian 提交于 1月 05, 2007
```
back-stamped for this.
```
  29dccf5f
22 11月, 2006 1 次提交

On systems that have setsid(2) (which should be just about everything except · 3ad0728c

由 Tom Lane 提交于 11月 21, 2006

Windows), arrange for each postmaster child process to be its own process
group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
process group not only the direct child process. This provides saner behavior
for archive and recovery scripts; in particular, it's possible to shut down a
warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
of SIGQUIT to the startup subprocess will result in killing the waiting
recovery_command. Also, this makes Query Cancel and statement_timeout apply
to scripts being run from backends via system(). (There is no support in the
core backend for that, but it's widely done using untrusted PLs.) Per gripe
from Stephen Harris and subsequent discussion.

3ad0728c

04 10月, 2006 1 次提交
- B
  
  pgindent run for 8.2. · f99a569a
  由 Bruce Momjian 提交于 10月 04, 2006
  
  f99a569a
30 7月, 2006 1 次提交

Modify snapshot definition so that lazy vacuums are ignored by other · 92c2ecc1

由 Alvaro Herrera 提交于 7月 30, 2006

vacuums.  This allows a OLTP-like system with big tables to continue
regular vacuuming on small-but-frequently-updated tables while the
big tables are being vacuumed.

Original patch from Hannu Krossing, rewritten by Tom Lane and updated
by me.

92c2ecc1

24 7月, 2006 1 次提交
- T
  Convert the lock manager to use the new dynahash.c support for partitioned · a794fb06
  由 Tom Lane 提交于 7月 23, 2006
```
hash tables, instead of the previous kluge involving multiple hash tables.
This partially undoes my patch of last December.
```
  a794fb06
14 7月, 2006 2 次提交
- B
  
  Remove 576 references of include files that were not needed. · e0522505
  由 Bruce Momjian 提交于 7月 14, 2006
  
  e0522505
- B
  Allow include files to compile own their own. · a22d76d9
  由 Bruce Momjian 提交于 7月 13, 2006
```
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
```
  a22d76d9