提交 · 3b5548a3d524e3b37d49f79f707d2119ecdfa303 · Greenplum / Gpdb

14 5月, 2012 1 次提交

Update comments that became out-of-date with the PGXACT struct. · 9e4637bf

由 Heikki Linnakangas 提交于 5月 14, 2012

When the "hot" members of PGPROC were split off to separate PGXACT structs,
many PGPROC fields referred to in comments were moved to PGXACT, but the
comments were neglected in the commit. Mostly this is just a search/replace
of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
prepared transactions changed more, making some of the comments totally
bogus.

Noah Misch

9e4637bf

24 4月, 2012 1 次提交
- R
  Lots of doc corrections. · 5d4b60f2
  由 Robert Haas 提交于 4月 23, 2012
```
Josh Kupershmidt
```
  5d4b60f2
07 2月, 2012 1 次提交

Add locking around WAL-replay modification of shared-memory variables. · c6d76d7c

由 Tom Lane 提交于 2月 06, 2012

Originally, most of this code assumed that no Postgres backends could be
running concurrently with it, and so no locking could be needed. That
assumption fails in Hot Standby. While it's still true that Hot Standby
backends should never change values like nextXid, they can examine them,
and consistency is important in some cases such as when computing a
snapshot. Therefore, prudence requires that WAL replay code obtain the
relevant locks when modifying such variables, even though it can examine
them without taking a lock. We were following that coding rule in some
places but not all. This commit applies the coding rule uniformly to all
updates of ShmemVariableCache and MultiXactState fields; a search of the
replay routines did not find any other cases that seemed to be at risk.

In addition, this commit fixes a longstanding thinko in replay of NEXTOID
and checkpoint records: we tried to advance nextOid only if it was behind
the value in the WAL record, but the comparison would draw the wrong
conclusion if OID wraparound had occurred since the previous value.
Better to just unconditionally assign the new value, since OID assignment
shouldn't be happening during replay anyway.

The additional locking seems to be more in the nature of future-proofing
than fixing any live bug, so I am not going to back-patch it. The NEXTOID
fix will be back-patched separately.

c6d76d7c

30 1月, 2012 1 次提交
- T
  Assorted comment fixes, mostly just typos, but some obsolete statements. · ad10853b
  由 Tom Lane 提交于 1月 29, 2012
```
YAMAMOTO Takashi
```
  ad10853b
24 1月, 2012 1 次提交

Resolve timing issue with logging locks for Hot Standby. · c172b7b0

由 Simon Riggs 提交于 1月 23, 2012

We log AccessExclusiveLocks for replay onto standby nodes,
but because of timing issues on ProcArray it is possible to
log a lock that is still held by a just committed transaction
that is very soon to be removed. To avoid any timing issue we
avoid applying locks made by transactions with InvalidXid.

Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee

c172b7b0

02 1月, 2012 1 次提交
- B
  
  Update copyright notices for year 2012. · e126958c
  由 Bruce Momjian 提交于 1月 01, 2012
  
  e126958c
17 12月, 2011 1 次提交

Various micro-optimizations for GetSnapshopData(). · 0d76b60d

由 Robert Haas 提交于 12月 16, 2011

Heikki Linnakangas had the idea of rearranging GetSnapshotData to
avoid checking for sub-XIDs when no top-level XID is present. This
patch does that plus further a bit of further, related rearrangement.
Benchmarking show a significant improvement on unlogged tables at
higher concurrency levels, and mostly indifferent result on permanent
tables (which are presumably bottlenecked elsewhere). Most of the
benefit seems to come from using the new NormalTransactionIdPrecedes()
macro rather than the function call TransactionIdPrecedes().

0d76b60d

25 11月, 2011 1 次提交

Move "hot" members of PGPROC into a separate PGXACT array. · ed0b409d

由 Robert Haas 提交于 11月 25, 2011

This speeds up snapshot-taking and reduces ProcArrayLock contention.
Also, the PGPROC (and PGXACT) structures used by two-phase commit are
now allocated as part of the main array, rather than in a separate
array, and we keep ProcArray sorted in pointer order. These changes
are intended to minimize the number of cache lines that must be pulled
in to take a snapshot, and testing shows a substantial increase in
performance on both read and write workloads at high concurrencies.

Pavan Deolasee, Heikki Linnakangas, Robert Haas

ed0b409d

02 11月, 2011 2 次提交

Derive oldestActiveXid at correct time for Hot Standby. · 86e33648

由 Simon Riggs 提交于 11月 02, 2011

There was a timing window between when oldestActiveXid was derived
and when it should have been derived that only shows itself under
heavy load. Move code around to ensure correct timing of derivation.
No change to StartupSUBTRANS() code, which is where this failed.

Bug report by Chris Redekop

86e33648

Start Hot Standby faster when initial snapshot is incomplete. · 10b7c686

由 Simon Riggs 提交于 11月 02, 2011

If the initial snapshot had overflowed then we can start whenever
the latest snapshot is empty, not overflowed or as we did already,
start when the xmin on primary was higher than xmax of our starting
snapshot, which proves we have full snapshot data.

Bug report by Chris Redekop

10b7c686

23 10月, 2011 1 次提交

Support synchronization of snapshots through an export/import procedure. · bb446b68

由 Tom Lane 提交于 10月 22, 2011

A transaction can export a snapshot with pg_export_snapshot(), and then
others can import it with SET TRANSACTION SNAPSHOT.  The data does not
leave the server so there are not security issues.  A snapshot can only
be imported while the exporting transaction is still running, and there
are some other restrictions.

I'm not totally convinced that we've covered all the bases for SSI (true
serializable) mode, but it works fine for lesser isolation modes.

Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified
by Tom Lane

bb446b68

21 10月, 2011 1 次提交

Simplify and improve ProcessStandbyHSFeedbackMessage logic. · b4a0223d

由 Tom Lane 提交于 10月 20, 2011

There's no need to clamp the standby's xmin to be greater than
GetOldestXmin's result; if there were any such need this logic would be
hopelessly inadequate anyway, because it fails to account for
within-database versus cluster-wide values of GetOldestXmin. So get rid of
that, and just rely on sanity-checking that the xmin is not wrapped around
relative to the nextXid counter. Also, don't reset the walsender's xmin if
the current feedback xmin is indeed out of range; that just creates more
problems than we already had. Lastly, don't bother to take the
ProcArrayLock; there's no need to do that to set xmin.

Also improve the comments about this in GetOldestXmin itself.

b4a0223d

04 9月, 2011 1 次提交

Clean up the #include mess a little. · 1609797c

由 Tom Lane 提交于 9月 04, 2011

walsender.h should depend on xlog.h, not vice versa. (Actually, the
inclusion was circular until a couple hours ago, which was even sillier;
but Bruce broke it in the expedient rather than logically correct
direction.) Because of that poor decision, plus blind application of
pgrminclude, we had a situation where half the system was depending on
xlog.h to include such unrelated stuff as array.h and guc.h. Clean up
the header inclusion, and manually revert a lot of what pgrminclude had
done so things build again.

This episode reinforces my feeling that pgrminclude should not be run
without adult supervision. Inclusion changes in header files in particular
need to be reviewed with great care. More generally, it'd be good if we
had a clearer notion of module layering to dictate which headers can sanely
include which others ... but that's a big task for another day.

1609797c

01 9月, 2011 1 次提交
- B
  
  Remove unnecessary #include references, per pgrminclude script. · 6416a82a
  由 Bruce Momjian 提交于 9月 01, 2011
  
  6416a82a
10 4月, 2011 1 次提交
- B
  
  pgindent run before PG 9.1 beta 1. · bf50caf1
  由 Bruce Momjian 提交于 4月 10, 2011
  
  bf50caf1
09 3月, 2011 1 次提交

Don't throw a warning if vacuum sees PD_ALL_VISIBLE flag set on a page that · 93d88823

由 Heikki Linnakangas 提交于 3月 08, 2011

contains newly-inserted tuples that according to our OldestXmin are not
yet visible to everyone. The value returned by GetOldestXmin() is conservative,
and it can move backwards on repeated calls, so if we see that contradiction
between the PD_ALL_VISIBLE flag and status of tuples on the page, we have to
assume it's because an earlier vacuum calculated a higher OldestXmin value,
and all the tuples really are visible to everyone.

We have received several reports of this bug, with the "PD_ALL_VISIBLE flag
was incorrectly set in relation ..." warning appearing in logs. We were
finally able to hunt it down with David Gould's help to run extra diagnostics
in an environment where this happened frequently.

Also reword the warning, per Robert Haas' suggestion, to not imply that the
PD_ALL_VISIBLE flag is necessarily at fault, as it might also be a symptom
of corruption on a tuple header.

Backpatch to 8.4, where the PD_ALL_VISIBLE flag was introduced.

93d88823

17 2月, 2011 1 次提交

Hot Standby feedback for avoidance of cleanup conflicts on standby. · bca8b7f1

由 Simon Riggs 提交于 2月 16, 2011

Standby optionally sends back information about oldestXmin of queries
which is then checked and applied to the WALSender's proc->xmin.
GetOldestXmin() is modified slightly to agree with GetSnapshotData(),
so that all backends on primary include WALSender within their snapshots.
Note this does nothing to change the snapshot xmin on either master or
standby. Feedback piggybacks on the standby reply message.
vacuum_defer_cleanup_age is no longer used on standby, though parameter
still exists on primary, since some use cases still exist.

Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas

bca8b7f1

18 1月, 2011 1 次提交
- H
  
  Fix thinko in comment. Spotted by Jim Nasby. · b1dc45c1
  由 Heikki Linnakangas 提交于 1月 18, 2011
  
  b1dc45c1
02 1月, 2011 1 次提交
- B
  
  Stamp copyrights for year 2011. · 5d950e3b
  由 Bruce Momjian 提交于 1月 01, 2011
  
  5d950e3b
09 12月, 2010 1 次提交

Optimize commit_siblings in two ways to improve group commit. · e620ee35

由 Simon Riggs 提交于 12月 08, 2010

First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith

e620ee35

07 12月, 2010 1 次提交

Fix bugs in the hot standby known-assigned-xids tracking logic. If there's · 5a031a55

由 Heikki Linnakangas 提交于 12月 07, 2010

an old transaction running in the master, and a lot of transactions have
started and finished since, and a WAL-record is written in the gap between
the creating the running-xacts snapshot and WAL-logging it, recovery will fail
with "too many KnownAssignedXids" error. This bug was reported by
Joachim Wieland on Nov 19th.

In the same scenario, when fewer transactions have started so that all the
xids fit in KnownAssignedXids despite the first bug, a more serious bug
arises. We incorrectly initialize the clog code with the oldest still running
transaction, and when we see the WAL record belonging to a transaction with
an XID larger than one that committed already before the checkpoint we're
recovering from, we zero the clog page containing the already committed
transaction, leading to data loss.

In hindsight, trying to track xids in the known-assigned-xids array before
seeing the running-xacts record was too complicated. To fix that, hold
XidGenLock while the running-xacts snapshot is taken and WAL-logged. That
ensures that no transaction can begin or end in that gap, so that in recvoery
we know that the snapshot contains all transactions running at that point in
WAL.

5a031a55

21 9月, 2010 1 次提交
- M
  
  Remove cvs keywords from all files. · 9f2e2113
  由 Magnus Hagander 提交于 9月 20, 2010
  
  9f2e2113
31 8月, 2010 1 次提交
- T
  
  Cosmetic fixes for KnownAssignedXidsGetOldestXmin, per Fujii Masao. · 174a5133
  由 Tom Lane 提交于 8月 30, 2010
  
  174a5133
30 8月, 2010 1 次提交
- S
  Teach GetOldestXmin() about KnownAssignedXids during recovery. · e24d1dc0
  由 Simon Riggs 提交于 8月 30, 2010
```
Very minor issue, though this is required for a later patch.
Reported by Heikki Linnakangas.
```
  e24d1dc0
13 8月, 2010 1 次提交
- R
  Correct sundry errors in Hot Standby-related comments. · 30c22eb8
  由 Robert Haas 提交于 8月 12, 2010
```
Fujii Masao
```
  30c22eb8
07 7月, 2010 1 次提交
- B
  
  pgindent run for 9.0, second run · 239d769e
  由 Bruce Momjian 提交于 7月 06, 2010
  
  239d769e
04 7月, 2010 1 次提交

Make vacuum_defer_cleanup_age be PGC_SIGHUP level, since it's not sensible · aceedd88

由 Tom Lane 提交于 7月 03, 2010

to have different values in different processes of the primary server.
Also put it into the "Streaming Replication" GUC category; it doesn't belong
in "Standby Servers" because you use it on the master not the standby.
In passing also correct guc.c's idea of wal_keep_segments' category.

aceedd88

14 5月, 2010 1 次提交
- S
  Add many new Asserts in code and fix simple bug that slipped through · fd34374b
  由 Simon Riggs 提交于 5月 14, 2010
```
without them, related to previous commit. Report by Bruce Momjian.
```
  fd34374b
13 5月, 2010 1 次提交

Cleanup initialization of Hot Standby. Clarify working with reanalysis · 8431e296

由 Simon Riggs 提交于 5月 13, 2010

of requirements and documentation on LogStandbySnapshot(). Fixes
two minor bugs reported by Tom Lane that would lead to an incorrect
snapshot after transaction wraparound. Also fix two other problems
discovered that would give incorrect snapshots in certain cases.
ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor
refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().

8431e296

30 4月, 2010 1 次提交

Rename the parameter recovery_connections to hot_standby, to reduce possible · f0488bd5

由 Tom Lane 提交于 4月 29, 2010

confusion with streaming-replication settings. Also, change its default
value to "off", because of concern about executing new and poorly-tested
code during ordinary non-replicating operation. Per discussion.

In passing do some minor editing of related documentation.

f0488bd5

28 4月, 2010 1 次提交

Replace the KnownAssignedXids hash table with a sorted-array data structure, · 2871b461

由 Tom Lane 提交于 4月 28, 2010

and be more tense about the locking requirements for it, to improve performance
in Hot Standby mode.  In passing fix a few bugs and improve a number of
comments in the existing HS code.

Simon Riggs, with some editorialization by Tom

2871b461

22 4月, 2010 2 次提交

Optimise btree delete processing when no active backends. · a2555571

由 Simon Riggs 提交于 4月 22, 2010

Clarify comments, downgrade a message to DEBUG and remove some
debug counters. Direct from ideas by Heikki Linnakangas.

a2555571

Relax locking during GetCurrentVirtualXIDs(). Earlier improvements · 0192abc4

由 Simon Riggs 提交于 4月 21, 2010

to handling of btree delete records mean that all snapshot
conflicts on standby now have a valid, useful latestRemovedXid.
Our earlier approach using LW_EXCLUSIVE was useful when we didnt
always have a valid value, though is no longer useful or necessary.
Asserts added to code path to prove and ensure this is the case.
This will reduce contention and improve performance of larger Hot
Standby servers.

0192abc4

20 4月, 2010 1 次提交
- S
  Check RecoveryInProgress() while holding ProcArrayLock during snapshots. · 7bc76d51
  由 Simon Riggs 提交于 4月 19, 2010
```
This prevents a rare, yet possible race condition at the exact moment
of transition from recovery to normal running.
```
  7bc76d51
19 4月, 2010 1 次提交

Tune GetSnapshotData() during Hot Standby by avoiding loop · 21d6a6a1

由 Simon Riggs 提交于 4月 18, 2010

through normal backends. Makes code clearer also, since we
avoid various Assert()s. Performance of snapshots taken
during recovery no longer depends upon number of read-only
backends.

21d6a6a1

06 4月, 2010 1 次提交
- S
  
  Change some debug ereports to elogs, as requested by translation team. · 19c7a59b
  由 Simon Riggs 提交于 4月 06, 2010
  
  19c7a59b
11 3月, 2010 1 次提交
- H
  Fix bug in KnownAssignedXidsMany(). I saw this when looking at the · e0f9e2b6
  由 Heikki Linnakangas 提交于 3月 11, 2010
```
assertion failure reported by Erik Rijkers, but this alone doesn't explain
the failure.
```
  e0f9e2b6
26 2月, 2010 1 次提交
- B
  
  pgindent run for 9.0 · 65e806cb
  由 Bruce Momjian 提交于 2月 26, 2010
  
  65e806cb
24 1月, 2010 1 次提交

In HS, Startup process sets SIGALRM when waiting for buffer pin. If · 959ac58c

由 Simon Riggs 提交于 1月 23, 2010

woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.

959ac58c

21 1月, 2010 1 次提交

Better internal documentation of locking for Hot Standby conflict resolution. · 58565d78

由 Simon Riggs 提交于 1月 21, 2010

Discuss the reasons for the lock type we hold on ProcArrayLock while deriving
the conflict list. Cover the idea of false positive conflicts and seemingly
strange effects on snapshot derivation.

58565d78