1. 14 5月, 2012 1 次提交
    • H
      Update comments that became out-of-date with the PGXACT struct. · 9e4637bf
      Heikki Linnakangas 提交于
      When the "hot" members of PGPROC were split off to separate PGXACT structs,
      many PGPROC fields referred to in comments were moved to PGXACT, but the
      comments were neglected in the commit. Mostly this is just a search/replace
      of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
      prepared transactions changed more, making some of the comments totally
      bogus.
      
      Noah Misch
      9e4637bf
  2. 24 4月, 2012 1 次提交
  3. 07 2月, 2012 1 次提交
    • T
      Add locking around WAL-replay modification of shared-memory variables. · c6d76d7c
      Tom Lane 提交于
      Originally, most of this code assumed that no Postgres backends could be
      running concurrently with it, and so no locking could be needed.  That
      assumption fails in Hot Standby.  While it's still true that Hot Standby
      backends should never change values like nextXid, they can examine them,
      and consistency is important in some cases such as when computing a
      snapshot.  Therefore, prudence requires that WAL replay code obtain the
      relevant locks when modifying such variables, even though it can examine
      them without taking a lock.  We were following that coding rule in some
      places but not all.  This commit applies the coding rule uniformly to all
      updates of ShmemVariableCache and MultiXactState fields; a search of the
      replay routines did not find any other cases that seemed to be at risk.
      
      In addition, this commit fixes a longstanding thinko in replay of NEXTOID
      and checkpoint records: we tried to advance nextOid only if it was behind
      the value in the WAL record, but the comparison would draw the wrong
      conclusion if OID wraparound had occurred since the previous value.
      Better to just unconditionally assign the new value, since OID assignment
      shouldn't be happening during replay anyway.
      
      The additional locking seems to be more in the nature of future-proofing
      than fixing any live bug, so I am not going to back-patch it.  The NEXTOID
      fix will be back-patched separately.
      c6d76d7c
  4. 30 1月, 2012 1 次提交
  5. 24 1月, 2012 1 次提交
    • S
      Resolve timing issue with logging locks for Hot Standby. · c172b7b0
      Simon Riggs 提交于
      We log AccessExclusiveLocks for replay onto standby nodes,
      but because of timing issues on ProcArray it is possible to
      log a lock that is still held by a just committed transaction
      that is very soon to be removed. To avoid any timing issue we
      avoid applying locks made by transactions with InvalidXid.
      
      Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee
      c172b7b0
  6. 02 1月, 2012 1 次提交
  7. 17 12月, 2011 1 次提交
    • R
      Various micro-optimizations for GetSnapshopData(). · 0d76b60d
      Robert Haas 提交于
      Heikki Linnakangas had the idea of rearranging GetSnapshotData to
      avoid checking for sub-XIDs when no top-level XID is present.  This
      patch does that plus further a bit of further, related rearrangement.
      Benchmarking show a significant improvement on unlogged tables at
      higher concurrency levels, and mostly indifferent result on permanent
      tables (which are presumably bottlenecked elsewhere).  Most of the
      benefit seems to come from using the new NormalTransactionIdPrecedes()
      macro rather than the function call TransactionIdPrecedes().
      0d76b60d
  8. 25 11月, 2011 1 次提交
    • R
      Move "hot" members of PGPROC into a separate PGXACT array. · ed0b409d
      Robert Haas 提交于
      This speeds up snapshot-taking and reduces ProcArrayLock contention.
      Also, the PGPROC (and PGXACT) structures used by two-phase commit are
      now allocated as part of the main array, rather than in a separate
      array, and we keep ProcArray sorted in pointer order.  These changes
      are intended to minimize the number of cache lines that must be pulled
      in to take a snapshot, and testing shows a substantial increase in
      performance on both read and write workloads at high concurrencies.
      
      Pavan Deolasee, Heikki Linnakangas, Robert Haas
      ed0b409d
  9. 02 11月, 2011 2 次提交
    • S
      Derive oldestActiveXid at correct time for Hot Standby. · 86e33648
      Simon Riggs 提交于
      There was a timing window between when oldestActiveXid was derived
      and when it should have been derived that only shows itself under
      heavy load. Move code around to ensure correct timing of derivation.
      No change to StartupSUBTRANS() code, which is where this failed.
      
      Bug report by Chris Redekop
      86e33648
    • S
      Start Hot Standby faster when initial snapshot is incomplete. · 10b7c686
      Simon Riggs 提交于
      If the initial snapshot had overflowed then we can start whenever
      the latest snapshot is empty, not overflowed or as we did already,
      start when the xmin on primary was higher than xmax of our starting
      snapshot, which proves we have full snapshot data.
      
      Bug report by Chris Redekop
      10b7c686
  10. 23 10月, 2011 1 次提交
    • T
      Support synchronization of snapshots through an export/import procedure. · bb446b68
      Tom Lane 提交于
      A transaction can export a snapshot with pg_export_snapshot(), and then
      others can import it with SET TRANSACTION SNAPSHOT.  The data does not
      leave the server so there are not security issues.  A snapshot can only
      be imported while the exporting transaction is still running, and there
      are some other restrictions.
      
      I'm not totally convinced that we've covered all the bases for SSI (true
      serializable) mode, but it works fine for lesser isolation modes.
      
      Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified
      by Tom Lane
      bb446b68
  11. 21 10月, 2011 1 次提交
    • T
      Simplify and improve ProcessStandbyHSFeedbackMessage logic. · b4a0223d
      Tom Lane 提交于
      There's no need to clamp the standby's xmin to be greater than
      GetOldestXmin's result; if there were any such need this logic would be
      hopelessly inadequate anyway, because it fails to account for
      within-database versus cluster-wide values of GetOldestXmin.  So get rid of
      that, and just rely on sanity-checking that the xmin is not wrapped around
      relative to the nextXid counter.  Also, don't reset the walsender's xmin if
      the current feedback xmin is indeed out of range; that just creates more
      problems than we already had.  Lastly, don't bother to take the
      ProcArrayLock; there's no need to do that to set xmin.
      
      Also improve the comments about this in GetOldestXmin itself.
      b4a0223d
  12. 04 9月, 2011 1 次提交
    • T
      Clean up the #include mess a little. · 1609797c
      Tom Lane 提交于
      walsender.h should depend on xlog.h, not vice versa.  (Actually, the
      inclusion was circular until a couple hours ago, which was even sillier;
      but Bruce broke it in the expedient rather than logically correct
      direction.)  Because of that poor decision, plus blind application of
      pgrminclude, we had a situation where half the system was depending on
      xlog.h to include such unrelated stuff as array.h and guc.h.  Clean up
      the header inclusion, and manually revert a lot of what pgrminclude had
      done so things build again.
      
      This episode reinforces my feeling that pgrminclude should not be run
      without adult supervision.  Inclusion changes in header files in particular
      need to be reviewed with great care.  More generally, it'd be good if we
      had a clearer notion of module layering to dictate which headers can sanely
      include which others ... but that's a big task for another day.
      1609797c
  13. 01 9月, 2011 1 次提交
  14. 10 4月, 2011 1 次提交
  15. 09 3月, 2011 1 次提交
    • H
      Don't throw a warning if vacuum sees PD_ALL_VISIBLE flag set on a page that · 93d88823
      Heikki Linnakangas 提交于
      contains newly-inserted tuples that according to our OldestXmin are not
      yet visible to everyone. The value returned by GetOldestXmin() is conservative,
      and it can move backwards on repeated calls, so if we see that contradiction
      between the PD_ALL_VISIBLE flag and status of tuples on the page, we have to
      assume it's because an earlier vacuum calculated a higher OldestXmin value,
      and all the tuples really are visible to everyone.
      
      We have received several reports of this bug, with the "PD_ALL_VISIBLE flag
      was incorrectly set in relation ..." warning appearing in logs. We were
      finally able to hunt it down with David Gould's help to run extra diagnostics
      in an environment where this happened frequently.
      
      Also reword the warning, per Robert Haas' suggestion, to not imply that the
      PD_ALL_VISIBLE flag is necessarily at fault, as it might also be a symptom
      of corruption on a tuple header.
      
      Backpatch to 8.4, where the PD_ALL_VISIBLE flag was introduced.
      93d88823
  16. 17 2月, 2011 1 次提交
    • S
      Hot Standby feedback for avoidance of cleanup conflicts on standby. · bca8b7f1
      Simon Riggs 提交于
      Standby optionally sends back information about oldestXmin of queries
      which is then checked and applied to the WALSender's proc->xmin.
      GetOldestXmin() is modified slightly to agree with GetSnapshotData(),
      so that all backends on primary include WALSender within their snapshots.
      Note this does nothing to change the snapshot xmin on either master or
      standby. Feedback piggybacks on the standby reply message.
      vacuum_defer_cleanup_age is no longer used on standby, though parameter
      still exists on primary, since some use cases still exist.
      
      Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas
      bca8b7f1
  17. 18 1月, 2011 1 次提交
  18. 02 1月, 2011 1 次提交
  19. 09 12月, 2010 1 次提交
  20. 07 12月, 2010 1 次提交
    • H
      Fix bugs in the hot standby known-assigned-xids tracking logic. If there's · 5a031a55
      Heikki Linnakangas 提交于
      an old transaction running in the master, and a lot of transactions have
      started and finished since, and a WAL-record is written in the gap between
      the creating the running-xacts snapshot and WAL-logging it, recovery will fail
      with "too many KnownAssignedXids" error. This bug was reported by
      Joachim Wieland on Nov 19th.
      
      In the same scenario, when fewer transactions have started so that all the
      xids fit in KnownAssignedXids despite the first bug, a more serious bug
      arises. We incorrectly initialize the clog code with the oldest still running
      transaction, and when we see the WAL record belonging to a transaction with
      an XID larger than one that committed already before the checkpoint we're
      recovering from, we zero the clog page containing the already committed
      transaction, leading to data loss.
      
      In hindsight, trying to track xids in the known-assigned-xids array before
      seeing the running-xacts record was too complicated. To fix that, hold
      XidGenLock while the running-xacts snapshot is taken and WAL-logged. That
      ensures that no transaction can begin or end in that gap, so that in recvoery
      we know that the snapshot contains all transactions running at that point in
      WAL.
      5a031a55
  21. 21 9月, 2010 1 次提交
  22. 31 8月, 2010 1 次提交
  23. 30 8月, 2010 1 次提交
  24. 13 8月, 2010 1 次提交
  25. 07 7月, 2010 1 次提交
  26. 04 7月, 2010 1 次提交
    • T
      Make vacuum_defer_cleanup_age be PGC_SIGHUP level, since it's not sensible · aceedd88
      Tom Lane 提交于
      to have different values in different processes of the primary server.
      Also put it into the "Streaming Replication" GUC category; it doesn't belong
      in "Standby Servers" because you use it on the master not the standby.
      In passing also correct guc.c's idea of wal_keep_segments' category.
      aceedd88
  27. 14 5月, 2010 1 次提交
  28. 13 5月, 2010 1 次提交
    • S
      Cleanup initialization of Hot Standby. Clarify working with reanalysis · 8431e296
      Simon Riggs 提交于
      of requirements and documentation on LogStandbySnapshot(). Fixes
      two minor bugs reported by Tom Lane that would lead to an incorrect
      snapshot after transaction wraparound. Also fix two other problems
      discovered that would give incorrect snapshots in certain cases.
      ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor
      refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().
      8431e296
  29. 30 4月, 2010 1 次提交
  30. 28 4月, 2010 1 次提交
  31. 22 4月, 2010 2 次提交
    • S
      Optimise btree delete processing when no active backends. · a2555571
      Simon Riggs 提交于
      Clarify comments, downgrade a message to DEBUG and remove some
      debug counters. Direct from ideas by Heikki Linnakangas.
      a2555571
    • S
      Relax locking during GetCurrentVirtualXIDs(). Earlier improvements · 0192abc4
      Simon Riggs 提交于
      to handling of btree delete records mean that all snapshot
      conflicts on standby now have a valid, useful latestRemovedXid.
      Our earlier approach using LW_EXCLUSIVE was useful when we didnt
      always have a valid value, though is no longer useful or necessary.
      Asserts added to code path to prove and ensure this is the case.
      This will reduce contention and improve performance of larger Hot
      Standby servers.
      0192abc4
  32. 20 4月, 2010 1 次提交
  33. 19 4月, 2010 1 次提交
  34. 06 4月, 2010 1 次提交
  35. 11 3月, 2010 1 次提交
  36. 26 2月, 2010 1 次提交
  37. 24 1月, 2010 1 次提交
    • S
      In HS, Startup process sets SIGALRM when waiting for buffer pin. If · 959ac58c
      Simon Riggs 提交于
      woken by alarm we send SIGUSR1 to all backends requesting that they
      check to see if they are blocking Startup process. If so, they throw
      ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
      removed. max_standby_delay = -1 option removed to prevent deadlock.
      959ac58c
  38. 21 1月, 2010 1 次提交