1. 29 8月, 2012 1 次提交
    • A
      Split resowner.h · 45326c5a
      Alvaro Herrera 提交于
      This lets files that are mere users of ResourceOwner not automatically
      include the headers for stuff that is managed by the resowner mechanism.
      45326c5a
  2. 03 7月, 2012 1 次提交
  3. 02 7月, 2012 1 次提交
    • T
      Fix race condition in enum value comparisons. · 9ad45c18
      Tom Lane 提交于
      When (re) loading the typcache comparison cache for an enum type's values,
      use an up-to-date MVCC snapshot, not the transaction's existing snapshot.
      This avoids problems if we encounter an enum OID that was created since our
      transaction started.  Per report from Andres Freund and diagnosis by Robert
      Haas.
      
      To ensure this is safe even if enum comparison manages to get invoked
      before we've set a transaction snapshot, tweak GetLatestSnapshot to
      redirect to GetTransactionSnapshot instead of throwing error when
      FirstSnapshotSet is false.  The existing uses of GetLatestSnapshot (in
      ri_triggers.c) don't care since they couldn't be invoked except in a
      transaction that's already done some work --- but it seems just conceivable
      that this might not be true of enums, especially if we ever choose to use
      enums in system catalogs.
      
      Note that the comparable coding in enum_endpoint and enum_range_internal
      remains GetTransactionSnapshot; this is perhaps debatable, but if we
      changed it those functions would have to be marked volatile, which doesn't
      seem attractive.
      
      Back-patch to 9.1 where ALTER TYPE ADD VALUE was added.
      9ad45c18
  4. 11 6月, 2012 1 次提交
  5. 14 5月, 2012 1 次提交
    • H
      Update comments that became out-of-date with the PGXACT struct. · 9e4637bf
      Heikki Linnakangas 提交于
      When the "hot" members of PGPROC were split off to separate PGXACT structs,
      many PGPROC fields referred to in comments were moved to PGXACT, but the
      comments were neglected in the commit. Mostly this is just a search/replace
      of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
      prepared transactions changed more, making some of the comments totally
      bogus.
      
      Noah Misch
      9e4637bf
  6. 02 1月, 2012 1 次提交
  7. 25 11月, 2011 1 次提交
    • R
      Move "hot" members of PGPROC into a separate PGXACT array. · ed0b409d
      Robert Haas 提交于
      This speeds up snapshot-taking and reduces ProcArrayLock contention.
      Also, the PGPROC (and PGXACT) structures used by two-phase commit are
      now allocated as part of the main array, rather than in a separate
      array, and we keep ProcArray sorted in pointer order.  These changes
      are intended to minimize the number of cache lines that must be pulled
      in to take a snapshot, and testing shows a substantial increase in
      performance on both read and write workloads at high concurrencies.
      
      Pavan Deolasee, Heikki Linnakangas, Robert Haas
      ed0b409d
  8. 23 10月, 2011 1 次提交
    • T
      Support synchronization of snapshots through an export/import procedure. · bb446b68
      Tom Lane 提交于
      A transaction can export a snapshot with pg_export_snapshot(), and then
      others can import it with SET TRANSACTION SNAPSHOT.  The data does not
      leave the server so there are not security issues.  A snapshot can only
      be imported while the exporting transaction is still running, and there
      are some other restrictions.
      
      I'm not totally convinced that we've covered all the bases for SSI (true
      serializable) mode, but it works fine for lesser isolation modes.
      
      Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified
      by Tom Lane
      bb446b68
  9. 27 9月, 2011 1 次提交
    • T
      Allow snapshot references to still work during transaction abort. · 57eb0090
      Tom Lane 提交于
      In REPEATABLE READ (nee SERIALIZABLE) mode, an attempt to do
      GetTransactionSnapshot() between AbortTransaction and CleanupTransaction
      failed, because GetTransactionSnapshot would recompute the transaction
      snapshot (which is already wrong, given the isolation mode) and then
      re-register it in the TopTransactionResourceOwner, leading to an Assert
      because the TopTransactionResourceOwner should be empty of resources after
      AbortTransaction.  This is the root cause of bug #6218 from Yamamoto
      Takashi.  While changing plancache.c to avoid requesting a snapshot when
      handling a ROLLBACK masks the problem, I think this is really a snapmgr.c
      bug: it's lower-level than the resource manager mechanism and should not be
      shutting itself down before we unwind resource manager resources.  However,
      just postponing the release of the transaction snapshot until cleanup time
      didn't work because of the circular dependency with
      TopTransactionResourceOwner.  Fix by managing the internal reference to
      that snapshot manually instead of depending on TopTransactionResourceOwner.
      This saves a few cycles as well as making the module layering more
      straightforward.  predicate.c's dependencies on TopTransactionResourceOwner
      go away too.
      
      I think this is a longstanding bug, but there's no evidence that it's more
      than a latent bug, so it doesn't seem worth any risk of back-patching.
      57eb0090
  10. 04 9月, 2011 1 次提交
    • T
      Clean up the #include mess a little. · 1609797c
      Tom Lane 提交于
      walsender.h should depend on xlog.h, not vice versa.  (Actually, the
      inclusion was circular until a couple hours ago, which was even sillier;
      but Bruce broke it in the expedient rather than logically correct
      direction.)  Because of that poor decision, plus blind application of
      pgrminclude, we had a situation where half the system was depending on
      xlog.h to include such unrelated stuff as array.h and guc.h.  Clean up
      the header inclusion, and manually revert a lot of what pgrminclude had
      done so things build again.
      
      This episode reinforces my feeling that pgrminclude should not be run
      without adult supervision.  Inclusion changes in header files in particular
      need to be reviewed with great care.  More generally, it'd be good if we
      had a clearer notion of module layering to dictate which headers can sanely
      include which others ... but that's a big task for another day.
      1609797c
  11. 01 9月, 2011 1 次提交
  12. 10 4月, 2011 1 次提交
  13. 01 3月, 2011 1 次提交
    • T
      Rearrange snapshot handling to make rule expansion more consistent. · c0b00760
      Tom Lane 提交于
      With this patch, portals, SQL functions, and SPI all agree that there
      should be only a CommandCounterIncrement between the queries that are
      generated from a single SQL command by rule expansion.  Fetching a whole
      new snapshot now happens only between original queries.  This is equivalent
      to the existing behavior of EXPLAIN ANALYZE, and it was judged to be the
      best choice since it eliminates one source of concurrency hazards for
      rules.  The patch should also make things marginally faster by reducing the
      number of snapshot push/pop operations.
      
      The patch removes pg_parse_and_rewrite(), which is no longer used anywhere.
      There was considerable discussion about more aggressive refactoring of the
      query-processing functions exported by postgres.c, but for the moment
      nothing more has been done there.
      
      I also took the opportunity to refactor snapmgr.c's API slightly: the
      former PushUpdatedSnapshot() has been split into two functions.
      
      Marko Tiikkaja, reviewed by Steve Singer and Tom Lane
      c0b00760
  14. 08 2月, 2011 1 次提交
    • H
      Implement genuine serializable isolation level. · dafaa3ef
      Heikki Linnakangas 提交于
      Until now, our Serializable mode has in fact been what's called Snapshot
      Isolation, which allows some anomalies that could not occur in any
      serialized ordering of the transactions. This patch fixes that using a
      method called Serializable Snapshot Isolation, based on research papers by
      Michael J. Cahill (see README-SSI for full references). In Serializable
      Snapshot Isolation, transactions run like they do in Snapshot Isolation,
      but a predicate lock manager observes the reads and writes performed and
      aborts transactions if it detects that an anomaly might occur. This method
      produces some false positives, ie. it sometimes aborts transactions even
      though there is no anomaly.
      
      To track reads we implement predicate locking, see storage/lmgr/predicate.c.
      Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
      memory is finite, so when a transaction takes many tuple-level locks on a
      page, the locks are promoted to a single page-level lock, and further to a
      single relation level lock if necessary. To lock key values with no matching
      tuple, a sequential scan always takes a relation-level lock, and an index
      scan acquires a page-level lock that covers the search key, whether or not
      there are any matching keys at the moment.
      
      A predicate lock doesn't conflict with any regular locks or with another
      predicate locks in the normal sense. They're only used by the predicate lock
      manager to detect the danger of anomalies. Only serializable transactions
      participate in predicate locking, so there should be no extra overhead for
      for other transactions.
      
      Predicate locks can't be released at commit, but must be remembered until
      all the transactions that overlapped with it have completed. That means that
      we need to remember an unbounded amount of predicate locks, so we apply a
      lossy but conservative method of tracking locks for committed transactions.
      If we run short of shared memory, we overflow to a new "pg_serial" SLRU
      pool.
      
      We don't currently allow Serializable transactions in Hot Standby mode.
      That would be hard, because even read-only transactions can cause anomalies
      that wouldn't otherwise occur.
      
      Serializable isolation mode now means the new fully serializable level.
      Repeatable Read gives you the old Snapshot Isolation level that we have
      always had.
      
      Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
      Anssi Kääriäinen
      dafaa3ef
  15. 02 1月, 2011 1 次提交
  16. 21 9月, 2010 1 次提交
  17. 12 9月, 2010 1 次提交
    • J
      SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99
      Joe Conway 提交于
      transaction snapshots, i.e. a snapshot registered at the beginning of
      a transaction. Change variable naming and comments to reflect this reality
      in preparation for a future, truly serializable mode, e.g.
      Serializable Snapshot Isolation (SSI).
      
      For the moment transaction snapshots are still used to implement
      SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
      Grittner and Dan Ports with review and some minor wording changes by me.
      5eb15c99
  18. 26 2月, 2010 1 次提交
  19. 03 1月, 2010 1 次提交
  20. 19 12月, 2009 1 次提交
    • S
      Allow read only connections during recovery, known as Hot Standby. · efc16ea5
      Simon Riggs 提交于
      Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
      
      New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
      
      This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
      
      Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
      
      Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
      efc16ea5
  21. 08 10月, 2009 1 次提交
    • A
      Fix snapshot management, take two. · 07cefdfb
      Alvaro Herrera 提交于
      Partially revert the previous patch I installed and replace it with a more
      general fix: any time a snapshot is pushed as Active, we need to ensure that it
      will not be modified in the future.  This means that if the same snapshot is
      used as CurrentSnapshot, it needs to be copied separately.  This affects
      serializable transactions only, because CurrentSnapshot has already been copied
      by RegisterSnapshot and so PushActiveSnapshot does not think it needs another
      copy.  However, CommandCounterIncrement would modify CurrentSnapshot, whereas
      ActiveSnapshots must not have their command counters incremented.
      
      I say "partially" because the regression test I added for the previous bug
      has been kept.
      
      (This restores 8.3 behavior, because before snapmgr.c existed, any snapshot set
      as Active was copied.)
      
      Per bug report from Stuart Bishop in
      6bc73d4c0910042358k3d1adff3qa36f8df75198ecea@mail.gmail.com
      07cefdfb
  22. 03 10月, 2009 1 次提交
    • A
      Ensure that a cursor has an immutable snapshot throughout its lifespan. · caa4cfa3
      Alvaro Herrera 提交于
      The old coding was using a regular snapshot, referenced elsewhere, that was
      subject to having its command counter updated.  Fix by creating a private copy
      of the snapshot exclusively for the cursor.
      
      Backpatch to 8.4, which is when the bug was introduced during the snapshot
      management rewrite.
      caa4cfa3
  23. 11 6月, 2009 1 次提交
  24. 02 1月, 2009 1 次提交
  25. 04 12月, 2008 1 次提交
    • A
      Fix a couple of snapshot management bugs in the new ResourceOwner world: · 7b640b03
      Alvaro Herrera 提交于
      non-writable large objects need to have their snapshots registered on the
      transaction resowner, not the current portal's, because it must persist until
      the large object is closed (which the portal does not).  Also, ensure that the
      serializable snapshot is recorded by the transaction resource owner too, even
      when a subtransaction has changed the current resource owner before
      serializable is taken.
      
      Per bug reports from Pavan Deolasee.
      7b640b03
  26. 26 11月, 2008 1 次提交
  27. 28 10月, 2008 1 次提交
  28. 11 9月, 2008 1 次提交
    • A
      Initialize the minimum frozen Xid in vac_update_datfrozenxid using · d53a5668
      Alvaro Herrera 提交于
      GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not
      depend on the latter being correctly set elsewhere, and while it is more
      expensive, this code path is not performance-critical.  This is a real
      risk for autovacuum, because it can execute whole cycles without doing
      a single vacuum, which would mean that RecentGlobalXmin would stay at its
      initialization value, FirstNormalTransactionId, causing a bogus value to be
      inserted in pg_database.  This bug could explain some recent reports of
      failure to truncate pg_clog.
      
      At the same time, change the initialization of RecentGlobalXmin to
      InvalidTransactionId, and ensure that it's set to something else whenever
      it's going to be used.  Using it as FirstNormalTransactionId in HOT page
      pruning could incur in data loss.  InitPostgres takes care of setting it
      to a valid value, but the extra checks are there to prevent "special"
      backends from behaving in unusual ways.
      
      Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us
      d53a5668
  29. 11 7月, 2008 2 次提交
  30. 13 5月, 2008 1 次提交
    • A
      Improve snapshot manager by keeping explicit track of snapshots. · 5da9da71
      Alvaro Herrera 提交于
      There are two ways to track a snapshot: there's the "registered" list, which
      is used for arbitrary long-lived snapshots; and there's the "active stack",
      which is used for the snapshot that is considered "active" at any time.
      This also allows users of snapshots to stop worrying about snapshot memory
      allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
      assignment.  This is all done automatically now.
      
      As a consequence, this allows us to reset MyProc->xmin when there are no
      more snapshots registered in the current backend, reducing the impact that
      long-running transactions have on VACUUM.
      5da9da71
  31. 27 3月, 2008 2 次提交