1. 17 9月, 2013 1 次提交
  2. 12 2月, 2013 1 次提交
  3. 23 1月, 2013 1 次提交
    • A
      Improve concurrency of foreign key locking · 0ac5ad51
      Alvaro Herrera 提交于
      This patch introduces two additional lock modes for tuples: "SELECT FOR
      KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
      other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
      FOR UPDATE".  UPDATE commands that do not modify the values stored in
      the columns that are part of the key of the tuple now grab a SELECT FOR
      NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
      with tuple locks of the FOR KEY SHARE variety.
      
      Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
      means the concurrency improvement applies to them, which is the whole
      point of this patch.
      
      The added tuple lock semantics require some rejiggering of the multixact
      module, so that the locking level that each transaction is holding can
      be stored alongside its Xid.  Also, multixacts now need to persist
      across server restarts and crashes, because they can now represent not
      only tuple locks, but also tuple updates.  This means we need more
      careful tracking of lifetime of pg_multixact SLRU files; since they now
      persist longer, we require more infrastructure to figure out when they
      can be removed.  pg_upgrade also needs to be careful to copy
      pg_multixact files over from the old server to the new, or at least part
      of multixact.c state, depending on the versions of the old and new
      servers.
      
      Tuple time qualification rules (HeapTupleSatisfies routines) need to be
      careful not to consider tuples with the "is multi" infomask bit set as
      being only locked; they might need to look up MultiXact values (i.e.
      possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
      whereas they previously were assured to only use information readily
      available from the tuple header.  This is considered acceptable, because
      the extra I/O would involve cases that would previously cause some
      commands to block waiting for concurrent transactions to finish.
      
      Another important change is the fact that locking tuples that have
      previously been updated causes the future versions to be marked as
      locked, too; this is essential for correctness of foreign key checks.
      This causes additional WAL-logging, also (there was previously a single
      WAL record for a locked tuple; now there are as many as updated copies
      of the tuple there exist.)
      
      With all this in place, contention related to tuples being checked by
      foreign key rules should be much reduced.
      
      As a bonus, the old behavior that a subtransaction grabbing a stronger
      tuple lock than the parent (sub)transaction held on a given tuple and
      later aborting caused the weaker lock to be lost, has been fixed.
      
      Many new spec files were added for isolation tester framework, to ensure
      overall behavior is sane.  There's probably room for several more tests.
      
      There were several reviewers of this patch; in particular, Noah Misch
      and Andres Freund spent considerable time in it.  Original idea for the
      patch came from Simon Riggs, after a problem report by Joel Jacobson.
      Most code is from me, with contributions from Marti Raudsepp, Alexander
      Shulgin, Noah Misch and Andres Freund.
      
      This patch was discussed in several pgsql-hackers threads; the most
      important start at the following message-ids:
      	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
      	1290721684-sup-3951@alvh.no-ip.org
      	1294953201-sup-2099@alvh.no-ip.org
      	1320343602-sup-2290@alvh.no-ip.org
      	1339690386-sup-8927@alvh.no-ip.org
      	4FE5FF020200002500048A3D@gw.wicourts.gov
      	4FEAB90A0200002500048B7D@gw.wicourts.gov
      0ac5ad51
  4. 02 1月, 2013 1 次提交
  5. 12 12月, 2012 1 次提交
    • K
      Fix performance problems with autovacuum truncation in busy workloads. · b19e4250
      Kevin Grittner 提交于
      In situations where there are over 8MB of empty pages at the end of
      a table, the truncation work for trailing empty pages takes longer
      than deadlock_timeout, and there is frequent access to the table by
      processes other than autovacuum, there was a problem with the
      autovacuum worker process being canceled by the deadlock checking
      code. The truncation work done by autovacuum up that point was
      lost, and the attempt tried again by a later autovacuum worker. The
      attempts could continue indefinitely without making progress,
      consuming resources and blocking other processes for up to
      deadlock_timeout each time.
      
      This patch has the autovacuum worker checking whether it is
      blocking any other thread at 20ms intervals. If such a condition
      develops, the autovacuum worker will persist the work it has done
      so far, release its lock on the table, and sleep in 50ms intervals
      for up to 5 seconds, hoping to be able to re-acquire the lock and
      try again. If it is unable to get the lock in that time, it moves
      on and a worker will try to continue later from the point this one
      left off.
      
      While this patch doesn't change the rules about when and what to
      truncate, it does cause the truncation to occur sooner, with less
      blocking, and with the consumption of fewer resources when there is
      contention for the table's lock.
      
      The only user-visible change other than improved performance is
      that the table size during truncation may change incrementally
      instead of just once.
      
      This problem exists in all supported versions but is infrequently
      reported, although some reports of performance problems when
      autovacuum runs might be caused by this. Initial commit is just the
      master branch, but this should probably be backpatched once the
      build farm and general developer usage confirm that there are no
      surprising effects.
      
      Jan Wieck
      b19e4250
  6. 30 11月, 2012 1 次提交
  7. 21 6月, 2012 1 次提交
    • H
      Add a small cache of locks owned by a resource owner in ResourceOwner. · eeb6f37d
      Heikki Linnakangas 提交于
      This speeds up reassigning locks to the parent owner, when the transaction
      holds a lot of locks, but only a few of them belong to the current resource
      owner. This is particularly helps pg_dump when dumping a large number of
      objects.
      
      The cache can hold up to 15 locks in each resource owner. After that, the
      cache is marked as overflowed, and we fall back to the old method of
      scanning the whole local lock table. The tradeoff here is that the cache has
      to be scanned whenever a lock is released, so if the cache is too large,
      lock release becomes more expensive. 15 seems enough to cover pg_dump, and
      doesn't have much impact on lock release.
      
      Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
      eeb6f37d
  8. 11 6月, 2012 1 次提交
  9. 05 5月, 2012 1 次提交
    • T
      Overdue code review for transaction-level advisory locks patch. · 71b9549d
      Tom Lane 提交于
      Commit 62c7bd31 had assorted problems, most
      visibly that it broke PREPARE TRANSACTION in the presence of session-level
      advisory locks (which should be ignored by PREPARE), as per a recent
      complaint from Stephen Rees.  More abstractly, the patch made the
      LockMethodData.transactional flag not merely useless but outright
      dangerous, because in point of fact that flag no longer tells you anything
      at all about whether a lock is held transactionally.  This fix therefore
      removes that flag altogether.  We now rely entirely on the convention
      already in use in lock.c that transactional lock holds must be owned by
      some ResourceOwner, while session holds are never so owned.  Setting the
      locallock struct's owner link to NULL thus denotes a session hold, and
      there is no redundant marker for that.
      
      PREPARE TRANSACTION now works again when there are session-level advisory
      locks, and it is also able to transfer transactional advisory locks to the
      prepared transaction, but for implementation reasons it throws an error if
      we hold both types of lock on a single lockable object.  Perhaps it will be
      worth improving that someday.
      
      Assorted other minor cleanup and documentation editing, as well.
      
      Back-patch to 9.1, except that in the 9.1 branch I did not remove the
      LockMethodData.transactional flag for fear of causing an ABI break for
      any external code that might be examining those structs.
      71b9549d
  10. 18 4月, 2012 2 次提交
  11. 02 1月, 2012 1 次提交
  12. 11 11月, 2011 1 次提交
  13. 14 10月, 2011 1 次提交
  14. 05 8月, 2011 1 次提交
    • R
      Create VXID locks "lazily" in the main lock table. · 84e37126
      Robert Haas 提交于
      Instead of entering them on transaction startup, we materialize them
      only when someone wants to wait, which will occur only during CREATE
      INDEX CONCURRENTLY.  In Hot Standby mode, the startup process must also
      be able to probe for conflicting VXID locks, but the lock need never be
      fully materialized, because the startup process does not use the normal
      lock wait mechanism.  Since most VXID locks never need to touch the
      lock manager partition locks, this can significantly reduce blocking
      contention on read-heavy workloads.
      
      Patch by me.  Review by Jeff Davis.
      84e37126
  15. 18 7月, 2011 1 次提交
    • R
      Create a "fast path" for acquiring weak relation locks. · 3cba8999
      Robert Haas 提交于
      When an AccessShareLock, RowShareLock, or RowExclusiveLock is requested
      on an unshared database relation, and we can verify that no conflicting
      locks can possibly be present, record the lock in a per-backend queue,
      stored within the PGPROC, rather than in the primary lock table.  This
      eliminates a great deal of contention on the lock manager LWLocks.
      
      This patch also refactors the interface between GetLockStatusData() and
      pg_lock_status() to be a bit more abstract, so that we don't rely so
      heavily on the lock manager's internal representation details.  The new
      fast path lock structures don't have a LOCK or PROCLOCK structure to
      return, so we mustn't depend on that for purposes of listing outstanding
      locks.
      
      Review by Jeff Davis.
      3cba8999
  16. 18 2月, 2011 1 次提交
    • I
      Add transaction-level advisory locks. · 62c7bd31
      Itagaki Takahiro 提交于
      They share the same locking namespace with the existing session-level
      advisory locks, but they are automatically released at the end of the
      current transaction and cannot be released explicitly via unlock
      functions.
      
      Marko Tiikkaja, reviewed by me.
      62c7bd31
  17. 02 1月, 2011 1 次提交
  18. 21 9月, 2010 1 次提交
  19. 26 2月, 2010 1 次提交
  20. 03 1月, 2010 1 次提交
  21. 19 12月, 2009 1 次提交
    • S
      Allow read only connections during recovery, known as Hot Standby. · efc16ea5
      Simon Riggs 提交于
      Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
      
      New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
      
      This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
      
      Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
      
      Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
      efc16ea5
  22. 05 4月, 2009 1 次提交
    • T
      A session that does not have any live snapshots does not have to be waited for · c973051a
      Tom Lane 提交于
      when we are waiting for old snapshots to go away during a concurrent index
      build.  In particular, this rule lets us avoid waiting for
      idle-in-transaction sessions.
      
      This logic could be improved further if we had some way to wake up when
      the session we are currently waiting for goes idle-in-transaction.  However
      that would be a significantly more complex/invasive patch, so it'll have to
      wait for some other day.
      
      Simon Riggs, with some improvements by Tom.
      c973051a
  23. 02 1月, 2009 1 次提交
  24. 16 9月, 2008 1 次提交
    • T
      Widen the nLocks counts in local lock tables from int to int64. This · 30df79a7
      Tom Lane 提交于
      forestalls potential overflow when the same table (or other object, but
      usually tables) is accessed by very many successive queries within a single
      transaction.  Per report from Michael Milligan.
      
      Back-patch to 8.0, which is as far back as the patch conveniently applies.
      There have been no reports of overflow in pre-8.3 releases, but clearly the
      risk existed all along.  (Michael's report suggests that 8.3 may consume lock
      counts faster than prior releases, but with no test case to look at it's hard
      to be sure about that.  Widening the counts seems a good future-proofing
      measure in any event.)
      30df79a7
  25. 12 5月, 2008 1 次提交
    • A
      Restructure some header files a bit, in particular heapam.h, by removing some · f8c4d7db
      Alvaro Herrera 提交于
      unnecessary #include lines in it.  Also, move some tuple routine prototypes and
      macros to htup.h, which allows removal of heapam.h inclusion from some .c
      files.
      
      For this to work, a new header file access/sysattr.h needed to be created,
      initially containing attribute numbers of system columns, for pg_dump usage.
      
      While at it, make contrib ltree, intarray and hstore header files more
      consistent with our header style.
      f8c4d7db
  26. 09 1月, 2008 1 次提交
  27. 02 1月, 2008 1 次提交
  28. 16 11月, 2007 2 次提交
  29. 27 10月, 2007 1 次提交
  30. 06 9月, 2007 1 次提交
    • T
      Implement lazy XID allocation: transactions that do not modify any database · 295e6398
      Tom Lane 提交于
      rows will normally never obtain an XID at all.  We already did things this way
      for subtransactions, but this patch extends the concept to top-level
      transactions.  In applications where there are lots of short read-only
      transactions, this should improve performance noticeably; not so much from
      removal of the actual XID-assignments, as from reduction of overhead that's
      driven by the rate of XID consumption.  We add a concept of a "virtual
      transaction ID" so that active transactions can be uniquely identified even
      if they don't have a regular XID.  This is a much lighter-weight concept:
      uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
      record is made about them.
      
      Florian Pflug, with some editorialization by Tom.
      295e6398
  31. 20 6月, 2007 1 次提交
    • T
      Code review for log_lock_waits patch. Don't try to issue log messages from · 6e072287
      Tom Lane 提交于
      within a signal handler (this might be safe given the relatively narrow code
      range in which the interrupt is enabled, but it seems awfully risky); do issue
      more informative log messages that tell what is being waited for and the exact
      length of the wait; minor other code cleanup.  Greg Stark and Tom Lane
      6e072287
  32. 31 5月, 2007 1 次提交
  33. 04 3月, 2007 1 次提交
  34. 06 1月, 2007 1 次提交
  35. 23 11月, 2006 1 次提交
  36. 04 10月, 2006 1 次提交
  37. 23 9月, 2006 1 次提交
  38. 19 9月, 2006 1 次提交
    • T
      Add built-in userlock manipulation functions to replace the former · 9b4cda0d
      Tom Lane 提交于
      contrib functionality.  Along the way, remove the USER_LOCKS configuration
      symbol, since it no longer makes any sense to try to compile that out.
      No user documentation yet ... mmoncure has promised to write some.
      Thanks to Abhijit Menon-Sen for creating a first draft to work from.
      9b4cda0d