1. 03 12月, 2008 1 次提交
    • H
      Introduce visibility map. The visibility map is a bitmap with one bit per · 608195a3
      Heikki Linnakangas 提交于
      heap page, where a set bit indicates that all tuples on the page are
      visible to all transactions, and the page therefore doesn't need
      vacuuming. It is stored in a new relation fork.
      
      Lazy vacuum uses the visibility map to skip pages that don't need
      vacuuming. Vacuum is also responsible for setting the bits in the map.
      In the future, this can hopefully be used to implement index-only-scans,
      but we can't currently guarantee that the visibility map is always 100%
      up-to-date.
      
      In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
      each heap page, also indicating that all tuples on the page are visible to
      all transactions. It's important that this flag is kept up-to-date. It
      is also used to skip visibility tests in sequential scans, which gives a
      small performance gain on seqscans.
      608195a3
  2. 19 11月, 2008 1 次提交
    • H
      Rethink the way FSM truncation works. Instead of WAL-logging FSM · 33960006
      Heikki Linnakangas 提交于
      truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To
      make that cleaner from modularity point of view, move the WAL-logging one
      level up to RelationTruncate, and move RelationTruncate and all the
      related WAL-logging to new src/backend/catalog/storage.c file. Introduce
      new RelationCreateStorage and RelationDropStorage functions that are used
      instead of calling smgrcreate/smgrscheduleunlink directly. Move the
      pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new
      functions. This leaves smgr.c as a thin wrapper around md.c; all the
      transactional stuff is now in storage.c.
      
      This will make it easier to add new forks with similar truncation logic,
      like the visibility map.
      33960006
  3. 10 11月, 2008 1 次提交
  4. 31 10月, 2008 1 次提交
    • H
      Unite ReadBufferWithFork, ReadBufferWithStrategy, and ZeroOrReadBuffer · 19c8dc83
      Heikki Linnakangas 提交于
      functions into one ReadBufferExtended function, that takes the strategy
      and mode as argument. There's three modes, RBM_NORMAL which is the default
      used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and
      a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages
      without throwing an error. The FSM needs the new mode to recover from
      corrupt pages, which could happend if we crash after extending an FSM file,
      and the new page is "torn".
      
      Add fork number to some error messages in bufmgr.c, that still lacked it.
      19c8dc83
  5. 30 9月, 2008 1 次提交
    • H
      Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the · 15c121b3
      Heikki Linnakangas 提交于
      free space information is stored in a dedicated FSM relation fork, with each
      relation (except for hash indexes; they don't use FSM).
      
      This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
      trace of them from the backend, initdb, and documentation.
      
      Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
      introduce a new variant of the get_raw_page(regclass, int4, int4) function in
      contrib/pageinspect that let's you to return pages from any relation fork, and
      a new fsm_page_contents() function to inspect the new FSM pages.
      15c121b3
  6. 12 5月, 2008 1 次提交
    • A
      Restructure some header files a bit, in particular heapam.h, by removing some · f8c4d7db
      Alvaro Herrera 提交于
      unnecessary #include lines in it.  Also, move some tuple routine prototypes and
      macros to htup.h, which allows removal of heapam.h inclusion from some .c
      files.
      
      For this to work, a new header file access/sysattr.h needed to be created,
      initially containing attribute numbers of system columns, for pg_dump usage.
      
      While at it, make contrib ltree, intarray and hstore header files more
      consistent with our header style.
      f8c4d7db
  7. 27 3月, 2008 1 次提交
  8. 25 3月, 2008 1 次提交
  9. 10 3月, 2008 1 次提交
  10. 02 1月, 2008 1 次提交
  11. 16 11月, 2007 1 次提交
  12. 27 9月, 2007 1 次提交
  13. 24 9月, 2007 2 次提交
    • A
      Reduce the size of memory allocations by lazy vacuum when processing a small · 58536626
      Alvaro Herrera 提交于
      table, by allocating just enough for a hardcoded number of dead tuples per
      page.  The current estimate is 200 dead tuples per page.
      
      Per reports from Jeff Amiel, Erik Jones and Marko Kreen, and subsequent
      discussion.
      CVS: ----------------------------------------------------------------------
      CVS: Enter Log.  Lines beginning with `CVS:' are removed automatically
      CVS:
      CVS: Committing in .
      CVS:
      CVS: Modified Files:
      CVS: 	commands/vacuumlazy.c
      CVS: ----------------------------------------------------------------------
      58536626
    • T
      Simplify and rename some GUC variables, per various recent discussions: · 48f7e643
      Tom Lane 提交于
      * stats_start_collector goes away; we always start the collector process,
      unless prevented by a problem with setting up the stats UDP socket.
      
      * stats_reset_on_server_start goes away; it seems useless in view of the
      availability of pg_stat_reset().
      
      * stats_block_level and stats_row_level are merged into a single variable
      "track_counts", which controls all reports sent to the collector process.
      
      * stats_command_string is renamed to track_activities.
      
      * log_autovacuum is renamed to log_autovacuum_min_duration to better reflect
      its meaning.
      
      The log_autovacuum change is not a compatibility issue since it didn't exist
      before 8.3 anyway.  The other changes need to be release-noted.
      48f7e643
  14. 21 9月, 2007 2 次提交
    • T
      Revert ill-fated patch to release exclusive lock early after vacuum · eb5f4d6c
      Tom Lane 提交于
      truncates a table.  Introduces race condition, as shown by buildfarm
      failures.
      eb5f4d6c
    • T
      HOT updates. When we update a tuple without changing any of its indexed · 282d2a03
      Tom Lane 提交于
      columns, and the new version can be stored on the same heap page, we no longer
      generate extra index entries for the new version.  Instead, index searches
      follow the HOT-chain links to ensure they find the correct tuple version.
      
      In addition, this patch introduces the ability to "prune" dead tuples on a
      per-page basis, without having to do a complete VACUUM pass to recover space.
      VACUUM is still needed to clean up dead index entries, however.
      
      Pavan Deolasee, with help from a bunch of other people.
      282d2a03
  15. 16 9月, 2007 1 次提交
    • T
      Fix aboriginal mistake in lazy VACUUM's code for truncating away · 43b0c918
      Tom Lane 提交于
      no-longer-needed pages at the end of a table.  We thought we could throw away
      pages containing HEAPTUPLE_DEAD tuples; but this is not so, because such
      tuples very likely have index entries pointing at them, and we wouldn't have
      removed the index entries.  The problem only emerges in a somewhat unlikely
      race condition: the dead tuples have to have been inserted by a transaction
      that later aborted, and this has to have happened between VACUUM's initial
      scan of the page and then rechecking it for empty in count_nondeletable_pages.
      But that timespan will include an index-cleaning pass, so it's not all that
      hard to hit.  This seems to explain a couple of previously unsolved bug
      reports.
      43b0c918
  16. 13 9月, 2007 1 次提交
    • T
      Redefine the lp_flags field of item pointers as having four states, rather · 68893035
      Tom Lane 提交于
      than two independent bits (one of which was never used in heap pages anyway,
      or at least hadn't been in a very long time).  This gives us flexibility to
      add the HOT notions of redirected and dead item pointers without requiring
      anything so klugy as magic values of lp_off and lp_len.  The state values
      are chosen so that for the states currently in use (pre-HOT) there is no
      change in the physical representation.
      68893035
  17. 12 9月, 2007 1 次提交
  18. 11 9月, 2007 2 次提交
  19. 06 9月, 2007 1 次提交
    • T
      Implement lazy XID allocation: transactions that do not modify any database · 295e6398
      Tom Lane 提交于
      rows will normally never obtain an XID at all.  We already did things this way
      for subtransactions, but this patch extends the concept to top-level
      transactions.  In applications where there are lots of short read-only
      transactions, this should improve performance noticeably; not so much from
      removal of the actual XID-assignments, as from reduction of overhead that's
      driven by the rate of XID consumption.  We add a concept of a "virtual
      transaction ID" so that active transactions can be uniquely identified even
      if they don't have a regular XID.  This is a much lighter-weight concept:
      uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
      record is made about them.
      
      Florian Pflug, with some editorialization by Tom.
      295e6398
  20. 31 5月, 2007 1 次提交
    • T
      Make large sequential scans and VACUUMs work in a limited-size "ring" of · d526575f
      Tom Lane 提交于
      buffers, rather than blowing out the whole shared-buffer arena.  Aside from
      avoiding cache spoliation, this fixes the problem that VACUUM formerly tended
      to cause a WAL flush for every page it modified, because we had it hacked to
      use only a single buffer.  Those flushes will now occur only once per
      ring-ful.  The exact ring size, and the threshold for seqscans to switch into
      the ring usage pattern, remain under debate; but the infrastructure seems
      done.  The key bit of infrastructure is a new optional BufferAccessStrategy
      object that can be passed to ReadBuffer operations; this replaces the former
      StrategyHintVacuum API.
      
      This patch also changes the buffer usage-count methodology a bit: we now
      advance usage_count when first pinning a buffer, rather than when last
      unpinning it.  To preserve the behavior that a buffer's lifetime starts to
      decrease when it's released, the clock sweep code is modified to not decrement
      usage_count of pinned buffers.
      
      Work not done in this commit: teach GiST and GIN indexes to use the vacuum
      BufferAccessStrategy for vacuum-driven fetches.
      
      Original patch by Simon, reworked by Heikki and again by Tom.
      d526575f
  21. 17 5月, 2007 1 次提交
    • A
      Move the tuple freezing point in CLUSTER to a point further back in the past, · 3b0347b3
      Alvaro Herrera 提交于
      to avoid losing useful Xid information in not-so-old tuples.  This makes
      CLUSTER behave the same as VACUUM as far a tuple-freezing behavior goes
      (though CLUSTER does not yet advance the table's relfrozenxid).
      
      While at it, move the actual freezing operation in rewriteheap.c to a more
      appropriate place, and document it thoroughly.  This part of the patch from
      Tom Lane.
      3b0347b3
  22. 30 4月, 2007 1 次提交
    • T
      Implement rate-limiting logic on how often backends will attempt to send · 957d08c8
      Tom Lane 提交于
      messages to the stats collector.  This avoids the problem that enabling
      stats_row_level for autovacuum has a significant overhead for short
      read-only transactions, as noted by Arjen van der Meijden.  We can avoid
      an extra gettimeofday call by piggybacking on the one done for WAL-logging
      xact commit or abort (although that doesn't help read-only transactions,
      since they don't WAL-log anything).
      
      In my proposal for this, I noted that we could change the WAL log entries
      for commit/abort to record full TimestampTz precision, instead of only
      time_t as at present.  That's not done in this patch, but will be committed
      separately.
      957d08c8
  23. 20 4月, 2007 1 次提交
  24. 19 4月, 2007 1 次提交
  25. 22 2月, 2007 2 次提交
  26. 04 2月, 2007 1 次提交
  27. 06 1月, 2007 1 次提交
  28. 06 11月, 2006 1 次提交
    • T
      Fix recently-understood problems with handling of XID freezing, particularly · 48188e16
      Tom Lane 提交于
      in PITR scenarios.  We now WAL-log the replacement of old XIDs with
      FrozenTransactionId, so that such replacement is guaranteed to propagate to
      PITR slave databases.  Also, rather than relying on hint-bit updates to be
      preserved, pg_clog is not truncated until all instances of an XID are known to
      have been replaced by FrozenTransactionId.  Add new GUC variables and
      pg_autovacuum columns to allow management of the freezing policy, so that
      users can trade off the size of pg_clog against the amount of freezing work
      done.  Revise the already-existing code that forces autovacuum of tables
      approaching the wraparound point to make it more bulletproof; also, revise the
      autovacuum logic so that anti-wraparound vacuuming is done per-table rather
      than per-database.  initdb forced because of changes in pg_class, pg_database,
      and pg_autovacuum catalogs.  Heikki Linnakangas, Simon Riggs, and Tom Lane.
      48188e16
  29. 04 10月, 2006 1 次提交
  30. 22 9月, 2006 1 次提交
    • T
      Fix free space map to correctly track the total amount of FSM space needed · 9e936693
      Tom Lane 提交于
      even when a single relation requires more than max_fsm_pages pages.  Also,
      make VACUUM emit a warning in this case, since it likely means that VACUUM
      FULL or other drastic corrective measure is needed.  Per reports from Jeff
      Frost and others of unexpected changes in the claimed max_fsm_pages need.
      9e936693
  31. 14 9月, 2006 1 次提交
  32. 05 9月, 2006 1 次提交
  33. 01 8月, 2006 1 次提交
    • T
      Change the relation_open protocol so that we obtain lock on a relation · 09d3670d
      Tom Lane 提交于
      (table or index) before trying to open its relcache entry.  This fixes
      race conditions in which someone else commits a change to the relation's
      catalog entries while we are in process of doing relcache load.  Problems
      of that ilk have been reported sporadically for years, but it was not
      really practical to fix until recently --- for instance, the recent
      addition of WAL-log support for in-place updates helped.
      
      Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
      concurrent update.
      09d3670d
  34. 14 7月, 2006 2 次提交
  35. 11 7月, 2006 1 次提交
    • A
      Improve vacuum code to track minimum Xids per table instead of per database. · d4cef0aa
      Alvaro Herrera 提交于
      To this end, add a couple of columns to pg_class, relminxid and relvacuumxid,
      based on which we calculate the pg_database columns after each vacuum.
      
      We now force all databases to be vacuumed, even template ones.  A backend
      noticing too old a database (meaning pg_database.datminxid is in danger of
      falling behind Xid wraparound) will signal the postmaster, which in turn will
      start an autovacuum iteration to process the offending database.  In principle
      this is only there to cope with frozen (non-connectable) databases without
      forcing users to set them to connectable, but it could force regular user
      database to go through a database-wide vacuum at any time.  Maybe we should
      warn users about this somehow.  Of course the real solution will be to use
      autovacuum all the time ;-)
      
      There are some additional improvements we could have in this area: for example
      the vacuum code could be smarter about not updating pg_database for each table
      when called by autovacuum, and do it only once the whole autovacuum iteration
      is done.
      
      I updated the system catalogs documentation, but I didn't modify the
      maintenance section.  Also having some regression tests for this would be nice
      but it's not really a very straightforward thing to do.
      
      Catalog version bumped due to system catalog changes.
      d4cef0aa