1. 02 1月, 2011 2 次提交
    • R
      Basic foreign table support. · 0d692a0d
      Robert Haas 提交于
      Foreign tables are a core component of SQL/MED.  This commit does
      not provide a working SQL/MED infrastructure, because foreign tables
      cannot yet be queried.  Support for foreign table scans will need to
      be added in a future patch.  However, this patch creates the necessary
      system catalog structure, syntax support, and support for ancillary
      operations such as COMMENT and SECURITY LABEL.
      
      Shigeru Hanada, heavily revised by Robert Haas
      0d692a0d
    • B
      Stamp copyrights for year 2011. · 5d950e3b
      Bruce Momjian 提交于
      5d950e3b
  2. 14 12月, 2010 1 次提交
    • R
      Generalize concept of temporary relations to "relation persistence". · 5f7b58fa
      Robert Haas 提交于
      This commit replaces pg_class.relistemp with pg_class.relpersistence;
      and also modifies the RangeVar node type to carry relpersistence rather
      than istemp.  It also removes removes rd_istemp from RelationData and
      instead performs the correct computation based on relpersistence.
      
      For clarity, we add three new macros: RelationNeedsWAL(),
      RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we
      can clarify the purpose of each check that previous depended on
      rd_istemp.
      
      This is intended as infrastructure for the upcoming unlogged tables
      patch, as well as for future possible work on global temporary tables.
      5f7b58fa
  3. 09 12月, 2010 2 次提交
  4. 21 9月, 2010 1 次提交
  5. 12 9月, 2010 1 次提交
    • J
      SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99
      Joe Conway 提交于
      transaction snapshots, i.e. a snapshot registered at the beginning of
      a transaction. Change variable naming and comments to reflect this reality
      in preparation for a future, truly serializable mode, e.g.
      Serializable Snapshot Isolation (SSI).
      
      For the moment transaction snapshots are still used to implement
      SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
      Grittner and Dan Ports with review and some minor wording changes by me.
      5eb15c99
  6. 30 7月, 2010 1 次提交
    • R
      Fix possible page corruption by ALTER TABLE .. SET TABLESPACE. · 1a078629
      Robert Haas 提交于
      If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
      set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
      will do the same thing during replay, so the corruption propagates to any
      standby.  Note, however, that the bug can't be demonstrated unless archiving
      is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
      are not set.
      
      Back-patch to 8.0; prior releases do not have tablespaces.
      
      Analysis and patch by Jeff Davis.  Adjustments for back-branches and minor
      wordsmithing by me.
      1a078629
  7. 07 7月, 2010 1 次提交
  8. 03 5月, 2010 2 次提交
    • T
      609a63fd
    • T
      Fix replay of XLOG_HEAP_NEWPAGE WAL records to pay attention to the forknum · e55e6ecf
      Tom Lane 提交于
      field of the WAL record.  The previous coding always wrote to the main fork,
      resulting in data corruption if the page was meant to go into a non-default
      fork.
      
      At present, the only operation that can produce such WAL records is
      ALTER TABLE/INDEX SET TABLESPACE when executed with archive_mode = on.
      Data corruption would be observed on standby slaves, and could occur on the
      master as well if a database crash and recovery occurred after committing
      the ALTER and before the next checkpoint.  Per report from Gordon Shannon.
      
      Back-patch to 8.4; the problem doesn't exist in earlier branches because
      we didn't have a concept of multiple relation forks then.
      e55e6ecf
  9. 22 4月, 2010 1 次提交
    • S
      Further reductions in Hot Standby conflict processing. These · 781ec6b7
      Simon Riggs 提交于
      come from the realistion that HEAP2_CLEAN records don't
      always remove user visible data, so conflict processing for
      them can be skipped. Confirm validity using Assert checks,
      clarify circumstances under which we log heap_cleanup_info
      records. Tuning arises from bug fixing of earlier safety
      check failures.
      781ec6b7
  10. 26 2月, 2010 1 次提交
  11. 15 2月, 2010 1 次提交
    • R
      Wrap calls to SearchSysCache and related functions using macros. · e26c539e
      Robert Haas 提交于
      The purpose of this change is to eliminate the need for every caller
      of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
      GetSysCacheOid, and SearchSysCacheList to know the maximum number
      of allowable keys for a syscache entry (currently 4).  This will
      make it far easier to increase the maximum number of keys in a
      future release should we choose to do so, and it makes the code
      shorter, too.
      
      Design and review by Tom Lane.
      e26c539e
  12. 08 2月, 2010 1 次提交
    • T
      Remove old-style VACUUM FULL (which was known for a little while as · 0a469c87
      Tom Lane 提交于
      VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
      Per discussion, the use case for this method of vacuuming is no longer large
      enough to justify maintaining it; not to mention that we don't wish to invest
      the work that would be needed to make it play nicely with Hot Standby.
      
      Aside from the code directly related to old-style VACUUM FULL, this commit
      removes support for certain WAL record types that could only be generated
      within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
      nontransactional generation of cache invalidation sinval messages (the last
      being the sticking point for Hot Standby).
      
      We still have to retain all code that copes with finding HEAP_MOVED_OFF and
      HEAP_MOVED_IN flag bits on existing tuples.  This can't be removed as long
      as we want to support in-place update from pre-9.0 databases.
      0a469c87
  13. 03 2月, 2010 1 次提交
  14. 30 1月, 2010 1 次提交
    • S
      Filter recovery conflicts based upon dboid from relfilenode of WAL · 76be0c81
      Simon Riggs 提交于
      records for heap and btree. Minor change, mostly API changes to
      pass through the required values. This is a simple change though
      also provides the refactoring required for further enhancements
      to conflict processing using the relOid. Changes only have effect
      during Hot Standby.
      76be0c81
  15. 21 1月, 2010 1 次提交
  16. 14 1月, 2010 1 次提交
    • S
      First part of refactoring of code for ResolveRecoveryConflict. Purposes · e99767bc
      Simon Riggs 提交于
      of this are to centralise the conflict code to allow further change,
      as well as to allow passing through the full reason for the conflict
      through to the conflicting backends. Backend state alters how we
      can handle different types of conflict so this is now required.
      As originally suggested by Heikki, no longer optional.
      e99767bc
  17. 10 1月, 2010 1 次提交
    • R
      Remove partial, broken support for NULL pointers when fetching attributes. · 84b6d5f3
      Robert Haas 提交于
      Previously, fastgetattr() and heap_getattr() tested their fourth argument
      against a null pointer, but any attempt to use them with a literal-NULL
      fourth argument evaluated to *(void *)0, resulting in a compiler error.
      Remove these NULL tests to avoid leading future readers of this code to
      believe that this has a chance of working.  Also clean up related legacy
      code in nocachegetattr(), heap_getsysattr(), and nocache_index_getattr().
      
      The new coding standard is that any code which calls a getattr-type
      function or macro which takes an isnull argument MUST pass a valid
      boolean pointer.  Per discussion with Bruce Momjian, Tom Lane, Alvaro
      Herrera.
      84b6d5f3
  18. 03 1月, 2010 1 次提交
  19. 19 12月, 2009 1 次提交
    • S
      Allow read only connections during recovery, known as Hot Standby. · efc16ea5
      Simon Riggs 提交于
      Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
      
      New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
      
      This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
      
      Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
      
      Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
      efc16ea5
  20. 24 8月, 2009 1 次提交
    • T
      Fix a violation of WAL coding rules in the recent patch to include an · 7fc7a7c4
      Tom Lane 提交于
      "all tuples visible" flag in heap page headers.  The flag update *must*
      be applied before calling XLogInsert, but heap_update and the tuple
      moving routines in VACUUM FULL were ignoring this rule.  A crash and
      replay could therefore leave the flag incorrectly set, causing rows
      to appear visible in seqscans when they should not be.  This might explain
      recent reports of data corruption from Jeff Ross and others.
      
      In passing, do a bit of editorialization on comments in visibilitymap.c.
      7fc7a7c4
  21. 11 6月, 2009 2 次提交
  22. 13 5月, 2009 1 次提交
    • T
      Fix LOCK TABLE to eliminate the race condition that could make it give weird · f23bdda3
      Tom Lane 提交于
      errors when tables are concurrently dropped.  To do this we must take lock
      on each relation before we check its privileges.  The old code was trying
      to do that the other way around, which is a bit pointless when there are lots
      of other commands that lock relations before checking privileges.  I did keep
      it checking each relation's privilege before locking the next relation, which
      is a detail that ALTER TABLE isn't too picky about.
      f23bdda3
  23. 21 1月, 2009 1 次提交
  24. 02 1月, 2009 1 次提交
  25. 17 12月, 2008 1 次提交
    • T
      Make heap_update() set newtup->t_tableOid correctly, for consistency with · fc3297d8
      Tom Lane 提交于
      the other major heapam.c functions.  The only known consequence of this
      omission is that UPDATE RETURNING failed to return the correct value for
      "tableoid", as per report from KaiGai Kohei.
      
      Back-patch to 8.2.  Arguably it's wrong all the way back; but without
      evidence of visible breakage before RETURNING was added, I'll desist from
      patching the older branches.
      fc3297d8
  26. 03 12月, 2008 1 次提交
    • H
      Introduce visibility map. The visibility map is a bitmap with one bit per · 608195a3
      Heikki Linnakangas 提交于
      heap page, where a set bit indicates that all tuples on the page are
      visible to all transactions, and the page therefore doesn't need
      vacuuming. It is stored in a new relation fork.
      
      Lazy vacuum uses the visibility map to skip pages that don't need
      vacuuming. Vacuum is also responsible for setting the bits in the map.
      In the future, this can hopefully be used to implement index-only-scans,
      but we can't currently guarantee that the visibility map is always 100%
      up-to-date.
      
      In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
      each heap page, also indicating that all tuples on the page are visible to
      all transactions. It's important that this flag is kept up-to-date. It
      is also used to skip visibility tests in sequential scans, which gives a
      small performance gain on seqscans.
      608195a3
  27. 19 11月, 2008 1 次提交
    • H
      Rethink the way FSM truncation works. Instead of WAL-logging FSM · 33960006
      Heikki Linnakangas 提交于
      truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To
      make that cleaner from modularity point of view, move the WAL-logging one
      level up to RelationTruncate, and move RelationTruncate and all the
      related WAL-logging to new src/backend/catalog/storage.c file. Introduce
      new RelationCreateStorage and RelationDropStorage functions that are used
      instead of calling smgrcreate/smgrscheduleunlink directly. Move the
      pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new
      functions. This leaves smgr.c as a thin wrapper around md.c; all the
      transactional stuff is now in storage.c.
      
      This will make it easier to add new forks with similar truncation logic,
      like the visibility map.
      33960006
  28. 07 11月, 2008 1 次提交
  29. 01 11月, 2008 1 次提交
    • H
      Update FSM on WAL replay. This is a bit limited; the FSM is only updated · e9816533
      Heikki Linnakangas 提交于
      on non-full-page-image WAL records, and quite arbitrarily, only if there's
      less than 20% free space on the page after the insert/update (not on HOT
      updates, though). The 20% cutoff should avoid most of the overhead, when
      replaying a bulk insertion, for example, while ensuring that pages that
      are full are marked as full in the FSM.
      
      This is mostly to avoid the nasty worst case scenario, where you replay
      from a PITR archive, and the FSM information in the base backup is really
      out of date. If there was a lot of pages that the outdated FSM claims to
      have free space, but don't actually have any, the first unlucky inserter
      after the recovery would traverse through all those pages, just to find
      out that they're full. We didn't have this problem with the old FSM
      implementation, because we simply threw the FSM information away on a
      non-clean shutdown.
      e9816533
  30. 31 10月, 2008 1 次提交
    • H
      Unite ReadBufferWithFork, ReadBufferWithStrategy, and ZeroOrReadBuffer · 19c8dc83
      Heikki Linnakangas 提交于
      functions into one ReadBufferExtended function, that takes the strategy
      and mode as argument. There's three modes, RBM_NORMAL which is the default
      used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and
      a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages
      without throwing an error. The FSM needs the new mode to recover from
      corrupt pages, which could happend if we crash after extending an FSM file,
      and the new page is "torn".
      
      Add fork number to some error messages in bufmgr.c, that still lacked it.
      19c8dc83
  31. 28 10月, 2008 1 次提交
  32. 08 10月, 2008 1 次提交
  33. 30 9月, 2008 1 次提交
    • H
      Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the · 15c121b3
      Heikki Linnakangas 提交于
      free space information is stored in a dedicated FSM relation fork, with each
      relation (except for hash indexes; they don't use FSM).
      
      This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
      trace of them from the backend, initdb, and documentation.
      
      Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
      introduce a new variant of the get_raw_page(regclass, int4, int4) function in
      contrib/pageinspect that let's you to return pages from any relation fork, and
      a new fsm_page_contents() function to inspect the new FSM pages.
      15c121b3
  34. 11 9月, 2008 1 次提交
    • A
      Initialize the minimum frozen Xid in vac_update_datfrozenxid using · d53a5668
      Alvaro Herrera 提交于
      GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not
      depend on the latter being correctly set elsewhere, and while it is more
      expensive, this code path is not performance-critical.  This is a real
      risk for autovacuum, because it can execute whole cycles without doing
      a single vacuum, which would mean that RecentGlobalXmin would stay at its
      initialization value, FirstNormalTransactionId, causing a bogus value to be
      inserted in pg_database.  This bug could explain some recent reports of
      failure to truncate pg_clog.
      
      At the same time, change the initialization of RecentGlobalXmin to
      InvalidTransactionId, and ensure that it's set to something else whenever
      it's going to be used.  Using it as FirstNormalTransactionId in HOT page
      pruning could incur in data loss.  InitPostgres takes care of setting it
      to a valid value, but the extra checks are there to prevent "special"
      backends from behaving in unusual ways.
      
      Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us
      d53a5668
  35. 11 8月, 2008 1 次提交
    • H
      Introduce the concept of relation forks. An smgr relation can now consist · 3f0e808c
      Heikki Linnakangas 提交于
      of multiple forks, and each fork can be created and grown separately.
      
      The bulk of this patch is about changing the smgr API to include an extra
      ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
      smgrdounlink no longer implicitly call smgrclose, because other forks might
      still exist after unlinking one. The callers of those functions have been
      modified to call smgrclose instead.
      
      This patch in itself doesn't have any user-visible effect, but provides the
      infrastructure needed for upcoming patches. The additional forks envisioned
      are a rewritten FSM implementation that doesn't rely on a fixed-size shared
      memory block, and a visibility map to allow skipping portions of a table in
      VACUUM that have no dead tuples.
      3f0e808c
  36. 14 7月, 2008 1 次提交
    • T
      Clean up the use of some page-header-access macros: principally, use · 9d035f42
      Tom Lane 提交于
      SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that
      makes the code clearer, and avoid casting between Page and PageHeader where
      possible.  Zdenek Kotala, with some additional cleanup by Heikki Linnakangas.
      
      I did not apply the parts of the proposed patch that would have resulted in
      slightly changing the on-disk format of hash indexes; it seems to me that's
      not a win as long as there's any chance of having in-place upgrade for 8.4.
      9d035f42