1. 02 1月, 2011 2 次提交
    • R
      Basic foreign table support. · 0d692a0d
      Robert Haas 提交于
      Foreign tables are a core component of SQL/MED.  This commit does
      not provide a working SQL/MED infrastructure, because foreign tables
      cannot yet be queried.  Support for foreign table scans will need to
      be added in a future patch.  However, this patch creates the necessary
      system catalog structure, syntax support, and support for ancillary
      operations such as COMMENT and SECURITY LABEL.
      
      Shigeru Hanada, heavily revised by Robert Haas
      0d692a0d
    • B
      Stamp copyrights for year 2011. · 5d950e3b
      Bruce Momjian 提交于
      5d950e3b
  2. 14 12月, 2010 1 次提交
    • R
      Generalize concept of temporary relations to "relation persistence". · 5f7b58fa
      Robert Haas 提交于
      This commit replaces pg_class.relistemp with pg_class.relpersistence;
      and also modifies the RangeVar node type to carry relpersistence rather
      than istemp.  It also removes removes rd_istemp from RelationData and
      instead performs the correct computation based on relpersistence.
      
      For clarity, we add three new macros: RelationNeedsWAL(),
      RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we
      can clarify the purpose of each check that previous depended on
      rd_istemp.
      
      This is intended as infrastructure for the upcoming unlogged tables
      patch, as well as for future possible work on global temporary tables.
      5f7b58fa
  3. 09 12月, 2010 2 次提交
  4. 09 11月, 2010 1 次提交
    • H
      In rewriteheap.c (used by VACUUM FULL and CLUSTER), calculate the tuple · 000efc3d
      Heikki Linnakangas 提交于
      length stored in the line pointer the same way it's calculated in the normal
      heap_insert() codepath. As noted by Jeff Davis, the length stored by
      raw_heap_insert() included padding but the one stored by the normal codepath
      did not. While the mismatch seems to be harmless, inconsistency isn't good,
      and the normal codepath has received a lot more testing over the years.
      
      Backpatch to 8.3 where the heap rewrite code was introduced.
      000efc3d
  5. 08 10月, 2010 1 次提交
    • T
      Improve logging in VACUUM FULL VERBOSE and CLUSTER VERBOSE. · 9cc8c84e
      Tom Lane 提交于
      This patch resurrects some of the information that could be logged by the
      old, now-dead implementation of VACUUM FULL, in particular counts of live
      and dead tuples and the time taken for the table rebuild proper.  There's
      still no logging about the ensuing index rebuilds, though.
      
      Itagaki Takahiro
      9cc8c84e
  6. 21 9月, 2010 1 次提交
  7. 20 9月, 2010 1 次提交
  8. 12 9月, 2010 1 次提交
    • J
      SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99
      Joe Conway 提交于
      transaction snapshots, i.e. a snapshot registered at the beginning of
      a transaction. Change variable naming and comments to reflect this reality
      in preparation for a future, truly serializable mode, e.g.
      Serializable Snapshot Isolation (SSI).
      
      For the moment transaction snapshots are still used to implement
      SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
      Grittner and Dan Ports with review and some minor wording changes by me.
      5eb15c99
  9. 19 8月, 2010 1 次提交
    • R
      Tidy up a few calls to smrgextend(). · d37781fa
      Robert Haas 提交于
      In the new API introduced by my patch to include the backend ID in
      temprel filenames, the last argument to smrgextend() became skipFsync
      rather than isTemp, but these calls didn't get the memo.  It's not
      really a problem to pass rel->rd_istemp rather than just plain false,
      because smgrextend() now automatically skips the fsync for temprels
      anyway, but this seems cleaner and saves some minute number of cycles.
      d37781fa
  10. 14 8月, 2010 1 次提交
    • R
      Include the backend ID in the relpath of temporary relations. · debcec7d
      Robert Haas 提交于
      This allows us to reliably remove all leftover temporary relation
      files on cluster startup without reference to system catalogs or WAL;
      therefore, we no longer include temporary relations in XLOG_XACT_COMMIT
      and XLOG_XACT_ABORT WAL records.
      
      Since these changes require including a backend ID in each
      SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id
      field has been reduced from two bytes to one, and the maximum number
      of connections has been reduced from INT_MAX / 4 to 2^23-1.  It would
      be possible to remove these restrictions by increasing the size of
      SharedInvalidationMessage by 4 bytes, but right now that doesn't seem
      like a good trade-off.
      
      Review by Jaime Casanova and Tom Lane.
      debcec7d
  11. 30 7月, 2010 1 次提交
    • R
      Fix possible page corruption by ALTER TABLE .. SET TABLESPACE. · 1a078629
      Robert Haas 提交于
      If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
      set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
      will do the same thing during replay, so the corruption propagates to any
      standby.  Note, however, that the bug can't be demonstrated unless archiving
      is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
      are not set.
      
      Back-patch to 8.0; prior releases do not have tablespaces.
      
      Analysis and patch by Jeff Davis.  Adjustments for back-branches and minor
      wordsmithing by me.
      1a078629
  12. 07 7月, 2010 1 次提交
  13. 03 5月, 2010 2 次提交
    • T
      609a63fd
    • T
      Fix replay of XLOG_HEAP_NEWPAGE WAL records to pay attention to the forknum · e55e6ecf
      Tom Lane 提交于
      field of the WAL record.  The previous coding always wrote to the main fork,
      resulting in data corruption if the page was meant to go into a non-default
      fork.
      
      At present, the only operation that can produce such WAL records is
      ALTER TABLE/INDEX SET TABLESPACE when executed with archive_mode = on.
      Data corruption would be observed on standby slaves, and could occur on the
      master as well if a database crash and recovery occurred after committing
      the ALTER and before the next checkpoint.  Per report from Gordon Shannon.
      
      Back-patch to 8.4; the problem doesn't exist in earlier branches because
      we didn't have a concept of multiple relation forks then.
      e55e6ecf
  14. 29 4月, 2010 1 次提交
    • H
      Introduce wal_level GUC to explicitly control if information needed for · 9b8a7332
      Heikki Linnakangas 提交于
      archival or hot standby should be WAL-logged, instead of deducing that from
      other options like archive_mode. This replaces recovery_connections GUC in
      the primary, where it now has no effect, but it's still used in the standby
      to enable/disable hot standby.
      
      Remove the WAL-logging of "unlogged operations", like creating an index
      without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
      the wal_mode setting and the settings that affect how much shared memory a
      hot standby server needs to track master transactions (max_connections,
      max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
      change, at server restart, write a WAL record noting the new settings and
      update pg_control. This allows us to notice the change in those settings in
      the standby at the right moment, they used to be included in checkpoint
      records, but that meant that a changed value was not reflected in the
      standby until the first checkpoint after the change.
      
      Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
      the sequence it used to follow, before hot standby and subsequent patches
      changed it to 0x9003.
      9b8a7332
  15. 24 4月, 2010 1 次提交
  16. 22 4月, 2010 2 次提交
    • S
      Further reductions in Hot Standby conflict processing. These · 781ec6b7
      Simon Riggs 提交于
      come from the realistion that HEAP2_CLEAN records don't
      always remove user visible data, so conflict processing for
      them can be skipped. Confirm validity using Assert checks,
      clarify circumstances under which we log heap_cleanup_info
      records. Tuning arises from bug fixing of earlier safety
      check failures.
      781ec6b7
    • S
      Fix oversight in collecting values for cleanup_info records. · bc2b85d9
      Simon Riggs 提交于
      vacuum_log_cleanup_info() now generates log records with a valid
      latestRemovedXid set in all cases. Also be careful not to zero the
      value when we do a round of vacuuming part-way through lazy_scan_heap().
      Incidentally, this reduces frequency of conflicts in Hot Standby.
      bc2b85d9
  17. 26 2月, 2010 1 次提交
  18. 15 2月, 2010 1 次提交
    • R
      Wrap calls to SearchSysCache and related functions using macros. · e26c539e
      Robert Haas 提交于
      The purpose of this change is to eliminate the need for every caller
      of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
      GetSysCacheOid, and SearchSysCacheList to know the maximum number
      of allowable keys for a syscache entry (currently 4).  This will
      make it far easier to increase the maximum number of keys in a
      future release should we choose to do so, and it makes the code
      shorter, too.
      
      Design and review by Tom Lane.
      e26c539e
  19. 10 2月, 2010 1 次提交
    • T
      Fix up rickety handling of relation-truncation interlocks. · cbe9d6be
      Tom Lane 提交于
      Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr
      relation entries, so that they will get reset to InvalidBlockNumber whenever
      an smgr-level flush happens.  Because we now send smgr invalidation messages
      immediately (not at end of transaction) when a relation truncation occurs,
      this ensures that other backends will reset their values before they next
      access the relation.  We no longer need the unreliable assumption that a
      VACUUM that's doing a truncation will hold its AccessExclusive lock until
      commit --- in fact, we can intentionally release that lock as soon as we've
      completed the truncation.  This patch therefore reverts (most of) Alvaro's
      patch of 2009-11-10, as well as my marginal hacking on it yesterday.  We can
      also get rid of assorted no-longer-needed relcache flushes, which are far more
      expensive than an smgr flush because they kill a lot more state.
      
      In passing this patch fixes smgr_redo's failure to perform visibility-map
      truncation, and cleans up some rather dubious assumptions in freespace.c and
      visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of
      date.
      cbe9d6be
  20. 08 2月, 2010 1 次提交
    • T
      Remove old-style VACUUM FULL (which was known for a little while as · 0a469c87
      Tom Lane 提交于
      VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
      Per discussion, the use case for this method of vacuuming is no longer large
      enough to justify maintaining it; not to mention that we don't wish to invest
      the work that would be needed to make it play nicely with Hot Standby.
      
      Aside from the code directly related to old-style VACUUM FULL, this commit
      removes support for certain WAL record types that could only be generated
      within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
      nontransactional generation of cache invalidation sinval messages (the last
      being the sticking point for Hot Standby).
      
      We still have to retain all code that copes with finding HEAP_MOVED_OFF and
      HEAP_MOVED_IN flag bits on existing tuples.  This can't be removed as long
      as we want to support in-place update from pre-9.0 databases.
      0a469c87
  21. 04 2月, 2010 1 次提交
    • T
      Restructure CLUSTER/newstyle VACUUM FULL/ALTER TABLE support so that swapping · 9727c583
      Tom Lane 提交于
      of old and new toast tables can be done either at the logical level (by
      swapping the heaps' reltoastrelid links) or at the physical level (by swapping
      the relfilenodes of the toast tables and their indexes).  This is necessary
      infrastructure for upcoming changes to support CLUSTER/VAC FULL on shared
      system catalogs, where we cannot change reltoastrelid.  The physical swap
      saves a few catalog updates too.
      
      We unfortunately have to keep the logical-level swap logic because in some
      cases we will be adding or deleting a toast table, so there's no possibility
      of a physical swap.  However, that only happens as a consequence of schema
      changes in the table, which we do not need to support for system catalogs,
      so such cases aren't an obstacle for that.
      
      In passing, refactor the cluster support functions a little bit to eliminate
      unnecessarily-duplicated code; and fix the problem that while CLUSTER had
      been taught to rename the final toast table at need, ALTER TABLE had not.
      9727c583
  22. 03 2月, 2010 1 次提交
  23. 30 1月, 2010 1 次提交
    • S
      Filter recovery conflicts based upon dboid from relfilenode of WAL · 76be0c81
      Simon Riggs 提交于
      records for heap and btree. Minor change, mostly API changes to
      pass through the required values. This is a simple change though
      also provides the refactoring required for further enhancements
      to conflict processing using the relOid. Changes only have effect
      during Hot Standby.
      76be0c81
  24. 21 1月, 2010 1 次提交
  25. 14 1月, 2010 1 次提交
    • S
      First part of refactoring of code for ResolveRecoveryConflict. Purposes · e99767bc
      Simon Riggs 提交于
      of this are to centralise the conflict code to allow further change,
      as well as to allow passing through the full reason for the conflict
      through to the conflicting backends. Backend state alters how we
      can handle different types of conflict so this is now required.
      As originally suggested by Heikki, no longer optional.
      e99767bc
  26. 10 1月, 2010 1 次提交
    • R
      Remove partial, broken support for NULL pointers when fetching attributes. · 84b6d5f3
      Robert Haas 提交于
      Previously, fastgetattr() and heap_getattr() tested their fourth argument
      against a null pointer, but any attempt to use them with a literal-NULL
      fourth argument evaluated to *(void *)0, resulting in a compiler error.
      Remove these NULL tests to avoid leading future readers of this code to
      believe that this has a chance of working.  Also clean up related legacy
      code in nocachegetattr(), heap_getsysattr(), and nocache_index_getattr().
      
      The new coding standard is that any code which calls a getattr-type
      function or macro which takes an isnull argument MUST pass a valid
      boolean pointer.  Per discussion with Bruce Momjian, Tom Lane, Alvaro
      Herrera.
      84b6d5f3
  27. 03 1月, 2010 1 次提交
  28. 19 12月, 2009 1 次提交
    • S
      Allow read only connections during recovery, known as Hot Standby. · efc16ea5
      Simon Riggs 提交于
      Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
      
      New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
      
      This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
      
      Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
      
      Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
      efc16ea5
  29. 24 8月, 2009 1 次提交
    • T
      Fix a violation of WAL coding rules in the recent patch to include an · 7fc7a7c4
      Tom Lane 提交于
      "all tuples visible" flag in heap page headers.  The flag update *must*
      be applied before calling XLogInsert, but heap_update and the tuple
      moving routines in VACUUM FULL were ignoring this rule.  A crash and
      replay could therefore leave the flag incorrectly set, causing rows
      to appear visible in seqscans when they should not be.  This might explain
      recent reports of data corruption from Jeff Ross and others.
      
      In passing, do a bit of editorialization on comments in visibilitymap.c.
      7fc7a7c4
  30. 30 7月, 2009 1 次提交
    • T
      Support deferrable uniqueness constraints. · 25d9bf2e
      Tom Lane 提交于
      The current implementation fires an AFTER ROW trigger for each tuple that
      looks like it might be non-unique according to the index contents at the
      time of insertion.  This works well as long as there aren't many conflicts,
      but won't scale to massive unique-key reassignments.  Improving that case
      is a TODO item.
      
      Dean Rasheed
      25d9bf2e
  31. 22 7月, 2009 1 次提交
  32. 18 6月, 2009 1 次提交
  33. 11 6月, 2009 2 次提交
  34. 13 5月, 2009 1 次提交
    • T
      Fix LOCK TABLE to eliminate the race condition that could make it give weird · f23bdda3
      Tom Lane 提交于
      errors when tables are concurrently dropped.  To do this we must take lock
      on each relation before we check its privileges.  The old code was trying
      to do that the other way around, which is a bit pointless when there are lots
      of other commands that lock relations before checking privileges.  I did keep
      it checking each relation's privilege before locking the next relation, which
      is a detail that ALTER TABLE isn't too picky about.
      f23bdda3
  35. 21 1月, 2009 1 次提交