1. 29 6月, 2012 2 次提交
  2. 28 6月, 2012 5 次提交
    • R
      Dramatically reduce System V shared memory consumption. · b0fc0df9
      Robert Haas 提交于
      Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny
      amount of System V shared memory (as an interlock to protect the data
      directory) and allocate the rest as anonymous shared memory via mmap.
      This will hopefully spare most users the hassle of adjusting operating
      system parameters before being able to start PostgreSQL with a
      reasonable value for shared_buffers.
      
      There are a bunch of documentation updates needed here, and we might
      need to adjust some of the HINT messages related to shared memory as
      well.  But it's not 100% clear how portable this is, so before we
      write the documentation, let's give it a spin on the buildfarm and
      see what turns red.
      b0fc0df9
    • R
      Add missing space in event_source GUC description. · c5b3451a
      Robert Haas 提交于
      This has apparently been wrong since event_source was added.
      
      Alexander Lakhin
      c5b3451a
    • T
      Make UtilityContainsQuery recurse until it finds a non-utility Query. · bde689f8
      Tom Lane 提交于
      The callers of UtilityContainsQuery want it to return a non-utility Query
      if it returns anything at all.  However, since we made CREATE TABLE
      AS/SELECT INTO into a utility command instead of a variant of SELECT,
      a command like "EXPLAIN SELECT INTO" results in two nested utility
      statements.  So what we need UtilityContainsQuery to do is drill down
      to the bottom non-utility Query.
      
      I had thought of this possibility in setrefs.c, and fixed it there by
      looping around the UtilityContainsQuery call; but overlooked that the call
      sites in plancache.c have a similar issue.  In those cases it's
      notationally inconvenient to provide an external loop, so let's redefine
      UtilityContainsQuery as recursing down to a non-utility Query instead.
      
      Noted by Rushabh Lathia.  This is a somewhat cleaned-up version of his
      proposed patch.
      bde689f8
    • P
      Fix install program detection · f7867154
      Peter Eisentraut 提交于
      configure handles INSTALL as a substitution variable specially, and
      apparently it gets confused when it's set to empty.  Use INSTALL_
      instead as a workaround to avoid the issue.
      f7867154
    • H
      Fix two more neglected comments, still referring to log/seg. · a8f97b39
      Heikki Linnakangas 提交于
      Fujii Masao
      a8f97b39
  3. 27 6月, 2012 7 次提交
  4. 26 6月, 2012 9 次提交
    • R
      Reduce use of heavyweight locking inside hash AM. · 76837c15
      Robert Haas 提交于
      Avoid using LockPage(rel, 0, lockmode) to protect against changes to
      the bucket mapping.  Instead, an exclusive buffer content lock is now
      viewed as sufficient permission to modify the metapage, and a shared
      buffer content lock is used when such modifications need to be
      prevented.  This more relaxed locking regimen makes it possible that,
      when we're busy getting a heavyweight bucket on the bucket we intend
      to search or insert into, a bucket split might occur underneath us.
      To compenate for that possibility, we use a loop-and-retry system:
      release the metapage content lock, acquire the heavyweight lock on the
      target bucket, and then reacquire the metapage content lock and check
      that the bucket mapping has not changed.   Normally it hasn't, and
      we're done.  But if by chance it has, we simply unlock the metapage,
      release the heavyweight lock we acquired previously, lock the new
      bucket, and loop around again.  Even in the worst case we cannot loop
      very many times here, since we don't split the same bucket again until
      we've split all the other buckets, and 2^N gets big pretty fast.
      
      This results in greatly improved concurrency, because we're
      effectively replacing two lwlock acquire-and-release cycles in
      exclusive mode (on one of the lock manager locks) with a single
      acquire-and-release cycle in shared mode (on the metapage buffer
      content lock).  Testing shows that it's still not quite as good as
      btree; for that, we'd probably have to find some way of getting rid
      of the heavyweight bucket locks as well, which does not appear
      straightforward.
      
      Patch by me, review by Jeff Janes.
      76837c15
    • H
      Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring. · 038f3a05
      Heikki Linnakangas 提交于
      The xlogid + segno representation of a particular WAL segment doesn't make
      much sense in pg_resetxlog anymore, now that we don't use that anywhere
      else. Use the WAL filename instead, since that's a convenient way to name a
      particular WAL segment.
      
      I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
      patch, but I neglected pg_upgrade and the docs. This should now be more
      complete.
      038f3a05
    • T
      Make pg_dump emit more accurate dependency information. · 8a504a36
      Tom Lane 提交于
      While pg_dump has included dependency information in archive-format output
      ever since 7.3, it never made any large effort to ensure that that
      information was actually useful.  In particular, in common situations where
      dependency chains include objects that aren't separately emitted in the
      dump, the dependencies shown for objects that were emitted would reference
      the dump IDs of these un-dumped objects, leaving no clue about which other
      objects the visible objects indirectly depend on.  So far, parallel
      pg_restore has managed to avoid tripping over this misfeature, but only
      by dint of some crude hacks like not trusting dependency information in
      the pre-data section of the archive.
      
      It seems prudent to do something about this before it rises up to bite us,
      so instead of emitting the "raw" dependencies of each dumped object,
      recursively search for its actual dependencies among the subset of objects
      that are being dumped.
      
      Back-patch to 9.2, since that code hasn't yet diverged materially from
      HEAD.  At some point we might need to back-patch further, but right now
      there are no known cases where this is actively necessary.  (The one known
      case, bug #6699, is fixed in a different way by my previous patch.)  Since
      this patch depends on 9.2 changes that made TOC entries be marked before
      output commences as to whether they'll be dumped, back-patching further
      would require additional surgery; and as of now there's no evidence that
      it's worth the risk.
      8a504a36
    • T
      Improve pg_dump's dependency-sorting logic to enforce section dump order. · a1ef01fe
      Tom Lane 提交于
      As of 9.2, with the --section option, it is very important that the concept
      of "pre data", "data", and "post data" sections of the output be honored
      strictly; else a dump divided into separate sectional files might be
      unrestorable.  However, the dependency-sorting logic knew nothing of
      sections and would happily select output orderings that didn't fit that
      structure.  Doing so was mostly harmless before 9.2, but now we need to be
      sure it doesn't do that.  To fix, create dummy objects representing the
      section boundaries and add dependencies between them and all the normal
      objects.  (This might sound expensive but it seems to only add a percent or
      two to pg_dump's runtime.)
      
      This also fixes a problem introduced in 9.1 by the feature that allows
      incomplete GROUP BY lists when a primary key is given in GROUP BY.
      That means that views can depend on primary key constraints.  Previously,
      pg_dump would deal with that by simply emitting the primary key constraint
      before the view definition (and hence before the data section of the
      output).  That's bad enough for simple serial restores, where creating an
      index before the data is loaded works, but is undesirable for speed
      reasons.  But it could lead to outright failure of parallel restores, as
      seen in bug #6699 from Joe Van Dyk.  That happened because pg_restore would
      switch into parallel mode as soon as it reached the constraint, and then
      very possibly would try to emit the view definition before the primary key
      was committed (as a consequence of another bug that causes the view not to
      be correctly marked as depending on the constraint).  Adding the section
      boundary constraints forces the dependency-sorting code to break the view
      into separate table and rule declarations, allowing the rule, and hence the
      primary key constraint it depends on, to revert to their intended location
      in the post-data section.  This also somewhat accidentally works around the
      bogus-dependency-marking problem, because the rule will be correctly shown
      as depending on the constraint, so parallel pg_restore will now do the
      right thing.  (We will fix the bogus-dependency problem for real in a
      separate patch, but that patch is not easily back-portable to 9.1, so the
      fact that this patch is enough to dodge the only known symptom is
      fortunate.)
      
      Back-patch to 9.1, except for the hunk that adds verification that the
      finished archive TOC list is in correct section order; the place where
      it was convenient to add that doesn't exist in 9.1.
      a1ef01fe
    • A
      Tighten up includes in sinvaladt.h, twophase.h, proc.h · 77ed0c69
      Alvaro Herrera 提交于
      Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in
      proc.h with xlogdefs.h.
      77ed0c69
    • P
      Unify calling conventions for postgres/postmaster sub-main functions · eeece9e6
      Peter Eisentraut 提交于
      There was a wild mix of calling conventions: Some were declared to
      return void and didn't return, some returned an int exit code, some
      claimed to return an exit code, which the callers checked, but
      actually never returned, and so on.
      
      Now all of these functions are declared to return void and decorated
      with attribute noreturn and don't return.  That's easiest, and most
      code already worked that way.
      eeece9e6
    • R
      Fix typo in DEBUG message, introduced by recent WAL refactoring. · c7d47abd
      Robert Haas 提交于
      Fujii Masao
      c7d47abd
    • R
      Unbreak pg_resetxlog -l. · a6427f1f
      Robert Haas 提交于
      Fujii Masao
      a6427f1f
    • R
      Remove sanity test in XRecOffIsValid. · 2dfa87bc
      Robert Haas 提交于
      Commit 061e7efb changed the rules
      for splitting xlog records across pages, but neglected to update this
      test.  It's possible that there's some better action here than just
      removing the test completely, but this at least appears to get some
      of the things that are currently broken (like initdb on MacOS X)
      working again.
      2dfa87bc
  5. 25 6月, 2012 7 次提交
  6. 24 6月, 2012 3 次提交
    • H
      Allow WAL record header to be split across pages. · 061e7efb
      Heikki Linnakangas 提交于
      This saves a few bytes of WAL space, but the real motivation is to make it
      predictable how much WAL space a record requires, as it no longer depends
      on whether we need to waste the last few bytes at end of WAL page because
      the header doesn't fit.
      
      The total length field of WAL record, xl_tot_len, is moved to the beginning
      of the WAL record header, so that it is still always found on the first page
      where a WAL record begins.
      
      Bump WAL version number again as this is an incompatible change.
      061e7efb
    • H
      Move WAL continuation record information to WAL page header. · 20ba5ca6
      Heikki Linnakangas 提交于
      The continuation record only contained one field, xl_rem_len, so it makes
      things simpler to just include it in the WAL page header. This wastes four
      bytes on pages that don't begin with a continuation from previos page, plus
      four bytes on every page, because of padding.
      
      The motivation of this is to make it easier to calculate how much space a
      WAL record needs. Before this patch, it depended on how many page boundaries
      the record crosses. The motivation of that, in turn, is to separate the
      allocation of space in the WAL from the copying of the record data to the
      allocated space. Keeping the calculation of space required simple helps to
      keep the critical section of allocating the space from WAL short. But that's
      not included in this patch yet.
      
      Bump WAL version number again, as this is an incompatible change.
      20ba5ca6
    • H
      Don't waste the last segment of each 4GB logical log file. · dfda6eba
      Heikki Linnakangas 提交于
      The comments claimed that wasting the last segment made it easier to do
      calculations with XLogRecPtrs, because you don't have problems representing
      last-byte-position-plus-1 that way. In my experience, however, it only made
      things more complicated, because the there was two ways to represent the
      boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
      or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
      picky about which representation was used.
      
      Also, use a 64-bit segment number instead of the log/seg combination, to
      point to a certain WAL segment. We assume that all platforms have a working
      64-bit integer type nowadays.
      
      This is an incompatible change in WAL format, so bumping WAL version number.
      dfda6eba
  7. 22 6月, 2012 5 次提交
  8. 21 6月, 2012 2 次提交
    • H
      Add a small cache of locks owned by a resource owner in ResourceOwner. · eeb6f37d
      Heikki Linnakangas 提交于
      This speeds up reassigning locks to the parent owner, when the transaction
      holds a lot of locks, but only a few of them belong to the current resource
      owner. This is particularly helps pg_dump when dumping a large number of
      objects.
      
      The cache can hold up to 15 locks in each resource owner. After that, the
      cache is marked as overflowed, and we fall back to the old method of
      scanning the whole local lock table. The tradeoff here is that the cache has
      to be scanned whenever a lock is released, so if the cache is too large,
      lock release becomes more expensive. 15 seems enough to cover pg_dump, and
      doesn't have much impact on lock release.
      
      Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
      eeb6f37d
    • T
      Remove incomplete/incorrect support for zero-column foreign keys. · dfd9c116
      Tom Lane 提交于
      The original coding in ri_triggers.c had partial support for the concept of
      zero-column foreign key constraints.  But this is not defined in the SQL
      standard, nor was it ever allowed by any other part of Postgres, nor was it
      very fully implemented even here (eg there was no support for preventing
      PK-table deletions that would violate the constraint).  Doesn't seem very
      useful to carry 100-plus lines of code for a corner case that no one is
      interested in making work.  Instead, just add a check that the column list
      read from pg_constraint is non-empty.
      dfd9c116