1. 19 1月, 2013 2 次提交
  2. 18 1月, 2013 12 次提交
    • B
      Improve pg_upgrade error report · 600250d0
      Bruce Momjian 提交于
      If the cluster alignments don't match, output this suggestion:
      
      	Likely one cluster is a 32-bit install, the other 64-bit
      600250d0
    • A
      Fix off-by-one bug in xlog reading logic · 8c17144c
      Alvaro Herrera 提交于
      Bug reported by Michael Paquier
      
      Author: Andres Freund
      8c17144c
    • B
      psql latex fixes · 74a82baf
      Bruce Momjian 提交于
      Remove extra line at bottom of table for new 'latex' mode border=3.
      Also update 'latex'-longtable 'tableattr' docs to say
      'whitespace-separated' instead of 'space'.
      74a82baf
    • H
      Now that START_REPLICATION returns the next timeline's ID after reaching end · 6f7cddc7
      Heikki Linnakangas 提交于
      of timeline, take advantage of that in walreceiver.
      
      Startup process is still in control of choosign the target timeline, by
      scanning the timeline history files present in pg_xlog, but walreceiver now
      uses the next timeline's ID to fetch its history file immediately after it
      has finished streaming the old timeline. Before, the standby would first try
      to restart streaming on the old timeline, which fetches the missing timeline
      history file as a side-effect, and only then restart from the new timeline.
      This patch eliminates the extra iteration, which speeds up the timeline
      switch and reduces the noise in the log caused by the extra restart on the
      old timeline.
      6f7cddc7
    • H
      Use the right timeline when beginning to stream from master. · 2ff65553
      Heikki Linnakangas 提交于
      The xlogreader refactoring broke the logic to decide which timeline to start
      streaming from. XLogPageRead() uses the timeline history to check which
      timeline the requested WAL position falls into. However, after the
      refactoring, XLogPageRead() is always first called with the first page in
      the segment, to verify the segment header, and only then with the actual WAL
      position we're interested in. That first read of the segment's header made
      XLogPageRead() to always start streaming from the old timeline containing
      the segment header, not the timeline containing the actual record, if there
      was a timeline switch within the segment.
      
      I thought I fixed this yesterday, but that fix was too narrow and only fixed
      this for the corner-case that the timeline switch happened in the first page
      of the segment. To fix this more robustly, pass explicitly the position of
      the record we're actually interested in to XLogPageRead, and use that to
      decide which timeline to read from, rather than deduce it from the page and
      offset.
      
      Per report from Fujii Masao.
      2ff65553
    • H
      When xlogreader asks the callback function to read a page, make sure we · 88228e6f
      Heikki Linnakangas 提交于
      get a large enough part of the page to include the beginning of the next
      record we're interested in. The XLogPageRead callback uses the requested
      length to decide which timeline to stream WAL from, and if the first call
      is short, and the page contains a timeline switch, we'll repeatedly try
      to stream that page from the old timeline, and never get across the
      timeline switch.
      88228e6f
    • H
      I added a result set to START_STREAMING command, but neglected walreceiver. · 3684a534
      Heikki Linnakangas 提交于
      The patch to allow pg_receivexlog to switch timeline added a result set
      after copy has ended in START_STREAMING command, to return the next
      timeline's ID to the client. But walreceived didn't get the memo, and threw
      an error on the unexpected result set. Fix.
      3684a534
    • A
      Accelerate end-of-transaction dropping of relations · 279628a0
      Alvaro Herrera 提交于
      When relations are dropped, at end of transaction we need to remove the
      files and clean the buffer pool of buffers containing pages of those
      relations.  Previously we would scan the buffer pool once per relation
      to clean up buffers.  When there are many relations to drop, the
      repeated scans make this process slow; so we now instead pass a list of
      relations to drop and scan the pool once, checking each buffer against
      the passed list.  When the number of relations is larger than a
      threshold (which as of this patch is being set to 20 relations) we sort
      the array before starting, and bsearch the array; when it's smaller, we
      simply scan the array linearly each time, because that's faster.  The
      exact optimal threshold value depends on many factors, but the
      difference is not likely to be significant enough to justify making it
      user-settable.
      
      This has been measured to be a significant win (a 15x win when dropping
      100,000 relations; an extreme case, but reportedly a real one).
      
      Author: Tomas Vondra, some tweaks by me
      Reviewed by: Robert Haas, Shigeru Hanada, Andres Freund, Álvaro Herrera
      279628a0
    • H
      Make pg_receivexlog and pg_basebackup -X stream work across timeline switches. · 0b632913
      Heikki Linnakangas 提交于
      This mirrors the changes done earlier to the server in standby mode. When
      receivelog reaches the end of a timeline, as reported by the server, it
      fetches the timeline history file of the next timeline, and restarts
      streaming from the new timeline by issuing a new START_STREAMING command.
      
      When pg_receivexlog crosses a timeline, it leaves the .partial suffix on the
      last segment on the old timeline. This helps you to tell apart a partial
      segment left in the directory because of a timeline switch, and a completed
      segment. If you just follow a single server, it won't make a difference, but
      it can be significant in more complicated scenarios where new WAL is still
      generated on the old timeline.
      
      This includes two small changes to the streaming replication protocol:
      First, when you reach the end of timeline while streaming, the server now
      sends the TLI of the next timeline in the server's history to the client.
      pg_receivexlog uses that as the next timeline, so that it doesn't need to
      parse the timeline history file like a standby server does. Second, when
      BASE_BACKUP command sends the begin and end WAL positions, it now also sends
      the timeline IDs corresponding the positions.
      0b632913
    • T
      Improve memory space management in tuplesort and tuplestore. · 8ae35e91
      Tom Lane 提交于
      The code originally just doubled the size of the tuple-pointer array so
      long as that would fit in allowedMem.  This could result in failing to use
      as much as half of allowedMem, if (as is typical) the last doubling attempt
      didn't quite fit.  Worse, we might double the array size but be unable to
      use most of the added slots, because there was no room left within the
      allowedMem limit for tuples the slots should point to.  To fix, double only
      so long as we've used less than half of allowedMem in total.  Then do one
      more array enlargement, but scale it based on total memory consumption so
      far.  This will work nicely as long as the average tuple size is reasonably
      stable, and in any case should be better than the old method.
      
      This change will result in large sort operations consuming a larger
      fraction of work_mem than they typically did in the past.  The release
      notes should mention that users may want to revisit their work_mem
      settings, if they'd tuned those settings based on the old behavior of
      sorting.
      
      Jeff Janes, reviewed by Peter Geoghegan and Robert Haas
      8ae35e91
    • H
      Fix a couple of error-handling bugs in the xlogreader patch. · 1296d5c5
      Heikki Linnakangas 提交于
      XLogReadRecord should reset its state on every error, to make sure it
      re-reads the page on next call. It was inconsistent in that some errors did
      that, but some did not.
      
      In ReadRecord(), don't give up on an error if we're in standby mode. The
      loop was set up to retry, but the checks within the loop broke out of the
      loop on any error.
      
      Andres Freund, with some tweaking by me.
      1296d5c5
    • B
      Add a latex-longtable output format to psql · b14f81bc
      Bruce Momjian 提交于
      latex longtable is more powerful than the 'tabular' output format
      'latex' uses.  Also add border=3 support to 'latex'.
      b14f81bc
  3. 17 1月, 2013 8 次提交
    • M
      Silence compiler warnings · 8ef69616
      Magnus Hagander 提交于
      8ef69616
    • H
      Make GiST indexes on-disk compatible with 9.2 again. · 9ee4d06f
      Heikki Linnakangas 提交于
      The patch that turned XLogRecPtr into a uint64 inadvertently changed the
      on-disk format of GiST indexes, because the NSN field in the GiST page
      opaque is an XLogRecPtr. That breaks pg_upgrade. Revert the format of that
      field back to the two-field struct that XLogRecPtr was before. This is the
      same we did to LSNs in the page header to avoid changing on-disk format.
      
      Bump catversion, as this invalidates any existing GiST indexes built on
      9.3devel.
      9ee4d06f
    • M
      Base the default SSL ciphers on DEFAULT instead of ALL · bba486f3
      Magnus Hagander 提交于
      It's better to start from what the OpenSSL people consider a good
      default and then remove insecure things (low encryption, exportable
      encryption and md5 at this point) from that, instead of starting
      from everything that exists and remove from that. We trust the
      OpenSSL people to make good choices about what the default is.
      bba486f3
    • M
      Make size-output fixed length in pg_basebackup verbose mode · 4eebf130
      Magnus Hagander 提交于
      This way the line doesn't shift right as the amount of data processed
      increases.
      4eebf130
    • M
      Truncate filenames in the leadning end in pg_basebackup verbose output · d7e9ca7f
      Magnus Hagander 提交于
      When truncating at the end, like before, the output would often end up
      just showing the path instead of the filename.
      
      Also increase the length of the filename by 5, which still keeps us at
      less than 80 characters in most outputs.
      d7e9ca7f
    • M
      Support multiple -t/--table arguments for more commands · f3af5344
      Magnus Hagander 提交于
      On top of the previous support in pg_dump, add support to specify
      multiple tables (by using the -t option multiple times) to
      pg_restore, clsuterdb, reindexdb and vacuumdb.
      
      Josh Kupershmidt, reviewed by Karl O. Pinc
      f3af5344
    • P
      Get rid of pg_dump's README · 36bdfa52
      Peter Eisentraut 提交于
      It was largely full of outdated and incorrect information.  Move the few
      notes which were still relevant into header comments of pg_backup_tar.c
      and pg_dumpall.c.
      
      Josh Kupershmidt
      36bdfa52
    • A
      Split out XLog reading as an independent facility · 7fcbf6a4
      Alvaro Herrera 提交于
      This new facility can not only be used by xlog.c to carry out crash
      recovery, but also by external programs.  By supplying a function to
      read XLog pages from somewhere, all the WAL reading can be used for
      completely different purposes.
      
      For the standard backend use, the behavior should be pretty much the
      same as previously.  As for non-backend programs, an hypothetical
      pg_xlogdump program is now closer to reality, but some more backend
      support is still necessary.
      
      This patch was originally submitted by Andres Freund in a different
      form, but Heikki Linnakangas opted for and authored another design of
      the concept.  Andres has advanced the patch since Heikki's initial
      version.  Review and some (mostly cosmetics) changes by me.
      7fcbf6a4
  4. 16 1月, 2013 3 次提交
    • H
      Make \? help message more clear when not connected. · 8606dd81
      Heikki Linnakangas 提交于
      On second thought, "none" could mislead to think that you're connected a
      database with that name. Duplicate the whole string, so that it can be
      more easily translated. In back-branches, thought, just use an empty string
      in place of the database name, to avoid adding a translatable string.
      8606dd81
    • H
      Don't pass NULL to fprintf, if not currently connected to a database. · b04ce529
      Heikki Linnakangas 提交于
      Backpatch all the way to 8.3. Fixes bug #7811, per report and diagnosis by
      Meng Qingzhong.
      b04ce529
    • A
      Rework order of checks in ALTER / SET SCHEMA · 7ac5760f
      Alvaro Herrera 提交于
      When attempting to move an object into the schema in which it already
      was, for most objects classes we were correctly complaining about
      exactly that ("object is already in schema"); but for some other object
      classes, such as functions, we were instead complaining of a name
      collision ("object already exists in schema").  The latter is wrong and
      misleading, per complaint from Robert Haas in
      CA+TgmoZ0+gNf7RDKRc3u5rHXffP=QjqPZKGxb4BsPz65k7qnHQ@mail.gmail.com
      
      To fix, refactor the way these checks are done.  As a bonus, the
      resulting code is smaller and can also share some code with Rename
      cases.
      
      While at it, remove use of getObjectDescriptionOids() in error messages.
      These are normally disallowed because of translatability considerations,
      but this one had slipped through since 9.1.  (Not sure that this is
      worth backpatching, though, as it would create some untranslated
      messages in back branches.)
      
      This is loosely based on a patch by KaiGai Kohei, heavily reworked by
      me.
      7ac5760f
  5. 15 1月, 2013 6 次提交
  6. 14 1月, 2013 5 次提交
    • A
      Remove spurious space · 692079e5
      Alvaro Herrera 提交于
      Andres Freund
      692079e5
    • T
      Prevent very-low-probability PANIC during PREPARE TRANSACTION. · 2065dd28
      Tom Lane 提交于
      The code in PostPrepare_Locks supposed that it could reassign locks to
      the prepared transaction's dummy PGPROC by deleting the PROCLOCK table
      entries and immediately creating new ones.  This was safe when that code
      was written, but since we invented partitioning of the shared lock table,
      it's not safe --- another process could steal away the PROCLOCK entry in
      the short interval when it's on the freelist.  Then, if we were otherwise
      out of shared memory, PostPrepare_Locks would have to PANIC, since it's
      too late to back out of the PREPARE at that point.
      
      Fix by inventing a dynahash.c function to atomically update a hashtable
      entry's key.  (This might possibly have other uses in future.)
      
      This is an ancient bug that in principle we ought to back-patch, but the
      odds of someone hitting it in the field seem really tiny, because (a) the
      risk window is small, and (b) nobody runs servers with maxed-out lock
      tables for long, because they'll be getting non-PANIC out-of-memory errors
      anyway.  So fixing it in HEAD seems sufficient, at least until the new
      code has gotten some testing.
      2065dd28
    • P
      Make spelling more uniform · 9d2cd99a
      Peter Eisentraut 提交于
      9d2cd99a
    • T
      Update comments for elog_start(). · 24dd0502
      Tom Lane 提交于
      Forgot I was going to do this as part of the previous patch ...
      24dd0502
    • T
      Improve handling of ereport(ERROR) and elog(ERROR). · b853eb97
      Tom Lane 提交于
      In commit 71450d7f, we added code to inform
      suitably-intelligent compilers that ereport() doesn't return if the elevel
      is ERROR or higher.  This patch extends that to elog(), and also fixes a
      double-evaluation hazard that the previous commit created in ereport(),
      as well as reducing the emitted code size.
      
      The elog() improvement requires the compiler to support __VA_ARGS__, which
      should be available in just about anything nowadays since it's required by
      C99.  But our minimum language baseline is still C89, so add a configure
      test for that.
      
      The previous commit assumed that ereport's elevel could be evaluated twice,
      which isn't terribly safe --- there are already counterexamples in xlog.c.
      On compilers that have __builtin_constant_p, we can use that to protect the
      second test, since there's no possible optimization gain if the compiler
      doesn't know the value of elevel.  Otherwise, use a local variable inside
      the macros to prevent double evaluation.  The local-variable solution is
      inferior because (a) it leads to useless code being emitted when elevel
      isn't constant, and (b) it increases the optimization level needed for the
      compiler to recognize that subsequent code is unreachable.  But it seems
      better than not teaching non-gcc compilers about unreachability at all.
      
      Lastly, if the compiler has __builtin_unreachable(), we can use that
      instead of abort(), resulting in a noticeable code savings since no
      function call is actually emitted.  However, it seems wise to do this only
      in non-assert builds.  In an assert build, continue to use abort(), so that
      the behavior will be predictable and debuggable if the "impossible"
      happens.
      
      These changes involve making the ereport and elog macros emit do-while
      statement blocks not just expressions, which forces small changes in
      a few call sites.
      
      Andres Freund, Tom Lane, Heikki Linnakangas
      b853eb97
  7. 12 1月, 2013 2 次提交
    • A
      Extend and improve use of EXTRA_REGRESS_OPTS. · 4ae5ee6c
      Andrew Dunstan 提交于
      This is now used by ecpg tests, and not clobbered by pg_upgrade
      tests. This change won't affect anything that doesn't set this
      environment variable, but will enable the buildfarm to control
      exactly what port regression test installs will be running on,
      and thus to detect possible rogue postmasters more easily.
      
      Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
      4ae5ee6c
    • T
      Redesign the planner's handling of index-descent cost estimation. · 31f38f28
      Tom Lane 提交于
      Historically we've used a couple of very ad-hoc fudge factors to try to
      get the right results when indexes of different sizes would satisfy a
      query with the same number of index leaf tuples being visited.  In
      commit 21a39de5 I tweaked one of these
      fudge factors, with results that proved disastrous for larger indexes.
      Commit bf01e34b fudged it some more,
      but still with not a lot of principle behind it.
      
      What seems like a better way to address these issues is to explicitly model
      index-descent costs, since that's what's really at stake when considering
      diferent indexes with similar leaf-page-level costs.  We tried that once
      long ago, and found that charging random_page_cost per page descended
      through was way too much, because upper btree levels tend to stay in cache
      in real-world workloads.  However, there's still CPU costs to think about,
      and the previous fudge factors can be seen as a crude attempt to account
      for those costs.  So this patch replaces those fudge factors with explicit
      charges for the number of tuple comparisons needed to descend the index
      tree, plus a small charge per page touched in the descent.  The cost
      multipliers are chosen so that the resulting charges are in the vicinity of
      the historical (pre-9.2) fudge factors for indexes of up to about a million
      tuples, while not ballooning unreasonably beyond that, as the old fudge
      factor did (even more so in 9.2).
      
      To make this work accurately for btree indexes, add some code that allows
      extraction of the known root-page height from a btree.  There's no
      equivalent number readily available for other index types, but we can use
      the log of the number of index pages as an approximate substitute.
      
      This seems like too much of a behavioral change to risk back-patching,
      but it should improve matters going forward.  In 9.2 I'll just revert
      the fudge-factor change.
      31f38f28
  8. 11 1月, 2013 1 次提交
    • T
      Last-gasp attempt to save libperl.so configure probe. · e1b735ae
      Tom Lane 提交于
      I notice that plperl's makefile adds the -I for $perl_archlibexp/CORE
      at the end of CPPFLAGS not the beginning.  It seems somewhat unlikely
      that the include search order has anything to do with why buildfarm
      member okapi is failing, but I'm about out of other ideas.
      e1b735ae
  9. 10 1月, 2013 1 次提交
    • T
      Test linking libperl.so using only Perl's required libraries. · 9d5a160c
      Tom Lane 提交于
      It appears that perl_embed_ldflags should already mention all the libraries
      that are required by libperl.so itself.  So let's try the test link with
      just those and not the other LIBS we've found up to now.  This should
      more nearly reproduce what will happen when plperl is linked, and perhaps
      will fix buildfarm member okapi's problem.
      9d5a160c