1. 07 2月, 2012 2 次提交
    • T
      Add locking around WAL-replay modification of shared-memory variables. · c6d76d7c
      Tom Lane 提交于
      Originally, most of this code assumed that no Postgres backends could be
      running concurrently with it, and so no locking could be needed.  That
      assumption fails in Hot Standby.  While it's still true that Hot Standby
      backends should never change values like nextXid, they can examine them,
      and consistency is important in some cases such as when computing a
      snapshot.  Therefore, prudence requires that WAL replay code obtain the
      relevant locks when modifying such variables, even though it can examine
      them without taking a lock.  We were following that coding rule in some
      places but not all.  This commit applies the coding rule uniformly to all
      updates of ShmemVariableCache and MultiXactState fields; a search of the
      replay routines did not find any other cases that seemed to be at risk.
      
      In addition, this commit fixes a longstanding thinko in replay of NEXTOID
      and checkpoint records: we tried to advance nextOid only if it was behind
      the value in the WAL record, but the comparison would draw the wrong
      conclusion if OID wraparound had occurred since the previous value.
      Better to just unconditionally assign the new value, since OID assignment
      shouldn't be happening during replay anyway.
      
      The additional locking seems to be more in the nature of future-proofing
      than fixing any live bug, so I am not going to back-patch it.  The NEXTOID
      fix will be back-patched separately.
      c6d76d7c
    • R
      Remove dead declaration. · 96abd817
      Robert Haas 提交于
      96abd817
  2. 06 2月, 2012 2 次提交
    • A
      fe-misc.c depends on pg_config_paths.h · 0c88086d
      Alvaro Herrera 提交于
      Declare this in Makefile to avoid failures in parallel compiles.
      
      Author: Lionel Elie Mamane
      0c88086d
    • T
      Fix transient clobbering of shared buffers during WAL replay. · 17118825
      Tom Lane 提交于
      RestoreBkpBlocks was in the habit of zeroing and refilling the target
      buffer; which was perfectly safe when the code was written, but is unsafe
      during Hot Standby operation.  The reason is that we have coding rules
      that allow backends to continue accessing a tuple in a heap relation while
      holding only a pin on its buffer.  Such a backend could see transiently
      zeroed data, if WAL replay had occasion to change other data on the page.
      This has been shown to be the cause of bug #6425 from Duncan Rance (who
      deserves kudos for developing a sufficiently-reproducible test case) as
      well as Bridget Frey's re-report of bug #6200.  It most likely explains the
      original report as well, though we don't yet have confirmation of that.
      
      To fix, change the code so that only bytes that are supposed to change will
      change, even transiently.  This actually saves cycles in RestoreBkpBlocks,
      since it's not writing the same bytes twice.
      
      Also fix seq_redo, which has the same disease, though it has to work a bit
      harder to meet the requirement.
      
      So far as I can tell, no other WAL replay routines have this type of bug.
      In particular, the index-related replay routines, which would certainly be
      broken if they had to meet the same standard, are not at risk because we
      do not have coding rules that allow access to an index page when not
      holding a buffer lock on it.
      
      Back-patch to 9.0 where Hot Standby was added.
      17118825
  3. 05 2月, 2012 4 次提交
  4. 04 2月, 2012 3 次提交
  5. 03 2月, 2012 2 次提交
    • P
      ecpg: Improve test building · 69e9768e
      Peter Eisentraut 提交于
      Further improve on commit c75e1436.
      Instead of building both .o files and binaries in the same make rule,
      just rely on the normal .c -> .o rule.  This will ensure that
      dependency tracking is used when enabled.  To do this, disable the
      implicit direct .c -> binary rule globally, which will also prevent
      the original problem (*.dSYM junk) from reappearing elsewhere.
      69e9768e
    • R
      Allow spgist's text_ops to handle pattern-matching operators. · 0ed7445d
      Robert Haas 提交于
      This was presumably intended to work this way all along, but a few key
      bits of indxpath.c didn't get the memo.
      
      Robert Haas and Tom Lane
      0ed7445d
  6. 02 2月, 2012 6 次提交
    • R
      Avoid re-checking for visibility map extension too frequently. · b4e07417
      Robert Haas 提交于
      When testing bits (but not when setting or clearing them), we now
      won't check whether the map has been extended.  This significantly
      improves performance in the case where the visibility map doesn't
      exist yet, by avoiding an extra system call per tuple.  To make
      sure backends notice eventually, send an smgr inval on VM extension.
      
      Dean Rasheed, with minor modifications by me.
      b4e07417
    • P
      initdb: Add options --auth-local and --auth-host · 8a02339e
      Peter Eisentraut 提交于
      reviewed by Robert Haas and Pavel Stehule
      8a02339e
    • P
      psql: Case preserving completion of SQL key words · 69f4f1c3
      Peter Eisentraut 提交于
      Instead of always completing SQL key words in upper case, look at the
      word being completed and match the case.
      
      reviewed by Fujii Masao
      69f4f1c3
    • T
      Add some regression test cases for denormalized float8 input. · 500cf66d
      Tom Lane 提交于
      This was submitted with the previous patch, but I'm committing it
      separately to ease backing it out if these results prove too unportable.
      
      Marti Raudsepp, after a proposal by Jeroen Vermeulen
      500cf66d
    • T
      Try to be more consistent about accepting denormalized float8 numbers. · c318aeed
      Tom Lane 提交于
      On some platforms, strtod() reports ERANGE for a denormalized value (ie,
      one that can be represented as distinct from zero, but is too small to have
      full precision).  On others, it doesn't.  It seems better to try to accept
      these values consistently, so add a test to see if the result value
      indicates a true out-of-range condition.  This should be okay per Single
      Unix Spec.  On machines where the underlying math isn't IEEE standard, the
      behavior for such small numbers may not be very consistent, but then it
      wouldn't be anyway.
      
      Marti Raudsepp, after a proposal by Jeroen Vermeulen
      c318aeed
    • A
      Implement dry-run mode for pg_archivecleanup · b2e431a4
      Alvaro Herrera 提交于
      In dry-run mode, just the name of the file to be removed is printed to
      stdout; this is so the user can easily plug it into another program
      through a pipe.  If debug mode is also specified, a more verbose message
      is printed to stderr.
      
      Author: Gabriele Bartolini
      Reviewer: Josh Kupershmidt
      b2e431a4
  7. 01 2月, 2012 7 次提交
  8. 31 1月, 2012 6 次提交
  9. 30 1月, 2012 8 次提交
    • H
      Make group commit more effective. · 9b38d46d
      Heikki Linnakangas 提交于
      When a backend needs to flush the WAL, and someone else is already flushing
      the WAL, wait until it releases the WALInsertLock and check if we still need
      to do the flush or if the other backend already did the work for us, before
      acquiring WALInsertLock. This helps group commit, because when the WAL flush
      finishes, all the backends that were waiting for it can be woken up in one
      go, and the can all concurrently observe that they're done, rather than
      waking them up one by one in a cascading fashion.
      
      This is based on a new LWLock function, LWLockWaitUntilFree(), which has
      peculiar semantics. If the lock is immediately free, it grabs the lock and
      returns true. If it's not free, it waits until it is released, but then
      returns false without grabbing the lock. This is used in XLogFlush(), so
      that when the lock is acquired, the backend flushes the WAL, but if it's
      not, the backend first checks the current flush location before retrying.
      
      Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although
      this patch as committed ended up being very different from that.
      9b38d46d
    • S
    • S
      73f617f1
    • H
      Accept a non-existent value in "ALTER USER/DATABASE SET ..." command. · a5782570
      Heikki Linnakangas 提交于
      When default_text_search_config, default_tablespace, or temp_tablespaces
      setting is set per-user or per-database, with an "ALTER USER/DATABASE SET
      ..." statement, don't throw an error if the text search configuration or
      tablespace does not exist. In case of text search configuration, even if
      it doesn't exist in the current database, it might exist in another
      database, where the setting is intended to have its effect. This behavior
      is now the same as search_path's.
      
      Tablespaces are cluster-wide, so the same argument doesn't hold for
      tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER
      SET ..." statements before the "CREATE TABLESPACE" statements. Arguably
      that's pg_dumpall's fault - it should dump the statements in such an order
      that the tablespace is created first and then the "ALTER USER SET
      default_tablespace ..." statements after that - but it seems better to be
      consistent with search_path and default_text_search_config anyway. Besides,
      you could still create a dump that throws an error, by creating the
      tablespace, running "ALTER USER SET default_tablespace", then dropping the
      tablespace and running pg_dumpall on that.
      
      Backpatch to all supported versions.
      a5782570
    • T
      Assorted comment fixes, mostly just typos, but some obsolete statements. · ad10853b
      Tom Lane 提交于
      YAMAMOTO Takashi
      ad10853b
    • T
      Fix typo in comment. · dd243b3e
      Tom Lane 提交于
      Peter Geoghegan
      dd243b3e
    • T
      Tweak index costing for problems with partial indexes. · 21a39de5
      Tom Lane 提交于
      btcostestimate() makes an estimate of the number of index tuples that will
      be visited based on knowledge of which index clauses can actually bound the
      scan within nbtree.  However, it forgot to account for partial indexes in
      this calculation, with the result that the cost of the index scan could be
      significantly overestimated for a partial index.  Fix that by merging the
      predicate with the abbreviated indexclause list, in the same way as we do
      with the full list to estimate how many heap tuples will be visited.
      
      Also, slightly increase the "fudge factor" that's meant to give preference
      to smaller indexes over larger ones.  While this is applied to all indexes,
      it's most important for partial indexes since it can be the only factor
      that makes a partial index look cheaper than a similar full index.
      Experimentation shows that the existing value is so small as to easily get
      swamped by noise such as page-boundary-roundoff behavior.  I'm tempted to
      kick it up more than this, but will refrain for now.
      
      Per report from Ruben Blanco.  These are long-standing issues, but given
      the lack of prior complaints I'm not going to risk changing planner
      behavior in back branches by back-patching.
      21a39de5
    • T
      Fix pushing of index-expression qualifications through UNION ALL. · b28ffd0f
      Tom Lane 提交于
      In commit 57664ed2, I made the planner
      wrap non-simple-variable outputs of appendrel children (IOW, child SELECTs
      of UNION ALL subqueries) inside PlaceHolderVars, in order to solve some
      issues with EquivalenceClass processing.  However, this means that any
      upper-level WHERE clauses mentioning such outputs will now contain
      PlaceHolderVars after they're pushed down into the appendrel child,
      and that prevents indxpath.c from recognizing that they could be matched
      to index expressions.  To fix, add explicit stripping of PlaceHolderVars
      from index operands, same as we have long done for RelabelType nodes.
      Add a regression test covering both this and the plain-UNION case (which
      is a totally different code path, but should also be able to do it).
      
      Per bug #6416 from Matteo Beccati.  Back-patch to 9.1, same as the
      previous change.
      b28ffd0f