1. 31 12月, 2009 1 次提交
    • T
      Revise pgstat's tracking of tuple changes to improve the reliability of · 48c192c1
      Tom Lane 提交于
      decisions about when to auto-analyze.
      
      The previous code depended on n_live_tuples + n_dead_tuples - last_anl_tuples,
      where all three of these numbers could be bad estimates from ANALYZE itself.
      Even worse, in the presence of a steady flow of HOT updates and matching
      HOT-tuple reclamations, auto-analyze might never trigger at all, even if all
      three numbers are exactly right, because n_dead_tuples could hold steady.
      
      To fix, replace last_anl_tuples with an accurately tracked count of the total
      number of committed tuple inserts + updates + deletes since the last ANALYZE
      on the table.  This can still be compared to the same threshold as before, but
      it's much more trustworthy than the old computation.  Tracking this requires
      one more intra-transaction counter per modified table within backends, but no
      additional memory space in the stats collector.  There probably isn't any
      measurable speed difference; if anything it might be a bit faster than before,
      since I was able to eliminate some per-tuple arithmetic operations in favor of
      adding sums once per (sub)transaction.
      
      Also, simplify the logic around pgstat vacuum and analyze reporting messages
      by not trying to fold VACUUM ANALYZE into a single pgstat message.
      
      The original thought behind this patch was to allow scheduling of analyzes
      on parent tables by artificially inflating their changes_since_analyze count.
      I've left that for a separate patch since this change seems to stand on its
      own merit.
      48c192c1
  2. 30 12月, 2009 3 次提交
    • T
      Add an index on pg_inherits.inhparent, and use it to avoid seqscans in · 540e69a0
      Tom Lane 提交于
      find_inheritance_children().  This is a complete no-op in databases without
      any inheritance.  In databases where there are just a few entries in
      pg_inherits, it could conceivably be a small loss.  However, in databases with
      many inheritance parents, it can be a big win.
      540e69a0
    • T
      Add the ability to store inheritance-tree statistics in pg_statistic, · 649b5ec7
      Tom Lane 提交于
      and teach ANALYZE to compute such stats for tables that have subclasses.
      Per my proposal of yesterday.
      
      autovacuum still needs to be taught about running ANALYZE on parent tables
      when their subclasses change, but the feature is useful even without that.
      649b5ec7
    • H
      Previous fix for temporary file management broke returning a set from · 84d723b6
      Heikki Linnakangas 提交于
      PL/pgSQL function within an exception handler. Make sure we use the right
      resource owner when we create the tuplestore to hold returned tuples.
      
      Simplify tuplestore API so that the caller doesn't need to be in the right
      memory context when calling tuplestore_put* functions. tuplestore.c
      automatically switches to the memory context used when the tuplestore was
      created. Tuplesort was already modified like this earlier. This patch also
      removes the now useless MemoryContextSwitch calls from callers.
      
      Report by Aleksei on pgsql-bugs on Dec 22 2009. Backpatch to 8.1, like
      the previous patch that broke this.
      84d723b6
  3. 29 12月, 2009 2 次提交
  4. 27 12月, 2009 1 次提交
  5. 25 12月, 2009 1 次提交
    • B
      Binary upgrade: · c44327af
      Bruce Momjian 提交于
      Modify pg_dump --binary-upgrade and add backend support routines to
      support the preservation of pg_type oids when doing a binary upgrade.
      This allows user-defined composite types and arrays to be binary
      upgraded.
      c44327af
  6. 24 12月, 2009 1 次提交
    • T
      Remove code that attempted to rename index columns to keep them in sync with · c176e122
      Tom Lane 提交于
      their underlying table columns.  That code was not bright enough to cope with
      collision situations (ie, new name conflicts with some other column of the
      index).  Since there is no functional reason to do this at all, trying to
      upgrade the logic to be bulletproof doesn't seem worth the trouble.
      
      This change means that both the index name and the column names of an index
      are set when it's created, and won't be automatically changed when the
      underlying table columns are renamed.  Neatnik DBAs are still free to rename
      them manually, of course.
      c176e122
  7. 23 12月, 2009 3 次提交
    • H
      Always pass catalog id to the options validator function specified in · 4e766f2d
      Heikki Linnakangas 提交于
      CREATE FOREIGN DATA WRAPPER. Arguably it wasn't a bug because the
      documentation said that it's passed the catalog ID or zero, but surely
      we should provide it when it's known. And there isn't currently any
      scenario where it's not known, and I can't imagine having one in the
      future either, so better remove the "or zero" escape hatch and always
      pass a valid catalog ID. Backpatch to 8.4.
      
      Martin Pihlak
      4e766f2d
    • T
      Adjust naming of indexes and their columns per recent discussion. · cfc5008a
      Tom Lane 提交于
      Index expression columns are now named after the FigureColname result for
      their expressions, rather than always being "pg_expression_N".  Digits are
      appended to this name if needed to make the column name unique within the
      index.  (That happens for regular columns too, thus fixing the old problem
      that CREATE INDEX fooi ON foo (f1, f1) fails.  Before exclusion indexes
      there was no real reason to do such a thing, but now maybe there is.)
      
      Default names for indexes and associated constraints now include the column
      names of all their columns, not only the first one as in previous practice.
      (Of course, this will be truncated as needed to fit in NAMEDATALEN.  Also,
      pkey indexes retain the historical behavior of not naming specific columns
      at all.)
      
      An example of the results:
      
      regression=# create table foo (f1 int, f2 text,
      regression(# exclude (f1 with =, lower(f2) with =));
      NOTICE:  CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo"
      CREATE TABLE
      regression=# \d foo_f1_lower_exclusion
      Index "public.foo_f1_lower_exclusion"
       Column |  Type   | Definition
      --------+---------+------------
       f1     | integer | f1
       lower  | text    | lower(f2)
      btree, for table "public.foo"
      cfc5008a
    • T
      Disallow comments on columns of relation types other than tables, views, · b7d67954
      Tom Lane 提交于
      and composite types, which are the only relkinds for which pg_dump support
      exists for dumping column comments.  There is no obvious usefulness for
      comments on columns of sequences or toast tables; and while comments on
      index columns might have some value, it's not worth the risk of compatibility
      problems due to possible changes in the algorithm for assigning names to
      index columns.  Per discussion.
      
      In consequence, remove now-dead code for copying such comments in CREATE TABLE
      LIKE.
      b7d67954
  8. 21 12月, 2009 1 次提交
  9. 19 12月, 2009 2 次提交
    • S
      Allow read only connections during recovery, known as Hot Standby. · efc16ea5
      Simon Riggs 提交于
      Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
      
      New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
      
      This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
      
      Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
      
      Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
      efc16ea5
    • B
      binary migration: pg_migrator · 78a09145
      Bruce Momjian 提交于
      Add comments about places where system oids have to be preserved for
      binary migration.
      78a09145
  10. 17 12月, 2009 1 次提交
    • R
      Several fixes for EXPLAIN (FORMAT YAML), plus one for EXPLAIN (FORMAT JSON). · ff499613
      Robert Haas 提交于
      ExplainSeparatePlans() was busted for both JSON and YAML output - the present
      code is a holdover from the original version of my machine-readable explain
      patch, which didn't have the grouping_stack machinery.  Also, fix an odd
      distribution of labor between ExplainBeginGroup() and ExplainYAMLLineStarting()
      when marking lists with "- ", with each providing one character.  This broke
      the output format for multi-query statements.  Also, fix ExplainDummyGroup()
      for the YAML output format.
      
      Along the way, make the YAML format use escape_yaml() in situations where the
      JSON format uses escape_json().  Right now, it doesn't matter because all the
      values are known not to need escaping, but it seems safer this way.  Finally,
      I added some comments to better explain what the YAML output format is doing.
      
      Greg Sabino Mullane reported the issues with multi-query statements.
      Analysis and remaining cleanups by me.
      ff499613
  11. 15 12月, 2009 1 次提交
  12. 12 12月, 2009 1 次提交
    • R
      Export ExplainBeginOutput() and ExplainEndOutput() for auto_explain. · 02490d46
      Robert Haas 提交于
      Without these functions, anyone outside of explain.c can't actually use
      ExplainPrintPlan, because the ExplainState won't be initialized properly.
      The user-visible result of this was a crash when using auto_explain with
      the JSON output format.
      
      Report by Euler Taveira de Oliveira.  Analysis by Tom Lane.  Patch by me.
      02490d46
  13. 11 12月, 2009 2 次提交
  14. 10 12月, 2009 1 次提交
    • T
      Prevent indirect security attacks via changing session-local state within · 62aba765
      Tom Lane 提交于
      an allegedly immutable index function.  It was previously recognized that
      we had to prevent such a function from executing SET/RESET ROLE/SESSION
      AUTHORIZATION, or it could trivially obtain the privileges of the session
      user.  However, since there is in general no privilege checking for changes
      of session-local state, it is also possible for such a function to change
      settings in a way that might subvert later operations in the same session.
      Examples include changing search_path to cause an unexpected function to
      be called, or replacing an existing prepared statement with another one
      that will execute a function of the attacker's choosing.
      
      The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
      these threats, which are the same places previously deemed to need protection
      against the SET ROLE issue.  GUC changes are still allowed, since there are
      many useful cases for that, but we prevent security problems by forcing a
      rollback of any GUC change after completing the operation.  Other cases are
      handled by throwing an error if any change is attempted; these include temp
      table creation, closing a cursor, and creating or deleting a prepared
      statement.  (In 7.4, the infrastructure to roll back GUC changes doesn't
      exist, so we settle for rejecting changes of "search_path" in these contexts.)
      
      Original report and patch by Gurjeet Singh, additional analysis by
      Tom Lane.
      
      Security: CVE-2009-4136
      62aba765
  15. 07 12月, 2009 1 次提交
  16. 21 11月, 2009 1 次提交
    • T
      Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be · 7fc0f062
      Tom Lane 提交于
      checked to determine whether the trigger should be fired.
      
      For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
      triggers it can provide a noticeable performance improvement, since queuing of
      a deferred trigger event and re-fetching of the row(s) at end of statement can
      be short-circuited if the trigger does not need to be fired.
      
      Takahiro Itagaki, reviewed by KaiGai Kohei.
      7fc0f062
  17. 19 11月, 2009 1 次提交
  18. 17 11月, 2009 1 次提交
  19. 12 11月, 2009 1 次提交
    • T
      Make initdb behave sanely when the selected locale has codeset "US-ASCII". · 8f8a5df6
      Tom Lane 提交于
      Per discussion, this should result in defaulting to SQL_ASCII encoding.
      The original coding could not support that because it conflated selection
      of SQL_ASCII encoding with not being able to determine the encoding.
      Adjust pg_get_encoding_from_locale()'s API to distinguish these cases,
      and fix callers appropriately.  Only initdb actually changes behavior,
      since the other callers were perfectly content to consider these cases
      equivalent.
      
      Per bug #5178 from Boh Yap.  Not going to bother back-patching, since
      no one has complained before and there's an easy workaround (namely,
      specify the encoding you want).
      8f8a5df6
  20. 11 11月, 2009 2 次提交
    • T
      Revert the temporary patch to work around Snow Leopard readdir() bug. · 21e3edd6
      Tom Lane 提交于
      Apple has fixed that bug in 10.6.2, and we should encourage users to
      update to that version rather than trusting this cosmetic patch.
      As was recently noted by Stephen Tyler, this patch was only masking
      the problem in the context of DROP TABLESPACE, but the failure could
      occur in other places such as pg_xlog cleanup.
      21e3edd6
    • A
      Fix longstanding problems in VACUUM caused by untimely interruptions · e7ec0222
      Alvaro Herrera 提交于
      In VACUUM FULL, an interrupt after the initial transaction has been recorded
      as committed can cause postmaster to restart with the following error message:
      PANIC: cannot abort transaction NNNN, it was already committed
      This problem has been reported many times.
      
      In lazy VACUUM, an interrupt after the table has been truncated by
      lazy_truncate_heap causes other backends' relcache to still point to the
      removed pages; this can cause future INSERT and UPDATE queries to error out
      with the following error message:
      could not read block XX of relation 1663/NNN/MMMM: read only 0 of 8192 bytes
      The window to this race condition is extremely narrow, but it has been seen in
      the wild involving a cancelled autovacuum process.
      
      The solution for both problems is to inhibit interrupts in both operations
      until after the respective transactions have been committed.  It's not a
      complete solution, because the transaction could theoretically be aborted by
      some other error, but at least fixes the most common causes of both problems.
      e7ec0222
  21. 07 11月, 2009 1 次提交
  22. 06 11月, 2009 1 次提交
  23. 05 11月, 2009 1 次提交
    • T
      Add support for invoking parser callback hooks via SPI and in cached plans. · 9bedd128
      Tom Lane 提交于
      As proof of concept, modify plpgsql to use the hooks.  plpgsql is still
      inserting $n symbols textually, but the "back end" of the parsing process now
      goes through the ParamRef hook instead of using a fixed parameter-type array,
      and then execution only fetches actually-referenced parameters, using a hook
      added to ParamListInfo.
      
      Although there's a lot left to be done in plpgsql, this already cures the
      "if (TG_OP = 'INSERT' and NEW.foo ...)"  problem, as illustrated by the
      changed regression test.
      9bedd128
  24. 04 11月, 2009 1 次提交
  25. 28 10月, 2009 1 次提交
    • T
      Fix AfterTriggerSaveEvent to use a test and elog, not just Assert, to check · 44956c52
      Tom Lane 提交于
      that it's called within an AfterTriggerBeginQuery/AfterTriggerEndQuery pair.
      The RI cascade triggers suppress that overhead on the assumption that they
      are always run non-deferred, so it's possible to violate the condition if
      someone mistakenly changes pg_trigger to mark such a trigger deferred.
      We don't really care about supporting that, but throwing an error instead
      of crashing seems desirable.  Per report from Marcelo Costa.
      44956c52
  26. 26 10月, 2009 1 次提交
    • T
      Re-implement EvalPlanQual processing to improve its performance and eliminate · 9f2ee8f2
      Tom Lane 提交于
      a lot of strange behaviors that occurred in join cases.  We now identify the
      "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
      UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
      appropriate row into each scan node in the rechecking plan, forcing it to emit
      only that one row.  The former behavior could rescan the whole of each joined
      relation for each recheck, which was terrible for performance, and what's much
      worse could result in duplicated output tuples.
      
      Also, the original implementation of EvalPlanQual could not re-use the recheck
      execution tree --- it had to go through a full executor init and shutdown for
      every row to be tested.  To avoid this overhead, I've associated a special
      runtime Param with each LockRows or ModifyTable plan node, and arranged to
      make every scan node below such a node depend on that Param.  Thus, by
      signaling a change in that Param, the EPQ machinery can just rescan the
      already-built test plan.
      
      This patch also adds a prohibition on set-returning functions in the
      targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
      duplicate-output-tuple problem.  It seems fairly reasonable since the
      other restrictions on SELECT FOR UPDATE are meant to ensure that there
      is a unique correspondence between source tuples and result tuples,
      which an output SRF destroys as much as anything else does.
      9f2ee8f2
  27. 15 10月, 2009 1 次提交
    • T
      Support SQL-compliant triggers on columns, ie fire only if certain columns · b2734a0d
      Tom Lane 提交于
      are named in the UPDATE's SET list.
      
      Note: the schema of pg_trigger has not actually changed; we've just started
      to use a column that was there all along.  catversion bumped anyway so that
      this commit is included in the history of potentially interesting changes
      to system catalog contents.
      
      Itagaki Takahiro
      b2734a0d
  28. 13 10月, 2009 3 次提交
    • T
      Code review for LIKE INCLUDING patch --- clean up some cosmetic and not · 8d54c248
      Tom Lane 提交于
      so cosmetic stuff.
      8d54c248
    • A
    • T
      Move the handling of SELECT FOR UPDATE locking and rechecking out of · 0adaf4cb
      Tom Lane 提交于
      execMain.c and into a new plan node type LockRows.  Like the recent change
      to put table updating into a ModifyTable plan node, this increases planning
      flexibility by allowing the operations to occur below the top level of the
      plan tree.  It's necessary in any case to restore the previous behavior of
      having FOR UPDATE locking occur before ModifyTable does.
      
      This partially refactors EvalPlanQual to allow multiple rows-under-test
      to be inserted into the EPQ machinery before starting an EPQ test query.
      That isn't sufficient to fix EPQ's general bogosity in the face of plans
      that return multiple rows per test row, though.  Since this patch is
      mostly about getting some plan node infrastructure in place and not about
      fixing ten-year-old bugs, I will leave EPQ improvements for another day.
      
      Another behavioral change that we could now think about is doing FOR UPDATE
      before LIMIT, but that too seems like it should be treated as a followon
      patch.
      0adaf4cb
  29. 10 10月, 2009 1 次提交
    • T
      Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. · 8a5849b7
      Tom Lane 提交于
      They are now handled by a new plan node type called ModifyTable, which is
      placed at the top of the plan tree.  In itself this change doesn't do much,
      except perhaps make the handling of RETURNING lists and inherited UPDATEs a
      tad less klugy.  But it is necessary preparation for the intended extension of
      allowing RETURNING queries inside WITH.
      
      Marko Tiikkaja
      8a5849b7
  30. 08 10月, 2009 1 次提交