1. 22 12月, 2011 1 次提交
    • R
      Improve behavior of concurrent CLUSTER. · cbe24a6d
      Robert Haas 提交于
      In the previous coding, a user could queue up for an AccessExclusiveLock
      on a table they did not have permission to cluster, thus potentially
      interfering with access by authorized users who got stuck waiting behind
      the AccessExclusiveLock.  This approach avoids that.  cluster() has the
      same permissions-checking requirements as REINDEX TABLE, so this commit
      moves the now-shared callback to tablecmds.c and renames it, per
      discussion with Noah Misch.
      cbe24a6d
  2. 21 12月, 2011 1 次提交
    • R
      Take fewer snapshots. · d573e239
      Robert Haas 提交于
      When a PORTAL_ONE_SELECT query is executed, we can opportunistically
      reuse the parse/plan shot for the execution phase.  This cuts down the
      number of snapshots per simple query from 2 to 1 for the simple
      protocol, and 3 to 2 for the extended protocol.  Since we are only
      reusing a snapshot taken early in the processing of the same protocol
      message, the change shouldn't be user-visible, except that the remote
      possibility of the planning and execution snapshots being different is
      eliminated.
      
      Note that this change does not make it safe to assume that the parse/plan
      snapshot will certainly be reused; that will currently only happen if
      PortalStart() decides to use the PORTAL_ONE_SELECT strategy.  It might
      be worth trying to provide some stronger guarantees here in the future,
      but for now we don't.
      
      Patch by me; review by Dimitri Fontaine.
      d573e239
  3. 20 12月, 2011 2 次提交
    • P
      Add support for privileges on types · 72920557
      Peter Eisentraut 提交于
      This adds support for the more or less SQL-conforming USAGE privilege
      on types and domains.  The intent is to be able restrict which users
      can create dependencies on types, which restricts the way in which
      owners can alter types.
      
      reviewed by Yeb Havinga
      72920557
    • A
      Allow CHECK constraints to be declared ONLY · 61d81bd2
      Alvaro Herrera 提交于
      This makes them enforceable only on the parent table, not on children
      tables.  This is useful in various situations, per discussion involving
      people bitten by the restrictive behavior introduced in 8.4.
      
      Message-Id:
      8762mp93iw.fsf@comcast.net
      CAFaPBrSMMpubkGf4zcRL_YL-AERUbYF_-ZNNYfb3CVwwEqc9TQ@mail.gmail.com
      
      Authors: Nikhil Sontakke, Alex Hunsaker
      Reviewed by Robert Haas and myself
      61d81bd2
  4. 16 12月, 2011 2 次提交
    • R
      Improve behavior of concurrent ALTER <relation> .. SET SCHEMA. · 1da5c119
      Robert Haas 提交于
      If the referrent of a name changes while we're waiting for the lock,
      we must recheck permissons.  We also now check the relkind before
      locking, since it's easy to do that long the way.
      
      Patch by me; review by Noah Misch.
      1da5c119
    • R
      Improve behavior of concurrent rename statements. · 74a1d4fe
      Robert Haas 提交于
      Previously, renaming a table, sequence, view, index, foreign table,
      column, or trigger checked permissions before locking the object, which
      meant that if permissions were revoked during the lock wait, we would
      still allow the operation.  Similarly, if the original object is dropped
      and a new one with the same name is created, the operation will be allowed
      if we had permissions on the old object; the permissions on the new
      object don't matter.  All this is now fixed.
      
      Along the way, attempting to rename a trigger on a foreign table now gives
      the same error message as trying to create one there in the first place
      (i.e. that it's not a table or view) rather than simply stating that no
      trigger by that name exists.
      
      Patch by me; review by Noah Misch.
      74a1d4fe
  5. 10 12月, 2011 1 次提交
  6. 07 12月, 2011 3 次提交
    • M
      Remove spclocation field from pg_tablespace · 16d8e594
      Magnus Hagander 提交于
      Instead, add a function pg_tablespace_location(oid) used to return
      the same information, and do this by reading the symbolic link.
      
      Doing it this way makes it possible to relocate a tablespace when the
      database is down by simply changing the symbolic link.
      16d8e594
    • T
      Create a "sort support" interface API for faster sorting. · c6e3ac11
      Tom Lane 提交于
      This patch creates an API whereby a btree index opclass can optionally
      provide non-SQL-callable support functions for sorting.  In the initial
      patch, we only use this to provide a directly-callable comparator function,
      which can be invoked with a bit less overhead than the traditional
      SQL-callable comparator.  While that should be of value in itself, the real
      reason for doing this is to provide a datatype-extensible framework for
      more aggressive optimizations, as in Peter Geoghegan's recent work.
      
      Robert Haas and Tom Lane
      c6e3ac11
    • R
      Typo fixes for commit 2ad36c4e. · d2a66218
      Robert Haas 提交于
      Noted during post-commit review by by Noah Misch.
      d2a66218
  7. 30 11月, 2011 1 次提交
    • R
      Improve table locking behavior in the face of current DDL. · 2ad36c4e
      Robert Haas 提交于
      In the previous coding, callers were faced with an awkward choice:
      look up the name, do permissions checks, and then lock the table; or
      look up the name, lock the table, and then do permissions checks.
      The first choice was wrong because the results of the name lookup
      and permissions checks might be out-of-date by the time the table
      lock was acquired, while the second allowed a user with no privileges
      to interfere with access to a table by users who do have privileges
      (e.g. if a malicious backend queues up for an AccessExclusiveLock on
      a table on which AccessShareLock is already held, further attempts
      to access the table will be blocked until the AccessExclusiveLock
      is obtained and the malicious backend's transaction rolls back).
      
      To fix, allow callers of RangeVarGetRelid() to pass a callback which
      gets executed after performing the name lookup but before acquiring
      the relation lock.  If the name lookup is retried (because
      invalidation messages are received), the callback will be re-executed
      as well, so we get the best of both worlds.  RangeVarGetRelid() is
      renamed to RangeVarGetRelidExtended(); callers not wishing to supply
      a callback can continue to invoke it as RangeVarGetRelid(), which is
      now a macro.  Since the only one caller that uses nowait = true now
      passes a callback anyway, the RangeVarGetRelid() macro defaults nowait
      as well.  The callback can also be used for supplemental locking - for
      example, REINDEX INDEX needs to acquire the table lock before the index
      lock to reduce deadlock possibilities.
      
      There's a lot more work to be done here to fix all the cases where this
      can be a problem, but this commit provides the general infrastructure
      and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE,
      LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE.
      
      Per discussion with Noah Misch and Alvaro Herrera.
      2ad36c4e
  8. 29 11月, 2011 1 次提交
    • T
      Disallow deletion of CurrentExtensionObject while running extension script. · 871dd024
      Tom Lane 提交于
      While the deletion in itself wouldn't break things, any further creation
      of objects in the script would result in dangling pg_depend entries being
      added by recordDependencyOnCurrentExtension().  An example from Phil
      Sorber convinced me that this is just barely likely enough to be worth
      expending a couple lines of code to defend against.  The resulting error
      message might be confusing, but it's better than leaving corrupted catalog
      contents for the user to deal with.
      871dd024
  9. 26 11月, 2011 1 次提交
    • A
      Improve logging of autovacuum I/O activity · 9d3b5024
      Alvaro Herrera 提交于
      This adds some I/O stats to the logging of autovacuum (when the
      operation takes long enough that log_autovacuum_min_duration causes it
      to be logged), so that it is easier to tune.  Notably, it adds buffer
      I/O counts (hits, misses, dirtied) and read and write rate.
      
      Authors: Greg Smith and Noah Misch
      9d3b5024
  10. 25 11月, 2011 1 次提交
    • R
      Move "hot" members of PGPROC into a separate PGXACT array. · ed0b409d
      Robert Haas 提交于
      This speeds up snapshot-taking and reduces ProcArrayLock contention.
      Also, the PGPROC (and PGXACT) structures used by two-phase commit are
      now allocated as part of the main array, rather than in a separate
      array, and we keep ProcArray sorted in pointer order.  These changes
      are intended to minimize the number of cache lines that must be pulled
      in to take a snapshot, and testing shows a substantial increase in
      performance on both read and write workloads at high concurrencies.
      
      Pavan Deolasee, Heikki Linnakangas, Robert Haas
      ed0b409d
  11. 24 11月, 2011 1 次提交
    • T
      Creator of a range type must have permission to call support functions. · a912a278
      Tom Lane 提交于
      Since range types can be created by non-superusers, we need to consider
      their permissions.  Ideally we'd check this when the type is used, not
      when it's created, but that seems like much more trouble than it's worth.
      The existing restriction that the support functions be immutable already
      prevents most cases where an unauthorized call to a function might be
      thought a security issue, and the fact that the user has no access to
      the results of the system's calls to subtype_diff closes off the other
      plausible reason for concern.  So this check is basically pro-forma,
      but let's make it anyway.
      a912a278
  12. 23 11月, 2011 2 次提交
    • T
      Remove user-selectable ANALYZE option for range types. · 74c1723f
      Tom Lane 提交于
      It's not clear that a per-datatype typanalyze function would be any more
      useful than a generic typanalyze for ranges.  What *is* clear is that
      letting unprivileged users select typanalyze functions is a crash risk or
      worse.  So remove the option from CREATE TYPE AS RANGE, and instead put in
      a generic typanalyze function for ranges.  The generic function does
      nothing as yet, but hopefully we'll improve that before 9.2 release.
      74c1723f
    • T
      Remove zero- and one-argument range constructor functions. · df735844
      Tom Lane 提交于
      Per discussion, the zero-argument forms aren't really worth the catalog
      space (just write 'empty' instead).  The one-argument forms have some use,
      but they also have a serious problem with looking too much like functional
      cast notation; to the point where in many real use-cases, the parser would
      misinterpret what was wanted.
      
      Committing this as a separate patch, with the thought that we might want
      to revert part or all of it if we can think of some way around the cast
      ambiguity.
      df735844
  13. 22 11月, 2011 1 次提交
    • T
      More code review for rangetypes patch. · a4ffcc8e
      Tom Lane 提交于
      Fix up some infelicitous coding in DefineRange, and add some missing error
      checks.  Rearrange operator strategy number assignments for GiST anyrange
      opclass so that they don't make such a mess of opr_sanity's table of
      operator names associated with different strategy numbers.  Assign
      hopefully-temporary selectivity estimators to range operators that didn't
      have one --- poor as the estimates are, they're still a lot better than the
      default 0.5 estimate, and they'll shut up the opr_sanity test that wants to
      see selectivity estimators on all built-in operators.
      a4ffcc8e
  14. 21 11月, 2011 1 次提交
  15. 18 11月, 2011 2 次提交
    • R
      Further consolidation of DROP statement handling. · fc6d1006
      Robert Haas 提交于
      This gets rid of an impressive amount of duplicative code, with only
      minimal behavior changes.  DROP FOREIGN DATA WRAPPER now requires object
      ownership rather than superuser privileges, matching the documentation
      we already have.  We also eliminate the historical warning about dropping
      a built-in function as unuseful.  All operations are now performed in the
      same order for all object types handled by dropcmds.c.
      
      KaiGai Kohei, with minor revisions by me
      fc6d1006
    • R
      Remove ancient downcasing code from procedural language operations. · 67dc4eed
      Robert Haas 提交于
      A very long time ago, language names were specified as literals rather
      than identifiers, so this code was added to do case-folding.  But that
      style has ben deprecated for many years so this isn't needed any more.
      Language names will still be downcased when specified as unquoted
      identifiers, but quoted identifiers or the old style using string
      literals will be left as-is.
      67dc4eed
  16. 15 11月, 2011 3 次提交
    • T
      Fix alignment and toasting bugs in range types. · ad50934e
      Tom Lane 提交于
      A range type whose element type has 'd' alignment must have 'd' alignment
      itself, else there is no guarantee that the element value can be used
      in-place.  (Because range_deserialize uses att_align_pointer which forcibly
      aligns the given pointer, violations of this rule did not lead to SIGBUS
      but rather to garbage data being extracted, as in one of the added
      regression test cases.)
      
      Also, you can't put a toast pointer inside a range datum, since the
      referenced value could disappear with the range datum still present.
      For consistency with the handling of arrays and records, I also forced
      decompression of in-line-compressed bound values.  It would work to store
      them as-is, but our policy is to avoid situations that might result in
      double compression.
      
      Add assorted regression tests for this, and bump catversion because of
      fixes to built-in pg_type entries.
      
      Also some marginal cleanup of inconsistent/unnecessary error checks.
      ad50934e
    • B
      Rerun pgindent with updated typedef list. · 1a2586c1
      Bruce Momjian 提交于
      1a2586c1
    • B
      cdaa45fd
  17. 10 11月, 2011 1 次提交
  18. 09 11月, 2011 1 次提交
    • H
      In COPY, insert tuples to the heap in batches. · d326d9e8
      Heikki Linnakangas 提交于
      This greatly reduces the WAL volume, especially when the table is narrow.
      The overhead of locking the heap page is also reduced. Reduced WAL traffic
      also makes it scale a lot better, if you run multiple COPY processes at
      the same time.
      d326d9e8
  19. 08 11月, 2011 2 次提交
    • R
      Rewrite comment for slightly greater accuracy. · 0e1c4b7d
      Robert Haas 提交于
      Per an observation from Thom Brown that the old version contained a typo.
      0e1c4b7d
    • R
      Make VACUUM avoid waiting for a cleanup lock, where possible. · bbb6e559
      Robert Haas 提交于
      In a regular VACUUM, it's OK to skip pages for which a cleanup lock
      isn't immediately available; the next VACUUM will deal with them.  If
      we're scanning the entire relation to advance relfrozenxid, we might
      need to wait, but only if there are tuples on the page that actually
      require freezing.  These changes should greatly reduce the incidence
      of of vacuum processes getting "stuck".
      
      Simon Riggs and Robert Haas
      bbb6e559
  20. 03 11月, 2011 1 次提交
  21. 02 11月, 2011 1 次提交
  22. 27 10月, 2011 2 次提交
    • T
      Change FK trigger naming convention to fix self-referential FKs. · 1e3b21dd
      Tom Lane 提交于
      Use names like "RI_ConstraintTrigger_a_NNNN" for FK action triggers and
      "RI_ConstraintTrigger_c_NNNN" for FK check triggers.  This ensures the
      action trigger fires first in self-referential cases where the very same
      row update fires both an action and a check trigger.  This change provides
      a non-probabilistic solution for bug #6268, at the risk that it could break
      client code that is making assumptions about the exact names assigned to
      auto-generated FK triggers.  Hence, change this in HEAD only.  No need for
      forced initdb since old triggers continue to work fine.
      1e3b21dd
    • T
      Change FK trigger creation order to better support self-referential FKs. · 58958726
      Tom Lane 提交于
      When a foreign-key constraint references another column of the same table,
      row updates will queue both the PK's ON UPDATE action and the FK's CHECK
      action in the same event.  The ON UPDATE action must execute first, else
      the CHECK will check a non-final state of the row and possibly throw an
      inappropriate error, as seen in bug #6268 from Roman Lytovchenko.
      
      Now, the firing order of multiple triggers for the same event is determined
      by the sort order of their pg_trigger.tgnames, and the auto-generated names
      we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the
      trigger OID.  So most of the time the firing order is the same as creation
      order, and so rearranging the creation order fixes it.
      
      This patch will fail to fix the problem if the OID counter wraps around or
      adds a decimal digit (eg, from 99999 to 100000) while we are creating the
      triggers for an FK constraint.  Given the small odds of that, and the low
      usage of self-referential FKs, we'll live with that solution in the back
      branches.  A better fix is to change the auto-generated names for FK
      triggers, but it seems unwise to do that in stable branches because there
      may be client code that depends on the naming convention.  We'll fix it
      that way in HEAD in a separate patch.
      
      Back-patch to all supported branches, since this bug has existed for a long
      time.
      58958726
  23. 22 10月, 2011 1 次提交
    • T
      More cleanup after failed reduced-lock-levels-for-DDL feature. · 5ac59807
      Tom Lane 提交于
      Turns out that use of ShareUpdateExclusiveLock or ShareRowExclusiveLock
      to protect DDL changes had gotten copied into several places that were
      not touched by either of Simon's original patches for the feature, and
      thus neither he nor I thought to revert them.  (Indeed, it appears that
      two of these uses were committed *after* the reversion, which just goes
      to show that git merging is no panacea.)  Change these places to use
      AccessExclusiveLock again.  If we ever manage to resurrect that feature,
      we're going to have to think a bit harder about how to keep lock level
      usage in sync for DDL operations that aren't within the AlterTable
      infrastructure.
      
      Two of these bugs are only in HEAD, but one is in the 9.1 branch too.
      Alvaro found one of them, I found the other two.
      5ac59807
  24. 21 10月, 2011 1 次提交
    • R
      Fix DROP OPERATOR FAMILY IF EXISTS. · 98026192
      Robert Haas 提交于
      Essentially, the "IF EXISTS" portion was being ignored, and an error
      thrown anyway if the opfamily did not exist.
      
      I broke this in commit fd1843ff; so
      backpatch to 9.1.X.
      
      Report and diagnosis by KaiGai Kohei.
      98026192
  25. 20 10月, 2011 2 次提交
  26. 19 10月, 2011 1 次提交
    • T
      Suppress -Wunused-result warnings about write() and fwrite(). · aa90e148
      Tom Lane 提交于
      This is merely an exercise in satisfying pedants, not a bug fix, because
      in every case we were checking for failure later with ferror(), or else
      there was nothing useful to be done about a failure anyway.  Document
      the latter cases.
      aa90e148
  27. 15 10月, 2011 1 次提交
    • T
      Measure the number of all-visible pages for use in index-only scan costing. · e6858e66
      Tom Lane 提交于
      Add a column pg_class.relallvisible to remember the number of pages that
      were all-visible according to the visibility map as of the last VACUUM
      (or ANALYZE, or some other operations that update pg_class.relpages).
      Use relallvisible/relpages, instead of an arbitrary constant, to estimate
      how many heap page fetches can be avoided during an index-only scan.
      
      This is pretty primitive and will no doubt see refinements once we've
      acquired more field experience with the index-only scan mechanism, but
      it's way better than using a constant.
      
      Note: I had to adjust an underspecified query in the window.sql regression
      test, because it was changing answers when the plan changed to use an
      index-only scan.  Some of the adjacent tests perhaps should be adjusted
      as well, but I didn't do that here.
      e6858e66
  28. 13 10月, 2011 1 次提交
    • T
      Throw a useful error message if an extension script file is fed to psql. · 458857cc
      Tom Lane 提交于
      We have seen one too many reports of people trying to use 9.1 extension
      files in the old-fashioned way of sourcing them in psql.  Not only does
      that usually not work (due to failure to substitute for MODULE_PATHNAME
      and/or @extschema@), but if it did work they'd get a collection of loose
      objects not an extension.  To prevent this, insert an \echo ... \quit
      line that prints a suitable error message into each extension script file,
      and teach commands/extension.c to ignore lines starting with \echo.
      That should not only prevent any adverse consequences of loading a script
      file the wrong way, but make it crystal clear to users that they need to
      do it differently now.
      
      Tom Lane, following an idea of Andrew Dunstan's.  Back-patch into 9.1
      ... there is not going to be much value in this if we wait till 9.2.
      458857cc
  29. 12 10月, 2011 1 次提交
    • T
      Rearrange the implementation of index-only scans. · a0185461
      Tom Lane 提交于
      This commit changes index-only scans so that data is read directly from the
      index tuple without first generating a faux heap tuple.  The only immediate
      benefit is that indexes on system columns (such as OID) can be used in
      index-only scans, but this is necessary infrastructure if we are ever to
      support index-only scans on expression indexes.  The executor is now ready
      for that, though the planner still needs substantial work to recognize
      the possibility.
      
      To do this, Vars in index-only plan nodes have to refer to index columns
      not heap columns.  I introduced a new special varno, INDEX_VAR, to mark
      such Vars to avoid confusion.  (In passing, this commit renames the two
      existing special varnos to OUTER_VAR and INNER_VAR.)  This allows
      ruleutils.c to handle them with logic similar to what we use for subplan
      reference Vars.
      
      Since index-only scans are now fundamentally different from regular
      indexscans so far as their expression subtrees are concerned, I also chose
      to change them to have their own plan node type (and hence, their own
      executor source file).
      a0185461