提交 · cbe24a6dd8fb224b9585f25b882d5ffdb55a0ba5 · Greenplum / Gpdb

22 12月, 2011 1 次提交

Improve behavior of concurrent CLUSTER. · cbe24a6d

由 Robert Haas 提交于 12月 21, 2011

In the previous coding, a user could queue up for an AccessExclusiveLock
on a table they did not have permission to cluster, thus potentially
interfering with access by authorized users who got stuck waiting behind
the AccessExclusiveLock. This approach avoids that. cluster() has the
same permissions-checking requirements as REINDEX TABLE, so this commit
moves the now-shared callback to tablecmds.c and renames it, per
discussion with Noah Misch.

cbe24a6d

21 12月, 2011 1 次提交

Take fewer snapshots. · d573e239

由 Robert Haas 提交于 12月 21, 2011

When a PORTAL_ONE_SELECT query is executed, we can opportunistically
reuse the parse/plan shot for the execution phase.  This cuts down the
number of snapshots per simple query from 2 to 1 for the simple
protocol, and 3 to 2 for the extended protocol.  Since we are only
reusing a snapshot taken early in the processing of the same protocol
message, the change shouldn't be user-visible, except that the remote
possibility of the planning and execution snapshots being different is
eliminated.

Note that this change does not make it safe to assume that the parse/plan
snapshot will certainly be reused; that will currently only happen if
PortalStart() decides to use the PORTAL_ONE_SELECT strategy.  It might
be worth trying to provide some stronger guarantees here in the future,
but for now we don't.

Patch by me; review by Dimitri Fontaine.

d573e239

20 12月, 2011 2 次提交

Add support for privileges on types · 72920557

由 Peter Eisentraut 提交于 12月 20, 2011

This adds support for the more or less SQL-conforming USAGE privilege
on types and domains.  The intent is to be able restrict which users
can create dependencies on types, which restricts the way in which
owners can alter types.

reviewed by Yeb Havinga

72920557

Allow CHECK constraints to be declared ONLY · 61d81bd2

由 Alvaro Herrera 提交于 12月 05, 2011

This makes them enforceable only on the parent table, not on children
tables.  This is useful in various situations, per discussion involving
people bitten by the restrictive behavior introduced in 8.4.

Message-Id:
8762mp93iw.fsf@comcast.net
CAFaPBrSMMpubkGf4zcRL_YL-AERUbYF_-ZNNYfb3CVwwEqc9TQ@mail.gmail.com

Authors: Nikhil Sontakke, Alex Hunsaker
Reviewed by Robert Haas and myself

61d81bd2

16 12月, 2011 2 次提交

Improve behavior of concurrent ALTER <relation> .. SET SCHEMA. · 1da5c119

由 Robert Haas 提交于 12月 15, 2011

If the referrent of a name changes while we're waiting for the lock,
we must recheck permissons.  We also now check the relkind before
locking, since it's easy to do that long the way.

Patch by me; review by Noah Misch.

1da5c119

Improve behavior of concurrent rename statements. · 74a1d4fe

由 Robert Haas 提交于 12月 15, 2011

Previously, renaming a table, sequence, view, index, foreign table,
column, or trigger checked permissions before locking the object, which
meant that if permissions were revoked during the lock wait, we would
still allow the operation.  Similarly, if the original object is dropped
and a new one with the same name is created, the operation will be allowed
if we had permissions on the old object; the permissions on the new
object don't matter.  All this is now fixed.

Along the way, attempting to rename a trigger on a foreign table now gives
the same error message as trying to create one there in the first place
(i.e. that it's not a table or view) rather than simply stating that no
trigger by that name exists.

Patch by me; review by Noah Misch.

74a1d4fe

10 12月, 2011 1 次提交
- P
  
  Add ALTER FOREIGN DATA WRAPPER / RENAME and ALTER SERVER / RENAME · 5bcf8ede
  由 Peter Eisentraut 提交于 12月 09, 2011
  
  5bcf8ede
07 12月, 2011 3 次提交

Remove spclocation field from pg_tablespace · 16d8e594

由 Magnus Hagander 提交于 12月 07, 2011

Instead, add a function pg_tablespace_location(oid) used to return
the same information, and do this by reading the symbolic link.

Doing it this way makes it possible to relocate a tablespace when the
database is down by simply changing the symbolic link.

16d8e594

Create a "sort support" interface API for faster sorting. · c6e3ac11

由 Tom Lane 提交于 12月 07, 2011

This patch creates an API whereby a btree index opclass can optionally
provide non-SQL-callable support functions for sorting. In the initial
patch, we only use this to provide a directly-callable comparator function,
which can be invoked with a bit less overhead than the traditional
SQL-callable comparator. While that should be of value in itself, the real
reason for doing this is to provide a datatype-extensible framework for
more aggressive optimizations, as in Peter Geoghegan's recent work.

Robert Haas and Tom Lane

c6e3ac11

R
Typo fixes for commit 2ad36c4e. · d2a66218
由 Robert Haas 提交于 12月 06, 2011
```
Noted during post-commit review by by Noah Misch.
```
d2a66218

30 11月, 2011 1 次提交

Improve table locking behavior in the face of current DDL. · 2ad36c4e

由 Robert Haas 提交于 11月 30, 2011

In the previous coding, callers were faced with an awkward choice:
look up the name, do permissions checks, and then lock the table; or
look up the name, lock the table, and then do permissions checks.
The first choice was wrong because the results of the name lookup
and permissions checks might be out-of-date by the time the table
lock was acquired, while the second allowed a user with no privileges
to interfere with access to a table by users who do have privileges
(e.g. if a malicious backend queues up for an AccessExclusiveLock on
a table on which AccessShareLock is already held, further attempts
to access the table will be blocked until the AccessExclusiveLock
is obtained and the malicious backend's transaction rolls back).

To fix, allow callers of RangeVarGetRelid() to pass a callback which
gets executed after performing the name lookup but before acquiring
the relation lock. If the name lookup is retried (because
invalidation messages are received), the callback will be re-executed
as well, so we get the best of both worlds. RangeVarGetRelid() is
renamed to RangeVarGetRelidExtended(); callers not wishing to supply
a callback can continue to invoke it as RangeVarGetRelid(), which is
now a macro. Since the only one caller that uses nowait = true now
passes a callback anyway, the RangeVarGetRelid() macro defaults nowait
as well. The callback can also be used for supplemental locking - for
example, REINDEX INDEX needs to acquire the table lock before the index
lock to reduce deadlock possibilities.

There's a lot more work to be done here to fix all the cases where this
can be a problem, but this commit provides the general infrastructure
and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE,
LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE.

Per discussion with Noah Misch and Alvaro Herrera.

2ad36c4e

29 11月, 2011 1 次提交

Disallow deletion of CurrentExtensionObject while running extension script. · 871dd024

由 Tom Lane 提交于 11月 28, 2011

While the deletion in itself wouldn't break things, any further creation
of objects in the script would result in dangling pg_depend entries being
added by recordDependencyOnCurrentExtension(). An example from Phil
Sorber convinced me that this is just barely likely enough to be worth
expending a couple lines of code to defend against. The resulting error
message might be confusing, but it's better than leaving corrupted catalog
contents for the user to deal with.

871dd024

26 11月, 2011 1 次提交

Improve logging of autovacuum I/O activity · 9d3b5024

由 Alvaro Herrera 提交于 11月 25, 2011

This adds some I/O stats to the logging of autovacuum (when the
operation takes long enough that log_autovacuum_min_duration causes it
to be logged), so that it is easier to tune.  Notably, it adds buffer
I/O counts (hits, misses, dirtied) and read and write rate.

Authors: Greg Smith and Noah Misch

9d3b5024

25 11月, 2011 1 次提交

Move "hot" members of PGPROC into a separate PGXACT array. · ed0b409d

由 Robert Haas 提交于 11月 25, 2011

This speeds up snapshot-taking and reduces ProcArrayLock contention.
Also, the PGPROC (and PGXACT) structures used by two-phase commit are
now allocated as part of the main array, rather than in a separate
array, and we keep ProcArray sorted in pointer order. These changes
are intended to minimize the number of cache lines that must be pulled
in to take a snapshot, and testing shows a substantial increase in
performance on both read and write workloads at high concurrencies.

Pavan Deolasee, Heikki Linnakangas, Robert Haas

ed0b409d

24 11月, 2011 1 次提交

Creator of a range type must have permission to call support functions. · a912a278

由 Tom Lane 提交于 11月 23, 2011

Since range types can be created by non-superusers, we need to consider
their permissions. Ideally we'd check this when the type is used, not
when it's created, but that seems like much more trouble than it's worth.
The existing restriction that the support functions be immutable already
prevents most cases where an unauthorized call to a function might be
thought a security issue, and the fact that the user has no access to
the results of the system's calls to subtype_diff closes off the other
plausible reason for concern. So this check is basically pro-forma,
but let's make it anyway.

a912a278

23 11月, 2011 2 次提交

Remove user-selectable ANALYZE option for range types. · 74c1723f

由 Tom Lane 提交于 11月 23, 2011

It's not clear that a per-datatype typanalyze function would be any more
useful than a generic typanalyze for ranges. What *is* clear is that
letting unprivileged users select typanalyze functions is a crash risk or
worse. So remove the option from CREATE TYPE AS RANGE, and instead put in
a generic typanalyze function for ranges. The generic function does
nothing as yet, but hopefully we'll improve that before 9.2 release.

74c1723f

Remove zero- and one-argument range constructor functions. · df735844

由 Tom Lane 提交于 11月 22, 2011

Per discussion, the zero-argument forms aren't really worth the catalog
space (just write 'empty' instead).  The one-argument forms have some use,
but they also have a serious problem with looking too much like functional
cast notation; to the point where in many real use-cases, the parser would
misinterpret what was wanted.

Committing this as a separate patch, with the thought that we might want
to revert part or all of it if we can think of some way around the cast
ambiguity.

df735844

22 11月, 2011 1 次提交

More code review for rangetypes patch. · a4ffcc8e

由 Tom Lane 提交于 11月 21, 2011

Fix up some infelicitous coding in DefineRange, and add some missing error
checks. Rearrange operator strategy number assignments for GiST anyrange
opclass so that they don't make such a mess of opr_sanity's table of
operator names associated with different strategy numbers. Assign
hopefully-temporary selectivity estimators to range operators that didn't
have one --- poor as the estimates are, they're still a lot better than the
default 0.5 estimate, and they'll shut up the opr_sanity test that wants to
see selectivity estimators on all built-in operators.

a4ffcc8e

21 11月, 2011 1 次提交
- T
  Further code review for range types patch. · b985d487
  由 Tom Lane 提交于 11月 20, 2011
```
Fix some bugs in coercion logic and pg_dump; more comment cleanup;
minor cosmetic improvements.
```
  b985d487
18 11月, 2011 2 次提交

Further consolidation of DROP statement handling. · fc6d1006

由 Robert Haas 提交于 11月 17, 2011

This gets rid of an impressive amount of duplicative code, with only
minimal behavior changes. DROP FOREIGN DATA WRAPPER now requires object
ownership rather than superuser privileges, matching the documentation
we already have. We also eliminate the historical warning about dropping
a built-in function as unuseful. All operations are now performed in the
same order for all object types handled by dropcmds.c.

KaiGai Kohei, with minor revisions by me

fc6d1006

Remove ancient downcasing code from procedural language operations. · 67dc4eed

由 Robert Haas 提交于 11月 17, 2011

A very long time ago, language names were specified as literals rather
than identifiers, so this code was added to do case-folding. But that
style has ben deprecated for many years so this isn't needed any more.
Language names will still be downcased when specified as unquoted
identifiers, but quoted identifiers or the old style using string
literals will be left as-is.

67dc4eed

15 11月, 2011 3 次提交

Fix alignment and toasting bugs in range types. · ad50934e

由 Tom Lane 提交于 11月 14, 2011

A range type whose element type has 'd' alignment must have 'd' alignment
itself, else there is no guarantee that the element value can be used
in-place. (Because range_deserialize uses att_align_pointer which forcibly
aligns the given pointer, violations of this rule did not lead to SIGBUS
but rather to garbage data being extracted, as in one of the added
regression test cases.)

Also, you can't put a toast pointer inside a range datum, since the
referenced value could disappear with the range datum still present.
For consistency with the handling of arrays and records, I also forced
decompression of in-line-compressed bound values. It would work to store
them as-is, but our policy is to avoid situations that might result in
double compression.

Add assorted regression tests for this, and bump catversion because of
fixes to built-in pg_type entries.

Also some marginal cleanup of inconsistent/unnecessary error checks.

ad50934e

B

Rerun pgindent with updated typedef list. · 1a2586c1
由 Bruce Momjian 提交于 11月 14, 2011

1a2586c1
B

Run pgindent on range type files, per request from Tom. · cdaa45fd
由 Bruce Momjian 提交于 11月 14, 2011

cdaa45fd

10 11月, 2011 1 次提交
- R
  
  Fix compiler warning. · 452d1d19
  由 Robert Haas 提交于 11月 09, 2011
  
  452d1d19
09 11月, 2011 1 次提交

In COPY, insert tuples to the heap in batches. · d326d9e8

由 Heikki Linnakangas 提交于 11月 09, 2011

This greatly reduces the WAL volume, especially when the table is narrow.
The overhead of locking the heap page is also reduced. Reduced WAL traffic
also makes it scale a lot better, if you run multiple COPY processes at
the same time.

d326d9e8

08 11月, 2011 2 次提交

R
Rewrite comment for slightly greater accuracy. · 0e1c4b7d
由 Robert Haas 提交于 11月 08, 2011
```
Per an observation from Thom Brown that the old version contained a typo.
```
0e1c4b7d

Make VACUUM avoid waiting for a cleanup lock, where possible. · bbb6e559

由 Robert Haas 提交于 11月 07, 2011

In a regular VACUUM, it's OK to skip pages for which a cleanup lock
isn't immediately available; the next VACUUM will deal with them.  If
we're scanning the entire relation to advance relfrozenxid, we might
need to wait, but only if there are tuples on the page that actually
require freezing.  These changes should greatly reduce the incidence
of of vacuum processes getting "stuck".

Simon Riggs and Robert Haas

bbb6e559

03 11月, 2011 1 次提交

Support range data types. · 4429f6a9

由 Heikki Linnakangas 提交于 11月 03, 2011

Selectivity estimation functions are missing for some range type operators,
which is a TODO.

Jeff Davis

4429f6a9

02 11月, 2011 1 次提交
- S
  
  Comment changes to show bgwriter no longer performs checkpoints. · f3ebaad4
  由 Simon Riggs 提交于 11月 01, 2011
  
  f3ebaad4
27 10月, 2011 2 次提交

Change FK trigger naming convention to fix self-referential FKs. · 1e3b21dd

由 Tom Lane 提交于 10月 26, 2011

Use names like "RI_ConstraintTrigger_a_NNNN" for FK action triggers and
"RI_ConstraintTrigger_c_NNNN" for FK check triggers. This ensures the
action trigger fires first in self-referential cases where the very same
row update fires both an action and a check trigger. This change provides
a non-probabilistic solution for bug #6268, at the risk that it could break
client code that is making assumptions about the exact names assigned to
auto-generated FK triggers. Hence, change this in HEAD only. No need for
forced initdb since old triggers continue to work fine.

1e3b21dd

Change FK trigger creation order to better support self-referential FKs. · 58958726

由 Tom Lane 提交于 10月 26, 2011

When a foreign-key constraint references another column of the same table,
row updates will queue both the PK's ON UPDATE action and the FK's CHECK
action in the same event. The ON UPDATE action must execute first, else
the CHECK will check a non-final state of the row and possibly throw an
inappropriate error, as seen in bug #6268 from Roman Lytovchenko.

Now, the firing order of multiple triggers for the same event is determined
by the sort order of their pg_trigger.tgnames, and the auto-generated names
we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the
trigger OID. So most of the time the firing order is the same as creation
order, and so rearranging the creation order fixes it.

This patch will fail to fix the problem if the OID counter wraps around or
adds a decimal digit (eg, from 99999 to 100000) while we are creating the
triggers for an FK constraint. Given the small odds of that, and the low
usage of self-referential FKs, we'll live with that solution in the back
branches. A better fix is to change the auto-generated names for FK
triggers, but it seems unwise to do that in stable branches because there
may be client code that depends on the naming convention. We'll fix it
that way in HEAD in a separate patch.

Back-patch to all supported branches, since this bug has existed for a long
time.

58958726

22 10月, 2011 1 次提交

More cleanup after failed reduced-lock-levels-for-DDL feature. · 5ac59807

由 Tom Lane 提交于 10月 21, 2011

Turns out that use of ShareUpdateExclusiveLock or ShareRowExclusiveLock
to protect DDL changes had gotten copied into several places that were
not touched by either of Simon's original patches for the feature, and
thus neither he nor I thought to revert them. (Indeed, it appears that
two of these uses were committed *after* the reversion, which just goes
to show that git merging is no panacea.) Change these places to use
AccessExclusiveLock again. If we ever manage to resurrect that feature,
we're going to have to think a bit harder about how to keep lock level
usage in sync for DDL operations that aren't within the AlterTable
infrastructure.

Two of these bugs are only in HEAD, but one is in the 9.1 branch too.
Alvaro found one of them, I found the other two.

5ac59807

21 10月, 2011 1 次提交

Fix DROP OPERATOR FAMILY IF EXISTS. · 98026192

由 Robert Haas 提交于 10月 21, 2011

Essentially, the "IF EXISTS" portion was being ignored, and an error
thrown anyway if the opfamily did not exist.

I broke this in commit fd1843ff; so
backpatch to 9.1.X.

Report and diagnosis by KaiGai Kohei.

98026192

20 10月, 2011 2 次提交

Add "skipping" to the NOTICE produced by DROP OPERATOR CLASS IF EXISTS. · 1d751018

由 Robert Haas 提交于 10月 19, 2011

This makes this message consistent with all the other similar notices
produced by other DROP IF EXISTS commands.

Noted by KaiGai Kohei

1d751018

Consolidate DROP handling for some object types. · 82a4a777

由 Robert Haas 提交于 10月 19, 2011

This gets rid of a significant amount of duplicative code.

KaiGai Kohei, reviewed in earlier versions by Dimitri Fontaine, with
further review and cleanup by me.

82a4a777

19 10月, 2011 1 次提交

Suppress -Wunused-result warnings about write() and fwrite(). · aa90e148

由 Tom Lane 提交于 10月 18, 2011

This is merely an exercise in satisfying pedants, not a bug fix, because
in every case we were checking for failure later with ferror(), or else
there was nothing useful to be done about a failure anyway.  Document
the latter cases.

aa90e148

15 10月, 2011 1 次提交

Measure the number of all-visible pages for use in index-only scan costing. · e6858e66

由 Tom Lane 提交于 10月 14, 2011

Add a column pg_class.relallvisible to remember the number of pages that
were all-visible according to the visibility map as of the last VACUUM
(or ANALYZE, or some other operations that update pg_class.relpages).
Use relallvisible/relpages, instead of an arbitrary constant, to estimate
how many heap page fetches can be avoided during an index-only scan.

This is pretty primitive and will no doubt see refinements once we've
acquired more field experience with the index-only scan mechanism, but
it's way better than using a constant.

Note: I had to adjust an underspecified query in the window.sql regression
test, because it was changing answers when the plan changed to use an
index-only scan.  Some of the adjacent tests perhaps should be adjusted
as well, but I didn't do that here.

e6858e66

13 10月, 2011 1 次提交

Throw a useful error message if an extension script file is fed to psql. · 458857cc

由 Tom Lane 提交于 10月 12, 2011

We have seen one too many reports of people trying to use 9.1 extension
files in the old-fashioned way of sourcing them in psql.  Not only does
that usually not work (due to failure to substitute for MODULE_PATHNAME
and/or @extschema@), but if it did work they'd get a collection of loose
objects not an extension.  To prevent this, insert an \echo ... \quit
line that prints a suitable error message into each extension script file,
and teach commands/extension.c to ignore lines starting with \echo.
That should not only prevent any adverse consequences of loading a script
file the wrong way, but make it crystal clear to users that they need to
do it differently now.

Tom Lane, following an idea of Andrew Dunstan's.  Back-patch into 9.1
... there is not going to be much value in this if we wait till 9.2.

458857cc

12 10月, 2011 1 次提交

Rearrange the implementation of index-only scans. · a0185461

由 Tom Lane 提交于 10月 11, 2011

This commit changes index-only scans so that data is read directly from the
index tuple without first generating a faux heap tuple. The only immediate
benefit is that indexes on system columns (such as OID) can be used in
index-only scans, but this is necessary infrastructure if we are ever to
support index-only scans on expression indexes. The executor is now ready
for that, though the planner still needs substantial work to recognize
the possibility.

To do this, Vars in index-only plan nodes have to refer to index columns
not heap columns. I introduced a new special varno, INDEX_VAR, to mark
such Vars to avoid confusion. (In passing, this commit renames the two
existing special varnos to OUTER_VAR and INNER_VAR.) This allows
ruleutils.c to handle them with logic similar to what we use for subplan
reference Vars.

Since index-only scans are now fundamentally different from regular
indexscans so far as their expression subtrees are concerned, I also chose
to change them to have their own plan node type (and hence, their own
executor source file).

a0185461