提交 · 48c192c15e828812d62194126d869d0b104a4ef1 · Greenplum / Gpdb

31 12月, 2009 1 次提交

Revise pgstat's tracking of tuple changes to improve the reliability of · 48c192c1

由 Tom Lane 提交于 12月 30, 2009

decisions about when to auto-analyze.

The previous code depended on n_live_tuples + n_dead_tuples - last_anl_tuples,
where all three of these numbers could be bad estimates from ANALYZE itself.
Even worse, in the presence of a steady flow of HOT updates and matching
HOT-tuple reclamations, auto-analyze might never trigger at all, even if all
three numbers are exactly right, because n_dead_tuples could hold steady.

To fix, replace last_anl_tuples with an accurately tracked count of the total
number of committed tuple inserts + updates + deletes since the last ANALYZE
on the table. This can still be compared to the same threshold as before, but
it's much more trustworthy than the old computation. Tracking this requires
one more intra-transaction counter per modified table within backends, but no
additional memory space in the stats collector. There probably isn't any
measurable speed difference; if anything it might be a bit faster than before,
since I was able to eliminate some per-tuple arithmetic operations in favor of
adding sums once per (sub)transaction.

Also, simplify the logic around pgstat vacuum and analyze reporting messages
by not trying to fold VACUUM ANALYZE into a single pgstat message.

The original thought behind this patch was to allow scheduling of analyzes
on parent tables by artificially inflating their changes_since_analyze count.
I've left that for a separate patch since this change seems to stand on its
own merit.

48c192c1

30 12月, 2009 3 次提交

Add an index on pg_inherits.inhparent, and use it to avoid seqscans in · 540e69a0

由 Tom Lane 提交于 12月 29, 2009

find_inheritance_children(). This is a complete no-op in databases without
any inheritance. In databases where there are just a few entries in
pg_inherits, it could conceivably be a small loss. However, in databases with
many inheritance parents, it can be a big win.

540e69a0

Add the ability to store inheritance-tree statistics in pg_statistic, · 649b5ec7

由 Tom Lane 提交于 12月 29, 2009

and teach ANALYZE to compute such stats for tables that have subclasses.
Per my proposal of yesterday.

autovacuum still needs to be taught about running ANALYZE on parent tables
when their subclasses change, but the feature is useful even without that.

649b5ec7

Previous fix for temporary file management broke returning a set from · 84d723b6

由 Heikki Linnakangas 提交于 12月 29, 2009

PL/pgSQL function within an exception handler. Make sure we use the right
resource owner when we create the tuplestore to hold returned tuples.

Simplify tuplestore API so that the caller doesn't need to be in the right
memory context when calling tuplestore_put* functions. tuplestore.c
automatically switches to the memory context used when the tuplestore was
created. Tuplesort was already modified like this earlier. This patch also
removes the now useless MemoryContextSwitch calls from callers.

Report by Aleksei on pgsql-bugs on Dec 22 2009. Backpatch to 8.1, like
the previous patch that broke this.

84d723b6

29 12月, 2009 2 次提交
- B
  
  Remove PGDLLIMPORT used for binary upgrade; must be on the externs, per Tom. · 0d399d57
  由 Bruce Momjian 提交于 12月 28, 2009
  
  0d399d57
- B
  Add PGDLLIMPORT for binary_upgrade global variables so shared object · 3687b2e0
  由 Bruce Momjian 提交于 12月 28, 2009
```
libraries can access them.
```
  3687b2e0
27 12月, 2009 1 次提交
- B
  Add backend and pg_dump code to allow preservation of pg_enum oids, for · e5b457c2
  由 Bruce Momjian 提交于 12月 27, 2009
```
use in binary upgrades.

Bump catalog version for detection by pg_migrator of new backend API.
```
  e5b457c2
25 12月, 2009 1 次提交

Binary upgrade: · c44327af

由 Bruce Momjian 提交于 12月 24, 2009

Modify pg_dump --binary-upgrade and add backend support routines to
support the preservation of pg_type oids when doing a binary upgrade.
This allows user-defined composite types and arrays to be binary
upgraded.

c44327af

24 12月, 2009 1 次提交

Remove code that attempted to rename index columns to keep them in sync with · c176e122

由 Tom Lane 提交于 12月 23, 2009

their underlying table columns. That code was not bright enough to cope with
collision situations (ie, new name conflicts with some other column of the
index). Since there is no functional reason to do this at all, trying to
upgrade the logic to be bulletproof doesn't seem worth the trouble.

This change means that both the index name and the column names of an index
are set when it's created, and won't be automatically changed when the
underlying table columns are renamed. Neatnik DBAs are still free to rename
them manually, of course.

c176e122

23 12月, 2009 3 次提交

Always pass catalog id to the options validator function specified in · 4e766f2d

由 Heikki Linnakangas 提交于 12月 23, 2009

CREATE FOREIGN DATA WRAPPER. Arguably it wasn't a bug because the
documentation said that it's passed the catalog ID or zero, but surely
we should provide it when it's known. And there isn't currently any
scenario where it's not known, and I can't imagine having one in the
future either, so better remove the "or zero" escape hatch and always
pass a valid catalog ID. Backpatch to 8.4.

Martin Pihlak

4e766f2d

Adjust naming of indexes and their columns per recent discussion. · cfc5008a

由 Tom Lane 提交于 12月 23, 2009

Index expression columns are now named after the FigureColname result for
their expressions, rather than always being "pg_expression_N".  Digits are
appended to this name if needed to make the column name unique within the
index.  (That happens for regular columns too, thus fixing the old problem
that CREATE INDEX fooi ON foo (f1, f1) fails.  Before exclusion indexes
there was no real reason to do such a thing, but now maybe there is.)

Default names for indexes and associated constraints now include the column
names of all their columns, not only the first one as in previous practice.
(Of course, this will be truncated as needed to fit in NAMEDATALEN.  Also,
pkey indexes retain the historical behavior of not naming specific columns
at all.)

An example of the results:

regression=# create table foo (f1 int, f2 text,
regression(# exclude (f1 with =, lower(f2) with =));
NOTICE:  CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo"
CREATE TABLE
regression=# \d foo_f1_lower_exclusion
Index "public.foo_f1_lower_exclusion"
 Column |  Type   | Definition
--------+---------+------------
 f1     | integer | f1
 lower  | text    | lower(f2)
btree, for table "public.foo"

cfc5008a

Disallow comments on columns of relation types other than tables, views, · b7d67954

由 Tom Lane 提交于 12月 22, 2009

and composite types, which are the only relkinds for which pg_dump support
exists for dumping column comments.  There is no obvious usefulness for
comments on columns of sequences or toast tables; and while comments on
index columns might have some value, it's not worth the risk of compatibility
problems due to possible changes in the algorithm for assigning names to
index columns.  Per discussion.

In consequence, remove now-dead code for copying such comments in CREATE TABLE
LIKE.

b7d67954

21 12月, 2009 1 次提交

More cleanups for the recent large object permissions patch. · c7e4be59

由 Robert Haas 提交于 12月 21, 2009

Rewrite or adjust various comments for clarity.  Remove one bogus comment that
doesn't reflect what the code actually does.  Improve the description of the
lo_compat_privileges option.

c7e4be59

19 12月, 2009 2 次提交

Allow read only connections during recovery, known as Hot Standby. · efc16ea5

由 Simon Riggs 提交于 12月 19, 2009

Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.

efc16ea5

B
binary migration: pg_migrator · 78a09145
由 Bruce Momjian 提交于 12月 19, 2009
```
Add comments about places where system oids have to be preserved for
binary migration.
```
78a09145

17 12月, 2009 1 次提交

Several fixes for EXPLAIN (FORMAT YAML), plus one for EXPLAIN (FORMAT JSON). · ff499613

由 Robert Haas 提交于 12月 16, 2009

ExplainSeparatePlans() was busted for both JSON and YAML output - the present
code is a holdover from the original version of my machine-readable explain
patch, which didn't have the grouping_stack machinery. Also, fix an odd
distribution of labor between ExplainBeginGroup() and ExplainYAMLLineStarting()
when marking lists with "- ", with each providing one character. This broke
the output format for multi-query statements. Also, fix ExplainDummyGroup()
for the YAML output format.

Along the way, make the YAML format use escape_yaml() in situations where the
JSON format uses escape_json(). Right now, it doesn't matter because all the
values are known not to need escaping, but it seems safer this way. Finally,
I added some comments to better explain what the YAML output format is doing.

Greg Sabino Mullane reported the issues with multi-query statements.
Analysis and remaining cleanups by me.

ff499613

15 12月, 2009 1 次提交

Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics. · cddca5ec

由 Robert Haas 提交于 12月 15, 2009

This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.

Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.

cddca5ec

12 12月, 2009 1 次提交

Export ExplainBeginOutput() and ExplainEndOutput() for auto_explain. · 02490d46

由 Robert Haas 提交于 12月 12, 2009

Without these functions, anyone outside of explain.c can't actually use
ExplainPrintPlan, because the ExplainState won't be initialized properly.
The user-visible result of this was a crash when using auto_explain with
the JSON output format.

Report by Euler Taveira de Oliveira. Analysis by Tom Lane. Patch by me.

02490d46

11 12月, 2009 2 次提交
- I
  Add large object access control. · f1325ce2
  由 Itagaki Takahiro 提交于 12月 11, 2009
```
A new system catalog pg_largeobject_metadata manages
ownership and access privileges of large objects.

KaiGai Kohei, reviewed by Jaime Casanova.
```
  f1325ce2
- A
  
  Add YAML to list of EXPLAIN formats. Greg Sabino Mullane, reviewed by Takahiro Itagaki. · 324385d6
  由 Andrew Dunstan 提交于 12月 11, 2009
  
  324385d6
10 12月, 2009 1 次提交

Prevent indirect security attacks via changing session-local state within · 62aba765

由 Tom Lane 提交于 12月 09, 2009

an allegedly immutable index function. It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user. However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.

The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue. GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation. Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement. (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)

Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.

Security: CVE-2009-4136

62aba765

07 12月, 2009 1 次提交

Add exclusion constraints, which generalize the concept of uniqueness to · 0cb65564

由 Tom Lane 提交于 12月 07, 2009

support any indexable commutative operator, not just equality.  Two rows
violate the exclusion constraint if "row1.col OP row2.col" is TRUE for
each of the columns in the constraint.

Jeff Davis, reviewed by Robert Haas

0cb65564

21 11月, 2009 1 次提交

Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be · 7fc0f062

由 Tom Lane 提交于 11月 20, 2009

checked to determine whether the trigger should be fired.

For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.

Takahiro Itagaki, reviewed by KaiGai Kohei.

7fc0f062

19 11月, 2009 1 次提交

Add a hook to CREATE/ALTER ROLE to allow an external module to check the · c742b795

由 Tom Lane 提交于 11月 18, 2009

strength of database passwords, and create a sample implementation of
such a hook as a new contrib module "passwordcheck".

Laurenz Albe, reviewed by Takahiro Itagaki

c742b795

17 11月, 2009 1 次提交

Provide a parenthesized-options syntax for VACUUM, analogous to that recently · 5e66a51c

由 Tom Lane 提交于 11月 16, 2009

adopted for EXPLAIN.  This will allow additional options to be implemented
in future without having to make them fully-reserved keywords.  The old syntax
remains available for existing options, however.

Itagaki Takahiro

5e66a51c

12 11月, 2009 1 次提交

Make initdb behave sanely when the selected locale has codeset "US-ASCII". · 8f8a5df6

由 Tom Lane 提交于 11月 12, 2009

Per discussion, this should result in defaulting to SQL_ASCII encoding.
The original coding could not support that because it conflated selection
of SQL_ASCII encoding with not being able to determine the encoding.
Adjust pg_get_encoding_from_locale()'s API to distinguish these cases,
and fix callers appropriately.  Only initdb actually changes behavior,
since the other callers were perfectly content to consider these cases
equivalent.

Per bug #5178 from Boh Yap.  Not going to bother back-patching, since
no one has complained before and there's an easy workaround (namely,
specify the encoding you want).

8f8a5df6

11 11月, 2009 2 次提交

Revert the temporary patch to work around Snow Leopard readdir() bug. · 21e3edd6

由 Tom Lane 提交于 11月 10, 2009

Apple has fixed that bug in 10.6.2, and we should encourage users to
update to that version rather than trusting this cosmetic patch.
As was recently noted by Stephen Tyler, this patch was only masking
the problem in the context of DROP TABLESPACE, but the failure could
occur in other places such as pg_xlog cleanup.

21e3edd6

Fix longstanding problems in VACUUM caused by untimely interruptions · e7ec0222

由 Alvaro Herrera 提交于 11月 10, 2009

In VACUUM FULL, an interrupt after the initial transaction has been recorded
as committed can cause postmaster to restart with the following error message:
PANIC: cannot abort transaction NNNN, it was already committed
This problem has been reported many times.

In lazy VACUUM, an interrupt after the table has been truncated by
lazy_truncate_heap causes other backends' relcache to still point to the
removed pages; this can cause future INSERT and UPDATE queries to error out
with the following error message:
could not read block XX of relation 1663/NNN/MMMM: read only 0 of 8192 bytes
The window to this race condition is extremely narrow, but it has been seen in
the wild involving a cancelled autovacuum process.

The solution for both problems is to inhibit interrupts in both operations
until after the respective transactions have been committed. It's not a
complete solution, because the transaction could theoretically be aborted by
some other error, but at least fixes the most common causes of both problems.

e7ec0222

07 11月, 2009 1 次提交

Keep track of language's trusted flag in InlineCodeBlock. Needed to support DO... · b79f49c7

由 Andrew Dunstan 提交于 11月 06, 2009

Keep track of language's trusted flag in InlineCodeBlock. Needed to support DO blocks for languages that have both trusted and untrusted variants.

b79f49c7

06 11月, 2009 1 次提交

Don't treat NEW and OLD as reserved words anymore. For the purposes of rules · 593f4b85

由 Tom Lane 提交于 11月 05, 2009

it works just as well to have them be ordinary identifiers, and this gets rid
of a number of ugly special cases.  Plus we aren't interfering with non-rule
usage of these names.

catversion bump because the names change internally in stored rules.

593f4b85

05 11月, 2009 1 次提交

Add support for invoking parser callback hooks via SPI and in cached plans. · 9bedd128

由 Tom Lane 提交于 11月 04, 2009

As proof of concept, modify plpgsql to use the hooks. plpgsql is still
inserting $n symbols textually, but the "back end" of the parsing process now
goes through the ParamRef hook instead of using a fixed parameter-type array,
and then execution only fetches actually-referenced parameters, using a hook
added to ParamListInfo.

Although there's a lot left to be done in plpgsql, this already cures the
"if (TG_OP = 'INSERT' and NEW.foo ...)" problem, as illustrated by the
changed regression test.

9bedd128

04 11月, 2009 1 次提交
- H
  Allow rewriting ALTER TABLE to skip WAL logging. · 91ce16a9
  由 Heikki Linnakangas 提交于 11月 04, 2009
```
Itagaki Takahiro, with small changes by me and Simon.
```
  91ce16a9
28 10月, 2009 1 次提交

Fix AfterTriggerSaveEvent to use a test and elog, not just Assert, to check · 44956c52

由 Tom Lane 提交于 10月 27, 2009

that it's called within an AfterTriggerBeginQuery/AfterTriggerEndQuery pair.
The RI cascade triggers suppress that overhead on the assumption that they
are always run non-deferred, so it's possible to violate the condition if
someone mistakenly changes pg_trigger to mark such a trigger deferred.
We don't really care about supporting that, but throwing an error instead
of crashing seems desirable. Per report from Marcelo Costa.

44956c52

26 10月, 2009 1 次提交

Re-implement EvalPlanQual processing to improve its performance and eliminate · 9f2ee8f2

由 Tom Lane 提交于 10月 26, 2009

a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.

Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.

This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.

9f2ee8f2

15 10月, 2009 1 次提交

Support SQL-compliant triggers on columns, ie fire only if certain columns · b2734a0d

由 Tom Lane 提交于 10月 14, 2009

are named in the UPDATE's SET list.

Note: the schema of pg_trigger has not actually changed; we've just started
to use a column that was there all along.  catversion bumped anyway so that
this commit is included in the history of potentially interesting changes
to system catalog contents.

Itagaki Takahiro

b2734a0d

13 10月, 2009 3 次提交

T
Code review for LIKE INCLUDING patch --- clean up some cosmetic and not · 8d54c248
由 Tom Lane 提交于 10月 13, 2009
```
so cosmetic stuff.
```
8d54c248
A

CREATE LIKE INCLUDING COMMENTS and STORAGE, and INCLUDING ALL shortcut. Itagaki Takahiro. · faa1afc6
由 Andrew Dunstan 提交于 10月 12, 2009

faa1afc6

Move the handling of SELECT FOR UPDATE locking and rechecking out of · 0adaf4cb

由 Tom Lane 提交于 10月 12, 2009

execMain.c and into a new plan node type LockRows. Like the recent change
to put table updating into a ModifyTable plan node, this increases planning
flexibility by allowing the operations to occur below the top level of the
plan tree. It's necessary in any case to restore the previous behavior of
having FOR UPDATE locking occur before ModifyTable does.

This partially refactors EvalPlanQual to allow multiple rows-under-test
to be inserted into the EPQ machinery before starting an EPQ test query.
That isn't sufficient to fix EPQ's general bogosity in the face of plans
that return multiple rows per test row, though. Since this patch is
mostly about getting some plan node infrastructure in place and not about
fixing ten-year-old bugs, I will leave EPQ improvements for another day.

Another behavioral change that we could now think about is doing FOR UPDATE
before LIMIT, but that too seems like it should be treated as a followon
patch.

0adaf4cb

10 10月, 2009 1 次提交

Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. · 8a5849b7

由 Tom Lane 提交于 10月 10, 2009

They are now handled by a new plan node type called ModifyTable, which is
placed at the top of the plan tree.  In itself this change doesn't do much,
except perhaps make the handling of RETURNING lists and inherited UPDATEs a
tad less klugy.  But it is necessary preparation for the intended extension of
allowing RETURNING queries inside WITH.

Marko Tiikkaja

8a5849b7

08 10月, 2009 1 次提交

Support use of function argument names to identify which actual arguments · 717fa274

由 Tom Lane 提交于 10月 08, 2009

match which function parameters.  The syntax uses AS, for example
	funcname(value AS arg1, anothervalue AS arg2)

Pavel Stehule

717fa274