提交 · 0d692a0dc9f0e532c67c577187fe5d7d323cb95b · Greenplum / Gpdb

02 1月, 2011 2 次提交

由 Robert Haas 提交于 1月 01, 2011

Foreign tables are a core component of SQL/MED. This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried. Support for foreign table scans will need to
be added in a future patch. However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.

Shigeru Hanada, heavily revised by Robert Haas

0d692a0d

B

Stamp copyrights for year 2011. · 5d950e3b
由 Bruce Momjian 提交于 1月 01, 2011

5d950e3b

14 12月, 2010 1 次提交

Generalize concept of temporary relations to "relation persistence". · 5f7b58fa

由 Robert Haas 提交于 12月 13, 2010

This commit replaces pg_class.relistemp with pg_class.relpersistence;
and also modifies the RangeVar node type to carry relpersistence rather
than istemp. It also removes removes rd_istemp from RelationData and
instead performs the correct computation based on relpersistence.

For clarity, we add three new macros: RelationNeedsWAL(),
RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we
can clarify the purpose of each check that previous depended on
rd_istemp.

This is intended as infrastructure for the upcoming unlogged tables
patch, as well as for future possible work on global temporary tables.

5f7b58fa

09 12月, 2010 2 次提交

S

Self review of previous patch. Fix assumption that xmax >= xmin. · 9975c683
由 Simon Riggs 提交于 12月 09, 2010

9975c683

Reduce spurious Hot Standby conflicts from never-visible records. · b9075a6d

由 Simon Riggs 提交于 12月 09, 2010

Hot Standby conflicts only with tuples that were visible at
some point. So ignore tuples from aborted transactions or for
tuples updated/deleted during the inserting transaction when
generating the conflict transaction ids.

Following detailed analysis and test case by Noah Misch.
Original report covered btree delete records, correctly observed
by Heikki Linnakangas that this applies to other cases also.
Fix covers all sources of cleanup records via common code.

b9075a6d

21 9月, 2010 1 次提交
- M
  
  Remove cvs keywords from all files. · 9f2e2113
  由 Magnus Hagander 提交于 9月 20, 2010
  
  9f2e2113
12 9月, 2010 1 次提交

SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99

由 Joe Conway 提交于 9月 11, 2010

transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).

For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.

5eb15c99

30 7月, 2010 1 次提交

Fix possible page corruption by ALTER TABLE .. SET TABLESPACE. · 1a078629

由 Robert Haas 提交于 7月 29, 2010

If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
will do the same thing during replay, so the corruption propagates to any
standby.  Note, however, that the bug can't be demonstrated unless archiving
is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
are not set.

Back-patch to 8.0; prior releases do not have tablespaces.

Analysis and patch by Jeff Davis.  Adjustments for back-branches and minor
wordsmithing by me.

1a078629

07 7月, 2010 1 次提交
- B
  
  pgindent run for 9.0, second run · 239d769e
  由 Bruce Momjian 提交于 7月 06, 2010
  
  239d769e
03 5月, 2010 2 次提交

T

Improve printing of XLOG_HEAP_NEWPAGE records to include the forknum. · 609a63fd
由 Tom Lane 提交于 5月 02, 2010

609a63fd

Fix replay of XLOG_HEAP_NEWPAGE WAL records to pay attention to the forknum · e55e6ecf

由 Tom Lane 提交于 5月 02, 2010

field of the WAL record. The previous coding always wrote to the main fork,
resulting in data corruption if the page was meant to go into a non-default
fork.

At present, the only operation that can produce such WAL records is
ALTER TABLE/INDEX SET TABLESPACE when executed with archive_mode = on.
Data corruption would be observed on standby slaves, and could occur on the
master as well if a database crash and recovery occurred after committing
the ALTER and before the next checkpoint. Per report from Gordon Shannon.

Back-patch to 8.4; the problem doesn't exist in earlier branches because
we didn't have a concept of multiple relation forks then.

e55e6ecf

22 4月, 2010 1 次提交

Further reductions in Hot Standby conflict processing. These · 781ec6b7

由 Simon Riggs 提交于 4月 22, 2010

come from the realistion that HEAP2_CLEAN records don't
always remove user visible data, so conflict processing for
them can be skipped. Confirm validity using Assert checks,
clarify circumstances under which we log heap_cleanup_info
records. Tuning arises from bug fixing of earlier safety
check failures.

781ec6b7

26 2月, 2010 1 次提交
- B
  
  pgindent run for 9.0 · 65e806cb
  由 Bruce Momjian 提交于 2月 26, 2010
  
  65e806cb
15 2月, 2010 1 次提交

Wrap calls to SearchSysCache and related functions using macros. · e26c539e

由 Robert Haas 提交于 2月 14, 2010

The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4).  This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.

Design and review by Tom Lane.

e26c539e

08 2月, 2010 1 次提交

Remove old-style VACUUM FULL (which was known for a little while as · 0a469c87

由 Tom Lane 提交于 2月 08, 2010

VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.

Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).

We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long
as we want to support in-place update from pre-9.0 databases.

0a469c87

03 2月, 2010 1 次提交

Move the responsibility of writing a "unlogged WAL operation" record from · 9de778b2

由 Heikki Linnakangas 提交于 2月 03, 2010

heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.

9de778b2

30 1月, 2010 1 次提交

Filter recovery conflicts based upon dboid from relfilenode of WAL · 76be0c81

由 Simon Riggs 提交于 1月 29, 2010

records for heap and btree. Minor change, mostly API changes to
pass through the required values. This is a simple change though
also provides the refactoring required for further enhancements
to conflict processing using the relOid. Changes only have effect
during Hot Standby.

76be0c81

21 1月, 2010 1 次提交

Write a WAL record whenever we perform an operation without WAL-logging · 09b115f7

由 Heikki Linnakangas 提交于 1月 20, 2010

that would've been WAL-logged if archiving was enabled. If we encounter
such records in archive recovery anyway, we know that some data is
missing from the log. A WARNING is emitted in that case.

Original patch by Fujii Masao, with changes by me.

09b115f7

14 1月, 2010 1 次提交

First part of refactoring of code for ResolveRecoveryConflict. Purposes · e99767bc

由 Simon Riggs 提交于 1月 14, 2010

of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.

e99767bc

10 1月, 2010 1 次提交

Remove partial, broken support for NULL pointers when fetching attributes. · 84b6d5f3

由 Robert Haas 提交于 1月 10, 2010

Previously, fastgetattr() and heap_getattr() tested their fourth argument
against a null pointer, but any attempt to use them with a literal-NULL
fourth argument evaluated to *(void *)0, resulting in a compiler error.
Remove these NULL tests to avoid leading future readers of this code to
believe that this has a chance of working. Also clean up related legacy
code in nocachegetattr(), heap_getsysattr(), and nocache_index_getattr().

The new coding standard is that any code which calls a getattr-type
function or macro which takes an isnull argument MUST pass a valid
boolean pointer. Per discussion with Bruce Momjian, Tom Lane, Alvaro
Herrera.

84b6d5f3

03 1月, 2010 1 次提交
- B
  
  Update copyright for the year 2010. · 02398008
  由 Bruce Momjian 提交于 1月 02, 2010
  
  02398008
19 12月, 2009 1 次提交

Allow read only connections during recovery, known as Hot Standby. · efc16ea5

由 Simon Riggs 提交于 12月 19, 2009

Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.

efc16ea5

24 8月, 2009 1 次提交

Fix a violation of WAL coding rules in the recent patch to include an · 7fc7a7c4

由 Tom Lane 提交于 8月 24, 2009

"all tuples visible" flag in heap page headers. The flag update *must*
be applied before calling XLogInsert, but heap_update and the tuple
moving routines in VACUUM FULL were ignoring this rule. A crash and
replay could therefore leave the flag incorrectly set, causing rows
to appear visible in seqscans when they should not be. This might explain
recent reports of data corruption from Jeff Ross and others.

In passing, do a bit of editorialization on comments in visibilitymap.c.

7fc7a7c4

11 6月, 2009 2 次提交

B
8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list · d7471402
由 Bruce Momjian 提交于 6月 11, 2009
```
provided by Andrew.
```
d7471402

Keep rs_startblock the same during heap_rescan, so that a rescan of a SeqScan · 61dd4185

由 Tom Lane 提交于 6月 10, 2009

node starts from the same place as the first scan did.  This avoids surprising
behavior of scrollable and WITH HOLD cursors, as seen in Mark Kirkwood's bug
report of yesterday.

It's not entirely clear whether a rescan should be forced to drop out of the
syncscan mode, but for the moment I left the code behaving the same on that
point.  Any change there would only be a performance and not a correctness
issue, anyway.

Back-patch to 8.3, since the unstable behavior was created by the syncscan
patch.

61dd4185

13 5月, 2009 1 次提交

Fix LOCK TABLE to eliminate the race condition that could make it give weird · f23bdda3

由 Tom Lane 提交于 5月 12, 2009

errors when tables are concurrently dropped. To do this we must take lock
on each relation before we check its privileges. The old code was trying
to do that the other way around, which is a bit pointless when there are lots
of other commands that lock relations before checking privileges. I did keep
it checking each relation's privilege before locking the next relation, which
is a detail that ALTER TABLE isn't too picky about.

f23bdda3

21 1月, 2009 1 次提交

Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock should · b2a667b9

由 Heikki Linnakangas 提交于 1月 20, 2009

be used instead of the normal exclusive lock, and make WAL redo functions
responsible for calling RestoreBkpBlocks(). They know better what kind of a
lock they need.

At the moment, this just moves things around with no functional change, but
makes the hot standby patch that's under review cleaner.

b2a667b9

02 1月, 2009 1 次提交
- B
  
  Update copyright for 2009. · 511db38a
  由 Bruce Momjian 提交于 1月 01, 2009
  
  511db38a
17 12月, 2008 1 次提交

Make heap_update() set newtup->t_tableOid correctly, for consistency with · fc3297d8

由 Tom Lane 提交于 12月 16, 2008

the other major heapam.c functions.  The only known consequence of this
omission is that UPDATE RETURNING failed to return the correct value for
"tableoid", as per report from KaiGai Kohei.

Back-patch to 8.2.  Arguably it's wrong all the way back; but without
evidence of visible breakage before RETURNING was added, I'll desist from
patching the older branches.

fc3297d8

03 12月, 2008 1 次提交

Introduce visibility map. The visibility map is a bitmap with one bit per · 608195a3

由 Heikki Linnakangas 提交于 12月 03, 2008

heap page, where a set bit indicates that all tuples on the page are
visible to all transactions, and the page therefore doesn't need
vacuuming. It is stored in a new relation fork.

Lazy vacuum uses the visibility map to skip pages that don't need
vacuuming. Vacuum is also responsible for setting the bits in the map.
In the future, this can hopefully be used to implement index-only-scans,
but we can't currently guarantee that the visibility map is always 100%
up-to-date.

In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
each heap page, also indicating that all tuples on the page are visible to
all transactions. It's important that this flag is kept up-to-date. It
is also used to skip visibility tests in sequential scans, which gives a
small performance gain on seqscans.

608195a3

19 11月, 2008 1 次提交

Rethink the way FSM truncation works. Instead of WAL-logging FSM · 33960006

由 Heikki Linnakangas 提交于 11月 19, 2008

truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To
make that cleaner from modularity point of view, move the WAL-logging one
level up to RelationTruncate, and move RelationTruncate and all the
related WAL-logging to new src/backend/catalog/storage.c file. Introduce
new RelationCreateStorage and RelationDropStorage functions that are used
instead of calling smgrcreate/smgrscheduleunlink directly. Move the
pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new
functions. This leaves smgr.c as a thin wrapper around md.c; all the
transactional stuff is now in storage.c.

This will make it easier to add new forks with similar truncation logic,
like the visibility map.

33960006

07 11月, 2008 1 次提交

Improve bulk-insert performance by keeping the current target buffer pinned · 85e2cedf

由 Tom Lane 提交于 11月 06, 2008

(but not locked, as that would risk deadlocks).  Also, make it work in a small
ring of buffers to avoid having bulk inserts trash the whole buffer arena.

Robert Haas, after an idea of Simon Riggs'.

85e2cedf

01 11月, 2008 1 次提交

Update FSM on WAL replay. This is a bit limited; the FSM is only updated · e9816533

由 Heikki Linnakangas 提交于 10月 31, 2008

on non-full-page-image WAL records, and quite arbitrarily, only if there's
less than 20% free space on the page after the insert/update (not on HOT
updates, though). The 20% cutoff should avoid most of the overhead, when
replaying a bulk insertion, for example, while ensuring that pages that
are full are marked as full in the FSM.

This is mostly to avoid the nasty worst case scenario, where you replay
from a PITR archive, and the FSM information in the base backup is really
out of date. If there was a lot of pages that the outdated FSM claims to
have free space, but don't actually have any, the first unlucky inserter
after the recovery would traverse through all those pages, just to find
out that they're full. We didn't have this problem with the old FSM
implementation, because we simply threw the FSM information away on a
non-clean shutdown.

e9816533

31 10月, 2008 1 次提交

Unite ReadBufferWithFork, ReadBufferWithStrategy, and ZeroOrReadBuffer · 19c8dc83

由 Heikki Linnakangas 提交于 10月 31, 2008

functions into one ReadBufferExtended function, that takes the strategy
and mode as argument. There's three modes, RBM_NORMAL which is the default
used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and
a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages
without throwing an error. The FSM needs the new mode to recover from
corrupt pages, which could happend if we crash after extending an FSM file,
and the new page is "torn".

Add fork number to some error messages in bufmgr.c, that still lacked it.

19c8dc83

28 10月, 2008 1 次提交
- A
  No need for extra code to log freezing zero tuples. Callers already check that · c9d1efda
  由 Alvaro Herrera 提交于 10月 27, 2008
```
they are freezing a nonzero amount anyway.
```
  c9d1efda
08 10月, 2008 1 次提交

Modify the parser's error reporting to include a specific hint for the case · 34372863

由 Tom Lane 提交于 10月 08, 2008

of referencing a WITH item that's not yet in scope according to the SQL
spec's semantics.  This seems to be an easy error to make, and the bare
"relation doesn't exist" message doesn't lead one's mind in the correct
direction to fix it.

34372863

30 9月, 2008 1 次提交

Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the · 15c121b3

由 Heikki Linnakangas 提交于 9月 30, 2008

free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).

This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.

Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.

15c121b3

11 9月, 2008 1 次提交

Initialize the minimum frozen Xid in vac_update_datfrozenxid using · d53a5668

由 Alvaro Herrera 提交于 9月 11, 2008

GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not
depend on the latter being correctly set elsewhere, and while it is more
expensive, this code path is not performance-critical. This is a real
risk for autovacuum, because it can execute whole cycles without doing
a single vacuum, which would mean that RecentGlobalXmin would stay at its
initialization value, FirstNormalTransactionId, causing a bogus value to be
inserted in pg_database. This bug could explain some recent reports of
failure to truncate pg_clog.

At the same time, change the initialization of RecentGlobalXmin to
InvalidTransactionId, and ensure that it's set to something else whenever
it's going to be used. Using it as FirstNormalTransactionId in HOT page
pruning could incur in data loss. InitPostgres takes care of setting it
to a valid value, but the extra checks are there to prevent "special"
backends from behaving in unusual ways.

Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us

d53a5668

11 8月, 2008 1 次提交

Introduce the concept of relation forks. An smgr relation can now consist · 3f0e808c

由 Heikki Linnakangas 提交于 8月 11, 2008

of multiple forks, and each fork can be created and grown separately.

The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.

This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.

3f0e808c

14 7月, 2008 1 次提交

Clean up the use of some page-header-access macros: principally, use · 9d035f42

由 Tom Lane 提交于 7月 13, 2008

SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that
makes the code clearer, and avoid casting between Page and PageHeader where
possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas.

I did not apply the parts of the proposed patch that would have resulted in
slightly changing the on-disk format of hash indexes; it seems to me that's
not a win as long as there's any chance of having in-place upgrade for 8.4.

9d035f42