• H
    Implement genuine serializable isolation level. · dafaa3ef
    Heikki Linnakangas 提交于
    Until now, our Serializable mode has in fact been what's called Snapshot
    Isolation, which allows some anomalies that could not occur in any
    serialized ordering of the transactions. This patch fixes that using a
    method called Serializable Snapshot Isolation, based on research papers by
    Michael J. Cahill (see README-SSI for full references). In Serializable
    Snapshot Isolation, transactions run like they do in Snapshot Isolation,
    but a predicate lock manager observes the reads and writes performed and
    aborts transactions if it detects that an anomaly might occur. This method
    produces some false positives, ie. it sometimes aborts transactions even
    though there is no anomaly.
    
    To track reads we implement predicate locking, see storage/lmgr/predicate.c.
    Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
    memory is finite, so when a transaction takes many tuple-level locks on a
    page, the locks are promoted to a single page-level lock, and further to a
    single relation level lock if necessary. To lock key values with no matching
    tuple, a sequential scan always takes a relation-level lock, and an index
    scan acquires a page-level lock that covers the search key, whether or not
    there are any matching keys at the moment.
    
    A predicate lock doesn't conflict with any regular locks or with another
    predicate locks in the normal sense. They're only used by the predicate lock
    manager to detect the danger of anomalies. Only serializable transactions
    participate in predicate locking, so there should be no extra overhead for
    for other transactions.
    
    Predicate locks can't be released at commit, but must be remembered until
    all the transactions that overlapped with it have completed. That means that
    we need to remember an unbounded amount of predicate locks, so we apply a
    lossy but conservative method of tracking locks for committed transactions.
    If we run short of shared memory, we overflow to a new "pg_serial" SLRU
    pool.
    
    We don't currently allow Serializable transactions in Hot Standby mode.
    That would be hard, because even read-only transactions can cause anomalies
    that wouldn't otherwise occur.
    
    Serializable isolation mode now means the new fully serializable level.
    Repeatable Read gives you the old Snapshot Isolation level that we have
    always had.
    
    Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
    Anssi Kääriäinen
    dafaa3ef
heapam.c 144.9 KB