The Oracle Read-Consistency Model
David Clement
September 2004
Oracle versions 7-8
When it is time to run the monthly balance report, either the whole
system can be taken off-line and the checking accounts made
unavailable, or the monthly balance report can contain inconsistent
figures reflecting checking account activity during the processing of
the report, or -- if neither of these is acceptable -- the report has
to be protected in some way from the checking account activity. In
the RDBMS world, this protection involves the notion of a
transaction.
The Object Management Group, an industry consortium including Apple,
Oracle, and Sun, has published an influential definition of a
transaction as a database system action that passes the ACID test,
where ACID stands for Atomic, Consistent, Isolated, and Durable. A
transaction is atomic in that either all of it happens or none of it
happens. It is consistent in that it acts on the same data throughout
its lifespan, that is, if a customer deposits or withdraws money at
any time after the monthly balance report has started running, that
customer's action is not reflected in the totals for that report. A
transaction is isolated in that concurrent transactions do not affect
it, and it does not affect them. It is durable in that after a disk
failure or instance crash, it can be recovered.
Each multi-user RDBMS has its own design for meeting some or all of
these requirements. In the case of Oracle, a major part of the design
for ACID compliance is the read-consistency model and the default
isolation level of read-committed. This is a guarantee that any
single SQL statement sees consistent data all the way through. It
differs from some other RDBMS models in that the lowest level of
isolation is already a "no dirty read" level, that is, no SQL
statement ever sees uncommitted data changes. The database causes
statements to fail rather than read uncommitted data.
The isolation level can be raised from read-committed to
serializable, in which case not only single statements but
arbitrarily long, user-defined sequences of statements are treated as
single transactions and see only data consistent with the database
state when the transaction began executing. However, since this
requires a great deal of overhead, it is not the default level.
The infrastructure of the read-consistency model is a set of data
structures called rollback segments. The rollback segments keep
snapshots of data before changes were made to the data, so that a SQL
statement can refer to these snapshots as necessary. This makes it
possible to roll a transaction back, and it enables read-consistency
at the same time.
As the first statement of a transaction begins to execute, a unique
number, the System Change Number or SCN, is allocated. When data to
satisfy a query is found in a data block in the database buffer
cache, the SCN for the transaction is checked against the block
version number (BVN) of the data block. The BVN consists of an SCN
and a sequence number that represents how many changes have been
applied to the block at that SCN. If the block has not been changed
or has only been changed by the current transaction, then the current
transaction can use the data in the block. This is called a "current
read." However, if some other transaction has changed the data in the
block, then the BVN will reveal that.
The reason for this is that, as an Oracle server process executes any
DML statement (DML is "data-manipulation language," primarily
"insert," "update," "delete," or "merge") it requests a new snapshot
slot in the rollback segments for each data block that it is about to
modify. Once it has saved a snapshot of the data before any change,
the server process acquires a latch on the block, updates the BVN,
modifies the block, and then releases the latch. So all changes to
data blocks are captured in the BVNs.
When the comparison of the SCN to the block version number shows
another transaction has altered this block since the current
transaction began, the transaction has to find and read the
appropriate snapshot in the rollback segments to get consistent data.
This operation is called a "consistent read."
Every commit statement ends a transaction, so that the
next SQL statement submitted in the same session is always a new
transaction with a new SCN. The relationship of the SCNs and BVNs to
the snapshots in the rollback segments is the heart of the
read-consistency model.
What happens if the BVN differs from the transaction SCN, but the
necessary snapshot cannot be found in the rollback segment? In that
case, Oracle raises the ORA-01555 error.
The rollback segments make up a circular buffer, where each data
block contains a pointer to the next one, and the last one points to
the first. As Oracle requires more snapshot space in the rollback
segments, the pointer to the area currently being written into keeps
increasing. This pointer is called the "head." The pointer to the
beginning of the oldest active transaction record is called the
"tail." If the head is about to touch the tail, Oracle allocates
another block in memory and adjusts the block pointers to include
this new block in the circular buffer. In order to keep from taking
over the whole SGA for rollback segments, Oracle advances the tail
whenever a transaction record is marked committed, so that there is
that much less danger of the head catching up to the tail.
In this way, while a long-running transaction is executing, the tail
may pull away from a snapshot that the long-running transactin
requires for read consistency, and the head may then overwrite the
required snapshot. When the transaction cannot find the data that it
needs, ORA-01555 is raised.
Since DML statements increase the BVN and push the rollback segment
head further, redundant and irrelevant DML should be avoided with
care. Updating one column in a row with a value that it already has
may not seem harmful, but if the block would not have been touched
otherwise, this will waste rollback segment space as well as
processing cycles and will increase the risk of certain errors.
Some other mitigations of ORA-01555, such as rollback segment sizing
and dedication, are in the hands of the database administrator or the
architect designing the database infrastructure. In any case, these
tricks cannot be more than mitigations; the problem is with
read consistency, as described above.
|