|
|
When is a Full Table Scan a Good Thing?
David Clement
September 2004
Oracle versions 7-8
No row-source operation is good or bad in itself. Each is the best
choice in some contexts. A full-table scan (FTS) is faster than index
access in the following situations. At the same time, some of these
situations can be avoided.
If reading right through the table would be less effort than
retrieving rows by probing an index, then FTS is actually the better
choice. This is usually the case when the table is small, that is, 2
percent or less of the size of the database block buffer. The size is in
v$sga if you need to check it. In such a case, Oracle can probably read
the table in one multiblock scan.
The advantage or disadvantage of FTS is less evident if the table is
of small to medium size. Many DBAs use the rule of thumb that if a
SQL statement affects 20 percent of the table data or more, it would
take as long or longer to pick up row IDs from index as it would
to scan the table. This can be misleading inasmuch as the real
question is not one of row access but of block access. A hypothetical
table with 200 short rows stored on average in each of 5 4K blocks might
take exactly as long to return one row as it would to return 100, in
the case where all 100 needed happened to be in one block (not an
unrealistic possibility if you are querying a historical table by
timestamp). Thus, data distribution can have a crucial effect.
Another widely cited rule of thumb is that in the unusual case where
the table data is perfectly even distributed (for example, as the
result of an insensitive load-balancing algorithm) to read 5 percent
of the table data is enough to make FTS just as fast as index access
to the same rows.
In other cases, FTS is enforced by circumstances that could be
avoided. If any operation is performed on a column in the
where clause, the optimizer will not choose index access
(in the absence of function-based indexes). This includes functions,
arithmetical operations, and concatenations. In the sample code
snippet below, each condition in the where clause
prevents index access.
where upper (last_name) = 'SMITH'
or salary * 1.25 > 60000
or last_name || null = 'SMITH'
An operation on a constant or a bind variable does not have the same
effect, so moving the operation to the other side of the equal sign
can restore index access.
In the case of a composite index, if the leading edge of the index is
omitted from the where clause, the optimizer cannot use
it. So for an index on columns X, Y, Z, if the where
clause refers to Y and Z without referring to X, the optimizer will
choose FTS over index access. For such indexes, it may also be wiser
to put the column names in order, from leftmost toward the right,
as follows.
where X = this and Y = that and Z = theother
Apparently, if the column names are not in the same order as the
index definition, the optimizer has a reduced chance of noticing the
index. This effect is undocumented but has been mentioned by enough
senior DBAs and developers in forums that it may be worth taking
account of.
|
|