Index Bloating
David Clement
September 2004
Oracle versions 7-8
Oracle's default form of index is the b*tree index segment. This is a
sorted list of row identifiers that can be accessed by a balanced tree.
This tip won't go into binary sort algorithms and index balancing, but
as a quick reminder, the basic idea is that, if you're looking up the
row with the indexed last name 'SMITH', you check to see if 'SMITH' is
in the first half (A-M) or the last (N-Z), then in the N-Z block to see
if 'SMITH' is in the first half of that (N-S) or the last (T-Z), then
in the N-S block to see if it's in N-O-P or Q-R-S, and then you can
rapidly match it under S for an index unique access. Two notable
advantages of this kind of indexing are that it works with any sequence
of values, as long as they're sorted, and that it is more efficient the
longer the sequence.
The blocks in an Oracle index segment form a doubly-linked list as well
as a tree-sorted list. This is to optimize index range scanning. If a
query needs to retrieve rows for June through July from a large table,
it can look up June and then scan blocks by means of the list pointers
until it reaches August. This works because the indexed values are
sorted.
A logical implication of this indexing scheme is that if an indexed row
is deleted, the free space left in the index segment cannot be
reclaimed immediately. The indexed rowids have to be in sorted order,
so there is no way that the next indexed item can be written into the
first free space. Oracle also needs to allow for rollbacks; it cannot
be sure, until a commit, that the free space will not be
needed again by the same item that was just deleted. Oracle therefore
responds to the deletion of an indexed row by merely leaving the space
for that item empty in the index segment. In fact it is flagged as
empty, not actually cleared out. If a new row is inserted that should
use that space, then the space will be recycled, but not otherwise.
This means that the indexes of volatile tables are in danger of
bloating, as I like to call it. Index bloating occurs when more and
more space in the index segment
is unused because indexed rows are deleted but never replaced. In
extreme cases the majority of the blocks in the index segment are
empty. I have seen a case in which one index wasted 800 megabytes of
disk. This imposes a high penalty on an index range scan; naturally,
index unique access is less affected.
This situation can be fairly easily rectified by rebuilding indexes.
Oracle allows index rebuilding while users are connected and working.
This is the syntax:
alter index indexname rebuild;
Rebuilding an index is transparent to the end users, but it involves
sorting all the data and reallocating the storage in the index segment,
so it is not the least expensive process.
An index of a static or slowly changing table is in no danger of
bloating. So it is not a good idea to write a script that rebuilds all
the indexes in a tablespace or a database. Most of the CPU cycles
consumed by such a script are wasted.
Instead, indexes should be periodically checked for usage and only
those that use far more space than they need should be rebuilt. The
Oracle data dictionary includes a temporary table called
index_stats that holds the most recent result of an
analyze index indexname validate structure command. This
table includes three columns, btree_space,
used_space, and pct_used, that will tell you
how efficiently the index uses space.
Some senior DBAs advocate rebuilding any index that uses less than 75
percent of its allocated space. This seems a reasonable rule of thumb.
|