|
|
Explaining the Execution Plan
David Clement
September 2004
Oracle versions 7-8
For any SQL statement, the choice of the execution plan is the most
important single factor in the speed of the processing. Since the
optimizer chooses among available execution plans, rather than among
all logically possible execution plans, and since the optimizer
considers many variables in assigning costs to different execution
plans, it is not easy to guess what plan the optimizer will choose.
Fortunately Oracle provides tools to display the optimizer's choice.
The easiest of these to use is autotrace. Using SQL*Plus, enter
set autotrace traceonly explain and then enter the SQL
statement. Because of the traceonly option, the SQL
statement is not executed, only evaluated. If you want the SQL
statement to be executed at the same time, then use set
autotrace on explain instead. Because of the
explain option, the execution plan choice is displayed.
Autotrace can also display statistics, as another tip in this series
explains.
Autotrace is a front-end to the underlying explain plan
functionality. This is a programming support utility that invokes
the optimizer on a SQL statement and saves a detailed report of the
execution plan in an Oracle table for later reporting. For most
platforms, Oracle supplies a SQL script called utlxplan.sql to build
the table, and another one called utlxpls.sql to report execution
plans from the table, for those who would rather access the utility
that way. It is also possible to build the plan table by hand and to
execute the explain plan command by entering it at the
SQL*Plus prompt, if you have a good understanding of the SQL
connect by clause.
However the functionality is accessed, the output provides a list of
row-source operations, their options, the objects they work on, their
costs, and their cardinality. For example:
Execution Plan
-----------------------------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=338)
1 0 NESTED LOOPS (Cost=4 Card=1 Bytes=338)
2 1 NESTED LOOPS (Cost=3 Card=1 Byte=236)
3 2 TABLE ACCESS (FULL) OF 'CRDT_RQT_ACCTS' (Cost=2 Card=1 Bytes=31)
4 2 TABLE ACCESS (BY ROWID) OF 'CRDT_RQT'
5 4 INDEX (UNIQUE) OF 'CRDT_RQT_PKEY' (UNIQUE)
6 1 TABLE ACCESS (BY ROWID) OF 'CRDT_RTNGS' (Cost=5 Card=1 Bytes=102)
7 6 INDEX (UNIQUE SCAN) OF 'CRDT_RTNGS_PKEY' (UNIQUE)
Each line describes a row-source operation, that is, a processing
step that will return rows to the step that calls it, eventually to
the user process.
The numbers to the left and their indentation provide an execution
tree. For example, one line reads
3 2 TABLE ACCESS (FULL) OF 'CRDT_RQT_ACCTS' (Cost=2 Card=1 Bytes=31)
In this line, 3 is the operation identifier, and 2 is the operation
that called this one and will process its output. In this line, the
operation is a TABLE ACCESS and its option is
(FULL); together these mean that the operation is a
full table scan. The object is 'CRDT_RQT_ACCTS' and in
this database its cost is low, at a rank of 2, because its
cardinality (the number of rows that the optimizer predicts will be
returned) is low, at 1. The low cardinality and resulting low cost
for a full table scan are partly why the optimizer chose that
operation.
Not all the operations reported by explain plan are
fully documented, but the following can be explained.
-
Four of them,
delete statement, insert
statement, select statement, and update
statement, give the statement type and generally appear as
the first line of output.
-
and-equal is an index operation that returns row ID
values common to sets of row IDs; it appears as a step in some
sort operations.
-
bitmap is a conversion between row ID values and bit
values.
-
connect by is a hierarchical ordering of rows.
-
concatenation is a union of all rows from two or
more sets into a single set.
-
count is a count of rows.
-
filter is an operation that removes rows from a set.
-
first row returns the first row only.
-
for update is a row-locking operation.
-
index is the operation that retrieves row ID values
from an index.
-
intersection returns the rows common to two sets of
rows, including duplicates.
-
merge join performs a table join, after a sort.
-
minus removes rows from one set of records when they
appear in another set.
-
nested loops iterates through a result set and
returns matching rows from another result set.
-
remote is the operation to retrieve data from a
remote database.
-
sequence returns a value from a sequence generator.
-
sort is a sort operation.
-
table access is any operation that returns rows from
a table.
-
union returns two sets of rows, eliminating
duplicates.
-
view is the operation to execute a query behind a
view.
Under certain circumstances, each of these is the best possible
row-source operation. However, some of them are worth watching out
for. A sort is always CPU-intensive and sometimes
disk-intensive. A table access, particularly table
access (full), may be much slower than the equivalent index
access. Remote implies network traffic that may be
time-consuming. A great deal of work might be hidden behind
view.
The cost and cardinality of each operation are two values to look at
with care. Even if the row-source operation is a table access
(full), if it returns 5,000 rows for a cost of 2, it is
efficient. The other thing to watch out for is the execution tree;
if a view feeds rows to a remote that feeds
rows to a sort, plan for a long lunch.
Third-party tools often supply easy keyboard shortcuts for displaying
the execution plan. For instance, in TOAD, put the cursor on the SQL
statement and press Control-E.
|
|