in and exists
David Clement
September 2004
Oracle versions 7-8
The SQL operators in and exists are often
confused, but they are processed quite differently in Oracle. The
purpose of exists is to determine if an element occurs
in a set; the purpose of in is to examine a set related
to the containing query. A good rule of thumb is that
exists is almost always the operator to use simply to
check if a row exists in a table.
A example from Oracle's "Ask Tom" column gives a logically equivalent
process for each of two queries that return the same results, one of
which uses in while the other uses exists.
First,
select * from t1 where x in (select y from t2);
is equivalent to
select * from t1, (select distinct y from t2) t2
where t1.x = t2.y;
In this case, the inoperator returns a processed set
that can be inner-joined to the containing query. This is fast if t2
is smaller than t1, if the sort (unique) operation on t2
implied by the distinct keyword is not time-consuming,
and if t1.x has an index.
Second,
select * from t1 where exists
(select null from t2 where y = t1.x);
is equivalent to
for x in (select * from t1) loop
if (select null from t2 where y = x.x) then
print x;
else
continue;
end if;
end loop;
This is pseudocode because there is no other PL/SQL construct that
does what exists does. In this case, for each row in the
containing query, a test is made for matching rows. There is no
sort (unique) operation. This is fast if t1 is smaller
than t2 and if t2.y has an index, and this statement has far less
overhead.
An interesting point is that select null in the above is
the same as select 1 or select 'A' or any
other constant, although there have been flame wars about this in
Oracle forums.
The in operator is overloaded. There is a logically
distinct list iteration operator of the same name that processes a
set of literals or host variables. An example would be
where x in (4, 9, 72, 3)
The execution plan for this statement is guaranteed to be the same as
for
where ( (x = 4) or (x = 9) or (x = 72) or (x = 3) )
The in operator on a list is easier to type than the
nested or conditions, and is otherwise the same, so you
should use it in that context.
|