This document explains how the optimizer changes expressions.
The way an expression is written really does matter. When written well expressions not only clearly state what is intended but they can also improve the performance of the SQL engine significantly. This document provides several tips designed to help you improve your SQL expressions.
Intended audience:
This document is intended to be used by Symbian OS licensees and third party application developers.
In an earlier section we described how the optimizer would only make use of an index if one of the columns being indexed occurred on one side of certain equality and comparison operators (“= ”, “IN ”, “< ”, “> ”, “<= ”, “>= ”, and sometimes “IS NULL ”).
While this is technically true, prior to the stage of the analysis where the optimizer is looking for these kinds of expressions, it may have first modified the
The next few paragraphs will describe some of these rewriting rules.
The query optimizer always rewrites the
What the query optimizer ultimately sees is this:
In this revised form, the optimizer might be able to use an index on the “x” column to speed the operation of the query.
Another rewriting rule is that a disjunction of two or more equality tests against the same column is changed into a single IN operator. So if you write:
The WHERE clause will be rewritten into the following form:
The original format was not a candidate for use of indexes. But after the disjunction is converted into a single IN operator the usual index processing logic applies and the query can be made much faster.
In order for this rewriting rule to be applied, however, all terms of the disjunction must be equality comparisons against the same column. It will not work to have a disjunction involving two or more columns or involving expressions. So, for instance, the following statements will not be optimized:
The query parser and compiler in SQLite are designed to be small, fast, and lean. A consequence of this design is that SQLite does not do much in the way of constant folding or common subexpression elimination. SQLite evaluates SQL expressions mostly as written.
One way to work around the lack of constant folding in SQLite is to enclose constant subexpressions within a subquery. SQLite does optimize constant subqueries – it evaluates them once, remembers the result, and then reuses that result repeatedly.
An example will help clarify how this works. Suppose you have a table that contains a timestamp recorded as the fractional julian day number:
A query against this table to find all entries after November 8, 2006 might look like the following:
This query works fine. The problem is that the “julianday('2006-11-08')” function gets called repeatedly, once for each row tested, even though the function returns the same value each time. It is much more efficient to call the function one time only and reuse the result over and over. You can accomplish this by moving the function call inside a subquery as follows:
There are, of course, some cases where multiple evaluations of a function in the WHERE clause is desirable. For example, suppose you want to return roughly one out of every eight records, chosen at random. A suitable query would be:
In this case, moving the function evaluation into a subquery would not work as desired:
In this last example, the result would be either all of the records in the table (probability 12.5%) or none of them (probability 87.5%). The difference here, of course, is that the
The SQL compiler does not have any way of knowing this so it always assumes the worst: that every function works like
The way an expression is written really +does matter. When written well expressions not only clearly state what is +intended but they can also improve the performance of the SQL engine significantly. +This document provides several tips designed to help you improve your SQL +expressions.
Intended +audience:
This document is intended to be used by Symbian platform +licensees and third party application developers.
In an earlier section we described how +the optimizer would only make use of an index if one of the columns being +indexed occurred on one side of certain equality and comparison operators +(“= ”, “IN ”, “< ”, “> ”, “<= ”, +“>= ”, and sometimes “IS NULL ”).
While this is technically
+true, prior to the stage of the analysis where the optimizer is looking for
+these kinds of expressions, it may have first modified the
The next few paragraphs will +describe some of these rewriting rules.
The query optimizer always
+rewrites the
What the query optimizer ultimately sees is this:
In this revised form, the optimizer might be able to use an +index on the “x” column to speed the operation of the query.
Another +rewriting rule is that a disjunction of two or more equality tests against +the same column is changed into a single IN operator. So if you write:
The WHERE clause will be rewritten into the following form:
The original format was not a candidate for use of indexes. +But after the disjunction is converted into a single IN operator the usual +index processing logic applies and the query can be made much faster.
In +order for this rewriting rule to be applied, however, all terms of the disjunction +must be equality comparisons against the same column. It will not work to +have a disjunction involving two or more columns or involving expressions. +So, for instance, the following statements will not be optimized:
The query parser and compiler +in SQLite are designed to be small, fast, and lean. A consequence of this +design is that SQLite does not do much in the way of constant folding or common +subexpression elimination. SQLite evaluates SQL expressions mostly as written.
One +way to work around the lack of constant folding in SQLite is to enclose constant +subexpressions within a subquery. SQLite does optimize constant subqueries +– it evaluates them once, remembers the result, and then reuses that result +repeatedly.
An example will help clarify how this works. Suppose you +have a table that contains a timestamp recorded as the fractional julian day +number:
A query against this table to find all entries after November +8, 2006 might look like the following:
This query works fine. The problem is that the “julianday('2006-11-08')” +function gets called repeatedly, once for each row tested, even though the +function returns the same value each time. It is much more efficient to call +the function one time only and reuse the result over and over. You can accomplish +this by moving the function call inside a subquery as follows:
There are, of course, some cases where multiple evaluations +of a function in the WHERE clause is desirable. For example, suppose you want +to return roughly one out of every eight records, chosen at random. A suitable +query would be:
In this case, moving the function evaluation into a subquery +would not work as desired:
In this last example, the result would be either all of the
+records in the table (probability 12.5%) or none of them (probability 87.5%).
+The difference here, of course, is that the
The
+SQL compiler does not have any way of knowing this so it always assumes the
+worst: that every function works like