Symbian3/SDK/Source/GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A.dita
changeset 13 48780e181b38
parent 7 51a74ef9ed63
--- a/Symbian3/SDK/Source/GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A.dita	Fri Jul 16 17:23:46 2010 +0100
+++ b/Symbian3/SDK/Source/GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A.dita	Tue Jul 20 12:00:49 2010 +0100
@@ -1,461 +1,477 @@
-<?xml version="1.0" encoding="utf-8"?>
-<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
-<!-- This component and the accompanying materials are made available under the terms of the License 
-"Eclipse Public License v1.0" which accompanies this distribution, 
-and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
-<!-- Initial Contributors:
-    Nokia Corporation - initial contribution.
-Contributors: 
--->
-<!DOCTYPE concept
-  PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
-<concept id="GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A" xml:lang="en"><title>SQL Index
-Tips</title><shortdesc>This document includes several tips for using SQL indexes.</shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody>
-<section id="GUID-3895F9D0-DE9C-4375-B541-AC99CABB7B8A"><title>Introduction</title> <p>You can use indexes to speed up access.
-You create indexes automatically using PRIMARY KEY and UNIQUE. </p> <p><b>Intended
-audience:</b> </p> <p>This document is intended to be used by Symbian platform
-licensees and third party application developers. </p> </section>
-<section id="GUID-765F0DF1-ACB0-57DB-B9A8-3697E4637065"><title>Use an Index
-to Speed up Access</title> <p>Suppose you have a table like this: </p> <codeblock id="GUID-F70B25AB-A151-52CE-A413-1C62A2464D6A" xml:space="preserve">
-CREATE TABLE demo5(
-    id INTEGER,
-    content BLOB
-);
-</codeblock> <p>Further suppose that this table contains thousands or millions
-of rows and you want to access a single row with a particular ID: </p> <codeblock id="GUID-B02FA452-4093-5383-BAFA-AE035919D720" xml:space="preserve">
-SELECT content FROM demo5 WHERE id=?
-</codeblock> <p>The only want that SQLite can perform this query, and be certain
-to get every row with the chosen ID, is to examine every single row, check
-the ID of that row, and return the content if the ID matches. Examining every
-single row this way is called a <i>full table scan</i>. </p> <p>Reading and
-checking every row of a large table can be very slow, so you want to avoid
-full table scans. The usual way to do this is to create an index on the column
-you are searching against. In the example above, an appropriate index would
-be this: </p> <codeblock id="GUID-82E337F1-2CA2-51B0-A7BC-071A83779A18" xml:space="preserve">
-CREATE INDEX demo5_idx1 ON demo5(id);
-</codeblock> <p>With an index on the ID column, SQLite is able to use a binary
-search to locate entries that contain a particular value of ID. So if the
-table contains a million rows, the query can be satisfied with about 20 accesses
-rather than 1000000 accesses. This is a huge performance improvement. </p> <p>One
-of the features of the SQL language is that you do not have to figure out
-what indexes you may need in advance of coding your application. It is perfectly
-acceptable, even preferable, to write the code for your application using
-a database without any indexes. Then once the application is running and you
-can make speed measurements, add whatever indexes are needed in order to make
-it run faster. </p> <p>When you add indexes, the query optimizer within the
-SQL compiler is able to find new more efficient bytecode procedures for carrying
-out the operations that your SQL statements specify. In other words, by adding
-indexes late in the development cycle you have the power to completely reorganize
-your data access patterns without changing a single line of code. </p> </section>
-<section id="GUID-BB1F17C5-1174-5DF4-AA61-611173237F3F"><title>Create Indexes
-Automatically Using PRIMARY KEY and UNIQUE</title> <p>Any column of a table
-that is declared to be the PRIMARY KEY or that is declared UNIQUE will be
-indexed automatically. There is no need to create a separate index on that
-column using the CREATE INDEX statement. So, for example, this table declaration: </p> <codeblock id="GUID-E4BE6077-F639-5CE7-964A-276B0D58A129" xml:space="preserve">
-CREATE TABLE demo39a(
-    id INTEGER,
-    content BLOB
-);
-
-CREATE INDEX demo39_idx1 ON demo39a(id);
-</codeblock> <p>Is roughly equivalent to the following: </p> <codeblock id="GUID-DB3167E0-FA95-50CA-92C7-102B5C2C13E3" xml:space="preserve">
-CREATE TABLE demo39b(
-    id INTEGER UNIQUE,
-    content BLOB
-);
-</codeblock> <p>The two examples above are “roughly” equivalent, but not exactly
-equivalent. Both tables have an index on the ID column. In the first case,
-the index is created explicitly. In the second case, the index is implied
-by the UNIQUE keyword in the type declaration of the ID column. Both table
-designs use exactly the same amount of disk space, and both will run queries
-such as </p> <codeblock id="GUID-7ACAE270-6D20-557B-B7D1-C90EDD757E43" xml:space="preserve">
-SELECT content FROM demo39 WHERE id=?
-</codeblock> <p>using exactly the same bytecode. The only difference is that
-table demo39a lets you insert multiple rows with the same ID whereas table
-demo39b will raise an exception if you try to insert a new row with the same
-ID as an existing row. </p> <p>If you use the UNIQUE keyword in the CREATE
-INDEX statement of demo39a, like this: </p> <codeblock id="GUID-0EE5E186-CC4A-5CC3-AEAE-F1482F1F8F9A" xml:space="preserve">
-CREATE UNIQUE INDEX demo39_idx1 ON demo39a(id);
-</codeblock> <p>Then both table designs really would be exactly the same in
-every way. In fact, whenever SQLite sees the UNIQUE keyword on a column type
-declaration, all it does is create an automatic unique index on that column. </p> <p>The
-PRIMARY KEY modifier on a column type declaration works like UNIQUE; it causes
-a unique index to be created automatically. The main difference is that you
-are only allowed to have a single PRIMARY KEY. This restriction of only allowing
-a single PRIMARY KEY is part of the official SQL language definition. </p> <p>The
-idea is that a PRIMARY KEY is used to order the rows on disk. Some SQL database
-engines actually implement PRIMARY KEYs this way. But with SQLite, a PRIMARY
-KEY is like any other UNIQUE column, with only one exception: INTEGER PRIMARY
-KEY is a special case which is handled differently, as described in the next
-section. </p> </section>
-<section id="GUID-BF7A0301-8490-58ED-BB37-FAC403A84230"><title>Use Multi-Column
-Indexes</title> <p>SQLite is able to make use of multi-column indexes. The
-rule is that if an index is over columns <i>X</i>  <i> 0 </i>, <i>X</i>  <i> 1 </i>, <i>X</i>  <i> 2 </i>,
-..., <i>X</i>  <i> n </i> of some table, then the index can be used if the
-WHERE clause contains equality constraints for some prefix of those columns <i>X</i>  <i>0 </i>, <i>X</i>  <i>1 </i>, <i>X</i>  <i>2 </i>,
-..., <i>X</i>  <i>i </i> where <i>i</i> is less than <i>n</i>. </p> <p>As
-an example, suppose you have a table and index declared as follows: </p> <codeblock id="GUID-C18C97F7-23CA-5636-9F00-130A8FB3DEF5" xml:space="preserve">
-CREATE TABLE demo314(a,b,c,d,e,f,g);
-CREATE INDEX demo314_idx ON demo314(a,b,c,d,e,f);
-</codeblock> <p>Then the index might be used to help with a query that contained
-a WHERE clause like this: </p> <codeblock id="GUID-8A0944F4-1ACF-5267-B49F-EB83EFBB5670" xml:space="preserve">
-... WHERE a=1 AND b='Smith' AND c=1
-</codeblock> <p>All three terms of the WHERE clause would be used together
-with the index in order to narrow the search. But the index could not be used
-if there WHERE clause said: </p> <codeblock id="GUID-B5F1C17F-0F5E-5FC2-A9A4-DF19D699A076" xml:space="preserve">
-... WHERE b='Smith' AND c=1
-</codeblock> <p>The second WHERE clause does not contain equality terms for
-a prefix of the columns in the index because it omits a term for the “a” column. </p> <p>In
-a case like this: </p> <codeblock id="GUID-EF2CFE7D-0456-5414-847D-BADCC057CFD8" xml:space="preserve">
-... WHERE a=1 AND c=1
-</codeblock> <p>Only the “a=1” term in the WHERE clause could be used to help
-narrow the search. The “c=1” term is not part of the prefix of terms in the
-index which have equality constraints because there is no equality constraint
-on the “b” column. </p> <p>SQLite only allows a single index to be used per
-table within a simple SQL statement. For UPDATE and DELETE statements, this
-means that only a single index can ever be used, since those statements can
-only operate on a single table at a time. </p> <p>In a simple SELECT statement
-multiple indexes can be used if the SELECT statement is a join – one index
-per table in the join. In a compound SELECT statement (two or more SELECT
-statements connected by UNION or INTERSECT or EXCEPT) each SELECT statement
-is treated separately and can have its own indexes. Likewise, SELECT statements
-that appear in subexpressions are treated separately. </p> <p>Some other SQL
-database engines (for example PostgreSQL) allow multiple indexes to be used
-for each table in a SELECT. For example, if you had a table and index in PostgreSQL
-like this: </p> <codeblock id="GUID-F5DE8F24-7471-5992-9896-295CE173D855" xml:space="preserve">
-CREATE TABLE pg1(a INT, b INT, c INT, d INT);
-CREATE INDEX pg1_ix1 ON pg1(a);
-CREATE INDEX pg1_ix2 ON pg1(b);
-CREATE INDEX pg1_ix3 ON pg1(c);
-</codeblock> <p>And if you were to run a query like the following: </p> <codeblock id="GUID-A35663B7-4E7D-5CC0-BF5E-CF3A4CFED63F" xml:space="preserve">
-SELECT d FROM pg1 WHERE a=5 AND b=11 AND c=99;
-</codeblock> <p>Then PostgreSQL might attempt to optimize the query by using
-all three indexes, one for each term of the WHERE clause. </p> <p>SQLite does
-not work this way. SQLite is compelled to select a single index to use in
-the query. It might select any of the three indexes shown, depending on which
-one the optimizer things will give the best speedup. But in every case it
-will only select a single index and only a single term of the WHERE clause
-will be used. </p> <p>SQLite prefers to use a multi-column index such as this: </p> <codeblock id="GUID-40FB7075-1239-5089-BBC5-0D994F4A0C39" xml:space="preserve">
-CREATE INDEX pg1_ix_all ON pg1(a,b,c);
-</codeblock> <p>If the pg1_ix_all index is available for use when the SELECT
-statement above is prepared, SQLite will likely choose it over any of the
-single-column indexes because the multi-column index is able to make use of
-all 3 terms of the WHERE clause. </p> <p>You can trick SQLite into using multiple
-indexes on the same table by rewriting the query. Instead of the SELECT statement
-shown above, if you rewrite it as this: </p> <codeblock id="GUID-D7DE75D4-BB01-50DF-A9DC-956A83DED5D0" xml:space="preserve">
-SELECT d FROM pg1 WHERE RowID IN (
-    SELECT RowID FROM pg1 WHERE a=5
-    INTERSECT
-    SELECT RowID FROM pg1 WHERE b=11
-    INTERSECT
-    SELECT RowID FROM pg1 WHERE c=99
-)
-</codeblock> <p>Then each of the individual SELECT statements will using a
-different single-column index and their results will be combined by the outer
-SELECT statement to give the correct result. The other SQL database engines
-like PostgreSQL that are able to make use of multiple indexes per table do
-so by treating the simpler SELECT statement shown first as if they where the
-more complicated SELECT statement shown here. </p> </section>
-<section id="GUID-E90057A8-70B6-590C-B8AE-616DA25BB543"><title>Use Inequality
-Constraints on the Last Index Term</title> <p>Terms in the WHERE clause of
-a query or UPDATE or DELETE statement are mostly likely to trigger the use
-of an index if they are an equality constraint – in other words if the term
-consists of the name of an indexed column, an equal sign (“=”), and an expression. </p> <p>So,
-for example, if you have a table and index that look like this: </p> <codeblock id="GUID-84AADB9D-5853-57C2-B489-87DC7FB7AADE" xml:space="preserve">
-CREATE TABLE demo315(a,b,c,d);
-CREATE INDEX demo315_idx1 ON demo315(a,b,c);
-</codeblock> <p>And a query like this: </p> <codeblock id="GUID-A2B7DA9F-DB82-5D06-80E2-7AF714E403D5" xml:space="preserve">
-SELECT d FROM demo315 WHERE a=512;
-</codeblock> <p>The single “a=512” term of the WHERE clause qualifies as an
-equality constraint and is likely to provoke the use of the demo315_idx1 index. </p> <p>SQLite
-supports two other kinds of equality constraints. One is the IN operator: </p> <codeblock id="GUID-EA5D7637-A6B8-5BC0-A72E-D576B0F945A3" xml:space="preserve">
-SELECT d FROM demo315 WHERE a IN (512,1024);
-SELECT d FROM demo315 WHERE a IN (SELECT x FROM someothertable);
-</codeblock> <p>There other is the IS NULL constraint: </p> <codeblock id="GUID-B2C1C84B-C33D-55C5-8484-24B28EFC8E37" xml:space="preserve">
-SELECT d FROM demo315 WHERE a IS NULL;
-</codeblock> <p>SQLite allows at most one term of an index to be constrained
-by an inequality such as less than “&lt;”, greater than “&gt;”, less than or
-equal to “&lt;=”, or greater than or equal to “&gt;=”. </p> <p>The column that
-the inequality constrains will be the right-most term of the index that is
-used. So, for example, in this query: </p> <codeblock id="GUID-563231B5-EC3A-57C2-BC6F-1A8129ADE308" xml:space="preserve">
-SELECT d FROM demo315 WHERE a=5 AND b&gt;11 AND c=1;
-</codeblock> <p>Only the first two terms of the WHERE clause will be used
-with the demo315_idx1 index. The third term, the “c=1” constraint, cannot
-be used because the “c” column occurs to the right of the “b” column in the
-index and the “b” column is constrained by an inequality. </p> <p>SQLite allows
-up to two inequalities on the same column as long as the two inequalities
-provide an upper and lower bound on the column. For example, in this query: </p> <codeblock id="GUID-4EB94886-EDFF-58F2-8692-011A67AC5A60" xml:space="preserve">
-SELECT d FROM demo315 WHERE a=5 AND b&gt;11 AND b&lt;23;
-</codeblock> <p>All three terms of the WHERE clause will be used because the
-two inequalities on the “b” column provide an upper and lower bound on the
-value of “b”. </p> <p>SQLite will only use the four inequalities mentioned
-above to help constrain a search: “&lt;”, “&gt;”, “&lt;=”, and “&gt;=”. Other inequality
-operators such as not equal to (“!=” or “&lt;&gt;”) and NOT NULL are not helpful
-to the query optimizer and will never be used to control an index and help
-make the query run faster. </p> </section>
-<section id="GUID-CAD0C181-37E7-578A-A7E1-7843447C247F"><title>Use Indexes
-To Help ORDER BY Clauses Evaluate Faster</title> <p>The default method for
-evaluating an ORDER BY clause in a SELECT statement is to first evaluate the
-SELECT statement and store the results in a temporary tables, then sort the
-temporary table according to the ORDER BY clause and scan the sorted temporary
-table to generate the final output. </p> <p>This method always works, but
-it requires three passes over the data (one pass to generate the result set,
-a second pass to sort the result set, and a third pass to output the results)
-and it requires a temporary storage space sufficiently large to contain the
-entire results set. </p> <p>Where possible, SQLite will avoid storing and
-sorting the result set by using an index that causes the results to emerge
-from the query in sorted order in the first place. </p> <p>The way to get
-SQLite to use an index for sorting is to provide an index that covers the
-same columns specified in the ORDER BY clause. For example, if the table and
-index are like this: </p> <codeblock id="GUID-F0103033-C5C8-5177-8AD7-70BCC45C33C9" xml:space="preserve">
-CREATE TABLE demo316(a,b,c,data);
-CREATE INDEX idx316 ON demo316(a,b,c);
-</codeblock> <p>And you do a query like this: </p> <codeblock id="GUID-D67BB6FF-E213-5B86-A2C1-E1992DA96A62" xml:space="preserve">
-SELECT data FROM demo316 ORDER BY a,b,c;
-</codeblock> <p>SQLite will use the idx316 index to implement the ORDER BY
-clause, obviating the need for temporary storage space and a separate sorting
-pass. </p> <p>An index can be used to satisfy the search constraints of a
-WHERE clause and to impose the ORDER BY ordering of outputs all at once. The
-trick is for the ORDER BY clause terms to occur immediately after the WHERE
-clause terms in the index. For example, one can write: </p> <codeblock id="GUID-02063968-34B5-5766-9D02-86D696D39C1E" xml:space="preserve">
-SELECT data FROM demo316 WHERE a=5 ORDER BY b,c;
-</codeblock> <p>The “a” column is used in the WHERE clause and the immediately
-following terms of the index, “b” and “c” are used in the ORDER BY clause.
-So in this case the idx316 index would be used both to speed up the search
-and to satisfy the ORDER BY clause. </p> <p>This query also uses the idx316
-index because, once again, the ORDER BY clause term “c” immediate follows
-the WHERE clause terms “a” and “b” in the index: </p> <codeblock id="GUID-6760EC7E-E86A-5EBD-BDDD-32A68BE78A9E" xml:space="preserve">
-SELECT data FROM demo316 WHERE a=5 AND b=17 ORDER BY c;
-</codeblock> <p>But now consider this: </p> <codeblock id="GUID-9363996C-8C30-5E04-B05F-392C8262F1F6" xml:space="preserve">
-SELECT data FROM demo316 WHERE a=5 ORDER BY c;
-</codeblock> <p>Here there is a gap between the ORDER BY term “c” and the
-WHERE clause term “a”. So the idx316 index cannot be used to satisfy both
-the WHERE clause and the ORDER BY clause. The index will be used on the WHERE
-clause and a separate sorting pass will occur to put the results in the correct
-order. </p> </section>
-<section id="GUID-109AF0DA-A054-504A-A432-76BD145B2AC4"><title>Add Result
-Columns To The End Of Indexes</title> <p>Queries will sometimes run faster
-if their result columns appear in the right-most entries of an index. Consider
-the following example: </p> <codeblock id="GUID-63292052-B523-5671-B3EE-E10A66C7275F" xml:space="preserve">
-CREATE TABLE demo317(a,b,c,data);
-CREATE INDEX idx317 ON demo316(a,b,c);
-</codeblock> <p>A query where all result column terms appears in the index,
-such as </p> <codeblock id="GUID-41F740E7-EAFC-583B-BFE6-E63DBEA354D7" xml:space="preserve">
-SELECT c FROM demo317 WHERE a=5 ORDER BY b;
-</codeblock> <p>will typically run about twice as fast or faster than a query
-that uses columns that are not in the index, e.g. </p> <codeblock id="GUID-098752F4-304A-5A84-834E-240D97D97C2D" xml:space="preserve">
-SELECT data FROM demo317 WHERE a=5 ORDER BY b;
-</codeblock> <p>The reason for this is that when all information is contained
-within the index entry only a single search has to be made for each row of
-output. But when some of the information is in the index and other parts are
-in the table, first there must be a search for the appropriate index entry
-then a separate search is made for the appropriate table row based on the
-RowID found in the index entry. Twice as much searching has to be done for
-each row of output generated. </p> <p>The extra query speed does not come
-for free, however. Adding additional columns to an index makes the database
-file larger. So when developing an application, the programmer will need to
-make a space versus time trade-off to determine whether the extra columns
-should be added to the index or not. </p> <p>Note that if any column of the
-result must be obtained from the original table, then the table row will have
-to be searched for anyhow. There will be no speed advantage, so you might
-as well omit the extra columns from the end of the index and save on storage
-space. The speed-up described in this section can only be realized when every
-column in a table is obtainable from the index. </p> <p>Taking into account
-the results of the previous few sections, the best set of columns to put in
-an index can be described as follows: </p> <ul>
-<li id="GUID-EBF4DEFB-2F5F-5D78-92FA-06FEAB0C3650"><p>The first columns in
-the index should be columns that have equality constraints in the WHERE clause
-of the query. </p> </li>
-<li id="GUID-E5CB725C-6304-5946-9E18-E69B5F1A6A88"><p>The second group of
-columns should match the columns specified in the ORDER BY clause. </p> </li>
-<li id="GUID-FBC00251-C3AD-5AC0-9102-EF66EA37DE4E"><p>Add additional columns
-to the end of the index that are used in the result set of the query. </p> </li>
-</ul> </section>
-<section id="GUID-D7B5B389-E031-5512-8186-235B22F0D9C1"><title>Resolve Indexing
-Ambiguities Using the Unary “+” Operator</title> <p>The SQLite query optimizer
-usually does a good job of choosing the best index to use for a particular
-query, especially if ANALYZE has been run to provide it with index performance
-statistics. But occasions do arise where it is useful to give the optimizer
-hints. </p> <p>One of the easiest ways to control the operation of the optimizer
-is to disqualify terms in the WHERE clause or ORDER BY clause as candidates
-for optimization by using the unary “+” operator. </p> <p>In SQLite, a unary
-“+” operator is a no-op. It makes no change to its operand, even if the operand
-is something other than a number. So you can always prefix a “+” to an expression
-in without changing the meaning of the expression. As the optimizer will only
-use terms in WHERE, HAVING, or ON clauses that have an index column name on
-one side of a comparison operator, you can prevent such a term from being
-used by the optimizer by prefixing the column name with a “+”. </p> <p>For
-example, suppose you have a database with a schema like this: </p> <codeblock id="GUID-E7747EFD-FE58-5EA4-88B3-097C0A303F52" xml:space="preserve">
-CREATE TABLE demo321(a,b,c,data);
-CREATE INDEX idx321a ON demo321(a);
-CREATE INDEX idx321b ON demo321(b);
-</codeblock> <p>If you issue a query such as this: </p> <codeblock id="GUID-87BD59FC-33A8-598B-B91F-607B26F7349D" xml:space="preserve">
-SELECT data FROM demo321 WHERE a=5 AND b=11;
-</codeblock> <p>The query optimizer might use the “a=5” term with idx321a
-or it might use the “b=11” term with the idx321b index. But if you want to
-force the use of the idx321a index you can accomplish that by disqualifying
-the second term of the WHERE clause as a candidate for optimization using
-a unary “+” like this: </p> <codeblock id="GUID-E6EAB459-726A-5FE4-8065-6C46AC2C5B5C" xml:space="preserve">
-SELECT data FROM demo321 WHERE a=5 AND +b=11;
-</codeblock> <p>The “+” in front of the “b=11” turns the left-hand side of
-the equals comparison operator into an expression instead of an indexed column
-name. The optimizer will then not recognize that the second term can be used
-with an index and so the optimizer is compelled to use the first “a=5” term. </p> <p>The
-unary “+” operator can also be used to disable ORDER BY clause optimizations.
-Consider this query: </p> <codeblock id="GUID-0488D466-77B7-50E0-AB85-FF033A2D75DC" xml:space="preserve">
-SELECT data FROM demo321 WHERE a=5 ORDER BY b;
-</codeblock> <p>The optimizer has the choice of using the “a=5” term of the
-WHERE clause with idx321a to restrict the search. Or it might choose to use
-do a full table scan with idx321b to satisfy the ORDER BY clause and thus
-avoid a separate sorting pass. You can force one choice or the other using
-a unary “+”. </p> <p>To force the use of idx321a on the WHERE clause, add
-the unary “+” in from of the “b” in the ORDER BY clause: </p> <codeblock id="GUID-E55A085F-D91F-58E0-B964-317BB3A9D7ED" xml:space="preserve">
-SELECT data FROM demo321 WHERE a=5 ORDER BY +b;
-</codeblock> <p>To go the other way and force the idx321b index to be used
-to satisfy the ORDER BY clause, disqualify the WHERE term by prefixing with
-a unary “+”: </p> <codeblock id="GUID-D97EF52A-1F74-57EB-AC11-7911B4E088B3" xml:space="preserve">
-SELECT data FROM demo321 WHERE +a=5 ORDER BY b;
-</codeblock> <p>The reader is cautioned not to overuse the unary “+” operator.
-The SQLite query optimizer usually picks the best index without any outside
-help. Premature use of unary “+” can confuse the optimizer and cause less
-than optimal performance. But in some cases it is useful to be able override
-the decisions of the optimizer, and the unary “+” operator is an excellent
-way to do this when it becomes necessary. </p> </section>
-<section id="GUID-7BEBC49C-0528-5D58-9626-2A92F3D0D9E8"><title>Avoid Indexing
-Large BLOBs and CLOBs</title> <p>SQLite stores indexes as b-trees. Each b-tree
-node uses one page of the database file. In order to maintain an acceptable
-fan-out, the b-tree module within SQLite requires that at least 4 entries
-must fit on each page of a b-tree. There is also some overhead associated
-with each b-tree page. So at the most there is about 250 bytes of space available
-on the main b-tree page for each index entry. </p> <p>If an index entry exceeds
-this allotment of approximately 250 bytes excess bytes are spilled to overflow
-pages. There is no arbitrary limit on the number of overflow pages or on the
-length of a b-tree entry, but for maximum efficiency it is best to avoid overflow
-pages, especially in indexes. This means that you should strive to keep the
-number of bytes in each index entry below 250. </p> <p>If you keep the size
-of indexes significantly smaller than 250 bytes, then the b-tree fan-out is
-increased and the binary search algorithm used to search for entries in an
-index has fewer pages to examine and therefore runs faster. So the fewer bytes
-used in each index entry the better, at least from a performance perspective. </p> <p>For
-these reasons, it is recommended that you avoid indexing large BLOBs and CLOBs.
-SQLite will continue to work when large BLOBs and CLOBs are indexed, but there
-will be a performance impact. </p> <p>On the other hand, if you need to lookup
-entries using a large BLOB or CLOB as the key, then by all means use an index.
-An index on a large BLOB or CLOB is not as fast as an index using more compact
-data types such as integers, but it is still many order of magnitude faster
-than doing a full table scan. So to be more precise, the advice of this section
-is that you should design your applications so that you do not need to lookup
-entries using a large BLOB or CLOB as the key. Try to arrange to have compact
-keys consisting of short strings or integers. </p> <p>Note that many other
-SQL database engines disallow the indexing of BLOBs and CLOBs in the first
-place. You simple cannot do it. SQLite is more flexible that most in that
-it does allow BLOBs and CLOBs to be indexed and it will use those indexes
-when appropriate. But for maximum performance, it is best to use smaller search
-keys. </p> </section>
-<section id="GUID-DD40F29F-DF93-536E-9B52-F9B9FF45155D"><title>Avoid Excess
-Indexes</title> <p>Some developers approach SQL-based application development
-with the attitude that indexes never hurt and that the more indexes you have,
-the faster your application will run. This is definitely not the case. There
-is a costs associated with each new index you create: </p> <ul>
-<li id="GUID-FD257BF7-F938-54B5-AC03-9536712D6281"><p>Each new index takes
-up additional space in the database file. The more indexes you have, the larger
-your database files will become for the same amount of data. </p> </li>
-<li id="GUID-E1B74FB6-246A-5148-AF06-04E1B4B949F1"><p>Every INSERT and UPDATE
-statement modifies both the original table and all indexes on that table.
-So the performance of INSERT and UPDATE decreases linearly with the number
-of indexes. </p> </li>
-<li id="GUID-56AAE2D1-71D6-5A23-8190-B0C80B204DED"><p>Compiling new SQL statements
-using <codeph>Prepare()</codeph> takes longer when there are more indexes
-for the optimizer to choose between. </p> </li>
-<li id="GUID-24B7F7D8-FAA9-5C78-B3C7-B886FA774C0B"><p>Surplus indexes give
-the optimizer more opportunities to make a bad choice. </p> </li>
-</ul> <p>Your policy on indexes should be to avoid them wherever you can.
-Indexes are powerful medicine and can work wonders to improve the performance
-of a program. But just as too many drugs can be worse than none at all, so
-also can too many indexes cause more harm than good. </p> <p>When building
-a new application, a good approach is to omit all explicitly declared indexes
-in the beginning and only add indexes as needed to address specific performance
-problems. </p> <p>Take care to avoid redundant indexes. For example, consider
-this schema: </p> <codeblock id="GUID-89F20101-1628-5783-82B0-2ABE84078C7D" xml:space="preserve">
-CREATE TABLE demo323a(a,b,c);
-CREATE INDEX idx323a1 ON demo323(a);
-CREATE INDEX idx323a2 ON demo323(a,b);
-</codeblock> <p>The idx323a1 index is redundant and can be eliminated. Anything
-that the idx323a1 index can do the idx323a2 index can do better. </p> <p>Other
-redundancies are not quite as apparent as the above. Recall that any column
-or columns that are declared UNIQUE or PRIMARY KEY (except for the special
-case of INTEGER PRIMARY KEY) are automatically indexed. So in the following
-schema: </p> <codeblock id="GUID-2FE7B726-4027-518C-9217-B4BD1ECDA991" xml:space="preserve">
-CREATE TABLE demo323b(x TEXT PRIMARY KEY, y INTEGER UNIQUE);
-CREATE INDEX idx323b1 ON demo323b(x);
-CREATE INDEX idx323b2 ON demo323b(y);
-</codeblock> <p>Both indexes are redundant and can be eliminated with no loss
-in query performance. Occasionally one sees a novice SQL programmer use both
-UNIQUE and PRIMARY KEY on the same column: </p> <codeblock id="GUID-CDE12649-BDB4-58D4-8981-02628BDF5C79" xml:space="preserve">
-CREATE TABLE demo323c(p TEXT UNIQUE PRIMARY KEY, q);
-</codeblock> <p>This has the effect of creating two indexes on the “p” column
-– one for the UNIQUE keywords and another for the PRIMARY KEY keyword. Both
-indexes are identical so clearly one can be omitted. A PRIMARY KEY is guaranteed
-to always be unique so the UNIQUE keyword can be removed from the demo323c
-table definition with no ambiguity or loss of functionality. </p> <p>It is
-not a fatal error to create too many indexes or redundant indexes. SQLite
-will continue to generate the correct answers but it may take longer to produce
-those answers and the resulting database files might be a little larger. So
-for best results, keep the number of indexes to a minimum. </p> </section>
-<section id="GUID-9337E315-BB5A-56D0-8319-6C398D26151F"><title>Avoid Tables
-and Indexes with an Excessive Number of Columns</title> <p>SQLite places no
-arbitrary limits on the number of columns in a table or index. There are known
-commercial applications using SQLite that construct tables with tens of thousands
-of columns each. And these applications actually work. </p> <p>However the
-database engine is optimized for the common case of tables with no more than
-a few dozen columns. For best performance you should try to stay in the optimized
-region. Furthermore, we note that relational databases with a large number
-of columns are usually not well normalized. So even apart from performance
-considerations, if you find your design has tables with more than a dozen
-or so columns, you really need to rethink how you are building your application. </p> <p>There
-are a number of places in <codeph>Prepare()</codeph> that run in time O(N<sup>2</sup>)
-where N is the number of columns in the table. The constant of proportionality
-is small in these cases so you should not have any problems for N of less
-than one hundred but for N on the order of a thousand, the time to run <codeph>Prepare()</codeph> can
-start to become noticeable. </p> <p>When the bytecode is running and it needs
-to access the i-th column of a table, the values of the previous i-1 columns
-must be accessed first. So if you have a large number of columns, accessing
-the last column can be an expensive operation. This fact also argues for putting
-smaller and more frequently accessed columns early in the table. </p> <p>There
-are certain optimizations that will only work if the table has 30 or fewer
-columns. The optimization that extracts all necessary information from an
-index and never refers to the underlying table works this way. So in some
-cases, keeping the number of columns in a table at or below 30 can result
-in a 2-fold speed improvement. </p> <p>Indexes will only be used if they contain
-30 or fewer columns. You can put as many columns in an index as you want,
-but if the number is greater than 30, the index will never improve performance
-and will never do anything but take up space in your database file. </p> </section>
-</conbody><related-links>
-<link href="GUID-22844C28-AB5B-5A6F-8863-7269464684B4.dita"><linktext>SQL Overview</linktext>
-</link>
-<link href="GUID-78773BCA-ADF6-53E6-AC80-5CB2AE1F8BCC.dita"><linktext>SQL Server
-Guide</linktext></link>
-<link href="GUID-E51836E1-D33E-506C-B75B-19B8E3CC313A.dita"><linktext>SQLite</linktext>
-</link>
-<link href="GUID-1F12E3F5-45B2-55EC-B021-00338277C608.dita"><linktext>SQL DB Overview</linktext>
-</link>
-<link href="GUID-43CA02E7-0101-5824-B91B-E15EE20C829A.dita"><linktext>Avoid Transient
-Tables</linktext></link>
-<link href="GUID-49A3419F-D20A-5C5D-B2FF-51724EF37704.dita"><linktext>Prevent Datafile
-Corruption</linktext></link>
-<link><linktext/></link>
-<link href="GUID-B994E6F7-228A-5433-B87F-91857C5D93D6.dita"><linktext>SQL Insertion
-Tips</linktext></link>
-<link href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita"><linktext>SQL Schema
-Tips</linktext></link>
-<link href="GUID-2A2920E0-5D40-5358-BC0C-8572CEFE078C.dita"><linktext>SQL Expressions</linktext>
-</link>
-<link href="GUID-126FCCCC-0E7D-59AE-959A-2F94A7319C4B.dita"><linktext>SQL Statement
-Tips</linktext></link>
-<link href="GUID-ACCCB148-DAF9-59EC-B585-8EF632B9BF04.dita"><linktext>SQL Joins</linktext>
-</link>
-<link href="GUID-B7E978C1-45CA-554C-8028-D901B97BA2E0.dita"><linktext> ANALYZE
-Command</linktext></link>
-<link href="GUID-AF5A75D7-0687-546C-87B2-0B7DF7D33217.dita"><linktext> SQL WHERE
-CLause Tips</linktext></link>
+<?xml version="1.0" encoding="utf-8"?>
+<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
+<!-- This component and the accompanying materials are made available under the terms of the License 
+"Eclipse Public License v1.0" which accompanies this distribution, 
+and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
+<!-- Initial Contributors:
+    Nokia Corporation - initial contribution.
+Contributors: 
+-->
+<!DOCTYPE concept
+  PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A" xml:lang="en"><title>SQL Index Tips</title><shortdesc>This document includes several tips for using SQL indexes.</shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody>
+<section id="GUID-3895F9D0-DE9C-4375-B541-AC99CABB7B8A"><title>Introduction</title> <p>You can use indexes to speed up access. You create indexes automatically
+using PRIMARY KEY and UNIQUE. </p> <p><b>Intended audience:</b> </p> <p>This document is intended to be
+used by Symbian platform licensees and third party application developers. </p> </section>
+<section id="GUID-765F0DF1-ACB0-57DB-B9A8-3697E4637065"><title>Use
+an Index to Speed up Access</title> <p>Suppose you have a table like
+this: </p> <codeblock id="GUID-F70B25AB-A151-52CE-A413-1C62A2464D6A" xml:space="preserve">
+CREATE TABLE demo5(
+    id INTEGER,
+    content BLOB
+);
+</codeblock> <p>Further suppose that this table contains thousands
+or millions of rows and you want to access a single row with a particular
+ID: </p> <codeblock id="GUID-B02FA452-4093-5383-BAFA-AE035919D720" xml:space="preserve">
+SELECT content FROM demo5 WHERE id=?
+</codeblock> <p>The only want that SQLite can perform this query,
+and be certain to get every row with the chosen ID, is to examine
+every single row, check the ID of that row, and return the content
+if the ID matches. Examining every single row this way is called a <i>full table scan</i>. </p> <p>Reading and checking every row of a
+large table can be very slow, so you want to avoid full table scans.
+The usual way to do this is to create an index on the column you are
+searching against. In the example above, an appropriate index would
+be this: </p> <codeblock id="GUID-82E337F1-2CA2-51B0-A7BC-071A83779A18" xml:space="preserve">
+CREATE INDEX demo5_idx1 ON demo5(id);
+</codeblock> <p>With an index on the ID column, SQLite is able to
+use a binary search to locate entries that contain a particular value
+of ID. So if the table contains a million rows, the query can be satisfied
+with about 20 accesses rather than 1000000 accesses. This is a huge
+performance improvement. </p> <p>One of the features of the SQL language
+is that you do not have to figure out what indexes you may need in
+advance of coding your application. It is perfectly acceptable, even
+preferable, to write the code for your application using a database
+without any indexes. Then once the application is running and you
+can make speed measurements, add whatever indexes are needed in order
+to make it run faster. </p> <p>When you add indexes, the query optimizer
+within the SQL compiler is able to find new more efficient bytecode
+procedures for carrying out the operations that your SQL statements
+specify. In other words, by adding indexes late in the development
+cycle you have the power to completely reorganize your data access
+patterns without changing a single line of code. </p> </section>
+<section id="GUID-BB1F17C5-1174-5DF4-AA61-611173237F3F"><title>Create
+Indexes Automatically Using PRIMARY KEY and UNIQUE</title> <p>Any
+column of a table that is declared to be the PRIMARY KEY or that is
+declared UNIQUE will be indexed automatically. There is no need to
+create a separate index on that column using the CREATE INDEX statement.
+So, for example, this table declaration: </p> <codeblock id="GUID-E4BE6077-F639-5CE7-964A-276B0D58A129" xml:space="preserve">
+CREATE TABLE demo39a(
+    id INTEGER,
+    content BLOB
+);
+
+CREATE INDEX demo39_idx1 ON demo39a(id);
+</codeblock> <p>Is roughly equivalent to the following: </p> <codeblock id="GUID-DB3167E0-FA95-50CA-92C7-102B5C2C13E3" xml:space="preserve">
+CREATE TABLE demo39b(
+    id INTEGER UNIQUE,
+    content BLOB
+);
+</codeblock> <p>The two examples above are “roughly” equivalent, but
+not exactly equivalent. Both tables have an index on the ID column.
+In the first case, the index is created explicitly. In the second
+case, the index is implied by the UNIQUE keyword in the type declaration
+of the ID column. Both table designs use exactly the same amount of
+disk space, and both will run queries such as </p> <codeblock id="GUID-7ACAE270-6D20-557B-B7D1-C90EDD757E43" xml:space="preserve">
+SELECT content FROM demo39 WHERE id=?
+</codeblock> <p>using exactly the same bytecode. The only difference
+is that table demo39a lets you insert multiple rows with the same
+ID whereas table demo39b will raise an exception if you try to insert
+a new row with the same ID as an existing row. </p> <p>If you use
+the UNIQUE keyword in the CREATE INDEX statement of demo39a, like
+this: </p> <codeblock id="GUID-0EE5E186-CC4A-5CC3-AEAE-F1482F1F8F9A" xml:space="preserve">
+CREATE UNIQUE INDEX demo39_idx1 ON demo39a(id);
+</codeblock> <p>Then both table designs really would be exactly the
+same in every way. In fact, whenever SQLite sees the UNIQUE keyword
+on a column type declaration, all it does is create an automatic unique
+index on that column. </p> <p>The PRIMARY KEY modifier on a column
+type declaration works like UNIQUE; it causes a unique index to be
+created automatically. The main difference is that you are only allowed
+to have a single PRIMARY KEY. This restriction of only allowing a
+single PRIMARY KEY is part of the official SQL language definition. </p> <p>The idea is that a PRIMARY KEY is used to order the rows on disk.
+Some SQL database engines actually implement PRIMARY KEYs this way.
+But with SQLite, a PRIMARY KEY is like any other UNIQUE column, with
+only one exception: INTEGER PRIMARY KEY is a special case which is
+handled differently, as described in the next section. </p> </section>
+<section id="GUID-BF7A0301-8490-58ED-BB37-FAC403A84230"><title>Use
+Multi-Column Indexes</title> <p>SQLite is able to make use of multi-column
+indexes. The rule is that if an index is over columns <i>X</i>  <i> 0 </i>, <i>X</i>  <i> 1 </i>, <i>X</i>  <i> 2 </i>, ..., <i>X</i>  <i> n </i> of some table, then the index can be used if the WHERE
+clause contains equality constraints for some prefix of those columns <i>X</i>  <i>0 </i>, <i>X</i>  <i>1 </i>, <i>X</i>  <i>2 </i>, ..., <i>X</i>  <i>i </i> where <i>i</i> is less than <i>n</i>. </p> <p>As
+an example, suppose you have a table and index declared as follows: </p> <codeblock id="GUID-C18C97F7-23CA-5636-9F00-130A8FB3DEF5" xml:space="preserve">
+CREATE TABLE demo314(a,b,c,d,e,f,g);
+CREATE INDEX demo314_idx ON demo314(a,b,c,d,e,f);
+</codeblock> <p>Then the index might be used to help with a query
+that contained a WHERE clause like this: </p> <codeblock id="GUID-8A0944F4-1ACF-5267-B49F-EB83EFBB5670" xml:space="preserve">
+... WHERE a=1 AND b='Smith' AND c=1
+</codeblock> <p>All three terms of the WHERE clause would be used
+together with the index in order to narrow the search. But the index
+could not be used if there WHERE clause said: </p> <codeblock id="GUID-B5F1C17F-0F5E-5FC2-A9A4-DF19D699A076" xml:space="preserve">
+... WHERE b='Smith' AND c=1
+</codeblock> <p>The second WHERE clause does not contain equality
+terms for a prefix of the columns in the index because it omits a
+term for the “a” column. </p> <p>In a case like this: </p> <codeblock id="GUID-EF2CFE7D-0456-5414-847D-BADCC057CFD8" xml:space="preserve">
+... WHERE a=1 AND c=1
+</codeblock> <p>Only the “a=1” term in the WHERE clause could be used
+to help narrow the search. The “c=1” term is not part of the prefix
+of terms in the index which have equality constraints because there
+is no equality constraint on the “b” column. </p> <p>SQLite only allows
+a single index to be used per table within a simple SQL statement.
+For UPDATE and DELETE statements, this means that only a single index
+can ever be used, since those statements can only operate on a single
+table at a time. </p> <p>In a simple SELECT statement multiple indexes
+can be used if the SELECT statement is a join – one index per table
+in the join. In a compound SELECT statement (two or more SELECT statements
+connected by UNION or INTERSECT or EXCEPT) each SELECT statement is
+treated separately and can have its own indexes. Likewise, SELECT
+statements that appear in subexpressions are treated separately. </p> <p>Some other SQL database engines (for example PostgreSQL) allow
+multiple indexes to be used for each table in a SELECT. For example,
+if you had a table and index in PostgreSQL like this: </p> <codeblock id="GUID-F5DE8F24-7471-5992-9896-295CE173D855" xml:space="preserve">
+CREATE TABLE pg1(a INT, b INT, c INT, d INT);
+CREATE INDEX pg1_ix1 ON pg1(a);
+CREATE INDEX pg1_ix2 ON pg1(b);
+CREATE INDEX pg1_ix3 ON pg1(c);
+</codeblock> <p>And if you were to run a query like the following: </p> <codeblock id="GUID-A35663B7-4E7D-5CC0-BF5E-CF3A4CFED63F" xml:space="preserve">
+SELECT d FROM pg1 WHERE a=5 AND b=11 AND c=99;
+</codeblock> <p>Then PostgreSQL might attempt to optimize the query
+by using all three indexes, one for each term of the WHERE clause. </p> <p>SQLite does not work this way. SQLite is compelled to select
+a single index to use in the query. It might select any of the three
+indexes shown, depending on which one the optimizer things will give
+the best speedup. But in every case it will only select a single index
+and only a single term of the WHERE clause will be used. </p> <p>SQLite
+prefers to use a multi-column index such as this: </p> <codeblock id="GUID-40FB7075-1239-5089-BBC5-0D994F4A0C39" xml:space="preserve">
+CREATE INDEX pg1_ix_all ON pg1(a,b,c);
+</codeblock> <p>If the pg1_ix_all index is available for use when
+the SELECT statement above is prepared, SQLite will likely choose
+it over any of the single-column indexes because the multi-column
+index is able to make use of all 3 terms of the WHERE clause. </p> <p>You can trick SQLite into using multiple indexes on the same
+table by rewriting the query. Instead of the SELECT statement shown
+above, if you rewrite it as this: </p> <codeblock id="GUID-D7DE75D4-BB01-50DF-A9DC-956A83DED5D0" xml:space="preserve">
+SELECT d FROM pg1 WHERE RowID IN (
+    SELECT RowID FROM pg1 WHERE a=5
+    INTERSECT
+    SELECT RowID FROM pg1 WHERE b=11
+    INTERSECT
+    SELECT RowID FROM pg1 WHERE c=99
+)
+</codeblock> <p>Then each of the individual SELECT statements will
+using a different single-column index and their results will be combined
+by the outer SELECT statement to give the correct result. The other
+SQL database engines like PostgreSQL that are able to make use of
+multiple indexes per table do so by treating the simpler SELECT statement
+shown first as if they where the more complicated SELECT statement
+shown here. </p> </section>
+<section id="GUID-E90057A8-70B6-590C-B8AE-616DA25BB543"><title>Use
+Inequality Constraints on the Last Index Term</title> <p>Terms in
+the WHERE clause of a query or UPDATE or DELETE statement are mostly
+likely to trigger the use of an index if they are an equality constraint
+– in other words if the term consists of the name of an indexed column,
+an equal sign (“=”), and an expression. </p> <p>So, for example, if
+you have a table and index that look like this: </p> <codeblock id="GUID-84AADB9D-5853-57C2-B489-87DC7FB7AADE" xml:space="preserve">
+CREATE TABLE demo315(a,b,c,d);
+CREATE INDEX demo315_idx1 ON demo315(a,b,c);
+</codeblock> <p>And a query like this: </p> <codeblock id="GUID-A2B7DA9F-DB82-5D06-80E2-7AF714E403D5" xml:space="preserve">
+SELECT d FROM demo315 WHERE a=512;
+</codeblock> <p>The single “a=512” term of the WHERE clause qualifies
+as an equality constraint and is likely to provoke the use of the
+demo315_idx1 index. </p> <p>SQLite supports two other kinds of equality
+constraints. One is the IN operator: </p> <codeblock id="GUID-EA5D7637-A6B8-5BC0-A72E-D576B0F945A3" xml:space="preserve">
+SELECT d FROM demo315 WHERE a IN (512,1024);
+SELECT d FROM demo315 WHERE a IN (SELECT x FROM someothertable);
+</codeblock> <p>There other is the IS NULL constraint: </p> <codeblock id="GUID-B2C1C84B-C33D-55C5-8484-24B28EFC8E37" xml:space="preserve">
+SELECT d FROM demo315 WHERE a IS NULL;
+</codeblock> <p>SQLite allows at most one term of an index to be constrained
+by an inequality such as less than “&lt;”, greater than “&gt;”, less
+than or equal to “&lt;=”, or greater than or equal to “&gt;=”. </p> <p>The column that the inequality constrains will be the right-most
+term of the index that is used. So, for example, in this query: </p> <codeblock id="GUID-563231B5-EC3A-57C2-BC6F-1A8129ADE308" xml:space="preserve">
+SELECT d FROM demo315 WHERE a=5 AND b&gt;11 AND c=1;
+</codeblock> <p>Only the first two terms of the WHERE clause will
+be used with the demo315_idx1 index. The third term, the “c=1” constraint,
+cannot be used because the “c” column occurs to the right of the “b”
+column in the index and the “b” column is constrained by an inequality. </p> <p>SQLite allows up to two inequalities on the same column as long
+as the two inequalities provide an upper and lower bound on the column.
+For example, in this query: </p> <codeblock id="GUID-4EB94886-EDFF-58F2-8692-011A67AC5A60" xml:space="preserve">
+SELECT d FROM demo315 WHERE a=5 AND b&gt;11 AND b&lt;23;
+</codeblock> <p>All three terms of the WHERE clause will be used because
+the two inequalities on the “b” column provide an upper and lower
+bound on the value of “b”. </p> <p>SQLite will only use the four inequalities
+mentioned above to help constrain a search: “&lt;”, “&gt;”, “&lt;=”,
+and “&gt;=”. Other inequality operators such as not equal to (“!=” or
+“&lt;&gt;”) and NOT NULL are not helpful to the query optimizer and will
+never be used to control an index and help make the query run faster. </p> </section>
+<section id="GUID-CAD0C181-37E7-578A-A7E1-7843447C247F"><title>Use
+Indexes To Help ORDER BY Clauses Evaluate Faster</title> <p>The default
+method for evaluating an ORDER BY clause in a SELECT statement is
+to first evaluate the SELECT statement and store the results in a
+temporary tables, then sort the temporary table according to the ORDER
+BY clause and scan the sorted temporary table to generate the final
+output. </p> <p>This method always works, but it requires three passes
+over the data (one pass to generate the result set, a second pass
+to sort the result set, and a third pass to output the results) and
+it requires a temporary storage space sufficiently large to contain
+the entire results set. </p> <p>Where possible, SQLite will avoid
+storing and sorting the result set by using an index that causes the
+results to emerge from the query in sorted order in the first place. </p> <p>The way to get SQLite to use an index for sorting is to provide
+an index that covers the same columns specified in the ORDER BY clause.
+For example, if the table and index are like this: </p> <codeblock id="GUID-F0103033-C5C8-5177-8AD7-70BCC45C33C9" xml:space="preserve">
+CREATE TABLE demo316(a,b,c,data);
+CREATE INDEX idx316 ON demo316(a,b,c);
+</codeblock> <p>And you do a query like this: </p> <codeblock id="GUID-D67BB6FF-E213-5B86-A2C1-E1992DA96A62" xml:space="preserve">
+SELECT data FROM demo316 ORDER BY a,b,c;
+</codeblock> <p>SQLite will use the idx316 index to implement the
+ORDER BY clause, obviating the need for temporary storage space and
+a separate sorting pass. </p> <p>An index can be used to satisfy the
+search constraints of a WHERE clause and to impose the ORDER BY ordering
+of outputs all at once. The trick is for the ORDER BY clause terms
+to occur immediately after the WHERE clause terms in the index. For
+example, one can write: </p> <codeblock id="GUID-02063968-34B5-5766-9D02-86D696D39C1E" xml:space="preserve">
+SELECT data FROM demo316 WHERE a=5 ORDER BY b,c;
+</codeblock> <p>The “a” column is used in the WHERE clause and the
+immediately following terms of the index, “b” and “c” are used in
+the ORDER BY clause. So in this case the idx316 index would be used
+both to speed up the search and to satisfy the ORDER BY clause. </p> <p>This query also uses the idx316 index because, once again, the
+ORDER BY clause term “c” immediate follows the WHERE clause terms
+“a” and “b” in the index: </p> <codeblock id="GUID-6760EC7E-E86A-5EBD-BDDD-32A68BE78A9E" xml:space="preserve">
+SELECT data FROM demo316 WHERE a=5 AND b=17 ORDER BY c;
+</codeblock> <p>But now consider this: </p> <codeblock id="GUID-9363996C-8C30-5E04-B05F-392C8262F1F6" xml:space="preserve">
+SELECT data FROM demo316 WHERE a=5 ORDER BY c;
+</codeblock> <p>Here there is a gap between the ORDER BY term “c”
+and the WHERE clause term “a”. So the idx316 index cannot be used
+to satisfy both the WHERE clause and the ORDER BY clause. The index
+will be used on the WHERE clause and a separate sorting pass will
+occur to put the results in the correct order. </p> </section>
+<section id="GUID-109AF0DA-A054-504A-A432-76BD145B2AC4"><title>Add
+Result Columns To The End Of Indexes</title> <p>Queries will sometimes
+run faster if their result columns appear in the right-most entries
+of an index. Consider the following example: </p> <codeblock id="GUID-63292052-B523-5671-B3EE-E10A66C7275F" xml:space="preserve">
+CREATE TABLE demo317(a,b,c,data);
+CREATE INDEX idx317 ON demo316(a,b,c);
+</codeblock> <p>A query where all result column terms appears in the
+index, such as </p> <codeblock id="GUID-41F740E7-EAFC-583B-BFE6-E63DBEA354D7" xml:space="preserve">
+SELECT c FROM demo317 WHERE a=5 ORDER BY b;
+</codeblock> <p>will typically run about twice as fast or faster than
+a query that uses columns that are not in the index, e.g. </p> <codeblock id="GUID-098752F4-304A-5A84-834E-240D97D97C2D" xml:space="preserve">
+SELECT data FROM demo317 WHERE a=5 ORDER BY b;
+</codeblock> <p>The reason for this is that when all information is
+contained within the index entry only a single search has to be made
+for each row of output. But when some of the information is in the
+index and other parts are in the table, first there must be a search
+for the appropriate index entry then a separate search is made for
+the appropriate table row based on the RowID found in the index entry.
+Twice as much searching has to be done for each row of output generated. </p> <p>The extra query speed does not come for free, however. Adding
+additional columns to an index makes the database file larger. So
+when developing an application, the programmer will need to make a
+space versus time trade-off to determine whether the extra columns
+should be added to the index or not. </p> <p>Note that if any column
+of the result must be obtained from the original table, then the table
+row will have to be searched for anyhow. There will be no speed advantage,
+so you might as well omit the extra columns from the end of the index
+and save on storage space. The speed-up described in this section
+can only be realized when every column in a table is obtainable from
+the index. </p> <p>Taking into account the results of the previous
+few sections, the best set of columns to put in an index can be described
+as follows: </p> <ul>
+<li id="GUID-EBF4DEFB-2F5F-5D78-92FA-06FEAB0C3650"><p>The first columns
+in the index should be columns that have equality constraints in the
+WHERE clause of the query. </p> </li>
+<li id="GUID-E5CB725C-6304-5946-9E18-E69B5F1A6A88"><p>The second group
+of columns should match the columns specified in the ORDER BY clause. </p> </li>
+<li id="GUID-FBC00251-C3AD-5AC0-9102-EF66EA37DE4E"><p>Add additional
+columns to the end of the index that are used in the result set of
+the query. </p> </li>
+</ul> </section>
+<section id="GUID-D7B5B389-E031-5512-8186-235B22F0D9C1"><title>Resolve
+Indexing Ambiguities Using the Unary “+” Operator</title> <p>The SQLite
+query optimizer usually does a good job of choosing the best index
+to use for a particular query, especially if ANALYZE has been run
+to provide it with index performance statistics. But occasions do
+arise where it is useful to give the optimizer hints. </p> <p>One
+of the easiest ways to control the operation of the optimizer is to
+disqualify terms in the WHERE clause or ORDER BY clause as candidates
+for optimization by using the unary “+” operator. </p> <p>In SQLite,
+a unary “+” operator is a no-op. It makes no change to its operand,
+even if the operand is something other than a number. So you can always
+prefix a “+” to an expression in without changing the meaning of the
+expression. As the optimizer will only use terms in WHERE, HAVING,
+or ON clauses that have an index column name on one side of a comparison
+operator, you can prevent such a term from being used by the optimizer
+by prefixing the column name with a “+”. </p> <p>For example, suppose
+you have a database with a schema like this: </p> <codeblock id="GUID-E7747EFD-FE58-5EA4-88B3-097C0A303F52" xml:space="preserve">
+CREATE TABLE demo321(a,b,c,data);
+CREATE INDEX idx321a ON demo321(a);
+CREATE INDEX idx321b ON demo321(b);
+</codeblock> <p>If you issue a query such as this: </p> <codeblock id="GUID-87BD59FC-33A8-598B-B91F-607B26F7349D" xml:space="preserve">
+SELECT data FROM demo321 WHERE a=5 AND b=11;
+</codeblock> <p>The query optimizer might use the “a=5” term with
+idx321a or it might use the “b=11” term with the idx321b index. But
+if you want to force the use of the idx321a index you can accomplish
+that by disqualifying the second term of the WHERE clause as a candidate
+for optimization using a unary “+” like this: </p> <codeblock id="GUID-E6EAB459-726A-5FE4-8065-6C46AC2C5B5C" xml:space="preserve">
+SELECT data FROM demo321 WHERE a=5 AND +b=11;
+</codeblock> <p>The “+” in front of the “b=11” turns the left-hand
+side of the equals comparison operator into an expression instead
+of an indexed column name. The optimizer will then not recognize that
+the second term can be used with an index and so the optimizer is
+compelled to use the first “a=5” term. </p> <p>The unary “+” operator
+can also be used to disable ORDER BY clause optimizations. Consider
+this query: </p> <codeblock id="GUID-0488D466-77B7-50E0-AB85-FF033A2D75DC" xml:space="preserve">
+SELECT data FROM demo321 WHERE a=5 ORDER BY b;
+</codeblock> <p>The optimizer has the choice of using the “a=5” term
+of the WHERE clause with idx321a to restrict the search. Or it might
+choose to use do a full table scan with idx321b to satisfy the ORDER
+BY clause and thus avoid a separate sorting pass. You can force one
+choice or the other using a unary “+”. </p> <p>To force the use of
+idx321a on the WHERE clause, add the unary “+” in from of the “b”
+in the ORDER BY clause: </p> <codeblock id="GUID-E55A085F-D91F-58E0-B964-317BB3A9D7ED" xml:space="preserve">
+SELECT data FROM demo321 WHERE a=5 ORDER BY +b;
+</codeblock> <p>To go the other way and force the idx321b index to
+be used to satisfy the ORDER BY clause, disqualify the WHERE term
+by prefixing with a unary “+”: </p> <codeblock id="GUID-D97EF52A-1F74-57EB-AC11-7911B4E088B3" xml:space="preserve">
+SELECT data FROM demo321 WHERE +a=5 ORDER BY b;
+</codeblock> <p>The reader is cautioned not to overuse the unary “+”
+operator. The SQLite query optimizer usually picks the best index
+without any outside help. Premature use of unary “+” can confuse the
+optimizer and cause less than optimal performance. But in some cases
+it is useful to be able override the decisions of the optimizer, and
+the unary “+” operator is an excellent way to do this when it becomes
+necessary. </p> </section>
+<section id="GUID-7BEBC49C-0528-5D58-9626-2A92F3D0D9E8"><title>Avoid
+Indexing Large BLOBs and CLOBs</title> <p>SQLite stores indexes as
+b-trees. Each b-tree node uses one page of the database file. In order
+to maintain an acceptable fan-out, the b-tree module within SQLite
+requires that at least 4 entries must fit on each page of a b-tree.
+There is also some overhead associated with each b-tree page. So at
+the most there is about 250 bytes of space available on the main b-tree
+page for each index entry. </p> <p>If an index entry exceeds this
+allotment of approximately 250 bytes excess bytes are spilled to overflow
+pages. There is no arbitrary limit on the number of overflow pages
+or on the length of a b-tree entry, but for maximum efficiency it
+is best to avoid overflow pages, especially in indexes. This means
+that you should strive to keep the number of bytes in each index entry
+below 250. </p> <p>If you keep the size of indexes significantly smaller
+than 250 bytes, then the b-tree fan-out is increased and the binary
+search algorithm used to search for entries in an index has fewer
+pages to examine and therefore runs faster. So the fewer bytes used
+in each index entry the better, at least from a performance perspective. </p> <p>For these reasons, it is recommended that you avoid indexing
+large BLOBs and CLOBs. SQLite will continue to work when large BLOBs
+and CLOBs are indexed, but there will be a performance impact. </p> <p>On the other hand, if you need to lookup entries using a large
+BLOB or CLOB as the key, then by all means use an index. An index
+on a large BLOB or CLOB is not as fast as an index using more compact
+data types such as integers, but it is still many order of magnitude
+faster than doing a full table scan. So to be more precise, the advice
+of this section is that you should design your applications so that
+you do not need to lookup entries using a large BLOB or CLOB as the
+key. Try to arrange to have compact keys consisting of short strings
+or integers. </p> <p>Note that many other SQL database engines disallow
+the indexing of BLOBs and CLOBs in the first place. You simple cannot
+do it. SQLite is more flexible that most in that it does allow BLOBs
+and CLOBs to be indexed and it will use those indexes when appropriate.
+But for maximum performance, it is best to use smaller search keys. </p> </section>
+<section id="GUID-DD40F29F-DF93-536E-9B52-F9B9FF45155D"><title>Avoid
+Excess Indexes</title> <p>Some developers approach SQL-based application
+development with the attitude that indexes never hurt and that the
+more indexes you have, the faster your application will run. This
+is definitely not the case. There is a costs associated with each
+new index you create: </p> <ul>
+<li id="GUID-FD257BF7-F938-54B5-AC03-9536712D6281"><p>Each new index
+takes up additional space in the database file. The more indexes you
+have, the larger your database files will become for the same amount
+of data. </p> </li>
+<li id="GUID-E1B74FB6-246A-5148-AF06-04E1B4B949F1"><p>Every INSERT
+and UPDATE statement modifies both the original table and all indexes
+on that table. So the performance of INSERT and UPDATE decreases linearly
+with the number of indexes. </p> </li>
+<li id="GUID-56AAE2D1-71D6-5A23-8190-B0C80B204DED"><p>Compiling new
+SQL statements using <codeph>Prepare()</codeph> takes longer when
+there are more indexes for the optimizer to choose between. </p> </li>
+<li id="GUID-24B7F7D8-FAA9-5C78-B3C7-B886FA774C0B"><p>Surplus indexes
+give the optimizer more opportunities to make a bad choice. </p> </li>
+</ul> <p>Your policy on indexes should be to avoid them wherever you
+can. Indexes are powerful medicine and can work wonders to improve
+the performance of a program. But just as too many drugs can be worse
+than none at all, so also can too many indexes cause more harm than
+good. </p> <p>When building a new application, a good approach is
+to omit all explicitly declared indexes in the beginning and only
+add indexes as needed to address specific performance problems. </p> <p>Take care to avoid redundant indexes. For example, consider this
+schema: </p> <codeblock id="GUID-89F20101-1628-5783-82B0-2ABE84078C7D" xml:space="preserve">
+CREATE TABLE demo323a(a,b,c);
+CREATE INDEX idx323a1 ON demo323(a);
+CREATE INDEX idx323a2 ON demo323(a,b);
+</codeblock> <p>The idx323a1 index is redundant and can be eliminated.
+Anything that the idx323a1 index can do the idx323a2 index can do
+better. </p> <p>Other redundancies are not quite as apparent as the
+above. Recall that any column or columns that are declared UNIQUE
+or PRIMARY KEY (except for the special case of INTEGER PRIMARY KEY)
+are automatically indexed. So in the following schema: </p> <codeblock id="GUID-2FE7B726-4027-518C-9217-B4BD1ECDA991" xml:space="preserve">
+CREATE TABLE demo323b(x TEXT PRIMARY KEY, y INTEGER UNIQUE);
+CREATE INDEX idx323b1 ON demo323b(x);
+CREATE INDEX idx323b2 ON demo323b(y);
+</codeblock> <p>Both indexes are redundant and can be eliminated with
+no loss in query performance. Occasionally one sees a novice SQL programmer
+use both UNIQUE and PRIMARY KEY on the same column: </p> <codeblock id="GUID-CDE12649-BDB4-58D4-8981-02628BDF5C79" xml:space="preserve">
+CREATE TABLE demo323c(p TEXT UNIQUE PRIMARY KEY, q);
+</codeblock> <p>This has the effect of creating two indexes on the
+“p” column – one for the UNIQUE keywords and another for the PRIMARY
+KEY keyword. Both indexes are identical so clearly one can be omitted.
+A PRIMARY KEY is guaranteed to always be unique so the UNIQUE keyword
+can be removed from the demo323c table definition with no ambiguity
+or loss of functionality. </p> <p>It is not a fatal error to create
+too many indexes or redundant indexes. SQLite will continue to generate
+the correct answers but it may take longer to produce those answers
+and the resulting database files might be a little larger. So for
+best results, keep the number of indexes to a minimum. </p> </section>
+<section id="GUID-9337E315-BB5A-56D0-8319-6C398D26151F"><title>Avoid
+Tables and Indexes with an Excessive Number of Columns</title> <p>SQLite places no arbitrary limits on the number of columns in a table
+or index. There are known commercial applications using SQLite that
+construct tables with tens of thousands of columns each. And these
+applications actually work. </p> <p>However the database engine is
+optimized for the common case of tables with no more than a few dozen
+columns. For best performance you should try to stay in the optimized
+region. Furthermore, we note that relational databases with a large
+number of columns are usually not well normalized. So even apart from
+performance considerations, if you find your design has tables with
+more than a dozen or so columns, you really need to rethink how you
+are building your application. </p> <p>There are a number of places
+in <codeph>Prepare()</codeph> that run in time O(N<sup>2</sup>) where
+N is the number of columns in the table. The constant of proportionality
+is small in these cases so you should not have any problems for N
+of less than one hundred but for N on the order of a thousand, the
+time to run <codeph>Prepare()</codeph> can start to become noticeable. </p> <p>When the bytecode is running and it needs to access the i-th
+column of a table, the values of the previous i-1 columns must be
+accessed first. So if you have a large number of columns, accessing
+the last column can be an expensive operation. This fact also argues
+for putting smaller and more frequently accessed columns early in
+the table. </p> <p>There are certain optimizations that will only
+work if the table has 30 or fewer columns. The optimization that extracts
+all necessary information from an index and never refers to the underlying
+table works this way. So in some cases, keeping the number of columns
+in a table at or below 30 can result in a 2-fold speed improvement. </p> <p>Indexes will only be used if they contain 30 or fewer columns.
+You can put as many columns in an index as you want, but if the number
+is greater than 30, the index will never improve performance and will
+never do anything but take up space in your database file. </p> </section>
+</conbody><related-links>
+<link href="GUID-22844C28-AB5B-5A6F-8863-7269464684B4.dita"><linktext>SQL
+Overview</linktext></link>
+<link href="GUID-78773BCA-ADF6-53E6-AC80-5CB2AE1F8BCC.dita"><linktext>SQL
+Server Guide</linktext></link>
+<link href="GUID-E51836E1-D33E-506C-B75B-19B8E3CC313A.dita"><linktext>SQLite</linktext>
+</link>
+<link href="GUID-1F12E3F5-45B2-55EC-B021-00338277C608.dita"><linktext>SQL
+DB Overview</linktext></link>
+<link href="GUID-43CA02E7-0101-5824-B91B-E15EE20C829A.dita"><linktext>Avoid
+Transient Tables</linktext></link>
+<link href="GUID-49A3419F-D20A-5C5D-B2FF-51724EF37704.dita"><linktext>Prevent
+Datafile Corruption</linktext></link>
+<link href="GUID-B994E6F7-228A-5433-B87F-91857C5D93D6.dita">
+<linktext>SQL Insertion Tips</linktext></link>
+<link href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita"><linktext>SQL
+Schema Tips</linktext></link>
+<link href="GUID-2A2920E0-5D40-5358-BC0C-8572CEFE078C.dita"><linktext>SQL
+Expressions</linktext></link>
+<link href="GUID-126FCCCC-0E7D-59AE-959A-2F94A7319C4B.dita"><linktext>SQL
+Statement Tips</linktext></link>
+<link href="GUID-ACCCB148-DAF9-59EC-B585-8EF632B9BF04.dita"><linktext>SQL
+Joins</linktext></link>
+<link href="GUID-B7E978C1-45CA-554C-8028-D901B97BA2E0.dita"><linktext> ANALYZE
+Command</linktext></link>
+<link href="GUID-AF5A75D7-0687-546C-87B2-0B7DF7D33217.dita"><linktext> SQL
+WHERE CLause Tips</linktext></link>
 </related-links></concept>
\ No newline at end of file