Symbian3/SDK/Source/GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita
changeset 7 51a74ef9ed63
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/Symbian3/SDK/Source/GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita	Wed Mar 31 11:11:55 2010 +0100
@@ -0,0 +1,322 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
+<!-- This component and the accompanying materials are made available under the terms of the License 
+"Eclipse Public License v1.0" which accompanies this distribution, 
+and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
+<!-- Initial Contributors:
+    Nokia Corporation - initial contribution.
+Contributors: 
+-->
+<!DOCTYPE concept
+  PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757" xml:lang="en"><title>SQL Schema
+Tips</title><shortdesc>This guide gives tips for using the database schema. </shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody>
+<section id="GUID-37B02CC0-B591-542D-8B02-B6E2F3FF2097"><title>Introduction</title> <p>A
+schema is a way of organizing a number of related database objects such as
+tables. The schema contains at least one object but can contain as many as
+the total number of tables in the database. This section provides some handy
+tips for working with schemas in Symbian SQL. </p> <p>The following tips or
+techniques are discussed: </p> <ul>
+<li id="GUID-F50E880D-BE59-5915-BE5A-E752EA2D3868"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-296FB987-9E61-5925-9246-0468534D16A9">Handling SQL schema errors</xref> </p> </li>
+<li id="GUID-A5C25843-9D8E-5F2F-96FD-61BA05B4FEC6"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-0D7324A8-8DAF-57B5-BCA1-C8BDD855773C">Designing tables for more efficient access</xref> </p> </li>
+<li id="GUID-A300E8B2-A0F2-5EC1-AAE4-09909FA69E98"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-DAF6BE02-7D38-5667-86B2-D12EB34BB6EC">Storing Large BLOBs Separately from the Database</xref> </p> </li>
+<li id="GUID-C66BE626-F551-58E5-98C9-33090DF96272"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-8F18A8B8-706F-5F53-88BC-249F9EB48110">Using INTEGER PRIMARY KEY</xref> </p> </li>
+<li id="GUID-86442FBC-699C-5101-95CB-1BA5229DE63A"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-BD8FE233-2AA5-5DF7-9A84-EEC5836669E8">Efficient Primary Key and Row ID values</xref> </p> </li>
+<li id="GUID-8FEA56BD-19A7-59A2-8B26-01A3594D6FC7"><p> <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-40CDE965-B261-5873-9F8B-7812EEC14AD7">Keep Schemas Small and Constant</xref> </p> </li>
+</ul> </section>
+<section id="GUID-296FB987-9E61-5925-9246-0468534D16A9"><title>Handling SQL
+schema errors</title> <p>In order to translate a SQL statement into bytecode, <codeph>Prepare()</codeph> needs
+to know the database schema. For any given SQL statement there are typically
+many different ways of implementing the statement in bytecode. The compiler
+(or a particular part of the compiler that is commonly referred to as the <i>optimizer</i>)
+tries to pick the most efficient implementation based on what is known about
+the content of the various tables and what indexes are available. The bytecode
+depends so much on the database schema that the slightest change to the schema
+can cause totally different bytecode to be generated for the same SQL statement. </p> <p>When <codeph>Prepare()</codeph> runs,
+it uses the database schema as of the last time the database was accessed
+when preparing the bytecode. So if the database schema has been changed by
+another program, or if the schema is changed after <codeph>Prepare()</codeph> completes
+but before <codeph>Exec()</codeph> or <codeph>Next()</codeph> is run, the
+schema of the database will not match the schema used to prepare the bytecode. </p> <p>To
+guard against any problems, the first few instructions of bytecode in a prepared
+statement read the current schema cookie for the database and check to see
+that it is the same as it was when <codeph>Prepare()</codeph> was run. If
+the schema cookie has changed, then <codeph>Exec()</codeph> and <codeph>Next()</codeph> terminate
+with the error <xref href="GUID-1BCF6DC7-8419-3399-9890-627048A74F52.dita"><apiname>KSqlErrSchema</apiname></xref>. </p> <p>When this occurs,
+a new prepared statement should be created by calling <codeph>Prepare()</codeph> again
+with the same SQL and the original prepared statement should be discarded. </p> <p>It
+is important to recognize that a <xref href="GUID-1BCF6DC7-8419-3399-9890-627048A74F52.dita"><apiname>KSqlErrSchema</apiname></xref> error can
+occur on any call to <codeph>Exec()</codeph> or <codeph>Next()</codeph>. A
+common programming error with is to omit the necessary error handling or to
+do it incorrectly. The error is, unfortunately, often overlooked during testing
+because the usual cause of an <xref href="GUID-1BCF6DC7-8419-3399-9890-627048A74F52.dita"><apiname>KSqlErrSchema</apiname></xref> error is an
+external program making a schema change at just the wrong moment, and external
+programs are not normally running during testing. </p> </section>
+<section id="GUID-0D7324A8-8DAF-57B5-BCA1-C8BDD855773C"><title> Designing
+tables for more efficient access</title> <p>Put smaller and frequently accessed
+columns at the beginning of tables. SQLite stores a row of a table by gathering
+the data for all columns in that row and packing the data together in column
+order into as few bytes as possible. <i>Column order</i> means that the data
+for the first declared column of the table – the column that appears first
+in the CREATE TABLE statement – is packed into the bundle first. The data
+for the second column is packed into the bundle second. And so forth. </p> <p>SQLite
+goes to some trouble to use as few bytes as possible when storing data. A
+text string that is 5 bytes long, for example, is stored using just 5 bytes.
+The '\00' terminator on the end is omitted. No padding bytes or overhead are
+added even if the column specifies a larger string such as VARCHAR(1000). </p> <p>Small
+integers (between -127 and +127) are stored as a single byte. As the magnitude
+of the integer increases, additional bytes of storage are added as necessary
+up to 8 bytes. Floating point numbers are normally stored as 8-byte IEEE floats,
+but if the floating point number can be represented as a smaller integer without
+loss of information, it is converted and stored that way to save space. BLOBs
+are also stored using the minimum number of bytes necessary. </p> <p>These
+efforts to save space both help to keep SQLite database files small and also
+improve performance since the smaller each record is the more information
+will fit in the same amount of space and the less disk I/O needs to occur. </p> <p>The
+downside of using compressed storage is that data in each column is an unpredictable
+size. So within each row, we do not know in advance where one column ends
+and the next column begins. To extract the data from the N-th column of a
+row, SQLite has to decode the data from the N-1 prior columns first. Thus,
+for example, if you have a query that reads just the fourth column of a table,
+the bytecode that implements the query has to read and discard the data from
+the first, second, and third columns in order to find where the data for the
+fourth column begins in order to read it out. </p> <p>For this reason, it
+is best to put smaller and more frequently accessed columns of a table early
+in the CREATE TABLE statement and put large CLOBs and BLOBs and infrequently
+accessed information toward the end. </p> </section>
+<section id="GUID-DAF6BE02-7D38-5667-86B2-D12EB34BB6EC"><title>Storing Large
+BLOBs Separately from the Database</title> <p>SQLite does not place arbitrary
+constraints on the size of BLOBs and CLOBs that it will store. The only real
+restriction is that the entire BLOB or CLOB must fit in memory all at once.
+But just because you can do this does not mean you should. </p> <p>Although
+SQLite is able to handle really big BLOBs, it is not optimized for that case.
+If performance and storage efficiency is a concern, you will be better served
+to store large BLOBs and CLOBs in separate disk files then store the name
+of the file in the database in place of the actual content. </p> <p>When SQLite
+encounters a large table row – e.g. because it contains one or more large
+BLOBs – it tries to store as much of the row as it can on a single page of
+the database. The tail of the row that will not fit on a single page is spilled
+into overflow pages. If more than one overflow page is required, then the
+overflow pages form a linked list. </p> <p>To store a 1MiB BLOB in a database
+with a 1KiB page size will require the BLOB to be broken into a list of over
+a thousand links, each of which gets stored separately. Whenever you want
+to read the BLOB out of the database, SQLite has to walk this thousand-member
+linked list in order to reassemble the BLOB again. This works and is reliable,
+but it is not especially efficient. </p> <p>Another problem with large BLOBs
+and CLOBs is that they can be neither read nor written incrementally. The
+entire BLOB or CLOB must be read or written all at once. So, for example,
+to append 10 bytes to the end of a 10MiB BLOB, one has to read the entire
+BLOB, append the 10 bytes, then write the new BLOB back out again. With a
+disk file, on the other hand, you can append 10 bytes quickly and efficiently
+without having to access the other 10 million bytes of the original file. </p> <p>If
+you absolutely need to store large BLOBs and CLOBs in your database (perhaps
+so that changes to these data elements will be atomic and durable) then at
+least consider storing the BLOBs and CLOBs in separate tables that contain
+only an INTEGER PRIMARY KEY and the content of the BLOB or CLOB. So, instead
+of writing your schema this way: </p> <codeblock id="GUID-A501A0C1-5AB2-5EBA-9DB3-2C786C1DFFFF" xml:space="preserve">
+CREATE TABLE image(
+    imageId INTEGER PRIMARY KEY,
+    imageTitle TEXT,
+    imageWidth INTEGER,
+    imageHeight INTEGER,
+    imageRefCnt INTEGER,
+    imageBlob BLOB
+);
+</codeblock> <p>Factor out the large BLOB into a separate table so that your
+schema looks more like this: </p> <codeblock id="GUID-F3CEBEDB-9D4A-5930-8AFE-3FE33DECE1CF" xml:space="preserve">
+CREATE TABLE image(
+    imageId INTEGER PRIMARY KEY,
+    imageTitle TEXT,
+    imageWidth INTEGER,
+    imageHeight INTEGER,
+    imageRefCnt INTEGER
+);
+
+CREATE TABLE imagedata(
+    imageId INTEGER PRIMARY KEY,
+    imageBlob BLOB
+);
+</codeblock> <p>Keeping all of the small columns in a separate table increases
+locality of reference and allows queries that do not use the large BLOB to
+run much faster. You can still write single queries that return both the smaller
+fields and the BLOB by using a simple (and very efficient) join on the INTEGER
+PRIMARY KEY. </p> </section>
+<section id="GUID-8F18A8B8-706F-5F53-88BC-249F9EB48110"><title>Using INTEGER
+PRIMARY KEY</title> <p>INTEGER PRIMARY KEY is a fast special case. Every row
+of every SQLite table has a signed 64-bit integer RowID. This RowID is the
+key for the b-tree that holds the table content. The RowID must be unique
+over all other rows in the same table. When you go to find, insert, or remove
+a row from a table in SQLite, the row is first located by search for its RowID.
+Searching by RowID is very fast. Everything in SQLite tables centres around
+RowIDs. </p> <p>In the CREATE TABLE statement that defines a table, if you
+declare a column to be of type INTEGER PRIMARY KEY, then that column becomes
+an alias for the RowID. It is generally a good idea to create such a column
+whenever it is practical. </p> <p>Looking up information by RowID or INTEGER
+PRIMARY KEY is usually about twice as fast as any other search method in SQLite.
+So for example, if we have an indexed table like this: </p> <codeblock id="GUID-A168F121-43D7-52A1-B9BD-E92160473FE1" xml:space="preserve">
+CREATE TABLE demo1(
+    id1 INTEGER PRIMARY KEY,
+    id2 INTEGER UNIQUE,
+    content BLOB
+);
+</codeblock> <p>Then queries against the INTEGER PRIMARY KEY, e.g. of the
+form: </p> <codeblock id="GUID-FE186FEB-8134-56BE-A23D-219C20A89FC3" xml:space="preserve">
+SELECT content FROM demo1 WHERE id1=?;
+</codeblock> <p>will be about twice as fast as queries like this: </p> <codeblock id="GUID-BD9DD075-48DB-51FA-823D-CB278E72E275" xml:space="preserve">
+SELECT content FROM demo1 WHERE id2=?;
+</codeblock> <p>Note that the following two queries are identical in SQLite
+– not just equivalent but identical. They generate exactly the same bytecode: </p> <codeblock id="GUID-B28BE56D-F947-5B75-9602-3A1024A8CEC1" xml:space="preserve">
+SELECT content FROM demo1 WHERE id1=?;
+SELECT content FROM demo1 WHERE RowID=?;
+</codeblock> <p>But for stylistic and portability reasons, the first form
+of the query is preferred. </p> <p>Also observe that in order for the INTEGER
+PRIMARY KEY to truly be an alias for the RowID, the declared type must be
+exactly “INTEGER PRIMARY KEY”. </p> <p>Variations such as “INT PRIMARY KEY”
+or “UNSIGNED INTEGER PRIMARY KEY” or “SHORTINT PRIMARY KEY” become independent
+columns instead of aliases for the RowID. </p> <p>So be careful to always
+spell INTEGER PRIMARY KEY correctly in your CREATE TABLE statements. </p> </section>
+<section id="GUID-BD8FE233-2AA5-5DF7-9A84-EEC5836669E8"><title> Efficient
+Primary Key and Row ID values</title> <p>Use small positive integers for INTEGER
+PRIMARY KEY or RowID. As explained in the section <xref href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita#GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757/GUID-8F18A8B8-706F-5F53-88BC-249F9EB48110">Using INTEGER PRIMARY KEY</xref>, every table row has a key which is a signed
+64-bit integer: the RowID or INTEGER PRIMARY KEY. This same RowID also occurs
+in every index entry and is used to refer the index entry back to the original
+table row. </p> <p>When writing the RowID to disk, SQLite encodes the 64-bit
+integer value using a Huffman code over a fixed probability distribution.
+The resulting encoding requires between 1 and 9 bytes of storage space, depending
+on the magnitude of the integer. Small, non-negative integers require less
+space than larger or negative integers. Table 3.1 shows the storage requirements: </p> <table id="GUID-A50643C5-5D28-56C1-B1D9-6BFF1E9551B5">
+<tgroup cols="2"><colspec colname="col0"/><colspec colname="col1"/>
+<thead>
+<row>
+<entry>RowID Magnitude</entry>
+<entry>Bytes Of Storage</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry><p>0 to 127 </p> </entry>
+<entry><p>1</p> </entry>
+</row>
+<row>
+<entry><p>128 to 16383 </p> </entry>
+<entry><p>2</p> </entry>
+</row>
+<row>
+<entry><p>16384 to 2097151 </p> </entry>
+<entry><p>3</p> </entry>
+</row>
+<row>
+<entry><p>2097152 to 268435455 </p> </entry>
+<entry><p>4</p> </entry>
+</row>
+<row>
+<entry><p>268435456 to 34359738367 </p> </entry>
+<entry><p>5</p> </entry>
+</row>
+<row>
+<entry><p>34359738368 to 4398046511103 </p> </entry>
+<entry><p>6</p> </entry>
+</row>
+<row>
+<entry><p>4398046511104 to 562949953421311 </p> </entry>
+<entry><p>7</p> </entry>
+</row>
+<row>
+<entry><p>562949953421312 to 72057594037927935 </p> </entry>
+<entry><p>8</p> </entry>
+</row>
+<row>
+<entry><p>Less than 0 or greater than 72057594037927935 </p> </entry>
+<entry><p>9</p> </entry>
+</row>
+</tbody>
+</tgroup>
+</table> <p> <b>Table 3.1:</b> Storage requirements for RowIDs of different
+magnitudes </p> <p>This chart makes it clear that the best way to keep reduce
+the number of bytes of disk space devoted to storing RowIDs is to keep RowIDs
+small non-negative integers. The database will continue to work fine with
+large or negative RowIDs, but the database files will take up more space. </p> <p>SQLite
+normally works by assigning RowIDs automatically. An automatically generated
+RowID will normally be the smallest positive integer that is not already used
+as a RowID in the same table. The first RowID assigned will be 1, the second
+2, and so forth. So the automatic RowID assignment algorithm begins by using
+1-byte RowIDs, then moves to 2-byte RowIDs as the number of rows increased,
+then 3-byte RowIDs, and so forth. In other words, the automatic RowID assignment
+algorithm does a good job of selecting RowIDs that minimize storage requirements. </p> <p>If
+a table contains an INTEGER PRIMARY KEY column, that column becomes an alias
+for the RowID. If you then specify a particular RowID when doing an INSERT,
+the RowID you specify overrides the default choice made by SQLite itself.
+For example, if your table is this: </p> <codeblock id="GUID-54E6F47E-5FBC-5CB2-8A3A-B81B758C112A" xml:space="preserve">
+CREATE TABLE demo311(
+    id INTEGER PRIMARY KEY,
+    content VARCHAR(100)
+);
+</codeblock> <p>Then if you insert like this: </p> <codeblock id="GUID-3D820D52-B53F-5F74-A727-81CEB3677295" xml:space="preserve">
+INSERT INTO demo311(id,content) VALUES(-17,'Hello');
+</codeblock> <p>The value -17 is used as the RowID which requires 9 bytes
+of space in the disk file because it is negative. If instead of specifying
+the id value you had just left it blank: </p> <codeblock id="GUID-8CBF96D3-9FB6-54C1-861C-CF85FF3F76A8" xml:space="preserve">
+INSERT INTO demo311(content) VALUES('Hello');
+</codeblock> <p>Or if you had specified a NULL as the INTEGER PRIMARY KEY
+value: </p> <codeblock id="GUID-0DEF58F0-DC25-586A-B803-4B581FF8653C" xml:space="preserve">
+INSERT INTO demo311(id,content) VALUES(NULL,'Hello');
+</codeblock> <p>Then in either case, SQLite would have automatically selected
+a RowID value that was the smallest positive integer not already in use, thus
+minimizing the amount of disk space required. </p> <p>Therefore unless you
+have specific requirements to the contrary, it generally works best to let
+SQLite pick its own RowID and INTEGER PRIMARY KEY values. </p> </section>
+<section id="GUID-40CDE965-B261-5873-9F8B-7812EEC14AD7"><title>Keep Schemas
+Small and Constant</title> <p>SQLite stores the original text of all CREATE
+statements in a system table of the database. When a database session is opened
+or when the database schema is changed, SQLite has to read and parse all of
+these CREATE statements in order to reconstruct its internal symbol tables. </p> <p>The
+parser and symbol table builder inside SQLite are very fast, but they still
+take time which is proportional to the size of the schema. To minimize the
+computational overhead associated with parsing the schema: </p> <ul>
+<li id="GUID-36BDF092-298D-564D-9744-9C83836A2FB2"><p>Keep the number and
+size of CREATE statements in the schema to a minimum. </p> </li>
+<li id="GUID-DF3A6363-7C24-5727-AD27-58367EFFE1A4"><p>Avoid unnecessary calls
+to <xref href="GUID-4688F6B7-E1B0-37CF-BAA2-C6BD103D4FDF.dita#GUID-4688F6B7-E1B0-37CF-BAA2-C6BD103D4FDF/GUID-195907BE-273B-3DD1-95C9-D48870030914"><apiname>RSqlDatabase::Open()</apiname></xref>. Reuse the same database session
+where possible. </p> </li>
+<li id="GUID-D1A62767-9CF7-5FBE-8F51-98EB154559C1"><p>Avoid changing the database
+schema while other threads or processes might be using the database. </p> </li>
+</ul> <p>Whenever the schema is reparsed, all statements that were prepared
+using the old schema are invalidated. A call <codeph>Next()</codeph> or <codeph>Exec()</codeph> on
+a <xref href="GUID-0176BF07-DF94-3259-8F90-DE030E35CE9A.dita"><apiname>RSqlStatement</apiname></xref> object that was prepared prior to the schema
+reparse will return <xref href="GUID-1BCF6DC7-8419-3399-9890-627048A74F52.dita"><apiname>KSqlErrSchema</apiname></xref> to let you know that
+the cause of the error was a schema change. Thus any prepared statements that
+were cached will have to be discarded and re-prepared following a schema change.
+Refilling the prepared statement cache can be as computationally expensive
+as parsing the schema in the first place. </p> <p>Prepared statements can
+be invalidated for reasons other than schema changes including executing <codeph>Attach()</codeph> or <codeph>Detach()</codeph>.
+Avoid this if you have a large cache of prepared statements. The best strategy
+is to attach all associated databases as soon as a new database session is
+opened and before any statements are prepared. </p> </section>
+</conbody><related-links>
+<link href="GUID-22844C28-AB5B-5A6F-8863-7269464684B4.dita"><linktext>SQL Overview</linktext>
+</link>
+<link href="GUID-78773BCA-ADF6-53E6-AC80-5CB2AE1F8BCC.dita"><linktext>SQL Server
+Guide</linktext></link>
+<link href="GUID-E51836E1-D33E-506C-B75B-19B8E3CC313A.dita"><linktext>SQLite</linktext>
+</link>
+<link href="GUID-1F12E3F5-45B2-55EC-B021-00338277C608.dita"><linktext>SQL DB Overview</linktext>
+</link>
+<link href="GUID-43CA02E7-0101-5824-B91B-E15EE20C829A.dita"><linktext>Avoid Transient
+Tables</linktext></link>
+<link href="GUID-49A3419F-D20A-5C5D-B2FF-51724EF37704.dita"><linktext>Prevent Datafile
+Corruption</linktext></link>
+<link href="GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A.dita"><linktext>SQL Index
+Tips</linktext></link>
+<link href="GUID-B994E6F7-228A-5433-B87F-91857C5D93D6.dita"><linktext>SQL Insertion
+Tips</linktext></link>
+<link href="GUID-2A2920E0-5D40-5358-BC0C-8572CEFE078C.dita"><linktext>SQL Expressions</linktext>
+</link>
+<link href="GUID-126FCCCC-0E7D-59AE-959A-2F94A7319C4B.dita"><linktext>SQL Statement
+Tips</linktext></link>
+<link href="GUID-ACCCB148-DAF9-59EC-B585-8EF632B9BF04.dita"><linktext>SQL Joins</linktext>
+</link>
+<link href="GUID-B7E978C1-45CA-554C-8028-D901B97BA2E0.dita"><linktext> ANALYZE
+Command</linktext></link>
+<link href="GUID-AF5A75D7-0687-546C-87B2-0B7DF7D33217.dita"><linktext> SQL WHERE
+CLause Tips</linktext></link>
+</related-links></concept>
\ No newline at end of file