Symbian3/SDK/Source/GUID-B994E6F7-228A-5433-B87F-91857C5D93D6.dita
changeset 7 51a74ef9ed63
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/Symbian3/SDK/Source/GUID-B994E6F7-228A-5433-B87F-91857C5D93D6.dita	Wed Mar 31 11:11:55 2010 +0100
@@ -0,0 +1,243 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
+<!-- This component and the accompanying materials are made available under the terms of the License 
+"Eclipse Public License v1.0" which accompanies this distribution, 
+and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
+<!-- Initial Contributors:
+    Nokia Corporation - initial contribution.
+Contributors: 
+-->
+<!DOCTYPE concept
+  PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="GUID-B994E6F7-228A-5433-B87F-91857C5D93D6" xml:lang="en"><title>SQL Insertion
+Tips</title><shortdesc>This guide provides some tips for using COMMIT statements.</shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody>
+<section id="GUID-A3DC5F3F-92C6-4847-A9C5-A227F777D7D3"><title>Introduction</title> <p>INSERT, UPDATE and DELETE operations
+are all very fast. However, COMMIT statements are very slow. You need to consider
+several techniques to ensure you make the best use of COMMIT statements. </p> <p><b>Intended audience:</b> </p> <p>This document is intended to be used by
+Symbian platfomr licensees and third party application developers. </p> </section>
+<section id="GUID-C12416F0-87D5-59AA-A08F-A1741A3FD3ED"><title>INSERT and
+UPDATE are Fast but COMMIT is Slow</title> <p>A programmer migrating to this
+database engine might write a test program to see how many INSERT statements
+per second it can do. They create an empty database with a single empty table.
+Then they write a loop that runs a few thousand times and does a single INSERT
+statement on each iteration. Upon timing this program they find that it appears
+to only be doing a couple of dozen INSERTs per second. </p> <p>“Everybody
+I talked to says SQLite is suppose to be really fast”, the new programmer
+will typically complain, “But I'm only getting 20 or 30 INSERTs per second!” </p> <p>In
+reality, SQLite can achieve around 50000 or more INSERTs per second on a modern
+workstation, although less on a typical embedded platform. But the characteristics
+of the underlying storage medium and the fact that the database engine guarantees
+atomic updates to the database mean that it can only achieve a few dozen COMMIT
+operations per second. </p> <p>Unless you take specific action to tell SQLite
+to do otherwise, it will automatically insert a COMMIT operation after every
+insert. So the programmers described above are really measuring the number
+of transactions per second, not the number of INSERTs. This is a very important
+distinction. </p> <p>Why is COMMIT so much slower than INSERT? SQLite guarantees
+that changes to a database are ACID – Atomic, Consistent, Isolated, and Durable.
+The Atomic and Durable parts are what take the time. </p> <p>In order to be
+Atomic, the database engine has to go through an elaborate protocol with the
+underlying file system, which ultimately means that every modified page of
+the database file must be written twice. </p> <p>In order to be durable, the
+COMMIT operation must not return until all content has been safely written
+to nonvolatile media. At least two consecutive non-concurrent writes to flash
+memory must occur in order to COMMIT. </p> <p>An atomic and durable COMMIT
+is a very powerful feature that can help you to build a system that is resilient,
+even in the face of unplanned system crashes or power failures. But the price
+of this resilience is that COMMIT is a relatively slow operation. Hence if
+performance is a priority you should strive to minimize the number of COMMITs. </p> <p>If
+you need to do more than one INSERT or UPDATE or DELETE operation, you are
+advised to put them all inside a single explicit transaction by running the
+BEGIN statement prior to the first changes and executing COMMIT once all changes
+have finished. In this way, all your changes occur within a single transaction
+and only a single time-consuming COMMIT operation must occur. </p> <p>If you
+omit the explicit BEGIN...COMMIT, then SQLite automatically inserts an implicit
+BEGIN...COMMIT around each of your INSERT, UPDATE, and DELETE statements,
+which means you end of doing many COMMITs which will always be much slower
+than doing just one. </p> </section>
+<section id="GUID-0E57010E-E3AB-573E-B550-72DE36725D1B"><title>Batch INSERT,
+UPDATE, and DELETE Operations Using TEMP Tables</title> <p>As described above,
+when you have many changes to make to a database, you are advised to make
+all those changes within a single explicit transaction by preceding the first
+change with a BEGIN statement and concluding the changes with a COMMIT statement. </p> <p>The
+problem with BEGIN...COMMIT is that BEGIN acquires an exclusive lock on the
+database file which is not released until the COMMIT completes. That means
+that only a single connection to the database can be in the middle of a BEGIN...COMMIT
+at one time. If another thread or process tries to start a BEGIN...COMMIT
+while the first is busy, the second has to wait. To avoid holding up other
+threads and processes, therefore, every BEGIN should be followed by a COMMIT
+as quickly as possible. </p> <p>But sometimes you run into a situation where
+you have to make periodic INSERTs or UPDATEs to a database based on timed
+or external events. For example, you may want to do an INSERT into an event
+log table once every 250 milliseconds or so. You could do a separate INSERT
+for each event, but that would mean doing a separate COMMIT four times per
+second, which is perhaps more overhead than you desire. On the other hand,
+if you did a BEGIN and accumulated several seconds worth of INSERTs you could
+avoid doing a COMMIT except for every 10<sup>th</sup> second or so. The trouble
+there is that other threads and processes are unable to write to the database
+while the event log is holding its transaction open. </p> <p>The usual method
+for avoiding this dilemma is to store all of the INSERTs in a separate TEMP
+table, then periodically flush the content of the TEMP table into the main
+database with a single operation. </p> <p>A TEMP table works just like a regular
+database table except that a TEMP table is only visible to the database connection
+that creates it, and the TEMP table is automatically dropped when the database
+connection is closed. You create a TEMP table by inserting the “TEMP” or “TEMPORARY”
+keyword in between “CREATE” and “TABLE”, like this: </p> <codeblock id="GUID-90068863-645B-551F-8A85-A12ED647F1AA" xml:space="preserve">
+CREATE TEMP TABLE event_accumulator(
+        eventId INTEGER,
+        eventArg TEXT
+    );
+</codeblock> <p>Because TEMP tables are ephemeral (meaning that they do not
+persist after the database connection closes) SQLite does not need to worry
+about making writes to a TEMP table atomic or durable. Hence a COMMIT to a
+TEMP table is very quick. </p> <p>A process can do multiple INSERTs into a
+TEMP table without having to enclose those INSERTs within an explicit BEGIN...COMMIT
+for efficiency. Writes to a TEMP table are always efficient regardless of
+whether or not they are enclosed in an explicit transaction. </p> <p>So as
+events arrive, they can be written into the TEMP table using isolated INSERT
+statements. But because the TEMP table is ephemeral, one must take care to
+periodically flush the contents of the TEMP table into the main database where
+they will persist. So every 10 seconds or so (depending on the application
+requirements) you can run code like this: </p> <codeblock id="GUID-F51D81FE-5F85-5C22-96AC-4CC69AF00BC4" xml:space="preserve">
+BEGIN;
+INSERT INTO event_log SELECT * FROM event_accumulator;
+DELETE FROM event_accumulator;
+COMMIT;
+</codeblock> <p>These statements transfer the content of the ephemeral event_accumulator
+table over to the persistent event_log table as a single atomic operation.
+Since this transfer occurs relatively infrequently, minimal database overhead
+is incurred. </p> </section>
+<section id="GUID-83C876AB-7C3F-5BFC-8F02-6503A0B3D8D6"><title>Use Bound Parameters</title> <p>Suppose
+you have a descriptor, nameDes, and you want to insert that value into the
+namelist table of a database. One way to proceed is to construct an appropriate
+INSERT statement that contains the desired string value as a SQL string literal,
+then run that INSERT statement. Pseudo-code for this approach follows: </p> <codeblock id="GUID-B96D98B4-ED28-566D-A0AA-073E4BEC6954" xml:space="preserve">
+_LIT(KSql, “INSERT INTO namelist VALUES('%S')”);
+sqlBuf.Format(KSql, nameDes);
+sqlDatabase.Execute(sql);
+</codeblock> <p>The INSERT statement is constructed by the call to <codeph>Format()</codeph> on
+the second line of the example above. The first argument is a template for
+the SQL statement. The value of the nameDes descriptor is inserted where the
+%S occurs in the template. Notice that the %S is surrounded by single quotes
+so that the string will be properly contained in SQL standard quotes. </p> <p>This
+approach works as long as the value in nameDes does not contain any single-quote
+characters. If nameDes does contain one or more single-quotes, then the string
+literal in the INSERT statement will not be well-formed and a syntax error
+might occur. Or worse, if a hostile user is able to control the content of
+nameDes, they might be able to put text in nameDes that looked something like
+this: </p> <codeblock id="GUID-12627B2F-D98A-5170-9E6F-6C3EFA33EA87" xml:space="preserve">
+hi'); DELETE FROM critical_table; SELECT 'hi
+</codeblock> <p>This would result in the sqlBuf variable holding </p> <codeblock id="GUID-FD6A2B42-8260-535B-80D3-B99C726B02FC" xml:space="preserve">
+INSERT INTO namelist VALUES('hi'); DELETE FROM critical_table; SELECT 'hi'
+</codeblock> <p>Your adversary has managed to convert your single INSERT statement
+into three separate SQL statements, one of which does things that you probably
+do not want to happen. This is called an “SQL Injection Attack”. You want
+to be very, very careful to avoid SQL injection attacks as they can seriously
+compromise the security of your application. </p> <p>SQLite allows you to
+specify parameters in SQL statements and then substitute values for those
+parameters prior to running the SQL. Parameters can take several forms, including: </p> <codeblock id="GUID-8255AFE5-CFA2-5DAD-A168-2A7294021AC6" xml:space="preserve">
+?
+?NNN
+:AAA
+@AAA
+$AAA
+</codeblock> <p>In the above, NNN means any sequence of digits and AAA means
+any sequence of alphanumeric characters and underscores. In this example we
+will stick with the first and simplest form – the question mark. The operation
+above would be rewritten as shown below. (Error checking is omitted from this
+example for brevity.) </p> <codeblock id="GUID-292857E5-9219-5AD5-B05A-9C9A47055741" xml:space="preserve">
+_LIT(KSql, “INSERT INTO namelist VALUES(?)”);
+RSqlStatement s;
+s.PrepareL(db, KSql);
+s.BindText(1, nameDes);
+s.Exec();
+s.Close();
+</codeblock> <p> <codeph> PrepareL()</codeph> compiles the SQL statement held
+in the literal KSql. This statement contains a single parameter. The value
+for this parameter is initially NULL. </p> <p>The <codeph>BindText()</codeph> sets
+the value of this parameter to the content of the <codeph>nameDes</codeph> descriptor
+and then <codeph>Exec()</codeph> executes the SQL statement with the bound
+parameter value. </p> <p>There are variations of <codeph>BindXxx()</codeph> to
+bind other kinds of values such as integers, floating point numbers, and binary
+large objects (BLOBs). The key point to observe is that none of these values
+need to be quoted or escaped in any way. And there is no possibility of being
+vulnerable to an SQL injection attack. </p> <p>Besides reducing your vulnerability
+to SQL injection attacks, the use of bound parameters also happens to be more
+efficient that constructing SQL statements from scratch, especially when inserting
+large strings or BLOBs. </p> </section>
+<section id="GUID-3EC5A35D-67E0-5C85-9DA3-CD6AA40BB6A7"><title>Cache and Reuse
+Prepared Statements</title> <p>Using <xref href="GUID-0176BF07-DF94-3259-8F90-DE030E35CE9A.dita"><apiname>RSqlStatement</apiname></xref> is a
+two-step process. Firstly the statement must be compiled using <codeph>Prepare()</codeph>.
+Then the resulting prepared statement is run using either <codeph>Exec()</codeph> or <codeph>Next()</codeph>. </p> <p>The
+relative amount of time spent doing each of these steps depends on the nature
+of the SQL statement. SELECT statements that return a large result set or
+UPDATE or DELETE statements that touch many rows of a table will normally
+spend most of their time in the Virtual Machine module and relatively little
+time being compiled. But simple INSERT statements on the other hand, can take
+twice as long to compile as they take to run in the virtual machine. </p> <p>A
+simple way to reduce the CPU load of an application that uses SQLite is to
+cache the prepared statements and reuse them. Of course, one rarely needs
+to run the exact same SQL statement more than once. But if a statement contains
+one or more bound parameters, you can bind new values to the parameters prior
+to each run and thus accomplish something different with each invocation. </p> <p>This
+technique is especially effective when doing multiple INSERTs into the same
+table. Instead of preparing a separate insert for each row, create a single
+generic insert statement like this: </p> <codeblock id="GUID-C66EEE5F-27C3-59A8-9DD4-2E32A49367AF" xml:space="preserve">
+INSERT INTO important_table VALUES(?,?,?,?,?)
+</codeblock> <p>Then for each row to be inserted, use one or more of the <codeph>BindXxx()</codeph> interfaces
+to bind values to the parameters in the insert statement, and call <codeph>Exec()</codeph> to
+do the insert, then call <codeph>Reset()</codeph> to rewind the program counter
+of the internal bytecode in preparation for the next run. </p> <p>For INSERT
+statements, reusing a single prepared statement in this way will typically
+make your code run two or three times faster. </p> <p>You can manually manage
+a cache of prepared statements, keeping around only those prepared statements
+that you know will be needed again and closing prepared statements using <codeph>Close()</codeph> when
+you are done with them or when they are about to fall out of scope. But depending
+on the application, it can be more convenient to create a wrapper class around
+the SQL interface that manages the cache automatically. </p> <p>A wrapper
+class can keep around the 5 or 10 most recently used prepared statements and
+reuse those statements if the same SQL is requested. Handling the prepared
+statement cache automatically in a wrapper has the advantage that it frees
+you to focus more mental energy on writing a great application and less effort
+on operating the database interface. It also makes the programming task less
+error prone since with an automatic class, there is no chance of accidentally
+omitting a call to <codeph>Close()</codeph> and leaking prepared statements. </p> <p>The
+downside is that a cache wrapper will not have the foresight of a human programmer
+and will often cache prepared statements that are no longer needed, thus using
+excess memory, or sometimes discard prepared statements just before they are
+needed again. </p> <p>This is a classic ease-of-programming versus performance
+trade-off. For applications that are intended for a high-power workstation,
+it can be best to go with a wrapper class that handles the cache automatically.
+But when designing an application for a resource constrained devices where
+performance is critical and engineering design talent is plentiful, it may
+be better to manage the cache manually. </p> <p>Regardless of whether or not
+the prepared statement cache is managed manually or automatically using a
+wrapper class, reusing prepared statements is always a good thing, and can
+in some cases double or triple the performance of the application. </p> </section>
+</conbody><related-links>
+<link href="GUID-22844C28-AB5B-5A6F-8863-7269464684B4.dita"><linktext>SQL Overview</linktext>
+</link>
+<link href="GUID-78773BCA-ADF6-53E6-AC80-5CB2AE1F8BCC.dita"><linktext>SQL Server
+Guide</linktext></link>
+<link href="GUID-E51836E1-D33E-506C-B75B-19B8E3CC313A.dita"><linktext>SQLite</linktext>
+</link>
+<link href="GUID-1F12E3F5-45B2-55EC-B021-00338277C608.dita"><linktext>SQL DB Overview</linktext>
+</link>
+<link href="GUID-43CA02E7-0101-5824-B91B-E15EE20C829A.dita"><linktext>Avoid Transient
+Tables</linktext></link>
+<link href="GUID-49A3419F-D20A-5C5D-B2FF-51724EF37704.dita"><linktext>Prevent Datafile
+Corruption</linktext></link>
+<link href="GUID-C2FAEBB2-4A1A-5BB0-9670-4801525CBC6A.dita"><linktext>SQL Index
+Tips</linktext></link>
+<link href="GUID-4FC23DB7-4758-5DA4-81FF-0DAB169E2757.dita"><linktext>SQL Schema
+Tips</linktext></link>
+<link href="GUID-2A2920E0-5D40-5358-BC0C-8572CEFE078C.dita"><linktext>SQL Expressions</linktext>
+</link>
+<link href="GUID-126FCCCC-0E7D-59AE-959A-2F94A7319C4B.dita"><linktext>SQL Statement
+Tips</linktext></link>
+<link href="GUID-ACCCB148-DAF9-59EC-B585-8EF632B9BF04.dita"><linktext>SQL Joins</linktext>
+</link>
+<link href="GUID-B7E978C1-45CA-554C-8028-D901B97BA2E0.dita"><linktext> ANALYZE
+Command</linktext></link>
+<link href="GUID-AF5A75D7-0687-546C-87B2-0B7DF7D33217.dita"><linktext> SQL WHERE
+CLause Tips</linktext></link>
+</related-links></concept>
\ No newline at end of file