<?xml version="1.0" encoding="utf-8"?>
<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
<!-- This component and the accompanying materials are made available under the terms of the License
"Eclipse Public License v1.0" which accompanies this distribution,
and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
<!-- Initial Contributors:
Nokia Corporation - initial contribution.
Contributors:
-->
<!DOCTYPE concept
PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="GUID-785B2F0B-E7E6-5DAE-98F1-6C32BED25964" xml:lang="en"><title>Database
storage overhead</title><shortdesc>Database storage is very efficient. This document gives details
of the storage overhead.</shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody>
<p>Space utilization is efficient</p>
<ul>
<li id="GUID-8AE05C00-0C7E-5D9C-AC2B-DC1D5432DF0D"><p>the underlying permanent
file store adds minimal overhead to maintain its structure: store imposes
a fixed 46 bytes + two bytes per additional 16K; each stream requires seven
bytes.</p> </li>
<li id="GUID-5BDB30B0-9140-55AA-A7BE-4C4B30355ADE"><p>the database has minimal
space requirements to store its schema and data structure.</p> </li>
<li id="GUID-3FCF5F99-767F-5526-B0AF-40F7F248CF26"><p>including the stream
overhead, row storage overhead can be less than two bytes per row in ideal
clustering conditions, even with lower clustering the overhead is usually
below 1% of the data volume.</p> </li>
</ul>
<p>The storage required for a row can be determined as follows:</p>
<ul>
<li id="GUID-41E7E40C-6E34-5C7A-BB82-A0AF92B5D455"><p>each non-null value
requires the storage for fixed width columns, and the raw storage plus one
byte for variable width columns.</p> </li>
<li id="GUID-845C1A45-8C1F-5EA0-A3A6-787359332505"><p>long columns are stored
embedded in the row data when they are small enough, and otherwise they are
stored in a separate stream: this makes more efficient use of the both space
and speed. When embedded they require just one bit more than the short columns,
when separated they require eight bytes in the row data plus any stream overhead.</p> </li>
<li id="GUID-CD64205E-894B-50D3-9AAA-ACA375D820EA"><p>each nullable column
requires one extra bit.</p> </li>
<li id="GUID-4FF233BB-3863-5869-9E6E-E56AC3B29924"><p>bits are packed into
bytes in the row storage.</p> </li>
</ul>
<section id="GUID-14305B1D-41AF-452E-9E44-131569F00FBC"><title>Indexes</title> <p>Indexes
are implemented using STORE B+trees. Note in particular that indexes use fixed
length keys, so that indexes on longer text fields can consume significant
space.</p> <p>If the key for the index is <i>k</i> bytes, the number of rows
to index is <i>n</i>, the index page size is <i>P</i>, and the B-tree packing
density is <i>r</i>:</p> <p><i>a = [(P-8)/(k+4)] * r</i> </p> <p>Where <i>[x]</i> is
the largest integer <= <i>x</i>. Then the number of pages required, <i>N</i>,
is</p> <p><i>N = n * (1/a + 1/(a*a))</i> </p> <p>Each page requires <i>P+7</i> bytes
in the store, so the total indexing overhead, <i>S</i>, is</p> <p><i>S = N
* (P+7)</i> </p> <p>For DBMS <i>P=512</i> and <i>r=0.86</i>.</p> </section>
</conbody></concept>