|
1 <?xml version="1.0" encoding="utf-8"?> |
|
2 <!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. --> |
|
3 <!-- This component and the accompanying materials are made available under the terms of the License |
|
4 "Eclipse Public License v1.0" which accompanies this distribution, |
|
5 and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". --> |
|
6 <!-- Initial Contributors: |
|
7 Nokia Corporation - initial contribution. |
|
8 Contributors: |
|
9 --> |
|
10 <!DOCTYPE concept |
|
11 PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
|
12 <concept id="GUID-785B2F0B-E7E6-5DAE-98F1-6C32BED25964" xml:lang="en"><title>Database |
|
13 storage overhead</title><shortdesc>Database storage is very efficient. This document gives details |
|
14 of the storage overhead.</shortdesc><prolog><metadata><keywords/></metadata></prolog><conbody> |
|
15 <p>Space utilization is efficient</p> |
|
16 <ul> |
|
17 <li id="GUID-8AE05C00-0C7E-5D9C-AC2B-DC1D5432DF0D"><p>the underlying permanent |
|
18 file store adds minimal overhead to maintain its structure: store imposes |
|
19 a fixed 46 bytes + two bytes per additional 16K; each stream requires seven |
|
20 bytes.</p> </li> |
|
21 <li id="GUID-5BDB30B0-9140-55AA-A7BE-4C4B30355ADE"><p>the database has minimal |
|
22 space requirements to store its schema and data structure.</p> </li> |
|
23 <li id="GUID-3FCF5F99-767F-5526-B0AF-40F7F248CF26"><p>including the stream |
|
24 overhead, row storage overhead can be less than two bytes per row in ideal |
|
25 clustering conditions, even with lower clustering the overhead is usually |
|
26 below 1% of the data volume.</p> </li> |
|
27 </ul> |
|
28 <p>The storage required for a row can be determined as follows:</p> |
|
29 <ul> |
|
30 <li id="GUID-41E7E40C-6E34-5C7A-BB82-A0AF92B5D455"><p>each non-null value |
|
31 requires the storage for fixed width columns, and the raw storage plus one |
|
32 byte for variable width columns.</p> </li> |
|
33 <li id="GUID-845C1A45-8C1F-5EA0-A3A6-787359332505"><p>long columns are stored |
|
34 embedded in the row data when they are small enough, and otherwise they are |
|
35 stored in a separate stream: this makes more efficient use of the both space |
|
36 and speed. When embedded they require just one bit more than the short columns, |
|
37 when separated they require eight bytes in the row data plus any stream overhead.</p> </li> |
|
38 <li id="GUID-CD64205E-894B-50D3-9AAA-ACA375D820EA"><p>each nullable column |
|
39 requires one extra bit.</p> </li> |
|
40 <li id="GUID-4FF233BB-3863-5869-9E6E-E56AC3B29924"><p>bits are packed into |
|
41 bytes in the row storage.</p> </li> |
|
42 </ul> |
|
43 <section id="GUID-14305B1D-41AF-452E-9E44-131569F00FBC"><title>Indexes</title> <p>Indexes |
|
44 are implemented using STORE B+trees. Note in particular that indexes use fixed |
|
45 length keys, so that indexes on longer text fields can consume significant |
|
46 space.</p> <p>If the key for the index is <i>k</i> bytes, the number of rows |
|
47 to index is <i>n</i>, the index page size is <i>P</i>, and the B-tree packing |
|
48 density is <i>r</i>:</p> <p><i>a = [(P-8)/(k+4)] * r</i> </p> <p>Where <i>[x]</i> is |
|
49 the largest integer <= <i>x</i>. Then the number of pages required, <i>N</i>, |
|
50 is</p> <p><i>N = n * (1/a + 1/(a*a))</i> </p> <p>Each page requires <i>P+7</i> bytes |
|
51 in the store, so the total indexing overhead, <i>S</i>, is</p> <p><i>S = N |
|
52 * (P+7)</i> </p> <p>For DBMS <i>P=512</i> and <i>r=0.86</i>.</p> </section> |
|
53 </conbody></concept> |