Address string tokenizer overview


The Address String Tokenizer offers methods for parsing phone numbers, E-mail addresses, URL and URI addresses from given text. It provides an interface for applications that, for example, want to create/use their own GUI for displaying found items.

Architectural Relationships

All functionality is implemented in the CTulAddressStringTokenizer class. The interface can be accessed through the tuladdressstringtokenizer.h file. The binaries are linked to the etul.lib (Text Utilities - part of Egul component) library.

Figure 1. Subsystem dependencies



In order to use the Address String Tokenizer, the user has to create an instance of CTulAddressStringTokenizer by using the factory method CTulAddressStringTokenizer::NewL().

For example:

CTulAddressStringTokenizer* addressStringTokenizer = CTulAddressStringTokenizer::NewL(text, searchCase);

The method takes two parameters of type TDesC& and TTokenizerSearchCase which is defined in tuladdressstringtokenizer.h.

The first parameter defines the text to be searched from.

The second parameter tells what exactly is being looked for. It is an enum which describes the type of text being searched for. The types available are phone number, email address, fixed start URL or generic URI.

The passed text is parsed in construction, and found items can be fetched using the ItemArray() method. ItemArray() returns a constant array containing all the found items.

The interface also offers helper functions for handling the item array by itself.

For more information on individual methods, please see the reference API for CTulAddressStringTokenizer.


This sample code explains a few simple use cases that search for items from a text string:

// Some text
TBufC<256> strSomeText(_L("Mail to or call 040 1234567. 
                        You can also tune in to audio feed at rtsp://"));

// SFoundItem instance
CTulAddressStringTokenizer::SFoundItem item;

// Create an instance of CTulAddressStringTokenizer and search for URLs.
CTulAddressStringTokenizer* singleSearch = CTulAddressStringTokenizer::NewL
                                           (strSomeText, CTulAddressStringTokenizer::EFindItemSearchScheme);

// Get count of found items
TInt count(singleSearch->ItemCount());

// Get currently selected item (rtsp:// to the result 
// variable
TPtrC16 result(strSomeText.Mid(item.iStartPos, item.iLength));

// Deallocate memory
delete singleSearch;

// Look for all possible things (cases work as binary mask)
CTulAddressStringTokenizer* multiSearch = CTulAddressStringTokenizer::NewL
                                          (strSomeText, (CTulAddressStringTokenizer::TTokenizerSearchCase)
                                          (CTulAddressStringTokenizer::EFindItemSearchPhoneNumberBin |           
                                          CTulAddressStringTokenizer::EFindItemSearchURLBin | 
                                          CTulAddressStringTokenizer::EFindItemSearchMailAddressBin | 

// Debug print all items and their type
count = multiSearch->ItemCount();

for(TInt i=0; i<count; i++)
    result.Set(strSomeText.Mid(item.iStartPos, item.iLength));
    RDebug::Print(_L("Found type %d item:"), item.iItemType);
    RDebug::Print(_L("%S"), &result)

// Deallocate memory
delete multiSearch;

Sequence Diagram

Figure 2. Sequence of events for CTulAddressStringTokenizer