How to use Lexical Analysis

Explains how to create and use TLex objects with code fragments.

Introduction

A TLex object may be constructed as an empty object, from another TLex, or from an existing string as below. The code fragments are taken from an example that implements a simple Reverse Polish Notation calculation engine.

TInt RPNCalc(const TDesC& aCommand, TReal& aReturnValue)
    {
    ...
    TLex input (aCommand) ;
    ...

Code can then proceed to move through the TLex data to:

  • mark positions for rolling back to later

  • delimit the start of lexical tokens

  • delete parts of the string held by the TLex object.

...
input.Mark() ;                     // Remember where we are.
input.SkipCharacters() ;           // Move to end of character token.
...
_LIT(KTextMemset,"MEMSET");
if ( input.TokenLength() != 0 )    // if valid potential token
    {
    TPtrC token = input.MarkedToken() ;    // then extract token
    if ( token.CompareF(KTextMemset == 0)  // and test.
        {
        ...
        }
    ...
    }
 ...

Analysis can also be done by character using functions that move through the TLex data, extracting, returning and jumping specified character lengths. For example:

...
// ensure we are looking at a digit or sign
if (!(input.Peek().IsDigit() || (input.Peek() == '.') ) )
    {
    return KErrNotFound ;
    }
...    
// deal with sign
if (input.Peek() == '+')
    {
    input.Inc();
    }

Additionally, numeric conversion functions permit a variety of numeric formats to be extracted from the TLex data, with provision for conversion using the most common number systems (radixes).

...
if (input.Val(extractUint) == KErrNone)
    {
    stack.Push(TReal(extractUint));
    }
else if (input.Val(extractReal) == KErrNone)
        {
        stack.Push(extractReal);
        }
...

where stack, is an instance of a class implementing a stack.

Constructing TLex objects

This converts an real number into a string, which is then assigned to a TLex.

TBuf<0x100> convertRealToString;
// want a TLex from a value
if (convertRealToString.Num(value,format) < KErrNone )
    {
       ...
       }
else
    {
    convertRealToString.ZeroTerminate();
    TLex string(convertRealToString) ;
       }

This takes a descriptor as a function parameter, and copies it to a TLex.

TInt RPNCalc(const TDesC& iCommand, TReal& returnValue)
    {
    TLex input (iCommand) ;
       }

Peeking the next character

This shows a code flow decision made according to next character to be read from the TLex:

if (!(input.Peek()).IsDigit()) // found non-digit after decimal point if 

Moving past a character that has been peeked

This shows the use of the Inc() function to move past a character that has been peeked:

if (input.Peek() == '-') 
    {
    input.Inc() ;        // move past minus sign & flag
    negative = ETrue ;
    }

Restoring a previously “got” character

This shows the use of UnGet() to restore the previously "got" character.

....
if (input.Offset() > 0)    // if not at start of line
    {
    input.UnGet() ;        // restore 'got' character
    }

If the previous character is before the start of the string, then the function raises a USER 59 panic for the TLex8 variant and a USER 64 panic for the TLex16 variant.

Reset the next character to the supplied mark

This shows how to allow part of a TLex can be parsed again:

if (!(input.Peek()).IsDigit())
    {
    // found non-digit after decimal point. Error, so rewind 
    input.UnGetToMark(startMark);
    }

Skipping any non-white space to get the next token

This parses a TLex for the next token:

input.Mark() ;                      // remember where we are
input.SkipCharacters() ;            // move to end of character token
if ( input.TokenLength() != 0 )     // if valid potential token
...

Getting length of a token

This shows how TokenLength() is used to return the difference between the position of the next character and the extraction mark. This gives a check as to whether the token length is valid. An invalid token length implies an invalid token.


if ( input.TokenLength() != 0 )  // if valid token length
...

Extracting a token

This extracts a marked token.

TPtrC token = input.MarkedToken() ;  // extract token 

Getting the offset of next character position

This shows how to return the offset of the next character position from the start of the string.

if (input.Offset() > 0)    // if not at start of line
    {
    input.UnGet() ;        // restore 'got' character
    ...
    }

Extracting an unknown number type

This example shows how to return the offset of the next character position from the start of the string.

if (input.Val(extractUint) == KErrNone)
    {
    stack.Push(TReal(extractUint)) ;
    }
else 
    {
    if (input.Val(extractReal) == KErrNone)
        {
        stack.Push(extractReal) ;
        }
    }

This extracts an unknown number type. Tries an integer first and then, if this fails, tries a real:

if (input.Val(extractUint) == KErrNone)
    stack.Push(TReal(extractUint)) ;
else if (input.Val(extractReal) == KErrNone)
    stack.Push(extractReal) ;