How to use Lexical Analysis

Explains how to create and use TLex objects with code fragments.

Introduction

A TLex object may be constructed as an empty object, from another TLex, or from an existing string as below. The code fragments are taken from an example that implements a simple Reverse Polish Notation calculation engine.

       
        
       
       TInt RPNCalc(const TDesC& aCommand, TReal& aReturnValue)
    {
    ...
    TLex input (aCommand) ;
    ...
      

Code can then proceed to move through the TLex data to:

  • mark positions for rolling back to later

  • delimit the start of lexical tokens

  • delete parts of the string held by the TLex object.

       
        
       
       ...
input.Mark() ;                     // Remember where we are.
input.SkipCharacters() ;           // Move to end of character token.
...
_LIT(KTextMemset,"MEMSET");
if ( input.TokenLength() != 0 )    // if valid potential token
    {
    TPtrC token = input.MarkedToken() ;    // then extract token
    if ( token.CompareF(KTextMemset == 0)  // and test.
        {
        ...
        }
    ...
    }
 ...
      

Analysis can also be done by character using functions that move through the TLex data, extracting, returning and jumping specified character lengths. For example:

       
        
       
       ...
// ensure we are looking at a digit or sign
if (!(input.Peek().IsDigit() || (input.Peek() == '.') ) )
    {
    return KErrNotFound ;
    }
...    
// deal with sign
if (input.Peek() == '+')
    {
    input.Inc();
    }
      

Additionally, numeric conversion functions permit a variety of numeric formats to be extracted from the TLex data, with provision for conversion using the most common number systems (radixes).

       
        
       
       ...
if (input.Val(extractUint) == KErrNone)
    {
    stack.Push(TReal(extractUint));
    }
else if (input.Val(extractReal) == KErrNone)
        {
        stack.Push(extractReal);
        }
...
      

where stack , is an instance of a class implementing a stack.

Constructing TLex objects

This converts an real number into a string, which is then assigned to a TLex .

       
        
       
       TBuf<0x100> convertRealToString;
// want a TLex from a value
if (convertRealToString.Num(value,format) < KErrNone )
    {
       ...
       }
else
    {
    convertRealToString.ZeroTerminate();
    TLex string(convertRealToString) ;
       }
      

This takes a descriptor as a function parameter, and copies it to a TLex .

       
        
       
       TInt RPNCalc(const TDesC& iCommand, TReal& returnValue)
    {
    TLex input (iCommand) ;
       }
      

Peeking the next character

This shows a code flow decision made according to next character to be read from the TLex :

       
        
       
       if (!(input.Peek()).IsDigit()) // found non-digit after decimal point if
      

Moving past a character that has been peeked

This shows the use of the Inc() function to move past a character that has been peeked:

       
        
       
       if (input.Peek() == '-') 
    {
    input.Inc() ;        // move past minus sign & flag
    negative = ETrue ;
    }
      

Restoring a previously “got” character

This shows the use of UnGet() to restore the previously "got" character.

       
        
       
       ....
if (input.Offset() > 0)    // if not at start of line
    {
    input.UnGet() ;        // restore 'got' character
    }
      

If the previous character is before the start of the string, then the function raises a USER 59 panic for the TLex8 variant and a USER 64 panic for the TLex16 variant.

Reset the next character to the supplied mark

This shows how to allow part of a TLex can be parsed again:

       
        
       
       if (!(input.Peek()).IsDigit())
    {
    // found non-digit after decimal point. Error, so rewind 
    input.UnGetToMark(startMark);
    }
      

Skipping any non-white space to get the next token

This parses a TLex for the next token:

       
        
       
       input.Mark() ;                      // remember where we are
input.SkipCharacters() ;            // move to end of character token
if ( input.TokenLength() != 0 )     // if valid potential token
...
      

Getting length of a token

This shows how TokenLength() is used to return the difference between the position of the next character and the extraction mark. This gives a check as to whether the token length is valid. An invalid token length implies an invalid token.

       
        
       
       if ( input.TokenLength() != 0 )  // if valid token length
...
      

Extracting a token

This extracts a marked token.

       
        
       
       TPtrC token = input.MarkedToken() ;  // extract token
      

Getting the offset of next character position

This shows how to return the offset of the next character position from the start of the string.

       
        
       
       if (input.Offset() > 0)    // if not at start of line
    {
    input.UnGet() ;        // restore 'got' character
    ...
    }
      

Extracting an unknown number type

This example shows how to return the offset of the next character position from the start of the string.

       
        
       
       if (input.Val(extractUint) == KErrNone)
    {
    stack.Push(TReal(extractUint)) ;
    }
else 
    {
    if (input.Val(extractReal) == KErrNone)
        {
        stack.Push(extractReal) ;
        }
    }
      

This extracts an unknown number type. Tries an integer first and then, if this fails, tries a real:

       
        
       
       if (input.Val(extractUint) == KErrNone)
    stack.Push(TReal(extractUint)) ;
else if (input.Val(extractReal) == KErrNone)
    stack.Push(extractReal) ;