CnvUtilities Class Reference

class CnvUtilities

Provides static character conversion utilities for complex encodings. Its functions may be called from a plug-in DLL's implementation of ConvertFromUnicode() and ConvertToUnicode().

These utility functions are provided for use when converting to/from complex character set encodings, including modal encodings. Modal encodings are those where the interpretation of a given byte of data is dependent on the current mode; mode changing is performed by escape sequences which occur in the byte stream. A non-modal complex encoding is one in which characters are encoded using variable numbers of bytes. The number of bytes used to encode a character depends on the value of the initial byte.

Public Member Functions
IMPORT_C void ConvertFromIntermediateBufferInPlace ( TInt , TDes8 &, TInt &, const TDesC8 &, TInt )
IMPORT_C TInt ConvertFromUnicode ( CCnvCharacterSetConverter::TEndianness , const TDesC8 &, TDes8 &, const TDesC16 &, CCnvCharacterSetConverter::TArrayOfAscendingIndices &, const TArray < SCharacterSet > &)
IMPORT_C TInt ConvertFromUnicode ( CCnvCharacterSetConverter::TEndianness , const TDesC8 &, TDes8 &, const TDesC16 &, CCnvCharacterSetConverter::TArrayOfAscendingIndices &, const TArray < SCharacterSet > &, TUint &, TUint )
IMPORT_C TInt ConvertToUnicodeFromHeterogeneousForeign ( CCnvCharacterSetConverter::TEndianness , TDes16 &, const TDesC8 &, TInt &, TInt &, const TArray < SMethod > &)
IMPORT_C TInt ConvertToUnicodeFromHeterogeneousForeign ( CCnvCharacterSetConverter::TEndianness , TDes16 &, const TDesC8 &, TInt &, TInt &, const TArray < SMethod > &, TUint &, TUint )
IMPORT_C TInt ConvertToUnicodeFromModalForeign ( CCnvCharacterSetConverter::TEndianness , TDes16 &, const TDesC8 &, TInt &, TInt &, TInt &, const TArray < SState > &)
IMPORT_C TInt ConvertToUnicodeFromModalForeign ( CCnvCharacterSetConverter::TEndianness , TDes16 &, const TDesC8 &, TInt &, TInt &, TInt &, const TArray < SState > &, TUint &, TUint )
Private Member Functions
void CheckArrayOfCharacterSets (const TArray < SCharacterSet > &)
void CheckArrayOfMethods (const TArray < SMethod > &)
void CheckArrayOfStates (const TArray < SState > &)
TBool IsStartOf (const TDesC8 &, const TDesC8 &)
TInt LengthOfUnicodeCharacter (const TDesC16 &, TInt )
TBool MatchesEscapeSequence ( TInt &, TPtrC8 &, TPtrC8 &, const TDesC8 &)
TBool NextHomogeneousForeignRun (const SCnvConversionData *&, TInt &, TPtrC8 &, TPtrC8 &, const TArray < SState > &, TUint &)
TInt ReduceToNearestMultipleOf ( TInt , TInt )
Public Member Type Definitions
typedef void(* FConvertFromIntermediateBufferInPlace
typedef void(* FConvertToIntermediateBufferInPlace
typedef TInt (* FNumberOfBytesAbleToConvert

Member Functions Documentation

CheckArrayOfCharacterSets(const TArray< SCharacterSet > &)

void CheckArrayOfCharacterSets ( const TArray < SCharacterSet > & aArrayOfCharacterSets ) [private, static]

Parameters

const TArray < SCharacterSet > & aArrayOfCharacterSets

CheckArrayOfMethods(const TArray< SMethod > &)

void CheckArrayOfMethods ( const TArray < SMethod > & aArrayOfMethods ) [private, static]

Parameters

const TArray < SMethod > & aArrayOfMethods

CheckArrayOfStates(const TArray< SState > &)

void CheckArrayOfStates ( const TArray < SState > & aArrayOfStates ) [private, static]

Parameters

const TArray < SState > & aArrayOfStates

ConvertFromIntermediateBufferInPlace(TInt, TDes8 &, TInt &, const TDesC8 &, TInt)

IMPORT_C void ConvertFromIntermediateBufferInPlace ( TInt aStartPositionInDescriptor,
TDes8 & aDescriptor,
TInt & aNumberOfCharactersThatDroppedOut,
const TDesC8 & aEscapeSequence,
TInt aNumberOfBytesPerCharacter
) [static]

Inserts an escape sequence into the descriptor.

This function is provided to help in the implementation of ConvertFromUnicode() for modal character set encodings. Each SCharacterSet object in the array passed to ConvertFromUnicode() must have its iConvertFromIntermediateBufferInPlace member assigned. To do this for a modal character set encoding, implement a function whose signature matches that of FConvertFromIntermediateBufferInPlace and which calls this function, passing all arguments unchanged, and specifying the character set's escape sequence and the number of bytes per character.

Parameters

TInt aStartPositionInDescriptor The byte position in aDescriptor at which the escape sequence is inserted. If the character set uses more than one byte per character, this position must be the start of a character, otherwise a panic occurs.
TDes8 & aDescriptor The descriptor into which the escape sequence is inserted.
TInt & aNumberOfCharactersThatDroppedOut The escape sequence is inserted into the start of aDescriptor and any characters that need to drop out to make room for the escape sequence (because the descriptor's maximum length was not long enough) drop out from the end of the buffer. This parameter indicates the number of characters that needed to drop out.
const TDesC8 & aEscapeSequence The escape sequence for the character set.
TInt aNumberOfBytesPerCharacter The number of bytes per character.

ConvertFromUnicode(CCnvCharacterSetConverter::TEndianness, const TDesC8 &, TDes8 &, const TDesC16 &, CCnvCharacterSetConverter::TArrayOfAscendingIndices &, const TArray< SCharacterSet > &)

IMPORT_C TInt ConvertFromUnicode ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
const TDesC8 & aReplacementForUnconvertibleUnicodeCharacters,
TDes8 & aForeign,
const TDesC16 & aUnicode,
CCnvCharacterSetConverter::TArrayOfAscendingIndices & aIndicesOfUnconvertibleCharacters,
const TArray < SCharacterSet > & aArrayOfCharacterSets
) [static]

Converts Unicode text into a complex foreign character set encoding. This is an encoding which cannot be converted simply by calling CCnvCharacterSetConverter::DoConvertFromUnicode() . It may be modal (e.g. JIS) or non-modal (e.g. Shift-JIS).

The Unicode text specified in aUnicode is converted using the array of conversion data objects (aArrayOfCharacterSets) provided by the plug-in for the complex character set encoding, and the converted text is returned in aForeign. Any existing contents in aForeign are overwritten.

Unlike CCnvCharacterSetConverter::DoConvertFromUnicode() , multiple character sets can be specified. aUnicode is converted using the first character conversion data object in the array. When a character is found which cannot be converted using that data, each character set in the array is tried in turn. If it cannot be converted using any object in the array, the index of the character is appended to aIndicesOfUnconvertibleCharacters and the character is replaced by aReplacementForUnconvertibleUnicodeCharacters.

If it can be converted using another object in the array, that object is used to convert all subsequent characters until another unconvertible character is found.

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness to use when writing the characters in the foreign character set. If an endian-ness for foreign characters is specified in the current conversion data object, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
const TDesC8 & aReplacementForUnconvertibleUnicodeCharacters The single character (one or more byte values) which is used to replace unconvertible characters.
TDes8 & aForeign On return, contains the converted text in the non-Unicode character set.
const TDesC16 & aUnicode The source Unicode text to be converted.
CCnvCharacterSetConverter::TArrayOfAscendingIndices & aIndicesOfUnconvertibleCharacters On return, holds an ascending array of the indices of each Unicode character in the source text which could not be converted (because none of the target character sets have an equivalent character).
const TArray < SCharacterSet > & aArrayOfCharacterSets Array of character conversion data objects, representing the character sets which comprise a complex character set encoding. These are used in sequence to convert the Unicode text. There must be at least one character set in this array and no character set may have any NULL member data, or a panic occurs.

ConvertFromUnicode(CCnvCharacterSetConverter::TEndianness, const TDesC8 &, TDes8 &, const TDesC16 &, CCnvCharacterSetConverter::TArrayOfAscendingIndices &, const TArray< SCharacterSet > &, TUint &, TUint)

IMPORT_C TInt ConvertFromUnicode ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
const TDesC8 & aReplacementForUnconvertibleUnicodeCharacters,
TDes8 & aForeign,
const TDesC16 & aUnicode,
CCnvCharacterSetConverter::TArrayOfAscendingIndices & aIndicesOfUnconvertibleCharacters,
const TArray < SCharacterSet > & aArrayOfCharacterSets,
TUint & aOutputConversionFlags,
TUint aInputConversionFlags
) [static]

Converts Unicode text into a complex foreign character set encoding. This is an encoding which cannot be converted simply by a call to CCnvCharacterSetConverter::DoConvertFromUnicode() . It may be modal (e.g. JIS) or non-modal (e.g. Shift-JIS).

The Unicode text specified in aUnicode is converted using the array of conversion data objects (aArrayOfCharacterSets) provided by the plug-in for the complex character set encoding and the converted text is returned in aForeign. The function can either append to aForeign or overwrite its contents (if any).

Unlike CCnvCharacterSetConverter::DoConvertFromUnicode() , multiple character sets can be specified. aUnicode is converted using the first character conversion data object in the array. When a character is found which cannot be converted using that data, each character set in the array is tried in turn. If it cannot be converted using any object in the array, the index of the character is appended to aIndicesOfUnconvertibleCharacters and the character is replaced by aReplacementForUnconvertibleUnicodeCharacters.

If it can be converted using another object in the array, that object is used to convert all subsequent characters until another unconvertible character is found.

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness to use when writing the characters in the foreign character set. If an endian-ness for foreign characters is specified in the current conversion data object, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
const TDesC8 & aReplacementForUnconvertibleUnicodeCharacters The single character (one or more byte values) which is used to replace unconvertible characters.
TDes8 & aForeign On return, contains the converted text in the non-Unicode character set. This may already contain some text. If it does, and if aInputConversionFlags specifies EInputConversionFlagAppend, then the converted text is appended to this descriptor.
const TDesC16 & aUnicode The source Unicode text to be converted.
CCnvCharacterSetConverter::TArrayOfAscendingIndices & aIndicesOfUnconvertibleCharacters On return, holds an ascending array of the indices of each Unicode character in the source text which could not be converted (because none of the target character sets have an equivalent character).
const TArray < SCharacterSet > & aArrayOfCharacterSets Array of character set data objects. These are used in sequence to convert the Unicode text. There must be at least one character set in this array and no character set may have any NULL member data, or a panic occurs.
TUint & aOutputConversionFlags If the input descriptor ended in a truncated sequence, e.g. the first half only of a Unicode surrogate pair, this returns with the EOutputConversionFlagInputIsTruncated flag set.
TUint aInputConversionFlags Specify CCnvCharacterSetConverter::EInputConversionFlagAppend to append the text to aForeign. Specify CCnvCharacterSetConverter::EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable to prevent the function from returning the error-code EErrorIllFormedInput when the input descriptor consists of nothing but a truncated sequence. The CCnvCharacterSetConverter::EInputConversionFlagStopAtFirstUnconvertibleCharacter flag must not be set, otherwise a panic occurs.

ConvertToUnicodeFromHeterogeneousForeign(CCnvCharacterSetConverter::TEndianness, TDes16 &, const TDesC8 &, TInt &, TInt &, const TArray< SMethod > &)

IMPORT_C TInt ConvertToUnicodeFromHeterogeneousForeign ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
TDes16 & aUnicode,
const TDesC8 & aForeign,
TInt & aNumberOfUnconvertibleCharacters,
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter,
const TArray < SMethod > & aArrayOfMethods
) [static]

Converts text from a non-modal complex character set encoding (e.g. Shift-JIS or EUC-JP) into Unicode.The non-Unicode text specified in aForeign is converted using the array of character set conversion methods (aArrayOfMethods) provided by the plug-in, and the converted text is returned in aUnicode. Overwrites the contents, if any, of aUnicode.

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness of the foreign characters. If an endian-ness for foreign characters is specified in the conversion data, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
TDes16 & aUnicode On return, contains the text converted into Unicode.
const TDesC8 & aForeign The non-Unicode source text to be converted.
TInt & aNumberOfUnconvertibleCharacters On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xfffd).
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.
const TArray < SMethod > & aArrayOfMethods Array of conversion methods. There must be one or more methods in this array and none of the methods in the array can have any NULL member data or a panic occurs.

ConvertToUnicodeFromHeterogeneousForeign(CCnvCharacterSetConverter::TEndianness, TDes16 &, const TDesC8 &, TInt &, TInt &, const TArray< SMethod > &, TUint &, TUint)

IMPORT_C TInt ConvertToUnicodeFromHeterogeneousForeign ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
TDes16 & aUnicode,
const TDesC8 & aForeign,
TInt & aNumberOfUnconvertibleCharacters,
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter,
const TArray < SMethod > & aArrayOfMethods,
TUint & aOutputConversionFlags,
TUint aInputConversionFlags
) [static]

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness for the foreign characters. If an endian-ness for foreign characters is specified in the conversion data, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
TDes16 & aUnicode On return, contains the text converted into Unicode.
const TDesC8 & aForeign The non-Unicode source text to be converted.
TInt & aNumberOfUnconvertibleCharacters On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xfffd).
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.
const TArray < SMethod > & aArrayOfMethods Array of conversion methods. There must be one or more methods in this array and none of the methods in the array can have any NULL member data or a panic occurs.
TUint & aOutputConversionFlags If the input descriptor ended in a truncated sequence, e.g. a part of a multi-byte character, aOutputConversionFlags returns with the EOutputConversionFlagInputIsTruncated flag set.
TUint aInputConversionFlags Specify CCnvCharacterSetConverter::EInputConversionFlagAppend to append the text to aUnicode. Specify EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable to prevent the function from returning the error-code EErrorIllFormedInput when the input descriptor consists of nothing but a truncated sequence. The CCnvCharacterSetConverter::EInputConversionFlagStopAtFirstUnconvertibleCharacter flag must not be set, otherwise a panic occurs.

ConvertToUnicodeFromModalForeign(CCnvCharacterSetConverter::TEndianness, TDes16 &, const TDesC8 &, TInt &, TInt &, TInt &, const TArray< SState > &)

IMPORT_C TInt ConvertToUnicodeFromModalForeign ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
TDes16 & aUnicode,
const TDesC8 & aForeign,
TInt & aState,
TInt & aNumberOfUnconvertibleCharacters,
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter,
const TArray < SState > & aArrayOfStates
) [static]

Converts text from a modal foreign character set encoding into Unicode.

The non-Unicode text specified in aForeign is converted using the array of character set conversion objects (aArrayOfStates) provided by the plug-in, and the converted text is returned in aUnicode. The function can either append to aUnicode or overwrite its contents (if any), depending on the input conversion flags specified. The first element in aArrayOfStates is taken to be the default mode (i.e. the mode to assume by default if there is no preceding escape sequence).

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness of the foreign characters. If an endian-ness for foreign characters is specified in the conversion data, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
TDes16 & aUnicode On return, contains the text converted into Unicode.
const TDesC8 & aForeign The non-Unicode source text to be converted.
TInt & aState Used to store a modal character set encoding's current mode across multiple calls to ConvertToUnicode() on the same input descriptor. This argument should be passed the same object as passed to the plug-in's ConvertToUnicode() exported function.
TInt & aNumberOfUnconvertibleCharacters On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xfffd).
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.
const TArray < SState > & aArrayOfStates Array of character set conversion data objects, and their escape sequences ("modes"). There must be one or more modes in this array, none of the modes can have any NULL member data, and each mode's escape sequence must begin with KControlCharacterEscape (0x1b) or a panic occurs.

ConvertToUnicodeFromModalForeign(CCnvCharacterSetConverter::TEndianness, TDes16 &, const TDesC8 &, TInt &, TInt &, TInt &, const TArray< SState > &, TUint &, TUint)

IMPORT_C TInt ConvertToUnicodeFromModalForeign ( CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters,
TDes16 & aUnicode,
const TDesC8 & aForeign,
TInt & aState,
TInt & aNumberOfUnconvertibleCharacters,
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter,
const TArray < SState > & aArrayOfStates,
TUint & aOutputConversionFlags,
TUint aInputConversionFlags
) [static]

Parameters

CCnvCharacterSetConverter::TEndianness aDefaultEndiannessOfForeignCharacters The default endian-ness for the foreign characters. If an endian-ness for foreign characters is specified in the conversion data, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.
TDes16 & aUnicode On return, contains the text converted into Unicode.
const TDesC8 & aForeign The non-Unicode source text to be converted.
TInt & aState Used to store a modal character set encoding's current mode across multiple calls to ConvertToUnicode() on the same input descriptor. This argument should be passed the same object as passed to the plug-in's ConvertToUnicode() exported function.
TInt & aNumberOfUnconvertibleCharacters On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xfffd).
TInt & aIndexOfFirstByteOfFirstUnconvertibleCharacter On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.
const TArray < SState > & aArrayOfStates Array of character set conversion data objects, and their escape sequences. There must be one or more modes in this array, none of the modes can have any NULL member data, and each mode's escape sequence must begin with KControlCharacterEscape (0x1b) or a panic occurs.
TUint & aOutputConversionFlags If the input descriptor ended in a truncated sequence, e.g. a part of a multi-byte character, aOutputConversionFlags returns with the EOutputConversionFlagInputIsTruncated flag set.
TUint aInputConversionFlags Specify CCnvCharacterSetConverter::EInputConversionFlagAppend to append the text to aUnicode. Specify EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable to prevent the function from returning the error-code EErrorIllFormedInput when the input descriptor consists of nothing but a truncated sequence. The CCnvCharacterSetConverter::EInputConversionFlagStopAtFirstUnconvertibleCharacter flag must not be set, otherwise a panic occurs.

IsStartOf(const TDesC8 &, const TDesC8 &)

TBool IsStartOf ( const TDesC8 & aStart,
const TDesC8 & aPotentiallyLongerDescriptor
) [private, static]

Parameters

const TDesC8 & aStart
const TDesC8 & aPotentiallyLongerDescriptor

LengthOfUnicodeCharacter(const TDesC16 &, TInt)

TInt LengthOfUnicodeCharacter ( const TDesC16 & aUnicode,
TInt aIndex
) [private, static]

Parameters

const TDesC16 & aUnicode
TInt aIndex

MatchesEscapeSequence(TInt &, TPtrC8 &, TPtrC8 &, const TDesC8 &)

TBool MatchesEscapeSequence ( TInt & aNumberOfForeignBytesConsumed,
TPtrC8 & aHomogeneousRun,
TPtrC8 & aRemainderOfForeign,
const TDesC8 & aEscapeSequence
) [private, static]

Parameters

TInt & aNumberOfForeignBytesConsumed
TPtrC8 & aHomogeneousRun
TPtrC8 & aRemainderOfForeign
const TDesC8 & aEscapeSequence

NextHomogeneousForeignRun(const SCnvConversionData *&, TInt &, TPtrC8 &, TPtrC8 &, const TArray< SState > &, TUint &)

TBool NextHomogeneousForeignRun ( const SCnvConversionData *& aConversionData,
TInt & aNumberOfForeignBytesConsumed,
TPtrC8 & aHomogeneousRun,
TPtrC8 & aRemainderOfForeign,
const TArray < SState > & aArrayOfStates,
TUint & aOutputConversionFlags
) [private, static]

Parameters

const SCnvConversionData *& aConversionData
TInt & aNumberOfForeignBytesConsumed
TPtrC8 & aHomogeneousRun
TPtrC8 & aRemainderOfForeign
const TArray < SState > & aArrayOfStates
TUint & aOutputConversionFlags

ReduceToNearestMultipleOf(TInt, TInt)

TInt ReduceToNearestMultipleOf ( TInt aNumber1,
TInt aNumber2
) [private, static, inline]

Parameters

TInt aNumber1
TInt aNumber2

Member Type Definitions Documentation

Typedef FConvertFromIntermediateBufferInPlace

typedef void(* FConvertFromIntermediateBufferInPlace

A pointer to a function which "mangles" text when converting from Unicode into a complex modal or non-modal foreign character set encoding.

It might insert a shifting character, escape sequence, or other special characters.If the target character set encoding is modal, the implementation of this function may call the CnvUtilities::ConvertFromIntermediateBufferInPlace() utility function which is provided because many modal character sets require an identical implementation of this function.

" convutils.lib "

Typedef FConvertToIntermediateBufferInPlace

typedef void(* FConvertToIntermediateBufferInPlace

A pointer to a function which prepares the text for conversion into Unicode.

For instance it might remove any shifting or other special characters. Called when converting from a non-modal complex character set encoding into Unicode.

" convutils.lib "

Typedef FNumberOfBytesAbleToConvert

typedef TInt (* FNumberOfBytesAbleToConvert

A pointer to a function which calculates the number of consecutive bytes in the remainder of the foreign descriptor which can be converted using the current character set's conversion data.

Called when converting from a non-modal complex character set encoding into Unicode. It may return a negative CCnvCharacterSetConverter::TError value to indicate an error in the encoding.

" convutils.lib "