Character Encoding and Conversion Framework Overview

The Charconv Framework component provides APIs and built-in converters to convert text between Unicode and other character encodings.

The Component also provides APIs which list the encodings available on the device and which select a specific encoding to convert to or from.

Architecture

The Charconv Framework is part of the Character Conversion (Charconv) Framework collection.

APIs

The Charconv Framework includes the following APIs:

API Description

CCnvCharacterSetConverter

Converts text between Unicode and another character set. Conversion can be performed on fragments of input, including handling the case when the input text is truncated midway through a multi-byte character. This allows clients to do conversion in small steps, which may be preferred for large amounts of text or when text is arriving in fragments from an external source.

CnvUtfConverter

Converts text between Unicode (UTF-16) and the two Unicode transformation formats UTF-7 and UTF-8. These are ASCII-compatible encodings of Unicode which use sequences of multiple bytes to encode non-ASCII characters. Conversion can be performed in incremental steps.

CCnvCharacterSetNames

Gets the names of all available converters.

CnvUtilities

Provides static character conversion utilities for complex encodings.

CCharacterSetConverterPluginInterface

Defines the methods in the CharConv Character Set Conversion plug-in interface.

Built-in converters

The built-in converters are used by most languages. Each converter is identified by a Unique Identifier (UID). The UIDs are defined in the charconv.h header file.

Converter Name Target script

UTF-7

Universal (Unicode)

UTF-8

Universal (Unicode)

Little Endian Unicode

Universal (Unicode)

Big Endian Unicode

Universal (Unicode)

IMAP UTF-7

Universal (Unicode) for IMAP

Java UTF-8

Universal (Unicode) for Java

ASCII

Western European / US

Code Page 1252

Western European / US

ISO8859-1

Western European / US

SMS7Bit

Western European / US

Typical uses

The Charconv Framework can be used to:

  • Convert between a foreign encoding and Unicode.

  • Convert between a Unicode transformation format and Unicode.

  • Write additional plug-in converters for specific encodings or languages.

  • Get a list of converter names.

  • Automatically detect the character encoding to be converted.

  • Get the character converter UID using its Internet standard encoding name.