textandloc_pub/character_conversion_api/inc/UTF.H
author Pat Downey <patd@symbian.org>
Fri, 04 Jun 2010 10:51:39 +0100
changeset 33 641ba6dff8d4
parent 32 8b9155204a54
permissions -rw-r--r--
Re-merge fix for bug 1543.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
32
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     1
/*
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     2
* Copyright (c) 1997-2003 Nokia Corporation and/or its subsidiary(-ies). 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     3
* All rights reserved.
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     4
* This component and the accompanying materials are made available
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     5
* under the terms of the License "Eclipse Public License v1.0"
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     6
* which accompanies this distribution, and is available
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     7
* at the URL "http://www.eclipse.org/legal/epl-v10.html".
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     8
*
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
     9
* Initial Contributors:
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    10
* Nokia Corporation - initial contribution.
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    11
*
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    12
* Contributors:
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    13
*
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    14
* Description: 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    15
*
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    16
*/
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    17
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    18
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    19
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    20
#if !defined(__UTF_H__)
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    21
#define __UTF_H__
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    22
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    23
#if !defined(__E32STD_H__)
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    24
#include <e32std.h>
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    25
#endif
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    26
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    27
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    28
class CnvUtfConverter
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    29
/** 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    30
Converts text between Unicode (UCS-2) and the two Unicode transformation 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    31
formats UTF-7 and UTF-8. There are no functions to convert directly between 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    32
UTF-7 and UTF-8.
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    33
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    34
Objects of this class do not need to be created because all the member functions 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    35
are static. The four functions are passed text in the second argument and 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    36
output the resulting text in the first argument. Sixteen-bit descriptors are 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    37
used to hold text encoded in UCS-2 (i.e. normal 16 bit Unicode), and eight-bit 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    38
descriptors are used to hold text encoded in either of the transformation 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    39
formats.
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    40
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    41
The conversion functions return the number of characters which were not converted 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    42
because the output descriptor was not long enough to hold all of the converted 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    43
text. This allows users of this class to perform partial conversions on an 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    44
input descriptor, handling the case when the input descriptor is truncated 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    45
mid way through a multi-byte character. The caller does not have to guess 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    46
how big to make the output descriptor for a given input descriptor- they 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    47
can simply do the conversion in a loop using a small output descriptor. The 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    48
ability to handle truncated descriptors is particularly useful if the caller 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    49
is receiving information in chunks from an external source. 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    50
@publishedAll
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    51
@released
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    52
*/
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    53
	{
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    54
public:
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    55
	/** Conversion error flags. At this stage there is only one error flag 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    56
	- others may be added in the future. */
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    57
	enum TError
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    58
		{
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    59
 		/** The input descriptor contains a single corrupt character. This 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    60
 		might occur when the input descriptor only contains some of the bytes 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    61
 		of a single multi-byte character. */
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    62
		EErrorIllFormedInput=KErrCorrupt
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    63
		};
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    64
	 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    65
	 /** Initial value for the state argument in a set of related calls to
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    66
	ConvertToUnicode(). */
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    67
	enum {KStateDefault=0}; 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    68
public:
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    69
	// the conversion functions return either one of the TError values above, or the number of unconverted elements left at the end of the input descriptor
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    70
	IMPORT_C static TInt ConvertFromUnicodeToUtf7(TDes8& aUtf7, const TDesC16& aUnicode, TBool aEncodeOptionalDirectCharactersInBase64);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    71
	static TInt ConvertFromUnicodeToUtf7(TDes8& aUtf7, const TDesC16& aUnicode, TBool aIsImapUtf7, TBool aEncodeOptionalDirectCharactersInBase64);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    72
	IMPORT_C static TInt ConvertFromUnicodeToUtf8(TDes8& aUtf8, const TDesC16& aUnicode);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    73
	static TInt ConvertFromUnicodeToUtf8(TDes8& aUtf8, const TDesC16& aUnicode, TBool aGenerateJavaConformantUtf8);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    74
	IMPORT_C static TInt ConvertToUnicodeFromUtf7(TDes16& aUnicode, const TDesC8& aUtf7, TInt& aState);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    75
	static TInt ConvertToUnicodeFromUtf7(TDes16& aUnicode, const TDesC8& aUtf7, TBool aIsImapUtf7, TInt& aState);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    76
	IMPORT_C static TInt ConvertToUnicodeFromUtf8(TDes16& aUnicode, const TDesC8& aUtf8);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    77
	static TInt ConvertToUnicodeFromUtf8(TDes16& aUnicode, const TDesC8& aUtf8, TBool aGenerateJavaConformantUtf8);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    78
	static TInt ConvertToUnicodeFromUtf8(TDes16& aUnicode, const TDesC8& aUtf8, TBool aGenerateJavaConformantUtf8,
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    79
			TInt& aNumberOfUnconvertibleCharacters, TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    80
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    81
	IMPORT_C static HBufC8* ConvertFromUnicodeToUtf7L(const TDesC16& aUnicode,TBool aEncodeOptionalDirectCharactersInBase64);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    82
	IMPORT_C static HBufC8* ConvertFromUnicodeToUtf8L(const TDesC16& aUnicode);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    83
	IMPORT_C static HBufC16* ConvertToUnicodeFromUtf7L(const TDesC8& aUtf7); 
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    84
	IMPORT_C static HBufC16* ConvertToUnicodeFromUtf8L(const TDesC8& aUtf8);
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    85
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    86
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    87
	};
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    88
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    89
#endif
8b9155204a54 Revert last code drop.
Pat Downey <patd@symbian.org>
parents:
diff changeset
    90