This section explains the concept of reserved, unsafe and unreserved characters.
URIs use some characters for special purposes in defining their syntax, these are called reserved characters. For example, - ; / ? : & = . When these characters are not used in their special role inside a URI, they need to be encoded.
The following lists the reserved characters for different URI components as defined in TEscapeMode:
Some characters present the possibility of being misunderstood within URIs for various reasons. These are called unsafe characters and must always be encoded. For example, '#' character is used in URIs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.
Data characters that are allowed in a URI but do not have a reserved purpose are called "unreserved" characters. These include upper and lower case letters, decimal digits, a limited set of punctuation marks and symbols, ASCII control characters which are not printable. For example, the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).
EscapeUtils escape encodes and decodes unsafe data in URI. It also supports converting of Unicode data (16-bit descriptor) into UTF8 data (8-bit descriptor) and vice-versa.
EscapeUtils provides the following functionality.
Copyright ©2010 Nokia Corporation and/or its subsidiary(-ies).
All rights
reserved. Unless otherwise stated, these materials are provided under the terms of the Eclipse Public License
v1.0.