Reserved and Unreserved Characters

This section explains the concept of reserved, unsafe and unreserved characters.

Reserved characters

URIs use some characters for special purposes in defining their syntax, these are called reserved characters. For example, - ; / ? : & = . When these characters are not used in their special role inside a URI, they need to be encoded.

The following lists the reserved characters for different URI components as defined in TEscapeMode :

Escape Mode URI component Reserved Characters

EscapeNormal

None

No reserved characters

EscapeQuery

query

- ;/?:&=+$,[]

EscapePath

path

- /;=?[]

EscapeAuth

authority

- /;:@?[]

EscapeUrlEncoded

URL

;/?:&=+$[]!\'()~

Unsafe characters

Some characters present the possibility of being misunderstood within URIs for various reasons. These are called unsafe characters and must always be encoded. For example, '#' character is used in URIs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.

Unreserved characters

Data characters that are allowed in a URI but do not have a reserved purpose are called "unreserved" characters. These include upper and lower case letters, decimal digits, a limited set of punctuation marks and symbols, ASCII control characters which are not printable. For example, the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).

EscapeUtils escape encodes and decodes unsafe data in URI. It also supports converting of Unicode data (16-bit descriptor) into UTF8 data (8-bit descriptor) and vice-versa.

EscapeUtils provides the following functionality.