14 <p>This topic describes the types of SMS encoding. </p> |
14 <p>This topic describes the types of SMS encoding. </p> |
15 <section id="GUID-F7D1E6C8-9605-57FA-9788-AF7FC72BD94C"><title>7-bit GSM encoding</title> <p>7-bit |
15 <section id="GUID-F7D1E6C8-9605-57FA-9788-AF7FC72BD94C"><title>7-bit GSM encoding</title> <p>7-bit |
16 GSM encoding supports the GSM 7-bit default alphabet and GSM 7-bit default |
16 GSM encoding supports the GSM 7-bit default alphabet and GSM 7-bit default |
17 alphabet extension table through an escape mechanism. </p> <p>Figure 1 </p> <fig id="GUID-CDEE59FC-F035-5B75-8838-96E94A6714E8"> |
17 alphabet extension table through an escape mechanism. </p> <p>Figure 1 </p> <fig id="GUID-CDEE59FC-F035-5B75-8838-96E94A6714E8"> |
18 <title> Escape mechanism </title> |
18 <title> Escape mechanism </title> |
19 <image href="GUID-08A6B93F-92CD-5182-B142-D353E78016F3_d0e406761_href.png" placement="inline"/> |
19 <image href="GUID-08A6B93F-92CD-5182-B142-D353E78016F3_d0e406599_href.png" placement="inline"/> |
20 </fig> <p>The GSM 7-bit default alphabet consists of 128 characters. Each |
20 </fig> <p>The GSM 7-bit default alphabet consists of 128 characters. Each |
21 character is represented by 7 bits. 10 extra characters are defined in the |
21 character is represented by 7 bits. 10 extra characters are defined in the |
22 GSM 7-bit default extension table. These characters are represented by an |
22 GSM 7-bit default extension table. These characters are represented by an |
23 escape mechanism using the escape character (0x1B). For example, 0x1B65 maps |
23 escape mechanism using the escape character (0x1B). For example, 0x1B65 maps |
24 to the Euro sign € (U+20AC). If an escape character byte is followed by a |
24 to the Euro sign € (U+20AC). If an escape character byte is followed by a |
41 reader can understand the word. The process of converting Á to A is called |
41 reader can understand the word. The process of converting Á to A is called |
42 a lossy conversion. </p> <p> <b>Note</b>: The 7-bit code of A (0x41) can only |
42 a lossy conversion. </p> <p> <b>Note</b>: The 7-bit code of A (0x41) can only |
43 be decoded back to the same Unicode letter A instead of Á. </p> <p>Figure |
43 be decoded back to the same Unicode letter A instead of Á. </p> <p>Figure |
44 2 </p> <fig id="GUID-ACFF9511-D5E0-5558-8008-4CD48EE0B7A1"> |
44 2 </p> <fig id="GUID-ACFF9511-D5E0-5558-8008-4CD48EE0B7A1"> |
45 <title> Lossy conversion </title> |
45 <title> Lossy conversion </title> |
46 <image href="GUID-8862E271-ABA4-5A25-8990-C0B3931E370D_d0e406801_href.png" placement="inline"/> |
46 <image href="GUID-8862E271-ABA4-5A25-8990-C0B3931E370D_d0e406639_href.png" placement="inline"/> |
47 </fig> </section> |
47 </fig> </section> |
48 <section id="GUID-D2F0E6BE-932E-545D-A0C8-39017E3D67B4"><title>16-bit Unicode |
48 <section id="GUID-D2F0E6BE-932E-545D-A0C8-39017E3D67B4"><title>16-bit Unicode |
49 encoding</title> <p>Unicode is an international standard character set. It |
49 encoding</title> <p>Unicode is an international standard character set. It |
50 includes the characters of every language. In Unicode, each character is usually |
50 includes the characters of every language. In Unicode, each character is usually |
51 encoded in two 8-bit bytes, and takes up more space than 7-bit encoding. </p> </section> |
51 encoded in two 8-bit bytes, and takes up more space than 7-bit encoding. </p> </section> |
72 <li id="GUID-830569B1-8ACD-5924-AF7F-15705FEF76B0"><p>The Language-specific |
72 <li id="GUID-830569B1-8ACD-5924-AF7F-15705FEF76B0"><p>The Language-specific |
73 basic table escapes to language-specific extension table. It is referred to |
73 basic table escapes to language-specific extension table. It is referred to |
74 as locking-single. </p> </li> |
74 as locking-single. </p> </li> |
75 </ul> <p>Figure 3 </p> <fig id="GUID-541CED9A-2450-5C9D-AADF-93EE59E4D77E"> |
75 </ul> <p>Figure 3 </p> <fig id="GUID-541CED9A-2450-5C9D-AADF-93EE59E4D77E"> |
76 <title> National language encoding </title> |
76 <title> National language encoding </title> |
77 <image href="GUID-44347376-702D-5648-8938-EB55AFA329EC_d0e406863_href.png" placement="inline"/> |
77 <image href="GUID-44347376-702D-5648-8938-EB55AFA329EC_d0e406701_href.png" placement="inline"/> |
78 </fig><p>The single shift mechanism is useful when a message contains only |
78 </fig><p>The single shift mechanism is useful when a message contains only |
79 a few characters outside the default GSM table. It is however inefficient |
79 a few characters outside the default GSM table. It is however inefficient |
80 when a message contains many unsupported characters, because each escaped |
80 when a message contains many unsupported characters, because each escaped |
81 character must occupy 2 bytes. GSM-single supports more characters than locking-GSM |
81 character must occupy 2 bytes. GSM-single supports more characters than locking-GSM |
82 ext, but these characters are in the single table, which takes 2 bytes. Locking-single |
82 ext, but these characters are in the single table, which takes 2 bytes. Locking-single |