symbian-qemu-0.9.1-12/python-2.6.1/Modules/cjkcodecs/README
changeset 1 2fb8b9db1c86
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/symbian-qemu-0.9.1-12/python-2.6.1/Modules/cjkcodecs/README	Fri Jul 31 15:01:17 2009 +0100
@@ -0,0 +1,79 @@
+To generate or modify mapping headers
+-------------------------------------
+Mapping headers are imported from CJKCodecs as pre-generated form.
+If you need to tweak or add something on it, please look at tools/
+subdirectory of CJKCodecs' distribution.
+
+
+
+Notes on implmentation characteristics of each codecs
+-----------------------------------------------------
+
+1) Big5 codec
+
+  The big5 codec maps the following characters as cp950 does rather
+  than conforming Unicode.org's that maps to 0xFFFD.
+
+    BIG5        Unicode     Description
+
+    0xA15A      0x2574      SPACING UNDERSCORE
+    0xA1C3      0xFFE3      SPACING HEAVY OVERSCORE
+    0xA1C5      0x02CD      SPACING HEAVY UNDERSCORE
+    0xA1FE      0xFF0F      LT DIAG UP RIGHT TO LOW LEFT
+    0xA240      0xFF3C      LT DIAG UP LEFT TO LOW RIGHT
+    0xA2CC      0x5341      HANGZHOU NUMERAL TEN
+    0xA2CE      0x5345      HANGZHOU NUMERAL THIRTY
+
+  Because unicode 0x5341, 0x5345, 0xFF0F, 0xFF3C is mapped to another
+  big5 codes already, a roundtrip compatibility is not guaranteed for
+  them.
+
+
+2) cp932 codec
+
+  To conform to Windows's real mapping, cp932 codec maps the following
+  codepoints in addition of the official cp932 mapping.
+
+    CP932     Unicode     Description
+
+    0x80      0x80        UNDEFINED
+    0xA0      0xF8F0      UNDEFINED
+    0xFD      0xF8F1      UNDEFINED
+    0xFE      0xF8F2      UNDEFINED
+    0xFF      0xF8F3      UNDEFINED
+
+
+3) euc-jisx0213 codec
+
+  The euc-jisx0213 codec maps JIS X 0213 Plane 1 code 0x2140 into
+  unicode U+FF3C instead of U+005C as on unicode.org's mapping.
+  Because euc-jisx0213 has REVERSE SOLIDUS on 0x5c already and A140
+  is shown as a full width character, mapping to U+FF3C can make
+  more sense.
+
+  The euc-jisx0213 codec is enabled to decode JIS X 0212 codes on
+  codeset 2. Because JIS X 0212 and JIS X 0213 Plane 2 don't have
+  overlapped by each other, it doesn't bother standard conformations
+  (and JIS X 0213 Plane 2 is intended to use so.) On encoding
+  sessions, the codec will try to encode kanji characters in this
+  order:
+
+    JIS X 0213 Plane 1 -> JIS X 0213 Plane 2 -> JIS X 0212
+
+
+4) euc-jp codec
+
+  The euc-jp codec is a compatibility instance on these points:
+   - U+FF3C FULLWIDTH REVERSE SOLIDUS is mapped to EUC-JP A1C0 (vice versa)
+   - U+00A5 YEN SIGN is mapped to EUC-JP 0x5c. (one way)
+   - U+203E OVERLINE is mapped to EUC-JP 0x7e. (one way)
+
+
+5) shift-jis codec
+
+  The shift-jis codec is mapping 0x20-0x7e area to U+20-U+7E directly
+  instead of using JIS X 0201 for compatibility. The differences are:
+   - U+005C REVERSE SOLIDUS is mapped to SHIFT-JIS 0x5c.
+   - U+007E TILDE is mapped to SHIFT-JIS 0x7e.
+   - U+FF3C FULL-WIDTH REVERSE SOLIDUS is mapped to SHIFT-JIS 815f.
+