| |
| | GB2312 - The real meaning from Timesharetalk wikipedia (Site not responding. Last check: 2007-09-11) |
 | | Compared to UTF-8, GB2312 (whether native or encoded in EUC-CN) is also more storage efficient, since Chinese characters are limited to a maximum of two bytes each, while UTF-8 uses at least three bytes. |
 | | To map the code points to bytes, add A0 to the 100's and 1000's value of the code point to form the high byte, and add A0 to the 1's and 10's value of the code point to form the low byte. |
 | | So, for example, if you have the GB2312 code point 4566 ("foreign,"), the high byte will come from 45 (4500), and the low byte will come from 66 (0066). |
| www.timesharetalk.co.uk /wiki.asp?k=GB2312 (551 words) |
|