Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Chinese character encoding


Related Topics
CJK

In the News (Mon 23 Oct 17)

  
  RFC 1922 (rfc1922) - Chinese Character Encoding for Internet Messages
To use this character scheme with MIME, CN-GB is used as the value for the charset parameter: Content-Type: text/plain; charset=cn-gb; charset-edition=1980 Note: The "charset-edition" is a new MIME parameter described in section 4.1 of the "Specification" part of this document.
To use this character scheme with MIME, CN-Big5 is used as the value for the charset parameter: Content-Type: text/plain; charset=cn-big5; charset-edition=1984 Note: The "charset-edition" is a new MIME parameter described in section 4.1 of the "Specification" part of this document.
Preferably the implementation should display the characters with glyphs appropriate to the typographic tradition that is implied in the encoding of the received text.
www.faqs.org /rfcs/rfc1922.html   (3800 words)

  
 Method and system for inputting simplified form and/or original complex form of Chinese character - Patent 5197810
The Chinese character display device of the present invention is a common display device, the screen of which is divided into a common editing area and a Chinese character inputting prompting area.
For each Chinese character itself is a radical and the number of disassembled radicals after being added by the last stroke thereof is still less than four, it is further added by itself, and the codes of radicals in the radical sequence thus formed are taken as the code of this Chinese character.
Generally, the characters in each duplicatively coded Chinese character group are sorted according to their frequencies of usage, and the character with the highest frequency is defined on the default position.
www.freepatentsonline.com /5197810.html   (7487 words)

  
 NationMaster - Encyclopedia: GB18030
A character encoding is a code that pairs a set of characters (such as an alphabet or syllabary) with a set of something else, such as numbers or electrical pulses.
GB18030 also maintains compatibility with GBK, which was the pre-existing standard character encoding used in the PRC, with the aim of simplifying the upgrade of data and software to use GB18030.
This character set is of historical significance since it is the first widely used character set including characters whose Universal Character Set character numbers (or code points) exceed the value 65,535.
www.nationmaster.com /encyclopedia/GB18030   (1587 words)

  
 Chinese character encoding - Definition, explanation
Chinese character encoding is needed for the display of Chinese characters in computers, used in the Chinese, Japanese, Korean, and Vietnamese languages (collectively CJK).
The opposite conversion often results in a data loss when converting to early forms of the GB character set (namely GB2312 80): in mapping one-to-many when assigning traditional glyphs to the simplified glyphs, some characters will inevitably be the wrong choices in some of the usages.
The issue of which encoding to use can also have political implications, as GB is the official standard of the People's Republic of China and Big5 is a de facto standard of Taiwan.
www.calsky.com /lexikon/en/txt/c/ch/chinese_character_encoding.php   (464 words)

  
 [No title]
The Chinese character encoding entry on keyboard is an entry method widely used in the world, the common mini- keyboards are mainly used, and special mini-keyboards for the Chinese character entry are partially used as well.
The encoding entry is a method in which the code entry, which carries out encoding for each Chinese character according to certain rules, replaces the direct entry of the complete Chinese character.
Character "zhong", z-h-o-n-g five letters must be input when the complete phonetic encoding is used, the consonant zh and the vowel ong, which are defined at certain keys, are input respectively when the latter is used.
www.cicc.or.jp /english/hyoujyunka/af02/2-13.html   (1643 words)

  
 Chinese Character Topic Center - Chinese
Periodic table (Chinese) Below is a periodic table using Chinese languageChinese Chinese charactercharacters as symbols for chemical elements.
Chinese character encoding Chinese character encoding is needed for the display of Chinese characters in computers, used in the Chinese languageChine...
Chinese input methods for computers needs to be higher Because the Chinese language uses a logogramlogographic scriptandmdash;one in which one &qu...
www.famouschinese.com /public/Chinese_Character.html   (617 words)

  
 Glossary
A character that is not identical to its canonical decomposition.
A character that is equivalent to a sequence of one or more other characters, according to the decomposition mappings found in the Unicode Character Database, and those described in Section 3.12, Conjoining Jamo Behavior.
The diaeresis is not distinguished from the umlaut in the Unicode character encoding.
www.unicode.org /glossary   (8703 words)

  
 Chinese Character Codes
It is the encoding standard used to represent Simplified Chinese characters.
Thus, in a document that contain Chinese characters and regular ASCII characters, the ASCII characters are still represented with a single byte.
Since the characters in different plane may have the same coding, escape sequence is necessary to switch between character planes.
www.khngai.com /chinese/charmap   (579 words)

  
 ³¢¦p¥É½×¤åºK­n
The study adopts the experimental paradigm being used in Sun's study (1991) to investigate the process of Chinese character encoding by native and non-native readers.
The result shows that native readers encode Chinese characters faster and more accurately than the non-native readers; however, the speed and accuracy of encoding do not increase with the second language proficiency.
Moreover, frequency does not have an effect on the speed of Chinese character encoding; but it has an effect on the accuracy of encoding for the highly proficient non-native readers.
www.ntnu.edu.tw /tcsl/IntroTCSL/S_research/1997/guolu.htm   (205 words)

  
 RFC 1922 - Chinese Character Encoding for Internet Messages. HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin.
RFC 1922 Chinese Character Encoding March 1996 The shift SO (one byte with hexadecimal value 0E) declares that subsequent bytes are interpreted in the character set defined by SOdesignation.
RFC 1922 Chinese Character Encoding March 1996 CN-GB-12345 and CN-GB-ISOIR165 support ASCII in a similar manner to CN-GB; the MSB of Chinese characters is set to 1 to distinguish from ASCII.
RFC 1922 Chinese Character Encoding March 1996 includes all the characters of GB 2312-80, GB 12345-90, GB 8565-89, CNS 11643's plane 1 and 2, and part of some other standards) and therefore can be used to transport Chinese characters in the Internet community.
rfc.sunsite.dk /rfc/rfc1922.html   (4194 words)

  
 RFC1922
RFC 1922 Chinese Character Encoding March 1996 The shift SO (one byte with hexadecimal value 0E) declares that subsequent bytes are interpreted in the character set defined by SOdesignation.
RFC 1922 Chinese Character Encoding March 1996 CN-GB-12345 and CN-GB-ISOIR165 support ASCII in a similar manner to CN-GB; the MSB of Chinese characters is set to 1 to distinguish from ASCII.
RFC 1922 Chinese Character Encoding March 1996 includes all the characters of GB 2312-80, GB 12345-90, GB 8565-89, CNS 11643's plane 1 and 2, and part of some other standards) and therefore can be used to transport Chinese characters in the Internet community.
www.unix.org.ua /rfc/rfc1922.html   (4169 words)

  
 character encoding | English | Dictionary & Translation by Babylon
(Or "character encoding scheme") A mapping of binary values to code positions and back; generally a 1:1 (bijective) mapping.
Unicode and many CJK coded character sets use many more than 255 positions, requiring more complex mappings: sometimes the characters are mapped onto pairs of bytes (see DBCS).
To avoid this problem, character encodings such as UTF-8 were devised.
www.babylon.com /definition/character_encoding   (161 words)

  
 HTML Document Representation
As some character encodings cannot directly represent all characters an author may want to include in a document, HTML offers other mechanisms, called character references, for referring to any character.
The document character set, however, does not suffice to allow user agents to correctly interpret HTML documents as they are typically exchanged -- encoded as a sequence of bytes in a file or during a network transmission.
A user agent may not be able to render all characters in a document meaningfully, for instance, because the user agent lacks a suitable font, a character has a value that may not be expressed in the user agent's internal character encoding, etc.
www.w3.org /TR/REC-html40/charset.html   (2143 words)

  
 Chinatown Online - your guide to all things Chinese
It is a one, two or four byte encoding and has defined about 6763 Chinese characters (excluding all symbols).
Big 5 whose name refers to the five companies that collaborated in its development, was established in 1984 and is the character encoding standard most commonly used for traditional Chinese characters.
There is however no mandated connection between the encoding system and the font used to display the characters, though font and encoding are always tied together for practical reasons.
www.chinatown.com.au /eng/article.asp?masterid=160&articleid=788   (318 words)

  
 Mirago : Computers: Software: Globalization: Character Encoding
Characters and Encodings - A tutorial on character code issues in digital processing and transfer of text data, on the Internet or otherwise.
ECMA: Character Code Structure and Extension Techniques - Specifies the structure of ECMA-35, for 8-bit codes and 7-bit codes which provide for the coding of character sets, with a detailed PDF document.
IANA: Character Sets - The official names for character sets that may be used in the Internet and referred to in Internet documentation - held at the Internet Assigned Number Authority.
www.mirago.co.uk /scripts/dir.aspx?cat=Top%2fComputers%2fSoftware%2fGlobalization%2fCharacter_Encoding   (593 words)

  
 RFC 1922 - Chinese Character Encoding for Internet Messages
Zhu, et al Informational [Page 2] RFC 1922 Chinese Character Encoding March 1996 The shift SO (one byte with hexadecimal value 0E) declares that subsequent bytes are interpreted in the character set defined by SOdesignation.
Zhu, et al Informational [Page 6] RFC 1922 Chinese Character Encoding March 1996 For instance, ISO-2022-CN and ISO-2022-CN-EXT can be used to support the popular Big5 codeset, because the first two planes of CNS-11643 contain the same Chinese characters as Big5's "common part" except two duplicate characters.
Zhu, et al Informational [Page 8] RFC 1922 Chinese Character Encoding March 1996 CN-GB-12345 and CN-GB-ISOIR165 support ASCII in a similar manner to CN-GB; the MSB of Chinese characters is set to 1 to distinguish from ASCII.
www.packetizer.com /rfc/rfc1922   (3928 words)

  
 Big5 - Chinese Character - Chinese
In most vendor extensions, extended characters are placed in the various zones reserved for user-defined characters, each of which are normally regarded as associated with the preceding zone.
For example, additional "graphical characters" (e.g., punctuation marks) would be expected to be placed in the 0xa3c0?0xa3fe range, and additional ideograms would be placed in either the 0xc6a1?0xc8fe or the 0xf9d6?0xfefe range.
Characters encoded in Big5 do not always represent things that can be readily used in plain text files; an example is "citation mark" (0xa1ca, ﹋), which is, when used, required to be typeset under the title of literary works.
www.famouschinese.com /virtual/Big5   (951 words)

  
 Chinese Character Encodings Information   (Site not responding. Last check: )
A character set is different from a font that supports that character set.
Viewing a Chinese document on a program that thinks it is in English will also produce an unintelligible document with lots of accented letters and symbols.
The characters in Unicode are a superset of the characters in GB and Big5 so it is easy to convert directly from GB or Big5 into Unicode.
www.chinesecomputing.com /encodings   (351 words)

  
 HTML Unleashed. Internationalizing HTML: Character Encoding Standards - webreference.com
As explained in Chapter 3, "SGML and the HTML DTD," a character encoding (often called character set or, more precisely, coded character set) is defined---first, by the numerical range of codes; second, by the repertoire of characters; and third, by a mapping between these two sets.
As a rule, character set standards are reluctant to exactly define the functions of control characters, as these functions may vary considerably depending on the nature of text processing software.
All of these encodings are backwards compatible with ISO 646; that is, the first 128 characters in each ISO 8859 code table are identical to 7-bit ASCII, while the national characters are always located in the upper 128 code positions.
www.webreference.com /dlab/books/html/39-1.html   (2429 words)

  
 .NET Character Encoding Conversion Class
All major character encodings are supported, including utf-8, ucs-2 (unicode) iso-8859-*, windows-*, shift-jis, iso-2022-jp/kr/cn, euc-jp/kr/cn, gb2312, big5, tis-620, and more.
Character encoding of source data, such as "iso-8859-1", "utf-8", or "euc-kr".
Character encoding of destination, such as "iso-8859-1", "utf-8", or "euc-kr".
www.chilkatsoft.com /dotNetDoc/ClassCharsetConvert.htm   (627 words)

  
 CJK - Wikinfo
These languages all share the fact that their writing systems are based partly on Han (Chinese) characters -- Hanzi in Chinese, Kanji in Japanese, and Hanja in Korean --, which require between 4000 characters for a basic vocabulary to 40,000 characters for reasonably complete coverage.
This number of characters cannot fit in the 256-character code space of 8-bit encodings, and therefore requires at least a 16-bit fixed width character encoding or multi-byte variable-length encodings.
There is much controversy among experts of Chinese characters about the desirability and technical merit of the "Han unification" process used to map multiple Chinese and Japanese characters sets into a single set of unified glyphs.
www.wikinfo.org /wiki.php?title=CJK   (853 words)

  
 telecode(5)   (Site not responding. Last check: )
In plane 1, standard characters occupy positions 0001 to 8045; the remaining 791 positions are for user-defined characters.
In plane 2, standard characters occupy positions 0001 to 8489; the remaining 346 positions are for user-defined characters.
Plane 1 Character Encoding To differentiate plane 1 code from plane 2 code, the most significant bit (MSB) is set on in both bytes of a plane 1 character code.
www.uwm.edu /cgi-bin/Dept/IMT/wwwman?topic=telecode(5)&msection=3   (365 words)

  
 PHP-Nuke
This is the character set suitable in the USA and western Europe.
The character set determines what characters are allowed in names and how things are sorted by the ORDER BY and GROUP BY clauses of the SELECT statement.
Your character encoding should be set to the one that is aproppriate for your language, see the Table with Character Sets and Corresponding 4.1 Character Set/Collation Pairs.
www.phpnuke.org /modules.php?name=PHP-Nuke_HOWTO&page=make-encyclopedia-international.html   (628 words)

  
 rfc1922   (Site not responding. Last check: )
Zhu, et al Informational [Page 2] RFC 1922 Chinese Character Encoding March 1996 The shift SO (one byte with hexadecimal value 0E) declares that subsequent bytes are interpreted in the character set defined by SOdesignation.
Zhu, et al Informational [Page 6] RFC 1922 Chinese Character Encoding March 1996 For instance, ISO-2022-CN and ISO-2022-CN-EXT can be used to support the popular Big5 codeset, because the first two planes of CNS-11643 contain the same Chinese characters as Big5's "common part" except two duplicate characters.
Zhu, et al Informational [Page 8] RFC 1922 Chinese Character Encoding March 1996 CN-GB-12345 and CN-GB-ISOIR165 support ASCII in a similar manner to CN-GB; the MSB of Chinese characters is set to 1 to distinguish from ASCII.
www.nalanda.nitc.ac.in /misc/rfc/html/rfc1922.html   (3901 words)

  
 Big5 - Wikinfo
Big5's Chinese name 大五碼 (pinyin: Dawu Ma), means "Big Five Encoding." But it is unknown which language is the origin of the translation in this case.
According to some accounts, the Big5 encoding was popularized by its adoption in several commercial software packages, especially the ET chinese system which ran on MS-DOS.
However, Cantonese uses many archaic Chinese characters that were not available in the normal Big5 character set.
www.wikinfo.org /wiki.php?title=Big5   (979 words)

  
 [No title]
To use this character scheme with MIME, CN-GB is used as the value for the charset parameter: Content-Type: text/plain; charset=cn-gb; charset-edition=1980 Note: The "charset-edition" is a new MIME parameter described in section 4.1 of the "Specification" part of this document.
To use this character scheme with MIME, CN-Big5 is used as the value for the charset parameter: Content-Type: text/plain; charset=cn-big5; charset-edition=1984 Note: The "charset-edition" is a new MIME parameter described in section 4.1 of the "Specification" part of this document.
Preferably the implementation should display the characters with glyphs appropriate to the typographic tradition that is implied in the encoding of the received text.
www.rfc-editor.org /rfc/rfc1922.txt   (4231 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.