Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: JIS X 0208


In the News (Mon 16 Nov 09)

  
  JIS X 0208 - Wikipedia
JIS X 0208の仮名の配列は、JIS X 0201の片仮名の配列と異なっている。JIS X 0201では、小文字 (小書きの仮名) は小文字で、大文字 (清音の文字) は大文字で、それぞれ五十音順に配列されている (ヲァィゥェォャュョッーアイウエオ……ラリルレロワン)。一方、JIS X 0208では、小文字、大文字、濁点つきの文字および半濁点つきの文字を一括して五十音順で、五十音順で同順位の場合は小文字、大文字、濁点つきの文字、半濁点つきの文字の順序で、配列されている (ぁあぃいぅうぇえぉお……っつづ……はばぱひびぴふぶぷへべぺほぼぽ……ゎわゐゑをん)。この配列は、仮名文字列の簡易的な辞書順ソートを容易にするために採用された (安岡ほか 2006)
JIS X 0213 (拡張漢字) は「JIS X 0208が当初符号化を意図していた現代日本語を符号化するために十分な文字集合を提供することを目的として」設計され、JIS X 0208の漢字集合を拡張した文字集合を規定する。JIS X 0213の原案作成者たちは、JIS X 0213の利点として、印刷標準字体への対応、新しい人名用漢字への対応などを挙げ、JIS X 0208からJIS X 0213への移行を推奨している。
JIS X 0212 (補助漢字) は、JIS X 0208に含まれない文字を必要とする情報交換のために、JIS X 0208の補助としての文字およびその符号を規定する。JIS X 0212は、JIS X 0208が非漢字として1区26点に収録している「〆」を漢字として16区17点に収録している。また、JIS X 0208の第2次規格が字形を変更した区点位置のうち、28区点位置の変更前の字形に相当する文字を収録している。これらのほかに、JIS X 0208と共通する文字は収録していない。
ja.wikipedia.org /wiki/JIS_X_0208#.E9.81.A9.E7.94.A8.E7.AF.84.E5.9B.B2.E3.81.8A.E3.82.88.E3.81.B3.E9.81.A9.E5.90.88.E6.80.A7   (441 words)

  
 [No title]
JIS X 0201 (1976) to Unicode 文字コード表 Shift-JIS による実体
JIS X 0208 (1990) to Unicode 漢字コード表 Shift-JIS による実体(UTF-8のコードはこちらにあります)
JIS X 0212 (1990) to Unicode 補助漢字コード表 UTF-8 による実体
charset.7jp.net   (35 words)

  
  (URW)++ Japanese Fonts
JIS X 0208 was again revised in January 1997 and published as JIS X 0208-1997.
To supplement people's names and outdated Chinese characters not included in the JIS X 0208, 6067 supplementary Chinese character sets (5801 Chinese characters and 266 symbols) were established in October 1990, and the sets are known as JIS X 0212-1990.
After the JIS X 0208-1997 was published, the coding of the third and forth planes of Chinese characters were undergoing, and JIS X 0213-2000 was established on January 20, 2000, accordingly.
www.urwpp.de /english/fonttechnologie/japfonts_cont.html   (385 words)

  
 JIS X 0208 - Wikipedia, the free encyclopedia
JIS X 0208 is incorporated into many Japanese encodings, such as Shift JIS, EUC-JP and ISO 2022-JP.
This is essentially the same as the order of characters by radical and stroke count used in a kanji dictionary.
During this process, some mistakes were made in the transmission of certain characters (i.e a crease in a paper being interpreted as a stroke, or a scribbled character being incorrectly read) resulting in about 20 characters that have no known instances of use.
en.wikipedia.org /wiki/JIS_X_0208   (285 words)

  
 Character Set List
The arrangement of the JIS X 0208 kanji does not seem to be connected with that of the Joyo kanji.
The relative obscurity of JIS X 0212, the tendency to just use Unicode 3.1 instead of JIS X 0213, and the deficiencies of JIS X 0208 mean that at the moment, when a system is said to use "JIS", a proprietary extention of JIS X 0208 is often what is meant.
This prevented the unification of those characters that had separate variants in JIS X 0208, and even in JIX X 0213 there are code points (61, in fact) assigned to what most commentators say are characters already in Unicode, but which now have to be given their own codepoints for round-trip compatibility.
www.jbrowse.com /text/unij.html   (9002 words)

  
 Traditional CJK charsets
The JIS X 0208 characters are organized in 94 rows ("ku") of 94 cells ("ten") so that they can be mapped over the 94 printable G0 characters of ASCII as specified by ISO 2022.
JIS X 0208 is thus structurally limited to 94×94 = 8'836 characters and with that low number no feasible replacement for Unicode with its 40'000+ characters.
GB 2312 is the Chinese equivalent to JIS X 0208 holding 6'763 hanzi, KS C 5601 the Korean DBCS holding 4'888 hanja and 2'350 Hangul syllables, and CNS 11643 Plane 1 the Taiwanese (Traditional Chinese) standard (albeit the not formally specified industry-standard "Big5" is used more often in Taiwan).
czyborra.com /charsets/cjk.html   (470 words)

  
 XML Japanese Profile
NOTE: [JIS X 0208]:1978 is not a source standard of [Unicode 3.0], which [XML] employs as the coded character set, and therefore, the relationship between these [JIS X 0208] and [Unicode 3.0] is not clear.
X-sjis-jisx0221-1995 is a conversion table derived from the conversion table in [JIS X 0221] Appendix 3 for the shift encoding which is specified in [JIS X 0208] Appendix 1.
X-eucjp-jisx0221-1995 is derived from the one in Appendix 3 of [JIS X 0221].
www.w3.org /TR/2000/NOTE-japanese-xml-20000414   (3923 words)

  
 [Ping] Japanese text encoding
JIS Roman runs from 0 to $7f and is identical to ASCII except for a few minor differences (notably, the backslash at 92 is instead a yen symbol, and the tilde at 126 is replaced by an overbar).
The JIS values get all rearranged in order to reserve the range $a0 to $df for a set of 64 half-width katakana; to accomplish this, the characters are squashed into half as many columns (values for the first byte) but twice as many rows (values for the second byte).
The figure shows the encoding ranges for JIS: the first byte will land either from $81 to $9f or from $e0 to $ef, and the second byte will land either from $40 to $7e or from $80 to $fc.
lfw.org /text/jp.html   (978 words)

  
 JIS X 0208@Everything2.com
JIS X 0208 is the standard Japanese character set used on most computers.
The set was originally defined in 1983 as a replacement for JIS C 6226; it has been revised twice since then, in 1990 and 1997.
Each character in the JIS X 0208 set has a hexadecimal "kuten" number assigned to it.
everything2.com /index.pl?node_id=1393117   (258 words)

  
 XML Japanese Profile (Second Edition)
The use of [JIS X 0208]:1978 (the first version of JIS X 0208) is an error; results are undefined.
Likewise, Halfwidth Katakana of [JIS X 0201] are deprecated.
X-sjis-jisx0221-1995 is a conversion table derived from the conversion table in [JIS X 0221-1] Appendix 2 (including Appendix 2.2) for the shift encoding which is specified in [JIS X 0208] Appendix 1.
www.w3.org /TR/japanese-xml   (4166 words)

  
 JIS X 0208 - Sljfaq
This is the encoding commonly known as JIS encoding.
It is a 7 bit encoding based on kuten and escape sequences.
The hexadecimal code of the kanji is known as its JIS code.
www.sljfaq.org /w/JIS_X_0208   (55 words)

  
 [No title]
Characters in JIS X 0208 are encoded by setting the high bit of the position codes, and characters in JIS X 0212 are encoded by doing the same but also prefixing the character with the byte 0x8F.
In the non-modal EUC encoding, for example, the byte 0x41 always refers to the letter `A'; whereas in JIS, it could either be the letter `A', or one of the two position codes in a JIS X 0208 character, or one of the two position codes in a JIS X 0212 character.
Although JIS X 0208 includes the whole Roman alphabet, as a 2-byte code it is not suited to programming; thus the inclusion of ASCII in the standard Japanese encodings.
www.brics.dk /~btools/completa/lib/xemacs-21.5-b10/info/lispref.info-46   (5501 words)

  
 Conversion tables differ between venders
JIS X 0213:2000 is a superset of JIS X 0208:1997.
When JIS X 0208 is used as a component coded character set of EUC-JP encoding, U+005C is not appropriate because it should be used for ISO 646 IRV 0x5C.
When JIS X 0208 is used as a component coded character set of Shift_JIS encoding, U+00A5 is not appropriate because it should be used for JIS X 0201 Roman.
people.debian.org /~kubota/unicode-symbols-map2.html   (2334 words)

  
 Japanese Locale Concept Dictionary
JIS X0201 - a single byte codeset consisting of 7bit characters corresponding to ISO 646, 7bit characters for katakana, and 8bit characters for both Roman and katakana characters.
JIS X 0212-1990 - Late in 1990 a supplemental Japanese character standard called JIS X 0212-1990 was published by JIS which specified an additional 5,801 kanji, 21 symbols/diacritical marks, and 245 Latin-based characters with diacritical marks.
In the case of New-JIS, the simplified form is in the JIS Level 1 column, and the unsimplified form is in the JIS Level 2 column.
cns-alumni.bu.edu /~djohnson/i18n/japanese.html   (2367 words)

  
 Introduction to i18n - Characters in Each Country
Though JIS X 0201 is included in SHIFT-JIS encoding (explained later) and widely used for Windows/Macintosh, usage of this is not encouraged in UNIX.
JIS X 0212 is not widely used, probably because it cannot be included in SHIFT-JIS, the standard encoding for Japanese version of Windows and Macintosh.
JIS X 0208 (aka JIS C 6226) is the main standard for Japanese characters.
www.debian.org /doc/manuals/intro-i18n/ch-languages.en.html   (5041 words)

  
 [No title]
Where the JIS X 0208 glyph is the second kanji for a particular descriptor code, it has a "-2" appended to the code.
In some 500 cases, the number is terminated with an `X`, to indicate that the kanji in Morohashi has a close, but not identical, glyph to the form in the JIS X 0208 standard.
Note 1: The JIS X 0208-1990 standard does not formally specify the precise glyphs used for kanji, however the glyphs it uses in the published version have become de facto standards for many font compilations.
hwr.nici.kun.nl /unipen/kanji/kanjidic.doc.html   (5088 words)

  
 [No title]
Likewise, JIS C 6220 was renamed JIS X 0201.
If there are JIS X 0208 characters on a line, there must be a switch to ASCII or to the "Roman" set of JIS X 0201 before the end of the line (i.e., before the CRLF).
The implementor is reminded that JIS X 0208 characters take up two bytes and should not be split in the middle to break lines for displaying, etc. The JIS X 0208 standard was revised in 1990, to add two characters at the end of the table.
www.ietf.org /rfc/rfc1468.txt   (1204 words)

  
 RFC 2237
It defines the use of JIS X 0208 as the double-byte character set in ISO-2022-JP text.
Since "ISO-2022-JP-1" is designed to add the capability of writing out JIS X 0212, if the message does not contain none of JIS X 0212 characters.
JIS X 0201-Roman is not identical to the ASCII with two different characters.
www.apps.ietf.org /rfc/rfc2237.html   (920 words)

  
 SP - Character sets
This refers to a character set which combines JIS X 0201, JIS X 0208 and JIS X 0212 by adding 0x8080 to the codes of characters in JIS X 0208 and 0x8000 to the codes of characters in JIS X 0212.
A bit combination between 0 and 127 or between 161 and 223 is encoded as a single byte with the same value as the bit combination.
A bit combination with the 0x8000 and 0x80 bits set is encoded by the sequence of bytes with which the SJIS encoding encodes the character whose number in JIS X 0208 added to 0x8080 is equal to the bit combination.
jclark.com /sp/charset.htm   (1176 words)

  
 ELECTRONIC DICTIONARY RESEARCH AND DEVELOPMENT GROUP - MONASH UNIVERSITY
KANJIDIC2 - File of Information about the Kanji in JIS X 0208, JIS X 0212 and JIS X 0213 in XML format.
RADKFILE/KRADFILE - files relating to the decomposition of the 6,355 kanji in JIS X 0208 into their visible components.
Copyright over the documents covered by this statement is held by James William BREEN and The Electronic Dictionary Research and Development Group at Monash University.
www.csse.monash.edu.au /~jwb/edrdg/licence.html   (1047 words)

  
 [No title]
More precisely, the QJisCodec class subclasses QTextCodec to provide support for JIS X 0201 Latin, JIS X 0201 Kana, JIS X 0208 and JIS X 0212.
JIS X 0221 is JIS version of Unicode, but a few chars (0x5c, 0x7e, 0x2140, 0x216f, 0x2131) are different from Unicode 1.1.
"open-19970715-ms" ("open- ms" for convenience) or "cp932" for JIS 0x2140 is mapped to UFF3C.
www.gl.umbc.edu /env/beta/sun4x_58/qt/qjiscodec_3qt.html   (572 words)

  
 [No title]
, # upper, lower, alpha, digit, xdigit # JIS X 0201 printable characters # JIS X 0208 printable characters # JIS X 0212 printable characters # Pritable characters in udc or vdc classes may be added.
# # jparen class: # The kana bracket characters in JIS X 0201 and the parentheses in JIS X 0208.
# Katakana characters, Katakana symbols in JIS X 0201, or udc/vdc # in undefined area of JIS X 0201 may be added.
std.dkuug.dk /i18n/locales/ja_JP   (1477 words)

  
 QR Code Information   (Site not responding. Last check: 2007-09-15)
These are values shifted from those of JIS X 0208.
A dark module is a binary one and a light module is a binary zero.
Model 2: 21 x 21 modules to 177 x 177 modules (Versions 1 to 40, increasing in steps of 4 modules per side).
www.tharo.com /webhelp/qr_code_information.htm   (336 words)

  
 RFC 1468
Note that JIS X 0208 was called JIS C 6226 until the name was changed
The implementor is reminded that JIS X 0208 characters take up two bytes and should not be split in the middle to break lines for displaying, etc.
The JIS X 0208 standard was revised in 1990, to add two characters at the end of the table.
www.apps.ietf.org /rfc/rfc1468.html   (1074 words)

  
 Set Character Width proposal
The down side of this is that non-ideographic double-byte characters, such as the Greek and Cyrillic alphabet present in JIS X 0208, will be displayed as double-width characters, which is even in the eyes of Japanese users typographically inappropriate but customary.
Both a single- and a double-width font are loaded and then the X client software can decide, which glyph it picks for each letter.
The C library offers a wcwidth() function to determine the width of a character, but leaves the actual value defined in a locale specification file, which could be fairly easily changed by the user.
www.cl.cam.ac.uk /~mgk25/ucs/scw-proposal.html   (1922 words)

  
 [No title]
Fortunately JIS (Japanese Industry Standard) defines JIS X 0212 as "code of the Tamaru Informational [Page 1] RFC 2237 Japanese Character Encoding November 1997 supplementary Japanese graphic character set for information interchange".
Most Japanese characters which are used in regular electronic mail in most cases can be accommodated in JIS X 0201, JIS X 0208 and JIS X 0212.
Description In "ISO-2022-JP-1" text, the initial character code of the message is in ASCII.
www.isi.edu /in-notes/rfc2237.txt   (1079 words)

  
 pkgsrc.se | The NetBSD package collection
8x8 dots X11 bitmap font for JIS X 0208
10x8 dots X11 bitmap font for JIS X 0208
Meta-package including X11 BDF fonts for JIS X0208 standard and more
pkgsrc.se /fonts   (446 words)

  
 charsets(7) - Waikato Linux Users Group
Console support for KOI8-R is available under Linux through user-mode utilities that modify keyboard bindings and the EGA graphics table, and employ the
KS C 5601 is an older name for KS X 1001.
Unless otherwise noted, all pages on this site are licensed under the WlugWikiLicense.
www.wlug.org.nz /charsets(7)   (2224 words)

  
 Qt 4.1: Shift-JIS Text Codec   (Site not responding. Last check: 2007-09-15)
The Shift-JIS codec provides conversion to and from Shift-JIS, an encoding of JIS X 0201 Latin, JIS X 0201 Kana and JIS X 0208.
The ISO 2022-JP (JIS) Text Codec documentation describes how to use this variable.
Most of the code here was written by Serika Kurusugawa, a.k.a.
doc.trolltech.com /4.1/codec-sjis.html   (195 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.