| |
| | Joel on Software - Stripping diacritical marks in Java? |
 | | The best discussion of folding, in all its complexity (folding accents, case, canonical duplicates, Greek letterforms, Hiragana ands Katakana, etc., etc.) is at http://www.unicode.org/reports/tr30/. |
 | | The easiest way is probably to do a canonical decomposition of each character, and then take the (first?) one of those that's in ASCII. |
 | | Of course, U+0131 (dotless i) looks vaguely like i, but it doesn't have a decomposition (there's no "white-out" glyph). |
| discuss.joelonsoftware.com /?joel.3.91624.4 (689 words) |
|