| |
| | Stemming and N-gram matching for term conflation in Turkish texts |
 | | One of the main problems involved in the use of free text for indexing and retrieval is the variation in word forms that is likely to be encountered (Lennon, et al., 1981) The most common type of variations are spelling errors, alternative spellings, multi-word concepts, transliteration, affixes and abbreviations. |
 | | One way to alleviate this problem is to use a conflation algorithm, a computational procedure that is designed to bring together words that are semantically related, and to reduce them to a single form for retrieval purposes. |
 | | Conflation algorithms can be broadly divided into two main classes: stemming algorithms, which are language dependent and which are designed to handle morphological variants, and string-similarity algorithms, which are (usually) language independent and which are designed to handle all types of variant. |
| www.informationr.net /ir/2-2/paper13.html (2140 words) |
|