| |
| | NewsForge | Training Annoyance Filter to combat spam (Site not responding. Last check: 2007-10-30) |
 | | Annoyance Filter uses a main dictionary, portable among different architectures and fully interoperable, and a lightweight version for fast operations. |
 | | Annoyance Filter uses only one dictionary for storing statistics about both junk and legitimate e-mail; every token (a word or group of words) in the dictionary gets a score denoting the probability of it being junk mail, and these probabilities are used when classifying new e-mail messages. |
 | | In Annoyance Filter, you can choose an arbitrarily long number of words as a token, from single words to groups of, say, 10 words; in practice, the more words per token the more performances will degrade, unfortunately without necessarily better results. |
| www.newsforge.com /software/03/10/24/2046238.shtml?tid=74 (2055 words) |
|