Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Baysian filter


Related Topics

  
  Bayesian spam filtering - Wikipedia, the free encyclopedia
Bayesian spam filtering is the process of using Bayesian statistical methods to classify documents into categories.
Bayesian filtering was proposed by Sahami et al.
Server-side email filters, such as SpamAssassin and ASSP, make use of Bayesian spam filtering techniques, and the functionality is sometimes embedded within mail server software itself.
en.wikipedia.org /wiki/Bayesian_filtering   (860 words)

  
 Bayesian filtering - TheBestLinks.com - Baysian filter, Bayesian probability, Email, Viagra, ...   (Site not responding. Last check: 2007-11-03)
Bayesian filtering is the process of using Bayesian statistical methods to classify text documents into one of several categories.
Bayesian filtering gained currency when it was described in the paper "A Plan for Spam"[1] (http://www.paulgraham.com/spam.html) by Paul Graham, and has become popular as a mechanism to distinguish spam emails from desirable emails.
To 'train' the filter, the user must manually indicate into which category a particular document belongs, and the filter will then assign a probability to each word in the email.
www.thebestlinks.com /Baysian_filter.html   (433 words)

  
 Bayesian probability - Wikipedia, the free encyclopedia
A Bayesian spam filter uses a reference set of e-mails to define what is originally believed to be spam.
After the reference has been defined, the filter then uses the characteristics in the reference to define new messages as either spam or legitimate e-mail.
New e-mail messages act as new information, and if mistakes in the definitions of spam and legitimate e-mail are identified by the user, this new information updates the information in the original reference set of e-mails with the hope that future definitions are more accurate.
en.wikipedia.org /wiki/Baysian   (1826 words)

  
 Better Bayesian Filtering
Spam filtering is a subset of text classification, which is a well established field, but the first papers about Bayesian spam filtering per se seem to have been two given at the same conference in 1998, one by Pantel and Lin [2], and another by a group from Microsoft Research [3].
The reason the filters caught them was that both companies in January switched to commercial email senders instead of sending the mails from their own servers, and both the headers and the bodies became much spammier.
If you were doing Bayesian filtering in a situation where the ratio of spam to nonspam was consistently very high or (especially) very low, you could probably improve filter performance by incorporating prior probabilities.
www.paulgraham.com /better.html   (4059 words)

  
 CrystalTech Forums - Another spam marking question   (Site not responding. Last check: 2007-11-03)
I put him on the whitelist, which seemed to work, but I have many clients who use a different smtp server from their email domain (lots of isp's make you use their smtp), and I'm wondering if any other valid messages are being marked as spam.
I had a client side Bayesian filter for a while (PopFile) which had a 99%+ accuracy, but it was trained on *MY* email, and not a servers worth.
I now have Baysian set to the same weight as the other tests, and just have to keep an eye out for false positives (which is easier now with the Low/Med/Hi buckets).
www.crystaltech.com /forum/topic.asp?TOPIC_ID=9245   (372 words)

  
 Introduction to Bayesian Filtering
At this point, the filter is very minimally trained – using it to filter spam would be like letting a 3-year old drive on a major highway.
Now we’re going to let the filter try to decide if a message is spam or not, based on what we’ve told it about what spam looks like and how it differs from non-spam.
While other filtering methods may be as effective for now, Bayesian filtering includes an element of changeability that makes it very difficult for spammers to circumvent.
www.process.com /precisemail/bayesian_filtering.htm   (2982 words)

  
 Spam filtering techniques
Most filtering software allows you to save rejected messages in temporary folders pending review -- but if you need to review a folder full of spam, the usefulness of the software is thereby reduced.
A whitelist filter connects to an MTA and passes mail only from explicitly approved recipients on to the inbox.
The whitelist filter's response contains some kind of unique code that identifies the original message, such as a hash or sequential ID. This challenge message contains instructions for the sender to reply in order to be added to the whitelist (the response message must contain the code generated by the whitelist filter).
www-106.ibm.com /developerworks/linux/library/l-spamf.html   (3721 words)

  
 Spam Filter   (Site not responding. Last check: 2007-11-03)
There are many spam filtering programs, however I have not found a single one that worked satisfactorily.
The disadvantage of this approach is you have to configure your email program to use the proxy filter for receiving and the ordinary mail server for sending.
Filter that extracts everyone you have sent mail to recently as a friend.
mindprod.com /projects/spam.html   (659 words)

  
 absolutly amazing | MetaFilter   (Site not responding. Last check: 2007-11-03)
As cool as these filtering techniques are, they do nothing to reduce the amount of spam in the world.
First Bayesian spam filter was ifile, for a mail system that no-one really uses any more...
It can only run after all of Eudora's other filters have run; in my case, that means that it won't filter mail to my business address, which makes it sort of useless.
www.metafilter.com /mefi/26669   (3665 words)

  
 Winter 2005 Newsletter
This filter is a database of features of ham (good mail) and spam, found by analyzing the user’s own mail.
In addition to the “autolearn” feature of the Baysian filter, it can also be “trained” a lot faster be using the “sa-learn” command on the Unix mail server.
A recent issue has arisen with respect to the Baysian filter where it appears that the filter is classifying some obvious spam as “good” mail.
www.arts.uwaterloo.ca /ACO/newsletters/w05/spam.html   (1097 words)

  
 The Old Joel on Software Forum - Spam filtering on servers
We get some users complaining about the fact that some legit mail gets caught in the filter, but the reality is that the owners of the company got sick of dealing with the spam themselves and forced us to put in server-side filtering.
It turned out that the Baysian filter that his company uses on their incoming email rejected my manually-written, non generic subject lined message as "probable SPAM".
Good and careful filtering at the server level, with practically NO chance for false positives, even at the expense of letting some spam through, is good.
discuss.fogcreek.com /joelonsoftware?cmd=show&ixPost=118891   (2735 words)

  
 [UFO Chicago] Re: spam filters
I don't need a personal fllist, that is taken care of by my baysian filter.
The fllisted sites and those my baysian filter said were spam would get retrieved "n" times, the ones not white listed would get retrieved once, the white listed sites would not be retrieved.
Or optionally the white-listed (pink-listed?) web pages could be retrieved and put in the e-mail to save me the task, or because they might change before I got to the e-mail, like newspaper stories that are free only for the first 24 hours.
ufo.chicago.il.us /pipermail/ufo/2004-July/002065.html   (532 words)

  
 Assp-mod News - DeveloperNet
There is still a huge problem with the mail filter that somehow allows mail to come through with an invalid TO address.
This causes a slow degredation of the Baysian database.
This version is ready to use to filter spam, but not ready to manage the whitelist and various other necessary features.
developer.novell.com /wiki/index.php/Assp-mod_News   (557 words)

  
 LWN: Open spam filtering rules considered harmful?
The attack on the software's filtering process highlights the dangers of open-source projects, but it also reinforces the ability of projects with active development teams to quickly respond to such security holes.
These are the easiest mails for spamassassin's bayesian filter to catch: legit mail never has misspelled spam terms, so anything containing lots of misspelled or m1sspe11ed spam words is easily flagged as spam.
For now, bayesian spam filtering seems to be the way to go; maybe something better will come along in the future, but it does seem to be an improvement over rule-based approaches.
lwn.net /Articles/53606   (3170 words)

  
 Spam Filtering by Net2Atlanta - Effective spam filter for Exchange, Groupwise or any mail-server
ISP level filtering means that the SpamDragon™ has more samples to work with, making it far more effective at blocking spam than a single user or company level spam filter.
The text filter actually consists of three in-line filters- a Baysian Spam Filter, a Heuristic Spam Filter, and a keyword filter.
Filtering at the ISP level saves your bandwidth and disk space and stops spam before it enters your pipeline.
www.net2atlanta.com /spamfilter.htm   (471 words)

  
 (Internode) Spam marking - Internode
The Spammer-filter battle is an arms race, and at times, the enemy improves their weapons and starts slipping stuff past the filters - the people who maintain the filters then improve those, and the effectiveness of filtering rises again.
I'd suggest trying to re-tune the filtering level you're using and/or just being patient for a week or two to let the folk who maintain our filtering system rules (which are commercial and outsourced) catch up with it.
How often do the hueristics get updated on the spam filter, and is there maybe a way to allow subscribers to help train the baysian filter (which I am guessing it may have) to be more effective by forward spam to a honeypot training address?
whirlpool.net.au /forum-replies-archive.cfm/545019.html   (383 words)

  
 Re: Hotmail, aol, lycos,MSN   (Site not responding. Last check: 2007-11-03)
Otherwise it would be no contradiction with what you said, namely, and I quote you: " the distinct tokens comprising a message do *not* contribute independent estimates of the spammishness of a message".
Once again, he talks about naive Bayesian filter which starts with the assumption that variables are independent, even they are known to not be independent.
If these >> improvements will lead to a true Baysian filter that is to be seen.
www.talkaboutspam.com /group/alt.spam/messages/90601.html   (398 words)

  
 Content-based Spam filtering - losing battle? - GameDev.Net Discussion Forums
Bayesian filtering is really going to die; most spam i see includes random text and therefore can't be filtered this way unless you're okay with many false positives.
Bayesian filtering is linear, and to really judge things by content and filter emails containing random words(with probabilities same as in real text) you must use something non-linear.
Bayesian filtering can be used where it is apporiate: when different items are independent.
www.gamedev.net /community/forums/topic.asp?topic_id=362342&whichpage=1�   (1953 words)

  
 SpamCop Discussion > Thank you!
I've been using spam filtering options (some that query blocklists) for a couple of years now.
Most recently, I was using SpamPal in conjunction with a Baysian filter.
the caution is that the build-up of filtered items that are sitting in the Held Folder may be building up to a level that may cause some future issues.
forum.spamcop.net /forums/lofiversion/index.php/t1795.html   (569 words)

  
 Spam Filtering Techniques: -- Comparing a Half-Dozen Approaches to Eliminating Unwanted Email --
Most filtering software allows you to save rejected messages in temporary folders pending review--but if you need to review a folder of spam, the usefulness of the software is thereby reduced.
Each recipient of a spam, however, absent prior filtering must press her own "delete" button to get rid of the message.
When a message is received by a MTA, a distributed fllist filter is called to determine whether the message is a known spam.
gnosis.cx /publish/programming/filtering-spam.html   (3622 words)

  
 Sex, Drugs & Unix
In August of 2002, Paul Graham published his now-famous, Plan for Spam that introduced Baysian filtering to the anti-spam crowd.
These days its assumed that your mail program and potentially your ISP are running some kind of Baysian filter.
With time, the spammers have found a way around the Baysian filters, and we are again dealing with a crushing load of spam in our mailboxes.
www.smallworks.com /archives/00000301.htm   (256 words)

  
 Forum at Vanquish Labs :: View topic - Vanquish adds a Smart Personal Filter
As the filter learns about your preferences and becomes highly accurate, it helps us to avoid challenging addresses that are fraudulently used by spammers.
We are so confident that our new smart personal filter is accurate and personal, that we are considering a default setting of no-challenges for new accounts.
The smart filter is so precise that some users are turning off the challenge/confirmation feature.
www.vanquish.com /forum/viewtopic.php?p=467   (850 words)

  
 CBSI Spam Control Center   (Site not responding. Last check: 2007-11-03)
However this isn't perfect, because computers are stupid and spammers are constantly trying to find new ways of fooling SA into thinking their spam is not spam.
The most powerfull tool SA employs is called a Baysian filter.
This Baysian filter's power comes from the fact that it works by "learning" what spam looks like and what spam doesn't look like.
spam.c-b-s-i.net /how.php   (294 words)

  
 Why do some spammers embed garbage text?   (Site not responding. Last check: 2007-11-03)
It is obvious the spammers don't really understand how a baysian filter works.
In addition to the "hashbusting" (making multiple copies of spam non-identical to try to thwart identical-text scanners) and feeble attempts to defeat Bayesean filters, I bet many spammers encode info in there somehow, so they know who larted them if they receive a copy of a lart with the To: line munged or removed.
The random garbage is treated by bayesian filters as 'words' and as such is added to the database of spam words.
www.webservertalk.com /message298939.html   (735 words)

  
 SPAMTAG | @lab
SpamAssassin is a mail filter that analyzes messages and rates their content in an attempt to determine if the message is likely to be a spam.
Control over this is type of filtering is completely in the hands of the administrators of the network segment over which the traffic flows.
Control over this type of filtering is in the hands of the administrator of the destination mail domain as defined by the MX records themselves.
lab.ac.uab.edu /node/view/5   (13767 words)

  
 a thaumaturgical compendium » Blog Archive » Where is my OLIVER?
I was recently rereading Licklidder’s “The computer as a communications device” (1968; available as a pdf) and was struck by this image of a personal network agent.
It seems as though people do a lot of talking about this, but I want to know where mine is. Is the Bayesian spam filter the best we are going to do on this.
I’m currently using the same baysian filter on my academic blog to delete off comment spam.
alex.halavais.net /?p=893   (422 words)

  
 Features   (Site not responding. Last check: 2007-11-03)
While different areas of pattern recognition obviously have different features, once the features are recognized, they are classified by a much smaller set of algorithms.
In speech recognition, features for recognizing phonemes can include noise ratios, length of sounds, relative power, filter matches and many others.
In spam detection algorithms, features may include whether certain email headers are present or absent, whether they are well formed, what language the email appears to be, the grammatical correctness of the text, markovian frequency analysis and many others.
www.freedownloadsoft.com /info/features.html   (215 words)

  
 Spam filter, spam filter review, spam filter comparison   (Site not responding. Last check: 2007-11-03)
These two filter engines together give a close to 100% spam filter capability, and the methods of.
Filter for Eudora which uses a preset set of rules for filtering spam spam filter.
The first prerequisite for spam filtering is to introduce the filter into the mail processing pipeline..
www.anyskate.com /spam-filter.html   (324 words)

  
 Spam Filter ISP Forums: Does Baysian Filter Properly learn Whitelisted Entries   (Site not responding. Last check: 2007-11-03)
Today the Bayesian filter hit the 5000 good email mark again, (30,000+ spam) and now every email that comes in that is not on the whitelist is flagged as 100% Spam.
The more accurate your existing "regular" filter, the better trained the statistical filter becomes.
This process trains the filter to recognize the mistake it has made, and to "weigh" more those emails as good mail in the future.
www.logsat.com /spamfilter/forums/forum_posts.asp?TID=3553&PN=1   (382 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.