AI Spam Filter

Overview

emailAI Pro has a built in spam filtering system that analyzes emails that have been successfully delivered and emails that have been sent to spam. From this information emailAI Pro is able to analyses incoming emails and send them to spam if they match the patterns of previously received spam. Because emailAI Pro builds these filters based on what it is seeing relevant to the particular organization it is a very accurate filter.

Enabling the AI Spam Filter

The AI Spam Filter is enabled by enabling the option "Use AI Spam Filter" in the General tab of the system configuration options.

Setting the AI filter threshold determines at what level emailAI Pro should decide whether the email is spam. By default this is set to 85%.

Tweaking the filter

Under the Spam Queue tab of the system configuration you can tweak the AI Spam Filter by changing the following values.

AI Filter Options

Good Token Weight

The weight given to a word when it is contained in a delivered email. Words contained in spam emails are given a weight of 1. Setting this value to 2 will there fore ranking words in good emails 2 to 1.

Min Token Count

The minimum number of words that must be in an email to be determined as good or bad. For example if an email contained just the word viagra and the minimum count was 2 the email would not be considered spam. The default value for this option is 0.

Min Count for Inclusion

The minimum number of times a word must appear across all emails to be included in the over all filter. For example if 100 emails are used to build the filters and the word viagra only showed up 2 times the word would be ignored by the filters. The default value for this is 5.

Min Score

The minimum score a word can have. The default for this is 0.011

Max Score

The maximum score a word can have. The default for this is .99

Likely Spam Score

The ranking to give a word when it is considered to likely be spam. The default for this is .9998

Certain Spam Score

The ranking to give a word when it is known as spam determined by the certain spam count. The default for this is .9999

Certain Spam Count

The number of times a word has to appear in spam scanned emails to determine if that word is a typical spam word. For example if 100 emails are scanned when building the filter and the word viagra showed up in spam emails11 times, it would be classed as a certain spam word. The default for this is 10.

Interesting Word Count

The amount of words to use in a email to guage whether it is spam or not. The default for this is 15. By using the most interesting words long emails that are intended to trick filters don't fool the system. Ie if the email contain Viagra then a paragraph about humpty dumpty trying to trick the filter, the filter only looks at the 15 most interesting words in the email, and viagra would be one of them.

More Information

For more information about how this filter works see Paul Grahams' article