Re: spam filtering


Subject: Re: spam filtering
From: David J. Weller-Fahy (lists@weller-fahy.com)
Date: Sat May 10 2003 - 15:56:57 AKDT


* bryan@ak.net <bryan@ak.net> [2003-05-09 15:01]:
> I started filtering yesterday, and I'm getting false negatives too.
> I only trained for a few days, but I thought that would be enough
> with as much spam as I get. Still, the filtering is a big improvement,
> and when I get non-filtered spam, I write them to a 'spam' file, and
> run 'bogofilter -Ns <spam' to reclassify it.

As a matter of fact, I believe that I'm getting more false negatives
since upgrading from 0.7.5 to 12.2. Probably means I'll soon be
starting over from empty databases and registering all known spam/ham as
such. I think I'll do that when I return (leaving for a week on
tuesday). I use a similar setup, but have the correction macros in
mutt so that I don't need to leave the program to reclassify emails.

> > Have you tried anything with the 3 states instead of two yet?
>
> No, I haven't. It would be easy enough to tell bogofilter to do that,
> but what should be done with the results? Would I have to have a
> third, "unsure" mailbox? Two is enough for me. I'll continue to
> reclassify the mistakes, and things should shake out sooner or later.

I believe you would need an unsure mailbox, however just one more
mailbox wouldn't matter to me: I've already got ~30. One for each
mailing list, and others for friends, family, online-services, etc.

I archive all my mail, so it helps to be able to find it. Once I start
re-registering spam/ham after my return, I'll probably also implement 3
state classification, after that I'll just train on misfiled and
unsures.

Either way, I'll let you know how it works out.

Regards,

-- 
David J. Weller-Fahy        | 'These are the questions that kept me out
largely at innocent dot com |  of the really *good* schools.'
www dot weller-fahy dot com |                  - One of The Group

--------- To unsubscribe, send email to <aklug-request@aklug.org> with 'unsubscribe' in the message body.



This archive was generated by hypermail 2a23 : Sat May 10 2003 - 15:57:00 AKDT