Joel Uckelman on Wed, 3 Jan 2007 04:19:26 -0700 (MST)

Re: [hosers-talk] greylisting

Thus spake Joshua:
> "Jon Stewart" sez:
> >> Joel Uckelman sez:
> >> >I don't understand why bogofilter works near-optimally for me, but not fo
> r
> >> >anyone else. Have you been training it on misclassified mail?
> >> 
> >> i have never retrained it. i thought you told me once i wouldn't have
> >> to? (i do, however, still file away as spam pieces of mail that it
> >> misses when filtering automatically.)
> >
> >
> >I've never gone through the hassle of setting it up. I get enough now that 
> >I probably should. Training it seems like a pain; I have always thought 
> >it's a weird system to filter after incorporation, too. One would think 
> >that there should be a high enough confidence for some spam that the shit 
> >just should go straight to /dev/null, with temporary black-listing of the 
> >sender address and relaying server.

There's no level of confidence at which you can safely bitbucket mail,
unfortunately. About once a month I'll have a piece of real mail get
misclassified, and it will often have scored a spamminess of 1.
> maybe you could set a criterion; it rates as it files. but i have
> certainly had it get things wrong.

The reason that I want to set up greylisting is not because I'm getting
too much spam in my inbox---I'm not. Rather, it's because I want less spam
in my spambox to sort through when I check for misclassified mail.
> the training took a while at the beginning to accumulate enough spam
> but otherwise it was just a one-shot minor annoyance.

As for spam: It's highly likely that we all get the same spam, more or less.
Jon, if you need a spam corpus, I can give you one. Using mine will probably
be only slightly worse than collecting your own. But you do need to collect
several thousand good messages of your own, since my good mail is going to
be statistically different from yours.

