Joel Uckelman on Fri, 5 Jan 2007 00:50:59 +0100 (CET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [hosers-talk] greylisting


Thus spake Joshua:
> 
> Joel Uckelman sez:
> >Thus spake Joshua:
> >> 
> >> Joel Uckelman sez:
> >> >> is there some kink that means i should only be running mh2bogo
> >> >> periodically, on masses of email, rather than every time i refile
> >> >> something?
> >> >> 
> >> >
> >> >No, training on mistakes was what I meant.
> >> >
> >> >What do you get when you do this?
> >> >
> >> >   bogoutil -w ~/.bogofilter .MSG_COUNT
> >> 
> >> i reckon that might be a bad sign. compare:
> >> 
> >> ~% bogoutil -w ~/.bogofilter .MSG_COUNT
> >>                                  spam   good
> >> .MSG_COUNT                       4815     32
> >
> >Oh. Well, there's our explanation, then. It has barely any good mail
> >with which it can make statistical comparisons.
> >
> >If you feed the last 4800 or so real messages you've received to
> >'mh2bogo -n', you should see a remarkable improvement in performance.
> 
> you told me what i started with would be enough! argh spam filters.

I thought sure that I said that you needed to start with corpora of
a few thousand messages. If you don't have that many good messages
right now, keep adding them as you go. You want to get the sizes of 
the two corpora within about 50% of each other, at worst.
 
-- 
J.
_______________________________________________
hosers-talk mailing list
hosers-talk@xxxxxxxxxxx
http://lists.ellipsis.cx/mailman/listinfo/hosers-talk