[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ifile-discuss] Negative words
From: |
Jason Rennie |
Subject: |
Re: [Ifile-discuss] Negative words |
Date: |
Fri, 31 Oct 2003 08:53:29 -0500 |
address@hidden said:
> - Is there a simple way to trick ifile into having "negative" word
> values (i.e. this "word" should never match folder X ?)
i.e. if this word appears, then the score for folder X should be
-infinity?
I think this is possible without changing any code. What you'd do is give
that word a count of -1 in the database (you'd need to edit your database)
for that class. ifile smooths counts---it adds one to every count, so if
the database has -1, ifile's smoothed count will be zero. Then, if ifile
sees that word, it will calculate that word's probability for folder X to
be zero. ifile multiplies all of the probabilities together for each
class. Thus, folder X will get a zero probability. The numbers output by
ifile -q are log probabilities. log(0)=-infinity, so if the -1 word
appears, folder X will never be chosen!
There's just one caveat: don't create too many of these -1 count words.
If you get a e-mail where, for each class/folder, a -1 count word
appears, all of the classes will get a -infinity score! Though, I
guess that's kind of what you'd want to happen anyway, right? :)
Jason
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Ifile-discuss] Negative words,
Jason Rennie <=