[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [help-GIFT] Re: Clarification on inverted file
From: |
David Squire |
Subject: |
Re: [help-GIFT] Re: Clarification on inverted file |
Date: |
Wed, 22 Aug 2001 16:32:38 +1000 |
(forwarded for Wolfgang)
On Monday 20 August 2001 12:10, David Squire wrote:
> Wolfgang Mueller wrote:
> > MARS is strongly inspired by text retrieval,
> > but modifies the retrieval scheme, basing the weighting not on the
> > document frequency but on the standard deviation of the term frequency.
>
> I haven't got the article in front of me, but if I recall correctly they
> didn't use standard deviations of term frequencies, but rather std. devs.
> of continous-valued features. This would mean that features that took on a
> wide range of values in the query would get a low weight.
OK. I was not precise enough:
The continuous feature values are seen as "pseudo tf" and then he looks at
stdevs of these pseudo tfs.
> This is clearly related to the term frequency idea, since if the features
> were quantized a la Viper, then features with low std. dev. would tend to
> get high term frequencies for the quantiles around the mean.
He uses the log standard deviation as equivalent to the log inverse
*document* frequency. By design, this goes in the same direction as Viper's
tf.idf stuff, but it does not capture multimodal distributions, and more
importantly multi-modal feedback.
Cheers,
Wolfgang
--
Wolfgang Müller,
assistant-doctorant == PhD student (2001), teaching assistant
Personal page: http://cui.unige.ch/~vision/members/WolfgangMueller.html
Maintainer, GNU Image Finding Tool (http://www.gnu.org/software/gift)