gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] phrase usage scoring


From: Hilmar Berger
Subject: Re: [Gnumed-devel] phrase usage scoring
Date: Thu, 18 Sep 2003 16:51:49 +0200 (MEST)

> > Incremented the counter. Discuss the pitfalls however.
> Well, the obvious pitfalls are that
> 
> a) a simple integer field will overflow eventually
> b) terms that are used often will have astronomically high scores
You could use a upper threshold. Once the first term reaches the threshold,
divide
all scores by 10 (or something else). This might slightly change the order,
but only if the differences in the first ranks are quite small. For the
lowest ranks this will change the rank to 0 (you will have to set it to 1
aftwerwards). That way the lowest ranks are somewhat discriminated but who cares
about the 50th (or even 500th) rank ?
I almost tend to believe that a float will be better in that case (you loose
less information on downgrading).

> Simple percented score increases will not work as they make
> all terms asymptotically reach the same weighting unless some
??? I thought that a 10% increase will result in finally reaching startvalue
* 1.1^x, which is infinite if x is just large enough. That is an exponential
function.  No asymptotic value there.
An exponential function might be better regarding separation of the first
ranks. On the other hand it will reach a threshold faster.

Hilmar


-- 
+++ GMX - die erste Adresse für Mail, Message, More! +++

Getestet von Stiftung Warentest: GMX FreeMail (GUT), GMX ProMail (GUT)
(Heft 9/03 - 23 e-mail-Tarife: 6 gut, 12 befriedigend, 5 ausreichend)

Jetzt selbst kostenlos testen: http://www.gmx.net






reply via email to

[Prev in Thread] Current Thread [Next in Thread]