Send Aspell-user mailing list submissions to
address@hidden
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.gnu.org/mailman/listinfo/aspell-user
or, via email, send a message with subject or body 'help' to
address@hidden
You can reach the person managing the list at
address@hidden
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Aspell-user digest..."
Today's Topics:
1. Re: suggestions with confidence levels (Ethan Bradford)
2. Regarding Edit Distance function (Jon P. Grewer)
3. Re: Regarding Edit Distance function (Kevin Atkinson)
4. Re: suggestions with confidence levels (Kevin Atkinson)
5. Re: suggestions with confidence levels (Amit Khemka)
6. Re: suggestions with confidence levels (Kevin Atkinson)
7. Re: suggestions with confidence levels (Amit Khemka)
8. Re: suggestions with confidence levels (Amit Khemka)
9. aspell 0.60.5 / dansk dictionary: returning lines starting
with question mark characte (Daniel Sippel)
----------------------------------------------------------------------
Message: 1
Date: Wed, 25 Apr 2007 09:10:11 -0700
From: "Ethan Bradford" <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: "Amit Khemka" <address@hidden>
Cc: address@hidden
Message-ID:
<address@hidden>
Content-Type: text/plain; charset="iso-8859-1"
On 4/25/07, Amit Khemka <address@hidden> wrote:
On 4/25/07, Amit Khemka <address@hidden> wrote:
> On 4/24/07, Ethan Bradford <address@hidden> wrote:
> > It doesn't do this out of the box, but if you implement patch
> > 1489981,
it
> > will.
> >
>
> Thanks :-), let me get a look !
>
> cheers,
Hi, finally I got the patch working with python bindings !
Very good!
I have another query: Given a word and its possible suggestions how
can I decide that anything with less than score 50, is a good
replacement candidate ! Or say if the score of the first suggestion is
200 rather than picking that as a possible alternative i rather ignore
it.
May be knowing the way these scores are calculated would help, or else
i guess i will have to use some heuristics.
Empirically, just from looking at several lists of spelling suggestions,
I
came up with a cut off of 130.
The biggest advantage I got from the score was in knowing when the top
suggestions had the same score, and thus could be reorganized based on
other
criteria (I have a rough estimate of frequency from another source). When
the scores are different, I almost always use their ranking. The fudge
factor I put in to allow frequency to override this is to raise the ratio
of
the scores to the 20th power and compare that to the ratios of the
frequencies.
cheers,
amit.
----
_______________________________________________
Aspell-user mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/aspell-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.gnu.org/pipermail/aspell-user/attachments/20070425/f274c18c/attachment.html
------------------------------
Message: 2
Date: Wed, 25 Apr 2007 13:27:17 -0400
From: "Jon P. Grewer" <address@hidden>
Subject: [Aspell-user] Regarding Edit Distance function
To: address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
I read in the documentation that the edit distance function has been
modified to take into account swapped letters. Does it only consider
adjacent swaps or all swaps?
Regretfully I am unable to understand the C code.
Kind regards,
--Jon Grewer
------------------------------
Message: 3
Date: Wed, 25 Apr 2007 12:17:34 -0600 (MDT)
From: Kevin Atkinson <address@hidden>
Subject: Re: [Aspell-user] Regarding Edit Distance function
To: "Jon P. Grewer" <address@hidden>
Cc: address@hidden
Message-ID: <address@hidden>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
On Wed, 25 Apr 2007, Jon P. Grewer wrote:
I read in the documentation that the edit distance function has been
modified
to take into account swapped letters. Does it only consider adjacent
swaps
or all swaps?
Adjacent
------------------------------
Message: 4
Date: Wed, 25 Apr 2007 12:38:50 -0600 (MDT)
From: Kevin Atkinson <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: Amit Khemka <address@hidden>
Cc: address@hidden
Message-ID: <address@hidden>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
On Tue, 24 Apr 2007, Amit Khemka wrote:
I am using aspell 0.60.3 ( with python bindings) for generating
possible suggestions of a wrongly spelled word. It returns a sorted
list,
By sorted list do you mean, alphabetically?
------------------------------
Message: 5
Date: Thu, 26 Apr 2007 11:30:32 +0530
From: "Amit Khemka" <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: address@hidden
Message-ID:
<address@hidden>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
On 4/26/07, Kevin Atkinson <address@hidden> wrote:
On Tue, 24 Apr 2007, Amit Khemka wrote:
> I am using aspell 0.60.3 ( with python bindings) for generating
> possible suggestions of a wrongly spelled word. It returns a sorted
> list,
By sorted list do you mean, alphabetically?
Nopes, sorted by some kind of 'score' or closeness to input word.
--
----
Amit Khemka
Home Page: www.cse.iitd.ernet.in/~csd00377
Endless the world's turn, endless the sun's Spinning, Endless the quest;
I turn again, back to my own beginning, And here, find rest.
------------------------------
Message: 6
Date: Thu, 26 Apr 2007 00:03:50 -0600 (MDT)
From: Kevin Atkinson <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: Amit Khemka <address@hidden>
Cc: address@hidden
Message-ID: <address@hidden>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
On Thu, 26 Apr 2007, Amit Khemka wrote:
On 4/26/07, Kevin Atkinson <address@hidden> wrote:
On Tue, 24 Apr 2007, Amit Khemka wrote:
> I am using aspell 0.60.3 ( with python bindings) for generating
> possible suggestions of a wrongly spelled word. It returns a sorted
> list,
By sorted list do you mean, alphabetically?
Nopes, sorted by some kind of 'score' or closeness to input word.
That's what it should do, just making sure that the python wrapper wasn't
sorting them alphabetically.
------------------------------
Message: 7
Date: Thu, 26 Apr 2007 11:36:25 +0530
From: "Amit Khemka" <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: address@hidden
Message-ID:
<address@hidden>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
On 4/26/07, Kevin Atkinson <address@hidden> wrote:
On Thu, 26 Apr 2007, Amit Khemka wrote:
<snip>
>> > I am using aspell 0.60.3 ( with python bindings) for generating
>> > possible suggestions of a wrongly spelled word. It returns a sorted
>> > list,
>>
>> By sorted list do you mean, alphabetically?
>
> Nopes, sorted by some kind of 'score' or closeness to input word.
That's what it should do, just making sure that the python wrapper wasn't
sorting them alphabetically.
Aspell is working perfectly, its only that i wanted some additional
'score' information for post-processing .
--
----
Amit Khemka -- onyomo.com
Home Page: www.cse.iitd.ernet.in/~csd00377
Endless the world's turn, endless the sun's Spinning, Endless the quest;
I turn again, back to my own beginning, And here, find rest.
------------------------------
Message: 8
Date: Thu, 26 Apr 2007 11:41:33 +0530
From: "Amit Khemka" <address@hidden>
Subject: Re: [Aspell-user] suggestions with confidence levels
To: address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
On 4/25/07, Ethan Bradford <address@hidden> wrote:
<snip>
> I have another query: Given a word and its possible suggestions how
> can I decide that anything with less than score 50, is a good
> replacement candidate ! Or say if the score of the first suggestion is
> 200 rather than picking that as a possible alternative i rather ignore
> it.
>
> May be knowing the way these scores are calculated would help, or else
> i guess i will have to use some heuristics.
Empirically, just from looking at several lists of spelling suggestions,
I
came up with a cut off of 130.
The biggest advantage I got from the score was in knowing when the top
suggestions had the same score, and thus could be reorganized based on
other
criteria (I have a rough estimate of frequency from another source).
When
the scores are different, I almost always use their ranking. The fudge
factor I put in to allow frequency to override this is to raise the ratio
of
the scores to the 20th power and compare that to the ratios of the
frequencies.
Sounds Interesting, I need it for some similar purpose. I think i will
need to find my own parameters. Thanks for your inputs ( and the patch
) !
cheers,
----
Amit Khemka
Home Page: www.cse.iitd.ernet.in/~csd00377
Endless the world's turn, endless the sun's Spinning, Endless the quest;
I turn again, back to my own beginning, And here, find rest.
------------------------------
Message: 9
Date: Thu, 26 Apr 2007 09:07:11 +0200
From: Daniel Sippel <address@hidden>
Subject: [Aspell-user] aspell 0.60.5 / dansk dictionary: returning
lines starting with question mark characte
To: address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain; charset=iso-8859-15
Hello!
I have a problem concerning aspell 0.60.5 in combinition with dansk
dictionary aspell-da-0.50.1-0.tar.bz2
Aspell gives two lines beginning with a "question mark" ? character as
result. The programm which is trying to interpret the result fails with
exception. I read the aspell developers documentation and got to know,
that the result lines should start with *, & or #.
"OK: *
Suggestions: & original count offset: miss, miss, ...
None: # original offset"
Why does aspell return some lines starting with the question mark
character as result? This only happens when using the dansk dictionary,
the german and the english spell checking is working correctly.
Please see shell output for details.
Thanks for helping,
Daniel
aspell -a --lang=da_DK -H < /tmp/aspell_data_XOYGeP 2>&1
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.5)
*
? LASTNAME 0 17: LAST
& EMP 8 33: EM, EP, LEMP, BMP, EMS, EMU, ESP, YMP
& BIRTHDAY 2 37: BIRTHA, BIRTHAS
& EMP 8 53: EM, EP, LEMP, BMP, EMS, EMU, ESP, YMP
? FIRSTNAME 0 57: FIRS
& Ich 3 74: Dich, Mich, Ih
& bin 35 78: bind, Bina, Bine, Bing, bien, bon, vin, bi, in, bid, Ben,
Bia, Bic, Bie, Bit, Fin, Lin, Tin, ban, ben, bie, bil, bio, bip, bis, bit,
bøn, din, fin, gin, hin, lin, min, pin, sin
& ein 29 82: Eina, Eino, Elin, Erin, Evin, en, in, én, Hein, tein, Eia,
Eik, Fin, Lin, Tin, din, egn, eis, evn, fin, gin, hin, lin, min, pin, sin,
tin, vin, yin
& falsches 11 86: falses, galoches, falskes, falsedes, falsenes, frosches,
kaleches, Falsters, fastes, Falster, fastres
& WÃ 24 95: EA, A, W, Å, EWA, OWA, CA, DA, DÅ, FA, FÅ, GÅ, HA, JA, LA, LÅ,
MÅ, NÅ, PÅ, RÅ, SÅ, TÅ, WC, W'S
& rt. 27 98: Rit, Rut, et, r's, rat, ret, ry, råt, ét, R, T, BRT, art,
urt, ært, IT, NT, Ro, at, it, re, ri, ro, ru, rå, æt, r'et
//Input file:
%
^A
!
^<P><BR><BR>LUCS LASTNAME<BR><BR>EMP BIRTHDAY<BR><BR>EMP
FIRSTNAME<BR><BR>Ich bin ein falsches Wört.</P>
//everything is fine
aspell -a --lang=de_DE -H < /tmp/aspell_data_XOYGeP 2>&1
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.5)
*
& LUCS 24 12: LUCHS, LUGS, LUD, LOCHS, FLUCHS, GLÜCKS, ULKS, LUGST, LÜGST,
BUCHS, FUCHS, LACHS, LACKS, LECHS, LECKS, RUCKS, WUCHS, LUG, LOS, LOKS,
FLUGS, BUGS, LOCH, LÖSS
& LASTNAME 15 17: LAST NAME, LAST-NAME, HOSTNAME, FESTNAHME, LÄSTERE,
LASTKAHN, LÄSTERTE, LÄSTERNDE, PFADNAME, LATERNE, MAßNAHME, SYSTEMNAME,
LÄSTERNS, LÄSTERND, LÜSTERNE
& EMP 6 33: EM, EMMA, ENG, MB, NEPP, HEMD
& BIRTHDAY 30 37: BRITE, BIETET, BITTET, BIETE, BITTE, BORTE, BARTES,
BIERTISCH, HIRTE, BIRKE, BIRNE, BÄRTE, WIRTE, BRIETET, BEIRRTET, BIERE,
BEIRRTE, BILDES, BIRNEN, BOHRTE, IRRTET, VIERTE, BRUTEI, PARTEI, IRRTE,
BIEREN, ZIERDE, GIERTE, WIRRTE, ZIERTE
& EMP 6 53: EM, EMMA, ENG, MB, NEPP, HEMD
& FIRSTNAME 9 57: FIRST NAME, FIRST-NAME, FESTNAHME, FORSTMANN,
FORSTMANNES, FORSTMÄNNER, WORTSTÄMME, FIXSTERNE, SYSTEMNAME
& WÃ 14 95: WAL, WAR, WAS, FA, TWA, SA, WC, WM, CA, DA, JA, MA, WG, WO
& rt 12 98: RTC, RTL, Rat, rot, rät, et, Art, Ort, AT, St, kt, lt
_______________________________________________________________
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
------------------------------
_______________________________________________
Aspell-user mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/aspell-user
End of Aspell-user Digest, Vol 53, Issue 9
******************************************
--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.463 / Virus Database: 269.6.1/776 - Release Date: 4/25/2007
12:19 PM