[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Aspell-user] Re: Aspell-user Digest, Vol 57, Issue 1
From: |
eleonora46 |
Subject: |
[Aspell-user] Re: Aspell-user Digest, Vol 57, Issue 1 |
Date: |
Tue, 14 Aug 2007 10:05:32 +0200 |
Hello,
I see in your sample url following:
scope — geniral usage
Therefore for me aspell works (almost) perfectly
considering
scope
—
usage
as correct
and geniral as an error
In my opinion your algorithm should consider
— as a word, and that would fix the problem.
-eleonora
> Hello,
>
> I am playing around with aspell as a server side spell checker for a
> flash application. It works beautifully (and fast as hell too!), but I
> did notice one little oddity that I haven't been able to find an
> explanation for in the docs.
>
> The problem happens when there is a special character in the text. I
> am not sure all of the special characters that cause my word counting
> algorithm to fail, but here is an example of the one that caused
> breakage (one of those long dashes that was in some text copied from a
> wiki).
>
> http://labs.splashlabs.com/spellcheck/1186978249
>
> When I pipe the above file through aspell (en_US), i get back the result:
>
> aspell -a < 1186978249
> @(#) International Ispell Version 3.1.20 (but really Aspell 0.50.5)
> *
> *
> & geniral 5 10: general, genital, genial, generally, generals
> *
>
> So it appears to count the lone character as a word. In my own
> program, I have to count words to find the start and end char indexes
> of the incorrect word. Since my algorithm does not count it as a word,
> my word count becomes off.
>
> Are there any options I can pass to prevent it from being counted? Or
> is there a way to figure out what all is counted as a word so I can
> match my own regex to it?
>
> Thanks for any advice!
> ...aaron
--
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail
- [Aspell-user] Re: Aspell-user Digest, Vol 57, Issue 1,
eleonora46 <=