[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Aspell-user] Aspell quality measurement (was: Arabic)
From: |
Lars Aronsson |
Subject: |
Re: [Aspell-user] Aspell quality measurement (was: Arabic) |
Date: |
Sat, 15 Apr 2006 17:15:18 +0200 (CEST) |
Mohammed Sameer wrote:
> On Mon, Apr 10, 2006 at 01:12:39PM +0200, Lars Aronsson wrote:
> > I know aspell works for Swedish, but I'm not convinced that it
> > is any better than ispell for Swedish. I don't have any test
> > case to determine the quality of its function.
>
> I still don't know anything about the soundsalike things, Still
> have a long way to go :-)
Has anybody (for any language) developed a test suite or quality
measurement for the sounds-alike functionality? How can we know
if aspell is any better (or how much better) than ispell?
Do we have any statistics (for different languages) on what the
common spelling and typing errors (typos) are?
I have plenty of statistics on common OCR errors ("scannos") for
the Scandinavian languages. In "Project Runeberg" (runeberg.org)
I maintain raw OCR text files under RCS version control as
volunteers are proofreading them online. "Distributed
Proofreaders" (pgdp.net) do the same for many languages. Making a
wdiff (word difference) between the original and final text
produces a list of the corrections made (and thus of the errors).
Could Wikipedia's version history be used for spelling error
statistics? Has anybody tried this? Can OpenOffice and other
word processing software be made to report which corrections are
made?
--
Lars Aronsson (address@hidden)
Aronsson Datateknik - http://aronsson.se