[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Aspell-user] forbidden words
From: |
ge |
Subject: |
[Aspell-user] forbidden words |
Date: |
Fri, 30 Jun 2006 21:56:59 +0200 |
Kevin,
I created an aspell dictionary for Hungarian with 870 thousand words.
At building it (using make) however, I get lots of messages like:
Warning: Removing invalid affix 'w' from word Bonaparté.
The flag w means, this word, whenever appears is marked as bad.
This might look a nonsense for you, saying: "Why does then the word appear at
all in the list, if it is wrong?"
However, this makes sense. an example:
There is a Hungarian word: öl (kill)
öl is then conjugated with all the verb prefixes, like
megöl
elöl
kiöl
átöl
etc....
The creation of all prefixes makes sense, since there are at least 20 of them,
and adding all verbs with the used prefix would be a lot of work with little
result. In this case there is only a small possibility, that this will create
wrong words. However, unfortunately sometimes it does:
There is no such Hungarian word, as elöl, but there is a word elõl (ahead),
that needs to be written with õ and not ö.
Therefore we simply say elöl/w, and that causes to mark the erroneously
spelled elöl word as wrong.
Therefore I consider /w as a useful flag for languages, that have to generate
lots of word forms.
FORBIDDENWORD w
in the affix file signalizes, that in this affix file /w flag means a forbidden
word.
The usage of /w is even broader a little bit. If there are other flags together
with /w, all word forms, that are created by the other flags are considered as
invalid, bad words.
There are lots of other examples for the usefulness of the /w flag in a word
set. Usage of /w saves in a lot of cases additional flags, and flags are a
valuable, limited resource (max 255) in case of aspell.
What is the behaviour of aspell, and what do you think about the above
mechanizm?
Thanks, Eleonora
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Aspell-user] forbidden words,
ge <=