[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [aspell-devel] remove from word lists
From: |
Jose Da Silva |
Subject: |
Re: [aspell-devel] remove from word lists |
Date: |
Sat, 19 Feb 2005 12:58:19 -0800 |
User-agent: |
KMail/1.6.1 |
On Wednesday 16 February 2005 02:48 pm, Kevin Atkinson wrote:
> On Wed, 16 Feb 2005, Anton Leuski wrote:
> > I guess it's not possible to remove words from the personal and session
> > word lists, right? When I try the remove method on a personal word
> > list from a Speller instance it comes back with
> >
> > The method "remove" is unimplemented in "WritableDict"
> >
> > Or am I missing something? Can you give me any advice on how implement
> > the remove method? Or (even better :-)) when is it going to implemented
> > in the main code base?
>
> The problem is that when Aspell "saves" a personal word list it doesn't
> really "save" it. Instead it merges the in-memory word list with the one
> saved to disk. That is before saving it reread the on-disk word list and
> than add any new words found to the in-memory word list. I do this
> avoid the problem of multiple Aspell processes, running at the same
> time, clobbering each others changes. This means that deleting a word
> from on in-memory word list will have no effect if word is also in the
> on-disk word list. A truly correct solution to this problem will be
> rather complicated. I am willing to accept a simpler, yet not 100%
> correct, solution but I have not got around to implementing it.
After reading through this, it makes more sense what's happening within
Aspell, but looking at Aspell, it does appear like one huge ball of tangled
yarn, so it is fairly difficult to find a place to start without unravelling
a bunch of other items.
Word hashing seems to have advantages, such as fewer words to search through,
probably a smaller memory footprint, but somehow it would seem worthwhile to
have word "ownership" thrown into the hash so you know where what word came
from. For example, family of users all using speller all at same time but
different requirements (eg, German, english, french, etc...), or perhaps a
library type of setting with multi-head terminals and one user decides to
use some foreign language versus the other X users. Or perhaps, one user
using multiple applications, say a word processor while at same time
spell-checking some stuff in another language. Or as Anton Leuski suggests,
how can you delete a word?
Just throwing some thoughts around.... so if there are better suggestions,
you're welcome to reply.
Perhaps adding word ownership to each word would probably help track where
what word came from, so you could track languages or be able to delete
words, but adding flags for each word would probably increase the memory
footprint a substantial amount versus having just 1 flag to mark 1 file.
Suppose there is a "primary" dictionary kept somewhat intact... atleast you
know who owns that, then you hash a 2nd personal dictionary against the
primary but keep it in it's own thread (so you know "word ownership" there
too), this way you keep the memory footprint down and reduce the wordcount
that way too. Multi-languages could probably follow something along this
idea.... atleast you won't have to reset aspell as "harshly" if you are
switching languages or running multi-languages all at same time.
...basically, the ideas here are thinking in terms of introducing
multi-threading.