aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[aspell] Re: Aspell & international support


From: Kevin Atkinson
Subject: [aspell] Re: Aspell & international support
Date: Sat, 27 Feb 1999 14:24:39 -0500

"Jean Christophe ANDRÉ" wrote:
> 
> Hi!
> 
> In the page at http://metalab.unc.edu/kevina/aspell/international/
> you wrote : "The reason for this is that almost all languages which
> do not fit within an 8-bit character set can not be spell checked in
> the traditional fashion. When I expand Aspell to support spell
> checking these languages I will also expand Aspell to work with wide
> characters. However for right now this brings an extra level of
> complexity that I don't want to deal with. If any of my assumptions
> are wrong please let me know."
> 
> I would like to comment this : you could support Unicode just by now
> without lot of code rewriting (asuming your code is clean, but I bet
> it is, isn't it ? ;)). Just use int (short is enougth in fact)
> instead of char to store characters. Your spelling functions should
> not have to change because of this modification. On the other side
> you will be ready to accept Unicode input by transcoding it from
> one charset to the Unicode one.
> 
> If you want more information about Unicode, feel free to ask me.
> Best regards, Progfou.

I do know about unicode.  Yes I can fairly easy convert my characters to
32 bit ints as my code is clean.  However, then how should I store the
word lists in memory?  As a string of ints.  Now that is using up 4
times more memory than charters would and for languages that can fit
within an 8-bit character that is, in my view, a gross waste of memory. 
So the solution is to work with the charters as 32 bit ints than convert
it to a shorter representation when storing them in memory.  Now than
can
lead to an inefficiency.  I could also use short ints however that may
not be good enough to hold all of future versions of unicode and it has
the same problems.

These where the issues I was talking about.


> --
> mailto:address@hidden               address@hidden
> http://www.etu.info.unicaen.fr/~jcandre/          Linux & Network specialist
> 
>  "UNIX _IS_ user friendly.  It's just selective about who its friends are."

-- 
Kevin Atkinson
address@hidden
http://metalab.unc.edu/kevina/

--------------A6BA57CF1684C1B7265BB6A6

---
Note: This message was origanlly posted to address@hidden,
      not address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]