emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ispell and unibyte characters


From: Eli Zaretskii
Subject: Re: Ispell and unibyte characters
Date: Fri, 13 Apr 2012 18:53:57 +0300

> Date: Fri, 13 Apr 2012 17:25:25 +0200
> From: Agustin Martin <address@hidden>
> 
> > I don't understand what are you trying to accomplish by encoding
> > OTHERCHARS in UTF-8.  What exactly is the problem with them being
> > encoded in some 8-bit encoding?  Please explain.
> 
> Imagine a fake entry in the general list, either in ispell.el or provided
> through `ispell-base-dicts-override-alist' (no accented chars for simplicity)
> 
> ("catala8"
>      "[A-Za-z]" "[^A-Za-z]" "['\267-]" nil ("-B" "-d" "catalan") nil 
> iso-8859-1)
> 
> Unless emacs knows the encoding for \267 (middledot "ยท") it cannot decode it
> properly. I prefer to not use UTF-8 here, because I want the entry to also be
> useful for ispell (and also be XEmacs incompatible). The best approach here
> seems to decode the otherchars regexp according to provided coding-system.
> 
> I have noticed that there seems to be no need to encode the resulting string
> in UTF-8, Emacs will know what to do with the decoded string.
> 
> I tested something like
> 
>  (dolist (adict ispell-dictionary-alist)
>           (add-to-list 'tmp-dicts-alist
>                        (list
>                         (nth 0 adict)  ; dict name
>                         "[[:alpha:]]"  ; casechars
>                         "[^[:alpha:]]" ; not-casechars
>                         (if ispell-encoding8-command
>                             ;; Decode 8bit otherchars if needed
>                             (decode-coding-string (nth 3 adict) (nth 7 adict))
>                           (nth 3 adict)) ; otherchars
>                         (nth 4 adict)  ; many-otherchars-p
>                         (nth 5 adict)  ; ispell-args
>                         (nth 6 adict)  ; extended-character-mode
>                         (if ispell-encoding8-command
>                             'utf-8
>                           (nth 7 adict)))))
> 
> and seems to work well.

So you are taking the Catalan dictionary spec written for Ispell and
convert it to a spec that could be used to support more characters by
using UTF-8, is that right?  If so, I find this a bit kludgey.  How
about having a completely separate spec instead?  More generally, why
not separate ispell-dictionary-alist into 2 alists, one to be used
with Ispell, the other to be used with aspell and hunspell?  I think
this would be cleaner, don't you agree?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]