emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bad ispell.el <-> aspell-0.60 interactions in utf8


From: Agustin Martin
Subject: Bad ispell.el <-> aspell-0.60 interactions in utf8
Date: Wed, 6 Apr 2005 17:25:58 +0200
User-agent: Mutt/1.5.8i

(please cc me replies, I am not subscribed to emacs-devel)

Hi,

Just to let you know about a problem that has been reported to Debian and
that seems caused by a undesired interaction between ispell.el and
aspell-0.60 when the environment (really LC_CYPE) is utf8,

http://bugs.debian.org/299725

In summary, when run in an UTF-8 environment, aspell 0.60 expects utf-8
text and returns utf-8 text, so if latin1 text is piped to it, some
problems appear. This sounds crazy when done from the command line, but
seems to happen when ispell.el pipes to aspell a text as latin1 (because
the corresponding ispell-dictionary-alist entry says the dict is latin1),
but aspell is run in an utf8 environment, e.g., piping the word rôle
(as is, in latin1 encoding) to aspell (as aspell -a -d british-w_accents)
in a latin1 environment gives

@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3-20050121)
& rôle 35 0: role, Roley, rile, Rolfe, roles, tole, roe, Ole, olé, roll,
rule, prole, Rolf, roué, Cole, Dole, Pole, Rome, Rose, Rowe, Roze, bole,
dole, hole, mole, pole, robe, rode, rope, rose, rote, rove, sole, vole,
role's

but doing the same in an utf8 environment (but the word in latin1) will
return

@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3-20050121)
*
& le 73 2: Le, Lea, Lee, Leo, Lew, Lie, lea, lee, lei, lie, El, L, l, LED,
Lek, Lem, Len, Les, Lev, Lr, led, leg, let, lye, E, e, LA, LL, La, Li, Lu,
Ly, la, ll, lo, Ole, ale, olé, LC, LP, Ln, Lt, lb, lg, ls, Be, Ce, DE, De,
Fe, GE, Ge, He, IE, ME, Me, NE, Ne, OE, PE, Re, SE, Se, Te, Xe, be, he, me,
re, we, ye, Le's, L's

This last seems what emacs do trough ispell.el, resulting in a 'Ispell and
its process have different charsets' error on ispell-word

The fix I am considering is to modify ispell.el so

--encoding=ispell_dict_encoding

is added to the aspell call (and only to the aspell call), assuming C.J.
Madsen patch for aspell-learn-from-user-misspelings is applied and
ispell-really-aspell is available (as is in emacs CVS ispell.el)

diff -urNad dictionaries-common/support/emacsen/ispell.el 
/tmp/dpep.3PGMwL/dictionaries-common/support/emacsen/ispell.el
--- dictionaries-common/support/emacsen/ispell.el       Sun Apr  3 23:27:46 2005
+++ /tmp/dpep.3PGMwL/dictionaries-common/support/emacsen/ispell.el      Sun Apr 
 3 23:29:55 2005
@@ -2250,8 +2250,16 @@
              (append args
                      (list "-p"
                            (expand-file-name
ispell-personal-dictionary)))))
+    ;; ----- Debian changes
+    (if ispell-really-aspell
+       (setq args 
+             (append args 
+                     (list
+                      (concat "--encoding=" 
+                              (symbol-name (ispell-get-coding-system)))))))
+    ;; ----- End of Debian changes
     (setq args (append args ispell-extra-args))
-
+    
     (if ispell-async-processp
        (let ((process-connection-type ispell-use-ptys-p))
          (apply 'start-process

so we make sure that both ispell.el and aspell use the same encoding.

Note that this will not work neither with aspell-0.33 (no --encoding option
available) nor with aspell-0.50 (uses only iso8859-1 syntax, while for
aspell-0.60 iso-8859-1 syntax is also available). I think all dicts in
ispell.el are supported by the current aspell-0.60 syntax. This problem
seems to not appear for aspell-0.{33,50} (no utf8 support available).

For a more general ispell.el some aditional checking on the aspell version
might be desirable.

Cheers,

-- 
Agustin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]