[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#4240: 23.1.50; C-u doesn't work with Swedish characters
From: |
Eli Zaretskii |
Subject: |
bug#4240: 23.1.50; C-u doesn't work with Swedish characters |
Date: |
Wed, 26 Aug 2009 20:08:40 +0300 |
Ping!
> Date: Sun, 23 Aug 2009 23:40:00 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: deniz.a.m.dogan@gmail.com
>
> > From: Juri Linkov <juri@jurta.org>
> > Date: Sun, 23 Aug 2009 21:54:04 +0300
> > Cc: 4240@emacsbugs.donarmstrong.com
> >
> > > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out
> > > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by
> > > fourteen "\344" and then a trailing "ä". This happens no matter which
> > > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's
> > > always the leading and the trailing character that come out right, all
> > > of the other ones are "broken".
> >
> > Please see bug#4037:
> > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037
> >
> > I received no confirmation that my proposed fix is correct.
>
> I think those two lines are not necessary anymore and should be
> removed (together with the comments which explain their need). I
> think they belong to the old pre-unicode days when raw eight-bit
> characters needed such special treatment.
>
> Handa-san, can you please comment on that?
>
> > Maybe the right fix is to reverse negation?
>
> Why, do you see that the code without these two lines don't DTRT when
> the characters are inserted into a unibyte buffer? If it works in
> both cases, it's the evidence that I'm right and this code is not
> needed anymore.
>
> > It seems logical to check if a buffer is unibyte before converting
> > from unibyte to multibyte, but I don't understand what this code was
> > supposed to do.
>
> It was supposed to produce a multibyte character from a unibyte one,
> by using a special locale-dependent table that mapped, e.g., 8859-1
> encoded Latin-1 characters in the range [128..255] to the
> corresponding multibyte codepoints of Latin-1 characters in the
> internal representation of characters Emacs 22 used. See the Emacs 22
> definition of unibyte_char_to_multibyte in src/charset.c.
>
> Nowadays we don't need that, since we have a special range of
> multibyte codepoints for representing unibyte characters in multibyte
> buffers and strings, and insert-char and the primitives it calls
> already DTRT with them. So there should be no need to do anything
> special outside insert-char.
>
bug#4240: marked as done (23.1.50; C-u doesn't work with Swedish characters), Emacs bug Tracking System, 2009/08/28