Re: [Unicode-2] `read' always returns multibyte symbol

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Unicode-2] `read' always returns multibyte symbol

From:	Katsumi Yamaoka
Subject:	Re: [Unicode-2] `read' always returns multibyte symbol
Date:	Fri, 16 Nov 2007 08:31:49 +0900
User-agent:	Gnus/5.110007 (No Gnus v0.7) Emacs/23.0.60 (gnu/linux)

>>>>> Kenichi Handa <address@hidden> wrote:

> In article <address@hidden>,
>       Katsumi Yamaoka <address@hidden> writes:

>> What I observed was different.

> That is exactly what string-as-multibyte does. \206\343 and
> \202\271 are valid multibyte forms in the current Emacs,
> thus are treated as multibyte characters.

I understood why such readable characters appeared abruptly.

[...]

> Please try this:

> (string-make-unibyte
>  (string-as-multibyte "\343\203\206\343\202\271\343\203\210"))

> You'll get the above result, ... yes, very weird.

Oh, it made me surprised a bit.  But I often view such a scene
while playing with unibyte and multibyte things, and it always
confuses me.

> On the other hand,

> (string-as-unibyte
>  (string-as-multibyte "\343\203\206\343\202\271\343\203\210"))
>  =>  "\343\203\206\343\202\271\343\203\210"

>>> I long ago proposed a facility that turns on the
>>> multibyteness of a buffer while converting 8-bit bytes to
>>> multibyte characters as what string-to-multibyte does, but
>>> not accepted.

>> But the modern Emacsen does do so, doesn't it?

> No.

Oops.  I misunderstood that the reason why Emacs 22 and 23 don't
break 8-bit data while they are being fed into a multibyte buffer
from a network process of which the process coding system is
binary.  So, maybe the best ways for the present are still to
use a unibyte buffer for unibyte data and to use a multibyte
buffer for multibyte data.  And use a string, not a buffer, to
encode and decode data if the multibyteness of data will change,
like:

(insert (prog1
            (decode-coding-string (buffer-string) 'coding)
          (erase-buffer)
          (set-buffer-multibyte t)))

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Unicode-2] `read' always returns multibyte symbol, (continued)
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/13

Prev by Date: Re: New keybinding suggestion: C-x _ for `shrink-window'
Next by Date: Re: [Unicode-2] `read' always returns multibyte symbol
Previous by thread: Re: [Unicode-2] `read' always returns multibyte symbol
Next by thread: Re: [Unicode-2] `read' always returns multibyte symbol
Index(es):
- Date
- Thread