[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: setenv -> locale-coding-system cannot handle ASCII?!
From: |
Stefan Monnier |
Subject: |
Re: setenv -> locale-coding-system cannot handle ASCII?! |
Date: |
Tue, 25 Feb 2003 21:52:18 -0500 |
> >> (if (multibyte-string-p variable)
> >> (setq variable (encode-coding-string variable
> >> locale-coding-system)))
> >>
> >> multibyte-string-p is mandatory because encode-coding-string
> >> will change the byte-sequence of `variable' even if it is
> >> unibyte.
> >> Ex. (encode-coding-string "\201\300" 'iso-latin-1) => "\300"
>
> > I find this behavior annoying because it makes the emacs-mule
> > encoding appear in a situation where it is not mentioned.
> > I wish that
>
> > (encode-coding-string "\201\300" 'iso-latin-1)
> > and
> > (encode-coding-string (string-to-multibyte "\201\300") 'iso-latin-1)
>
> > returned the same value.
>
> Why? As I wrote before, what does bytes of unibyte string
> means depends on a context.
I consider this context-dependent meaning of unibyte strings
to be a problem. I understand why text in a unibyte buffer
has such an ambiguous meaning and agree that it's difficult
to avoid, but it's not a reason to carry over this difficulty
to strings where it is not needed.
> In the former case, as it is given to encode-coding-string,
> it is a multibyte form by which emacs represents
> character(s), not a sequence of characters representing raw
> bytes.
The problem is that the multibyteness of strings is not
always as easy to guess/control. For example: what is the
multibyteness of
(concat "\201" (format "%s" "hello"))
and
(concat "\201" (format "%s" 1))
> In the latter case, as it is given to string-to-multibyte,
> it should be regard as a sequence of characters representing
> raw bytes, thus the result of (string-to-multibyte
> "\201\300") is still a sequence of raw-bytes. Encoding
> raw-bytes should yield the same raw-bytes.
Indeed, that's what I and `setenv' would want.
> And, this behaviour of encode-coding-string on a unibyte
> string is a natural consequence of encode-coding-region in a
> unibyte buffer.
As mentioned above, I understand why it works that way in buffers,
but I don't think it has to work the same way for strings.
Stefan
- setenv -> locale-coding-system cannot handle ASCII?!, Sam Steingold, 2003/02/24
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Miles Bader, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!,
Stefan Monnier <=
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- tar-mode, Kenichi Handa, 2003/02/26
- Re: tar-mode, Stefan Monnier, 2003/02/26
- Re: tar-mode, Kenichi Handa, 2003/02/26
- Re: tar-mode, Stefan Monnier, 2003/02/26