emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Re: Inadequate documentation of silly characters on screen.


From: Stefan Monnier
Subject: Re: Fwd: Re: Inadequate documentation of silly characters on screen.
Date: Thu, 19 Nov 2009 18:10:54 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)

> The abstraction is broken.  It is broken because it isn't abstract - its
> users have to think about the way characters are represented.  In an
> effective abstraction, a user could just write "ñ" or ?ñ and rely on the
> underlying mechanisms to work.

> Instead of the abstraction "string", we have two grossly inferior
> abstractions, "unibyte string" and "multibyte string".

No: the abstraction "multibyte string" is what you call "a string", it's
absolutely identical.  The only problem is that there's one tiny but
significant unsupported spot: when you write a string constant you may
think it's a multibyte string, but Emacs may disagree.

The abstraction "unibyte string" is what you might call "a byte array".
It doesn't have much to do with your idea of a string.

> Please suggest to me the correct elisp to "replace the zeroth character
> of an existing string with Spanish n-twiddle".

For a unibyte string, it's impossible since "Spanish n-twiddle" is not
a byte.  For multibyte strings, `aset' will work dandy (tho
inefficiently of course because we're talking about a string, not an
array).

> If this is impossible to write, or it's grossly larger than the buggy
> "(aset nl 0 ?ñ)", that's a demonstration of the breakage.

Except the breakage is elsewhere: you expect `nl' to be a multibyte
string (i.e. "a string" in your mind), whereas Emacs tricked you earlier
and `nl' is really a byte array.

> Why is it necessary to distinguish between 'A' and 65?

It's not usually.  Because in almost all coding systems, the character
A is represented by the byte 65.

>> No, I don't agree.  If you want to get a human-readable text string,
>> don't use aset; use string operations instead.
> There aren't any.

Of course there are: substring+concat.

> I don't imagine anybody here would hold that the current state of strings
> is ideal.  I'm still trying to piece together what the essence of the
> problem is.

The essense is that "\n" is not what you think of as a string: it's
a byte array instead.  And Emacs managed to do enough magic to trick you
into thinking until now that it's just like a string.


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]