emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unibyte characters, strings, and buffers


From: Andreas Schwab
Subject: Re: Unibyte characters, strings, and buffers
Date: Fri, 28 Mar 2014 12:22:18 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

David Kastrup <address@hidden> writes:

> "Stephen J. Turnbull" <address@hidden> writes:
>
>> I agree that having a way to represent "undecodable bytes" in a string
>> or buffer is extremely convenient.  XEmacs's lack of this capability
>> is surely a deficiency (Hi, David K!)
>
> Doing this in an utf-8 based internal coding is somewhat doable by
> employing non-utf-8 sequences.  Either using code points above the
> Unicode code range (2^20 + something, requiring 4 bytes), or by using
> non-minimal encodings (since the minimal ones are two bytes, requiring 3
> bytes).  Either way, the size increases significantly.

Emacs uses U3fff80-U3fffff for raw 8-bit bytes, internally represented
by 2 bytes.

Andreas.

-- 
Andreas Schwab, address@hidden
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



reply via email to

[Prev in Thread] Current Thread [Next in Thread]