emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unibyte characters, strings and buffers


From: Eli Zaretskii
Subject: Re: Unibyte characters, strings and buffers
Date: Fri, 28 Mar 2014 11:18:02 +0300

(I retitled the subject, because the unibyte issue is sufficiently
different from what I originally raised.)

> Date: Fri, 28 Mar 2014 00:46:01 -0700
> From: Paul Eggert <address@hidden>
> CC: address@hidden
> 
> Eli Zaretskii wrote:
> > How to compare bytes, then?
> 
> It depends on what kind of comparison one wants.  Simplest is to use 
> '='.  To ignore case and treat bytes 128-255 as Latin-1 characters, use 
> 'downcase' first.  To ignore case and treat bytes 128-255 as 
> uninterpreted bit patterns, use 'unibyte-char-to-multibyte' before 
> downcasing.  Etc.
> 
> > we don't have a way of distinguishing between characters and
> > bytes, unless we look on something besides the arguments themselves.
> 
> Yes, that's right.

Which is why your suggestions above will not necessarily DTRT.
Arbitrary interpretation of bytes 128-255 as Latin-1 is not guaranteed
to be correct, and therefore 'downcase' will sometimes produce
unexpected results, unless we can make sure, somehow, that raw bytes
will never be exposed to Lisp as having these values.  Unless you show
a practical way towards the latter goal, what you suggest will just
replace one set of subtly buggy behaviors with another (in which case
I vote for what we already have, because that one is at least well
known and passed some test of time).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]