emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unibyte characters, strings and buffers


From: Paul Eggert
Subject: Re: Unibyte characters, strings and buffers
Date: Fri, 28 Mar 2014 12:21:04 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0


Code that blithly passes bytes in the range 128-255 to char-equal is
*already* buggy.
There's nothing wrong with those bytes, certainly not when they stand
for Latin-1 characters.

Sure, and if they stand for Latin-1 characters the proposed change will do the right thing.

How is it a win, when it actually _adds_ bugs? E.g., under your proposal, (char-equal 192 224) will yield non-nil when case-fold-search is non-nil.

That's not a bug, since À and à are the same character, ignoring case.

As I understand it, the scenario you're worried about is that someone is visiting a unibyte buffer and is doing a case-folded search involving non-ASCII bytes and doesn't want these bytes to match their Latin-1 case-folded counterparts. This scenario is not common enough to worry about. Changing the behavior for this rare case is a cost, I suppose, but it's outweighed by the benefit of simplifying case-equal and fixing its semantics to be a bit saner.

Plus, the change is simpler and easier to explain than what we have now,
and that is a long-term win.
I don't see how it is simpler or easier to explain.  It replaces one
lopsided interpretation of 128-255 values with another.


It's simpler because it decouples the rules for char-equal from the question of whether the current buffer is multibyte. Separation of concerns is a win.

I suggested a solution: ignore case-fold-search in unibyte buffers.

Sorry, I didn't see that suggestion. It would be better than what we have now for char-equal, but it would have undesirable side effects elsewhere. When I type find-file-literally to visit a buffer in raw-text form, it's more convenient if I can type C-s h t m l (or whatever) and find "HTML". I'd rather not lose that capability.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]