emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: char equivalence classes in search - why not symmetric?


From: Drew Adams
Subject: RE: char equivalence classes in search - why not symmetric?
Date: Tue, 1 Sep 2015 12:09:54 -0700 (PDT)

> >> Because having both input characters mean the same thing
> >> uselessly deprives the user of expressive power.
> >
> > Examples/arguments/reasons, please.  IOW, prove it.
> 
> I'm sorry: I thought it was obvious.  For case folding, there are three
> sets of characters that might be considered a match: [a], [A], and [aA].
>  The default Emacs behavior is to make "a" mean [aA] and "A" mean [A].
> For the (relatively rare) case in which [a] is desired, one can turn
> case-fold-search off (e.g., with M-c).  Then you gain [a] and lose [aA]
> as a choice (you can't have all three from just two characters!).

You are just echoing what the implementation does, not giving
any supporting reasons for it.

"You can't have all three from just two characters" sounds
important - except that it doesn't mean anything.

It is quite possible for the behavior to be any of these:

 a matches a only
 a matches a and A
 A matches A only
 A matches a and A

The current implementation does not provide for the last
possibility.  In that, it can be argued that it "deprives
the user of expressive power".

But I won't bother making that argument for case folding.
I am not arguing for a change now in the longstanding
case-fold behavior.  I am arguing that we get this right
for char folding.

> With your suggestion (which addresses only case-fold-search, of course),
> we would have only [aA] available whether you typed "a" or "A".  That is
> the less expressive power: the semantically distinct options available
> have been reduced.

That's your suggestion perhaps.  It's certainly not mine.

I suggest letting the user match a to a, a to [aA], A to A, and
A to [aA].  That is more expressive power, not less.  With it,
the "semantically distinct options available" have been increased.

> Of course, with more than one character there are yet other
> possibilities: for two characters there are 9, of which "ab" gives you
> [aA][bB] and each of the other three permutations give one
> (case-sensitive) match each.  4/9 isn't great, but it's better than 1/9!

See above.  You are reducing possibilities, not expanding them.

> > IMO, more users have been tripped up than helped by the rule
> > that "An upper-case letter anywhere in the incremental search
> > string makes the search case-sensitive." (emacs) Search Case.
> 
> How did that upper-case letter get there?  Commands like C-w are careful
> not to add uppercase letters if there aren't already some.  So the user
> must have typed it explicitly, and so they were paying attention to case
> and have no need for a case-insensitive search.  The only harm is if
> they are inconsistent in their typing -- during something as brief as
> isearch.

A char in a search string can "get there" because a user typed it,
and that can be because for that user it is easy to type.  Or it can
get there from a previous search (same Isearch invocation or not).
Or it can "get there" by yanking copied text.

Try typing or pasting "réduction" to Google, and see if it ignores
hits such as "reduction".  Good luck with that.  Silly Google,
missing the "obvious".

It should be obvious that it can be useful to match the pattern
"réduction" against "reduction", just as it can be useful to
match the pattern "reduction" against "réduction" (and "réduction"
against "réduction" and "reduction" against "reduction").

To remove this possibility, thus reducing user expressiveness,
you really should come up with a reason.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]