bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: have you ever mistyped [[:lower:]] as [:lower:] ?


From: Paolo Bonzini
Subject: Re: have you ever mistyped [[:lower:]] as [:lower:] ?
Date: Wed, 01 Sep 2010 10:29:52 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Mnenhy/0.8.3 Thunderbird/3.0.5

On 09/01/2010 10:11 AM, Jim Meyering wrote:
No, that's not the same at all.  grep must not do that.
It is one thing to fail for an obviously erroneous construct.
It would be worse to silently transform it into the "intended"
one, since then GNU grep would silently work as intended, but that
same erroneous command would produce different results with non-GNU grep.

Here, we're making it clear that this is a serious error.

It's not, as it is syntactically correct.

Second, if this was done, it should operate in the same way in sed,
expr, awk, and all other GNU programs that deal with regexes.  (And
possibly in glibc too).

It would be nice to make other GNU programs provide the same new
feature.  However, if they don't (or don't right away), it's not
a big deal.

I know that sed won't as long as I'm maintainer.

If you want to add --warn=error (which is a "superset" of
--warn=always behavior), that's fine and I actually like the idea.
But I think making it the default non-POSIXLY_CORRECT behavior is
wrong.  Honestly, if this happened I would regret having introduced
the feature in the first place.

I hope you don't regret it.
Sometimes you just have to admit that
POSIX-is-clear-and-POSIX-can-be-improved.

POSIX can be improved in many ways, and GNU is a testimony to this. In fact, I should have participated to POSIX more and made sure some of my sed extensions went into POSIX.2-2008.

But this is not a case in which POSIX can be improved. POSIX provides a nice grammar for regex, and our warning is a hack on top of that grammar. POSIX provides the clean thing as a standard should, and we build on top of it a useful hack.

What could have a place in a standard, is a mechanism for grep to warn about doubtful regular expressions. Mandating "what is" a doubtful regular expression (which is a prerequisite if you want grep to exit with status 2) does not have its place in a standard. The C standard does not say what algorithms to use in order to find uninitialized variables or dead stores.

We should not let standards get in the way of improving our software.

That we are making this the default behavior
is a tribute to the usefulness of your new feature.

It's not. It's a huge mistake, because making it an error means changing the regex grammar (and making it unnecessarily complicated and contrived).

The following change-set implements what Paul and I have been advocating.
I'll push it later today or tomorrow.

I still think this is wrong, and doubly wrong because I cannot disable it on my system without breaking it with POSIXLY_CORRECT. Please, please implement --warn=error instead. Only then we can discuss about:

- making it the default

- making --warn=auto choose between "error" (instead of "yes") and "no"

- leaving things as is, releasing 2.6.4 or 2.7 now, and revisiting the topic again later

Paolo

ps: maybe you can look for another opinion and ask glibc to implement this in regex. Then watch Ulrich closing it as nonsense in 2 nanoseconds.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]