[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: find + sh + grep
From: |
Eric Blake |
Subject: |
Re: find + sh + grep |
Date: |
Tue, 25 Oct 2011 10:25:36 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110928 Fedora/3.1.15-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.4 Thunderbird/3.1.15 |
[re-adding the list, so that others may learn from this conversation]
On 10/25/2011 10:08 AM, Kirk Korver wrote:
Eric,
I wanted to thank you again for all of your help. I have one additional
question. I will understand if you are too busy to respond. It is not
blocking me, it is just for my personal education. You are more
knowledgeable than anyone else I know.
/me blushes
Not quite true - part of becoming an "expert" is realizing that there is
almost always someone out there better than you :)
Here is what I type, and the result.
grep '$Header:.*$' NoEnd.h
$Header:
So my initial analysis wasn't quite right. According to POSIX,
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
A <dollar-sign> ( '$' ) shall be an anchor when used as the last
character of an entire BRE. The implementation may treat a <dollar-sign>
as an anchor when used as the last character of a subexpression.
Therefore, I was wrong in stating that you have to use \$; a
POSIX-compliant regex engine should correctly recognize that '$Header'
uses $ where it cannot be an anchor, and thus treat it as a literal
character without needing the \$. [Now, how many regex implementations
actually implement this part of POSIX, and how many get it wrong?] In
fact, a strict reading of POSIX makes it sound like \$ is undefined if $
would not otherwise be an anchor!
So now on to my (mis)understandings
1) If I had typed grep '\$Header:.*\$' NoEnd.h I would not have a
match. The single quote tells the shell to not change the contents, and the
\$ is the dollar sign.
If $ can be an anchor, then \$ is the proper way to match a literal '$'.
At least GNU sed and grep treat \$ as a literal '$' everywhere,
whether or not the $ could match an anchor.
2) In what I did type, the dollar sign is the 'end of line'character
3) I typed, find a line where there is an end of line, followed by Header:
This is not my intent
Back to that pesky POSIX wording - if $ is not at the end of the regex
or a subexpression of the regex, then it cannot be an anchor, therefore
it does not match the end of line and instead matches a literal '$'.
4) I also tried grep "$Header:.*$" NoEnd.h and got 3 matches. I
then realized that the $Header was being interpreted as the environment
variable Header which is currently not set, so this becomes grep
':.*' which matches all lines with a colon in them. I am not sure what the
$ all by itself means. Short lesson there, understand better the difference
between the single quote, and the double quote, and then type what I mean. J
Yes, in shell expansion, "$Header" is much different from
'$Header'/"\$Header", which is in turn different from '\$Header'.
Also, in shell, "$" produces $, rather than a variable expansion, since
there was no variable name to expand, but it's risky enough that you
should generally escape that particular $.
Now my question, I do not understand why there is a match, in what I
initially typed. I am missing something about the regular expression. I
believe that
$Header:.*$
means
[end of line]Header:[any character][any number of times][end of line], which
should not yield a match.
It actually means literal $, literal Header:, any number of characters,
and end of line.
Can you shed some light?
So my initial analysis wasn't quite right - but I still stand by the
conclusion that you had a bad regex. It was the combination of double
shell expansion, inside "", that ate enough levels of \ that you ended
up expanding $Header instead of searching for a literal $, so you were
using a different regex than you had planned (:.*$ instead of $Header:.*$).
At any rate, thanks for forcing me to re-read POSIX and add better
information to this thread.
P.S. with an 801 number, do you live in SLC?
Yep.
--
Eric Blake address@hidden +1-801-349-2682
Libvirt virtualization library http://libvirt.org