bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: sed problem with ^ and \


From: Jonathan_Cook
Subject: RE: sed problem with ^ and \
Date: Wed, 16 Mar 2005 09:27:56 +0100

Hi Bob,

Thanks for the quick reply.

I should probably have said that I was using sed -f and so my 
search/replace filter was actually in a sedfile, hence no quotes. Sorry 
about that.

What you didn't answer, or at least if you did I didn't understand the 
answer, is actually where I was having my biggest problem. It seems that 
my Linux version of sed ...

***
uteuds04:wtu_dev > sed --version
GNU sed version 3.02

Copyright (C) 1998 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is 
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE,
to the extent permitted by law.
***

... doesn't recognise ^ to mean "start of line" or $ to mean "end of 
line". I found a similar issue later with the pattern below (supposed to 
trim spaces and nulls from the end of lines) not working because it 
didn't understand the $. 

        s/ .*$//

Looking at the man page for regexp I found the following ...

***
       -lineanchor    Changes  the  behavior  of `^' and `$' (the
                      ``anchors'') so they  match  the  beginning
                      and  end  of  a line respectively.  This is
                      the same as specifying  the  (?w)  embedded
                      option (see METASYNTAX, below).
***

So my question is, can I get sed to interpret ^ to mean start of line 
and $ to mean end of line and if so then how?

Regarding your gripe about my email disclaimer, all I can do is 
apologise, stand corrected and thank you for the etiquette lesson. I 
don't like them either but legal say we have to use them. In order to 
suppress it I have to first remember and then jump through a couple of 
hoops hoping that no one is looking. I am new to the Linux community 
having always previously worked on HP-UX systems and I promise to do 
better in future.

Regards,

~Jonathan

-----Original Message-----
From: address@hidden [mailto:address@hidden
Sent: 15 March 2005 17:13
To: Jonathan Cook /gnva
Cc: address@hidden
Subject: Re: sed problem with ^ and \


address@hidden wrote:
> My sed filter is supposed to delete everything in the file that is not 

> between <SQL> and </SQL>
> 
>       /^\<SQL>/,/\<\/SQL\>/!d

I think you meant '/^\<SQL\>/,/\<\/SQL\>/!d', right?  (You did not
backslash the first > in the line.)  But I assume you were quoting the
string in some other way because a plain '>' on the line would have
generated a shell error.  But \< and \> are undefined expressions.  So
that is your problem.

  Regex syntax clashes (problems with backslashes)
     `sed' uses the POSIX basic regular expression syntax.  According to
     the standard, the meaning of some escape sequences is undefined in
     this syntax;  notable in the case of `sed' are `\|', `\+', `\?',
     `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.

> This worked fine on HP-UX but on Linux the ^ doesn't seem to be 
> recognised to mean "start of line" and it also doesn't seem to like 
the 
> \ escape before the < or >.

On HP-UX the libc RE engine defaults to undefined escape sequences as
being defined as the character itself.  Of course use of undefined
sequences is not portable.  Especially for sed that provides 'egrep'
style expressions in addition to the older style regular expressions
there is a collision on syntax.

Try this syntax instead.

  /^<SQL>/,/<\/SQL>/!d

Here is a test case.

  cat >/tmp/testcase <<EOF
  one
  <SQL>
  two
  </SQL>
  three
  <SQL>
  four
  </SQL>
  five
  EOF

  sed '/^<SQL>/,/<\/SQL>/!d' /tmp/testcase
  <SQL>
  two
  </SQL>
  <SQL>
  four
  </SQL>

Does that help?

> We just moved from HP-UX 11 to Linux and I have a behaviour difference 

> with a regular expression pattern with sed.
[...]
> This worked fine on HP-UX but on Linux the ^ doesn't seem to be
> recognised to mean "start of line" and it also doesn't seem to like 
the
> \ escape before the < or >.

Okay, now it is time for the griping to begin!  You are asking for
help about a GNU Project program on a GNU mailing list.  So why are
you talking about the Linux kernel here?  Your question has absolutely
nothing at all to do with Linux.

  http://www.gnu.org/gnu/linux-and-gnu.html

In this case it would have been better to show the version of the sed
program.  The first line of 'sed --version' would be most appropriate.

  sed --version
  GNU sed version 4.1.4

> Confidentiality Note: This message is intended only for the named 
> recipient and may contain confidential, proprietary or legally 
> privileged information. Unauthorized individuals or entities are not 
> permitted access to this information. Any dissemination, distribution, 

> or copying of this information is strictly prohibited. If you have 
> received this message in error, please advise the sender by reply 
> e-mail, and delete this message and any attachments. Thank you.

And then you have committed a second breach of etiquette.  You have
included an email disclaimer in your message and posted it to a public
mailing list.  Many people on the Internet will refuse to even
acknowledge your messages if they include such notices since basically
you have told them by the notice that they can't.  In any case, they
are annoying.  Don't do it!

  http://www.goldmark.org/jeff/stupid-disclaimers/

If nothing else please post your message from a different account that
does not include such annoying things.

Not to be completely negative, your choice of subject for this message
was quite good and descriptive.  I give you full marks for it.  Good
job there.

Bob

Email Disclaimer: You are not allowed to read this message.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]