bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19414: bug in sed


From: Buchs, Kevin J.
Subject: bug#19414: bug in sed
Date: Mon, 29 Dec 2014 07:51:18 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

Bob,

Thanks for the reply. Attached is a tar archive with the test cases and the expected output. The bottom line, I suppose, is I did not expect the asterisk cases would not anchor to the beginning of the line matching the trivial pattern of zero spaces. I realize that asterisk is for zero or more matches of prior regexp, but this seems to be a non-greedy (anti-greedy) extreme. If I try to think like a regexp evaluation machine, I can see the logic for the way it works. Maybe, because it was in a "s" command, and not a line address, it seems like it should not be behaving as if it is anchored at the start of the line.

--
Kevin Buchs   Research Computer Services   Phone: 507-538-5459
Mayo Clinic   200 1st. St SW   Rochester, MN 55905   http://mayoclinic.org

On 12/19/2014 09:06 PM, Bob Proulx wrote:
Buchs, Kevin J. wrote:
I am using sed, version 4.2.1 on a few different systems. What I have
discovered is that inside a substitute command, a space alone is magically
anchored at the start of the line in an anti-greedy match.
First, thank you for your bug report.  Efforts to find and fix bugs
are appreciated.  However...

Could you provide a test case *along with the output* that you are
expecting to see?  Otherwise any of us that look are just going to see
what we expect and go, yep, looks okay to me.

As an example, consider this input stream:

  a                     <-- leading space
b c
d  e
f    g
Sure.

Along with these one-liner invocations:

sed -e 's/ *//'
sed -e 's/[ ]*//'
Sure.

sed -e 's/[[:space::]]*//'
Syntax error.  You have two ':' characters where you almost certainly
wanted only one.

sed -e 's/ ?//'
sed -e 's/ +//'
Those look like you want extended regular expressions in those two
because of the use of ? and + but you are using the basic regular
expressions by the syntax.  So those probably won't do what you want
but will work correctly.

What seems to be happening is that the wildcard match gets anchored to the
beginning of the line with a zero character hit.

This case seems to apply not only to spaces but other characters.
Please provide the output you are expecting.  This is what I get and
expect:

  a
b c
d  e
f    g
   sed -e 's/ *//'
   a
   b c
   d  e
   f    g

   sed -e 's/[ ]*//'
   a
   b c
   d  e
   f    g

   sed -e 's/[[:space:]]*//'
   a
   b c
   d  e
   f    g

   sed -e 's/ \?//'
   a
   b c
   d  e
   f    g

   sed -e 's/ \+//'
   a
   bc
   de
   fg

Those are all as expected.  Are you expecting anything different?

Bob

Attachment: sed-tests-Buchs.tar.gz
Description: GNU Zip compressed data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]