bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#30829: bug: empty regex exits with error when following 2-address li


From: Assaf Gordon
Subject: bug#30829: bug: empty regex exits with error when following 2-address like LINENO, /RE/
Date: Thu, 15 Mar 2018 16:34:07 -0600
User-agent: Mutt/1.5.24 (2015-08-30)

Hello,

On Thu, Mar 15, 2018 at 09:20:37PM +0100, Don Crissti wrote:
>    "the empty regular expression ‘//’ repeats the last regular expression
>    match"
> 
>    however this does not work when the empty regex follows a 2-address of
>    the form LINE_NUMBER,/REGEX/
>    e.g.
> 
>    # printf %s\\n {1..5} | sed '2,/5/{//!d}'
> 
>    fails with
> 
>    "sed: -e expression #1, char 0: no previous regular expression"

Thanks for reporting this bug and providing an easy way to reproduce.

Before deciding if it's a bug or not, it's worth comparing to other sed's.
(I'm using a slightly different sed program because multiple
commands on the same line is a GNU extension.)

FreeBSD/OpenBSD/NetBSD:

  $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p' 
  sed: first RE may not be empty

BusyBox and ToyBox (output seems incorrect):

  $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p'
  1
  2
  2
  3
  3
  4
  4
  5
  5

Heirloom (http://heirloom.sf.net/):

  $ seq 5 | sed-heirloom -n -e '2,/5/p' -e '//p'                            
  2
  3
  4
  5
  5

And surprisingly, GNU sed version 3.02:

  $ seq 5 | sed-gnu-3.02 -n -e '2,/5/p' -e '//p'
  2
  3
  4
  5
  5


GNU sed 4.0 and later:

  $ seq 5 | sed -n -e '2,/5/p' -e '//p'                             
  sed: -e expression #2, char 0: no previous regular expression

=====

Now to why it happens:

POSIX says (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html):

  "If an RE is empty (that is, no pattern is specified) sed shall behave as if 
the 
  last RE used in the last command applied (either as an address or as part of 
a 
  substitute command) was specified."

And the interpertation (of both GNU sed >4.0 and *BSD's sed) is
that the "last RE used in the last command *applied*" means the last RE 
*executed*
- not the last regex that preceeds the empty regex in the program.

And so in this command:

   sed -n -e '2,/5/p' -e '//p'

On the first line, the address 2 is checked (it doesn't match on line 1 
obviously).
the regex '/5/' is *not* executed (because 2 didn't match).
Then sed tries '//p' - but there was no RE executed - hence the error.

The reason for this is that empty (last) regex can be changed
during runtime, based on the input.

Consider the following (contrived) example:

 $ printf "%s\n" a ab ab ab \
      | sed '1s/a/X/
             tq
             1s/b/Y/
             :q
             s//*/'
 X
 *b
 *b
 *b

 $ printf "%s\n" b ab ab ab \
      | sed '1s/a/X/
             tq
             1s/b/Y/
             :q
             s//*/'
 Y
 a*
 a*
 a*




The flow is:
1. If line 1 contains 'a' - replace 'a' with 'X' and skip the next check
   ('tq' means "jump to label :q if the last subsitution matched").
2. If line 1 contains 'b' - replace 'b' with 'Y'.
3. For every line, replace the last regex with '*'.

And so you see that the last regex changes dynamically during
runtime, based on whether the first line contained 'a' or 'b'.

In the first case, the three 'a's are replaced with '*'.
In the second case, the three 'b's are replaced with '*'.


I therefore think this is not a bug (and I'm marking it as 'done').
However discussion can continue by replying to this thread,
and if there are different opinions we can always re-open it.

regards,
 - assaf





reply via email to

[Prev in Thread] Current Thread [Next in Thread]