emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#30829: closed (bug: empty regex exits with error w


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#30829: closed (bug: empty regex exits with error when following 2-address like LINENO, /RE/)
Date: Thu, 15 Mar 2018 22:35:02 +0000

Your message dated Thu, 15 Mar 2018 16:34:07 -0600
with message-id <address@hidden>
and subject line Re: bug#30829: bug: empty regex exits with error when 
following 2-address like LINENO, /RE/
has caused the debbugs.gnu.org bug report #30829,
regarding bug: empty regex exits with error when following 2-address like 
LINENO,/RE/
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
30829: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=30829
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: bug: empty regex exits with error when following 2-address like LINENO,/RE/ Date: Thu, 15 Mar 2018 21:20:37 +0100
The manual states that
 
"the empty regular _expression_ ‘//’ repeats the last regular _expression_ match"
 
however this does not work when the empty regex follows a 2-address of the form LINE_NUMBER,/REGEX/
e.g.
 
# printf %s\\n {1..5} | sed '2,/5/{//!d}'
 
fails with
 
"sed: -e _expression_ #1, char 0: no previous regular _expression_"
 
instead of printing
 
1
5
 
If it matters, a 2-address like /REGEX/,LINE_NUMBER works as expected e.g.:
 
# printf %s\\n {1..5} | sed '/2/,5{//!d}'
 
correctly prints
 
1
2
 
This is with gnu sed 4.4 on archlinux, vanilla.

--- End Message ---
--- Begin Message --- Subject: Re: bug#30829: bug: empty regex exits with error when following 2-address like LINENO, /RE/ Date: Thu, 15 Mar 2018 16:34:07 -0600 User-agent: Mutt/1.5.24 (2015-08-30)
Hello,

On Thu, Mar 15, 2018 at 09:20:37PM +0100, Don Crissti wrote:
>    "the empty regular expression ‘//’ repeats the last regular expression
>    match"
> 
>    however this does not work when the empty regex follows a 2-address of
>    the form LINE_NUMBER,/REGEX/
>    e.g.
> 
>    # printf %s\\n {1..5} | sed '2,/5/{//!d}'
> 
>    fails with
> 
>    "sed: -e expression #1, char 0: no previous regular expression"

Thanks for reporting this bug and providing an easy way to reproduce.

Before deciding if it's a bug or not, it's worth comparing to other sed's.
(I'm using a slightly different sed program because multiple
commands on the same line is a GNU extension.)

FreeBSD/OpenBSD/NetBSD:

  $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p' 
  sed: first RE may not be empty

BusyBox and ToyBox (output seems incorrect):

  $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p'
  1
  2
  2
  3
  3
  4
  4
  5
  5

Heirloom (http://heirloom.sf.net/):

  $ seq 5 | sed-heirloom -n -e '2,/5/p' -e '//p'                            
  2
  3
  4
  5
  5

And surprisingly, GNU sed version 3.02:

  $ seq 5 | sed-gnu-3.02 -n -e '2,/5/p' -e '//p'
  2
  3
  4
  5
  5


GNU sed 4.0 and later:

  $ seq 5 | sed -n -e '2,/5/p' -e '//p'                             
  sed: -e expression #2, char 0: no previous regular expression

=====

Now to why it happens:

POSIX says (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html):

  "If an RE is empty (that is, no pattern is specified) sed shall behave as if 
the 
  last RE used in the last command applied (either as an address or as part of 
a 
  substitute command) was specified."

And the interpertation (of both GNU sed >4.0 and *BSD's sed) is
that the "last RE used in the last command *applied*" means the last RE 
*executed*
- not the last regex that preceeds the empty regex in the program.

And so in this command:

   sed -n -e '2,/5/p' -e '//p'

On the first line, the address 2 is checked (it doesn't match on line 1 
obviously).
the regex '/5/' is *not* executed (because 2 didn't match).
Then sed tries '//p' - but there was no RE executed - hence the error.

The reason for this is that empty (last) regex can be changed
during runtime, based on the input.

Consider the following (contrived) example:

 $ printf "%s\n" a ab ab ab \
      | sed '1s/a/X/
             tq
             1s/b/Y/
             :q
             s//*/'
 X
 *b
 *b
 *b

 $ printf "%s\n" b ab ab ab \
      | sed '1s/a/X/
             tq
             1s/b/Y/
             :q
             s//*/'
 Y
 a*
 a*
 a*




The flow is:
1. If line 1 contains 'a' - replace 'a' with 'X' and skip the next check
   ('tq' means "jump to label :q if the last subsitution matched").
2. If line 1 contains 'b' - replace 'b' with 'Y'.
3. For every line, replace the last regex with '*'.

And so you see that the last regex changes dynamically during
runtime, based on whether the first line contained 'a' or 'b'.

In the first case, the three 'a's are replaced with '*'.
In the second case, the three 'b's are replaced with '*'.


I therefore think this is not a bug (and I'm marking it as 'done').
However discussion can continue by replying to this thread,
and if there are different opinions we can always re-open it.

regards,
 - assaf


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]