emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#33793: closed (sed bug with regular expressions)


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#33793: closed (sed bug with regular expressions)
Date: Tue, 18 Dec 2018 18:24:03 +0000

Your message dated Tue, 18 Dec 2018 12:23:16 -0600
with message-id <address@hidden>
and subject line Re: bug#33793: sed bug with regular expressions
has caused the debbugs.gnu.org bug report #33793,
regarding sed bug with regular expressions
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
33793: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=33793
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: sed bug with regular expressions Date: Tue, 18 Dec 2018 15:50:49 +0300
Hi. I've found the bug using sed. There is how to reproduce:
1) Run bash
2) Exec command \
echo weather -5.0 | sed 's/[^0-9\-\.]//g'                                                        
3) You will get "5.0". Expected output is "-5.0"

BUT
If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'
you''ll get the correct output "-5.0".

I am using GNU sed version 4.5 on Manjaro Linux.

--- End Message ---
--- Begin Message --- Subject: Re: bug#33793: sed bug with regular expressions Date: Tue, 18 Dec 2018 12:23:16 -0600 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1
tag 33793 notabug
thanks

On 12/18/18 6:50 AM, Uladzimir Panasiuk wrote:
Hi. I've found the bug using sed. There is how to reproduce:
1) Run bash
2) Exec command \
echo weather -5.0 | sed
's/[^0-9\-\.]//g'

You used two range expressions in this regex, but the result is the same as if you had used this regex with only one range expression::

's/[^0-9\.]//g'

Either way, you requested all characters except for the 10 digits, a literal backslash, or a literal dot. Remember, a range expression [\-\] selects a single character of the backslash. Since '-' is not excluded from the [] expression, sed correctly strips it.

3) You will get "5.0". Expected output is "-5.0"

You might be remembering the behavior of perl regex, where \ inside [] is an escape character. But that's not how POSIX regex behaves - inside [], \ is literal, and there are no escape characters.


BUT
If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'

Here, your regex only has one range expression, but lists \ twice. The repetition is harmless, but means that your expression is the same as this shorter:

's/[^0-9\.-]//g'

It is not obvious from your input whether you intended to be filtering out literal backslash or not, but if not, you probably meant to write:

's/[^0-9.-]//g'

with no backslash, and with the - last (as that is one of the few places that you can write - to be matched as itself rather than treated as a range operator between neighboring characters).

I'm closing this as not a bug, but feel free to reply with further questions or comments.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]