[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#34316: sed misbehavior on BRE's
From: |
Lange, Markus |
Subject: |
bug#34316: sed misbehavior on BRE's |
Date: |
Mon, 4 Feb 2019 13:42:52 +0000 |
Hi,
I'm currently migrating processes from an old SuSE 9 Linux to an new
CentOS 7 Linux and observed some unexpected behavior changes on sed.
At first some information's about the systems:
old:~ # cat /etc/SuSE-release
SuSE Linux 9.0 (i586)
VERSION = 9.0
old:~ # uname -a
Linux biblix 2.4.21-303-smp4G #1 SMP Tue Dec 6 12:33:10 UTC 2005 i686
i686 i386 GNU/Linux
old:~ # sed --version
GNU sed version 4.0.6
...
new:~ #cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
new:~ # uname -a
Linux userWS0.dnb.de 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14
21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
new:~ # sed --version
sed (GNU sed) 4.2.2
...
Now lets see how the behavior has changed, what I think is a bug:
old:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*$/\2 \1\3/p'
Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11
while the new system does not output anything using this expression.
Removing the line end ($) from the expression solved the problem,
somehow:
old:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*/\2 \1\3/p'
Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11
new:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*/\2 \1\3/p'
Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03�208@
a30-01-19bc
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03�208@
a30-01-19bc
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11�208@
a30-01-19bc
For me this seems to be the first unexpected behavior. The second,
which i think is tightly related, is that the first match group get's
text from the end of line attached. Maybe the first match group
consumes the line end?
So I started breaking the expression down, using only the first match
group:
old:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*$/\1/p' Fehlerpica.dat
9783507435339
9783507435377
9783034312189
The new system still doesn't output anything, leaving out the line end
in the expression end up in output on the new system:
old:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*/\1/p' Fehlerpica.dat
9783507435339
9783507435377
9783034312189
new:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*/\1/p' Fehlerpica.dat
9783507435339�208@ a30-01-19bc
9783507435377�208@ a30-01-19bc
9783034312189�208@ a30-01-19bc
However the output differs and is wrong on the new system. The line end
is still appended to the match group.
If I try using only the second match group, the string is appended
there:
old:~ # sed -n 's/^.*006V...\(.\{1,32\}\).*/\1/p' Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00
18290030a02544e6a451538b0e44f9e2
4c7ff6d790b34470852434f5ee41200b
new:~ # sed -n 's/^.*006V...\(.\{1,32\}\).*/\1/p' Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00�208@ a30-01-19bc
18290030a02544e6a451538b0e44f9e2�208@ a30-01-19bc
4c7ff6d790b34470852434f5ee41200b�208@ a30-01-19bc
So it seems like the first match group consumes far to much text in an
non-linear way breaking the match of the line end.
I've attached the Fehlerpica.dat for you and hope you can reproduce the
misbehavior.
If I can provide further information please let me know.
Thank you and best regards,
Markus Lange
--
***Lesen. Hören. Wissen. Deutsche Nationalbibliothek***
Deutsche Nationalbibliothek
Fachbereich IT, Informationsinfrastruktur
Adickesallee 1
60322 Frankfurt am Main
Tel: +49 69 1525 -1786
mailto:address@hidden
http://www.dnb.de
Fehlerpica.dat
Description: Fehlerpica.dat
- bug#34316: sed misbehavior on BRE's,
Lange, Markus <=