[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] Length and contents of RT may be wrong/garbled when RS==""
From: |
Jeroen Schot |
Subject: |
[bug-gawk] Length and contents of RT may be wrong/garbled when RS=="" |
Date: |
Fri, 26 Aug 2011 11:41:24 +0200 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Hello,
Below is a bug report from a Debian user. I have checked his findings
and the same behaviour exists in gawk 3.1.8 and 4.0.0. I have not
verified his patch (also attached). The original bug report can be
found at http://bugs.debian.org/619738
Regards,
--
Jeroen Schot
----- Forwarded message from Rogier -----
From: Rogier
To: Debian Bug Tracking System <address@hidden>
Subject: gawk: Length and contents of RT may be wrong/garbled when RS==""
Date: Sat, 26 Mar 2011 17:50:44 +0100
Package: gawk
Version: 1:3.1.7.dfsg-5
Severity: normal
Tags: patch
The contents of RT may be garbled and the length may be wrong when RS=="".
There are two cases:
- Case 1: The last record is 'terminated' with '\n' instead of '\n\n'
In this case, the length of RT is reported as 0 instead of 1
Example (1st and 3rd are OK):
$ awk 'BEGIN {printf "0"; exit}' | awk 'BEGIN {RS=""}; {print length(RT)}'
0
$ awk 'BEGIN {printf "0\n"; exit}' | awk 'BEGIN {RS=""}; {print length(RT)}'
0
$ awk 'BEGIN {printf "0\n\n"; exit}' | awk 'BEGIN {RS=""}; {print
length(RT)}'
2
- Case 2: RT is longer than the shortest RT seen so far
In this case, the additional characters in RT are garbage.
In a non-C locale, the length is also reported incorrectly.
$ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=C awk 'BEGIN
{RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
3 \n\n\n
5 address@hidden@
$ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=en_US.UTF-8 awk
'BEGIN {RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
3 \n\n\n
3 address@hidden@
In both cases, the output should be:
3 \n\n\n
5 \n\n\n\n\n
I have attached a patch that fixes these problems, and I have added some test
cases
as well. The patched source passes all tests and compiles into a .deb without
errors.
After applying the patch, execute permission must be set on the test scripts:
$ chmod +x test/rtlen*.sh
I hereby put the patch, to which I have all rights, in the public domain, so
that
there can (hopefully) be no legal objection to incorporating it.
Regards.
Rogier.
----- End forwarded message -----
gawk-3.1.7.dfsg.RT-patch
Description: Text document
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [bug-gawk] Length and contents of RT may be wrong/garbled when RS=="",
Jeroen Schot <=