[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Possible regression in gawk 4.x regarding I/O errors
From: |
Andrew J. Schorr |
Subject: |
Re: [bug-gawk] Possible regression in gawk 4.x regarding I/O errors |
Date: |
Wed, 23 Jul 2014 12:18:04 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
Hi,
On Mon, Jul 21, 2014 at 04:31:43PM -0400, Assaf Gordon wrote:
> This is reproducible with gawk-4.0.1, gawk-4.0.2, gawk-4.1.1, and latest git.
> gawk-3.1.8 properly detects the errors in both files.
> Tested on Ubuntu 14.04 amd64 and Debian 7 i686.
I was unfortunately able to reproduce this problem using the git master code.
Strace shows this for the bottles file:
read(3, " bottles of beer on the wall\n187"..., 1024) = 1024
write(1, "2000 bottles of beer on the wall"..., 4096) = 4096
read(3, "bottles of beer on the wall\n1844"..., 1024) = 1024
read(3, "ottles of beer on the wall\n1813 "..., 1024) = 1024
read(3, "ttles of beer on the wall\n1782 b"..., 1024) = 1024
read(3, "tles of beer on the wall\n1751 bo"..., 1024) = 1024
write(1, " bottles of beer on the wall\n187"..., 4096) = 4096
read(3, "les of beer on the wall\n1720 bot"..., 1024) = 1024
read(3, "es of beer on the wall\n1689 bott"..., 1024) = 1024
read(3, "s of beer on the wall\n1658 bottl"..., 1024) = 1024
read(3, 0xfb6b5c, 1024) = -1 EIO (Input/output error)
write(1, "tles of beer on the wall\n1751 bo"..., 4096) = 4096
close(3) = 0
write(1, "\n", 1) = 1
exit_group(0) = ?
+++ exited with 0 +++
On the numbers file, gawk catches the error:
read(3, "796\n797\n798\n799\n800\n801\n802\n803\n"..., 1024) = 1024
read(3, "1\n1042\n1043\n1044\n1045\n1046\n1047\n"..., 1024) = 1024
write(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14"..., 4096) = 4096
read(3, "46\n1247\n1248\n1249\n1250\n1251\n1252"..., 1024) = 1024
read(3, "451\n1452\n1453\n1454\n1455\n1456\n145"..., 1024) = 1024
read(3, "1656\n1657\n1658\n1659\n1660\n1661\n16"..., 1024) = 1024
read(3, "\n1861\n1862\n1863\n1864\n1865\n1866\n1"..., 1024) = 1024
write(1, "1\n1042\n1043\n1044\n1045\n1046\n1047\n"..., 4096) = 4096
read(3, "5\n2066\n2067\n2068\n2069\n2070\n2071\n"..., 1024) = 1024
read(3, "70\n2271\n2272\n2273\n2274\n2275\n2276"..., 1024) = 1024
read(3, "475\n2476\n2477\n2478\n2479\n2480\n248"..., 1024) = 1024
read(3, 0xa61b50, 1024) = -1 EIO (Input/output error)
...
write(1, "\n1861\n1862\n1863\n1864\n1865\n1866\n1"..., 4096) = 4096
write(2, "awk: ", 5awk: ) = 5
write(2, "cmd. line:", 10cmd. line:) = 10
write(2, "1: ", 31: ) = 3
write(2, "(", 1() = 1
write(2, "FILENAME=/tmp/baddisk/numbers.tx"...,
34FILENAME=/tmp/baddisk/numbers.txt ) = 34
write(2, "FNR=2679) ", 10FNR=2679) ) = 10
write(2, "fatal: ", 7fatal: ) = 7
write(2, "error reading input file `/tmp/b"..., 71error reading input file
`/tmp/baddisk/numbers.txt': Input/output error) = 71
write(2, "\n", 1
) = 1
exit_group(2) = ?
+++ exited with 2 +++
For the problematic bottles file, inrec is returning 1 with errcode set to 0,
whereas errcode is 5 for the numbers file. There must be a bug somewhere
in io.c:get_a_record.
I imagine this issue may be related to the tricky record boundary logic:
bash-4.2$ cat /tmp/baddisk/numbers.txt | tail -2 | od -c
cat: /tmp/baddisk/numbers.txt: Input/output error
0000000 2 6 7 8 \n 2 6 7 9 \n
0000012
bash-4.2$ cat /tmp/baddisk/bottles.txt | tail -2 | od -c
cat: /tmp/baddisk/bottles.txt: Input/output error
0000000 1 6 2 9 b o t t l e s o f
0000020 b e e r o n t h e w a l l
0000040 \n 1 6 2 8 b o t t l e s
0000055
For the numbers file, the I/O error occurs right after a
linefeed, whereas the error is in the middle of a line in the bottles
file.
Regards,
Andy