[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#6557: du sometimes miscounts directories, and files whose link count
From: |
Jim Meyering |
Subject: |
bug#6557: du sometimes miscounts directories, and files whose link count equals 1 |
Date: |
Sat, 03 Jul 2010 10:18:18 +0200 |
Paul Eggert wrote:
> (I found this bug by code inspection while doing the du performance
> improvement reported in:
> http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html
> )
>
> Unless -l is given, du is not supposed to count the same file more
> than once. It optimizes this test by not bothering to put a file into
> the hash table if its link count is 1, or if it is a directory. But
> this optimization is not correct if -L is given (because the same
> link-count-1 file, or directory, can be seen via symbolic links) or if
> two or more arguments are given (because the same such file can be
> seen under multiple arguments). The optimization should be suppressed
> if -L is given, or if multiple arguments are given.
>
> Here is a patch, with a couple of test cases for it. This patch
> assumes the du performance fix, but I can prepare an independent
> patch if you like.
Thanks!
Actually, that patch applies just fine, as-is.
However, it induces this new "make check" test failure:
FAIL: du/files0-from (exit: 1)
==============================
du (GNU coreutils) 8.5.75-569b2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Torbjorn Granlund, David MacKenzie, Paul Eggert,
and Jim Meyering.
f-extra-arg...
missing...
minus-in-stdin...
empty...
empty-nonreg...
nul-1...
nul-2...
1...
1a...
2...
files0-from: test 2: stdout mismatch, comparing 2.O (actual) and 2.1
(expected)
*** 2.O Sat Jul 3 09:28:08 2010
--- 2.1 Sat Jul 3 09:28:08 2010
***************
*** 1 ****
--- 1,2 ----
0 g
+ 0 g
2a...
files0-from: test 2a: stdout mismatch, comparing 2a.O (actual) and 2a.1
(expected)
*** 2a.O Sat Jul 3 09:28:08 2010
--- 2a.1 Sat Jul 3 09:28:08 2010
***************
*** 1 ****
--- 1,2 ----
0 g
+ 0 g
zero-len...
That's because with the unpatched "du", a command like this, with
a duplicate argument, prints two lines, while the patched version
prints two:
$ seq 100 > g; du g g
4 g
4 g
$ seq 100 > g; ./du g g
4 g
Note that the vendor versions of "du" from at least Solaris 10,
openBSD, netBSD and freeBSD print both lines.
I prefer the new semantics, especially when using --total:
$ seq 100 > g; du --total g g
4 g
4 g
8 total
$ seq 100 > g; ./du --total g g
4 g
4 total
You can get some of the old semantics by using -l:
$ seq 100 > g; ./du -l --total g g
4 g
4 g
8 total
What do you think of breaking with that tradition? POSIX does appear
to say that for each "FILE" argument du must print a line, but it also
mentions how with linked files, the space must be counted only once.
You can definitely consider listing the same file twice as being
analogous to a file being hard-linked.
An alternative might be to do this,
$ seq 100 > g; du --total g g
4 g
0 g
4 total
but this is too prone to misinterpretation both by people and by code
that parses du output. So I'm inclined to go with your approach.
-------------------------------------
This is the additional patch we'd need to make the failing
failing test accept your new output. You're welcome to merge
it into yours.
diff --git a/tests/du/files0-from b/tests/du/files0-from
index 620246d..860fc6a 100755
--- a/tests/du/files0-from
+++ b/tests/du/files0-from
@@ -70,15 +70,15 @@ my @Tests =
{IN=>{f=>"g\0"}}, {AUX=>{g=>''}},
{OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
- # two file names, no final NUL
+ # two identical file names, no final NUL
['2', '--files0-from=-', '<',
{IN=>{f=>"g\0g"}}, {AUX=>{g=>''}},
- {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+ {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
- # two file names, with final NUL
+ # two identical file names, with final NUL
['2a', '--files0-from=-', '<',
{IN=>{f=>"g\0g\0"}}, {AUX=>{g=>''}},
- {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
+ {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ],
# Ensure that $prog processes FILEs following a zero-length name.
['zero-len', '--files0-from=-', '<',