coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: du: POSIX mandating a single space instead of tab?


From: Stephane Chazelas
Subject: Re: du: POSIX mandating a single space instead of tab?
Date: Tue, 28 Apr 2015 17:50:58 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

2015-04-28 16:51:06 +0100, Pádraig Brady:
[...]
> > POSIX is already clear that anyone parsing for literal tabs is broken
> > when trying to parse du output.  The only safe way to parse du output is
> > to break on all whitespace (the way awk already does).  I'm 70-30 in
> > favor of changing to spaces.
> 
> What about file names with leading whitespace,
> which now couldn't be split if we didn't use a single tab.
[...]

The point is that it cannot be parsed portably, because all
POSIX guarantees is that there will be at least one blank (and I
suppose the definition of blank is locale-dependent) between the
number and the file name.

Also note that tab and newline (and other blank characters in
your locale) are as valid as the space character in a file name.

If you have to parse the output of du reliably, you need to do
things like:

LC_ALL=C du -k .//*.txt ///var

And look for those .// or /// in the output to see where the
file paths begin (and they end on the line before the one that
contains the next //).

Something like:

LC_ALL=C du -k .//*.txt ///var | LC_ALL=C awk '
  function process() {
    if (NR > 1) {
      print "disk usage for \"" file "\": " n
    }
  }
  {
    if (offset = index($0, "//")) {
      process()
      n = $1
      file = substr($0, offset + 2)
    } else {
      file = file "\n" $0
    }
  }
  END {process()}'

> 
> I don't think the gain is enough to break compat,
> given the greater alignment control etc. possible
> with expand(1) or numfmt(1) etc.
> I just checked an old wrapper script for du that I use,
> and see that it would be broken for example:
> http://www.pixelbeat.org/scripts/dutop
[...]

I'd tend to agree it would not be worth changing.

-- 
Stephane




reply via email to

[Prev in Thread] Current Thread [Next in Thread]