[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "du -b --files0-from=-" running out of memory
From: |
Barry Kelly |
Subject: |
Re: "du -b --files0-from=-" running out of memory |
Date: |
Sun, 23 Nov 2008 22:49:47 +0000 |
Eric Blake wrote:
> [adding the upstream coreutils list]
>
> According to Barry Kelly on 11/23/2008 6:24 AM:
> > I have a problem with du running out of memory.
> >
> > I'm feeding it a list of null-separated file names via standard input,
> > to a command-line that looks like:
> >
> > du -b --files0-from=-
> >
> > The problem is that when du is run in this way, it leaks memory like a
> > sieve. I feed it about 4.7 million paths but eventually it falls over as
> > it hits the 32-bit address space limit.
>
> That's because du must keep track of which files it has visited, so that
> it can determine whether to recount or ignore hard links that visit a file
That's why I said this:
> > Now, I can understand why a du -c might want to exclude excess hard
> > links to files, but that at most requires a hash table for device &
> > inode pairs - it's hard to see why 4.7 million entries would cause OOM
And 4.7 million inode and device pairs, assuming 64-bit inodes and
16-bit device data (major & minor), even including alignment (so 16
bytes), only adds up to 75MB of data. That shouldn't cause an overflow
of 2GB address space.
> already seen. The upstream ls source code was recently change to store
> this information only for command line arguments, rather than every file
> visited; I wonder if a similar change for du would make sense.
A "visited" hashtable would still be required for calculating '-c'
though.
-- Barry
--
http://barrkel.blogspot.com/