[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: stat() order performance issues
From: |
Jim Meyering |
Subject: |
Re: stat() order performance issues |
Date: |
Fri, 26 Jan 2007 18:18:22 +0100 |
Phillip Susi <address@hidden> wrote:
> Jim Meyering wrote:
>> Which ls option(s) are you using?
>
> I used ls -Ui to list the inode number and do not sort. I expected this
> to simply return the contents from getdents, but I see stat64 calls on
> each file, I believe in the order they are returned by getdents in,
> which causes a massive seek storm.
>
>> Which file system? As you probably know, it really matters.
>
> In my case, reiserfs, but this should apply equally as well to ext2/3.
That's good, but libc version matters too.
And the kernel version. Here, I have linux-2.6.18 and
Debian/unstable's libc-2.3.6.
>> If it's just "ls -U", then ls may not have to perform a single "stat" call.
>> If it's "ls -l", then the stat per file is inevitable.
>> But if it's "ls --inode" or "ls --file-type", with the right file system,
>> ls gets all it needs via readdir, and can skip all stat calls. But with
>> some other file system types, it still has to stat every file.
>
> It seems that ls -U does not stat, but ls -Ui does. It seems it
> shouldn't because the name and inode number are returned by readdir
> aren't they?
Yes.
Make sure you're using the latest version of coreutils.
If necessary, use a debugger to see whether readdir provides
valid inode information on your system. It should
>> For example, when I run "ls --file-type" on three maildirs containing
>> over 160K entries, it's nearly instantaneous. There are only 3 stat calls:
>> $ strace -c ls -1 a b c|wc -l
>
> Are a, b and c files or directories? If they are files, then of course
They're directories (of course), containing a total of 160K+ entries.
> it would only stat 3 times, because you have only asked ls to look up 3
> files. Try just ls -Ui without the a b c parameters.
>
>>> du in a Maildir with many thousands of small files takes ages to
>>> complete. I have investigated and believe this is due to the order in
>> Yep. du has to perform the stat calls.
>> "ages"? Give us numbers. Is NFS involved? A slow disk?
>> I've just run "du -s" on a directory containing almost 70,000 entries,
>> and it didn't take *too* long with a cold cache: 21 seconds.
>
> Modest disk, no NFS, 114k entries, and it takes 10-15 minutes with cold
> cache. When I sorted the directory listing by inode number and ran stat
> on each in that order with cold caches, it only took something like 1
> minute.
10-15 minutes is very bad.
Something needs an upgrade.
I presume you used xargs -- you wouldn't run stat 114K times...