[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Slow inventories on large source trees
From: |
Tom Lord |
Subject: |
Re: [Gnu-arch-users] Slow inventories on large source trees |
Date: |
Wed, 21 Apr 2004 11:42:39 -0700 (PDT) |
> From: Aaron Bentley <address@hidden>
> > 1) measure the speed of GNU `find' on this tree with
> > find . '!' -uid 0
> > (I'm assuming that root does not own any files in this tree. The
> > find expression is to force `find' to stat files.)
> Is that necessary? With names tagging, it shouldn't need to stat
> anything, should it?
Yes, it's necessary. `find' can sometimes get by with just a `chdir'
that might fail but `inventory' can not.
> [other message]
> It's calling filename_matches 89750 times -- about 21 times per file in
> the Wine tree, so I suspect that can be reduced, hopefully to single
digits.
You elsewhere mentioned that that's a `changes' profile, not
`inventory' -- so you're counting _2_ inventories. The actual
average is about 10.5 times per file.
You elsewhere mentioned something about ~230 .arch-inventory files, so
I'm assumeing nearly every directory has one.
Currently, the tests performed by a traversal during `changes', for a
directory containing a .arch-inv file is:
is it a control file?
(_NOT_ is it user-defined exclude?)
is it .arch-inv junk?
is it .arch-inv backup?
is it .arch-inv precious?
is it .arch-inv unrec?
is it .arch-inv junk?
is it .arch-inv source?
is it junk?
is it backup?
is it precious?
is it unrec?
is it source?
That's 12 per file except it's truncated for non-source files and
fewer for dirs lacking .arch-inventory. So, 10.5 makes perfect sense.
The last five calls can be replaced with a single call.
The six .arch-inv calls can be replaced with a single call.
That will mean that this example goes from average case 10.5 to
worst-case 3 calls to filename_matches.
Abently, are you interested in working on this? Do you know about the
`cut' operator in Rx? (I.e., you don't want to combine those
filename_matches calls by adding parentheses to regexps. You want to
arrange for the final state label of the dfa to tell you which pattern
matched.)
-t
Re: [Gnu-arch-users] Slow inventories on large source trees, Aaron Bentley, 2004/04/21