bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: using mapfile is extreamly slow compared to oldfashinod ways to read


From: Chet Ramey
Subject: Re: using mapfile is extreamly slow compared to oldfashinod ways to read files
Date: Sat, 28 Mar 2009 11:58:29 -0400
User-agent: Thunderbird 2.0.0.21 (Macintosh/20090302)

Lennart Schultz wrote:
> It seems that mapfile is OK for small numbers but for bigger numbers it
> starts to compsume time.

Not exactly.  Your own timing tests show that mapfile itself is blindingly
fast.  The time is consumed sequentially traversing the (very large) array.

Bash indexed arrays are implemented as doubly-linked lists, so accessing
a single element is (if I remember my combinatorics correctly, which is
unlikely) O(N) instead of O(1), and accessing the entire array
sequentially is O(N**N).

For instance, when I factor out the command substitution, with an array
with 100000 elements, I get the following times:

create the file:
real    0m47.317s
user    0m7.197s
sys     0m10.722s

read the file sequentially using a while loop:
real    0m9.609s
user    0m5.650s
sys     0m3.644s

mapfile:
real    0m0.062s
user    0m0.049s
sys     0m0.009s

accessing $MAPFILE sequentially:
real    1m36.880s
user    1m24.963s
sys     0m7.161s


I'm sure there are efficiency improvements possible in the bash indexed
array implementation, but sequentially accessing a data structure
optimized for space and sparse arrays is never going to be as fast as
a read-process loop, and that difference becomes more and more apparent
the larger the array.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer

Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]