coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Speedup wc -l


From: Pádraig Brady
Subject: Re: [PATCH] Speedup wc -l
Date: Sun, 15 Mar 2015 19:12:53 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0

On 15/03/15 08:33, Kristoffer Brånemyr wrote:
> Hi,
> 
> I did some tests and found out you can actually beat memchr with a simple 
> loop. Tests were done on a Intel Xeon E3-1231v3 (4*3.4GHz), on a 4GB file 
> that was already cached in memory. Benchmarking was done simply with the 
> 'time' command. I don't know how this code would run on other architectures, 
> but I guess you could put it in an #ifdef?
> 
> Coreutils 2.83 version, compiled with -O3:
> 507755520 /home/ztion/words
> 
> real    0m3.126s
> user    0m2.699s
> sys     0m0.429s
> 
> 
> Improved version compiled with -O2:
> 507755520 /home/ztion/words
> 
> real    0m2.857s
> user    0m2.461s
> sys     0m0.396s
> 
> Improved version compiled with -O3:
>  507755520 /home/ztion/words
> 
> real    0m1.518s
> user    0m1.157s
> sys     0m0.361s
> 
> I studied the generated assembly and with -O3 gcc generates some fancy SSE 
> code, getting some nice speedups. memchr is also SSE optimized as far as I 
> know, so it's interesting that this is so much faster, twice as fast actually.
> 
> In case you don't like turning -O3 on for some reason (the default in 
> coreutils is -O2 i think), the best version I could put together for -O2 was 
> this:
> 
> Improved version 2, compiled with -O2:
> 507755520 /home/ztion/words
> 
> real    0m2.206s
> user    0m1.827s
> sys     0m0.379s

Interesting. Thanks for the results.
I use 'gcc -march=native -g -O3' locally, and with that can't see a difference 
in performance.

What version of glibc and gcc are you using?
gcc-4.9.2-1.fc21.x86_64 and glibc-2.20-7.fc21.x86_64 here.

thanks,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]