[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
why is grep so slow?
From: |
Wacek Kusnierczyk |
Subject: |
why is grep so slow? |
Date: |
Fri, 24 Apr 2009 10:50:19 +0200 |
User-agent: |
Thunderbird 2.0.0.21 (X11/20090318) |
i have a >1GB text file (say, input), and want to count lines matching
some pattern (say, '^>>'). using grep, i got the following timings:
time (grep -c '^>>' input)
# ~6m20s
time (grep '^>>' input | wc -l)
# ~5m20s
sed is much faster:
time (sed -n '/^>>/p' input | wc -l)
# ~0m5s
what's the difference between grep and sed that makes grep so much
slower here?
interestingly,
time (grep -cP '^>>' input)
# ~0m0.2s
it could be that grep buffers the lines before it outputs them, and this
causes slowdown on large files, but then -P would not change it, would
it? or does -P change not only regexing, but also outputting?
in all the examples above, the actual output (the line count) was correct.
vQ
- why is grep so slow?,
Wacek Kusnierczyk <=