|
From: | Paolo Bonzini |
Subject: | Re: Run-time dynamic linking in grep |
Date: | Mon, 28 Feb 2011 09:49:31 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.7 |
On 01/21/2011 08:27 PM, Reuben Thomas wrote:
This is an interesting suggestion, not just because of performance reasons, but because I was trying to interface at the library level, while using a decompressor program directly would avoid having to do API impedance matching.
Right, and it would also be strictly more powerful. For example, it could allow grepping in binary files such as .odt or .doc.
I'd base the choice of a filter strictly on the extension. Not using magic numbers avoids the problem of buffering stdin.
There are of course various possibilities on how to implement it. For example you could have a file like ~/.grep.filters or /etc/filters.grep
.gz gzip -dc .bz2 bzip2 -dc .pdf pdftotextpossibly with an option --filters/--no-filters. Reading the configuration files should be skipped when grepping stdin to avoid useless stats.
Given the use case of "grep -r", another possibility could be to add --filters=recurse and make this the default. "-r" would turn on --filters, while no "-r" would leave it off. I don't think it's worth the complication though.
Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |