bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii fi


From: Martin Carroll
Subject: Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse
Date: Wed, 01 Aug 2012 10:04:37 -0400
User-agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120717 Thunderbird/13.0

Hi guys,

Ok, I cannot resist putting my $0.02 in.

First, as you all know, neither grep nor any other unix tool was ever
intended
to be atomic, or to work "correctly" even if the user (or other programs)
changes the file system out from under the tool.  The unix philosophy in
such cases is "you get what you get."

Second: Why just grep?  If grep gets to have code that defends against
having the file system changed out from under it, then so should all the
other unix tools.  But I suspect we all agree that no one intends to do
that,
so why do it just for grep?

As far as I can tell, there is only one good reason for adding special code
to grep, and it has nothing to do with having the file system changed under
it.  The only thing that I want grep to defend against is *writing
binary data
-- in particular, binary control sequences -- to the terminal.* 

Now, it so happens that the only time that that undesirable behavior can
occur is when grepping a file containing binary data.

But here is another important point that many people seem to overlook. 
When I grep a file containing binary data, *I still want grep to show the
matches*, because (1) lots of files with binary data also contain ascii
data, and (2) some ascii files accidentally have some binary data in
them.)  More specifically, let foo be a file containing binary data.  If I
type:

  % grep "good men" foo

then I do *not* want it to say:

  Binary file foo matches

I *do* want it to say:

  [binary data]Now is the time for all good men[binary data]

In other words, grep should do exactly what it normally does, except that
grep should by default "asciify" the binary data in the output.  (There
should
also be an option to disable asciification and to print the binary data
as is.)

Now, Paul's patch ca6e5ec860cf5a2af060e2a5c8c53d9004ddc131 (switch
to binary mode when encountering binary data anywhere) makes grep come
a little closer to my desired behavior, but it's not all the way there. 
Grep
should *not* have two modes, ascii and binary, and switch to binary mode
when binary data is encountered.  Instead, grep should have two modes,
one in which it asciifies binary data in the output, and one in which
asciification
is disabled, and the choice of mode should be based solely on command-line
arguments, not on the contents of the file.

mar




reply via email to

[Prev in Thread] Current Thread [Next in Thread]