diffutils-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Diffutils-devel] Interest in a -B / --binary option to cmp


From: Richard Bass
Subject: [Diffutils-devel] Interest in a -B / --binary option to cmp
Date: Fri, 31 May 2019 10:23:36 -0700
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0

   Hello!  I am new to the diffutils development list.  I am curious to
   see whether there would be support for a new option to the cmp
   command.  Specifically I am proposing a --binary / -B option which
   would indicate that the user is only interested in a binary comparison,
   and NOT to perform line analysis.
   The problem, as I see it, is that if you want to compare a huge volume
   of data to determine whether there has been any corruption, you are
   only interested in whether the files are the same or different and not
   the line number where they differ.  You could just use cmp -s to do
   this but then you don't get the answer in stdout.  With cmp -s, you
   have to check the exit status and so wrap each of the the cmp commands
   with an extra test.
   I added a -B option to a local copy of cmp built from the latest
   diffutils source.  I called it rcmp.  For two files that are 3.2 GB
   each and that are identical, after making sure that both files were
   cached,  I got the following:

     $ time rcmp f1 f2
     real    0m4.686s
     user    0m1.560s
     sys     0m3.026s

     $ time rcmp -B f1 f2
     real    0m3.810s
     user    0m0.701s
     sys     0m3.057s

   These numbers were consistent.  In other words, the line number
   analysis ended up costing an additional 23%.  For a large volume of
   data, this is considerable.  Note that when files do differ, the output
   merely omits the line number:

     $ rcmp f3 f4
     f3 f4 differ: byte 16, line 1
     $ rcmp -B f3 f4
     f3 f4 differ: byte 16

   Of course, I could be satisfied with running my own version, but I
   figured that others might be interested in such an option.
   Thanks,
   Richard <rwb>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]