bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] [patch v3] Bug / question in tar


From: Nathan Stratton Treadway
Subject: Re: [Bug-tar] [patch v3] Bug / question in tar
Date: Sat, 29 Mar 2014 18:51:08 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Sat, Mar 29, 2014 at 13:28:23 -0700, Tim Kientzle wrote:
> I'm curious.  If someone types the following command:
> 
>    tar cf /dev/null some files
> 
> What do you think they expect to happen?
> 
> I have heard of people using "tar cf /dev/null /mnt/cdrom" to test
> whether the files on a CD-ROM were readable.  If tar "optimizes" by
> not reading the files, that could lead a person relying on this
> behavior to erroneously believe the CD-ROM had no errors.
> 
> If you really believe that sending output to /dev/null should not do
> anything, make it a fatal error so people won't rely on it.

Well, actually the AMANDA backup system is an example of a use-case
which directly relies on the current behavior of doing _something_ but
specifically _not_ reading the input files.

In particular, at the beginning of a dump-run, Amanda goes through all
the partitions that it's going to back up and calculates the size of
output that each possible level of incremental backup will generate by
running tar with the --totals option and output directed to /dev/null. 

In this situation tar walks the directory tree and figures out which
files it needs to back up at that incremental level, then calculates the
total size of those files -- but doesn't bother to actually read the
file off the disk, thus saving a lot of time:

  # ls -lh bigfile.iso 
  -rwxr--r-- 1 root root 4.6G 2014-03-29 17:42 bigfile.iso

  # time tar --totals -cf /dev/null bigfile.iso 
  Total bytes written: 4879196160 (4.6GiB, 249GiB/s)

  real    0m0.020s
  user    0m0.020s
  sys     0m0.000s

  address@hidden:~# time tar --totals -cf /dev/zero bigfile.iso 
  Total bytes written: 4879196160 (4.6GiB, 91MiB/s)

  real    0m51.195s
  user    0m0.150s
  sys     0m4.620s

(In actual practice it may take a minute or two to walk the directory
tree on a partition and calculate total sizes at each incremental level,
but if tar had to actually read all the input files during the
size-calculation phase then the backup run would easily take two or
three times as long as it does now.)

Tar's current behavior in this regard is certainly not obvious... but
Amanda has relied on this behavior for a long time, and given that a
particular Amanda installation needs to work with a wide range of GNU
tar versions on the various client systems, it would cause a lot of
trouble if gtar's behavior suddenly changed at this point in its history.

On the other hand, there's certainly something to be said for adding an
option that would allow explicit control over this behavior, so that one
could use /dev/null in cases where you didn't want to keep the output
but specifically did want to read through the input files...  (as long
as the default setting was still set to "don't read data" when the
output was going to /dev/null).

                                                Nathan

----------------------------------------------------------------------------
Nathan Stratton Treadway  -  address@hidden  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



reply via email to

[Prev in Thread] Current Thread [Next in Thread]