[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Extensions to Tar
From: |
Paul Eggert |
Subject: |
Re: Extensions to Tar |
Date: |
Sat, 30 Mar 2002 22:25:08 -0800 (PST) |
> From: Chris Wilson <address@hidden>
> Date: Sun, 31 Mar 2002 02:02:31 +0100 (BST)
>
> I've implemented the "reserved" -y option for compressing individual files
> in the archive using zlib (not gzip). It seems to be working okay, with a
> few bugs which I am tracking down and fixing.
Yes, it is difficult to get right, isn't it?
> Would you be interested in adding this code to tar? I am happy to
> share it under the GPL.
For an extensive change like that we'd need legal papers. Have you done
this before with the FSF? If not I can send you the forms.
> The main disadvantage is that tar currently requires zlib to build
> with this extension, although it could be protected with #ifdefs.
It would have to be buildable without zlib. Not every platform has
zlib. 'configure' should have a --with-zlib=PATH option (see how
OpenSSH does it, for example).
> I'm also considering adding the ability to filter files through GPG, for
> encrypted backups.
That might also be nice, though I think it's lower priority.
> One problem I encountered with filtering was that in order to write the
> file header, tar needs to know _in advance_ the compressed/filtered size
> of the data. This means either compressing the entire file in memory
> before writing it, or compressing twice (once to get the size and write
> the header, then again to write the compressed data). Can anyone see any
> ways around this problem?
I can't.
> (currently, I compress in memory, which limits
> the maximum file size which can be compressed to the available virtual
> memory).
You need to be able to fall back to plan B (compressing it twice);
otherwise the code won't be robust enough in practice.
> I have implemented support for these data types using custom values of the
> typeflag header field: 'Y' for zlib-filtered data and 'E' for
> GPG-encrypted data. I hope this is the "right" way to do it.
I would like future extensions to be compatible with POSIX 1003.1-2001
pax Interchange Format. Please see:
http://www.UNIX-systems.org/version3/online.html
Registration is required, the terms are here:
http://www.opengroup.org/onlinepubs/terms.htm
Look at the 'pax' command for a description of the format.
> I would like tar to be able to save, restore and modify archives on a
> remote system without using rmt, since it seems to be a big security hole.
> I have untrusted clients who I want to be able to back up to a central
> server without giving them rsh or ssh shell access or anything similar.
With ssh you can give them only the ability to run a certain command,
presumably rmt or a wrapper for rmt. Isn't that enough to give you
want you want?
> I would also like to be able to 'update' archives by writing a new file,
> rather than modifying the existing one, to create incremental backups
> without risking damage to the original archive. Would anyone like me to
> implement this?
Not me. :-) You should be able to do that by telling tar to read
the archive from stdin and write it to stdout, updating as you go.
But as far as I know this part of the tar code has never been that
reliable.
> Finally, for differential/incremental backups, I was thinking about using
> md5sums to detect changed files,
Why? You don't trust the time stamps?
> Should I stick to using the mtime?
Should be the ctime. But I'm lost as to why you'd want to use MD5
here. And if you're paranoid enough to want a cryptographic checksum,
you should use a better one than MD5; SHA1 say.
Anyway, thanks for your thoughts. I hope I haven't scared you away
from contributing....