[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[rdiff-backup-users] Proposal: Storing excess file information
From: |
Ben Escoto |
Subject: |
[rdiff-backup-users] Proposal: Storing excess file information |
Date: |
Fri, 29 Nov 2002 16:01:20 -0800 |
Hi all, let me run by you a scheme for storing file (meta-)data which
won't fit natively on the destination file system. Suggestions
welcome.
Problem: Some file information cannot be stored on the destination
file system because of limitations of the file system, configuration
differences between it and the source system, security issues, or
other reasons.
For instance, as pointed out by Ilya Konstantinov, it is a big
limitation that unless rdiff-backup runs as root on the destination
system, ownership information is lost (because rdiff-backup lacks the
permissions to change file ownership). With root access, ownership
information is preserved, but still can be set incorrectly/confusingly
on the remote system, if user and group id's don't match. This will
be a bigger problem if/when rdiff-backups supports ACLs. Finally
sockets, symlinks, fifos, and device files cannot be backed up to many
file system types.
Proposed solution: Every session, rdiff-backup can write extra file
information to a data file in the rdiff-backup-data directory. It
would be a text file, looking like this:
File bin/view
Type sym
SymData view
File bin/zcat
NumberLinks 4
Inode 2834484
Device 771
Uname root
Gname root
File dev/ttyS1
Type dev
DevInfo c 4 65
Uname root
Gname root
...
each "line" would probably be terminated with a null, in case the
filenames included newlines. Most files would not have an entry at
all, just the ones with data that couldn't fit on the destination
system.
BTW, I thought about doing this in XML, but after spending a few
hours trying to learn XML (and even going to a bookstore and skimming
the Python & XML book) I concluded that either Python has bad XML
support, or, more likely, the dominant XML interfaces like SAX are
very bad for this kind of thing. But if you know XML and disagree,
let me know.
Anyway, the file would be gzipped and stored with reversed diffs,
so it wouldn't take up much space. If you have lots of hard links
(like I do) the new system will probably save you space, as currently
the hard link data is stored in its entirety for each session.
Although a text file, it shouldn't be that slow, since it is always
processed in order. One bad point is that restoring a single file
could be slow, since the whole file might have to be decompressed and
scanned.
And another side note, there was some discussion of this on the
rsync list, under the "virtual file system" rubric. I asked recently,
but didn't think there was enough enthusiasm to try to use the same
file format, or anything like that.
Last point: At first it seemed that this could help backing up to
a case insensitive file system, but now I don't see how, since file
names collisions could still happen, no matter what extra information
you had. So it seems this can't replace the current quoting system.
So, any suggestions? (Or offers to implement immediately? :))
--
Ben Escoto
pgpRBftJqcgzj.pgp
Description: PGP signature
- [rdiff-backup-users] Proposal: Storing excess file information,
Ben Escoto <=