[rdiff-backup-users] Proposal: Storing excess file information

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Proposal: Storing excess file information

From:	Ben Escoto
Subject:	[rdiff-backup-users] Proposal: Storing excess file information
Date:	Fri, 29 Nov 2002 16:01:20 -0800

Hi all, let me run by you a scheme for storing file (meta-)data which
won't fit natively on the destination file system.  Suggestions
welcome.


Problem: Some file information cannot be stored on the destination
file system because of limitations of the file system, configuration
differences between it and the source system, security issues, or
other reasons.

    For instance, as pointed out by Ilya Konstantinov, it is a big
limitation that unless rdiff-backup runs as root on the destination
system, ownership information is lost (because rdiff-backup lacks the
permissions to change file ownership).  With root access, ownership
information is preserved, but still can be set incorrectly/confusingly
on the remote system, if user and group id's don't match.  This will
be a bigger problem if/when rdiff-backups supports ACLs.  Finally
sockets, symlinks, fifos, and device files cannot be backed up to many
file system types.


Proposed solution:  Every session, rdiff-backup can write extra file
information to a data file in the rdiff-backup-data directory.  It
would be a text file, looking like this:

File bin/view
    Type sym
    SymData view
File bin/zcat
    NumberLinks 4
    Inode 2834484
    Device 771
    Uname root
    Gname root
File dev/ttyS1
    Type dev
    DevInfo c 4 65
    Uname root
    Gname root
...

each "line" would probably be terminated with a null, in case the
filenames included newlines.  Most files would not have an entry at
all, just the ones with data that couldn't fit on the destination
system.

    BTW, I thought about doing this in XML, but after spending a few
hours trying to learn XML (and even going to a bookstore and skimming
the Python & XML book) I concluded that either Python has bad XML
support, or, more likely, the dominant XML interfaces like SAX are
very bad for this kind of thing.  But if you know XML and disagree,
let me know.

    Anyway, the file would be gzipped and stored with reversed diffs,
so it wouldn't take up much space.  If you have lots of hard links
(like I do) the new system will probably save you space, as currently
the hard link data is stored in its entirety for each session.
Although a text file, it shouldn't be that slow, since it is always
processed in order.  One bad point is that restoring a single file
could be slow, since the whole file might have to be decompressed and
scanned.

    And another side note, there was some discussion of this on the
rsync list, under the "virtual file system" rubric.  I asked recently,
but didn't think there was enough enthusiasm to try to use the same
file format, or anything like that.

    Last point:  At first it seemed that this could help backing up to
a case insensitive file system, but now I don't see how, since file
names collisions could still happen, no matter what extra information
you had.  So it seems this can't replace the current quoting system.

    So, any suggestions?  (Or offers to implement immediately? :))


-- 
Ben Escoto

pgpRBftJqcgzj.pgp
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Proposal: Storing excess file information, Ben Escoto <=
- Re: [rdiff-backup-users] Proposal: Storing excess file information, Dave Steinberg, 2002/11/30

Prev by Date: Re: [rdiff-backup-users] Web Frontend ideas
Next by Date: Re: [rdiff-backup-users] Proposal: Storing excess file information
Previous by thread: [rdiff-backup-users] issues moving the backing store
Next by thread: Re: [rdiff-backup-users] Proposal: Storing excess file information
Index(es):
- Date
- Thread