rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Tar replacement - format proposal


From: John Goerzen
Subject: Re: [rdiff-backup-users] Tar replacement - format proposal
Date: Sat, 27 Sep 2003 22:23:02 -0500
User-agent: Mutt/1.5.4i

On Fri, Sep 26, 2003 at 06:39:43PM -0700, Ben Escoto wrote:
> Firstly, duplicity and rdiff-backup are written in python, which
> already takes up a ton of space.  I don't even see the problem with
> this.  The last rescue disk I used was Knoppix, which even has

That's not a rescue disk; that's a live operating system on a CD.  A rescue
disk is typically limited to a set of 1.44MB floppies and generally cannot
contain Python or anything approaching that size.

Even if your implementation of duplicity is in Python -- and there's nothing
wrong with that -- it should be a format that is readily codable in C,
without requiring any but the barest of standard system libraries. 
Otherwise, it will fail as a format useful for backups because it will not
be possible to recover from a catastrophic failure with it.

Keep in mind, too, that Knoppix is by no means the end of the story as far
as live OS CDs go; it runs only on i386, is likely only useful to people
with Linux systems whose hardware is directly supported by it.  If you're
running, say, Linux on Alpha or Solaris, Knoppix is going to be about as
useful to you as an AOL CD.

I use rdiff-backup myself, and one of the nice things about it is that its
storage format (for the most recent backup anyway) literally is just a bunch
of files on a disk, which is great.

> floppy instead of a 600+MB one.  Besides, is XML really that heavy?
> At least simple XML can be generated and parsed very easily.

Simple XML can be generated very easily.  Parsing is not quite so simple,
because already you have to consider all sorts of different quoting
situations, etc.

> we do with ACLs?  A file can have two ACLs, and each ACL can have a
> number of ACL entries.  Each ACL entry can have a type, a user/group
> id, and a permission set.  Once the data becomes hierarchical, it may
> be easier just to use XML than inventing various sub-encodings for
> each bit of data.

That is indeed an advantage of XML; however, I don't think it outweighs the
problems.

> Assuming all the metadata is placed together in the index, XML will
> typically compress very well.  For instance, on my machine gzip
> compresses the metadata of my rdiff-backup directory from 77MB to 6MB
> (about a factor of 13).

I'm not so much concerned about the size of the metadata (though that is a
factor, it's not the most important one.)  I'm more concerned about the most
important and vital use of a backup format:

  What can I do if I suffer a catastrophic loss and must bootstrop my system
  from scratch in the least amount of time possible?

With tar backups on tape, for instance, I would boot a Debian recue disk,
and basically do:

  tar -cvSpf /dev/nst0

I personally use Amanda, so there's a little trick in there involving
skipping the Amanda header, but that just requires a simple dd.

For rdiff-backup, it involves just slapping my backup drive in the target
box and doing a cp or a tar.

For this, it will mean bootstrapping enough of the system to get a working
Python and XML library, then copying things over... then I'd have to make
sure that no remnants of the old system remained.  That's a lot more
error-prone and cumbersome.

In fact, even if it were just easy to make a nice extractor in C, that would
be enough to make me happy.  (Again, I like Python, but it's just not always
practical on a rescue disk)

-- John




reply via email to

[Prev in Thread] Current Thread [Next in Thread]