[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Duplicity-talk] multi-level diffs; deletions
From: |
Ben Escoto |
Subject: |
Re: [Duplicity-talk] multi-level diffs; deletions |
Date: |
Wed, 05 Feb 2003 00:06:35 -0800 |
>>>>> "DC" == Dan Christensen <address@hidden>
>>>>> wrote the following on Tue, 04 Feb 2003 18:03:15 -0500
DC> Actually, I think that with rdiff based incrementals, one is
DC> more likely to want multilevel backups. Let me explain.
DC> Suppose you do a full backup A and then a backup B relative to
DC> A. Now it's time for backup C, and you are trying to decide
DC> whether to do it relative to A or B. The advantage of doing it
DC> relative to A is that you only have two tar files to process if
DC> you have to restore. The disadvantage is that backup C will be
DC> larger.
DC> If you use an rdiff based scheme instead of the usual method of
DC> backing up entire files that have changed, then the advantage of
DC> doing the backup relative to A is even bigger (since restores
DC> are more complicated, especially in the case of a complete disk
DC> failure) and the disadvantage is even less (since rdiff based
DC> backups are smaller).
DC> Put another way, since rdiff based backups produce smaller
DC> incrementals, I'm probably going to space my full backups
DC> further apart than with a traditional approach. So I'm going to
DC> want multilevel incrementals even more, since without them I'm
DC> going to have to process a long sequence of .difftar files in
DC> order to restore my system.
Yes, this makes some sense, but I was thinking of a case where a
perhaps the diff A->C is as big as the diff B->C + the diff A->B.
Then making the A->C diff may not be a good idea because it would take
up more space, and it wouldn't even be quicker to restore, because
you'd have to download the same amount of data either way.
But I was assuming downloading the data is the bottleneck. If it
is applying the diffs, then you are right, one A->C diff is much
easier to apply than an A->B diff and a B->C diff, even if they are of
the same total sizes.
At any rate, I agree that it would be a good feature to have.
BE> Also, I can't think of anything in the architecture which would
BE> preclude this feature---duplicity would just have to keep
BE> signatures for backup sets other than the most recent.
DC> I just tried duplicity out, and it seems to do this already. So
DC> maybe this feature wouldn't be hard to add?
I think the intended behavior is for only the most recent signature to
be kept. I don't think the feature would be hard to add, but this is
a rather confusing feature I think, and we would want some way to help
the user specify what s/he wanted. Also some built-in algorithms
would probably be nice, and for restores we would want duplicity to
pick the shortest path to a full backup.
--
Ben Escoto
pgptDlVAkkPI4.pgp
Description: PGP signature