duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] are periodic full backups necessary?


From: Peter Schuller
Subject: Re: [Duplicity-talk] are periodic full backups necessary?
Date: Sun, 20 Jan 2008 07:57:35 +0100
User-agent: KMail/1.9.7

> I'm trying to get a good understanding of the tradeoffs between
> rdiff-backup and duplicity.  One of the nice things about rdiff-backup
> is that the "most recent and backward diffs" means you never need to
> do a "full backup".

Agreed.

> What are the consequences of never doing a full backup with
> duplicity's "original and forward diffs" format?  Will the time
> required for an incremental backup increase in proportion to how many
> incrementals there have been since the last full backup?  Or is the
> backup time independent of how "far back" the most recent full backup
> was?

I believe the fundamental problem is that in order to generate mirror n, with 
a patch with differences between n and n - 1, you need access to the copy as 
it appears at n - 1 and n.

In duplicity's case, that would mean duplicity would have to perform a full 
restore prior to generating another forward diff and the new version of the 
file.

That is assuming the current archive format of duplicity. If duplicity were to 
try to implement an rdiff-backup style system with an up-to-date copy + 
reverse diff, the situation would be different but still problematic:

rdiff-backup can do what it does efficiently because it is running on boths 
ends of the pipe. It does not need to transfer the entire file (n - 1, nor n) 
in either direction thanks to its use of the rsync algorithm.

In addition, even if one were to accept that duplicity needed to run on the 
remote end, this has security implications - the entire point is that the 
process running on the remote end has access to the file being diffed, and if 
you do not encrypt, and you are running a remote process, then what was the 
point of using duplicity instead of rdiff-backup to begin with?

Perhaps a compromise is possible whereby each individual file being backed up 
is separately backed up and encrypted, such that the rsync algorithm can be 
applied even with an untrusted remote system on the encrypted data. However 
this also has security implications since you can make more determinations on 
the number of files, their sizes, the distribution of changes over time and 
so on than you can do with the volume uploads.

Perhaps a single large "virtual" volume could be generated for a complete 
backup, would would then be used to apply the rsync algorithm to the previous 
volume. This assumes the rsync algorithm only requires one pass (does it?) 
and that it will all work well even in the face of large displacement of data 
in this huge file (probably not).

Anyone have better ideas?

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <address@hidden>'
Key retrieval: Send an E-Mail to address@hidden
E-Mail: address@hidden Web: http://www.scode.org

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]