rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Re: Speed issue in first run


From: Maarten Bezemer
Subject: Re: [rdiff-backup-users] Re: Speed issue in first run
Date: Mon, 21 Jul 2008 21:29:19 +0200 (CEST)

Hi,

On Mon, 21 Jul 2008, Jussi Hirvi wrote:

> Thanks for the tip. I believe that's true, but still I have the impression
> that an empty rdiff-backup-data in target dir makes a difference - if it
> doesn't exist, the first run seems to take ve-e-ery long (though source and
> target should be identical). However, there are unknown factors, so I cannot
> be absolutely sure. And testing (in normal operation) is slow, as my backup
> batches are large. 

I didn't notice any difference between those two versions of an initial
backup run. Both are very slow ;-)

> As a general remark, rdiff-backup seems to be much slower than rsync. I'd
> like to know the reason for that.

With rsync, you have the (default) option of not touching (as in, doing
things with them, not just /bin/touch) files that have the same
size/protection/date as in the source tree. Rdiff-backup rebuilds every
file using librsync, which consumes a lot of time and bandwith. Maybe
applying some md5sum/sha1sum or whatever on both sides could save the
bandwith of syncing a file that did not change. (Much like rsync's
--checksum option)
Note that the rebuilding only happens for the initial run. Normal
incremental runs are much faster, but not as fast as rsync. This is partly
because it's a python program, partly because it's using an external
librsync library, probably with inferior pipelining techniques, and partly
because metadata and changes/history of all files needs to be
recorded. Maybe there are other factors slowing things down, but I think
these are the most important ones.

HTH,
 Maarten





reply via email to

[Prev in Thread] Current Thread [Next in Thread]