Re: [rdiff-backup-users] About backups and increments

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] About backups and increments

From:	Maarten Bezemer
Subject:	Re: [rdiff-backup-users] About backups and increments
Date:	Mon, 22 Aug 2011 21:26:10 +0200 (CEST)


On Mon, 22 Aug 2011, Robert Nichols wrote:

About space requirements: I assume the space required for the backup is:

- the space of the source files themselves
- the space of all the increments
- extra space required to compute the increment?
* Is this space stored on the source or destination drive? * This shouldbe the size of the file currently computed + it's increments right? Soshould I assume that to backup the *second* increment of some space X(where X can possibly be just one huge file) I need at least X * 2space for the backup - just for temporary files?* This brings me back to my first question: what happens when thedestination is full?
I'm not aware of any extra space needed for computing the increment, but the
increment itself, of course, does need to be stored on the destination drive.
If the destination drive runs out of space, the rdiff-backup session will
fail.

If it is detected that a file has changed (based on file attributes), anew file in the destination directory is created using a "temp name", andit is synced to its new contents, using the old version to speed up thersync process. After that, an increment is created, and only then will theold version be removed.This process is followed sequentially for all files, so the total spaceneeded would be the space for the increments that are created during thissession, plus the size of the largest file in the repository.Of course, you usually don't know in advance how large the increments willbe...


I don't really understand what you mean by 'the second increment'.

Worst case would be that you'd need the current size of the source, plusthe total size of your last backup including all increments (if everythingin the tree is replaced by something else), plus a small metadataoverhead. If you repeat for a second increment and again all data has beenreplaced by other data, you would again need the current source size plusthe total size of the backup tree.If, however, the data you backup changes only slightly or is mostly'append-only' data like log files, each time the space used by incrementswould be quite limited.


It all depends on your data set...

About backup speed. rdiff-backup doesn't seem to support bothbackupping *and* pruning the increments at the same time (yes, I'veread the man page). Though this sounds like a very sensible thing todo: knowing that you will prune several old increments, you can avoidto calculate the reverse diffs. Has this been considered?
There's not much point in combining those two, totally independent actions.
Computing the reverse diffs for session N vs. session N-1 is totally
independent of the existence (or lack thereof) of earlier sessions in the
archive.


Adding to that:

One will always have to calculate a reverse diff to go from the newlysynced (N) version to the previous (N-1) version. If someone wants toavoid calculating reverse diffs for a file, that is the same as having nohistory at all. Better use rsync then, instead of rdiff-backup...If you don't calculate a reverse-diff for a file, you won't be able toregress a backup run that failed half-way through... leaving you with auseless backup.


But!

Maybe I now know what I didn't understand in your line of questioning.With rdiff-backup, increments are for individual files, and only whenthese individual files have been changed. So, there are no reverse diffsif a file has not been changed. For a data set of 1000 files with only 10files changing since the previous run, the increments dir would onlycontain 10 reverse diff files for this run.Likewise, if a file hasn't been changed for 3 months and it is changedtoday, but I only want to keep 1 month of history, I can NOT simply ditchthe 3-months old version. Maybe it wasn't changed for all these months,but it is still yesterday's version and has to be kept in history for thecoming month minus 1 day...

--keep-increments N (where N is the number of most recent increments tokeep, irregardless of time).

[snip]

Let's say I want always to keep at all times at least 2 increments (or2 months, if that matters), I have no way to do that directly (I couldlist the increments and calculate the time myself, but that's ugly).

So.. lets assume you make weekly backups. (Hoping it will be more often,but just as an example.)

You want to keep history of 2 months. That's about 8 or 9 weeks.

But sometimes you make an extra backup halfway through a week, andsometimes you go on a vacation and don't run any backup.So, in these cases, you might want to keep history for 2 months, but alsoat least 5 increments, even if that means it will be more than 2 months?Would it really be useful to.. eh.. keep increments from 4 months ago ifyou forgot to run backups for the last 2 months? This sounds just like"oh, I didn't make backups over the last two months, but I do happen tohave some historic versions from 3 months ago containing your PhD thesisyou've been working on... for the last 3 months....."

Let's just say that I don't think having such an option would be a reallynice thing to have ;-)

And creating a small script would indeed be far easier ;-)

Side note: I never automate the removal of old increments. Always do thatby hand, first without --force to check the increment dates it announcesthat will be removed, then with --force if it looks OK. The only thingthat's automated wrt increment removal is a cron job reminding me of thetask. I could even modify it to remind me daily if increment removal isdue and wasn't done yet, but for now, I keep these reminders in my inboxuntil the removal is done.



--
Maarten

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] About backups and increments, Yuri D'Elia, 2011/08/22
- Re: [rdiff-backup-users] About backups and increments, Robert Nichols, 2011/08/22
  - Re: [rdiff-backup-users] About backups and increments, Yuri D'Elia, 2011/08/22
    - Re: [rdiff-backup-users] About backups and increments, Robert Nichols, 2011/08/22
  - Re: [rdiff-backup-users] About backups and increments, Robert Nichols, 2011/08/22
  - Re: [rdiff-backup-users] About backups and increments, Maarten Bezemer <=
    - Re: [rdiff-backup-users] About backups and increments, Robert Nichols, 2011/08/22
    - Re: [rdiff-backup-users] About backups and increments, Yuri D\'Elia, 2011/08/23

Prev by Date: Re: [rdiff-backup-users] About backups and increments
Next by Date: Re: [rdiff-backup-users] About backups and increments
Previous by thread: Re: [rdiff-backup-users] About backups and increments
Next by thread: Re: [rdiff-backup-users] About backups and increments
Index(es):
- Date
- Thread