rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Converting from rsync and many other thoughts


From: Steven Willoughby
Subject: Re: [rdiff-backup-users] Converting from rsync and many other thoughts
Date: Thu, 10 Jul 2008 16:33:58 -0600
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Brad Templeton wrote:
I noticed several postings on this list about switching to rdiff-backup
from a plain copy (from rsync) using --force

However, my tests show that when you do this, you get an rdiff-backup
repository that is a copy of the current state of the directory, but
the old directory is effectivley erased.

I am hoping for a way to convert an existing rsync (I have many hundreds
of gigabytes of backups of this form) to an rdiff-backup, preserving
the differences between the old backup and the current state, the
way rdiff-backup does when operating normally.

To do this, all one should need is a tool to create the rdiff-backup-data
directory inside the rsync.  Is there a good way to do this?

I tried one way.  I created a snapshot of the rsync backup with cp -al, which
does it with links.   Then I did an rdiff backup from one to the other.
That _seemed_ to work, creating a new copy with a -data directory.

But then which I did rdiff-backup from the live directory onto this new
backup, it is treating every file as different, and leaving behind a
.diff.gz file which is small and looks random to me.   As far as I know
there should be no differences.

The reason rdiff-backup thinks the file changed is that the metadata has changed. (There is a property in the rdiff-backup/mirror_metadata.* file called "NumHardLinks" which will be set to 2, but the first run of rdiff-backup un-hard-linked the files.)

Try doing it again and using --no-hard-links on your _first_ run of rdiff-backup. This doesn't create the "NumHardLinks" property for me.


Anyway, on to other suggestions, since I like the general flavour of rdb as
a backup tool:


a) I like to do offsites.  Problem is, if I do a big change, like loading in 
10gb
of new photos after a trip, it's going to be a long time syncing to the offsite
at 700 kbps upstream.   As such, two features might be useful:

    1) Tell it to terminate, gracefully, after so many hours.  I want to
        run these at night, and have them end when I want the bandwidth
        back.   Then resume where it left off.

This currently isn't possible AFAIK. You might be able to pause the rdiff-backup process during the day with kill -STOP and then resume it again the next night with -CONT if you turn on SSH's KeepAlive option.



    2) Tell it to back up certain files first.  This could be both files
        I list for it, but in general it would be files below a certain
        size.  Then do the other (larger) files.     Ideally even sort
        files by size and backup in that order.    Presumably I could do
        this with a complex set of scripts of "find" commands to build the
        files, but the program could do it better.

The way I do this is with multiple backups: one for photos, another for documents, etc.



Why?  The goal is to get a full backup, but there is not enough time in
the night to back up many gigabytes.  So start by getting the important
smaller files, and then get the larger files as time is available.


b) Like rsync, be able to write the update stream to a file, which then
goes onto a physical disk that is taken to the offsite.   This is another
way of handling when there is a very big difference, too large to send
over the internet.  So again, you want the smaller files, the more
important files to go over the internet, but when you happen to be
ready for a physical trip, you write the difference to a removable drive,
and you take it to the backup server and you apply it.   Now all the
big files are updated, and you are fully up to date.

You can do this using the cp -al trick you discovered earlier. Write the big files to the disk, take them offsite, create a copy with cp -al, remove the rdiff-backup-data directory from the copy, move your big files into the proper place in the copy, and the run rdiff-backup --no-hard-links $copy $dest



c) Encrypted remote store.  For those who want to do an offsite to a
friend's house, it would be cool to have the remote store be encrypted.
This does mean that any "diff" is going to be binary.

Duplicity seems to be the better tool to accomplish this.

Steven




reply via email to

[Prev in Thread] Current Thread [Next in Thread]