rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Converting from rsync and many other thoughts


From: Brad Templeton
Subject: [rdiff-backup-users] Converting from rsync and many other thoughts
Date: Wed, 9 Jul 2008 20:58:30 -0700
User-agent: Mutt/1.5.9i

I noticed several postings on this list about switching to rdiff-backup
from a plain copy (from rsync) using --force

However, my tests show that when you do this, you get an rdiff-backup
repository that is a copy of the current state of the directory, but
the old directory is effectivley erased.

I am hoping for a way to convert an existing rsync (I have many hundreds
of gigabytes of backups of this form) to an rdiff-backup, preserving
the differences between the old backup and the current state, the
way rdiff-backup does when operating normally.

To do this, all one should need is a tool to create the rdiff-backup-data
directory inside the rsync.  Is there a good way to do this?

I tried one way.  I created a snapshot of the rsync backup with cp -al, which
does it with links.   Then I did an rdiff backup from one to the other.
That _seemed_ to work, creating a new copy with a -data directory.

But then which I did rdiff-backup from the live directory onto this new
backup, it is treating every file as different, and leaving behind a
.diff.gz file which is small and looks random to me.   As far as I know
there should be no differences.

(The syntax of rdiff-backup was a bit confusing.  I used to do
    rsync /etc /home   /backup/mirror/system, which I presumed is
    done with rdiff-backup --include /etc --include /home --exclude /  / 
/backup/mirror/system
)

Anyway, on to other suggestions, since I like the general flavour of rdb as
a backup tool:


a) I like to do offsites.  Problem is, if I do a big change, like loading in 
10gb
of new photos after a trip, it's going to be a long time syncing to the offsite
at 700 kbps upstream.   As such, two features might be useful:

    1) Tell it to terminate, gracefully, after so many hours.  I want to
        run these at night, and have them end when I want the bandwidth
        back.   Then resume where it left off.

    2) Tell it to back up certain files first.  This could be both files
        I list for it, but in general it would be files below a certain
        size.  Then do the other (larger) files.     Ideally even sort
        files by size and backup in that order.    Presumably I could do
        this with a complex set of scripts of "find" commands to build the
        files, but the program could do it better.

Why?  The goal is to get a full backup, but there is not enough time in
the night to back up many gigabytes.  So start by getting the important
smaller files, and then get the larger files as time is available.


b) Like rsync, be able to write the update stream to a file, which then
goes onto a physical disk that is taken to the offsite.   This is another
way of handling when there is a very big difference, too large to send
over the internet.  So again, you want the smaller files, the more
important files to go over the internet, but when you happen to be
ready for a physical trip, you write the difference to a removable drive,
and you take it to the backup server and you apply it.   Now all the
big files are updated, and you are fully up to date.


c) Encrypted remote store.  For those who want to do an offsite to a
friend's house, it would be cool to have the remote store be encrypted.
This does mean that any "diff" is going to be binary.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]