rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Vailidation of RDiff archive...


From: Marcel (Felix) Giannelia
Subject: Re: [rdiff-backup-users] Vailidation of RDiff archive...
Date: Fri, 20 Mar 2009 19:04:16 -0700
User-agent: Thunderbird 2.0.0.16 (X11/20080726)


So, if I were to want to verify that the repository and all reverse
slices *could* be applied without error, is there such a mechanism
available?

Note, I'm not asking that it actually apply those slices and check
the resulting files with anything - but is there a way to be sure
that the archives, transaction and rdiff slices all LOOK ok?
rdiff-backup does actually keep sha1 sums of all the files it backs up; the sums are stored in rdiff-backup-data/mirror_metadata.[the most recent timestamp].gz . There's rdiff-backup --check-destination-dir , which checks that the last backup finished successfully, and will fix the destination dir if necessary (by rolling back the incomplete backup attempt). That only looks at the most recent backup or two, though; as far as I know it does not look any deeper into past backups. It's fairly easy to see if those "look" OK though -- here's how it's supposed to look:
http://rdiff-backup.nongnu.org/format.html
(Short summary: if you see a whole bunch of files with timestamps like 2009-03-13T22:08:22-07:00 in their name in rdiff-backup-data/ , it's probably OK.)

For more than a cursory look, read on...
---
Where most of the thrust of these questions is going, is I'd like to
know the available ways to verify an archive is still sound, and
sane.
The best way to do this is to try restoring all or part of the archive to a date before the very first increment present. That should force rdiff-backup to use all of the increments it has, and if any of them are corrupted/unusable/missing, it should complain. You can restore to a temp directory, and it will probably go faster if you restore to a separate physical drive.

If you want something that runs faster than that or uses less space, you can try gzip --test on all of the .gz files within rdiff-backup-data (since they're all gzipped, gzip should catch most corruptions that way without even needing to unpack them). To check the whole rdiff-backup-data dir, try: find [path to rdiff-backup-data directory] -name "*.gz" -print0 | xargs -0 gzip -t (Note that the quotes around "*.gz" are important; otherwise you'll only be testing the gzip files in the current directory.

If you're looking for a way to test the .diff files that's faster than actually applying them, you'd probably have to dig into the specs for rdiff itself (not rdiff-backup) -- all I know about that is that the first few bytes of a valid rdiff delta file are supposed to be 0x72730236.
Also, to verify that a current backup matches exactly that of the
source.
Since everything but the rdiff-backup-data directory is just a mirror, you could just compare it directly. If you want a really intensive comparison, use diff -r (might want to add --brief as well or it'll get very verbose) -- diff does a byte-for-byte comparison; no checksumming. You might want to find a way to omit rdiff-backup-data from that. For a faster comparison, you could do something like find [your base directory] -type f -print0 | xargs -0 md5sum >> big_list_of_md5sums for both directories, then sort the big lists of md5sums and compare. Some files will of course differ (the ones that changed since the backup), but most will match.

Those suggestions both assume that you meant "verify with something other than rdiff-backup itself" -- because running rdiff-backup *is* basically a verification of the mirror -- anything that differs gets stored as increments and updated. Anything that didn't get updated matches exactly.

~Felix.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]