[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.
From: |
Ben Escoto |
Subject: |
Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'. |
Date: |
Tue, 27 Jan 2004 18:33:57 -0800 |
>>>>> Greg Freemyer <address@hidden>
>>>>> wrote the following on Fri, 02 Jan 2004 14:39:18 -0500
> The writeup at:
>
> http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson_html/hash.html
>
> in effect says that 'compare-by-hash' is not the perfect choice for
> a rdiff-backup type of program. Instead, it says some kind of state
> information should be added to the algorithym to ensure accuracy.
>
> I don't know that I am concerned, but I am curious if rdiff-backup
> takes any precautions against hash-collisions.
No, rdiff-backup takes no special precautions. As far as I can tell,
the paper makes the point that even cryptographically strong hashes
may be breakable in the future. This is a pretty good objection,
especially since librsync uses md5 instead of the stronger SHA1.
But in rdiff-backup the most an attacker could do would be to produce
a file that had the same signature, causing rdiff-backup to skip over
parts of the file. But an attacker can already do much better than
that: just use utime(), chmod, and such to make it look like your file
hasn't changed. Plus this method doesn't take years of research.
So anyway, rdiff-backup isn't intended to defeat an attacker trying to
make it look like his files haven't changed.
I did find this quote from the article interesting:
> BitKeeper keeps state about each file under source control and knows
> what changes have been made since the last time each tree was
> synchronized. When synchronizing, it sends only the differences
> since the last synchronization occurred, in compressed form. In
> comparison to rsync, BitKeeper provides similar and sometimes better
> bandwidth usage when simply synchronizing two trees without
> resorting to compare-by-hash.
Does anyone know how BitKeeper can send only the differences without
using a compare-by-hash algorithm? I don't see how this is possible
unless BitKeeper has abandoned the usual edit w/arbitrary editor and
then checkin scheme.
--
Ben Escoto