Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.

From:	Ben Escoto
Subject:	Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.
Date:	Fri, 30 Jan 2004 14:07:45 -0800

>>>>> Greg Freemyer <address@hidden>
>>>>> wrote the following on Wed, 28 Jan 2004 14:08:05 -0500

> Since the hashing process is lossy (ie. non-reversable), then it is
> possible that two totally different data sets could generate the same
> hash, and in turn invalidate the backup checksum check.
> 
> I don't know what the odds are of this happening with rdiff-backup.
> 
> I assume that they are exceedingly small, but not zero.

For rdiff (and thus rdiff-backup) it actually depends on the number of
blocks in the file, because there is no global sha1 or md5 hash.  So a
2GB file that has all 2GB changed is more likely to cause a hash
collision than a changed 1k file.

I remember asking Donovan Baarda about this on the librsync list a while
ago, so if anyone is curious for more details they can look that up. The
upshot IIRC is that (for "random data") the odds of a collision are
around 2^-50 even for fairly large files.  This isn't as good as a 128
global hash, but quite reasonable for practical use.


-- 
Ben Escoto

pgp1rK_M1R1tE.pgp
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Greg Freemyer, 2004/01/02
- Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Ben Escoto, 2004/01/27
  - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Nathaniel Smith, 2004/01/28
  - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Greg Freemyer, 2004/01/28
    - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., David S., 2004/01/28
    - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Robert Knighten, 2004/01/28
    - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Ben Escoto <=
  - Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'., Paul P Komkoff Jr, 2004/01/28

Prev by Date: Re: [rdiff-backup-users] Download Problems
Next by Date: Re: [rdiff-backup-users] os error 22 detecting file system properties on cygwin
Previous by thread: Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.
Next by thread: Re: [rdiff-backup-users] Interesting write-up of 'compare-by-hash'.
Index(es):
- Date
- Thread