[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Proposal to fix long filenames
From: |
Chris Wilson |
Subject: |
Re: [rdiff-backup-users] Proposal to fix long filenames |
Date: |
Sat, 12 Nov 2005 12:25:09 +0000 (GMT) |
Hi Ben,
Time to fix rdiff-backup's oldest bug? In this message I'll describe
the problem and one way to fix it.
Great idea! Just a few small points:
There are three different ways I can see this happening:
[...]
3) The source filesystem supports longer filenames than the
destination. In this case the mirror file may be too long to
write even without any quoting. I've never heard of this actually
happening.
There is a fourth case: where the destination path is deeper into the
destination filesystem than the source path is. For example, I backup many
machines root directories (/) into /mnt/backup/<machine-name>/rdiff on my
backup servers. In this case, both the original filename and the
increments may be too long to back up.
The increments in the rdiff-backup-data directory also have
"rdiff-backup-data/" prepended to the name. In this case, the increment
names may be too long.
The mirror_metadata file could have two additional optional fields,
called "MirrorFilename" and "IncrementFilename". If MirrorFilename is
set, rdiff-backup reads the mirror file from the
rdiff-backup-data/long_filename_data/<mirror filename> file, instead
of from the normal location in the mirror directory.
The MirrorFilename seems like a good idea in principle, but it means that
the mirror files are not located in their usual place in the mirror
filesystem. I don't think that's a good thing, as it makes it
significantly harder to examine or restore the latest version "by hand",
and compute the disk space used by it.
Similarly, if IncrementFilename is set, increment data will not be
read from rdiff-backup-data/increments/<whatever>.<suffix> but from
rdiff-backup-data/long_filename_data/<increment filename>.<suffix>
Similarly, it means that some increments are not where we expect them to
be.
The alternate filenames would have boring but plentiful names like
1, 2, etc.
I'd like to propose a compromise:
rdiff-backup figures out the longest possible filename and deepest
possible path for itself when examining the filesystem capabilities.
If, during backup, any path or file to be written to the destination
exceeds those lengths, it's terminated near the maximum length, and a
number appended. The relevant IncrementFilename or MirrorFilename
directive is written to the metadata at the same time. So for example:
/a/really/long/path/on/a/short/path/file/system
might become
/a/really/long/path/on/a/short/path/file/s~1
and if the filesystem's limits are so short that directories must be
renamed as well, then keep at least the first character of each one:
/a/really/long/p~1/on/a/s~1/p~1/f~1/s~1
and if that's not enough, then just replace the directory names with
numbers:
/1/1/1/1/1/1/1/1/1/1
and if that's not enough, I don't know what else you can do! :-) Shoot the
admin, perhaps.
Originally I thought that the fix for long filenames might somehow be
integrated into a scheme to detect and compress renamed files. But
now I doubt any renaming scheme is forthcoming.
That's a pity, since I think it would now be really easy: just make a
hash table of the SHA-1 checksums in the mirror, and compare the checksum
of each newly added file to this list, to see if it's a duplicate or a
moved file. This shortcuts the need to transfer the file again.
Cheers, Chris.
--
_ ___ __ _
/ __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |