chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Ugarit: A backup/archival system


From: Alaric Snell-Pym
Subject: Re: [Chicken-users] Ugarit: A backup/archival system
Date: Wed, 28 Jan 2009 10:24:52 +0000


On 28 Jan 2009, at 8:21 am, Sven Hartrumpf wrote:

- Tiger hash: What happens in case of collisions (i.e. different
 data blocks having the same hash)?


Well, the theory is that that's very unlikely, on the scale of
"universe is likely to implode first", but just in case, I'm adding an
option to check for collisions every time an existing block is reused.
Such an operation would kill performance on the future S3 backend, but
wouldn't be too bad on the local-filesystem backends.

- gdbm as a storage backend: if you want a light-weight but more
 efficient/recent variant of dbm, please consider tokyocabinet too.

Yeah, I need to look into that!

- What I always wanted to add to other backup systems:
 an option to rely on mtime of directories: if the mtime of a
directory
 has not changed, skip the whole directory for testing.

Ah, but if you change a file down in a directory tree then the
directory mtime won't change (the mtime on my / is Jan 9th, when I
installed the VM) -

In some scenarios
 (and with careful users/scripts that ensure update of directories'
mtime!),
 this will give you a performance boost).

 - but if the user is happy to touch directories all the way up the
tree from a modified file, then that'd work, yeah. The file mtime
cache will boost performance a lot for subsequent snapshots, but still
requires a full directory scan to find changed mtimes, being able to
do directory mtime caching would rock even more. I was wondering about
figuring out filesystem-modification notifications on OSes that
support them, so we can just sit in the background building up a list
of modified files, which would be great.

- I will test ugarit with 500 GB and 20 million files ...
 or is this too early?

Please do :-)

Avoid lzma for tests, I think - it's very CPU-intensive to compress! I
think deflate or even no compression is more the trick for local
filesystem stuff LZMA will come into its own for remote backups, as
the reduced upload time will justify the CPU use.

ABS

--
Alaric Snell-Pym
Work: http://www.snell-systems.co.uk/
Play: http://www.snell-pym.org.uk/alaric/
Blog: http://www.snell-pym.org.uk/?author=4






reply via email to

[Prev in Thread] Current Thread [Next in Thread]