[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Tar replacement - format proposal
From: |
John Goerzen |
Subject: |
Re: [rdiff-backup-users] Tar replacement - format proposal |
Date: |
Fri, 26 Sep 2003 09:14:01 -0500 |
User-agent: |
Mutt/1.5.4i |
On Fri, Sep 26, 2003 at 02:50:00PM +0100, Kevin Spicer wrote:
> > However, even for tape, the central directory at the end of the file could
> > be great. Most tape drives can wind to a specific block far faster than
> > they can read through the entirety of a file. Even given the time lost for
> > reading the central directory and the seeks necessary to do that, it would,
> > in many cases, turn out far faster.
>
> Thats a good point, how would the drive know where to find the index
> though? I'm guessing that you can skip to an EOF mark them seek back x
> blocks from there, but how to know how many blocks the index uses...
Well, Ben's proposal uses the same mechanism as PKZip -- the very last n
bytes in the file (where n is defined by the spec and never changes) contain
a pointer to the offset where the central directory starts. So, your
algorithm for tape would be:
1. Skip to the EOF mark
2. Wind back one block and read that block.
(If you know in advance how many blocks the file takes, you could wind
directly to this block)
3. Look at the last n bytes and calculate the block in which the central
directory starts. Wind to that block.
4. Read the central directory sequentially. Determine blocks in which
each requested file start and sort them in ascending offset order.
For each file:
1. Wind to the block in which it starts if you are not already there
2. Read the file sequentially
Now, steps 1-4 are cumbersome, as it often takes tape drives 5-30 seconds to
switch from winding to reading. However, even if it takes, say, 2 minutes
to read the central directory, plus 4 minutes to wind to it and another 4
minutes to wind to the start of the file, that's only 10 minutes -- versus 2
or 3 hours to read through the entire archive.
(These are real-world numbers from my own tape drive)
> for the existance of file from an index on disk, rather than having to
> load tapes until you find what you are looking for.
That's an excellent idea as well.
-- John
- [rdiff-backup-users] Tar replacement - format proposal, Ben Escoto, 2003/09/26
- Re: [rdiff-backup-users] Tar replacement - format proposal, Kevin Spicer, 2003/09/26
- Re: [rdiff-backup-users] Tar replacement - format proposal, David Kempe, 2003/09/26
- Re: [rdiff-backup-users] Tar replacement - format proposal, Greg Freemyer, 2003/09/26
- Message not available
- [rdiff-backup-users] Why Tape, Greg Freemyer, 2003/09/26
- Re: [rdiff-backup-users] Why Tape, Kevin Spicer, 2003/09/26
- Re: [rdiff-backup-users] Why Tape, Ben Escoto, 2003/09/26
- Re: [rdiff-backup-users] Why Tape, David Kempe, 2003/09/29
Re: [rdiff-backup-users] Tar replacement - format proposal, John Goerzen, 2003/09/26
Re: [rdiff-backup-users] Tar replacement - format proposal, Ben Escoto, 2003/09/26
Re: [rdiff-backup-users] Tar replacement - format proposal, John Goerzen, 2003/09/26
Re: [Duplicity-talk] Re: [rdiff-backup-users] Tar replacement - format proposal, Will Dyson, 2003/09/26