[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gzz] Raw pools?
From: |
Tuomas Lukka |
Subject: |
Re: [Gzz] Raw pools? |
Date: |
Mon, 11 Nov 2002 11:52:57 +0200 |
User-agent: |
Mutt/1.4i |
On Sun, Nov 10, 2002 at 08:00:26PM +0100, Benja Fallenstein wrote:
> Tuomas Lukka wrote:
>
> >On Sun, Nov 10, 2002 at 02:39:58PM +0100, Benja Fallenstein wrote:
> >
> >
> >>Tuomas Lukka wrote:
> >>
> >>
> >>
> >>>An idea, related to the "canonical blocks":
> >>>Maybe we should label some pools "raw", i.e. no header, just a block of
> >>>binary data. That way, we could be compatible with other content-based
> >>>systems
> >>>for externally obtained data.
> >>>
> >>That means we cannot move blocks from there to non-raw pools. Also, we'd
> >>have to guess the type.
> >
> >Yes. But I think that's acceptable for the intended purpose:
> >allowing us to use material that we can't distribute.
> >
>
> As one of the important principles of Storm, I see what I call the
> "persistency commitment" (which I should peg ;-) ):
We should probably PEG the basic ideology of Storm as a whole:
- persistent blocks: operations
- get a block's bits EXACTLY or "sorry, can't get them" -answer
- store a block, get an ID
- pointers
- ...
> Since all future implementations will have to support all 'features' we
> put in now, we have to be extremely careful, because all garbage we put
> in now has to be carried along indefinitely. This is relevant in two ways:
> - If we support "raw pools" now, all future versions of Storm will need
> to support raw pools; otherwise, the blocks could not be referenced any
> more. This would violate the commitment.
Ahh, I think I was not clear, once again.
I didn't mean that raw pools should be equal to normal storm pools.
The raw blocks would *NOT* have storm ids and would not be addressed in
that way.
The point is that we need a level of indirection from the storm block
of a PDF file (header) (that should be redistributable) to the actual
file (that is not redistributable).
> - Storm would cease to be a mapping from ids to a block of binary data
> *with metadata, at least a Content-Type*. This change would not be
> revertible, since that would violate the commitment.
I'm not proposing that!
>
> [Disclaimer: Of course, it's not unlikely at all that Storm won't
> survive even ten years. The point is to try and publish the results, so
> that somebody designing a protocol that does last that long can learn
> from it.]
It's much easier to create a new format than to stop using an old one.
If Storm ever catches on, there'll be running implementations 20 years for now
;)
> >E.g. for xupdf, this would be vital for other people to be able
> >to use the demo.
> >
>
> We have the canonical blocks (with just the Content-Type header); since
> you have to call a program to put something inside Storm anyway (unless
> you're going to calculate the SHA-1 hash yourself), I don't see the
> difference it would make at this point in time.
It *does* make a big difference: the SHA-1 is not the same that someone
just obtaining the file would calculate. That's a big issue because
most SHA-1 -content-based-retrieval systems will *NOT* have the
same Content-TYpe header.
> >How vehemently opposed are you?
> >
>
> As the above shows, very. ;-) I don't think the single purpose here is
> worth the change.
Actually, you're not, as your proposal shows. You're assuming that I want
the raw blocks to be "real" storm blocks, i.e. first-class citizens.
This is not so. For my purposes, having them in a completely different
namespace is fine.
> *snip my proposal*
>
> >>(Note that the body chunks will *not* be Storm blocks, even though
> >>they're refered to by an SHA-1 hash. It is therefore illegal to refer to
> >>a body chunk through a Storm URI.)
> >>
> >>
> >
> >So these would, essentially, be blocks in raw pools?
> >
>
> No, since they would not be blocks. :-) But they would have the property
> you're searching for: they could be queried through a content
> distribution network that uses a plain SHA-1 hash for the identification
> of files.
Exactly.
> Let me put it like this: It has broad enough applicability that I'm
> thinking this *might* just be good enough to warrant the burden on
> future implementations.
Great, let's PEG it.
Tuomas
- [Gzz] Raw pools?, Tuomas Lukka, 2002/11/10
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/10
- Re: [Gzz] Raw pools?, Tuomas Lukka, 2002/11/10
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/10
- Re: [Gzz] Raw pools?,
Tuomas Lukka <=
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/11
- Re: [Gzz] Raw pools?, Tuomas Lukka, 2002/11/16
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/16
- Re: [Gzz] Raw pools?, Tuomas Lukka, 2002/11/16
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/17
- Re: [Gzz] Raw pools?, Tuomas Lukka, 2002/11/18
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/18
- Re: [Gzz] Raw pools?, Benja Fallenstein, 2002/11/16
- Re: [Gzz] Raw pools?, Tuomas Lukka, 2002/11/17