[Gzz] Re: the Storm article

gzz-dev
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] Re: the Storm article

From:	Alatalo Toni
Subject:	[Gzz] Re: the Storm article
Date:	Fri, 7 Mar 2003 08:16:44 +0200 (EET)
On Thu, 6 Mar 2003, Eric Armstrong wrote:

i quote extensively to inform the other authors and the actual designers
via the list -- must go teach some students ;) in 10mins but a few
reactions here first:

> First, I liked the article. A lot. I'm looking
> forward to using Storm, and so is Eugene (only
> took a 30 second presentation to get him
> interested).

glad to hear, and thank you very much about the detailed comments!

> I even started a review article "Taking the World
> by Storm", for publication in some broad-interest
> journal. (Not sure which one, though.)

i've also started working on a review, from a security point of view
related to mobile and ubiquitous computing environments (wireless
connections and small devices)

> Next, specifics.

can't go into most of this right now but probably later today, or perhaps
Benja, Tuomas or Hermanni before me (hence the full quote)

> Missing Ingredients
> -------------------
> These are things that need to be addressed in the
> article, however briefly, but are not currently
> mentioned:
>   * How are collisions handled?
>     (Surely some small blocks must produce the
>      same cryptographic hash as other small blocks,
>      sometimes.)

afaik it should not happen, but it is theoretically possible. so i guess
it's a good question :)

>   * How are docs hashed? I didn't see a discussion of that.
>
>   * What is the project storage impact?
>     (Maybe only "publish" material goes into the system,
>      or maybe storage is cheap and growing cheaper so
>      we don't really care, but it needs to be mentioned.)
>
>   * What language is it written in?
>     (Or do I care? If it really is like a "file system",
>      maybe I really don't?)

mostly Java, otherwise ex-gzz (now Fenfire) has been written also in
Python (Jython, for tests, demos and clients at least) and C++ (opengl
graphics API) .. but all Storm code I've seen is Java.

>   * If there really is a "file system" gui, that's still
>     going to be different from a shell, because I won't
>     be able to launch any existing editors, will I? They'll
>     need to write new files, not rewrite old ones -- and
>     they'll need to understand blocks and transclusions.

yes. the file system implementation has been so far used to save data from
ex-gzz only, via the Storm API.

>   * Short description of "Structured overlay networks".
>     What they do, what they accomplish. (paragraph or two)

they are a type of peer-to-peer networks, overlay refers to how e.g.
gnutella and freenet and layed over the Internet.

>   * Short description of gzz and it's relationship to Storm

this all must be updated to the current Fenfire status

> Sequenced Comments
> ------------------
> Thoughts and questions that occurred to me as I read.

ok i must go now - but thanks again and we'll certainly keep in touch (has
been also nice to see the development on ba-unrev and Benja's
participation there, too)

~Toni

>
> Abstract
>  * Very cool. location-independent globally unique identifiers,
>    append-and-delete only storage, and peer-to-peer networking.
>    very, very cool.
>
> Intro
>  * Wow. 8 references to systems that implement structured overlay
>    networks. I had no idea there were so many.
>
>  * 51 references in all, mostly in journals, to some *great*
>    work solving problems of data sharing, granular addressing,
>    linking, and versioning.
>
>  * The two major issues addressed are mentioned here: dangling
>    (unresolved) links and keeping track of alternative versions.
>    These deserve to be mentioned in the abstract.
>
> Related Work
>  * It's not totally clear what the relationship of the related
>    work is to the current project. Do the systems described
>    represent old work you've moved beyond, old work that
>    provided useful lessons (what lessons?), a foundation for
>    the current work (what parts?), predecessors or clients of
>    the current work.
>
>  * Mention gzz here, and it's relationship to Storm (i.e. gzz
>    refactored to create Storm as an independent module.)
>
> Peer-to-Peer Systems
>  * Mentions a proposal for a common API usable by DHT systems,
>    but it's not clear if you plan to build on that, or if it
>    is a rival, or a predecessor.
>
>  * Mentions "Xanalogical storage", but assumes we know what it
>    is. (Needs a short description. Ok to do a forward reference
>    to where it is discussed later in the article.)
>
>  * Hmmm. Probabilistic access seems reasonable for "contact"
>    scenarios (bunch of people together at a meeting), but not
>    for "publishing" scenarios (publish document on the web).
>    May be worth drawing the distinction here.
>
> Overview of Xanalogical Storage
>  * This threw me. A minute ago we were talking about blocks,
>    now we're talking about characters. Needs a transition to
>    make the relationship apparent. (Later, you talk about
>    spans. Those may be precursors to blocks or they really are
>    blocks. I'm not sure which. Need to anticipate that thought
>    somehow, and tell how we're building up to it, if that's
>    what's going on.
>
>  * Yeah. There's the paragraphs on spans. That threw me, too.
>    Suddenly I had gone from blocks to characters and now to
>    spans, and I was pretty confused about how they related.
>
>  * "Our current implementation" has me wondering what we're
>    talking about. At this point, I thought this more "Related
>    work", like "peer to peer systems". But now it seems it's
>    all one system? Or was this a previous system, before you
>    started working on Storm? (Need to make the relationships
>    apparent.)
>
> Storm Block Storage
>  * Now were back to blocks. Why did that last section exist,
>    anyway? (make the relationship apparent)
>
>  * "caching becomes trivial, because it is never necessary to
>    to check for new versions of blocks". Hmm. This sounds like
>    versioning isn't supported, which seems like a weakness.
>
>  * Interesting. There is a need for "anonymous caching". That
>    allows replication, while resolving the privacy concern.
>
>  * A block is hashed. Ok. And a doc contains pointers to blocks.
>    Ok. But is a doc a block? How is it hashed? How do links
>    contribute to the hash?
>
>  * Gzz is first mentioned here. It needs to be described earlier
>    in the Xanalogical addressing section.
>
>  * "Storm was first developed for the Gzz application, a platform
>    explicitly developed to overcome the limitations of traditional
>    file-based applications" -- a *very* intriguing statement.
>    When Gzz is introduced, this statement needs to be expanded to
>    provide a short list of those limitations, and what Gzz did to
>    solve them. (It has to be very short, of course -- no mean feat.)
>
> Implementation
>  * "we have not yet put a p2p-based implementation into use"
>    This paragraph is very nicely stated. You've done so much
>    already, no can blame you if this part is missing! But it
>    was very good of you to point it out. You do that same kind
>    of thing elsewhere, as well. Very nice.
>
>  * "UI conventions for listing, moving, and deleting blocks"
>    I don't know. That sounds wrong to me. Blocks should be
>    under the covers, and I should be dealing with docs. Ok,
>    so I have an outline-browser (for example, ADM's) or a
>    similar editor. Internally, blocks are moved around when I
>    edit. But my access is always thru a "Doc" -- otherwise I'll
>    be looking at blocks that are divorced from any context whatever.
>
> Application-Specific Reverse-Indexing
>  * This lost me pretty quickly. I wasn't sure what the purpose
>    of this section was. I needed a use case or two to keep me
>    oriented. Later, it becomes clear that this is
>    a part of the versioning solution. Mention that fact here.
>    If possible, also give one or more examples of the other
>    indexing systems you created, to show what this section is
>    for.
>
>  * "locally, is guaranteed that all blocks are indexed by all
>    applications known by the pool".
>    --This paragraph should come before the previous one, which
>      discusses the networked implementation, where not all
>      applications may have stuff indexed (at which point I said,
>      huh?)
>    --More importantly, I really needed an example of an application
>      or two so I could follow this. What does it mean if an
>      networked app doesn't have an index? I just wasn't getting
>      it. (It sure sounded like that wouldn't be good, but I
>      don't know for sure.)
>
>  * keyword searching
>    --it seemed to me that a keyword index would return every
>      *version* of a block that contained the word, which would
>      be a real weakness.
>    --(maybe versioning needs to be described first, so you can
>       discuss the indexing process in context, and mention the
>       resolutions for such issues?)
>
> Versioning
>  * Aha! I read the paper over several days, and so much water
>    went under the dam that I had forgotten this was mentioned
>    at the beginning of the paper.
>
>  * "if, on the other hand, two people collaborate..."
>    VERY nice. Multiple "current version"s are allowed to exist.
>    That's the only possible way to handle the situation.
>
>  * Note 6:
>    It wasn't clear to me how it knows which pointer blocks are
>    obsolete.
>
>  * Beautiful statement of points for further research
>    (authenticating pointer blocks, UI for choosing alternative
>     versions, suitability for web-like publishing). But the
>     system looks strong enough to make me *want* to do such
>     experimentation
>
> Diffs
>  * It wasn't clear if the most recent version was "intact" and
>    previous versions were stored as diffs. I would hope so,
>    in general. At least, if there was only one option, that's
>    the one I'd want. Or can you do it either way?
>
>  * "We always check that we can reconstruct the original version"
>    Very nice.
>
> Discussion
>  * Yes. This is the point of the article. Dangling links and
>    version handling. Definitely belongs in the abstract.
>
>  * Impact of immutable blocks on media use needs a mention
>    here. (Maybe just hand-waving, but some mention of the
>    fact that it's going to cost disk space, in return for
>    improved ability to do xyz, is needed.)
>
> Conclusions
>  * Wild. A Zope based on Storm. Or an OHP.
>    --what's an OHP, anyway. (needs a one-line definition)
>    --come to think of it, I recognize Zope, but not everyone
>      will. That needs a one-line explanation, as well.
>
>  * "structured overlay networks such as DHTs"
>    --I need another paper describing these things, so I can
>      find what they heck they are and how they work!
>
> References
>  * Excellent. Thanks.
>
> Bottom Line
> -----------
> An excellent read, and a most promising technology.
> Thanks for sending it to me.
>
[Prev in Thread]
Current Thread
[Next in Thread]
[Gzz] Re: the Storm article, Alatalo Toni <=
- Re: [Gzz] Re: the Storm article, Benja Fallenstein, 2003/03/07
  - Re: [Gzz] Re: the Storm article, Eric Armstrong, 2003/03/07
    - Re: [Gzz] Re: the Storm article, Benja Fallenstein, 2003/03/07
    - Re: [Gzz] Re: the Storm article, Eric Armstrong, 2003/03/07
    - Re: [Gzz] Re: the Storm article, Alatalo Toni, 2003/03/08
  - Re: [Gzz] Re: the Storm article, hemppah, 2003/03/08
Prev by Date: [Gzz] PEG ``vob_bgvob--humppake``: Abstract Background Vob
Next by Date: [Gzz] CFP: GROUP '03 International Conference on Supporting Group Work (fwd)
Previous by thread: [Gzz] PEG ``vob_bgvob--humppake``: Abstract Background Vob
Next by thread: Re: [Gzz] Re: the Storm article
Index(es):
- Date
- Thread