gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] manuscripts/storm article.rst


From: Hermanni Hyytiälä
Subject: [Gzz-commits] manuscripts/storm article.rst
Date: Fri, 14 Feb 2003 10:46:38 -0500

CVSROOT:        /cvsroot/gzz
Module name:    manuscripts
Changes by:     Hermanni Hyytiälä <address@hidden>      03/02/14 10:46:38

Modified files:
        storm          : article.rst 

Log message:
        Refs from Benja's thesis to article.rst. Refs still missing, please add 
refs and check that they are working ;).

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/storm/article.rst.diff?tr1=1.145&tr2=1.146&r1=text&r2=text

Patches:
Index: manuscripts/storm/article.rst
diff -u manuscripts/storm/article.rst:1.145 manuscripts/storm/article.rst:1.146
--- manuscripts/storm/article.rst:1.145 Fri Feb 14 10:13:07 2003
+++ manuscripts/storm/article.rst       Fri Feb 14 10:46:38 2003
@@ -31,14 +31,13 @@
 either have to include location information (as in regular URLs, which break 
 when documents are moved), or can only be resolved locally (as in
 link services that can only find links stored on a select set
-of link servers [ref Microcosm, DLS, ...]). Berners-Lee [ref NameMyth - that
-was '96, what does he say now?] argues that 
-unique random identifiers are not globally feasible for this reason.
+of link servers [hill94extending-andalso-carr95dls]_). Berners-Lee 
[name-myth]_ 
+argues that unique random identifiers are not globally feasible for this 
reason.
 
 However, recent developments in peer-to-peer systems have
 rendered this assumption obsolete. Structured overlay networks
-[ref chord, can, tapestry, pastry, kademlia, symphony, viceroy,
-skip graph, swan, kelips] allow location-independent identifiers
+[stoica01chord-andalso-ratnasamy01can-andalso-zhao01tapestry-andalso-rowston01pastry-andalso-maymounkov02kademlia-andalso-malkhi02viceroy-andalso-AspnesS2003-andalso-bonsma02swan]_
 
+allow location-independent identifiers
 to be resolved on a global scale. 
 Thus, it is now possible to perform a global lookup to find all information
 related to a given identifier on any participating peer in the network.
@@ -67,7 +66,7 @@
 
 .. [#] It might be more appropriate to speak about *resources*
    and *references* instead of *documents* and *links*, but
-   in the spirit of [ref kappe95scalable], we stick with
+   in the spirit of [kappe95scalable]_, we stick with
    the simpler terms for explanation purposes.
 
 *Dangling links* are an issue when documents are moved
@@ -75,18 +74,18 @@
 but there is a local copy (e.g. on a laptop or dialup system);
 or when the publisher removes a document permanently,
 but there are still copies (e.g. in a public archive such as
-[ref web.archive.org]). Dangling links are also an issue
+ [waybackmachine]_). Dangling links are also an issue
 when a document and a link to it are received independently,
 for example as attachments to independent emails,
 or when a link is sent by mail and the document is available
 from the local intranet. When two people meet e.g. on the train,
 they should be able to form an ad-hoc network and follow links
-to documents stored on either one's computer [ref Thompson et al].
+to documents stored on either one's computer [thompson01coincidence]_.
 Furthermore, when a document is split to parts, links to 
 the elements in the parts that are then in new documents should not break.
 
 Advanced hypermedia systems such as Microcosm and Hyper-G
-address dangling links through a notification system [ref]:
+address dangling links through a notification system 
[hill94extending-andalso-kappe95scalable]_:
 When a document is moved, a message is sent to servers storing links to it.
 Hyper-G uses an efficient protocol for delivering such notifications
 on the public Internet. 
@@ -108,10 +107,10 @@
 modifying them on each; when two people collaborate on a document, 
 sending each other versions of the document by email; 
 when someone downloads a document, modifies it, and publishes
-the modified version (e.g., a manual licensed under the Gnu FDL [ref]),
+the modified version (e.g., a manual licensed under the Gnu FDL [gnu-fdl]_),
 or when a group of people collaborate on a set of documents,
-synchronizing irregularly with a central server (as in CVS [ref]),
-a network of servers (as in Lotus Notes [ref]) or directly with each other 
+synchronizing irregularly with a central server (as in CVS [cvs]_),
+a network of servers (as in Lotus Notes) or directly with each other 
 (as in Groove[?] [ref]). In each of these cases, a user should be able
 to work on the version at hand and then either merge it with others 
 or fork to a different branch.
@@ -120,8 +119,8 @@
 dealing with versioning and dangling links. Storm is a library
 for storing and retrieving data as *blocks*, immutable
 byte sequences identified by cryptographic content hashes
-[ref ht'02 paper]. Additionally, Storm provides services
-for versioned data and Xanalogical storage [ref].
+[lukka02guids]_. Additionally, Storm provides services
+for versioned data and Xanalogical storage [ted-xanalogical-structure-needed]_.
 We address the mobility of documents by block storage
 and versioning, while we use Xanalogical storage
 to address the movement of content between documents (copy&paste).
@@ -133,7 +132,7 @@
 provide an input to the ongoing discussion about peer-to-peer
 hypermedia systems [ref ht01, ht02].
 Currently, Storm is partially implemented as a part of the Gzz 
-project [ref], which uses Storm exclusively for all disk storage.
+project [gzz]_, which uses Storm exclusively for all disk storage.
 Storm's peer-to-peer functionality is in a very early stage and not 
 usable yet.
 
@@ -187,14 +186,14 @@
 -------------------
 
 The dangling link problem has received a lot of attention
-in hypermedia research [refs]. As examples, we examine the ways
-in which HTTP, Microcosm [ref] and Hyper-G [ref] 
+in hypermedia research (e.g. [davis98referential]_). As examples, we examine 
the ways
+in which HTTP, Microcosm [fountain90microcosm]_ and Hyper-G [andrews95hyperg]_ 
 deal with the problem.
 
 .. XXX and URNs [ref]
 
 In HTTP, servers are able to notify a client that a document
-has been moved, and redirect it accordingly [ref spec?]. However,
+has been moved, and redirect it accordingly [rfc2068]_. However,
 this is not required, and there are no facilities for
 updating a link automatically when its target is moved.
 Consequently, broken links are a common experience for Web users.
@@ -203,7 +202,7 @@
 through *filters*, which react to arbitrary messages
 (such as 'find links to this anchor') generated by
 a client application. Filters are processes on the local system
-or on a remote host [ref distributed microcosm]. When
+or on a remote host [hill94extending]_. When
 a document is moved or deleted, a message is sent
 to the filters. Linkbases implemented as filters can
 update their links accordingly. A client selects a set
@@ -220,7 +219,7 @@
 
 In Hyper-G, documents are bound to servers, and a link
 between documents on different servers is stored by both servers
-[kappe95scalable]. This ensures that all links from and to a document
+[kappe95scalable]_. This ensures that all links from and to a document
 can always be found, but requires the cooperation 
 of both parties. Hyper-G employs a scalable protocol
 for notifying servers when a document has been moved or removed.
@@ -246,10 +245,10 @@
 2.2. Alternative versions
 -------------------------
 
-Version control systems like CVS or RCS [ref] usually assume
+Version control systems like CVS or RCS [tichy85rcs]_ usually assume
 a central server hosting a repository. The WebDAV/DeltaV protocols,
 designed for interoperability between version control systems, inherit
-this assumption [ref]. On the other hand, Arch [ref] places all repositories
+this assumption [rfc2518-andalso-rfc3253]_. On the other hand, Arch [arch]_ 
places all repositories
 into a global namespace and allows independent developers 
 to branch and merge overlapping repositories without any central control
 [is there a specific ref for this?].
@@ -283,9 +282,8 @@
 During the last few years, there has been a lot of research
 related to peer-to-peer resource discovery, both in academy 
 and in the industry [ref: iris: http://iris.lcs.mit.edu/, p2p working group].
-There are two main approaches: broadcasting [gnutella1, kazaa, limewire,
-shareaza, freenet, locutus], and distributed hashtables (DHTs) [chord, can, 
tapestry, pastry,
-kademlia, symphony, viceroy, kelips]. Broadcasting systems
+There are two main approaches: broadcasting 
[gnutellaurl-andalso-ripeanu02mappinggnutella-andalso-kazaaurl]_, 
+and distributed hashtables (DHTs) 
[stoica01chord-andalso-ratnasamy01can-andalso-zhao01tapestry-andalso-rowston01pastry-andalso-maymounkov02kademlia-andalso-malkhi02viceroy]_.
 Broadcasting systems
 forward queries to all systems reachable in a given number of hops
 (time-to-live). DHTs store (key,value) pairs which can be found given
 the key; a DHT assigns each peer a subset of all possible keys, and
@@ -320,12 +318,12 @@
 query closer to its destination in key space, until they reach
 the peer responsible for the query.
 
-.. http://sahara.cs.berkeley.edu/jan2003-retreat/ravenben_api_talk.pdf
-   Full paper will appear in IPTPS 2003 -Hermanni
+A common API that can be supported by current and future DHTs
+is being proposed in [zhao03api]_.
 
 Recently, a few DHT-like systems have been developed which employ
 a key space similarly to a DHT, but in which queries are routed
-to (key,value) pairs [SWAN, skip graph]: A peer 
+to (key,value) pairs [bonsma02swan-andalso-AspnesS2003]_: A peer 
 occupies several positions in the key space, one for each 
 (key,value) pair. In such a system, the indirection of placing
 close keys in the custody of a 'hashtable bucket' peer is removed
@@ -366,7 +364,7 @@
 approaches to the size of values. Consider a file-sharing application:
 If the keys are keywords from the titles of shared files, are the values
 the files-- or the addresses of peers from which the files may be
-downloaded? Iyer et al [ref Squirrel] call the former approach
+downloaded? Iyer et al [iyer02squirrel]_ call the former approach
 a *home-store* and the latter a *directory* scheme (they call the peer
 responsible for a hashtable item its 'home node,' thus 'home-store').
 
@@ -386,17 +384,17 @@
 property which is used in Storm blocks.
 
 Recently there has been some interest in peer-to-peer hypermedia.
-Thompson and de Roure [ref ht01] examine the discovery
+Thompson and de Roure [thompson01coincidence]_ examine the discovery
 of documents and links available at and relating to
 a user's physical location. An example would be
 a linkbase constructed from links made available by different
-participants of a meeting [thompson00weaving]. 
-Bouvin [ref 02] focuses on the scalability and ease of publishing
+participants of a meeting [thompson00weaving]_. 
+Bouvin [bouvin02open]_ focuses on the scalability and ease of publishing
 in peer-to-peer systems, examining ways in which p2p can serve
 as a basis for Open Hypermedia. Our own work has been 
-in implementing Xanalogical storage [ref 02].
+in implementing Xanalogical storage [lukka02guids]_.
 
-At the Hypertext'02 panel on peer-to-peer hypertext [ref],
+At the Hypertext'02 panel on peer-to-peer hypertext [p2p-hypertext-panel]_,
 there was a lively discussion on whether the probabilistic access
 to documents offered by peers joining and leaving the network
 would be tolerable for hypermedia publishing. For many documents,
@@ -414,8 +412,8 @@
 3. Overview of Xanalogical storage
 ==================================
 
-In the xanalogical storage model [ref], 
-pioneered by the unfinished Project Xanadu [ref],
+In the xanalogical storage model [ted-xu-tech]_, 
+pioneered by the unfinished Project Xanadu [ted-xu-tech]_,
 links are not between documents, but individual characters.
 When a character is first typed in, it acquires a permanent id
 ("the character 'D' typed by Janne Kujala on 10/8/97 8:37:18"),
@@ -458,7 +456,7 @@
 being ``text/plain``). To designate a span of characters
 from that session, we use the block's id, the offset of the first
 character, and the number of characters in the span.
-This technique was first introduced in [ref ht02 paper].
+This technique was first introduced in [lukka02guids]_.
 
 In Xanadu, characters are stored to append-only *scrolls*
 when they are typed [ref]. Because of this, in Storm, we call the 
@@ -491,7 +489,7 @@
 will be relatively small (limited by the amount of text
 the user enters between two saves of a document), we hope
 that this will not be a major scalability problem. Otherwise,
-systems that allow range queries, such as skip graphs [ref], 
+systems that allow range queries, such as skip graphs [AspnesS2003]_, 
 skipnet [ref], may prove useful.
 
 One question raised by xanalogical storage is which links to show
@@ -499,7 +497,7 @@
 We hope to address this problem by collaborative filtering
 of links [explain, ref (grouplens.org pubs?)]. There has been research on
 collaborative filtering in peer-to-peer systems
-without compromising participants' privacy [ref John Canny].
+without compromising participants' privacy 
[canny02collaborative-andalso-canny02factor]_.
 For some purposes simple rules based on e.g. belonging to a group may be
 applicable as well: e.g. when working on a project with a project group, it
 may be beneficial for the members of the group to see other members'
@@ -529,7 +527,7 @@
 
 In Storm, all data is stored
 as *blocks*, byte sequences identified by a SHA-1 
-cryptographic content hash [ref SHA-1]. 
+cryptographic content hash [fips-sha-1]_. 
 Being purely a function of a block's content, block ids
 are completely independent of network location.
 Blocks have a similar granularity
@@ -571,7 +569,7 @@
 the flash crowd problem could be alleviated: The more users
 request a block, the more locations there are to download it from.
 This resembles e.g. the Squirrel
-web cache [ref] [more refs? -Hermanni]; however, downloads can be
+web cache [iyer02squirrel]_; however, downloads can be
 from *any* peer since the source does not need to be trusted.
 On the other hand, there are privacy 
 concerns with exposing one's browser cache to the outside world.
@@ -636,7 +634,7 @@
 Even after failure of all of the publisher's mirrors,
 a document may still be available from peers that have
 downloaded it. An archive of published blocks, in the spirit
-of the Web archive [ref], would only be yet another backup;
+of the Web archive [waybackmachine]_, would only be yet another backup;
 normal links to a block would work as long as the archive
 holds a copy. It would also be hard to purposefully remove
 a published document from the network; whether this is
@@ -678,7 +676,7 @@
 4.1. Implementation
 -------------------
 
-Storm blocks are MIME messages [ref MIME], i.e., objects with
+Storm blocks are MIME messages [borenstein92mime]_, i.e., objects with
 a header and body as used in Internet mail or HTTP.
 This allows them to carry any metadata that can be carried
 in a MIME header, most importantly a content type.
@@ -699,22 +697,22 @@
 
 Many existing peer-to-peer systems could be used to
 find blocks on the network.
-For example, Freenet [ref], recent Gnutella-based clients 
-(e.g. Shareaza [ref]), and Overnet/eDonkey2000 [ref] 
+For example, Freenet [freenet-ieee]_, recent Gnutella-based clients 
+(e.g. Shareaza [shareazaurl]_), and Overnet/eDonkey2000 [ref] 
 also use SHA-1-based identifiers [e.g. ref: magnet uri]. 
 Implementations on top of a DHT could use both the
-directory and the home store approach as defined by [ref Squirrel].
+directory and the home store approach as defined by [iyer02squirrel]_.
 
 Unfortunately, we have not put a p2p-based implementation
 into use yet and can therefore only report on our design.
 Currently, we are working on a prototype implementation
-based on UDP, the GISP distributed hashtable [ref],
+based on UDP, the GISP distributed hashtable [kato02gisp]_,
 and the directory approach (using the DHT to find a peer
 with a copy of the block, then using HTTP to download the block).
 Many practical problems have to be overcome before this
 implementation will be usable (for example seeding the
 table of known peers, and issues with UDP and network
-address translation [ref]).
+address translation [rfc3253]_).
 
 Sometimes it is useful to think about *zones* blocks are in,
 related to distribution policy: for example, a *public*
@@ -916,14 +914,14 @@
 one user or group of users to be able to produce new
 official versions of a given document (an exception may
 be wikis, which are collaboratively edited by anyone
-interested [ref]). It is not yet clear how to do this.
+interested [leuf01wiki]_). It is not yet clear how to do this.
 Signing pointer blocks digitally may be sensible, but
 digital signatures require a public key infrastructure
 and a trusted timestamping mechanism [#]_, which
 is hardly feasible for a system intended to be used
 for off-line as well as on-line work.
 For long-term publishing, one-time signatures have been
-found useful [ref]. For the time being, the pointer mechanism
+found useful [anderson98erl]_. For the time being, the pointer mechanism
 works only in trusted Storm zones (Section 3), e.g.
 in a workgroup collaborating on a set of documents.
 
@@ -984,7 +982,7 @@
 leading up to the current one would be broken if any
 previous version were deleted. 
 
-Additionally, many versioning systems (e.g. CVS [ref])
+Additionally, many versioning systems (e.g. CVS [cvs]_)
 store the current version as well as the differences,
 enabling them to retrieve the current version quickly and compute
 recent versions by applying the differences 'backwards,'
@@ -1194,8 +1192,8 @@
 No work on integrating Storm with current programs (in the spirit of Open
 Hypermedia) has been done so far. It is not clear how far this is possible
 without changing applications substantially, if advantage of our
-implementation of Xanalogical storage is to be taken.  (Vitali [ref] notes
-that Xanalogical storage necessiates strong discipline in version tracking,
+implementation of Xanalogical storage is to be taken.  (Vitali 
[vitali99versioning]_
+notes that Xanalogical storage necessiates strong discipline in version 
tracking,
 which current systems lack.) 
 
 [worth to mention ? -Hermanni]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]