[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] manuscripts/pointers article.rst
From: |
Benja Fallenstein |
Subject: |
[Gzz-commits] manuscripts/pointers article.rst |
Date: |
Sun, 02 Nov 2003 13:46:50 -0500 |
CVSROOT: /cvsroot/gzz
Module name: manuscripts
Branch:
Changes by: Benja Fallenstein <address@hidden> 03/11/02 13:46:50
Modified files:
pointers : article.rst
Log message:
rmcrud; leave out diffs for now
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/pointers/article.rst.diff?tr1=1.75&tr2=1.76&r1=text&r2=text
Patches:
Index: manuscripts/pointers/article.rst
diff -u manuscripts/pointers/article.rst:1.75
manuscripts/pointers/article.rst:1.76
--- manuscripts/pointers/article.rst:1.75 Sun Nov 2 08:00:23 2003
+++ manuscripts/pointers/article.rst Sun Nov 2 13:46:50 2003
@@ -2,40 +2,6 @@
What's missing: Why isn't the Web running over P2P?
===================================================
-.. META:
-
- (tentative title)
-
- Title suggestions and comments:
-
- should somehow say "Versioning on top of a simple basic model"...
- Totally **distributed** versioning
- Works even in ad hoc environments
- ...
-
- Interoperability between P2P systems
-
- Important issue: the main message. Is it:
-
- This is a nice way to do versioning
-
- This is a distributed block-based versioning system
-
- This is a distributed block-based system for getting the WWW static pages
- feature set from P2P
-
- This is a way to avoid web rot (the Cassini-Huygens thing) by allowing
- storage of old versions (this is VERY different from the
location-independent
- web name... more like a side effect)
-
- This is a way to do an ad hoc networking web
-
-
- The MAIN goal of this work remains unclear and needs
- discussion. I think our getting this paper in depends on exactly that:
- a CLEAR main goal, a good presentation that will generate discussion.
-
- REAL:
Abstract
========
@@ -66,8 +32,8 @@
This way, versions of a document
stay available as long as anybody keeps a copy.
-We also present a simple model for storing only the
-differences between versions.
+.. We also present a simple model for storing only the
+ differences between versions.
.. raw:: latex
@@ -169,13 +135,6 @@
and stops publishing a page, it disappears, even if
someone else would have kept a copy.
-.. <<<In file-sharing systems, versioning
- would be useful for media files like the e-books
- distributed by Project Gutenberg [XXXref], which
- are occasionally updated to fix typographic errors.>>>
-
-.. Standing on the shoulders of giants: Example of Web links rotting away
-
This is an important concern.
In 1997, NASA launched the Cassini-Huygens spacecraft
on a mission to Saturn. Before the launch, the mission
@@ -198,104 +157,20 @@
alleviates these concerns, but it introduces
a single point of failure for the *entire* Web!)
-.. Links shouldn't break when documents move or
- publishers lose interest
-
-.. <<<There are two reasons for broken links: Either the original
- publisher has moved the target document to a new address,
- or they have stopped publishing it, usually because keeping
- up a Web page requires some amount of maintenance and they
- have lost interest.>>>
-
.. <<<We don't propose that every byte of information ever published
on the Web has to be kept around forever. However,
we do believe that as long as someone does keep a copy,
data should remain accessible, like in a file-sharing system,
and links should continue to work.>>>
-.. Location-independent, semantic-free, *self-verifying*
- identifiers (ref SFR paper [balakrishnan03semanticfree]_,
- [walfish03dns]_); example: hash-based (ref
- ``hash`` URN namespace Internet-Draft; ref Freenet & others)
-
- History of location-dependence: `TBL ref`_ (like in HT'03 paper)
-
-.. <<<This can be accomplished by replacing URIs that include
- a server name by URIs that are
-
- - location-independent: do not refer to a particular server
- that the file is to be downloaded from;
- - semantic-free: don't include human-readable information;
- if an identifier is semantic-free, there is no incentive
- to change it when a site is re-designed;
- - self-verifying: after downloading an alleged copy of a document,
- there is a cryptographical algorithm to test whether
- this is *really* a copy of this document.>>>
-
-.. <<<In 1996, Tim Berners-Lee [name-myth]_ argued that
- using location-independent, semantic-free identifiers
- is not viable on a global scale:
- "[I]f you put information in a name, it decreases its longevity;
- if you don't you can't dereference it to a resource.">>>
-
-.. <<<However, as observed in [fallenstein03storm]_,
- with the advent of efficient peer-to-peer lookup mechanisms such as
- distributed hashtables (DHTs), this observation
- is no longer true. A DHT is quite able to resolve a
- hash-based identifier on a global scale,
- as evidenced by applications like the Cooperative
- File System (CFS, [dabek01widearea]_) and
- the Overnet file sharing client [overneturl]_.>>>
-
-.. SFR (semantic-free referencing) not all that close,
- though semantic-free idea shared (SFR takes along
- many problems of the Web)
-
-.. <<<(Using DHTs
- to resolve location-independent identifiers on the Web
- has been proposed by Balakrishnan et.al.
- [balakrishnan03semanticfree-andalso-walfish03dns]_.
- However, in their work, the location-independent identifier
- merely points to a Web server administered by the publisher
- of a Web page; if the original publisher discontinues
- maintenance of the page, it would still drop off the Web.)>>>
-
- XXX move to related work?
-
-.. Proposal: A location-independent Web <<<(closest thing is Freenet (ref))>>>
-
-.. <<<The project that is currently closest to this goal is Freenet,
- a XXX>>>
-
-.. Benefits of hash-based addressing:
- - Pages easily movable between servers
- - Data accessible as long as anyone keeps a copy
- - Load balancing (download from everyone who has a copy)
- - Can use one addressing scheme with different protocols,
- searching different networks for the content behind a hash
- - Verifiable
- - Same namespace for local and for non-local data
-
.. <<<Other projects exploit some of the advantages of hash-based
(storage systems: CFS, PAST; web caching: Squirrel),
but don't address the Web.>>>
-.. <<<The infrastructure behind CFS, PAST and Squirrel: Peer-to-Peer>>>
-
-.. <<<Quite recently, several Peer-to-Peer architectures have been
- proposed that use hash-based, loc.ind. ids>>>
-
.. Possibility of desktop integration in ways that the location-dependent
Web cannot archieve, through the novel combination of
network transparency and location independence (ref ourselves).
-.. However, there's a problem with this: versioning -----
- Basic problem: Hash-based addressing allows no updates
-
-.. Contributions; structure of this paper ----
-
-.. Main contrib: Pointer records for implementing updating
-
The main contribution of our paper are *pointer records*,
a versioning mechanism which is similar to Oceanstore's heartbeats,
but makes the clients download and store the pointer records
@@ -335,100 +210,21 @@
in Gnutella or a DHT-based system, using an anonymized system
like Achord [XXXref] if it contains controversial content.
-.. Other contribs:
- - The idea of a location-independent Web including
- location-independent version management
- - Diffs
-
An additional contribution of this paper is XXX diffs
-.. Structure of this paper
-
The remainder of this paper is structured as follows.
In Section 2, we introduce pointer records.
In Section 3, we propose a simple, hash-based data model
that can be used by P2P Web servers and clients.
-In Section 4, we introduce a scheme for storing only
-the differences between versions, built on our basic
-data model. In Section 5 we discuss other possible applications
-of pointer records, and Section 6 gives an overview of our
-implementation. Section 7 concludes.
-
-
-
-.. Related work
- ============
-
- In this section, we briefly summarize how existing peer-to-peer
- systems deal with versioning. We have tried to classify peer-to-peer
- system into four different categories based on their versioning
- model below. We conclude that existing peer-to-peer do not provide all the
- benefits of hash-based addressing scheme.
-
-
- No versioning model
- -------------------
-
- PAST [rowstron01storage]_ is a persistent storage system
- that uses pastry [rowston01pastry]_
- for locating data in a Peer-to-Peer environment.
- Nodes and data items are distributed uniformly based on the hash
- identifier in a PAST network. Free Haven [dingledine00free]_
- provides a distributed anonymous persistent data storage. It uses
- both cryptography and routing techniques to provide anonymity for
- the participating peers and splits a file to a number of shares
- which are distributed through a network.
-
- .. Regular filesharing apps:
- .. Gnutella
- .. Fasttrack stack (Kazaa/Morpheus)
- .. Shareaza (magnet uri's)
- .. Overnet/eDonkey2000/eMule/MlDonkey (have location-independent
identifiers, MD4 hashes)
- .. BitTorrent
-
- Centralized versioning model
- ----------------------------
-
- .. SFS is a network filesystem that does not support of searching
- docs. IMHO it's not relevant w.r.t. the article (as Mnet is
- not either)
-
- Like Free Haven, Publius [pub00]_ focuses on anonymity of the
- participating nodes. Compared to Free Haven, however, Publius
- has a support for destructive updates. An issue with Publius
- is that it requires a globally maintained list of participating
- nodes currently available in a system.
-
- OceanStore [kubiatowicz00oceanstore]_ is a global storage system
- based on Tapestry [zhao01tapestry]_ routing algorithm. It
- supports non-destructive, linearly versioned updates through a
- centralized Byzantine agreement protocol.
-
-
- Network-level destructive versioning
- ------------------------------------
-
- CFS [dabek01widearea]_ is based on Chord [stoica01chord]_ and
- stores data blocks, fragments of files, and spreads blocks
- uniformly through the network based on identifier of a block.
- CFS has a support for versioning, but in a way that allows
- only the publisher of a file system to do a destructive
- versioning for a data item. The versioning takes place at
- network-level since the hash of the data block determines
- which host maintains a data block in the overlay.
+In Section 4 we discuss other possible applications
+of pointer records, and Section 5 gives an overview of our
+implementation. Section 6 concludes.
+
+.. In Section 4, we introduce a scheme for storing only
+ the differences between versions, built on our basic
+ data model.
- Miscellaneous versioning models
- -------------------------------
-
- Freenet [freenet-ieee]_ uses a probabilistic routing scheme to
- preserve the anonymity of the participating nodes in a network.
- It uses the "edition" versioning model: one can link
- version x to a link representing version y even if version y
- does not exist in a system yet.
-
- .. ??? XXX I don't understand the above AT ALL
-
Pointer records
===============
@@ -570,15 +366,15 @@
algorithm)
-Diffs
-=====
-
-- when storing all past versions, space may be a problem
-- (not for small text files-- practical experience: 30 MB
- in half a year of use-- but for things where each version
- takes 500 KB)
-- old idea: storing only differences
-- XXX
+.. Diffs
+ =====
+
+ - when storing all past versions, space may be a problem
+ - (not for small text files-- practical experience: 30 MB
+ in half a year of use-- but for things where each version
+ takes 500 KB)
+ - old idea: storing only differences
+ - XXX
Applications
- [Gzz-commits] manuscripts/pointers article.rst, (continued)
- [Gzz-commits] manuscripts/pointers article.rst, Benja Fallenstein, 2003/11/01
- [Gzz-commits] manuscripts/pointers article.rst, Benja Fallenstein, 2003/11/01
- [Gzz-commits] manuscripts/pointers article.rst, Benja Fallenstein, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Benja Fallenstein, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst,
Benja Fallenstein <=
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/02
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/03
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/03
- [Gzz-commits] manuscripts/pointers article.rst, Benja Fallenstein, 2003/11/03
- [Gzz-commits] manuscripts/pointers article.rst, Tuomas J. Lukka, 2003/11/03