gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz] Hemppah's PEG round 2 comments


From: Tuomas Lukka
Subject: Re: [Gzz] Hemppah's PEG round 2 comments
Date: Thu, 5 Jun 2003 13:15:26 +0300
User-agent: Mutt/1.5.4i

On Thu, Jun 05, 2003 at 11:20:19AM +0300, Hermanni Hyytiälä wrote:
> On Wed, 2003-06-04 at 18:11, Tuomas Lukka wrote:
> 
> 
> > ISSUE: what is known or easily deducible about the effect of a single
> > hostile node? Following variants:
> > 
> >     - Node just wants to make the network work slower
> > 
> >     - Node wants to make searches for some particular informations difficult
> > 
> >     - Node wants to make life difficult for some particular nodes,
> >       slowing down their queries
> > 
> >     - Node wants to make search for particular information by particular 
> > nodes
> >       difficult
> > 
> >     - Node wants to spam a particular identifier or xu block
> > 
> > What about a constant fraction of nodes?
> > 
> > These questions are of utmost importance to us.
> 
> 
> Directly related to DHTs, there are two publications about this topic:
> 
> "Security Considerations for Peer-to-Peer Distributed Hash Tables"
> by Emil Sit and Robert Morris
> 
> "Security for structured peer-to-peer overlay networks"
> by M. Castro, P. Druschel, A. Ganesh, A. Rowstron and and D. Wallach
> 
> Both papers discusses scenarions you mentioned above, e.g., Sit's
> paper describes:
...

Ok, since none of the defences are implemented in GISP, 
the simulation should be postponed to see if we can make exploit programs
and get GISP proofed against them. If not, we need another platform.

So that changes the goals; we don't want to do the abstract simulations --
we want to do concrete simulation of gisp clients with a fraction of
hostile exploiters and see how easy it is to stop things. 

Therefore, I suggest postponing this PEG and starting a new one:
"Attacking GISP", which contains your plans on how you will make an exploit..

> > ISSUE: Do we want to keep GISP or move to Kademlia or some other?
> > 
> > > ISSUE:
> > >         
> > >     What experiments / simulations / methods are used in the literature?
> > >     
> > >     RESOLVED:
> > >     
> > >     - mainly theoretical/best case experiments/simulations
> > >     - small amount of uncontrolled (real life), great amount of
> > >       controlled (simulation environment)
> > >     - simulations have been rather static and are often favored the 
> > >       proposed algorithm    
> > >     - in "real life" experiments, there have been very fast connections 
> > >       between peers
> > >     - some form of formal analyses ("proofs") are used, 
> > >       e.g., with Chord and Kademlia
> > >     - distribution of number of peers used in simulations: from 2^7 to 
> > > 2^20 
> > 
> > A summary of what the actual findings are would be good, i.e. which
> > things work and how. 
> 
> Hm, how detailed information do you want ? Could you give an example of
> "which thing" ? :)
> 
> In general, based on the published papers, it's quite hard say for sure
> what things *really* work and what works "in the article".

Well, that's what your summary should contain: the results and
your guesses as to their generality.

> > > Plan
> > > ====
> > > 
> > > First of all, we will create a PEG document (this document) which 
> > > discusses general and theoretical aspects of the research problems. Then, 
> > > if necessary, we program (rather short) test cases which will test 
> > > the GISP/Storm P2P properties, as discussed in this document. Finally, we 
> > > will 
> > > collect and analyse test cases' information and publish any interesting 
> > > discoveries. 
> > > 
> > > The GISP protocol
> > > =================
> > > 
> > > According to Daishi Kato, the author of GISP, GISP ''...intends leverage 
> > > those (DHT) algoritms and make a practical protocol.''. GISP's distance 
> > > function is based on the XOR-metric, which is similar to Kademlia's 
> > > XOR-metric 
> > > (unsigned integer of XOR). The GISP protocol specification paper 
> > > describes the 
> > > XOR-metric but the original GISP publication describes only numerical 
> > > metric (which is little confusing since Chord uses numerical distance
> > > metric and it's evident that a numerical metric is different from a 
> > > XOR-metric).
> > > 
> > > GISP extends Chord's routing table to have more peer information as a 
> > > cache.
> > > Thus, log^2 messages are required to join/leave operations (according to 
> > > the
> > > original Chord publication). Additionally, Chord's routing table is 
> > > asymmetric 
> > > (requires stabilization protocol) and lookup process is unidirectional 
> > > (in a virtual overlay ring). Original Kademlia publication states that 
> > > Chord's routing table is rigid compared to Kademlia's routing table.
> > > 
> > > Since GISP lacks of Kademlia's binary-tree-like abstraction of the 
> > > routing 
> > > table, it is evident that the benefits of Kademlia's lookup properties 
> > > (over 
> > > other DHTs) are not available in the GISP protocol, e.g., no concurrent 
> > > and
> > > asynchronous lookups (no free of choice) and when a peer leaves the system
> > > no messages are required.
> > 
> > Uhh, sentence unclear: in GISP, are messages required or not?
> 
> Since original GISP publication talks about Chord-like routing table, we
> can assume that GISP doesn't use Kademlia's routing table (abstraction).
> Thus, GISP requires O(log^2 n) messages when a peer joins/leaves the
> system.

Which is not acceptable for us - network connections will get severed &c.

So GISP doesn't even know how to deal with machines vanishing from the network?

> > If they are, then we should move away from it immediately!
> 
> Hm, do have any options ? I still recommend Kademlia ;).

Yes, you mentioned the python client -- might be worth looking at.



> > > - How much better/worse pure Kademlia implementation (e.g. Kashmir) is 
> > > over the GISP 
> > >   protocol in face performance and fault-tolerance ?
> > 
> > in face -> w.r.t.
> 
> 
> I don't understand this :).

"in face performance" is not proper English. 
w.r.t (with respect to) is the correct term.

> > 
> > > - How well GISP is able to perform in adverse conditions, e.g., a
> > >   network partition occurs ?
> > >    
> > > - How well GISP is able to perform against different kind of
> > >   security attacks and what are the impacts ?
> > >    
> > > 
> > > Simulations
> > > ===========
> > > 
> > > If needed, using a simulation process we try solve research problems. 
> > > Also, 
> > > with a simulation process we are able test the GISP protocol without 
> > > having 
> > > to deploy real life experiments.
> > 
> > Not protocol - the core algorithms.
> 
> 
> Well, I see both the protocol and algorithm important. The algortihm (a
> routing table) gives a general abstraction and properties for the 
> system and the protocol defines how well the system is able to benefit
> from the abstraction (e.g., this is the case with Kademlia).

The actual implementation is not important - it can be changed.
The algorithms and the susceptibility to attacks is the big thing.

> > > - A hostile entity is able to reroute a data lookup to a incorrect 
> > >   destination peer during a data lookup process (e.g., e.g., 1000 Storm 
> > >   blocks are insterted into a 1000 Storm-server system in which a a 
> > > fraction
> > >   of peers are hostile. Perform data lookups 1000 lookups randomly so 
> > > that 
> > >   in every lookup process, one forwarding request is rerouted incorrectly 
> > > towards
> > >   randomly chosen destionation peer; calculate average block request 
> > > failure, 
> > >   average lookup length, number of timed-out lookups and the distribution 
> > > of 
> > >   lookup messages processed per peer).  
> > 
> > Also, hostile node can direct towards non-peers, or empty network spots
> > where it'll take a long time for the query to time out.
> 
> In the papers, mentioned above, there are some proposals for this kind
> of scenario. 
> 
> OTOH, Kademlia has a nice feature which makes Kademlia
> (rather) resistant to this attack: the "free choice" feature can be used
> to select peers with low latency or few network hops. Additionally, it
> can be used to select on arbitrary other criteria, such as avoiding
> peers that are unreachable (due to an incompletely connected underlying
> network), avoiding peers that are untrusted etc.

So we should start looking into the python client?

Remember also that if we use indices for transclusions, we have a fresh
problem as it becomes possible to spam...

        Tuomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]