gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz] PEG: Storm P2P simulations using the GISP protocol


From: Tuomas Lukka
Subject: Re: [Gzz] PEG: Storm P2P simulations using the GISP protocol
Date: Tue, 3 Jun 2003 12:52:38 +0300
User-agent: Mutt/1.5.4i

On Tue, Jun 03, 2003 at 12:14:11PM +0300, Hermanni Hyytiälä wrote:
> =================================================================================
> PEG storm_gisp_simulation--hemppah: Storm P2P simulations using the GISP
> protocol
> =================================================================================
> 
> :Authors:  Hermanni Hyytiälä
> :Date-Created: 2003-06-02
> :Last-Modified: $Date: 2003/06/03 09:12:19 $
> :Revision: $Revision: 1.3 $
> :Status:   Incomplete
> 
> .. :Stakeholders:
> .. :Scope:    Major|Minor|Trivial|Cosmetic
> .. :Type:     META|Policy|Architecture|Interface|Implementation
> 
> .. Affect-PEGs:
> 
> 
> Performing GISP P2P simulations with Storm we can increase our
> understanding 
> about GISP's scalability properties and on the other hand, possible
> issues
> related to scalability. Also, we want to know how GISP outperforms
> against different threats such as network partition or security 
> attacks.

What do you mean "with Storm"?

Language: 
understanding about -> understanding
scalability properties and... -- the part after and is redundant.
outperforms -> performs

> This PEG discusses how these simulations should be planned, what are
> the research problems, hypothesis and issues.

Wrong order, still.

Research problems, hypotheses, the theoretical knowledge we have
about the hypotheses, and possible simulations to validate 
hypotheses. 

Issues (PEG issues) are implicit.

> Plan
> ====
> 
> First of all, we will create a PEG document (this document) which 
> discusses general aspects of the simulation process. Then, we plan 
> to program (rather short) test cases which will test the GISP/Storm 
> P2P properties, as discussed in this document. 

Nope. Still wrong order. Again: don't think about programming and simulations,
think about what questions you want answers for.

> Finally, we will 
> collect and analyse test cases' information and use this information
> in the future in our manuscripts.

... publish any interesting discoveries.

> We plan to perform all simulations on a single computer using the
> local loopback network interface to communicate with each other. 

Whether you need to even use the loopback depends SEVERELY on the 
nature of the simulation -- I'd think for the first simulations 
you will NOT need *ANY* network code.

> The
> purpose is that Storm-servers would act as if they were on different
> machines. The simulation is ran under a standard Linux/Java environment.

Umm, no. If we want to understand the theoretical properties first,
we don't want to use Storm servers as there are too many complexities there
that will affect the results and make them unreadable.

> Research problems
> =================
> 
> By using simulation as a research method, we try to test different kinds
> of
> properties of the GISP protocol without having to deploy real life
> experiments. 
> There are number of research problems which we try to solve (or
> understand 
> better) using the simulation process:

Umm, why do you keep harping on about simulations?
Why not just say: 

        For determining whether Storm with unmodified GISP
        is practical, we want the answers to the following questions.

If we can answer the questions without any simulations, it would be much better!

> - How well GISP can scale if there are lot of concurrent peer joins and
>   leaves in the system ? What about lookup effieciency when the network
>   grows ?

Isn't the "what about" part a separate question?

> - How well GISP is able to perform in adverse conditions, e.g., a
>   network partition occurs ?
>    
> - How well GISP is able to perform against different kind of
>   security attacks and what are the impacts ?

Ok, these sound good. 

ISSUE: (means you need to add this to the issues section)

        Why GISP? Why are we using it versus some other systems?
        What properties does it share with others, to such a degree
        that its performance might be deduced from theirs?

ISSUE:
        
        What experiments / simulations / methods are used in the literature?

> For now, we assume that simulation network is rather optimal, e.g.,
> there
> are no network latencies in the simulation network. In the future,
> however, 
> we plan to perform simulations in a non-optimal network.

Don't talk about simulation even here! This is not about simulation,
this is about answering the above questions!

        At first we'll assume  ideal conditions w.r.t. ...

BTW: The correct word is ideal, not optimal: ideal is a theoretical,
unachievable simplification, but optimal is just best of the currently
existing ones.

> Hypothesis
> ==========
> 
> - GISP can scale rather well when peers join and leave the system at a 
>   constant/static rate for a given time period and cost of
> joining/leaving 
>   is logarithmic (e.g. Start with 1000 blocks and 1000 Storm-servers, 10
>   peer(s) joins/leaves every 5 seconds).

What do you base this hypothesis on? 

What scales w.r.t. what?

Constant rate or constant fraction, if you scale up the number of peers?

Quite inexact still. I wouldn't know how to test this and say 
that this is false or true.

> - GISP can scale well and is adaptable if the cost of join/leave is 
>   logarithmic when peers join and leave the system constantly
>   and the variable rate for joining/leaving changes greatly (e.g., Start
> with 1000 
>   blocks and 1000 Storm-servers. 1-10 peer(s) joins/leaves every 1-10
> second(s), 
>   at a given time suddenly 100-900 peers joins/leaves randomly).

Same question again.

> - GISP's data lookup is efficient if the number of of lookup length
> grows with a 
>   logarithmic growth inspite that the number of Storm-servers increases 
>   linearly (e.g. 10-10000 Storm-servers, 10000 Storm blocks, with
> 10-10000
>   Storm-servers perform 10000 lookups randomly)

"Number of of lookup length"???

> - A GISP peer is not able to handle all request properly when great
> amount 
>   of query requests are performed towards a single peer/few peers (a
> peer is 
>   responsible for a given key). Thus, there can be query/routing
> hotspots
>   in the system and load balancing properties may not scalable/tolerance
>   against a hostile attack (e.g., 1000 Storm-server system, each server 
>   hosting 1-10 Storm block(s), 1-900 peers (randomly chosen) queries a 
>   single key every 1-10 second(s); calculate average block request 
>   failure, average lookup length, number of timed-out lookups and
>   the distribution of lookup messages processed per peer).
>      
> - GISP is is rather fault-tolerant if 80% of lookups are succesful when
> 20% of
>   peers die (This is Chord's simulation result) (e.g., 1000 Storm blocks
> are 
>   insterted into a 1000 Storm-server system. After insertions, 1-99% of 
>   servers die randomly or in a controlled way. Before GISP starts
> rebuilding 
>   routing tables, perform 1000 Storm block fetches; calculate average
> block 
>   request failure, average lookup length and number of timed-out
> lookups).
>  
> - A hostile entity is able to reroute a data lookup to a incorrect 
>   destination peer during a data lookup process (e.g., e.g., 1000 Storm 
>   blocks are insterted into a 1000 Storm-server system in which a a
> fraction
>   of peers are hostile. Perform data lookups 1000 lookups randomly so
> that 
>   in every lookup process, one forwarding request is rerouted
> incorrectly towards
>   randomly chosen destionation peer; calculate average block request
> failure, 
>   average lookup length, number of timed-out lookups and the
> distribution of 
>   lookup messages processed per peer).

Same problems apply to all these even more - needs to be far
more exact as to what you're claiming.

Why are you talking about 1000?

> Issues
> ======

I'd prefer to have Issues right after abstract.


> - How many virtual peers and Storm blocks we are able to simulate on a
> single 
>   machine (e.g., with 256Mb of memory) ?

This depends on the level of abstraction - I've been trying to tell
you no real network connections are needed. If you make an abstract
enough simulation, you should be able to do millions.

> - In the future, do we want to perform simulations in a LAN cluster (or 
>   relevant) ?
>   
>     RESOLVED: Yes, if we want to simulate huge virtual networks and/or
>     memory requirements are too massive for a single desktop.
>    
> - For now, do we need "real" Storm blocks during simulations or not ?
>  
>     RESOLVED: No, since we want to make our simulator environment 
>     as lightweight as possible (See issue #1). In the future, however,
> it is 
>     possible that we use "real" Storm blocks instead of "number" blocks.

*WHY*?

> Changes
> =======
> 
> We will program simulation test cases into the Storm CVS module.
> Currently,
> no changes are required to the Storm implementation codebase.
> 
> 
> 
> 
> _______________________________________________
> Gzz-dev mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/gzz-dev

        TUomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]