help-gnunet
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gnunet] INDIRECTION_TABLE_SIZE and download speed


From: Igor Wronsky
Subject: Re: [Help-gnunet] INDIRECTION_TABLE_SIZE and download speed
Date: Sat, 7 Sep 2002 15:55:42 +0300 (EEST)

On Fri, 6 Sep 2002, James Blackwell wrote:

> The problem is that GNUNet is by design causing data to get stuck on the
> originating hosts by allowing that new data being published to be
> concurrantly requested, causing all of the files of value to get stuck in
> the tiny little straw of a pipeline most people have. 

I see your point. I just don't see a neat technical solution.
A straightforward way would be for the node the keep track of 
which blocks are indexed locally and which blocks belong to 
which file. Once some block is accessed, prevent the downloading
of blocks belonging to other locally indexed files for a certain 
period of time. The extension to multiple-files-simultaneously 
is straightforward. 

Even that will cause plenty of problems. 1) node knowing
what belongs together ruins deniability 2) suppose node A
offers 20 files out of which 19 are locked, because we
are sending out one. Suppose search matches for all 20 
have been sent out previously, but nobody has downloaded any 
of the actual files. Now it will seem to anyone attempting to 
download a "wrong" (locked) file that its not available. 
User will most likely get frustrated and not try for that 
file again, thinking the host gone. 3) how to handle
request priorities in this case? What will low-priority
user think when he's put on hold without explanation for
the stopping of the dl? Or will we allow started low
priority downloads to block high priority requests?

Some of that could be aided by providing a reply
messages like FOUND_BUT_BUSY. Currently the only 
answer is either the actual content or nothing, I think.

> For the most part this is not true for p2p networks, though there is one
> vital exception: Getting the original data off of the originating node.
> That's because there is only *one* place that has this data set. 

On freenet that doesn't hold. Inserted content is distributed
to TTL hosts on insert. I think gnunet should push content out
as well, but I haven't been able to convince CG of how that'd
fit the economy model - because I don't know a good answer
myself. Content pushing would solve the issue partly though, 
because the load would be instantly distributed, and we 
are inserting just one file at a time.

Also in another way content pushing is mandatory. That
is because the hashes of the content blocks and the identity of 
the inserting node *have no correlation whatsoever*. This means
that if there are 50000 nodes, and the content is on inserting
node A, it will in the worst case mean that all 50000 nodes have
to be traversed for the content to be even found at all. If 
somebody wants to do a little exercise in propability he can
count the expected number of hops required for a block to be
found from a network size n providing the block is on any
but only one of those hosts. In addition, imagine that a file 
consists of k blocks, and the routing doesn't remember the route 
of the previously requested block.

Blocks need to travel to nodes having identities close to content
hashes for anything to be successful in a large net. The first 
migration is the hardest, because thats when the blocks 
are hardest to find.

> I can't help but ask. Are you really intending to tell the 50,000 people
> that will be using gnunet two years from now that they can publish at 
> will if nobody is interested in the published information, 

I'm intending to tell them to get PhD in distributed systems,
game theory and economics, and after that come to tell me how
to do it. Or do it themselves. Thats the open source spirit. ;)

> on socrates, but most would certainly consider that junk.  No. I don't feel 
> qualified to determine what is valuable and what is crap. I'm sorry for
> wandering on a tangital issue, but I do have to ask. Do you feel that you
> could make the distinction between that which has value and that which is
> "junk"?

Yes. If people like britney spears nekkid, they can insert
britney spears nekkid and my node can host blocks of 
britney spears nekkid (not knowing what the blocks are), 
if other hosts also answer my queries for me. However, 
I don't think it my business to go looking for and inserting 
britney spears nekkid if I don't care about that myself. 

Lets suppose I like apples. There's huge demand for
oranges. If I insert oranges, how will that help me? 
The orange-liking people will think my node a good
guy, but will that bring me more apples, as they dislike
them and don't have any? The apple liking people will 
think me worthless, because I'm offering so much oranges 
that hardly any apples get out from my node. Besides, 
if there's not much apples the apple people might 
not insert anything either or go elsewhere.

Of course the economics count blocks, not apples or oranges,
but on a higher level the catering for oranges when wanting
apples is probably just harmful, as the orange market
chokes the apple market.

So I'd still say that people should insert only stuff they
like (e.g. apples) and that way support and enhance the market
for that type of content.

> to replace it. I couldn't possibly hope to implement this with my current
> level of education.

And I'd need much more convincing, detailed explanation and
analysis to implement it with my current level of brains. Luckily
some other developers might be brighter. ;)

> Sorry if I have wasted your time.

Of course not. I'm pessimistic mainly because I don't see
a trivial way to go about it. And if its not trivial,
it will mean thinking, and working, and why should I be
the one to do it? I'd rather just watch tv and eat 
ice cream and have people pay my rent for me and ... ;)
Perhaps the best rule of thumb in open source devel
is "if you really want it, do it yourself". Pointing
out the problem is *a very good* start, I admit that,
but for the actual implementation it might not suffice.

For this particular problem, I don't know if CG wants
to add NO_GO_NOW -replies to the protocol, but it'd seem 
to require that, or otherwise the users won't know
what gives and think the net even worse as now (when
they can get that 200cps trickle).


I.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]