help-gnunet
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gnunet] INDIRECTION_TABLE_SIZE and download speed


From: James Blackwell
Subject: Re: [Help-gnunet] INDIRECTION_TABLE_SIZE and download speed
Date: Sun, 8 Sep 2002 15:06:00 -0400
User-agent: Mutt/1.4i

Sorry for the very long delay. Very interesting weekend for me here. :)

On Sat, Sep 07, 2002 at 03:55:42PM +0300, Igor Wronsky wrote:
> 
> > The problem is that GNUNet is by design causing data to get stuck on the
> > originating hosts by allowing that new data being published to be
> > concurrantly requested, causing all of the files of value to get stuck in
> > the tiny little straw of a pipeline most people have. 
> 
> I see your point. I just don't see a neat technical solution.
> A straightforward way would be for the node the keep track of 
> which blocks are indexed locally and which blocks belong to 
> which file. Once some block is accessed, prevent the downloading
> of blocks belonging to other locally indexed files for a certain 
> period of time. The extension to multiple-files-simultaneously 
> is straightforward. 

Thats what I have in mind. Its messy, but it works. 

> Even that will cause plenty of problems. 1) node knowing
> what belongs together ruins deniability 2) suppose node A

Maybe I really misunderstand whats going on. I figured that local
deniability has already been lost because of database.list? After all,
odds are we still have the original files sent on that node anyways and
pragmatically speaking, once physical control of a machine is lost, the
privacy is lost as well. If the user wishes to avoid the loss of control
issue, he could store gnunet data right alongside his original data on his
cfs filesystem.

> offers 20 files out of which 19 are locked, because we
> are sending out one. Suppose search matches for all 20 
> have been sent out previously, but nobody has downloaded any 
> of the actual files. Now it will seem to anyone attempting to 
> download a "wrong" (locked) file that its not available. 

Which when you think about it is true, since even under the current setup
its for all practical purposes still not available. :)

> User will most likely get frustrated and not try for that 
> file again, thinking the host gone. 3) how to handle

It would still be an improvement over nobody getting anything, correct? 

> request priorities in this case? What will low-priority
> user think when he's put on hold without explanation for
> the stopping of the dl? Or will we allow started low
> priority downloads to block high priority requests?

He'll think "Awww screw Metallica today. I'll go get me some Megadeth
instead and grab some Lars Ulrich tomorrow."
 
> Some of that could be aided by providing a reply
> messages like FOUND_BUT_BUSY. Currently the only 
> answer is either the actual content or nothing, I think.

I think FOUND_BUT_BUSY would be a bad idea because our originating host
would be the only node that would give that reply for that particular
file. That might not be solid proof to others that we're going to be the
originating node, but it would point a heavy finger in that direction.

Lets look at more serious example -- someone publishing information about
human rights abuses in China. China could over a short period of time find
dissidents releasing negative news about china by requesting it, and start
watching closely the nodes of people that seem to have the only copy of
blocks that match for "china news". People in china have disappeared for
twenty years on more circumstantial evidence than this.

 
> > For the most part this is not true for p2p networks, though there is one
> > vital exception: Getting the original data off of the originating node.
> > That's because there is only *one* place that has this data set. 
> 
> On freenet that doesn't hold. Inserted content is distributed
> to TTL hosts on insert. I think gnunet should push content out
> as well, but I haven't been able to convince CG of how that'd
> fit the economy model - because I don't know a good answer
> myself. Content pushing would solve the issue partly though, 
> because the load would be instantly distributed, and we 
> are inserting just one file at a time.

That's right. I forgot about that. I think pushing content is the right
answer for freenet and the wrong one for gnunet. Freenet's end all be all
purpose to existance is true anonymity while my understanding of gnunet is
that performance and anonymity are both held with equal levels of
priority.

Freenet model requires all data to be pushed out without any regard to
percieved value by others. In this case, our project guttenberg loving
dialup user can indeed publish the repository, but the cost is enormous. 

I really think that gnunet is going to perform well precisely because data
doesn't leave nodes until somebody wants it. I think the only trick
involved is trickling out original content at a rate the originator can
handle.

> Also in another way content pushing is mandatory. That
> is because the hashes of the content blocks and the identity of 
> the inserting node *have no correlation whatsoever*. This means
> that if there are 50000 nodes, and the content is on inserting

Just playing devil's advocate here....

Not true for unique data against even moderately powerful adverseries.
Consider the RIAA. They could easily toss 100k at a mainframe that
connects to all 50k nodes at once and requests their artists names. Any
time they see an answer for a search that comes from only one person, then
they have a good idea who started offering it in the first place, though
it probably takes several related items over time to get a very good idea
who your insurgeants are.

> > I can't help but ask. Are you really intending to tell the 50,000 people
> > that will be using gnunet two years from now that they can publish at 
> > will if nobody is interested in the published information, 
> 
> I'm intending to tell them to get PhD in distributed systems,
> game theory and economics, and after that come to tell me how
> to do it. Or do it themselves. Thats the open source spirit. ;)

I'm working on it. Two years of college down, about six to ten more to
go. I think I'm probably the exception though. You'll be lucky if you
manage to get most people to read the installation instructions... I don't
see you convincing many people to sign up at your local university.

Good bad or otherwise as soon as gnet starts picking up in popularity
you're going to get *swamped* by slashdot loving armchair quarterbacks,
and I'm worried that when the mob of 'em start showing up you're going to
get swamped by it. 

There might be other solutions that I can't think of, but it seems to me
that the only three solutions you'll have is to either A) make gnunet
immune to basic stupidity, B) going the Martin IDE maintainer route. C)
going the Linus route most correspondance to /dev/null. 

> > could make the distinction between that which has value and that which is
> > "junk"?
> 
> Yes. If people like britney spears nekkid, they can insert
> britney spears nekkid and my node can host blocks of 
> britney spears nekkid (not knowing what the blocks are), 
> if other hosts also answer my queries for me. However, 
> I don't think it my business to go looking for and inserting 
> britney spears nekkid if I don't care about that myself. 
> 
> Lets suppose I like apples. There's huge demand for
> oranges. If I insert oranges, how will that help me? 

Doesn't your trust level goes up because of serving oranges, thus allowing 
you a higher likelyhood of harvesting apples even when rare because
"favors" done from one node to another, much like bartering?

> The orange-liking people will think my node a good
> guy, but will that bring me more apples, as they dislike
> them and don't have any? The apple liking people will 

For the same reason I may have some oranges, others may have some apples.
For the moment, lets try and set asside the morality and legality issues 
involving mp3s. I'm willing to bet that when you look at larger mp3
collections you will find mp3s in which the repository owner has no
interest in owning but keeps for bartering power to gain more things he
does like. 

> think me worthless, because I'm offering so much oranges 
> that hardly any apples get out from my node. Besides, 
> if there's not much apples the apple people might 
> not insert anything either or go elsewhere.
> 
> Of course the economics count blocks, not apples or oranges,
> but on a higher level the catering for oranges when wanting
> apples is probably just harmful, as the orange market
> chokes the apple market.

I see your point. I think my example was a poor one though. I was thinking
more of "less interesting but related" information, such as...  Well, for 
this argument to work, we have to set asside legality and morality for 
the moment and pretend that the mp3 world were legal. 

"Bob" has an interest in They Might Be Giants, Pink Floyd, Depeche Mode
and a few others. However as time has passed, bob has also accumulated
quite a bit of cruft, which for the sake of argument we'll call Milli
Venilli and Vanilla Ice. When Bob "donates" his collection to gnunet, he's
much more likely to serve all of his music rather than just the stuff he
personally likes. Its not really so much a case of apples vs. oranges as 
it is a case of "getting the strawberries out of the fruit cocktail"

> > to replace it. I couldn't possibly hope to implement this with my current
> > level of education.
> 
> And I'd need much more convincing, detailed explanation and
> analysis to implement it with my current level of brains. Luckily
> some other developers might be brighter. ;)

Grin. :) I had originally intended for Christian to see my thoughts and
comment on it. However I have gotten you to at least look at it from a 
logical standpoint. If I'm on to something, then you are there to explain
eloquently to CG what I was barely able to stammer out in crude terms. If
I'm way off in left field then CG doesn't have to step down from his dias
in order to explain to a mere mortal on where I stand, since you would
have done so.


> > Sorry if I have wasted your time.
> 
> Of course not. I'm pessimistic mainly because I don't see
> a trivial way to go about it. And if its not trivial,
> it will mean thinking, and working, and why should I be
> the one to do it? I'd rather just watch tv and eat 
> ice cream and have people pay my rent for me and ... ;)

Grin. I'm not trying to convince you to do anything. I'm just trying to
get the thought into your head so that it can sit there in the back of
your mind until a magical solution pops up in your head.

> Perhaps the best rule of thumb in open source devel
> is "if you really want it, do it yourself". Pointing
> out the problem is *a very good* start, I admit that,
> but for the actual implementation it might not suffice.

But I did try to provide a logical solution to a logical problem. 

Let me ask you this.... if someone were to find a logical flaw in ssh2
that was akin to the flaw in ssh1, but could not create an appropriate
patch to the problem, would it be proper for that person to keep his mouth
shut rather than tell the developer?  Ok. that's not a parallel example,
so forget that one...

What I'm trying to do is explain what I think is a potential serious
problem that might render the work you're doing useless. I do this well
aware that you don't owe me a patch, or for that matter anything at all.
Even if you were to agree that I may be on to something, that doesn't mean
anything changes. If *I* want it fixed, I have two choices: either get 
meself an edumacation and do it or pay someone else that got h(im|er)self 
already got h(im|er)self an edumacation to do it. But would it be wise for
me to devote resources towards something before I managed to convince you 
there was a problem in the first place?

> For this particular problem, I don't know if CG wants
> to add NO_GO_NOW -replies to the protocol, but it'd seem 
> to require that, or otherwise the users won't know
> what gives and think the net even worse as now (when
> they can get that 200cps trickle).

I'm not convinced that if the code didn't change that users would be
guaranteed to get even that much for long, though that would be for IW or
CG to decide.

-- 
GnuPG fingerprint AAE4 8C76 58DA 5902 761D  247A 8A55 DA73 0635 7400
James Blackwell  --  Director http://www.linuxguru.net




reply via email to

[Prev in Thread] Current Thread [Next in Thread]