help-gnunet
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gnunet] Re: Inserting large amounts of content?


From: Christian Grothoff
Subject: Re: [Help-gnunet] Re: Inserting large amounts of content?
Date: Mon, 21 Apr 2003 19:24:09 -0500
User-agent: KMail/1.5.1

On Monday 21 April 2003 17:34, Per Kreuger wrote:
> Christian Grothoff wrote:
> > The cause for this is known. Currently, GNUnet can not do any exact
> > predictions on how big the database will be, so we put in some values
> > from experiments conducted by Eric -- and made them even a bit more
> > conservative. Yes, that should be fixed, but you can certainly work
> > around it.
>
> How?

Mostly by making the space-estimation code differenciate between the space 
consumption of indexed data (112 byte per block) and inserted data (1136 byte 
per block). No gigantic problem, but as I said, the datastore code has been 
completely reworked and the dust must still settle.

> >>What is probably worse is very distinct degradation of insertion speed.
> >>I enclose a postscript file with the insertions-rate listed and graphed
> >>over the several hours it took to insert the 946 files.
> >
> > Some degradation is to be expected with any database. You got a larger
> > degradation due to the uneven distribution of content to buckets (see
> > below).
>
> All my files had the same mime type and libextractor extracts keywords
> for that type which I used in the insertion.
>
> Insertion rate with empty database was on the order of 600K/sec. This
> quickly goes down to about 100K/sec and then slowly deterioates to less
> the 50K/sec after filling about half the allocated storage.

If you really care about insertion speed, you may want to try mysql instead of 
gdbm for the database. OTOH, I don't think 50kbps is that bad -- databases 
are faster for retrieving data (compared to insertion) and if your peer can 
sustain sending over 50 kbps of data into the network (plus control-traffic 
and traffic routed from other peers), you're talking about a bandwidth that 
most people do not have => the database is not the bottleneck for most 
people.

Ok, insertion speed still matters a bit if you want to 'just' index your 20 GB 
MP3 collection, but Linux users should know how to put such a process into 
the background -- and we'll worry about the Windows users next year :-)

> > Can you be more specific about 'crashes gnunetd' (see FAQ on bug
> > reporting...).
>
> I'll try the latest CVS and see if the same type of behaviour is present
> before running gnunet-check. If so I'll report it as a bug.
>
> With latest CVS the insertion-rate behaviour is initally the same: going
> from 558K/sec to 100K/sec in about 30 minutes and 500M of content indexed.
>
>
> It may be the case that this type of testing is premature. You mentioned
> that you are working heavily on the storage mechanism. Is this type of
> testing at all usefull for you or would you rather that I wait until the
> next release?

It's very useful since it's pretty much what I've been doing the whole week: 
test current CVS to find what are the important problems. CVS may become the 
next release very soon, so the more people are willing to test CVS and help 
us fix the bugs before that, the better :-)

Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]