help-gnunet
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gnunet] finding files & database management


From: Krista Bennett
Subject: Re: [Help-gnunet] finding files & database management
Date: Fri, 12 Mar 2004 13:55:03 -0500
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7a) Gecko/20040219

Hopefully I can answer most of this; I spend a lot of my time out of the loop (and so sometimes the clever folk change parts of the scheme on me), but I think I can give you some sense of what the answer is and why with my usual annoying verbosity.

Benjamin Kay wrote:
With very little content currently on GNUnet, finding files isn't easy.

This is true, and has long been known to be an issue; however, without many users, we don't have much content. Now that GNUnet is increasingly stable, and with windows port action going on, that may change in the future.

To complicate things, keyword matching in a search seems to be explicit and case sensitive.

This is also true; while we could certainly add an option to the search utility to have it look for a keyword in various case configurations, eliminating the case sensitivity in the encoding scheme itself would be a problem; since we look for keywords and hash-key-indexed content in the same way, this explicitness is simply part of how things work.

Doesn't mean we couldn't add something to the insert utility to automatically add stuff using various cases though!

To make files I insert/index easier to find, I try to include as many relevant keywords as possible - but inevitably, I still think of a few additional keywords after I've inserted/indexed the file. The same goes for file descriptions. I know I can reinsert the file with the new keywords and descriptions, but that is costly in terms of processing time and requires meticulous record keeping on my part (I need to keep track of under what description and keywords the original file was inserted).

Well... I suppose it's possible to automate some sort of external record of what you've inserted under various keywords and have it point to the top block of the file so that you could continuously reindex that block with different keywords. That might be something useful to have.

That's really so hard, methinks; as an aside, the problem with doing that is that for the person using such a method, there is then a concrete record of content you've inserted. From a "plausible deniability" standpoint, you then open yourself up to trouble, as there's not only a concrete record of what you've inserted into the network, but a pointer to the file itself - if I insert something I don't want attributed to me (for example, my dissertation drafts :), it's probably not smart for me to intentionally retain a record. It doesn't hurt the network, just me, but it's just something to think about.

This isn't a problem for the network in any sense, and I suppose it's no different than you keeping track of such stuff on your own.

So the short question and answer is: could something be incorporated so that you could add additionally descriptions and keywords to an existing top block without a complete reinsertion/reindexing? Unless there's been some radical changes in the encoding scheme since the last time I looked at it, sure, I think it's possible (as long as the previously indexed/inserted top content block is still around or can be constructed). Christian, Igor, Nils, and company will correct me if I'm wrong, I'm sure.

Is there a way to modify the description and/or keywords of an inserted or indexed file without reinsertion?

As I said above, unless I'm forgetting something vital, it could be made possible to add to the keyword list given a reference to the top block.

Now, to actually "modify" the description, that's a bit more of a problem, and that has something to do with the censorship-resistant nature of the network. If I insert the same file 100 times under the same keyword, given that the filename of the highest block in the content tree is a function of the keyword, it should overwrite my local copy of that top block.

(Is that right Christian, or did you and Igor do something tricky and new I'm forgetting about?)

So in that case, you can "modify" the description by reinserting the top block with the same keyword as before but a different description as long as the only copy of that block is on your machine.

On the other hand, if that keyword block has migrated for any reason, you can't do a darned thing about the already existing keyword blocks that are out on the network. Nor should you be able to; if I insert my dissertation under the keywords "Kristas_dissertation" and the description "Draft copy of dissertation on stuff and things - do not take internally without consulting a physician", I don't particularly want anyone else to go through the network and change the description to "Important government dossier on weapons of mass destruction - use in government press briefings" for every single keyword block out there. So once it's out in the network, it stays there until more important content comes along and it fades away.

So, again, the short answer is that it could be done by just "reinserting" the top block alone with a new keyword or description, but if that top block has migrated, you're stuck with the two versions in the network.

How about a way to view the descriptions and keywords of indexed files?

Unless you keep this externally (i.e. you keep a record of every keyword you've indexed) somehow, no; again, this is intentional. Part of what makes the AFS portion of GNUnet work is that you retain plausible deniability; this means that if someone goes to your machine and says "hey, we're going to confiscate your machine, search it, and destroy it because you have nude pictures of Dick Cheney on it", you can honestly say you had no way of knowing they were there short of brute force searching for nude pictures of Dick Cheney. (??!??!!)

Furthermore, all of the blocks you've got stored look the same to GNUnet, keyword-indexed or not, so unless you have the keywords somewhere, you can't do a reverse-lookup.

Along the same lines, is there a way to reindex a downloaded file without, well... reindexing it? I'm guessing that on nodes with content migration enabled, downloaded content gets inserted into the migration database. Perhaps there is a way to make it permanent on that node (index it) without wasting all that time manually reindexing it? And is it possible to reindex such files under their original descriptions and keywords?

Hrm... I'll let Christian handle that one. I'm probably just not parsing your question the way you intend it :)

I've probably confused you more than I've helped, but the case sensitivity thing and having some sort of external indexing utility sound like plausible (and fairly easy-to-implement) features to me. If I'm feeling ambitious this afternoon, maybe I'll even do something about it.

- Krista

--
***********************************************************************
Krista Bennett                             web.ics.purdue.edu/~bennetkl
Graduate Student in Linguistics              address@hidden
Purdue University
**                                                                   **
If you think education is expensive, try ignorance. - Benjamin Franklin
**                                                                   **






reply via email to

[Prev in Thread] Current Thread [Next in Thread]