gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz] Address meanings, not contents! (Re: Storm blocks and metadata)


From: Reto Bachmann-Gmuer
Subject: [Gzz] Address meanings, not contents! (Re: Storm blocks and metadata)
Date: Thu, 27 Mar 2003 17:10:15 +0100

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Benja

It is necessary for the interpretation of the data we get; and it's usually easy to agree on (people won't too often assign different mime types to the same bytes). One thing about content hashes is, when two people put the same file into a hash-based system, they will use the same identifier for it. With MIME types, that's still pretty much true; with more elaborate metadata, it isn't.
I certainly wouldn't argue to put even more metadata in the URI.
Using the same identifier is important for queries like, "Which documents include this image?" If the three documents that use the image use three different kinds of IDs for it (because they refer to three different kinds of metadata), you're out of luck.
In the common sense meaning of the question "Which documents include this image?", "this image" is not defined by the sequence of bytes that make up a specific jpeg version of "this image" but rather by a specific visual representation of a thing. Giving an URI to the image (in the defined, encoding independent common sense meaning) itself and referencing this URI rather than the URI of the byte-sequence wherever possible allows answering queries that are closer to our real world understanding of things (what is concrete for us, is fairly abstract for the computer, computers deal with abstractions over the raw data to get the stuff non mathematicians can deal with, this "abstraction-process" is to be pushed further to get the semantic web). By the way mime-type isn't so unambiguous, e.g. a text using only a restricted set of characters may be encoded to the same sequence of bytes using different encodings.

(...)
Higher level applications should not use block-uris anyway but deal with an abstraction representing the content (like http urls should).
You mean as in, with content negotiation applied? You use a single URI which maps to different representations of the same resource?
You name it, the *same* resource. (But each representation is also a resource itself).


An example to be more explicit:
<urn:urn-5:G7Fj> <DC:title> "Ulisses"
<urn:urn-5:G7Fj> <DC:decription> "bla bli"

This, for example, I would not include here. :-) Firstly, it is something I would want to be versioned independently: if I change the description of an image, that should not create a new version of the image.
Surely not! Where I used literal in the examples one could use a uri representing the meaning of "bla bli", an attribute value of this URI would then be a URI for the english expression of that meaning, an attribute of this URI would be an URI representing this expression spoken by John, an attribute of this URI would be a byte storm-block with the mp3 encoding of it. I think you need a generic versioning system for rdf statement rather than for the data, later statement must have a mean to put earlier statement out of the graph (while the older still should be accessible in the style of the reification "i used to believe (s p v)"

Secondly, I don't see a reason why the URI of the image would need to refer to this.
me neither ;-). There must be a misunderstanding here.
Thirdly, I don't think that when a file is put into the system-- and thus given its identifier-- is necessarily the time to create this kind of metadata. It would seem to hold up the task at hand. Rather, I'd like to be able to add it later on, and maybe someone else can do that even better than me-- like a librarian who has scientific background in giving metadata about stuff.
Of course. Mechanisms of the application should probably add some metadata that give the user a chance to find the data later, but there should always be the possibility to enter a new version of the metadata.

(...)

In this example application should reference "urn:urn-5:G7Fj" (which does not have a mime type) rather than "urn:content-hash: Dj&/fjkZRT68" (which has a mime type in a specific context) wherever possible, in many cases a higher abstraction "urn:urn-5:lG5d" can be used .

Um, using a urn-5 doesn't work since it's just a random number-- if we use just a random number, we cannot check whether the data we may retrieve from a p2p network is really what the person making the reference wanted us to see. We would need to use "urn:foo:ref:[blah]", which would be the above RDF data, from which we could then get the specific representation.
The urn-5 URIs are intended to reference a certain concept/idea/meaning/topic, peoples are free to associate attributes to existing URIs. They may be subject to change like terms in natural language are, if somebody wants to use a term in a specific sense she has to make this explicit, maybe using digital signature stuff, but more often I think a key free trust system (http://www.w3.org/2002/03/key-free-trust.html) is not only enough, but more adapted to "fuzzy" trust levels in a P2P network.

While you can only deficiently use http to server a block,

Why?
The only http-header you can send back is the length and if you put it in the URI the content-type, most http features are unused.

you could server the uri of both the abstractions (urn:urn-5:G7Fj and urn:urn-5:lG5d) directly using http 1.1.features.
(Again, you'd have to use hashes, or you could be arbitrarily spoofed.)
(Again. No good networking without trust mechanisms ;-)

(...)
And how do you split the metadata in blocks

Well, depends very much on the application. How do you split metadata into files? :-)
Not at all ;-). The splitting into file is rudimentary represented meta-data, if you use RDF the filesystem is a legacy application.

Um, but if you put metadata on an http server, you split it too?
My approach would be to split the data just in time. To make it accessible over http a standard request the server could return all the statements where a specific URI occurs, or only where it is the subject. An extended request could contain the level of expansion requested.

(...)

Cheers,
Reto
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (Darwin)

iD8DBQE+gyJtD1pReGFYfq4RAgiFAKCEEvE6v/NwTl1ebjge5YPx9UAtqACgqXvF
RpcbVqiDuvMrGt9ReDMGZLI=
=TRAL
-----END PGP SIGNATURE-----





reply via email to

[Prev in Thread] Current Thread [Next in Thread]