fenfire-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp


From: Benja Fallenstein
Subject: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp
Date: Mon, 22 Sep 2003 05:09:44 +0300
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030908 Debian/1.4-4

==========================================================================
An easier API for Swamp
==========================================================================

:Authors:  Benja Fallenstein
:Created:  2003-09-22
:Status:   Current
:Scope:    Minor
:Type:     Interface
:Affect-PEGs: swamp_rdf_api--tjl


Tuomas always makes the point that Swamp must be fast,
because it is called in the inner loops of Fenfire.

But Swamp must also be easy to use, because it is
the API that everyone hacking Fenfire will have to learn
in order to do anything, so it is vital that it doesn't
have a steep learning curve.

(Besides, easy-to-read and easy-to-use APIs are of course
the right thing to have anyway.)


.. Issues
   ======

A flavor of the API
===================

First of all, we need a good way for iterating
through a set of triples. I propose the following
interface::

    for(Triples t = graph.get(_, RDF.type, _); t.loop();) {
        System.out.println(t.sub+" is instance of "+t.ob);
    }

I.e., have our own iterator-like thing, which iterates
through a set of *triples*-- rather than nodes-- but doesn't
need to create objects for every one of these triples.

For good measure, here's how the above code would look
in the current API::

    for(Iterator i=graph.findN_X1A(RDF.type); i.hasNext();) {
        Object sub = i.next();
        for(Iterator j=graph.findN_11X(sub, RDF.type); j.hasNext();) {
            Object ob = j.next();
            System.out.println(sub+" is instance of "+t.ob);
        }
    }

However, to be fair, my code isn't how it would look
when efficiency is at a premium. (Then again, when I print
to the console inside the loop, efficiency isn't at a
premium anyway... but whatever...) The *fast* version
would look like this::

    for(Triples t = graph.get_A1A(RDF.type); t.loop()) {
        System.out.println(t.sub+" is instance of "+t.ob);    
    }

Not quite as straight-forward, but still better than
what we have now.

In Jython, the loop would look like this::

    t = graph.get(_, RDF.type, _)

    while t.loop():
        print "<%s> is instance of <%s>" % (t.sub, t.ob)

A bit different than in Java, but still recognizable.


Changes
=======

We'll make it a convention that classes using the API
have this at the top::

    static final _ = null;

You don't have to have this, but it makes things easier to read.


``ConstGraph``
--------------

``ConstGraph`` shall have the following API
for getting triples::

    /** Get an iterator through all triples in the graph
     *  matching a certain pattern.
     *  If <code>subject</code>, <code>predicate</code> and/or
     *  <code>object</code> are given, the triples must match these.
     *  If any of the parameters is <code>null</code>,
     *  any node will match it.
     */
    Triples get(Object subject, Object predicate, Object object);

    // Versions that don't allow wildcards (``null``)
    Triples get_AA1(Object predicate, Object object);
    Triples get_1A1(Object subject, Object object);
    ...

    /** Get the subject of the triple matching a certain pattern.
     *  If <code>subject</code>, <code>predicate</code> and/or
     *  <code>object</code> are given, the triple must match these.
     *  If any of the parameters is <code>null</code>,
     *  any node will match it.
     *  @returns The subject of the triple, if there is one,
     *           or <code>null</code> if there is no such triple.
     */
    Object getSubject(Object subject, Object predicate, Object object);

    Object getSubject_A1A(Object predicate);
    ...

Note: The reason for having ``subject`` as a parameter
for ``getSubject()`` is that it's easier to read. It will
almost always be "``_``" (i.e., ``null``). It shall work
consistently, though: If a subject is given, and there is
such a triple in the graph, return that subject; otherwise,
return ``null``.

    /** Get the subjects of all triples matching a certain pattern.
     *  If <code>subject</code>, <code>predicate</code> and/or
     *  <code>object</code> are given, the triple must match these.
     *  If any of the parameters is <code>null</code>,
     *  any node will match it.
     *  <p>
     *  The set is immutable; it is <em>not</em> backed
     *  by the graph (i.e., changing the graph does not
     *  change the set.)
     */
    Set getSubjects(Object subject, Object predicate, Object object);

(Backing is harder to program and I don't see the pay-off,
since the ``getXXXs`` functions won't be used that often.)

    Set getSubjects_AA1(Object object);
    ...

    // getObject(), getObjects() similarly
    // getPredicate(), getPredicates() similarly

``getPredicate()`` is essentially useless, but we'll have it
for symmetry. ``getPredicates()`` is useful, mostly for
getting *all* predicates used in a graph.

Note that we don't have ``X`` in the function variants
any more, just ``1`` and ``A``, with ``A`` being equivalent
to passing ``null`` in that position to the generic method.

(E.g., ``getSubjects_AAA()`` is equivalent to
``getSubjects(_, _, _)``, returning the set of all subjects
in the graph.)


``Triples``
-----------

The iterator-like object, ``Triples``, shall have
the following API::

    Object sub, pred, ob;

(These are ``null`` when the object hasn't been
initialized, i.e., ``next()`` hasn't been called yet.)

    /** Advance to the next triple. */
    void next();

    /** Whether there are any more triples to iterate through. */
    boolean hasNext();

    /** Indicate that this <code>Triples</code> object won't be
     *  used any more.
     *  This shall only be called by the code that has requested
     *  this object from <code>ConstGraph</code> (through
     *  <code>.get()</code>). It's purpose is to tell the
     *  <code>ConstGraph</code> that it can be re-used for the
     *  next <code>get()</code>; <code>ConstGraph</code> can then
     *  cache <code>Triples</code> objects, making life easier
     *  for the garbage collector.
     *  <p>
     *  Calling this method is not obligatory. (If you don't,
     *  this object will be garbage-collected normally.)
     */
    void free();

    boolean loop() {
        if(hasNext()) {
            next();
            return true;
        } else {
            free();
            return false;
        }
    }

The purpose of ``loop()`` is to enable the common loop
pattern, ::

    for(Triples t = graph.get(...); t.loop();) {
        // ...
    }

which would otherwise have to be written as::

    Triples t;
    for(t = graph.get(...); t.hasNext(); t.next()) {
        // ...
    }
    t.free();

This isn't just harder to read, it also scopes ``t``
wrongly. With the ``loop()`` pattern, the scope of ``t``
is the body of the loop, which is exactly the code
executed before ``free()`` is called.


``Graph``
---------

For changing graphs, the following API shall be used::

    /** Add a triple to this graph. */
    void add(Object subject, Object predicate, Object object);

    /** Remove all triples matching a certain pattern from this graph.
     *  If <code>subject</code>, <code>predicate</code> and/or
     *  <code>object</code> are given, the triple must match these.
     *  If any of the parameters is <code>null</code>,
     *  any node will match it.
     */
    void remove(Object subject, Object predicate, Object object);

    void remove_A1A(Object predicate);
    void remove_1AA(Object subject);
    ...

    /** Replace all triples with the given predicate and object
     *  with the given triple.
     */
    void setSubject(Object subject, Object predicate, Object object);

    /** Replace all triples with the given subject and predicate
     *  with the given triple.
     */
    void setObject(Object subject, Object predicate, Object object);

We don't have ``setPredicate()`` because it is essentially useless
and potentially harmful-- someone using it almost certainly
intended to do something else.

This is never a problem because the ``setXXX()`` methods
are only a convenience. You can always do::

    graph.remove(_, predicate, _);
    graph.add(subject, predicate, object);

if you *do* happen to have some esoteric use for it.


Conclusion
==========

I believe this API will be substantially simpler to use
than the one we have at the moment, and not lose
anything w.r.t. speed. In fact, it may speed things up
in the future, because we can cache the ``Triples`` objects.

\- Benja





reply via email to

[Prev in Thread] Current Thread [Next in Thread]