myexperiment-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Myexperiment-discuss] i need some workflows :-)


From: Antoon Goderis
Subject: Re: [Myexperiment-discuss] i need some workflows :-)
Date: Fri, 6 Apr 2007 07:36:27 +0100 (BST)

Hello

> Antoon - is there something we can start using immediately?  The goal at
> the moment is just to get some experience rather than putting lots of
> effort into making sure it's the right decision (that comes later!)

> > > In a way an Inchi is a canonical digest of a molecular structure.  You can
> > > imagine one or more types of digest of workflows such that google finds
> > > "similar" workflows, where "similar" will be a function of the digest.
> > > In the myExperiment context one would want some form of digest that
> > > works across multiple workflow systems.  I'm just thinking out loud - I
> > > expect people have thought about this already.  It came up because
> > > Don is mapping workflow descriptions to OAI.

The thing is: where do you want to measure similarity and for what
purpose. What you include in the digest should reflect the purpose. I
don't believe in a single measure of similarity any more.

You could look for similar templates, similar end designs or
similar provenance graphs. The purpose could be to find inspiration, to
repeat running someone else's workflow, re-use it by slight parameter
adaptation or repurpose it by altering structure. In general, I've been
focussing on the case of similarity between end designs to support people
repurpose their existing workflow based on others.

Depending on the purposes, having the workflow's overall signature
information can actually be enough.

I don't think Google is cut out to support all mentioned purposes -
you're effectively looking at algorithm comparisons. I tried feeding the
Google API a serialization of a workflow by turning a workflow into a
string of service names and see what would come up in relation to the wf
repository - not much as it turns out.

Over the summer I visited the Kepler crowd with a view to specify a
common representation for end designs and provenance graphs. A common
signature representation is be easy enough. It gets thorny once you need
to interpret from the diagram what the workflow schedule/control flow is
really like (e.g. with a repurposing scenario in mind). In the case of
Kepler, the ability to shove in different execution semantics in any
given workflow means you can get a different service orderings during
execution. Having said that, most of their workflows are (currently) data
flow based pipelines and the diagrams can be read mostly from left to
right.

If you need a general cross-system representation to support workflow
discovery (and not do things like provenance logs reconciliation/merging)
I believe representing a workflow as a bag of services is good middle
ground - in fact the solution Chris Wroe used 5 years ago to represent
workflows in the myGrid ontology. So I'd agree with Alex that adopting
simple measures will already yield useful applications. Comparing which
services are (not!) shared by workflows is a powerful discriminator.
At what level these services are then described, compared and their
similarities aggregated is another matter (URL, text, tags, ontology
concepts).

In terms of tool support, I think the closest we have to a reasonable
similarity tool for workflows is Woogle. Woogle regards Web services as a
bag of operations. I adapted it to play nice with Scufl workflows, the
mapping being Web service --> Taverna workflow and Web service operation
--> Taverna workflow Processor. Drawbacks are 1/takes time to install and
2/doesn't handle wrong spelling. There is also a workflow ranking tool I
built based on a graph matcher, which works reasonable on single author
corpus, but breaks down in the face of multiple authors. Would be happy to
help you with questions on either.

> > >> Do you think we should have URIs that is calculated from the workflow
> > >> like the INChI is from the molecular structure?  This would be idea
> > >> but will need some interesting fundamental work to show how
> > >> representations of workflows can be shown to be the same workflow etc
> > >> Jeremy

Maybe not URLs but perhaps timestamped certificates, issued when people
submit/publish workflow packages.

Antoon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]