myexperiment-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Myexperiment-discuss] Re: [Taverna-hackers] Wrapping a Runner with a we


From: Tom Oinn
Subject: [Myexperiment-discuss] Re: [Taverna-hackers] Wrapping a Runner with a web service
Date: Tue, 10 Jun 2008 14:27:39 +0100
User-agent: Thunderbird 2.0.0.14 (Windows/20080421)

Hi Scott, all,

On the face of it a workflow is an entity that consumes some data, performs some processing and returns some results, having potentially also had a side effect on its execution environment. At this level it seems perfectly plausible that we can turn a workflow into a service.

The problems start to emerge when we match the full capabilities of the workflow system against the limitations of 'vanilla' web services.

1) Workflows can run for a long time. Potentially they can run for months; if you were to expose a workflow of that kind as a service you'd have to move beyond the synchronous invocation pattern that a conventional SOAP service supports.

- there are existing ways to do this of course, we're not the only ones with long running services, but they immediately move you out of the 'really simple service' realm. Whether you do this with e.g. WSRF or with a custom service interface (so your service interface is to the workflow *engine* not to the workflow itself) is another choice to make.

2) Workflows can consume and emit very large data. Web services have no consistant mechanism to handle this - there are approaches to do with attachments and there are ad hoc mechanisms such as returning URLs to results and accepting the same as inputs.

Both these issues may or may not be a problem in any given case, and there are certainly simple workflows (short running, small data) which could be wrapped up in a naive synchronous service interface. Exactly what values of 'short' and 'small' we use here is a gray area, but it's not going to give you much room to work in.

Another option is to create a hybrid system, where you force all data to be passed by reference (HTTP URLs are pretty good for this). Taverna can internally handle this, and we can even force the workflow to return results in that form so your service logic wouldn't have to do very much. This works around the data size issue (as URLs are 'small') but not the run time problems.

The run length issues could be mitigated by adopting the typical 'submit, poll, get results, destroy' pattern. Of course, you'd have to have some way of identifying the workflow instance you wanted to interact with as part of the call (or in WS or HTTP headers).

There is, however, a final problem with 'really' making these things available as services. In T2 the 'workflow engine' is really a set of components such as security agents, reference management systems and the like. To run a workflow you need to assemble and mutually configure these components - in some cases there is an obvious trivial assembly which could be done implicitly, but in general you're looking at configuring this federation prior to workflow launch - this would immediately make the service taverna-specific.

There's nothing to say, of course, that these approaches are mutually exclusive. We could envisage a heirarchy of functionality where as you move up the heirarchy you get increased control (configuration, monitoring, support for long running workflows) at the cost of increased interface complexity. At the top of this heirarchy would be the full blown peer management service framework, at the bottom would be vanilla synchronous web services (with no exposure of the workflow system at all). Any given application would live somewhere between these two extremes.

Tom




reply via email to

[Prev in Thread] Current Thread [Next in Thread]