myexperiment-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Myexperiment-discuss] Re: googling an exp-object


From: Steve Pettifer
Subject: Re: [Myexperiment-discuss] Re: googling an exp-object
Date: Fri, 1 Jun 2007 08:21:58 +0100

To state the obvious, there is no point in being able to google for (a link to a representation of) an experiment object unless you can do something with the (representation of the) experiment object - cue a Taverna plugin for IE :-)

We've been thinking along similar lines in terms of finding and using biological objects from databases, e.g. sequences, structures, networks and whatnot. Far from concrete ideas at this stage, but relevant to the myExperiment problem too.

In UTOPIA at the moment, objects are found using the FindOMatic interface, which does simple keyword searches on a number of common databases and returns results in a unified form that we can convert to be viewed with the visualisation tools. This is ok as far as it goes, but is only half the 'finding stuff' problem -- it's a 1-to- many mapping from keywords / simple queries to multiple resources. The other common way people find things is by opportunistic browsing of their favourite site, on which they may come across a resource they like (a workflow, service, protein sequence etc). More of a many- to-1 problem. The question is, both for UTOPIA and myExperiment, how to get that resource neatly into some kind of application so that it can be used for something useful rather than just viewed on a web page.

There seem to be a variety of options:

1) Use a mime/helper application setup. This requires some co- operation from the server side in terms of generating content that can be recognised by the client end as being targeted at a particular helper application. In the case of myExperiment this may be ok (e.g. a bit of scufle, or some other thing that we have control over?) -- for UTOPIA which has to deal with much more heterogeneous resources that aren't under our control, this isn't so good. Advantage: easy to implement, fits neatly with browsers' helper apps (especially if they are java)

2) Use a plugin (as suggested by Alan above). Advantage; plugin can watch content as it comes past, and scrape out important info. Scraping is not such a problem if the content is under our control, bit more messy if it's not (though it would only have to recognise the possibility of meaningful content, which could then be retrieved behind the scenes via web services if they exist). Disadvantage of this if we want to be inclusive is that we'd have to maintain plugins for at least Microsoft Internet Explorer and variations of Firefox, and probably for other popular browsers too such as Safari, Konqueror, Opera etc. The core of these could be common, but the mechanisms of interfacing with the browser a bit messy. cf. greasemonkey and zotero for examples of this done in firefox for document management.

3) Use a proxy. In this setup, you point your browser, whatever it may be on whatever OS, at a lightweight local specialised proxy. The proxy can scan the http stream as it comes through, looking for likely content. If it spots something, it can augment the html with controls that would cause the data to be retrieved (either by scraping, or web services). This has the advantage of being browser and OS independent, but the disadvantage that users have to configure their browser to point at this proxy, which may have issues of trust etc. (e.g. a stronger perception that some software is 'watching' what you're browsing for).

4) Use a javascript link, a la del.icio.us. In this setup, a simple bookmark can essentially post a page to a server, which then scrapes it for content. Very lightweight, and more or less browser independent. However, it has the disadvantage that it's a push from the point of view of the user, rather than a scanning done automatically (i.e. if you don't post the page, the system can't see its content).

The core technology of all of these would be similar, and just they way they integrate with the browser different. There are issues of maintainability here, and also some more subtle ones about perceived trust, configuration cost and such. I suspect the answer is that all options are viable, and different approaches will suit different people.

Any thoughts?

Steve



reply via email to

[Prev in Thread] Current Thread [Next in Thread]