Re: [Myexperiment-discuss] proposal: a "test and controls" section for e

myexperiment-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Myexperiment-discuss] proposal: a "test and controls" section for e

From:	Paul Fisher
Subject:	Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment
Date:	Sat, 18 Oct 2008 15:17:09 +0100
User-agent:	Thunderbird 2.0.0.17 (Windows/20080914)

I see your point, but, this will in most cases be handeled directly bythe service provider. An example is to supply a gene identifier, when aprotein identifier is needed. The service would recognise that you haveput the wrong id in, as it simply wont return any results, or return anerror stating that the input was incorrect. This would be a service-sidemeans of error checking. On the other hand you COULD add in error checksfor the entire workflow, but for an example of:


http://www.myexperiment.org/workflows/72

you can quickly see that the size of the workflow will be incrediblylarge, if each input is to be checked before it is passed to the nextservice.. Given that people choose to re-use not only on "if it works"but also on the size of the workflow, and many other things, then thismay result in workflow no longer being used as they are too big tounderstand.

Perhaps this discussion should move to the Taverna-Users list instead,as a feature for Taverna or a workflow best practice thought?!?!?!

I do know that a workflow decay monitor has been in production, andthere may be plans for its' integration into other projects, such asBioCatalogue. Not that I want to speculate here,


regards,
Paul.


Giovanni Marco Dall'Olio wrote:

On Sat, Oct 18, 2008 at 3:34 PM, Paul Fisher<address@hidden<mailto:address@hidden>> wrote:


    How then do you propose to test the workflows, other than to
    download them and see if they work.
    Do you mean examples of their use, along with experimental results
    and a publication, to prove they work?

I am sympatizer of this phylosophy, that says that you should writetest units before writing the code:

- http://www.extremeprogramming.org/rules/testfirst.html

That means that for every script I write, I first create testing sets,and that I don't consider my programs as working until they don't passall the test correctly.

There are basically two kind of tests you can write forbioinformatics: those that verify that your programs don't containerrors, and those that you run each time to prove that your are usingthe program correctly.

So, in myExperiment I would add a section that explains all the teststhat have be ran to ensure that the workflows is written ok.That would be the first thing I'll check when I want to choose ifre-use a workflow or not.In this section, you should add a list of all the tests, theirdescription (like the one I put in the first mail of this thread), andtheir results, along with the necessary input data.Other people will be able to re-run the tests on their computers andtell if they succeed.For example: I publish a workflow on myExperiment, that make use ofncbi blast.Then, the ncbi xml interface changes, but I don't notice that, so Idon't update my workflow.The next time somebody else wants to re-use my workflow, he should beable to re-run the same tests with the same input files to see if heobtains the same exact results, and know that there is something wrongif a different result is returned.

Second, I would add a section for the test that should be used todemonstrate that you are using the workflow correctly.

Let's say you publish a workflow which has an input called 'fasta file'.

This would mean that the workflow needs a fasta file as input; butmaybe I could mis-understand your description, and put the literarystring 'fasta file' as input.Your workflow should contain a processor that checks that the inputfile is ok, and an output that should say 'input fasta file is notok!' if it is not.

That would be the run-time test.

Test are not always example files, but usually one always write atleast a test with such inputs.I am not a programming guru myself, but I hope I have been able toexplain what I want to say.

Do you want someone to upload example inputs for (as stated in my
last email).

Either way, you would only truely know if they worked if you
tested it for yourself, as with any other program!

Paul.

Giovanni Marco Dall'Olio wrote:

On Sat, Oct 18, 2008 at 3:00 PM, Paul Fisher
<address@hidden
<mailto:address@hidden>
<mailto:address@hidden
<mailto:address@hidden>>> wrote:

Hi,

I understood what you were trying to say in your email, but I'm
not sure it came across properly. I think you may have
confused a
few people with cross-discipline vocabulary :)

I hope I have this right in a summing up statement:

/You need examples of what the workflow should work on, more
precisely: inputs and outputs
/

No.
I am talking about software testing
(http://en.wikipedia.org/wiki/Software_testing).
If I want to re-use a workflow from myExperiment, I want
something that proves me that it works correctly.
It is a good scientific practice. As you won't use a
laboratory instrument that is not calibrate, you won't use a
program that is not tested.

I don't know how to explain it better that how I did in my
last mail. Maybe you can read this article:
-

http://www.americanscientist.org/issues/pub/wheres-the-real-bottleneck-in-scientific-computing/1


           I think this has already been mentioned before - more precisely
           work on attachments (which I'm very keen to see personally).
           Correct me if i'm wrong though people.

           regards,
           Paul.



           Giovanni Marco Dall'Olio wrote:

               Hi,
               I think you should add a section where to describe
        'Test and
               Controls' in the 'Detailed view' for every workflow in
               myExperiment.

               What do I mean?
               Protocols and Pipelines are always tested, in experimental
               biology.
               For example, let's say you want to design a new
        protocols for
               extracting DNA from blood samples.
               You will have to spend much of the time on ideating
        controls
               that will allow me to demonstrate that my protocol is good.
               You will need to demonstrate that PCR amplification doesn't
               amplify contaminations, you'll have to calibrate all the
               instruments, put control and comparison samples.

               The same goes for any bioinformatics workflow. A
        pipeline for
               a scientific experiment should follow all the good
        laboratory
               practices, it doesn't matter if the instruments used are
               physical machineries or bioinformatics tools.

               For example, I am going to write a script to calculate a
               statistics on a big amount of data.
               Up to now I have thought of three tests:
               - the workflow should fail if wrong input files are given
               - the workflow should give me the right result when I
        ran it
               on testing data for which I already know the statistics
        value.
               - If I create two random sets of sequences, one with more
               variablity than the other, the workflow should give me an
               higher output value for the first set than for the
        second one.

               You should add a section where people can write with which
               kind of tests their workflows have been calibrated.
        Eventually
               you should put two sections, one with the tests that have
               already been executed, and one for the ones one should run
               each time he is using the workflow.
               I think such a section would be very useful in
        myExperiment.
               Moreover, these test could also act as examples, so
        workflows
               will be easier to understand for other users.

               I believe testing workflow is a very good practice, that
               unfortunately not many bioinformaticists are used to do :(.
               You should distinguish the workflows that provide tests
               description from the others, so people will be able to
        suggest
               how to design tests to people that are not used to do that.

-------------------------------------------------------------


               My Blog on Bioinformatics (italian): http://bioinfoblog.it

------------------------------------------------------------------------


               _______________________________________________
               Myexperiment-discuss mailing list
               address@hidden
        <mailto:address@hidden>
               <mailto:address@hidden
        <mailto:address@hidden>>

http://lists.nongnu.org/mailman/listinfo/myexperiment-discuss

-------------------------------------------------------------


        My Blog on Bioinformatics (italian): http://bioinfoblog.it
        ------------------------------------------------------------------------

        _______________________________________________
        Myexperiment-discuss mailing list
        address@hidden
        <mailto:address@hidden>
        http://lists.nongnu.org/mailman/listinfo/myexperiment-discuss





--
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://bioinfoblog.it

[Prev in Thread]

Current Thread

[Next in Thread]

[Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/18
- Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Paul Fisher, 2008/10/18
  - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/18
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Paul Fisher, 2008/10/18
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/18
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Paul Fisher <=
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/18
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Paolo Missier, 2008/10/18
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/19
    - Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment, Giovanni Marco Dall'Olio, 2008/10/19

Prev by Date: Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment
Next by Date: Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment
Previous by thread: Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment
Next by thread: Re: [Myexperiment-discuss] proposal: a "test and controls" section for experiments in myExperiment
Index(es):
- Date
- Thread