On Sat, Oct 18, 2008 at 3:00 PM, Paul Fisher
<address@hidden
<mailto:address@hidden>> wrote:
Hi,
I understood what you were trying to say in your email, but I'm
not sure it came across properly. I think you may have confused a
few people with cross-discipline vocabulary :)
I hope I have this right in a summing up statement:
/You need examples of what the workflow should work on, more
precisely: inputs and outputs
/
No.
I am talking about software testing
(http://en.wikipedia.org/wiki/Software_testing).
If I want to re-use a workflow from myExperiment, I want something
that proves me that it works correctly.
It is a good scientific practice. As you won't use a laboratory
instrument that is not calibrate, you won't use a program that is not
tested.
I don't know how to explain it better that how I did in my last mail.
Maybe you can read this article:
-
http://www.americanscientist.org/issues/pub/wheres-the-real-bottleneck-in-scientific-computing/1
I think this has already been mentioned before - more precisely
work on attachments (which I'm very keen to see personally).
Correct me if i'm wrong though people.
regards,
Paul.
Giovanni Marco Dall'Olio wrote:
Hi,
I think you should add a section where to describe 'Test and
Controls' in the 'Detailed view' for every workflow in
myExperiment.
What do I mean?
Protocols and Pipelines are always tested, in experimental
biology.
For example, let's say you want to design a new protocols for
extracting DNA from blood samples.
You will have to spend much of the time on ideating controls
that will allow me to demonstrate that my protocol is good.
You will need to demonstrate that PCR amplification doesn't
amplify contaminations, you'll have to calibrate all the
instruments, put control and comparison samples.
The same goes for any bioinformatics workflow. A pipeline for
a scientific experiment should follow all the good laboratory
practices, it doesn't matter if the instruments used are
physical machineries or bioinformatics tools.
For example, I am going to write a script to calculate a
statistics on a big amount of data.
Up to now I have thought of three tests:
- the workflow should fail if wrong input files are given
- the workflow should give me the right result when I ran it
on testing data for which I already know the statistics value.
- If I create two random sets of sequences, one with more
variablity than the other, the workflow should give me an
higher output value for the first set than for the second one.
You should add a section where people can write with which
kind of tests their workflows have been calibrated. Eventually
you should put two sections, one with the tests that have
already been executed, and one for the ones one should run
each time he is using the workflow.
I think such a section would be very useful in myExperiment.
Moreover, these test could also act as examples, so workflows
will be easier to understand for other users.
I believe testing workflow is a very good practice, that
unfortunately not many bioinformaticists are used to do :(.
You should distinguish the workflows that provide tests
description from the others, so people will be able to suggest
how to design tests to people that are not used to do that.
--
-----------------------------------------------------------
My Blog on Bioinformatics (italian): http://bioinfoblog.it
------------------------------------------------------------------------
_______________________________________________
Myexperiment-discuss mailing list
address@hidden
<mailto:address@hidden>
http://lists.nongnu.org/mailman/listinfo/myexperiment-discuss
--
-----------------------------------------------------------
My Blog on Bioinformatics (italian): http://bioinfoblog.it
------------------------------------------------------------------------
_______________________________________________
Myexperiment-discuss mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/myexperiment-discuss