libreplanet-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [libreplanet-discuss] Question about free (as in freedom) data from


From: Mike Gerwitz
Subject: Re: [libreplanet-discuss] Question about free (as in freedom) data from free software
Date: Wed, 28 Oct 2015 23:20:42 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux)

On Wed, Oct 28, 2015 at 22:32:43 +0000, Pen-Yuan Hsing wrote:
> As I mentioned a couple months ago, I met some fellow scientists at a
> conference, and asked them to continue development their wildlife
> image analyses program as Free Software (as in Freedom). I was glad
> that they were receptive to the idea (even though they keep going back
> to calling it "open source".... oh well), so I think this is a good
> first step!

I'm glad to hear that you're making progress.

> (1) If the data is released to the wider public, there might be other
> scientists who would "steal" the data, publish the work/analyses on
> it, preempt our efforts, and we won't end up being able to publish any
> papers. Since peer-reviews scientific journal papers are the
> "currency" with which academic performance is judged, we shouldn't
> release our data because others might "steal" them. He said we should
> at the very least embargo and restrict access to the data for e.g.
> five years before releasing them.

This is unrelated to software freedom, so these opinions are strictly my
own, but I base them heavily on what I've read over the years from the
scientific community.

I would suggest that they release the data applicable to the
paper with the paper itself---this is necessary for proper peer-review
and reproducability.  If they have data left over after publishing their
results, they can release that after the fact.

> (2) Since a lot of the data are photos of wild animals, what if some
> of the animals are endangered or sensitive to human encroachment?
> Since GPS metadata is associated with our images, what is a poacher
> sees our data and use it to hunt down the endangered animal? He said
> maybe we shouldn't release the data at all/ever, doing so would be
> "irresponsible" and possibly cause great harm.

Reproducibility continues to be a constantly discussed topic in
scientific journals and communities---because it isn't happening.  This
was re-ignited recently when a study headed by Brian Nosek at the Center
for Open Science found that only 39 of the 100 replication attempts of
98 studies from three psychology journals were successful.[0]

This doesn't mean that those studies were wrong.  But in order to
reproduce results---in this case, from analyzing data---those data need
to be made available for independent analysis.  In turn, the _software_
also needs to be made available.  Which is what I brought up previously,
and what you're been working on.

Negative or uninteresting results are also important and valuable.  That
is why, even if the data aren't used in any papers, I suggest that they
should still be released at some point.

With regards to the concerns of encroachment: I can't answer that
question, because I'm not educated on this particular issue, and would
not want to suggest something irresponsible.  But my generic answer is
to default to releasing the data unless the researchers can determine
that the potential risk is greater than the potential benefit with a
high level of certainty.  It's difficult to say what the benefit might
be; here we can draw an analogy to free software: the original author
couldn't possibly anticipate how everyone might use their software, but
free software is tweaked for personal uses all the time, in novel
ways.  Consider also the Unix philosophy: the standard tools were
developed to do one thing and do it well, and to be used in a
pipeline.  This allows users to combine the tools in any number of
useful ways that their creators could have never imagined.

> So, what do people think about points (1) and (2)? If the expertise is
> not on this mailing list, is there another place/forum where I can
> discuss the issue of Free Data (as in Freedom), "open data", and "open
> access"??? Thank you!

I'd be curious of that answer as well.  While I do a lot of reading, I
rarely participate in discussion, unless it's contacting authors
directly.

[0]: 
http://www.nature.com/news/over-half-of-psychology-studies-fail-reproducibility-test-1.18248

-- 
Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]