pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

contributions from statisticians


From: Jason Stover
Subject: contributions from statisticians
Date: Sat, 1 Jul 2006 18:20:45 -0400
User-agent: Mutt/1.5.10i

I've been thinking about ways to get more statistical code into
pspp. I think the way to do it is to have volunteers submit their code
as modules to us for review, much like journal submissions are
reviewed.

The people who so far have volunteered to write mathematical
code never did so because hooking their code into pspp is too
daunting. I spend most of my time hooking my own mathematical routines
into PSPP, rather than writing those mathematical routines. I see no
way to code this problem away: The process of writing the mathematical
modules requires a lot of study and work. Writing the mathematical
routines AND making them run with pspp is too much to ask potential
contributors, no matter how easy we make it.

On our side, development of the statistical routines is very slow,
mostly because of the time required to make those routines run inside
pspp, rather than because of the time required to write the math
itself. Any process that can get us more mathematical code would help
a lot.

There are plenty of students and statisticians who can write C or
Fortran routines to fit statistical models, and past posts to this
list show that plenty of them want to contribute. If we ask them to
submit their code, review it for correctness, send the authors the
copyright assignment forms, then put their code into pspp, the
addition of mathematical code will accelerate a lot.

I'm willing to review mathematical routines and hook them into pspp's
procedures. Maybe more people will be willing to review code in the
future. Perhaps other authors who contribute could act act as
reviewers, as the authors of peer-reviewed papers often do.

I don't know what the requirements for authors should be, but I have a
few in mind:

        * Ask authors to follow the GNU coding standards. (This may
        require some leniency.)
        
        * Ask authors to code in C (what about Fortran?)
        
        * Ask authors not to rewrite basic prerequisite code that
        already exists in GSL. (E.g. code that computes probability
        density functions, or general-purpose conjugate gradient
        optimizers, or vector and matrix routines)

        * Ask authors to use comments liberally so we can see what
        they meant to do.

        * Tell authors that we will have to change their code to make
        it work with pspp, or if any bugs show up.

        * Ask for a bibliography in a comment so we can track down the
        sources to resolve any tricky problems.

        * Ask for an 'abstract' in a comment describing what the subroutine
        does, when it is appropriate, and (briefly) how it works.

The more requirements we have, the fewer contributions we will get, so
maybe the list above should be shortened. But those requirements seem
appropriate to me, and authors will find them much easier to follow
than to get their subroutines running with a CVS checkout.

I would be happy to write a file called 'CONTRIBUTING' and put that in
CVS. Maybe we could even put something on savannah where statistical
programmers will find it.

Thoughts?

-Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]