pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GLM, covariance matrices and interactions.


From: John Darrington
Subject: GLM, covariance matrices and interactions.
Date: Tue, 24 Aug 2010 15:44:36 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

I've just pushed a largish series of changes which reimplements the ONEWAY 
command (or most of it).

Whereas before it used a bunch of add-hoc hash tables, it now uses a combination
of covariance.c, categoricals.c and Jason's reg_sweep operator.  Although it 
appears
to have bloated somewhat, this is largely due to the remnants of the old
implementation which are necessary for the Levene test - which I'll get
around to rewriting soon (hopefully).

What this means is we can be reasonably confident that the same technique can
be used to implement a factorial anova which is a major use of the GLM command.
That is to say, we can currently do:
 ONEWAY X BY G.
and we should be able to easily implement:
 GLM X BY G1 G2 G3 G4.

I think also that multivariate analysis would also be (relatively) 
straightforward. ie:
 GLM X Y Z BY G1 G2 G3 G4.
Maybe Jason can correct me there?

Further I think the design of categoricals.c is such that it can be extended
to support interactions without great difficulty.  However it has the potential
to greatly complicate the interface to the categoricals struct (which is one
reason why I decided to put greater seperation between categoricals and 
covariance).

Some conclusions I have come to over the last week.

* We should abandon the constraint that CORRELATIONS and anova should use a
  common implementation of covariance matrix.  This is largely because 
CORRELATIONS
  does pairwise treatment of missing values.  This greatly complicates the 
  implementation.  On the other hand CORRELATIONS doesn't use categorical 
  variables.  I can't think of any scenario where Anova would sensibly want a 
  pairwise treatment of missing values in its covariance matrix, and combining 
  pairwise missing values and categorical variables seems like an 
insurmountable 
  task.

* We really need to take things one step at a time, rather than biting off
  a fully featured GLM.  So my suggestion is that we ignore interactions for the
  time being, and start off with a factorial anova capability - once it's 
  thoroughly tested we can think about interactions.

Does any of this make any sense?  And does anyone have any good test data 
for factorical anova without interactions?


J'



-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]