pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

type 3 patches


From: Jason Stover
Subject: type 3 patches
Date: Wed, 12 Jan 2011 21:57:20 -0500
User-agent: Mutt/1.5.18 (2008-05-17)

Two patches for type 3 sums of squares are below. I'm not sure
which is best.

First, notice computation of type 3 sums of squares requires 
computation of many least-squares surfaces (one for each explanatory
variable).

One patch changes glm.c only to allow the computation. The other adds a
function to linreg.c. The one for glm.c alone avoids teaching glm.c
about linreg.c, but doesn't provide a function that other procedures
can call.  The patch to linreg.c allows procedures to just call
linreg_type3_ss, but requires those procedures to use linreg.c.

linreg.c seems kind of clunky now (I had such high hopes for it), so
if it seems best to have a function in src/math that can do the job
without appealing to linreg.c, then we would need to put that function
in covariance.c or make a new file. But putting it in covariance.c
would mean teaching covariance.c about sweep.c.

So I don't know. But I wonder what other procedures will need to know
about type 3 sums of squares. If they don't compute a linear model, I
don't think their computation of type 3 sums of squares will be
identical to that for the linear model anyway -- that is, they'll have
their own computations, meaning their own functions.

(BTW: I have not tested these beyond compilation.)

Patch below:

diff --git a/src/language/stats/glm.c b/src/language/stats/glm.c
index 862f49c..b074c56 100644
--- a/src/language/stats/glm.c
+++ b/src/language/stats/glm.c
@@ -191,8 +191,50 @@ cmd_glm (struct lexer *lexer, struct dataset *ds)
 }
 
 static  void dump_matrix (const gsl_matrix *m);
+static void get_ssq (gsl_matrix * cov, gsl_vector *ssq);
 
 static void
+get_ssq (gsl_matrix * cov, gsl_vector * ssq)
+{
+  size_t i;
+  size_t j;
+  size_t dropped;
+  size_t dep_column;
+  gsl_matrix * small_cov = NULL;
+
+  small_cov = gsl_matrix_alloc (cov->size1 - 1, cov->size2 - 1);
+  for (dropped = 1; dropped < cov->size1; dropped++)
+    {
+      for (i = 0; i < dropped; i++)
+       {
+         for (j = 0; j < dropped; j++)
+           {
+             gsl_matrix_set (small_cov, i, j, gsl_matrix_get (cov, i, j));
+           }
+         for (j = dropped + 1; j < small_cov->size2; j++)
+           {
+             gsl_matrix_set (small_cov, i, j - 1, gsl_matrix_get (cov, i, j));
+           }
+       }
+      for (i = dropped + 1; i < small_cov->size1; i++)
+       {
+         for (j = 0; j < dropped; j++)
+           {
+             gsl_matrix_set (small_cov, i - 1, j, gsl_matrix_get (cov, i, j));
+           }
+         for (j = dropped + 1; j < small_cov->size2; j++)
+           {
+             gsl_matrix_set (small_cov, i - 1, j - 1, gsl_matrix_get (cov, i, 
j));
+           }
+       }
+      reg_sweep (small_cov, 0);
+      gsl_vector_set (result, dropped, 
+                     gsl_matrix_get (small_cov, 0, 0)
+                     - gsl_matrix_get (small_cov, 0, 0));
+    }
+  
+}
+static void
 run_glm (const struct glm_spec *cmd, struct casereader *input, const struct 
dataset *ds)
 {
   int v;
@@ -255,12 +297,18 @@ run_glm (const struct glm_spec *cmd, struct casereader 
*input, const struct data
 
   {
     gsl_matrix *cm = covariance_calculate_unnormalized (cov);
+    gsl_vector *ssq;
+
+    ssq = gsl_vector_alloc (cov->size1 - 1);
+
 
     dump_matrix (cm);
 
     ws.total_ssq = gsl_matrix_get (cm, 0, 0);
 
     reg_sweep (cm, 0);
+    
+    get_ssq (cov, ssq);
 
     dump_matrix (cm);
 
diff --git a/src/math/linreg.c b/src/math/linreg.c
index 63d1a0f..83e9fde 100644
--- a/src/math/linreg.c
+++ b/src/math/linreg.c
@@ -386,3 +386,50 @@ linreg_fit (const gsl_matrix *cov, linreg *l)
     }
 }
 
+/*
+  Call this function after calling linreg_fit.
+ */
+void
+linreg_type3_ss (const gsl_matrix * cov, linreg * l, gsl_vector * result)
+{
+  size_t i;
+  size_t j;
+  size_t dropped;
+  size_t dep_column;
+  gsl_matrix * small_cov = NULL;
+
+  small_cov = gsl_matrix_alloc (cov->size1 - 1, cov->size2 - 1);
+  for (dropped = 0; dropped < cov->size1; dropped++)
+    {
+      if (dropped != l->dependent_column)
+       {
+         for (i = 0; i < dropped; i++)
+           {
+             for (j = 0; j < dropped; j++)
+               {
+                 gsl_matrix_set (small_cov, i, j, gsl_matrix_get (cov, i, j));
+               }
+             for (j = dropped + 1; j < small_cov->size2; j++)
+               {
+                 gsl_matrix_set (small_cov, i, j - 1, gsl_matrix_get (cov, i, 
j));
+               }
+           }
+         for (i = dropped + 1; i < small_cov->size1; i++)
+           {
+             for (j = 0; j < dropped; j++)
+               {
+                 gsl_matrix_set (small_cov, i - 1, j, gsl_matrix_get (cov, i, 
j));
+               }
+             for (j = dropped + 1; j < small_cov->size2; j++)
+               {
+                 gsl_matrix_set (small_cov, i - 1, j - 1, gsl_matrix_get (cov, 
i, j));
+               }
+           }
+       }
+      dep_column = (l->dependent_column < dropped) ? l->dependent_column : 
(l->dependent_column - 1);
+      reg_sweep (small_cov, dep_column);
+      gsl_vector_set (result, dropped, 
+                     gsl_matrix_get (small_cov, small_cov->size1 - 1, 
small_cov->size2 - 1) 
+                     - linreg_sse (l));
+    }
+}



reply via email to

[Prev in Thread] Current Thread [Next in Thread]