[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
PSPP-BUG: [bug #48040] GLM produces wrong output
From: |
Alan Mead |
Subject: |
PSPP-BUG: [bug #48040] GLM produces wrong output |
Date: |
Fri, 27 May 2016 14:57:28 +0000 (UTC) |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0 |
URL:
<http://savannah.gnu.org/bugs/?48040>
Summary: GLM produces wrong output
Project: PSPP
Submitted by: amead
Submitted on: Fri 27 May 2016 02:57:26 PM GMT
Category: Numerical Errors
Severity: 5 - Average
Status: None
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Release: None
Effort: 0.00
_______________________________________________________
Details:
The attached data ("personality.sav") are three personality scores on a
50-item measure. They are moderately intercorrelated (0.35 - 0.51). There are
a large number of missing values.
Using the attached data, I ran:
GLM agree_score BY caution_score extra_score
The processing took 6 minutes (compared to SPSS < 1 sec) and produced attached
"glm_output.txt" that doesn't match SPSS and which is nonsensical in many
places. For example, the tests for intercept are simply missing, the F values
are all negative (which isn't possible), the p-values are all NaN, the error
degrees of freedom are negative as are the error mean square.
I assume that a problem (not the only problem, see below) with these data are
that there are many levels for the independent variables (about 50; they
aren't really factors). This SHOULD NOT be a problem for GLM, but when I do a
median split on the dependent variables:
recode caution_score (lo thru 35=1) (36 thru hi=2) (else=copy) into x1.
recode extra_score (lo thru 31=1) (32 thru hi=2) (else=copy) into x2.
execute.
GLM agree_score BY x1 x2.
This GLM runs very quickly but still produces incorrect output. For example,
the df for x1 and x2 should be 1 (they have 2 levels, 2-1=1). The only way
PSPP could be calculating df=2 is if it senses 3 levels which probably means
that it's treating missing as a level, which is obviously incorrect. the two
independent variables should still have significant effects on the dep.
variable but the SS is calculated as 0.00. Also, the p-value for one effect is
NaN, which shouldn't happen.
If the internal mechanics of PSPP's GLM cannot handle multiple levels then the
routine should count the levels and refuse to run when the levels are too many
(or, better yet, use an algorithm that doesn't fail because the "general
linear model" shouldn't choke when you feed it continuous variables... that's
what "general" in GLM means...).
Clearly there are several other serious bugs in the routine: (1) df are
calculated incorrectly. (2) Under some circumstances, error SS is wrong. (3)
When F=0 or F=1, significance values should be printed as 1.000 and 0.000, not
as "NaN"
I ran this using an old version of PSPP on Windows 7 but I have since
installed GNU pspp 0.10.1-g1082b8 and verified that I have identical results.
_______________________________________________________
File Attachments:
-------------------------------------------------------
Date: Fri 27 May 2016 02:57:26 PM GMT Name: glm_output2.odt Size: 21kB By:
amead
the .ODT file contains both PSPP and SPSS output (the only way to past ethe
SPSS output was as formatted text) and the SAV file was created by SPSS 24.
<http://savannah.gnu.org/bugs/download.php?file_id=37288>
-------------------------------------------------------
Date: Fri 27 May 2016 02:57:26 PM GMT Name: personality.sav Size: 5kB By:
amead
the .ODT file contains both PSPP and SPSS output (the only way to past ethe
SPSS output was as formatted text) and the SAV file was created by SPSS 24.
<http://savannah.gnu.org/bugs/download.php?file_id=37289>
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?48040>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- PSPP-BUG: [bug #48040] GLM produces wrong output,
Alan Mead <=