[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #42671] [PATCH] corr() does not have p-values
From: |
Philipp Kutin |
Subject: |
[Octave-bug-tracker] [bug #42671] [PATCH] corr() does not have p-values output, returns 1.0 with one observation. |
Date: |
Thu, 03 Jul 2014 12:36:46 +0000 |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0 |
URL:
<http://savannah.gnu.org/bugs/?42671>
Summary: [PATCH] corr() does not have p-values output,
returns 1.0 with one observation.
Project: GNU Octave
Submitted by: pkutin
Submitted on: Thu 03 Jul 2014 12:36:46 PM GMT
Category: Octave Function
Severity: 3 - Normal
Priority: 5 - Normal
Item Group: Matlab Compatibility
Status: None
Assigned to: None
Originator Name:
Originator Email:
Open/Closed: Open
Discussion Lock: Any
Release: dev
Operating System: Any
_______________________________________________________
Details:
The current corr.m is behind MATLAB's in various ways. First, there's no
p-values output with the second 'PVAL' outarg. Because of this, there's also
no option as to which kind of alternative hypothesis to consider ('both',
'left' or 'right').
The attached patch adds the PVAL output for the both-sided case. As stated in
the MATLAB docs, a transformation from r to values that are t-distributed
(assuming the input variables are uncorrelated bivariate Gaussian) is used
there.
Additionally, when correlating data sets with one observation, return NaN
instead of 1 -- the Pearson correlation coefficient is not defined in this
case since the variance of either variable isn't.
*Patch message:*
corr.m: obtain p-values from r-to-t transformation; return NaN for 1
observation.
* when correlating data sets with one observation, return NaN instead of 1.
* use a transformation into a t-distributed variable (assuming the input
variables are uncorrelated bivariate Gaussian) to obtain both-sided
p-values
*Future directions:*
For corr() to accept key/value pairs like 'KIND' it would be nice to have a
factored system to extract these from a varargin passed to a function.
Searching for the K/V pattern in the Octave code, it seems like these are done
by hand each time now.
The remaining measures of assiciation -- spearman() and kendall() -- are
there, so dispatching to those could then be done from corr(), too. Estimating
p-values for them is a different story.
Tests on MATLAB R2013a:
>> corr(1,2)
ans =
NaN
>> [c,p]=corr([1 2]',[2 3]')
c =
1.0000
p =
NaN
>> [c,p]=corr([1 2 3]',[2 3 4]')
c =
1.0000
p =
9.4864e-09
>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','right')
c =
0.9562
p =
0.0219
>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','left')
c =
0.9562
p =
0.9781
>> [c,p]=corr([1 2 3 4]',[2 3 4 7]', 'tail','both')
c =
0.9562
p =
0.0438
_______________________________________________________
File Attachments:
-------------------------------------------------------
Date: Thu 03 Jul 2014 12:36:46 PM GMT Name: corr-pval-1.patch Size: 2kB
By: pkutin
<http://savannah.gnu.org/bugs/download.php?file_id=31670>
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?42671>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [Octave-bug-tracker] [bug #42671] [PATCH] corr() does not have p-values output, returns 1.0 with one observation.,
Philipp Kutin <=