Re: The nanflag parameter

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The nanflag parameter

From:	Rik
Subject:	Re: The nanflag parameter
Date:	Sun, 15 Jan 2017 16:54:17 -0800

On 01/14/2017 09:00 AM, address@hidden wrote:
> Subject: > The nanflag parameter > From: > Ștefan-Gabriel Mirea <address@hidden> > Date: > 01/13/2017 06:31 PM > To: > address@hidden > List-Post: > <mailto:address@hidden> > Precedence: > list > MIME-Version: > 1.0 > Message-ID: > <address@hidden> > Content-Type: > text/plain; charset=UTF-8 > Message: > 2 > > Hello, > > My name is Stefan Mirea. I am a fourth year student of Automatic > Control and Computer Science at the "Politehnica" University of > Bucharest. As Octave has been one of my favourite tools during the > faculty years, I wish to contribute to its development. > > I chose to start with bug #50007. Unfortunately, I don't think there > is any elegant fix that does not imply changing the Octave API. Since > my solution would involve modifying multiple function prototypes, I > decided to describe it below and ask for your opinion before working > on a patch.

As long as changes are applied to the development branch and don't disrupt existing users, changing the API is okay.

> > > According to the MATLAB documentation[1], the functions which accept a > nanflag argument are: > * min > * max > * cummin > * cummax > * sum > * cumsum > * mean > * median > * var > * std > * cov > > * medfilt1

I don't see 'medfilt1' in the list of core Matlab functions. I think you can skip it for now.

> > together with some recently introduced descriptive statistics > functions, which are not currently implemented in Octave: movsum, > movmean, movmedian, movmax, movmin, movstd, movvar. > > For the one-array-input syntaxes of min and max, I would: > > * modify the signature of the do_minmax_red_op() function in max.cc, > in order to receive the boolean nan-flag from do_minmax_body(). > do_minmax_red_op() is never called from elsewhere. Unhappily, the two > template specializations, for charNDArray and boolNDArray > respectively, would need to be updated as well, even though these > types don't support NaN values. > > * add a nan-flag parameter to the min() and max() methods of all the > classes with which do_minmax_red_op() is instantiated (except > boolNDArray): SparseMatrix, NDArray, SparseComplexMatrix, > ComplexNDArray, FloatNDArray, FloatComplexNDArray, charNDArray and > intNDArray<*> (yet charNDArray methods could be left unchanged as well > because of the specialization). Although these min/max methods are > part of the Octave API, using a default argument would ensure that no > external code would be affected. Inside the Octave core, they are only > called from max.cc. Also, I believe that uniformity towards the min() > and max() methods of other classes would not be a problem, as the > classes above are already the only ones whose min/max methods appear > in this form (receiving the dimension along which the reductions will > be performed; on the other hand, ColumnVector::min() has no parameters > for example). > > * add a nan-flag parameter to mx_inline_min() and mx_inline_max(). > These functions are used only in the min()/max() methods of the Array > (not Sparse) classes above. The overloads defined with OP_MINMAX_FCNN > would just pass the flag unchanged when calling the OP_MINMAX_FCN or > OP_MINMAX_FCN2 version. > > * update do_mx_minmax_op() to receive the nan-flag from the caller > min()/max() method and send it to mx_minmax_op() (which now accepts > it). Again, do_mx_minmax_op() is used only in the min()/max() methods > of the Array classes above. > > * update SparseMatrix::min()/max() as well as the versions of > mx_inline_min() and mx_inline_max() defined with OP_MINMAX_FCN and > OP_MINMAX_FCN2 to take the received NaN policy into account. > I would be thankful if you could give me some feedback on the approach > above.

I agree that this seems like a workable approach. However, I wonder if
it is actually necessary? The question that occurs to me is whether the
nanflag parameter is something that users programming in C++ would
want. If it is, then the change needs to be in liboctave and the
interpreter can piggyback on those modifications. However, if this is
really something that only users of the Octave interpreter need then it
would be easier to just make this change in the m-files themselves or in
libinterp.

For example, the existing max/min functions behave as if "omitnan" were specified, and that is the most common way to use the functions. If a programmer wants the 'includenan' behavior they are really saying that they want a test for NaN, and there is already the function isnan() for that. Pseudo-code for a new max.m function would be:

tmp = __max__ (x, [], dim); # Call old max function
includenan = strcmpi (nanflag, "includenan");
if (includenan)
idx = any (isnan (x), dim);
tmp(idx) = NaN;
endif

Just for fun I tried,

x = rand (1000, 1000);
x(1000,1000) = NaN;
tmp = max (x);
tic; idx = any (isnan (x)); tmp(idx) = NaN; toc
Elapsed time is 0.00314093 seconds.

3 milliseconds is a pretty small price to pay, and it would only be done when using the non-default behavior.

What I coded as an m-file could also be done in C++ if you want to keep with the existing implementation of max/min in libinterp.

So, my fundamental question for Octave-Maintainers is whether we need a nanflag in liboctave, or only in the Octave interpreter?

--Rik

[Prev in Thread]

Current Thread

[Next in Thread]

The nanflag parameter, Ștefan-Gabriel Mirea, 2017/01/14
- Re: The nanflag parameter, Rik <=
  - Re: The nanflag parameter, Ștefan-Gabriel Mirea, 2017/01/16

Prev by Date: Re: Proposal for a team of admins
Next by Date: interval 2.1.0 released
Previous by thread: The nanflag parameter
Next by thread: Re: The nanflag parameter
Index(es):
- Date
- Thread