Re: Patch to residue.m

On Dec 31, 2007, at 12:04 PM, Daniel J Sebald wrote:

Ben Abbott wrote:
Please consider this patch to residue.m which improves its accuracy when pole multiplicity exists.
To capture the improvement, the test scripts were also modified to be more stringent.

Ben,

This looks like not too much of a change. Still, a few questions. Why the change? What motivated this? Just improving your code? Or did you find an example that was failing? (If so, give us the example.)

Observations:

1) Extra space in "group of pole multiplicity,"

2) p_group = cumsum (e == 1);
for ng = 1:max(p_group)

Since (e == 1) is strictly non-negative, rather than max(p_group) it could be p_group(end). That is more efficient, but of course the polynomial order is quite small so it doesn't really matter.

I'd not come across the "end" functionality before. Very cool! ... I was unhappy with p_group(numel(p_group)).

3) Could you explain the reasoning of the various tolerance parameters in the code? For example, I see this code for determining if a pole is zero or not:

small = max (abs (p));
small = max ([small, 1] ) * 1e-12 * (1 + numel (p))^2;
p(abs (p) < small) = 0;

I'm don't understand what the (1 + numel (p))^2 does. The more poles there are, the larger the tolerance becomes? Also, might you want to define 1e-12 with respect to the machine precision, say

small = max ([small, 1]) * eps*1e4 * (1 + numel (p))^2;

Good point.

(On my machine,

octave:14> eps
ans = 2.2204e-16

so that eps*1e4 is about 1e-12.

Why is it important to cast poles to 0? Is there something later in the code that depends on that?

I don't find the part concerning zero, pure real, pure imaginary poles well thought out. However, since it was there before, and others have expressed a desire for such checks, I've left it alone. In any event, the tolerance is small enough that I don't think it makes any numerical difference.

Now, here is the part I'm not so comfortable with. There is the code above that says a pole is zero if it is within ~1e-12 of zero. But shortly thereafter the code does this:

[e, indx] = mpoles (p, toler, 1);

where poles within "toler" (in the code it is set to 0.001) are equal. One case is a more strict requirement to be declared 0 than the other case of declaring two poles equal. It's just slightly strange. Doesn't that seem a contradiction somehow?

The problem is that as multiplicity increases, the location of the poles become numerically difficult to determine. When the multiplicity is unity, a relative tolerance of 1e-12 is possible. However, as the multiplicity increases, numerical errors increase as well. The values of "toler" and "small" were determined by empirical means (experience and testing). I tried to make them as small as was comfortable.

However, I must admit your questions have made me re-examine this subject, and it does appear that the checks for zero valued, real valued, and imaginary valued poles should be done *after* the multiplicity has been determined ... if they are to be done at all.

I'm not sure I'm comfortable with averaging the poles either:

p(m) = mean (p(m));

From a practical standpoint it may be a good idea, but from a numerical standpoint it seems a little unorthodox. I'd say let the numerical effects remain and let the end user decide what to do. Why is this needed?

I'll give a very simple example.

octave:20> p = [3;3;3];

octave:21> a = poly(p)

a =

1 -9 27 -27

octave:22> roots(a)

ans =

3.0000 + 0.0000i

3.0000 - 0.0000i

octave:23> roots(a)-p

ans =

3.3231e-05 + 0.0000e+00i

-1.6615e-05 + 2.8778e-05i

-1.6615e-05 - 2.8778e-05i

octave:24> mean(roots(a))-p

ans =

1.7764e-15

Thus the error from "roots" is about 1e-5 for each root. However, the error for the average is about 10 orders of magnitude less!

To be honest, I began from your position and was sure that taking the mean would not ensure an improvement in the result. Doug was certain otherwise. So I decided to justify my skepticism by designing an example, where I stacked the cards in my favor.

To my surprise, I failed. All tests confirmed Doug's conclusion. If you can devise an example that succeeds where I failed, I'd love to see it!

I suppose you are dealing with finding higher order residues,

Yes.

but could you compute the mean value for use in the rest of the routine but leave the values non-averaged when returned from the routine? What is the user expecting?

No, at least I don't think so. The residues and poles are uniquely paired. The returned residues should be paired with the returned poles. Regarding what the user expects, a better question would be what does the documentation reflect, and is it consistent with Matlab. Regarding the returned variable, they are consistent with the documentation and with Matlab.

I guess my main concerns are 1) is there a standard way of dealing with "nearly zero" in Octave? Should "toler" be replaced by "small"?

Regarding "near zero" ... I don't know how to properly determine what is "nearly zero", but would appreciate anyone telling us/me how.

Regarding, "toler" and "small" these two have two different purposes. "toler" is used to determine the pole multiplicity and "small" is used to round off to zero, pure real, or pure imaginary values. If "toler" is replace by "small" there will be no multiplicity of poles. So I don't think that is practical.

What we really need is numerically more accurate method for determining the roots. This entire discussion is the consequence of inaccurate roots. So if we are to address the ills of residue.m, a better roots.m will be needed, in my opinion anyway.

Ben

p.s. I'll incorporate the changes and post an update to the patch.

From:	Ben Abbott
Subject:	Re: Patch to residue.m
Date:	Mon, 31 Dec 2007 17:26:11 +0800