[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Interrupted System Call EINTR on Solaris
From: |
Paul D. Smith |
Subject: |
RE: Interrupted System Call EINTR on Solaris |
Date: |
Tue, 4 Apr 2006 12:42:54 -0400 |
%% "CARTER-HITCHIN, David, GBM" <address@hidden> writes:
cdg> Hi Paul,
>> http://make.paulandlesley.org/jobserver.html
>>
>> Look at the section "SA-non-RESTARTer?" I've had conversations with
>> knowledgeable people who claim that the POSIX guarantee for SA_RESTART
>> is not as ironclad as one would assume, and that technically
>> Solaris is not violating the letter of the spec, so...
cdg> This is bad news.
cdg> Would it be possible that you (or your knowledgable friends) to
cdg> knock up a small test case illustrating this problem?
The thing is, it's pretty hard to reproduce in a reliable way. You
basically have to get the timing exactly right so that the signal comes
in right when the system call is running. Not easy to do.
However, you really don't need to reproduce this: the situation is well
known and understood by Sun; there's no question about what the behavior
is. However Sun maintains that it works as designed and expected and
that the behavior is allowed by the relevant standards, and as far as I
know, they aren't interested in changing it.
You can find more info here including posts by Casper Dik, who works for
Sun and knows a ton about these issues.
Just ignore Rev. Don Kool when he starts spouting his usual inane drivel:
http://groups.google.com/group/comp.unix.solaris/browse_thread/thread/d6e3339bd36504c8/a162a5cd7ff45340?lnk=st&q=SA_RESTART+solaris+make&rnum=2&hl=en#a162a5cd7ff45340
http://groups.google.com/group/comp.unix.solaris/browse_thread/thread/698f23c99f7532e0/a20dfa1b940b5e63?lnk=st&q=SA_RESTART+solaris+make&rnum=3&hl=en#a20dfa1b940b5e63
I think there's even a case mentioned by Paul Eggert that he filed with
Sun that you can reference (although he says it was closed).
cdg> Having said all that, if there are other systems that do not
cdg> implement SA_RESTART properly, then I guess it is safer sticking
cdg> with 'defensive coding' techniques. Nevertheless it would be
cdg> still worth getting Sun to fix their O/S, as this might be
cdg> causing problems for other apps.
There are others, indeed.
I've implemented a good bit of "defensive coding", especially in the
obvious areas. However, it's simply not possible to defensively code
around every possible system call that might fail: many are hidden
behind normal C runtime functions (printf() etc. for example).
I would be VERY interested in hearing from people using GNU make 3.81 in
massively parallel situations (however, the parallelism has to be
limited; using "-j" with no limit won't use the jobserver at all so it
won't show this problem) about how often they still see these sorts of
failures.
--
-------------------------------------------------------------------------------
Paul D. Smith <address@hidden> Find some GNU make tips at:
http://www.gnu.org http://make.paulandlesley.org
"Please remain calm...I may be mad, but I am a professional." --Mad Scientist