certi-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [certi-dev] CERTI deadlock


From: Michael Raab
Subject: Re: [certi-dev] CERTI deadlock
Date: Tue, 19 Jun 2012 15:21:07 +0200

Hi Eric,

I'm willing to investigate that issue further on Windows 7. But before I can evaluate any debug and trace messages, I need some redirection of output.
I tried some things today, but unfortunately failed. :-(  
Hope you can provide something in the next days.

Thanks,
Michael



Dipl.-Inform. Michael Raab

Fraunhofer-Institut für Fabrikbetrieb und -automatisierung IFF
Virtuell Interaktives Training
Sandtorstr. 22, 39106 Magdeburg, Germany                
Telefon +49 (0) 391/ 40 90 122
Telefax +49 (0) 391/ 40 90 115
address@hidden
http://www.iff.fraunhofer.de oder http://www.vdtc.de



Von:        Eric Noulard <address@hidden>
An:        CERTI development discussions <address@hidden>
Datum:        19.06.2012 13:42
Betreff:        Re: [certi-dev] CERTI deadlock
Gesendet von:        address@hidden




2012/6/19 Michael Raab <address@hidden>:
> Hi Eric,
>
> I tried running the simulation using tick() and till now no deadlock
> occured. So there seems to be something wrong with tick(min,max)...

I was afraid of that. Plus the fact that we did never face such
problem on Linux/Unix
makes me think there is something wrong with the timeout handling code
on Windows.

> I tried to debug into the blocked federate but I got not much information.
> Here's the stack:
>
>          ntdll.dll!773bf8b1()
>          [Unten angegebene Rahmen sind möglicherweise nicht korrekt und/oder
> fehlen, keine Symbole geladen für ntdll.dll]
>          ntdll.dll!773bf8b1()
>          mswsock.dll!72d317cd()
>          mswsock.dll!72d36d30()
>          ntdll.dll!773ce38c()
>
>  msvcp80.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char>
>>::_Grow(unsigned int _Newsize=269108997, bool _Trim=false)  Zeile 2056
>    C++
>          CERTI.dll!certi::Message::Message()  Zeile 40        C++
>          ws2_32.dll!75236a28()
>          msvcr80.dll!free(void * pBlock=0x00000000)  Zeile 110        C
>>        rtia.exe!certi::rtia::Communications::readMessage(int & n=2,
>> certi::NetworkMessage * * msg_reseau=0x00dcfe6c, certi::Message * *
>> msg=0x00dcfe74, timeval * timeout=0x00dcfe60)  Zeile 256 + 0x19 Bytes
>>  C++
>          rtia.exe!certi::rtia::RTIA::execute()  Zeile 152        C++
>          rtia.exe!main(int argc=3, char * * argv=0x00e08ac8)  Zeile 118
>    C++
>          rtia.exe!__tmainCRTStartup()  Zeile 597 + 0x17 Bytes        C
>          kernel32.dll!7644339a()
>          ntdll.dll!773d9ef2()
>          ntdll.dll!773d9ec5()
>
> Seems to me as the federate is waiting for incoming messages, but I guess no
> know it better...
>
> Hope that helps,

As you can see the call to certi::rtia::Communications::readMessage
is done **with** timeout, because the "timeval* timeout" arg is not NULL.

Moreover "Zeile 256" (line 256 I think) is the win32 select call:

255: #ifdef _WIN32
256:         if (select(max_fd, &fdset, NULL, NULL, timeout) < 0) {
257:             if (WSAGetLastError() == WSAEINTR)
258: #else
259:         if (select(max_fd+1, &fdset, NULL, NULL, timeout) < 0) {
260:             if (errno == EINTR)
261: #endif


So this call should definitely terminate....
At least that what I think
http://msdn.microsoft.com/en-us/library/windows/desktop/ms740141%28v=vs.85%29.aspx

May be you can try to add some trace in
"certi::rtia::Communications::readMessage"
in order to see whether if "timeout" do occur.
May be the timeout value is outrageously big?

The translation frmo TickTime (which is a double number) to timeval is
done in RTIA::execute:
122:         struct timeval timev;
123:         timev.tv_sec = int(tm->_tick_timeout);
124:         timev.tv_usec = int((tm->_tick_timeout-timev.tv_sec)*1000000.0);

seems OK as well?

May be you can try to vary the timeout duration you use in order to
see if it's change something.

I'm really sorry to say that but I'm not in the position to
investigate this on Windows,
so the best I can do for now is to tru to guess what happen wihth your
experiment.

Steffen and Massimiliano did face "deadlock" as well:
https://savannah.nongnu.org/bugs/?34922

may be they can help with some testing on their side?


--
Erk
Le gouvernement représentatif n'est pas la démocratie --
http://www.le-message.org

--
CERTI-Devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/certi-devel


reply via email to

[Prev in Thread] Current Thread [Next in Thread]