qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] linux-user deadlock race condition with IPC


From: François Guimond
Subject: [Qemu-devel] linux-user deadlock race condition with IPC
Date: Wed, 09 Jun 2004 10:20:47 -0400

Hello all,

I've been working as a side project of mine to get the NWN Dedicated Server (server software for BioWare's Neverwinter Nights game) running on Linux/PPC. So far qemu seems the best candidate for this, however I have been having issues where the software would just hang in suspend for no apparent reason. I figured I'd try to debug it, but looking at qemu's code convinced me otherwise. I might be a good programmer, but I figure it would be much easier for the code designers to locate and fix than for me to learn qemu's architecture before I get to do anything to it. Anyway, I did do some work on this though, and managed to locate when it happens and what the exact symptoms are. Here goes.

The issue arises when a process communicates with another. As it turns out, the writing process issues a "write" syscall to the other, followed by a sigprocmask, and then a sigsuspend which waits for the other process to signal it when it's done. Easy enough, and this all seems to be atomic when done natively, but as it occurs with qemu-i386, sometimes the writing process gets the signal that it's done BEFORE it enters sigsuspend, so of course this means that when it does enter sigsuspend, nothing will wake it up.



  Here are some snippets from the strace ran on the native i386.

1. First you have the creation of the child process, and the creation of the pipe used to communicate to it.

pipe([3, 4]) = 0
clone(child_stack=0xdc23048, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 31729

2. Later on, the parent does its write routine on the said pipe to the child.

rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
write(4, "address@hidden"\326"..., 148) = 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]

3. The child receives it, does some operations, and then issues a kill syscall to the parent to notify it.

read(3, "address@hidden"\326"..., 148) = 148 mmap2(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40729000
mprotect(0x40729000, 4096, PROT_NONE) = 0
clone(child_stack=0x40928bc8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 31731
kill(31728, SIGRTMIN) = 0

4. Parent's sigsuspend gets interrupted, and program flow resumes.

<unfinished ...> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])



Now the same thing, but running on the PowerPC using qemu-i386. Note that the same application with the same libraries (I copied over glibc from the native to the emulated) are used. First what happens when the program succeeds:

1. Pipe and child creation, identical to native.

pipe([3, 4]) = 0
clone(child_stack=0x60f0dc70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 27405

2. Write part... I notice that the sigprocmask call is different, and that the previous mask is also different.. but all seems in order

rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
write(4, "`\200\r2\0\0\0\0\240a\n0\250\353*\10\360\200\316\r\200"..., 148) = 148
rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
rt_sigsuspend([FPE]

3. Child reads it, does a little tango, and signals back...  identical

read(3, "`\200\r2\0\0\0\0\240a\n0\250\353*\10\360\200\316\r\200"..., 148) = 148 mmap(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3279f000
mprotect(0x3279f000, 4096, PROT_NONE) = 0
clone(child_stack=0x60f20600, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 27417
kill(27404, SIGRTMIN) = 0

4. Parent gets the message, and program flow resumes (note however that the return here is "Broken pipe", while in the native case it is "Interrupted system call"

--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
) = -1 EPIPE (Broken pipe)



As you can see, there are some slight differences, but the end result was the same, both cases the program resumed alright. However, that is just the optimistic result, since in most trial runs I did, the above did NOT occur, and the below happened instead..

1. Pipe and process, looks good.

pipe([3, 4]) = 0
clone(child_stack=0x60f0dc70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 27429

2. Uh-oh... enters the danger zone... here notice it's missing the sigprocmask and sigsuspend right after the write (didn't have time to get there)..

rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
write(4, "`\200\r2\0\0\0\0\240a\n0\250\353*\0100\201\316\r\200\0"..., 148) = 148

3. Well behaved child did the same thing as usual.

read(3, "`\200\r2\0\0\0\0\240a\n0\250\353*\0100\201\316\r\200\0"..., 148) = 148 mmap(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3279f000
mprotect(0x3279f000, 4096, PROT_NONE) = 0
clone(child_stack=0x60f20600, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 27431
kill(27428, SIGRTMIN) = 0

4. But dady wasn't ready to receive it... and the ball went right over his head

--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]



Since in all the years the NWN Dedicated Server has been running on native Linux boxes no one has ever had this bug occur, and it happens perhaps 4 times out of 5 on my test box, I'm fairly convinced the issue is with qemu-i386. Perhaps the "minor" changes in signal handling are provoking this rather disasterous side effect? You qemu developpers are most likely the best people to narrow this down further and find the culprit.

-Francois

_________________________________________________________________
MSN Premium includes powerful parental controls and get 2 months FREE* http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines





reply via email to

[Prev in Thread] Current Thread [Next in Thread]