[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] linux-user deadlock race condition with IPC
From: |
François Guimond |
Subject: |
[Qemu-devel] linux-user deadlock race condition with IPC |
Date: |
Wed, 09 Jun 2004 10:20:47 -0400 |
Hello all,
I've been working as a side project of mine to get the NWN Dedicated
Server (server software for BioWare's Neverwinter Nights game) running on
Linux/PPC. So far qemu seems the best candidate for this, however I have
been having issues where the software would just hang in suspend for no
apparent reason. I figured I'd try to debug it, but looking at qemu's code
convinced me otherwise. I might be a good programmer, but I figure it would
be much easier for the code designers to locate and fix than for me to learn
qemu's architecture before I get to do anything to it. Anyway, I did do
some work on this though, and managed to locate when it happens and what the
exact symptoms are. Here goes.
The issue arises when a process communicates with another. As it turns
out, the writing process issues a "write" syscall to the other, followed by
a sigprocmask, and then a sigsuspend which waits for the other process to
signal it when it's done. Easy enough, and this all seems to be atomic when
done natively, but as it occurs with qemu-i386, sometimes the writing
process gets the signal that it's done BEFORE it enters sigsuspend, so of
course this means that when it does enter sigsuspend, nothing will wake it
up.
Here are some snippets from the strace ran on the native i386.
1. First you have the creation of the child process, and the creation of the
pipe used to communicate to it.
pipe([3, 4]) = 0
clone(child_stack=0xdc23048,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 31729
2. Later on, the parent does its write routine on the said pipe to the
child.
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
write(4, "address@hidden"\326"..., 148)
= 148
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]
3. The child receives it, does some operations, and then issues a kill
syscall to the parent to notify it.
read(3, "address@hidden"\326"..., 148) =
148
mmap2(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40729000
mprotect(0x40729000, 4096, PROT_NONE) = 0
clone(child_stack=0x40928bc8,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 31731
kill(31728, SIGRTMIN) = 0
4. Parent's sigsuspend gets interrupted, and program flow resumes.
<unfinished ...> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
<... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call)
sigreturn() = ? (mask now [RTMIN])
Now the same thing, but running on the PowerPC using qemu-i386. Note
that the same application with the same libraries (I copied over glibc from
the native to the emulated) are used. First what happens when the program
succeeds:
1. Pipe and child creation, identical to native.
pipe([3, 4]) = 0
clone(child_stack=0x60f0dc70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 27405
2. Write part... I notice that the sigprocmask call is different, and that
the previous mask is also different.. but all seems in order
rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
write(4, "`\200\r2\0\0\0\0\240a\n0\250\353*\10\360\200\316\r\200"..., 148) =
148
rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
rt_sigsuspend([FPE]
3. Child reads it, does a little tango, and signals back... identical
read(3, "`\200\r2\0\0\0\0\240a\n0\250\353*\10\360\200\316\r\200"..., 148) =
148
mmap(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3279f000
mprotect(0x3279f000, 4096, PROT_NONE) = 0
clone(child_stack=0x60f20600,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 27417
kill(27404, SIGRTMIN) = 0
4. Parent gets the message, and program flow resumes (note however that the
return here is "Broken pipe", while in the native case it is "Interrupted
system call"
--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
) = -1 EPIPE (Broken pipe)
As you can see, there are some slight differences, but the end result was
the same, both cases the program resumed alright. However, that is just the
optimistic result, since in most trial runs I did, the above did NOT occur,
and the below happened instead..
1. Pipe and process, looks good.
pipe([3, 4]) = 0
clone(child_stack=0x60f0dc70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 27429
2. Uh-oh... enters the danger zone... here notice it's missing the
sigprocmask and sigsuspend right after the write (didn't have time to get
there)..
rt_sigprocmask(SIG_BLOCK, NULL, [FPE], 8) = 0
write(4, "`\200\r2\0\0\0\0\240a\n0\250\353*\0100\201\316\r\200\0"..., 148) =
148
3. Well behaved child did the same thing as usual.
read(3, "`\200\r2\0\0\0\0\240a\n0\250\353*\0100\201\316\r\200\0"..., 148) =
148
mmap(NULL, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3279f000
mprotect(0x3279f000, 4096, PROT_NONE) = 0
clone(child_stack=0x60f20600,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|SIGRT_1) = 27431
kill(27428, SIGRTMIN) = 0
4. But dady wasn't ready to receive it... and the ball went right over his
head
--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]
Since in all the years the NWN Dedicated Server has been running on
native Linux boxes no one has ever had this bug occur, and it happens
perhaps 4 times out of 5 on my test box, I'm fairly convinced the issue is
with qemu-i386. Perhaps the "minor" changes in signal handling are
provoking this rather disasterous side effect? You qemu developpers are
most likely the best people to narrow this down further and find the
culprit.
-Francois
_________________________________________________________________
MSN Premium includes powerful parental controls and get 2 months FREE*
http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] linux-user deadlock race condition with IPC,
François Guimond <=