qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] thread-win32: fix GetThreadContext() permanentl


From: Fabien Chouteau
Subject: Re: [Qemu-devel] [PATCH] thread-win32: fix GetThreadContext() permanently fails
Date: Tue, 23 Jun 2015 11:49:30 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

On 06/23/2015 08:02 AM, Stefan Weil wrote:
> Am 22.06.2015 um 23:54 schrieb Zavadovsky Yan:
>> Calling SuspendThread() is not enough to suspend Win32 thread.
>> We need to call GetThreadContext() after SuspendThread()
>> to make sure that OS have really suspended target thread.
>> But GetThreadContext() needs for THREAD_GET_CONTEXT
>> access right on thread object.
>>
>> This patch adds THREAD_GET_CONTEXT to OpenThread() arguments
>> and change 'while(GetThreadContext() == SUCCESS)' to
>> 'while(GetThreadContext() == FAILED)'.
>> So this 'while' loop will stop only after successful grabbing
>> of thread context(i.e. when thread is really suspended).
>> Not after the one failed GetThreadContext() call.
>>
>> Signed-off-by: Zavadovsky Yan <address@hidden>
>> ---
>>   cpus.c                   | 2 +-
>>   util/qemu-thread-win32.c | 4 ++--
>>   2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index b85fb5f..83d5eb5 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1097,7 +1097,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
>>            * suspended until we can get the context.
>>            */
>>           tcgContext.ContextFlags = CONTEXT_CONTROL;
>> -        while (GetThreadContext(cpu->hThread, &tcgContext) != 0) {
>> +        while (GetThreadContext(cpu->hThread, &tcgContext) == 0) {
>>               continue;

This looks like a reasonable change, right now I don't understand why I
did it the other way...

>>           }
>>   diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
>> index 406b52f..823eca1 100644
>> --- a/util/qemu-thread-win32.c
>> +++ b/util/qemu-thread-win32.c
>> @@ -406,8 +406,8 @@ HANDLE qemu_thread_get_handle(QemuThread *thread)
>>         EnterCriticalSection(&data->cs);
>>       if (!data->exited) {
>> -        handle = OpenThread(SYNCHRONIZE | THREAD_SUSPEND_RESUME, FALSE,
>> -                            thread->tid);
>> +        handle = OpenThread(SYNCHRONIZE | THREAD_SUSPEND_RESUME | 
>> THREAD_GET_CONTEXT,
>> +                            FALSE, thread->tid);
>>       } else {
>>           handle = NULL;
>>       }
> 
> 
> I added the contributers of the original code to the cc list.
> 
> The modifications look reasonable - if GetThreadContext is needed at all.
> We should add an URL to reliable documentation which supports that
> claim.
>

The reason we need this call is on multi-processor host, when the TCG
thread and the IO-loop thread don't run on the same CPU.

So in this situation the function SuspendThread can return even before
the thread (running on another CPU) is effectively suspended.

Unfortunately this is not really documented by Microsoft an we found
that information somewhere on Internet (if you want I can search the
source again but there's nothing official) after countless hours of
debugging a very nasty race condition caused by this undocumented
behavior.

Maybe this is not explicit enough and the comments need to be updated.


> Is it a good idea to run a busy waiting loop? Or would a Sleep(0) in
> the loop be better (it allows other threads to run, maybe it helps
> them to suspend, too).
>

Maybe we can, but the "while" will only loop when threads are running on
different CPU, so the other thread is already running and calling sleep
will not help I think.

I hope this is clear, as I said we spent a huge amount of time debugging
this about a year and a half ago. The bug would append once every
several thousands tests. QEMU thread code is very "sensitive" on Windows
so we should be careful.

Yan, if you didn't already, I recommend you extensively test this
modification. By extensively, I mean running QEMU several thousands of
time on an SMP host (with many CPUs like 8 or 16 if possible).

Regards,



reply via email to

[Prev in Thread] Current Thread [Next in Thread]