qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()


From: Michael Tokarev
Subject: Re: [Qemu-devel] [PATCH 0/6] Drop the irredeemably racy cpu_unlink_tb()
Date: Thu, 09 May 2013 12:09:39 +0400
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:17.0) Gecko/17.0 Icedove/17.0

[Rehashing a relatively old thread.
 It is available, in particular, at 
http://thread.gmane.org/gmane.comp.emulators.qemu/196629]

22.02.2013 22:09, Peter Maydell wrote:
> This patch series gets rid of cpu_unlink_tb(), which is irredeemably
> racy, since it modifies the TB graph with no locking from other
> threads, signal handlers, etc etc. (The signal handler case is
> why you can't just fix this with more locks.) Instead we take the
> much simpler approach of setting a flag for the CPU when we want
> it to stop executing TBs, and generate code to check the flag at
> the start of every TB. The raciness is easiest to provoke with
> multithreaded linux-user guests but it is I think also a risk
> in system emulation mode.
>
> This fixes the crashes seen in LP:668799; however there are another
> class of crashes described in LP:1098729 which stem from the fact
> that in linux-user with a multithreaded guest all threads will
> use and modify the same global TCG date structures (including the
> generated code buffer) without any kind of locking. This means that
> multithreaded guest binaries are still in the "unsupported" category.

Now when Debian Wheezy has been released with qemu 1.2, users started
to file bugreports about multithreaded apps crashing.  So, while I
understand these are "unsupported" as per above, I think it is better
to fix as much as we can, and allow some more usage cases, than to
describe that it "does not work".  (And yes, I also understand it's
better to fix that before a release, but oh well).

So I tried to backport the series to 1.2 branch, to see how it goes.
But there were many changes since 1.2, so it ended up in quite some
changes, -- not exactly huge and complicated, but I need some help
or additional pair (or two) of eyes to ensure the sanity of the
resulting code.

The backported patchset is smaller than the original.

> Andreas Färber (1):
>   cpu: Introduce ENV_OFFSET macros

This change isn't needed in 1.2, because all CPU state is in single
place there, it hasn't been split between env and cpu states there.

> Peter Maydell (5):
>   tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses

Just a minor context difference in the header, no questions.

>   cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC

This needed a "back merge" of env+cpu states back to cpu.
Maybe we should somehow revisit the whole concept of the
two, because it's sorta fun: at some point all functions
has been converted to accept `cpu' instead of `env', but
in many places the first thing a function does is to
get `env' pointer out of `cpu'.  The backport of this
change reverts this piggy-backing and uses `env' everywhere
consistently again.

>   Handle CPU interrupts by inline checking of a flag

The main patch in the series.

The same simplification from `cpu' back to `env' as above.
I had to add `tcg_exit_req' field into CPU_COMMON macro in
cpu-defs.h, instead of adding it to CPUState struct in
include/qom/cpu.h.

Plus the corresponding changes in gen-icount.h -- that's
where ENV_OFFSET was used, but due to the same env to cpu
split, not needed anymore.  My main question is actually
about this place, I don't exactly understand how the
code generation works, so am not sure if I backported
it correctly.

Plus minor code shuffling - for one, bits in translate-all.c
was in exec.c before, so I had to modify it there.

>   translate-all.c: Remove cpu_unlink_tb()

That's easy, but again the bits being removed are in
exec.c in 1.2, not in translate-all.c (so the resulting
backported patch is now misnamed).

Technically this patch isn't needed for a backport,
since the function(s) are really unused, but gcc
generates a warning about unused static function
and the compilation fails due to -Werror.

>   gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end

And I didn't try to backport this one, since this is
just mechanical s/icount/tb/g without any code changes.

Now, the resulting thing compiles (ha!), but I'm not
really sure how to test it.  I ran a few random apps
using qemu-i386 and qemu-arm, it appears to work.

If all goes well, I think this series needs to be
included in whatever -stable qemu series are in use.

I'll post the 4 resulting patches in reply to this
message.

Thank you!

/mjt



reply via email to

[Prev in Thread] Current Thread [Next in Thread]