qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Signal handling bugs (and proposed fixes)


From: Peter Maydell
Subject: Re: [Qemu-devel] Signal handling bugs (and proposed fixes)
Date: Thu, 3 Apr 2014 16:53:52 +0100

On 2 April 2014 23:04, Andrei E. Warkentin <address@hidden> wrote:
> Hiya,
>
> I found a couple of corner cases where signal handling fails in QEMU
> linux-user support. "Signal handling" here being just a symptom -
> actual problems are in TB / page management.
>
> Here are a couple of simple tests
> (https://github.com/andreiw/andreiw-wip/tree/master/qemu/tests). The
> test:
> 1) Creates a page with two instructions - an illegal instruction and a 'ret'.
> 2) Sets up two handlers - SIGILL and SIGSEGV. On SIGILL, the page with
> the instructions is unmapped (in sigtest.c, and set to PROT_NONE in
> sigtest_mprotect.c), and the instruction pointer is set to point to
> the next instruction - the ret. On SIGSEGV, the page is mapped back in
> and the execution is supposed to continue successfully.
>
> The first problem is a bug in page_check_range that cuts the loop
> after the first page that needs to be  unprotected (whereas all pages
> need to be checked in the range).
>
> https://github.com/andreiw/andreiw-wip/blob/master/qemu/0001-qemu-fix-page_check_range.patch
>
> With this patch, the above tests now hang instead of segfaulting QEMU.

This patch looks good to me.

Reviewed-by: Peter Maydell <address@hidden>

> The second problem is a bit more involved - the code generator assumes
> that the translated code is always present. Jumping to NULL to land in
> a SIGSEGV handler, or expecting to "fault in" code like the tests
> above do, results in a deadlock condition. On a signal, QEMU tries
> delivering the signal, longjmps back to cpu_loop, but the
> tcg_ctx.tb_ctx.tb_lock is still held. So tb_gen_code must allow
> backing out gracefully if the memory is unreadable.
>
> I took the simpler approach - making sure the page didn't look
> unmapped or mprotected to PROT_NONE. Checking explicit access bits is
> a bad idea because different arches have different implicit access
> right (i.e. X implies R, or W implies R, or R implies X, etc). Another
> alternative is to have a safe memory probe function - i.e. take a
> SIGSEGV and patch the return to return EFAULT (like a kernel ;-)), but
> this would need an implementation for every arch QEMU runs on.
>
> Verified both on AArch64 and i386 linux-user targets. Both tests
> successfully exit.
>
> https://github.com/andreiw/andreiw-wip/blob/master/qemu/0002-qemu-handle-tb_gen_code-getting-called-for-unmapped-.patch
> https://github.com/andreiw/andreiw-wip/blob/master/qemu/0003-x86-implement-EXCP_TB_EFAULT.patch

I'm not so convinced about these two. I think the correct
approach is to make sure we unlock the tb_lock in the
code path where we longjumped out of the frontend
decoder. I'll try to cook up a patch for this.

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]