[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unpro
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect() |
Date: |
Mon, 27 Nov 2017 15:38:47 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 |
On 24/11/2017 18:18, Peter Maydell wrote:
> * threads A & B both try to do a write to a page with code in it at
> the same time (ie which we've made non-writeable, so SEGV)
> * they race into the signal handler with this faulting address
> * thread A happens to get to page_unprotect() first and takes the
> mmap lock, so thread B sits waiting for it to be done
> * A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
> * A can then continue OK (returns from signal handler to retry the
> memory access)
> * ...but when B gets the mmap lock it finds that the page is already
> PAGE_WRITE, and so it exits page_unprotect() via the "not due to
> protected translation" code path, and wrongly delivers the signal
> to the guest rather than just retrying the access
>
> I'm not sure how best to fix this. We could make page_unprotect()
> say "if PAGE_WRITE is set, assume this call raced with another one
> and say 'this was caused by protected translation' without doing
> anything".
Yes, I think this is the only solution since SIGSEGV is raised
asynchronously. Even using a trylock would only narrow the race window
but not fix it.
> But I have a feeling that will mean we could end up looping
> endlessly if we get a SEGV for a write to a writeable page (not
> sure when this could happen, but maybe alignment issues?).
Those would have to be detected via si_code (for the specific case of
invalid address alignment, that would be a SIGBUS with
si_code==BUS_ADRALN, not a SIGSEGV).
In general, I think that only SIGSEGV/SEGV_ACCERR needs to go down the
page_unprotect path.
Paolo