[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retransl
From: |
Aurelien Jarno |
Subject: |
[Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation |
Date: |
Mon, 10 Jan 2011 15:23:17 +0100 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
>
> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
>
> > On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> >>
> >> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> >>
> >>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >>>>
> >>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>>>> might want to look at the description (copied below), as from a first
> >>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>>>> strange behaviors, that happen randomly and that looks difficult to
> >>>>> debug it might be the issue.
> >>>>>
> >>>>> Aurelien
> >>>>>
> >>>>>
> >>>>> | QEMU uses code retranslation to restore the CPU state when an
> >>>>> exception
> >>>>> | happens. For it to work the retranslation must not modify the
> >>>>> generated
> >>>>> | code. This is what is currently implemented in ARM TCG.
> >>>>> |
> >>>>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>>>> | ARM, this requirement is stronger and code retranslation must not
> >>>>> modify
> >>>>> | the generated code "atomically", as the cache line might be flushed
> >>>>> | at any moment (interrupt, exception, task switching), even if not
> >>>>> | triggered by QEMU. The probability for this to happen is very low, and
> >>>>> | depends on cache size and associativiy, machine load, interrupts, so
> >>>>> the
> >>>>> | symptoms are might happen randomly.
> >>>>> |
> >>>>> | This requirement is currently not followed in tcg/arm, for the
> >>>>> | load/store code, which basically has the following structure:
> >>>>> | 1) tlb access code is written
> >>>>> | 2) conditional fast path code is written
> >>>>> | 3) branch is written with a temporary target
> >>>>> | 4) slow path code is written
> >>>>> | 5) branch target is updated
> >>>>> | The cache lines corresponding to the retranslated code is not flushed
> >>>>> | after code retranslation as the generated code is supposed to be the
> >>>>> | same. However if the cache line corresponding to the branch
> >>>>> instruction
> >>>>> | is flushed between step 3 and 5, and is not flushed again before the
> >>>>> | code is executed again, the branch target is wrong. In the guest, the
> >>>>> | symptoms are MMU page fault at a random addresses, which leads to
> >>>>> | kernel page fault or segmentation faults.
> >>>>
> >>>> I don't see the problem. If you have separate icache from dcache, the
> >>>> code in question doesn't get executed during the rewrite, so all should
> >>>> be fine. If you have both combined, the data write should automatically
> >>>> modify the cache line, so all is great too.
> >>>
> >>> As far as I understand, this is what happens to the branch target
> >>> instruction, or at least one possibility:
> >>>
> >>> operation | icache | dcache | mem/L2 |
> >>> ------------------------------------------+--------+--------+--------+
> >>> tlb access code is written | absent | absent | ok |
> >>> conditional fast path code is written | absent | absent | ok |
> >>> branch is written with a temporary target | absent | wrong | ok |
> >>> cache line is flushed to memory | absent | absent | wrong |
> >>> slow path code is written | absent | absent | wrong |
> >>> branch target is updated | absent | ok | wrong |
> >>> TB is re-executed | wrong | ok | wrong |
> >>>
> >>> Note that the issue is real, the patch really fixes the issue on ARM and
> >>> MIPS. It's not impossible that I don't fully understand it given I can't
> >>> analyse the cache values at a given time. However, when QEMU crashes
> >>> because of that, we have seen that the executed code doesn't match what
> >>> GDB says, and changing the temporary branch value also changes the way
> >>> QEMU or the guest crash.
> >>>
> >>> The problem doesn't happens very often though (for sure less than 1
> >>> every few millions). The other way to fix it is to issue a cache flush
> >>> after a code retranslation, however, it is something very costly on some
> >>> architectures.
> >>
> >> I'd guess it only happens when code is overwritten. Only then icache can
> >> still be stale in that code region. Can't we just invalidate the icache on
> >> every tb flush? That should also fix the issue I'd guess.
> >
> > That's a solution that works (tested). However making sure that the
> > retranslation doesn't change the code looks better.
>
> Yeah, I agree. Temporary retranslation really shouldn't modify existing code.
> We really do need an icache flush on tb flushes still though as that's a
> separate issue? Or are these already in?
>
We already have an icache flush after the first translation. We don't
have any when the TB is flushed, but I don't think it is something
necessary. The new TB that will replace it will issue an icache flush
after it has been translated.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
address@hidden http://www.aurel32.net
- [Qemu-devel] tcg/{ppc, s390, sparc}: branch target and code retranslation, Aurelien Jarno, 2011/01/06
- Message not available
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Aurelien Jarno, 2011/01/10
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Alexander Graf, 2011/01/10
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Aurelien Jarno, 2011/01/10
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Alexander Graf, 2011/01/10
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation,
Aurelien Jarno <=
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Alexander Graf, 2011/01/10
- [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Aurelien Jarno, 2011/01/10
- Re: [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Edgar E. Iglesias, 2011/01/10
- Re: [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation, Alexander Graf, 2011/01/10