qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i38


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
Date: Wed, 17 Jun 2015 15:24:46 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

On 2015-06-17 15:41, Pavel Dovgalyuk wrote:
> This set of patches fixes exception handling for MIPS and i386 targets.
> These targets contain instructions that break correct execution in 
> icount/TCG modes (MIPS) and in regular TCG mode (i386).

Just to be clear, this is not something specific to MIPS and i386.
Every target which call cpu_loop_exit without doing a retranslation is
affected by the icount bug.

> Incorrect execution for i386 is causes by exceptions raised by MMU functions.
> MMU helper functions are called from generated code and other helper
> functions. In both cases they try to get function's return address for
> restoring virtual CPU state.
> 
> When MMU helper is called from some other helper function
> (like helper_maskmov_xmm) through cpu_st* function, the return address
> will point to that helper. That is why CPU state cannot be restored in
> the case of MMU fault.
> 
> This bug can occur when maskmov instruction is located in the middle of the
> translation block.
> 
> Execution sequence for this example:
> 
> TB start:
> PC1: instr1
>      instr2
> PC2: maskmov <page fault>
>      <page fault processing>
> PC1: instr1
>      instr2
>      maskmov
> 
> At the start of TB execution guest PC points to instr1. When page fault occurs
> QEMU tries to restore guest PC (which should be equal to PC2). It reads host 
> PC
> from the call stack and checks whether it points to TB or not. Bug in ldst
> helpers implementation provides incorrect host PC, which is not located within
> the TB. That's why QEMU cannot recover guest PC and it remains the same (PC1).
> After page fault processing QEMU restarts TB and executes instr1 and instr2
> for the second time, because guest PC was not recovered.

Also just to be clear, the way you propose is just one way to fix the
issue. The other way (which is used for all similar instructions) is to
call just before the helper:
    gen_update_cc_op(s);
    gen_jmp_im(pc_start - s->cs_base);
                
> Bugs in MIPS helper functions do not break the execution in regular TCG mode,
> because PC value is updated before calling the functions that can raise
> an exception. But icount value cannot be updated this way. Therefore
> exceptions make execution in icount mode non-determinisic.
> In icount mode every translation block looks as follows:
> 
> if icount < n then exit
> icount -= n
> instr1
> instr2
> ...
> instrn
> exit
> 
> When one of these instructions initiates an exception, icount should be 
> restored and adjusted number of instructions should be subtracted from icount
> instead of initial n.
> 
> tlb_fill function passes retaddr to raise_exception, which allows restoring
> current instructions in TB and correct icount calculation.
> 
> When exception triggered with other function (e.g. by embedding call to 
> exception raising helper into TB), then PC is not passed as retaddr and
> correct icount is not recovered. In such cases icount will be decreased 
> by the value equal to the size of TB.
> 
> This behavior leads to incorrect values of virtual clock and 
> non-deterministic execution of the code.
> 
> These patches passes pointer to the translation block code to the exception
> handler. It allows correct restoring of PC and icount values.

While I think it's the correct way for load/stores or FPU exceptions, I
am not convinced we should do that in all cases. Retranslation has a
cost, and when the exception is likely to occur, it's better to save the
CPU state instead of going for a retranslation. Actually we had to
rollback such a change on SH4, as it has some performances issues. See
commit 1012740098d0307ce5d957ebbe9a7f020da7f574.

One way to fix that would be to reduce the cost of retranslation, for
example by using a kind of exception table that we generate at the same
time of the TB.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]