qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] TCG x86-64 'bt' insn


From: Paolo Bonzini
Subject: Re: [Qemu-devel] TCG x86-64 'bt' insn
Date: Sat, 19 Apr 2014 18:04:14 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Il 09/04/2014 13:55, Clemens Kolbitsch ha scritto:
Hi,

quick follow-up. *As always* you find a problem right after asking for
help :). The updated patch does not cause BSOD on Windows guests, but
neither does it fix the actual problem (of the program seg-faulting)

I would really appreciate any feedback on the proposed patch below - the
difference to the previous patch is that we clear out undefined flags
and only keep the Z-flag (and update the C-flag)

The new code should apply to btc/btr/bts too. Basically the "switch" statement keeps just the shr, and everything else moves below.

Or even simpler, you could exploit the undefinedness of non-CF, non-ZF flags and keep using CC_OP_SARB. Because CC_OP_SARB uses "cc_dst == 0" to compute ZF, if ZF=1 you want to set it to 0, and if ZF=0 you want to set it to non-zero. So just compute !ZF into CC_DST, and place the tested bit (CF) into CC_SRC. Basically:

gen_setcc1(s, (JCC_Z << 1) | 1, cpu_cc_dst);
switch(op) {
...
}
tcg_gen_andi_tl(cpu_cc_src, cpu_tmp4, CC_C);
set_cc_op(s, CC_OP_SARB + ot);

Last another general question: Does TCG make any assumptions that
undefined flags are set to 0? I see that most flag-computations set
undefined flags to 0 - is this just a convention or really a requirement?

Just a convention.

Paolo

Thanks guys!
-Clemens




On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch
<address@hidden <mailto:address@hidden>> wrote:

    Hi guys,

    I have to revive a rather old thread [1,2]. A quick summary of the
    issue:

    TCG emulates the BT (bit-test) instruction using a SAR to re-compute
    eflags. While SAR affects all flags, BT only affects the C-flag and
    leaves the Z-flag unaffected/unchanged [3].

    According to the x86 manual, the current behavior is correct
    (although not perfect performance-wise), but NOT correct according
    to newer Intel CPU instruction sets (such as x86-64). What has
    caused some confusion in the past it seems is that AMD's manual
    mentions that all flags other than the C-flag are "undefined" (just
    like in Intel's old x86 manual).

    A patch has been suggested (although not applied) before [2] and I
    was trying to update and try the patch to the QEMU 2.0 RC2, but it
    seems not to be working (causes a BSOD on Windows guests).

    BTW: I have a program that seg-faults on Windows clients (32 as well
    as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous
    versions), so this is not just of theoretical importance :)

    Can somebody say if there is something that I'm doing obviously wrong?

    *NON-FUNCTIONAL PATCH!*

    --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c      2014-04-08
    12:38:58.000000000 -0700
    +++ target-i386/translate.c     2014-04-09 10:08:43.252084947 -0700
    @@ -6710,8 +6710,14 @@
             tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1);
             switch(op) {
             case 0:
    -            tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
    -            tcg_gen_movi_tl(cpu_cc_dst, 0);
    +            /* whatever the last CC-op is - recompute now so we can
    OR-in
    +             * updated results */
    +            gen_update_cc_op(s);
    +            gen_compute_eflags(s);
    +            tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
    +            tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C);
    +            tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4);
    +            set_cc_op(s, CC_OP_EFLAGS);
                 break;
             case 1:
                 tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
    @@ -6734,8 +6740,8 @@
                 tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
                 break;
             }
    -        set_cc_op(s, CC_OP_SARB + ot);
             if (op != 0) {
    +            set_cc_op(s, CC_OP_SARB + ot);
                 if (mod != 3) {
                     gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
                 } else {

    Thanks, I'd really appreciate any input you can provide.
    -Clemens


    [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html
    [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html
    [3] page
    148 
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf






reply via email to

[Prev in Thread] Current Thread [Next in Thread]