qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] x86 tcg problem


From: Blue Swirl
Subject: Re: [Qemu-devel] x86 tcg problem
Date: Tue, 29 Jul 2008 20:18:25 +0300

On 7/29/08, Vince Weaver <address@hidden> wrote:
> Hello
>
>  I've spent a day now trying to figure out why bzip2 compress/decompress
> doesn't work when using sparc32plus-linux-user on x86.
>
>  I've tracked the problem to the Zero flag being improperly set (attached is
> a small exe/src that reproduces the problem.. it reports "Greater"
>  on real hardware, "Less Than" on qemu current).
>
>  The issue seems to be a misordering of an x86 sub instruction.  I tried to
> track this down in the tcg code but I quickly got lost.
>
>  The code does this for a compare (on sparc the compare turns into a
> subtract with result as the [ignores] zero reg):
>
>   mov_i32 cc_src_0,g4_0                          ;
>   mov_i32 cc_src_1,g4_1                          ; load g4  (0xaae60)
>   mov_i32 cc_src2_0,g3_0                         ;
>   mov_i32 cc_src2_1,g3_1                         ; load g3  (0)
>   sub2_i32
> cc_dst_0,cc_dst_1,cc_src2_0,cc_src2_1,cc_src_0,cc_src_1
>                                                 ; result = 0xaafe0-0
>   movi_i32 psr,$0x0                              ; clear psr
>   mov_i32 tmp42,cc_dst_0                         ; get cc_dst_0
>   movi_i32 tmp43,$0x0                            ;
>   movi_i32 tmp44,$0x0                            ;
>   movi_i32 tmp45,$0x0                            ; zero extends
>   brcond2_i32 tmp42,tmp43,tmp44,tmp45,$0x1,$0x0  ; if not
> zero, skip
>   movi_i32 tmp19,$0x400000                       ; else set zero flag
>
>
>
>  which converts into x86:
>   0xb80da04d:  sub    %ecx,%eax          ; %ecx = g4-g3
>   0xb80da04f:  sbb    %ebx,%edx
>   0xb80da051:  mov    %eax,0x6c(%ebp)    ; saving g3, not the result (ecx)!
>   0xb80da054:  mov    %edx,0x70(%ebp)    ;
>   0xb80da057:  xor    %edx,%edx
>   0xb80da059:  xor    %ecx,%ecx          ; clearing our result for use as
> psr
>                                         ; result is lost!
>                                         ; the later test for zero is done
>                                         ; against g3 instead, which
>                                         ; sets the zero flag when it
>   ...                                    ; shouldn't
>   0xb80da06f:  test   %eax,%eax
>   0xb80da071:  jne    0xb80da091         ; skip if not zero
>   ..
>   0xb80da07f:  mov    0x8c(%ebp),%eax    ; load psr
>   0xb80da085:  or     $0x400000,%eax     ; set zero flag
>
>
>  So unless there's some weird AT&T/intel ordering thing that is confusing me
> (please let me know if I am missing something), TCG is getting confused
> about which argument of the subtract is the result.  I'm not sure how to fix
> this though...

Thank you for the analysis! IIRC sub %ecx, %eax is in C:
eax -= ecx;

Still, I can reproduce this, and also amd64 is not correct:
 ---- 0x1008c
 mov_i64 cc_src,g4
 mov_i64 cc_src2,g3
 sub_i64 cc_dst,cc_src,cc_src2
 movi_i32 psr,$0x0
 movi_i64 tmp22,$0xffffffff
 and_i64 tmp21,cc_dst,tmp22
 movi_i64 tmp22,$0x0
 brcond_i64 tmp21,tmp22,$0x1,$0x0

0x601c287b:  mov    0x20(%r14),%rcx
0x601c287f:  mov    %rdx,%r8
0x601c2882:  mov    %rcx,%r9
0x601c2885:  sub    %r8,%r9
0x601c2888:  mov    %r9,%rax
0x601c288b:  and    $0xffffffff,%eax
0x601c2891:  mov    %rsi,0x10a58(%r14)
0x601c2898:  mov    %rdi,0x10a60(%r14)
0x601c289f:  mov    %rcx,0x60(%r14)
0x601c28a3:  mov    %r8,0x68(%r14)
0x601c28a7:  mov    %r9,0x70(%r14)
0x601c28ab:  xor    %edi,%edi
0x601c28ad:  mov    %edi,0x90(%r14)
0x601c28b4:  mov    %rdx,0x18(%r14)
0x601c28b8:  test   %rax,%rax
0x601c28bb:  jne    0x601c28d5

Though gen_op_sub_cc C flag generation part looks suspicious.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]