[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Expensive emulation of CPU condition flags
From: |
Shuang Zhai |
Subject: |
[Qemu-devel] Expensive emulation of CPU condition flags |
Date: |
Thu, 30 Jun 2016 18:13:56 +0000 |
Hi everyone.
In running an ARMv7 guest on an x86 host, we observed that a guest instruction
affecting condition flags is often translated into 10+ host instructions. The
reason seems to be the way that the frontend emulates the condition flags. For
instance:
Target ARM instruction:
cmp r9, 0x21 ;
IR instruction:
movi_i32 tmp5,$0x21
sub_i32 NF,r9,tmp5
mov_i32 ZF,NF
setcond_i32 CF,r9,tmp5,geu
xor_i32 VF,NF,r9
xor_i32 tmp7,r9,tmp5
and_i32 VF,VF,tmp7
Host x86 instruction:
sub $0x21,%ebx
mov %ebx,0x208(%r14)
mov %ebx,%r12d
mov %r12d,0x20c(%r14)
cmp $0x21,%ebp
setae %r13b
movzbl %r13b,%r13d
mov %r13d,0x200(%r14)
xor %ebp,%ebx
xor $0x21,%ebp
and %ebp,%ebx
mov %ebx,0x204(%r14)
Imaging in a tight loop where a cmp instruction is used to compute the
termination condition, this can be pretty expensive. And lazy evaluation seems
not to help here.
We wonder if there exists any optimization, e.g., directly mapping the frontend
flags to that of the backend? Any suggestions are appreciated.
Shuang
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] Expensive emulation of CPU condition flags,
Shuang Zhai <=