qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 00/22] re-factor softfloat and add fp16 funct


From: Alex Bennée
Subject: Re: [Qemu-devel] [PATCH v3 00/22] re-factor softfloat and add fp16 functions
Date: Thu, 25 Jan 2018 12:59:56 +0000
User-agent: mu4e 1.0-alpha3; emacs 26.0.91

Howard Spoelstra <address@hidden> writes:

> Hi,
>
> I built qemu-system-ppc for OSX and Windows from
> https://github.com/stsquad/qemu/tree/softfloat-refactor-and-fp16-v3
> and noticed a considerable drop in floating point performance on both
> hosts.
> Running Mac OS 9.2 in OSX, using MacBench 3.0, the score for the
> floating point performance dropped from ~60 to ~42.
>
> Recent tcg optimisations had improved processor and floating point
> performance considerably, but that gain seems to be more than lost for
> the floating point performance.
>
> Any idea what is causing this?

Well we expected a little degradation but it's a bit more than I
expected.

Re-factor

12:32:18 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu.git/aarch64-linux-user/qemu-aarch64 ./vector-benchmark -b 
float64-add
float64-add         : test took 1743387 msec
                      67108864 ops, ~25978 nsec/kop
12:32:26 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu.git/aarch64-linux-user/qemu-aarch64 ./vector-benchmark -b 
float64-add
float64-add         : test took 1735526 msec
                      67108864 ops, ~25861 nsec/kop
12:32:32 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu.git/aarch64-linux-user/qemu-aarch64 ./vector-benchmark -b 
float64-add
float64-add         : test took 1742030 msec
                      67108864 ops, ~25958 nsec/kop

Original

12:32:35 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu-builddirs/arm-targets.build/aarch64-linux-user/qemu-aarch64 
./vector-benchmark -b float64-add
float64-add         : test took 1255007 msec
                      67108864 ops, ~18701 nsec/kop
12:32:44 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu-builddirs/arm-targets.build/aarch64-linux-user/qemu-aarch64 
./vector-benchmark -b float64-add
float64-add         : test took 1243866 msec
                      67108864 ops, ~18535 nsec/kop
12:32:46 address@hidden:~/m/t/aarch64] master(+0/-0) ± 
~/lsrc/qemu/qemu-builddirs/arm-targets.build/aarch64-linux-user/qemu-aarch64 
./vector-benchmark -b float64-add
float64-add         : test took 1278100 msec
                      67108864 ops, ~19045 nsec/kop

The main difference is all the code has a common path now and works on a
canonicalized representation of the floating point number rather than
direct bit-fiddling of the register representation. This is obviously
going to have some cost but it looks like we should spend a bit more
time seeing if we can claw some of that back.

That said any floating point on QEMU is going to be slow by the very
nature of the implementation. Because we don't translate guest<->host
instructions directly we will be several orders of magnitude slower for
floating point instructions compared to normal integer and bitwise
operations. If we want to bring on FPU performance we need to look at
how we can safely use host instructions to get the exact same results as
we currently do with softfloat.

I'm still keen on the re-factor as it does achieve the other goals of
improving readability of the code and making it much easier to reason
with. It also reduces the amount of effective copy and paste but with
slightly different constants which the old code had.

>
> Best regards,
> Howard


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]