qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to spe


From: Dennis Luehring
Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to speedup the emulation?
Date: Fri, 21 Aug 2015 06:32:31 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0

Am 20.08.2015 um 19:19 schrieb Richard Henderson:
This isn't surprising, because at the moment tcg optimizations are almost
completely ineffective for sparc.  The way the register windows are implemented
means that there are very few proper tcg temporaries to optimize.

I've just updated an old branch that attempts to cure this.  It creates proper
tcg temporaries for the windowed registers, and uses a bit of recursion to find
the place at which they should be stored.

    git://github.com/rth7680/qemu.git  tcg-indirect

With a few quick unscientific tests, it appears to help.  It would be nice to
put that branch side-by-side with your tests above.

tcg-indirect seems not to improve (stream test degrades even more)

without-optimization means qemu.org-git + undefine USE_TCG_OPTIMIZATIONS

git clone git://github.com/rth7680/qemu.git
cd qemu
git checkout tcg-indirect

g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD -MP

tcg-indirect: ~2:46.5
qemu.org-git: ~2:51.2 (worst result)
without-optimization: ~2:14.1 (best result)

gcc prime.c -o prime.out -lm

prime.out runtime

tcg-indirect: ~9.3 sec (best result)
qemu.org-git: ~11 sec
without-optimization: ~9.9 sec (worst result)

stream results (STREAM version $Revision: 5.10 $)

tcg-indirect: (worst result)

Your clock granularity/precision appears to be 41 microseconds.
Each test below will take on the order of 632527 microseconds.
   (= 15427 clock ticks)
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             320.8     0.511297     0.498785     0.590214
Scale:            187.0     0.858693     0.855465     0.863527
Add:              218.2     1.104654     1.099698     1.110341
Triad:            169.5     1.433273     1.416321     1.502248

qemu.org-git: (best result)

Your clock granularity/precision appears to be 42 microseconds.
Each test below will take on the order of 330428 microseconds.
    (= 7867 clock ticks)
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             771.5     0.214717     0.207377     0.244214
Scale:            288.1     0.573320     0.555401     0.660161
Add:              423.5     0.633523     0.566661     1.092067
Triad:            242.9     1.053032     0.987970     1.499563

without-optimization:

Your clock granularity/precision appears to be 41 microseconds.
Each test below will take on the order of 745254 microseconds.
   (= 18176 clock ticks)
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             316.6     0.524065     0.505313     0.580103
Scale:            200.5     0.813356     0.798024     0.840986
Add:              243.9     1.010247     0.984025     1.119149
Triad:            182.9     1.345601     1.312236     1.427459



reply via email to

[Prev in Thread] Current Thread [Next in Thread]