|
From: | Dennis Luehring |
Subject: | Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to speedup the emulation? |
Date: | Fri, 21 Aug 2015 06:32:31 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 |
Am 20.08.2015 um 19:19 schrieb Richard Henderson:
This isn't surprising, because at the moment tcg optimizations are almost completely ineffective for sparc. The way the register windows are implemented means that there are very few proper tcg temporaries to optimize. I've just updated an old branch that attempts to cure this. It creates proper tcg temporaries for the windowed registers, and uses a bit of recursion to find the place at which they should be stored. git://github.com/rth7680/qemu.git tcg-indirect With a few quick unscientific tests, it appears to help. It would be nice to put that branch side-by-side with your tests above.
tcg-indirect seems not to improve (stream test degrades even more) without-optimization means qemu.org-git + undefine USE_TCG_OPTIMIZATIONS git clone git://github.com/rth7680/qemu.git cd qemu git checkout tcg-indirectg++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD -MP
tcg-indirect: ~2:46.5 qemu.org-git: ~2:51.2 (worst result) without-optimization: ~2:14.1 (best result) gcc prime.c -o prime.out -lm prime.out runtime tcg-indirect: ~9.3 sec (best result) qemu.org-git: ~11 sec without-optimization: ~9.9 sec (worst result) stream results (STREAM version $Revision: 5.10 $) tcg-indirect: (worst result) Your clock granularity/precision appears to be 41 microseconds. Each test below will take on the order of 632527 microseconds. (= 15427 clock ticks) Function Best Rate MB/s Avg time Min time Max time Copy: 320.8 0.511297 0.498785 0.590214 Scale: 187.0 0.858693 0.855465 0.863527 Add: 218.2 1.104654 1.099698 1.110341 Triad: 169.5 1.433273 1.416321 1.502248 qemu.org-git: (best result) Your clock granularity/precision appears to be 42 microseconds. Each test below will take on the order of 330428 microseconds. (= 7867 clock ticks) Function Best Rate MB/s Avg time Min time Max time Copy: 771.5 0.214717 0.207377 0.244214 Scale: 288.1 0.573320 0.555401 0.660161 Add: 423.5 0.633523 0.566661 1.092067 Triad: 242.9 1.053032 0.987970 1.499563 without-optimization: Your clock granularity/precision appears to be 41 microseconds. Each test below will take on the order of 745254 microseconds. (= 18176 clock ticks) Function Best Rate MB/s Avg time Min time Max time Copy: 316.6 0.524065 0.505313 0.580103 Scale: 200.5 0.813356 0.798024 0.840986 Add: 243.9 1.010247 0.984025 1.119149 Triad: 182.9 1.345601 1.312236 1.427459
[Prev in Thread] | Current Thread | [Next in Thread] |