qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to spe


From: Dennis Luehring
Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to speedup the emulation?
Date: Wed, 19 Aug 2015 06:28:58 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0

Am 18.08.2015 um 21:06 schrieb Karel Gardas:
Thanks a lot for doing this. It looks like g++ is memory-bound in this
case, isn't it? What does stream[1] benchmark tell on host and
emulated as 32/64bit sparc binary? Let's see if the ratio is kind of
similar to the time you get...

[1]:https://www.cs.virginia.edu/stream/

-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------

==>host Ubuntu 15.04 x64

-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 14147 microseconds.
   (= 14147 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            8877.1     0.018049     0.018024     0.018074
Scale:           8842.7     0.018206     0.018094     0.018749
Add:            10312.9     0.023367     0.023272     0.023901
Triad:          10114.3     0.023758     0.023729     0.023871
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

qemu 2.4.50 x64 build

==>netbsd-guest NetBSD 6.1.5 SPARC64 (pure 64bit) running from ramdisk

-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 42 microseconds.
Each test below will take on the order of 330428 microseconds.
   (= 7867 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             771.5     0.214717     0.207377     0.244214
Scale:            288.1     0.573320     0.555401     0.660161
Add:              423.5     0.633523     0.566661     1.092067
Triad:            242.9     1.053032     0.987970     1.499563
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

==>debian-guest 7.8.0 SPARC64 (mixed 32/64bit) running from ramdisk

!!32bit version!!

-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 41 microseconds.
Each test below will take on the order of 394519 microseconds.
   (= 9622 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             629.4     0.280860     0.254224     0.401105
Scale:            231.7     0.733338     0.690452     0.868741
Add:              346.9     0.747893     0.691890     0.889102
Triad:            201.4     1.239293     1.191786     1.394918
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

!!64bit version!!

-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 40 microseconds.
Each test below will take on the order of 395364 microseconds.
   (= 9884 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             651.3     0.251320     0.245668     0.274346
Scale:            240.3     0.694808     0.665834     0.770982
Add:              353.0     0.690291     0.679792     0.715228
Triad:            201.5     1.207881     1.191054     1.256001
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------








reply via email to

[Prev in Thread] Current Thread [Next in Thread]