qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] profiling qemu


From: Artyom Tarasenko
Subject: Re: [Qemu-devel] profiling qemu
Date: Tue, 14 Feb 2012 16:01:24 +0100

2012/2/14 Lluís Vilanova <address@hidden>:
> Artyom Tarasenko writes:
> [...]
>> QEMU 1.0.50 monitor - type 'help' for more information
>> (qemu) profile
>> unknown command: 'profile'
>> (qemu) info profile
>> async time  38505498320 (38.505)
>> qemu time   35947093161 (35.947)
>
>> Is there a way to find out more?
>
> Command "info jit" also has some information added when compiled with 
> profiling
> support.
>
> Search for CONFIG_PROFILER to see which code is activated during profiling.
>
>
>> Next I tried gprof:
>
>> build-prof $  gprof sparc64-softmmu/qemu-system-sparc64 gmon.out
>> Flat profile:
>
>> Each sample counts as 0.01 seconds.
>>   %   cumulative   self              self     total
>>  time   seconds   seconds    calls  Ts/call  Ts/call  name
>> 100.00      5.06     5.06                             main
>
>> Hmm. Not very informative. Is there a way to find out more details?
>
> Did you run QEMU for a reasonable amount of time? gprof uses sampling to 
> capture
> its execution time statistics, so a small execution of QEMU will not be able 
> to
> capture any meaningful information.

I did run it to the OpenBIOS prompt. But I think it's my setup which
makes gprof useless on the machine where I tested git master:
the "host" is a virtual machine itself running under virtual box, and
it has problems with the system timer. Will re-check on a bare metal
host.

> [...]
>> Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
>> good candidates for optimizing: together they take the same amount of
>> time as cpu_sparc_exec. I guess both operations would be trivial in
>> the x86_64 assembler. What would be the best strategy to make TCG take
>> the advantage of running on a x86_64 host?
>
> A quick look into the code reveals that these two are called from a TCG helper
> (helper_compute_psr), so I see two approaches here applicable to the most
> frequently used "sub-operations" in helper_compute_psr:
>
> * Define new simpler helpers for those sub-operations that can be declared 
> with
>  TCG_CALL_CONST and generate the new psr/xcc values in temporal registers. You
>  must make sure any other code will still be able to use the new psr/xcc
>  values.

I don't see how to make get_C_sub_xcc even simpler: all it does is the
src1 < src2 check.

> * Reimplement these sub-operations in pure TCG code.

Are there already examples where we compute flags in pure TCG code?

> But first, make sure you run a proper benchmark to establish where are the
> hotspots in the sparc code for QEMU. The problem here is to establish what a
> proper benchmark is :)
>

:)

Artyom

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/search/label/qemu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]