qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] sh : performance problem


From: Lionel Landwerlin
Subject: Re: [Qemu-devel] sh : performance problem
Date: Tue, 03 Mar 2009 19:57:57 +0100

Le mercredi 04 mars 2009 à 00:46 +0900, Shin-ichiro KAWASAKI a écrit :
> Lionel Landwerlin wrote:
> > Shin-ichiro,
> > 
> > Sorry, but I cannot apply your patch cleanly on the last qemu-svn.
> > 
> > Instead, I would like to try another approach. The patch you proposed to
> > find (or not) a valid TLB entry has a complexity of O(log2(n)) (or
> > something like that if I remember) instead here is a patch with a
> > complexity of O(1).
> 
> Good work.  I evaluated your patch on my environment, measuring
> compile time for empty main() with gcc.
> 
>   sh4 : 5.8 [seconds]     O(n) utlb search.
>   sh4 : 4.6 [seconds]     O(log2(n)) utlb search.
>   sh4 : 4.1 [seconds]     O(1) utlb search by Lionel
>   arm : 0.8 [seconds]     (-M versatilepb + Debian ARM)
> 
> Your patch has a nice score!

Great :) But we're still far from arm :(

> 
> Now I've done the work to increase number of utlb entries from 64 to 256,
> and found that the score get arround 2.4 seconds.
> I'm trying to increase it to 4096.  Your O(1) search will be more important
> as the entry number increase.
> 

What do you mean by more important ?

Is the arm emulation increasing the number of TLB ?

> > +#if !defined(CONFIG_USER_ONLY)
> > +    /* vpn to utlb entry caches (too much space for user emulation) */
> > +    uint8_t utlbs_1k[4194304]; /* 222 => 4 Mb */
> > +    uint8_t utlbs_4k[1048576]; /* 220 => 1 Mb */
> > +    uint8_t utlbs_64k[65536]; /* 216 => 64 Kb */
> > +    uint8_t utlbs_1m[4096]; /* 212 => 4 Kb */
> > +#endif
> >  } CPUSH4State;
> 
> Isn't it too gorgeous?
> How about allocating them on demand?
> I guess sh-linux uses only utlbs_4k[], in general.
> If so, 4 Mb utlbs_1k[] is waste.
> 

sh-linux can also use huge pages of 64k and 1M.

I think it is important to keep the emulation as close as possible from
the real the cpu capabilities.


Regards,

-- 
Lionel Landwerlin <address@hidden>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]