[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] sh : performance problem
From: |
Paul Brook |
Subject: |
Re: [Qemu-devel] sh : performance problem |
Date: |
Wed, 4 Mar 2009 02:59:18 +0000 |
User-agent: |
KMail/1.9.9 |
> > > Great :) But we're still far from arm :(
> By the way, does someone know why there is some kind of "tlb management
> code" in exec.c ??
>
> Does the SH4 architecture have special features that can't be handled in
> a generic code ? Or are we just rewriting some code that is already
> there ... ?
I think you're missing the most important difference; SH uses a software
managed TLB, whereas ARM uses a hardware managed TLB.
The main consequence of this is that we don't have to model the actual ARM TLB
at all, it is never directly visible. We effectively implement an infinitely
large TLB.
For SH the TLB is programmed directly, so we end up having to maintain two
TLBs: The qemu TLB and the architectural SH TLB. For correct operation pages
must be removed from the qemu TLB when they are evicted/replaced in the SH
TLB. The SH TLB is quite small, and flushing qemu TLB entries is quite
expensive, so this results in fairly poor performance.
MIPS has a similar problem. However in that case the most common TLB
operations do not directly expose the TLB state. In particular when setting a
new TLB entry it is unspecified which TLB entry is replaced. At that point
the OS can't know which ehtry was evicted, so we can lie, and not evict pages
until the guest does something that allows it to determine the exact TLB
state. In practice this is sufficient to make mips-linux workreasonably well.
I'm not sure if the same is posible for SH. It probably depends whether URC is
visible to/used by the guest.
Large pages add even more complications. The qemu tlb canonly handle a single
page size. In practice means that when large pages are used invalidating a
single page entry requires the whole qemu tlb to be flushed. I'm pretty sure
x86 getsand works mainly be chance (nothing actually ues large pages enough
to notice it's broken). ARM takes the hit of a full TLB flush (linux breakss
if you only flush a 1k region of a 4k entry), but single pge flushes are
rare so in practice this doesn't hurt too much
Paul
- Re: [Qemu-devel] sh : performance problem, (continued)
Re: [Qemu-devel] sh : performance problem, Lionel Landwerlin, 2009/03/02
Re: [Qemu-devel] sh : performance problem, Lionel Landwerlin, 2009/03/02
Re: [Qemu-devel] sh : performance problem, Lionel Landwerlin, 2009/03/03
Re: [Qemu-devel] sh : performance problem, Shin-ichiro KAWASAKI, 2009/03/04