qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH][RFC] Split non-TCG bits out of exec.c


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH][RFC] Split non-TCG bits out of exec.c
Date: Sun, 16 Nov 2008 15:07:30 +0200
User-agent: Thunderbird 2.0.0.16 (X11/20080723)

Jamie Lokier wrote:
Avi Kivity wrote:
Jamie Lokier wrote:
But does the fact KVM doesn't use TCG prevent KVM from running some
x86 modes correctly?  E.g. I gather 16-bit code is run by KVM using
VM86 mode, which is not exactly correct.  It would be nice to have KVM
acceleration but also complete and correct emulation, by switching to
TCG for those modes.
There is work in progress to make 16-bit emulation fully accurate.

Ooh!  I want my Windows 95 to run in KVM :-)
I'm curious, how is this planned to work?

I'm having trouble thinking of how to do it without software emulation
at some stage.


By emulating all instructions that can't be virtualized.

Since TCG is not smp-safe, this is very problematic for smp guests. You would have to stop virtualization on all vcpus and start tcg on all of them. Performance would plummet.

On the other hand, when running on a KVM-capable architecture
combination, it is definitely possible to make TCG smp-safe because
every guest atomic instruction has a corresponding host one.  It's
practically a 1:1 instruction mapping on x86, which doesn't have many
atomic instructions.  (Maybe harder on other archs).


Maybe. It's simpler to fix kvm not to require this. I don't want kvm to be tied to qemu; when userspace tells kvm to run a vcpu, it means run the vcpu; not "run the vcpu unless there are some instructions you can't run for some undocumented reason".

There are ways of mitigating the high mmio cost with kvm. For framebuffers, one can allow kvm direct access. For other mmio, there's the 'coalesced mmio' support which allows mmio to be batched when this does not affect emulation accuracy and latency.

Don't you still have to trap for each MMIO in order to collect the
batch, except for REP instructions?  It's the traps which are expensive.

Fortunately modern hardware tends to use DMA for data intensive
things, and MMIO just to trigger DMA, and initialisation.

In practice things work fine. 16-color modes are slow but only very old software was designed to work with them, so it expected the hardware to be slow.

--
error compiling committee.c: too many arguments to function





reply via email to

[Prev in Thread] Current Thread [Next in Thread]