Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments

From:	Avi Kivity
Subject:	Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments
Date:	Tue, 11 Jan 2011 16:52:37 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.7

On 01/11/2011 04:28 PM, Anthony Liguori wrote:

On 01/11/2011 08:18 AM, Avi Kivity wrote:
On 01/11/2011 04:00 PM, Anthony Liguori wrote:
On 01/11/2011 03:01 AM, Avi Kivity wrote:
On 01/10/2011 10:23 PM, Anthony Liguori wrote:
I don't see how ioapic, pit, or pic have a system scope.
They are not bound to any CPU like the APIC which you may have inmind.
And none of the above interact with KVM.
They're implemented by kvm. What deeper interaction do you have inmind?
The emulated ioapic/pit/pic do not interact with KVM at all.
How can they "not interact" with kvm if they're implemented by kvm?

I really don't follow here.
"emulated ioapic/pit/pic" == versions implemented in QEMU. That'swhat I'm trying to say. When not using the KVM versions of thedevices, there are no interactions with KVM.

Okay. Isn't that the same for the cpu? Yet we use the same CPUStateand are live-migration compatible (as long as cpuids match).


The KVM versions should be completely separate devices.


Why?


Because the KVM versions are replacements.

Only the implementation. The guest doesn't see the replacement. Theyhave exactly the same state.

I don't see why. Those are just two different implementations forthe same guest visible device.
Right, they should appear the same to the guest but the fact thatthey're two different implementations should be reflected in thedevice tree.
Why?
To move beyond single-word questions, what is the purpose of thedevice tree? In my mind, it reflects the virtual hardware. What'simportant is that we have a PIC, virtio network adapter, and IDEdisk. Not that they're backed by kvm, vhost-net, and qcow2.
Let me give a very concrete example to illustrate my point.
One thing I have on my TODO is to implement catch-up support for theemulated devices. I want to implement three modes of catch-upsupport: drop, fast, and gradual. Gradual is the best policy IMHO butfast is necessary on older kernels without highres timers. Drop isnecessary to maintain compatibility with what we have today.
The kernel PIT only implements one mode and even if the other two wereadded, even the newest version of QEMU needs to deal with the factthat there's old kernels out there with PIT's that only do fast.
So how does this get exposed to management tools? Do you check fordrift-mode=fast and transparently enable the KVM pit? Do you fail ifanything but drift-mode=fast is specified?
We need to have the following mechanisms:

1) the ability to select an in-kernel PIT vs. a userspace PIT

2) an independent mechanism to configure the userspace PIT

3) an independent mechanism to configure the in-kernel PIT.
The best way to do this is to make the in-kernel PIT a separatedevice. Then we get all of this for free.


And it buys us live migration and ABI issues for the same price.

Really, can't we do

    class i8254 {
        ...
        virtual void set_catchup_policy(std::string policy) = 0;
        ...
    }

to deal with the differences?

2) a user can explicitly create either the emulated version of thedevice or the in-kernel version of the device (no need for-no-kvm-irqchip)
-device ioapic,model=kernel vs. -device kvm-ioapic?
Is it really important to do that? 110% of the time we want thekernel irqchips. The remaining -10% are only used for testing.
If model=kernel makes the support options different, then you end upintroduce another layer of option validation. By using the laterform, you get to leverage the option validation of qdev plus it makesit much clearer to users what options are supported in what modelbecause now the documentation is explicit about it.

Option validation = internals. ABI = ABI. We can deal with the formerin any number of ways, but exposing it to the ABI is forever.

3) a user can pass parameters directly to the in-kernel version ofthe device that are different from the userspace version (likeselecting different interrupt catch-up methods)
-device pit,model=qemu,catchup=slew

error: catchup=slew not supported in this model
I'm not overly concerned about the implementation part. Though Ithink it's better to have a single implementation with kvm acting asan accelerator, having it the other way is no big deal. What I amworried about is exposing it as a monitor and migration ABI. IMO theonly important thing is the spec that the device implements, not whatpiece of code implements it.
Just as we do in the PIT, there's nothing wrong with making thedevice's migration compatible.

Then the two devices have the same migration section id? That's mybiggest worry. Not really worried about PIT and PIC (no one uses theuser PIT now), but more about future devices moving into the kernel, ifwe have to do that.

I'm not entirely sure what your concerns about the monitor are butthere's simply no way to hide the fact that a device is implemented inKVM at the monitor level.

Why is that? a PIT is a PIT. Why does the monitor care where the stateis managed?

But really, is this something that management tools want? I doubtit. I think they want to have ultimate control over what gets createdwith us providing a recommended set of defaults.

They also want a forward migration path. Splitting into two separatedevices (at the ABI level, ignoring the source level for now) deniesthem that.


--
error compiling committee.c: too many arguments to function

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments, (continued)
- [Qemu-devel] [PATCH 25/35] kvm: x86: Drop MCE MSRs write back restrictions, Marcelo Tosatti, 2011/01/06
- [Qemu-devel] [PATCH] kvm: x86: Fix build in absence of KVM_CAP_ASYNC_PF, Jan Kiszka, 2011/01/27

Prev by Date: [Qemu-devel] [PATCH] audio: Fix support for multiple soft output voices.
Next by Date: Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments
Previous by thread: Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments
Next by thread: Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments
Index(es):
- Date
- Thread