Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose ts

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose ts

From:	Gleb Natapov
Subject:	Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest)
Date:	Thu, 10 May 2012 15:53:42 +0300
On Wed, May 09, 2012 at 04:38:02PM -0300, Eduardo Habkost wrote:
> On Wed, May 09, 2012 at 12:38:37PM +0300, Gleb Natapov wrote:
> > On Wed, May 09, 2012 at 11:05:58AM +0200, Alexander Graf wrote:
> > > 
> > > On 09.05.2012, at 10:51, Gleb Natapov wrote:
> > > 
> > > > On Wed, May 09, 2012 at 10:42:26AM +0200, Alexander Graf wrote:
> > > >> 
> > > >> 
> > > >> On 09.05.2012, at 10:14, Gleb Natapov <address@hidden> wrote:
> > > >> 
> > > >>> On Wed, May 09, 2012 at 12:07:04AM +0200, Alexander Graf wrote:
> > > >>>> 
> > > >>>> On 08.05.2012, at 22:14, Eduardo Habkost wrote:
> > > >>>> 
> > > >>>>> On Tue, May 08, 2012 at 02:58:11AM +0200, Alexander Graf wrote:
> > > >>>>>> On 07.05.2012, at 20:21, Eduardo Habkost wrote:
> > > >>>>>> 
> > > >>>>>>> 
> > > >>>>>>> Andre? Are you able to help to answer the question below?
> > > >>>>>>> 
> > > >>>>>>> I would like to clarify what's the expected behavior of "-cpu 
> > > >>>>>>> host" to
> > > >>>>>>> be able to continue working on it. I believe the code will need 
> > > >>>>>>> to be
> > > >>>>>>> fixed on either case, but first we need to figure out what are the
> > > >>>>>>> expectations/requirements, to know _which_ changes will be needed.
> > > >>>>>>> 
> > > >>>>>>> 
> > > >>>>>>> On Tue, Apr 24, 2012 at 02:19:25PM -0300, Eduardo Habkost wrote:
> > > >>>>>>>> (CCing Andre Przywara, in case he can help to clarify what's the
> > > >>>>>>>> expected meaning of "-cpu host")
> > > >>>>>>>> 
> > > >>>>>>> [...]
> > > >>>>>>>> I am not sure I understand what you are proposing. Let me 
> > > >>>>>>>> explain the
> > > >>>>>>>> use case I am thinking about:
> > > >>>>>>>> 
> > > >>>>>>>> - Feature FOO is of type (A) (e.g. just a new instruction set 
> > > >>>>>>>> that
> > > >>>>>>>> doesn't require additional userspace support)
> > > >>>>>>>> - User has a Qemu vesion that doesn't know anything about 
> > > >>>>>>>> feature FOO
> > > >>>>>>>> - User gets a new CPU that supports feature FOO
> > > >>>>>>>> - User gets a new kernel that supports feature FOO (i.e. has FOO 
> > > >>>>>>>> in
> > > >>>>>>>> GET_SUPPORTED_CPUID)
> > > >>>>>>>> - User does _not_ upgrade Qemu.
> > > >>>>>>>> - User expects to get feature FOO enabled if using "-cpu host", 
> > > >>>>>>>> without
> > > >>>>>>>> upgrading Qemu.
> > > >>>>>>>> 
> > > >>>>>>>> The problem here is: to support the above use-case, userspace 
> > > >>>>>>>> need a
> > > >>>>>>>> probing mechanism that can differentiate _new_ (previously 
> > > >>>>>>>> unknown)
> > > >>>>>>>> features that are in group (A) (safe to blindly enable) from 
> > > >>>>>>>> features
> > > >>>>>>>> that are in group (B) (that can't be enabled without an userspace
> > > >>>>>>>> upgrade).
> > > >>>>>>>> 
> > > >>>>>>>> In short, it becomes a problem if we consider the following case:
> > > >>>>>>>> 
> > > >>>>>>>> - Feature BAR is of type (B) (it can't be enabled without extra
> > > >>>>>>>> userspace support)
> > > >>>>>>>> - User has a Qemu version that doesn't know anything about 
> > > >>>>>>>> feature BAR
> > > >>>>>>>> - User gets a new CPU that supports feature BAR
> > > >>>>>>>> - User gets a new kernel that supports feature BAR (i.e. has BAR 
> > > >>>>>>>> in
> > > >>>>>>>> GET_SUPPORTED_CPUID)
> > > >>>>>>>> - User does _not_ upgrade Qemu.
> > > >>>>>>>> - User simply shouldn't get feature BAR enabled, even if using 
> > > >>>>>>>> "-cpu
> > > >>>>>>>> host", otherwise Qemu would break.
> > > >>>>>>>> 
> > > >>>>>>>> If userspace always limited itself to features it knows about, 
> > > >>>>>>>> it would
> > > >>>>>>>> be really easy to implement the feature without any new probing
> > > >>>>>>>> mechanism from the kernel. But that's not how I think users 
> > > >>>>>>>> expect "-cpu
> > > >>>>>>>> host" to work. Maybe I am wrong, I don't know. I am CCing Andre, 
> > > >>>>>>>> who
> > > >>>>>>>> introduced the "-cpu host" feature, in case he can explain 
> > > >>>>>>>> what's the
> > > >>>>>>>> expected semantics on the cases above.
> > > >>>>>> 
> > > >>>>>> Can you think of any feature that'd go into category B?
> > > >>>>> 
> > > >>>>> - TSC-deadline: can't be enabled unless userspace takes care to 
> > > >>>>> enable
> > > >>>>> the in-kernel irqchip.
> > > >>>> 
> > > >>>> The kernel can check if in-kernel irqchip has it enabled and 
> > > >>>> otherwise mask it out, no?
> > > >>>> 
> > > >>> How kernel should know that userspace does not emulate it?
> > > >> 
> > > >> You have to enable the in-kernel apic to use it, at which point the 
> > > >> kernel knows it's in use, right?
> > > >> 
> > > >>> 
> > > >>>>> - x2apic: ditto.
> > > >>>> 
> > > >>>> Same here. For user space irqchip the kernel side doesn't care. If 
> > > >>>> in-kernel APIC is enabled, check for its capabilities.
> > > >>>> 
> > > >>> Same here.
> > > >>> 
> > > >>> Well, technically both of those features can't be implemented in
> > > >>> userspace right now since MSRs are terminated in the kernel, but I
> > > >> 
> > > >> Doesn't sound like the greatest design - unless you deprecate the 
> > > >> non-in-kernel apic case.
> > > >> 
> > > > You mean terminating MSRs in kernel does not sound like the greatest
> > > > design? I do not disagree. That is why IMO kernel can't filter out
> > > > TSC-deadline and x2apic like you suggest.
> > > 
> > > I still don't see why it can't.
> > > 
> > > Imagine we would filter TSC-deadline and x2apic by default in the kernel 
> > > - they are not known to exist yet.
> > > Now, we implement TSC-deadline in the kernel. We still filter
> > > TSC-deadline out in GET_SUPORTED_CPUID in the kernel. But we provide
> > > an interface to user space that says "call me to enable TSC-deadline
> > > CPUID, but only if you're using the in-kernel apic"
> 
> We have that interface already, it is called KVM_SET_CPUID.  :-)
> 
> > > New user space calls that ioctl when it's using the in-kernel apic, it 
> > > doesn't when it's using the user space apic.
> > > Old user space doesn't call that ioctl.
> > First of all we already have TSC-deadline in GET_SUPORTED_CPUID without
> > any additional ioctls. And second I do not see why we need additional
> > iostls here.
> 
> We don't have TSC-deadline set today (and that's what started this
> thread), but we have x2apic. So what you say is true for x2apic, anyway.
> 
Hmm, strange. But s/TSC-deadline/x2apic/ and the point stands :)

> > Hmm, so may be I misunderstood you. You propose to mask TSC-deadline and
> > x2apic out from GET_SUPORTED_CPUID if irq chip is not in kernel, not
> > from KVM_SET_CPUID? For those two features it may make sense indeed.
> 
> It makes sense to me.
> 
> It looks like my assumptions were wrong. They were:
> 
> - GET_SUPPORTED_CPUID simply can't know if the in-kernel irqchip is
>   going to be enabled or not.
I think this is currently true. Before we changing that we have to make
sure that no existing userspace calls GET_SUPPORTED_CPUID before
creating irqchip. Not sure we can check all existing userspaces.

> - GET_SUPPORTED_CPUID output has to be a function of the kernel code
>   capabilitie and host CPU, and not depend on any input from userspace.
> 
> Are those assumptions incorrect? If we break them, we may try what
> Alexander is proposing. It would be much more flexible than the options
> I was considering.
> 
> I didn't know ENABLE_CAP existed. Even if GET_SUPPORTED_CPUID can't
What is ENABLE_CAP?

> check for the in-kernel irqchip setup for some reason, ENABLE_CAP could
> be used by userpace to tell the kernel "I will enable the in-kernel
> irqchip, so feel free to return features that depend on it on
> GET_SUPPORTED_CPUID".
> 
> In other words, we would return only the type-A features on
> GET_SUPPORTED_CPUID (i.e. safe to be blindly enabled by -cpu host as
> long as migration is not required), but if we use ENABLE_CAP we can make
> group A safely grow, as long as userspace first tells the kernel what it
> supports.
> 
> Anybody is against doing that? Otherwise I plan to work on this.
> Probably I will start by making GET_SUPPORTED_CPUID not return x2apic
> unless userspace tells the kernel (using ENABLE_CAP) that it will enable
> the in-kernel irqchip. Then we can do the same with TSC-deadline.
> 
> Note that all this work is to allow the kernel to let userspace blindly
> enable features it _doesn't know yet_. If we limit ourselves to features
> userspace already knows about, we could simply remove x2apic and
> TSC-deadline from GET_SUPPORTED_CPUID completely, not use ENABLE_CAP for
> that, and let userspace set x2apic or TSC-deadline on KVM_SET_CPUID only
> if it knows it is safe (either because it checked for the corresponding
> KVM_CAP_* capability is present and it will enable the in-kernel
> irqchip, or because it will emulated it in userspace).
> 
There is not KVM_CAP_* for x2apic as far as I see.

> In case anybody is against the proposal above: note that the current
> documented GET_SUPPORTED_CPUID semantics (unconditionally returning bits
> that depend on specific userspace behavior/capabilities) is simply
> unusable by "-cpu host". If the above proposal gets rejected, my Plan B
> is to update the GET_SUPPORTED_CPUID documentation to note that it
> returns only type-A features (features that userspace can safely enable
> even if it doesn't know what it does), remove x2apic from
> GET_SUPPORTED_CPUID forever, and use KVM_CAP_* for discovery of all
> type-B features (features that depend on specific userspace
> capabilities/behavior).
> 
This will break existing userspaces, no?

> 
> > Not
> > sure there won't be others that are not dependent on irq chip presence.
> > You propose to add additional ioctls to enable them if they appear?
> 
> I am sure there will be new features in the future that don't depend on
> any userspace support, so they would be enabled on GET_SUPPORTED_CPUID
> unconditionally.
> 
Those not the kind of features I am worry about though. I worry about
the features that can be emulated either by kernel or by userspace and
userspace may choose where it wants the feature to be emulated.

> But if we have new features that depend on specific userspace
> capabilities/behavior (i.e. enabling the irqchip, or something else), we
> could also add them as long as we check if that capability/behavior was
> enabled using ENABLE_CAP.
> 
> 
> > > 
> > > So at the end all bits in GET_SUPPORTED_CPUID are consistent with what 
> > > user space is capable of.
> > > 
> > GET_SUPPORTED_CPUID should not be necessary consistent with what user
> > space is capable of. Userspace may emulate features that are not in
> > GET_SUPPORTED_CPUID.
> 
> True. We don't need to make the interface too complex just to make
> GET_SUPPORTED_CPUID match exactly what userspace is going to enable. If
> userspace wants to enable a feature because it can emulate it by its
> own, it can just enable it using SET_CPUID.
> 
> -- 
> Eduardo

--
                        Gleb.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest), (continued)
- Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest), Andre Przywara, 2012/05/09
Prev by Date: Re: [Qemu-devel] Idea: fuse-kvm filesystem
Next by Date: Re: [Qemu-devel] Idea: fuse-kvm filesystem
Previous by thread: Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest)
Next by thread: Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest)
Index(es):
- Date
- Thread