qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH 2/2] Expose tsc deadline timer cpuid to guest
Date: Mon, 23 Apr 2012 11:48:18 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Sat, Apr 21, 2012 at 09:23:50AM +0200, Jan Kiszka wrote:
> On 2012-04-20 17:36, Eduardo Habkost wrote:
> > On Fri, Apr 20, 2012 at 05:19:17PM +0200, Jan Kiszka wrote:
> >> On 2012-04-20 17:00, Eduardo Habkost wrote:
> >>> On Fri, Apr 20, 2012 at 12:12:38PM +0200, Jan Kiszka wrote:
> >>>> On 2012-04-19 22:03, Eduardo Habkost wrote:
> >>>>> Jan/Avi: ping?
> >>>>>
> >>>>> I would like to get this ABI detail clarified so it can be implemented
> >>>>> the right way on Qemu and KVM.
> >>>>>
> >>>>> My proposal is to simply add tsc-deadline to the data returned by
> >>>>> GET_SUPPORTED_CPUID, making KVM_CAP_TSC_DEADLINE_TIMER unnecessary.
> >>>>>
> >>>>
> >>>> IIRC, there were problems with this model to exclude that the feature
> >>>> gets reported this way without ensuring that in-kernel irqchip is
> >>>> actually activated. Please browse the discussions, it should be
> >>>> documented there.
> >>>
> >>> The flag was never added to GET_SUPPORTED_CPUID on any of the git
> >>> repositories I have checked, and I don't see that being done anywhere on
> >>> my KVM mailing list archives, either. So I don't see how we could have
> >>> had issues with GET_SUPPORTED_CPUID, if it was never present on the
> >>> code.
> >>>
> >>> What was present on the code before the fix, was coded that
> >>> unconditinally enabled the flag even if Qemu never asked for it, and
> >>> that really was wrong.
> >>>
> >>> GET_SUPPORTED_CPUID doesn't enable any flag: it just tells userspace
> >>> that the hardware and KVM supports the feature (and, surprise, this is
> >>> exactly what KVM_CAP_TSC_DEADLINE_TIMER means too, isn't it?). It's up
> >>> to userspace to enable the CPUID bits according to user requirements and
> >>> userspace capabilities (in other words: only when userspace knows it is
> >>> safe because the in-kernel irqchip is enabled).
> >>>
> >>> If the above is not what GET_SUPPORTED_CPUID means, I would like to get
> >>> that clarified, so I can submit an updated to KVM API documentation.
> >>
> >> Does old userspace filter out flags from GET_SUPPORTED_CPUID that it
> >> does not understand?
> > 
> > It's even more strict than that: it only enables what was explicitly
> > enabled on the CPU model asked by the user.
> > 
> > But:
> > 
> > The only exception is "-cpu host", that tries to enable everything, even
> > flags Qemu never heard of, and that is something that may require some
> > changes on the API design (or at least documentation clarifications).
> > 
> > Today "-cpu host" can't differentiate (A) "a feature that KVM supports
> > and emulate by itself and can be enabled without any support from
> > userspace" and (B) "a feature that KVM supports but need support from
> > userspace to be enabled". I am sure we will be able to find multiple
> > examples of (B) inside the flags returned by GET_SUPPORTED_CPUID today.
> 
> The problem remains that exposing TSC_DEADLINE via SUPPORTED_CPUID is
> wrong in case userspace doesn't select the in-kernel APIC. The kernel
> does _nothing_ about emulating the flag if userspace provides the APIC,
> so it must not expose this feature. Even if this had no practical impact
> (but it has: old qemu[-kvm] + userspace APIC + -cpu host), it would
> still be semantically broken.

How is that different from any other feature that requires userspace to
"do the right thing" to make the feature work? GET_SUPPORTED_CPUID just
tells userspace that the flag is supported, but userspace has to be sure
it will really work, before enabling it.

In other words, I always assumed that features from the (B) group were
always included on GET_SUPPORTED_CPUID, too. Yes, that risks breaking
-cpu host, so we will probably need a better interface to let "-cpu
host" know what can be enabled or not, anyway.


> 
> > 
> > We could decide to never add any new flag to GET_SUPPORTED_CPUID if it
> > requires additional userspace support to work, from now on, and create
> > new KVM_CAP_* flags for them. But, we really want to do that for almost
> > every new CPUID feature bit in the future?
> 
> Most CPU features do not depend on our in-kernel irqchips. But if there
> is something to simplify in retrieving information about such
> "conditional features", let's do it.

But many CPU features on GET_SUPPORTED_CPUID depend on other userspace
capabilities, need userspace to do "the right thing" to make it work,
and won't work if userspace doesn't do what's required. Why configuring
the in-kernel irqchip is special?


> 
> > 
> > We also have the problem of defining what "requires support from
> > userspace to work" means: I am not sure this has the same meaning for
> > everybody. Many new features require userspace support only for
> > migration, and nothing else, but some users don't need migration to
> > work. Do those features qualify as (A), or as (B)?
> 
> "Works for most user" also means "breaks for some". And that is a bug,
> isn't it?

It is, surely, but -cpu host is the only case where CPU features are
enabled blindly, and migration is not expected to be available when
using -cpu host, so in that case "breaks only migration" would still
mean "works for all" (becasue except for -cpu host, Qemu won't enable
any feature if it still doesn't have the proper migration support
working).

So, it's still not clear to me: "breaks only migration" qualifies as (A)
or (B)? In other words, a "breaks only migration" feature should be
added to GET_SUPPORTED_CPUID, or not? There are many features already
inside GET_SUPPORTED_CPUID that require specific migration support to
work (e.g. many of them require additional registers to be
saved/loaded). I would bet that _most_ of them require some additional
stated to be saved/loaded.



Trying to summarize the points above:

Groups (A) and (B) are:

A) a feature that KVM supports and emulate by itself and can be enabled
   by userspace blindly, without requiring any additional userspace
   code to work.
B) a feature that KVM supports but need support from userspace to work.

We have to differentiate those two groups somehow, otherwise "-cpu host"
will always risk being unstable (in case we can't identify group (B) and
end up enabling a feature that will break) or useless (if group (A) is
considered always empty).

(If you think this two-group model is not sufficient, please let me know.)

Note that I am discussing two things above:

- Whether GET_SUPPORTED_CPUID should expose only features from group
  (A), or group (B) too.
  - One problem here is that today GET_SUPPORTED_CPUIDS have many
    examples of (B) features inside it. Should we stop doing that?
    - TSC-deadline is the first case where we are _not_ doing that. It
      is the first CPU feature in KVM that can be enabled by userspace
      (as long as userspace does the proper setup), but it's not
      included on GET_SUPPORTED_CPUIDs.
  - Even the current documentation implies that group (B) is included:

    "This ioctl returns x86 cpuid features which are supported by both
    the hardware and kvm.  Userspace can use the information returned by
    this ioctl to construct cpuid information (for KVM_SET_CPUID2) that
    is consistent with hardware, kernel, and userspace capabilities, and
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
    with user requirements (for example, the user may wish to constrain
    cpuid to emulate older hardware, or for feature consistency across a
    cluster)."

    In the specific case of TSC-deadline, I consider "Qemu knowing that
    TSC-deadline can be enabled only if in-kernel irqchip is enabled" as
    an "userpace capability".

- How to precisely define the groups (A) and (B)?
  - "requires additional code only if migration is required" qualifies
    as (B) or (A)?
  - "requires in-kernel irqchip to be enabled to work" qualifies as (B),
    doesn't it?

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]