qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu fo


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
Date: Thu, 13 Nov 2014 14:57:52 +1100
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Nov 11, 2014 at 12:45:05PM +0530, Aravinda Prasad wrote:
> 
> 
> On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
> > On Wed, Nov 05, 2014 at 12:42:03PM +0530, Aravinda Prasad wrote:
> >> This series of patches add support for fwnmi in powerKVM guests.
> >>
> >> Currently upon machine check exception, if the address in
> >> error belongs to guest then KVM invokes guest's NMI interrupt
> >> vector 0x200.
> >>
> >> This patch series adds functionality where the guest's 0x200
> >> interrupt vector is patched such that QEMU gets control. QEMU
> >> then builds error log and reports the error to OS registered
> >> machine check handlers through RTAS space.
> >>
> >> Apart from this, the patch series also takes care of synchronization
> >> when multiple processors encounter machine check at or about the
> >> same time.
> >>
> >> The patch set was tested by simulating a machine check error in
> >> the guest.
> >>
> >> Changes in v3:
> >>     - Incorporated review comments
> >>     - Byte codes in patch 4/4 are now moved to
> >>       pc-bios/spapr-rtas/spapr-rtas.S as instructions.
> >>     - Defined the RTAS blob in-memory layout.
> >>     - FIX: save and restore cr register in the trampoline
> >>
> >> Changes in v2:
> >>     - Re-based to github.com/agraf/qemu.git  branch: ppc-next
> >>     - Merged patches 4 and 5.
> >>     - Incorporated other review comments
> > 
> > So, this may not still be possible depending on whether the KVM side
> > of this is already merged, but it occurs to me that there's a simpler
> > way.
> 
> The KVM part is already merged. Commit ID: 74845bc

Ok, that makes life harder, though I guess without the qemu code
merged, no-one would be using yet, so it's not impossible to change still.

> > Rather than mucking about with having to update the hypervisor on the
> > RTAS location, they have qemu copy the code out of RTAS, patch it and
> > copy it back into the vector, you could instead do this:
> 
> Though this is possible, I have coupe of comments below
> 
> > 
> >   1. Make KVM instead of immediately delivering a 0x200 for a guest
> > machine check, cause a special exit to qemu.
> > 
> >   2. Have the register-nmi RTAS call store the guest side MC handler
> > address in the spapr structure, but perform no actual guest code
> > patching.
> > 
> >   3. Allocate the error log buffer independently from the RTAS blob,
> > so qemu always knows where it is.
> 
> As per PAPR, the error log buffer should be part of RTAS blob and the
> guest kernel explicitly checks if error log is inside RTAS blob.
> This requires qemu to know the updated RTAS location by the OS which is
> handled in patch 2/4.

Ugh, ok.  That's a pretty stupid interface requirement, even by PAPR
standards, but I guess we're stuck with it.

> >   4. When qemu gets the MC exit condition, instead of going via a
> > patched 0x200 vector, just directly set the guest register state and
> > jump straight into the guest side MC handler.
> 
> PAPR mentions:
> 
> "R1–7.3.14–8: Once the OS has registered for NMI notification, the
> platform firmware must intercept all System Reset Interrupts on all of
> the OS’s processors."
> 
> So do we need to go via 0x200?

I don't see why.  The hypervisor is already intercepting system resets
and machine checks because it's a hypervisor, and from the PAPR
guest's point of view, all it cares about is that you enter its
registered handler with the expected information available.

I don't see that the guest cares whether you bounce via a vector in
guest space or directly enter the guest supplied handler using
hypervisor magic.  Patching the guest's vector actually seems a pretty
awful hack that would only be necessary to work around limitations in
the virtualization capabilities which I don't think we have as of POWER8.

Btw, isn't a "System Reset Interrupt" vector 0x100, not vector 0x200?

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: pgp6GRn0rtTUM.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]