qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 4/5] target/ppc: Handle NMI guest


From: David Gibson
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 4/5] target/ppc: Handle NMI guest exit
Date: Sun, 10 Sep 2017 13:22:02 +1000
User-agent: Mutt/1.8.3 (2017-05-23)

On Fri, Sep 08, 2017 at 01:39:37PM +0530, Aravinda Prasad wrote:
> 
> 
> On Wednesday 23 August 2017 02:09 PM, David Gibson wrote:
> > On Mon, Aug 21, 2017 at 06:00:52PM +0530, Aravinda Prasad wrote:
> >>
> >>
> >> On Thursday 17 August 2017 07:27 AM, David Gibson wrote:
> >>> On Wed, Aug 16, 2017 at 02:42:39PM +0530, Aravinda Prasad wrote:
> >>>> Memory error such as bit flips that cannot be corrected
> >>>> by hardware are passed on to the kernel for handling.
> >>>> If the memory address in error belongs to guest then
> >>>> guest kernel is responsible for taking suitable action.
> >>>> Patch [1] enhances KVM to exit guest with exit reason
> >>>> set to KVM_EXIT_NMI in such cases.
> >>>>
> >>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >>>> has registered the machine check handling routine by
> >>>> calling "ibm,nmi-register", then the handler builds
> >>>> the error log and invokes the registered handler else
> >>>> invokes the handler at 0x200.
> >>>>
> >>>> [1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
> >>>>  (e20bbd3d and related commits)
> >>>>
> >>>> Signed-off-by: Aravinda Prasad <address@hidden>
> >>>> ---
> >>>>  hw/ppc/spapr.c       |    4 ++
> >>>>  target/ppc/kvm.c     |   86 
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  target/ppc/kvm_ppc.h |   81 
> >>>> +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  3 files changed, 171 insertions(+)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 0bb2c4a..6cc3f69 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -2346,6 +2346,10 @@ static void ppc_spapr_init(MachineState *machine)
> >>>>          error_report("Could not get size of LPAR rtas '%s'", filename);
> >>>>          exit(1);
> >>>>      }
> >>>> +
> >>>> +    /* Resize blob to accommodate error log. */
> >>>> +    spapr->rtas_size = RTAS_ERRLOG_OFFSET + sizeof(struct RtasMCELog);
> >>>> +
> >>>>      spapr->rtas_blob = g_malloc(spapr->rtas_size);
> >>>>      if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 
> >>>> 0) {
> >>>>          error_report("Could not load LPAR rtas '%s'", filename);
> >>>> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> >>>> index 8571379..73f64ed 100644
> >>>> --- a/target/ppc/kvm.c
> >>>> +++ b/target/ppc/kvm.c
> >>>> @@ -1782,6 +1782,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct 
> >>>> kvm_run *run)
> >>>>          ret = 0;
> >>>>          break;
> >>>>  
> >>>> +    case KVM_EXIT_NMI:
> >>>> +        DPRINTF("handle NMI exception\n");
> >>>> +        ret = kvm_handle_nmi(cpu);
> >>>> +        break;
> >>>> +
> >>>>      default:
> >>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", 
> >>>> run->exit_reason);
> >>>>          ret = -1;
> >>>> @@ -2704,6 +2709,87 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
> >>>>      return data & 0xffff;
> >>>>  }
> >>>>  
> >>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
> >>>
> >>> So you only handle NMIs with KVM.  Wouldn't it make sense to also
> >>> handle them for TCG (where they can be triggered with the "nmi"
> >>> command on the monitor).
> >>
> >> For TCG it is already handled in spapr_nmi() which ultimately branches
> >> to guest's 0x200 vector. Even with KVM when KVM_CAP_PPC_FWNMI is not
> >> enabled we branch to guest's 0x200 vector. Should TCG behave as if
> >> KVM_CAP_PPC_FWNMI is not enabled?
> > 
> > No, we should implement the FWNMI feature for TCG, which means it
> > needs to respect the address given by nmi-register.
> 
> I have couple of questions regarding supporting NMIs triggered through
> "nmi" command from the monitor.
> 
> 1. The FWNNMI builds an error log upon machine check that includes the
> effective address that triggered the machine check and places the error
> log in RTAS blob. Hence, we need to build error log for the NMI
> triggered via the monitor "nmi" command. What should be the effective
> address and the other values of the error log in this case? Should "nmi"
> command be enhanced to take these arguments (as of now it does not take
> any arguments)?

Ah!  That's something I hadn't realised - sounds like FWNMI is only
designed to handle synchronous machine checks, whereas the monitor
command (obviously) generates asynchronous checks.

In that case it's probably reasonable that the monitor command doesn't
go through the FWNMI path.  However, it would be nice to make it a bit
more obvious, by updating some of the comments and commit messages in
the FWNMI code to emphasise that it's only for synchronous checks.

> 2. On Power architecture "nmi" monitor command triggers NMI on all the
> CPUs, while the NMI due to hardware error is triggered only on the CPU
> that accessed the bad memory region. Triggering NMI on all the CPUs
> could be a problem as it cause kernel panic if the CPU was in kernel
> space at that time or kills the process running on that CPU if it was in
> userspace.
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]