qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 0/4] AER-KVM: Error containment of PCI pass-thru devices assigned to KVM guests
Date: Tue, 20 Nov 2012 14:41:04 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Nov 20, 2012 at 06:31:48AM +0000, Pandarathil, Vijaymohan R wrote:
> Add support for error containment when a PCI pass-thru device assigned to a 
> KVM
> guest encounters an error. This is for PCIe devices/drivers that support AER
> functionality. When the OS is notified of an error in a device either
> through the firmware first approach or through an interrupt handled by the AER
> root port driver, concerned subsystems are notified by invoking callbacks
> registered by these subsystems. The device is also marked as tainted till the
> corresponding driver recovery routines are successful. 
> 
> KVM module registers for a notification of such errors. In the KVM callback
> routine, a global counter is incremented to keep track of the error
> notification. Before each CPU enters guest mode to execute guest code,
> appropriate checks are done to see if the impacted device belongs to the guest
> or not. If the device belongs to the guest, qemu hypervisor for the guest is
> informed and the guest is immediately brought down, thus preventing or
> minimizing chances of any bad data being written out by the guest driver
> after the device has encountered an error.

I'm surprised that the hypervisor would shut down the guest when PCIe
AER kicks in for a pass-through device.  Shouldn't we pass the AER event
into the guest and deal with it there?

The equivalent to this policy on physical hardware would be that the CPU
is reset or the machine is powered down on AER.  That doesn't sound
right.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]