qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 7/7] pc: add PC_I440FX_COMPAT to disable aerc


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v5 7/7] pc: add PC_I440FX_COMPAT to disable aercap for vifo device
Date: Wed, 18 Mar 2015 17:44:33 +0100

On Wed, Mar 18, 2015 at 09:45:29AM -0600, Alex Williamson wrote:
> On Wed, 2015-03-18 at 16:02 +0100, Michael S. Tsirkin wrote:
> > On Wed, Mar 18, 2015 at 08:50:54AM -0600, Alex Williamson wrote:
> > > On Wed, 2015-03-18 at 15:36 +0100, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 18, 2015 at 08:15:01AM -0600, Alex Williamson wrote:
> > > > > On Wed, 2015-03-18 at 15:05 +0100, Michael S. Tsirkin wrote:
> > > > > > On Wed, Mar 18, 2015 at 08:02:26AM -0600, Alex Williamson wrote:
> > > > > > > On Wed, 2015-03-18 at 14:23 +0100, Michael S. Tsirkin wrote:
> > > > > > > > typo in subject: vfio, not vifo.
> > > > > > > > 
> > > > > > > > On Thu, Mar 12, 2015 at 06:23:59PM +0800, Chen Fan wrote:
> > > > > > > > > for piix4 chipset, we don't need to expose aer, so introduce
> > > > > > > > > PC_I440FX_COMPAT for all piix4 machines to disable aercap,
> > > > > > > > > and add HW_COMPAT_2_2 to disable aercap for all lower
> > > > > > > > > than 2.3.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Chen Fan <address@hidden>
> > > > > > > > 
> > > > > > > > Well vfio is never migrated ATM.
> > > > > > > > So why is compat code needed at all?
> > > > > > > 
> > > > > > > It's not for migration, it's to maintain current behavior on 
> > > > > > > existing
> > > > > > > platforms.  If someone gets an uncorrected AER error on q35 
> > > > > > > machine type
> > > > > > > today, the VM stops.  With this change, AER would be exposed to 
> > > > > > > the
> > > > > > > guest and the guest could handle it.  The compat change therefore
> > > > > > > maintains the stop VM behavior on existing q35 machine types.
> > > > > > 
> > > > > > If stop VM behaviour is useful, expose it to users.
> > > > > > If not, then don't.
> > > > > > I don't see why does it have to be tied to machine types.
> > > > > 
> > > > > Because q35-2.2 machine type will currently do a stop VM on 
> > > > > uncorrected
> > > > > AER error.  If we don't tie that to a machine option then q35-2.2 
> > > > > would
> > > > > suddenly start exposing the error to the guest.  That's a fairly
> > > > > significant change in behavior for a static machine type.
> > > > 
> > > > I don't think you can classify it as a behaviour change. VM stop is not
> > > > guest visible behaviour.
> > > 
> > > In one case, an uncorrected AER occurs and the VM is stopped by QEMU.
> > > In the other case, the guest is notified and may attempt corrective
> > > action... or maybe the guest doesn't understand AER and the user is
> > > depending on the previous behavior.  That is absolutely a behavior
> > > change.
> > > 
> > > > Are you worrying about guests misbehaving when they see these errors?
> > > > Then you want this as user-controlled, supported option.
> > > 
> > > Whether the option is user visible is tangential to whether the behavior
> > > of existing machine types should be maintained.  Existing machine types
> > > can impose a different default than current machine types.
> > >
> > > > In other words: we only tie things to machine types when we
> > > > have to. This code gets almost no testing, and is a lot of
> > > > work to test. This one sounds like "just in case" is not a good
> > > > motivation.
> > > 
> > > It seems like an obvious use case for using machine types to maintain
> > > compatibility with previous behavior, which is exactly why we have
> > > machine types.  If we're not going to use it, why do we have it?
> > 
> > We have machine types because of the following issues:
> > - some silent changes confuse guests. For example guest installed with
> >   one machine type might not boot if you try to use it after
> >   changing something, or - in case of windows - throw up warnings.
> > - some changes break migration
> > 
> > Looks like none of these cases.
> > If AER is unsafe, turn it off by default for everyone.
> 
> This is silly, we have the tools, let's use them.

It's a very expensive tool, maintainance-wise. We often don't
have the choice but I'm not going to use this tool by choice
unless we know why we are doing this.

> If a user is running
> a VM that gets a VM stop on AER error one day and they upgrade QEMU and
> restart it, they should get the same behavior, whether a migration is
> involved or not.

You keep saying this, but why should it? Answer that question, the rest
will follow.

> Maybe the default should be disabled, this patch
> series hasn't yet even convinced me that there's a worthwhile general
> case where the guest can recover, but using the existing machine
> compatibility infrastructure should be at our disposal if we do think
> the default going forward should be different than the behavior today.

I'm sorry, I don't think it's the right tool for the job.

-- 
MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]