qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] pci: relax pci_msi_get_message()


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v2] pci: relax pci_msi_get_message()
Date: Fri, 25 Nov 2016 15:05:58 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Nov 25, 2016 at 08:47:26AM +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 25, 2016 at 01:28:36PM +0800, Peter Xu wrote:
> > On Fri, Nov 25, 2016 at 06:11:01AM +0200, Michael S. Tsirkin wrote:
> > > On Tue, Nov 22, 2016 at 04:08:50PM +0800, Peter Xu wrote:
> > > > We are very strict in the past getting MSIs from commit
> > > > d1f6af6a1 ("kvm-irqchip: simplify kvm_irqchip_add_msi_route"), assuming
> > > > that MSI should be configured before hand when fetching. When we have
> > > > unrecognized configurations, we panic the system. However looks like
> > > > this is too strict to be working on some platform, and issues occured.
> > > > Firstly it's found on a ppc case and fixed by David in:
> > > > 
> > > >   6d17a01 vfio/pci: Fix regression in MSI routing configuration
> > > > 
> > > > However we encountered another case now with windows virtio driver and
> > > > reported (and possibly more):
> > > > 
> > > >   http://bugs.debian.org/844361
> > > > 
> > > > To make every driver/hardware happy, let's loosen the rule and go back
> > > > to the original behavior - instead of panic the system, when we try to
> > > > fetch MSI without configured MSI/MSI-X system, we just provide an empty
> > > > message to make drivers happy.
> > > > 
> > > > Reported-by: Maciej KotliƄski <address@hidden>
> > > > Signed-off-by: Peter Xu <address@hidden>
> > > > ---
> > > >  hw/pci/pci.c | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index 24fae16..0cd2bb0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -2606,9 +2606,11 @@ MSIMessage pci_get_msi_message(PCIDevice *dev, 
> > > > int vector)
> > > >      } else if (msi_enabled(dev)) {
> > > >          msg = msi_get_message(dev, vector);
> > > >      } else {
> > > > -        /* Should never happen */
> > > > -        error_report("%s: unknown interrupt type", __func__);
> > > > -        abort();
> > > > +        /*
> > > > +         * Device is not configured with MSI/MSI-X yet, let's provide
> > > > +         * an empty message to make all device drivers happy.
> > > > +         */
> > > > +        memset(&msg, 0, sizeof(msg));
> > > >      }
> > > >      return msg;
> > > >  }
> > > 
> > > Looks like a hack to me. Even if callers happen to treat
> > > 0 message as a nop, there's no guarantee that's true for
> > > all platforms. Pls change pci_get_msi_message to
> > > return an error code, and fix callers to check that.
> > 
> > Actually I'd say I still cannot understand why some guests will
> > encounter this issue - the driver just tries to fetch and enable
> > MSI/MSI-X messages without fully activate MSI/MSI-X in general (that's
> > why msi_enabled() and msix_enabled() both returned false).
> > 
> > But indeed some drivers won't work properly after commit d1f6af6a.
> > David fixed some case for ppc and vfio (6d17a01), but problem still
> > exist, like Debian's report (http://bugs.debian.org/844361).
> > 
> > To avoid going into every details of guest drivers, I was thinking
> > maybe we can just "recover" the behavior to the past: old QEMU (before
> > d1f6af6a) allows to fetch an meaningless message (e.g., when
> > msi_get_message() is called, it's possible that MSI is still not
> > setup, but IMHO that message is meaningless). So here we emulate that
> > case, and return a meaningless message. From the Debian bug report, we
> > can see that this does solve the issue (yeah I know this is not even
> > an excuse to agree on this patch, but again I just think this is
> > currently the best way to solve the problem...).
> > 
> > If we change this into return error, then kvm_irqchip_add_msi_route()
> > might fail, and that's a different behavior again, I don't know
> > whether it'll bring more troubles (only if I can test every guest
> > driver with that code, but that's merely impossible). So IMHO the best
> > way to solve the problem is use the current patch and try to keep the
> > old behavior.
> > 
> > I would really appreciate if you (or anyone) has better suggestion.
> > 
> > Thanks,
> > 
> > -- peterx
> 
> You need to figure out the call stack. What does driver do?
> I'm fine with working around it for now but
> 1. maybe we should fix the driver anyway
> 2. we might want to emit an error message
> 
> -- 
> MST

Limin posted another comment here:

  http://bugs.debian.org/844361

We can see whether that bug is finally caused by pci-assign too, and
whether that pci-assign fix can solve the problem.

If that works, it'll be nice, then we won't need this patch at least
for now.

Thanks,

-- peterx



reply via email to

[Prev in Thread] Current Thread [Next in Thread]