[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] 2.0 regression: loadvm assertion with ehci + tablet
From: |
Cole Robinson |
Subject: |
Re: [Qemu-devel] 2.0 regression: loadvm assertion with ehci + tablet |
Date: |
Mon, 31 Mar 2014 14:14:15 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 |
On 03/30/2014 06:25 PM, Andreas Färber wrote:
> Hi,
>
> Am 30.03.2014 22:27, schrieb Cole Robinson:
>> With git master, loadvm hits an assert failure if using ehci and usb tablet.
>> Steps to reproduce:
>>
>> $ qemu-img create -f qcow2 foo.qcow2 10G
>> $ ./x86_64-softmmu/qemu-system-x86_64 \
>> -enable-kvm -m 4096 \
>> -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 \
>> -device
>> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5
>> \
>> -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 \
>> -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 \
>> -device usb-tablet,id=input0 \
>> -hda foo.qcow2 \
>> -cdrom Fedora-20-x86_64-Live-Desktop.iso \
>> -boot d -monitor stdio
>>
>> <wait until guest boot to 'welcome to fedora' window>
>> (qemu) savevm foo
>> (qemu) loadvm foo
>> qemu-system-x86_64: hw/pci/pci.c:250: pcibus_reset: Assertion
>> `bus->irq_count[i] == 0' failed.
>>
>> The relevant backtrace bits for the assertion:
>>
>> #4 0x00007f8f7241971e in pcibus_reset (qbus=0x7f8f74082fd0)
>> at hw/pci/pci.c:250
>> #5 0x00007f8f723bd36d in qbus_reset_one (bus=0x7f8f74082fd0,
>> opaque=<optimized out>) at hw/core/qdev.c:249
>> #6 0x00007f8f723bec88 in qdev_walk_children (dev=0x7f8f73efb320,
>> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 <qdev_reset_one>,
>> post_busfn=0x7f8f723bd320 <qbus_reset_one>, opaque=0x0)
>> at hw/core/qdev.c:403
>> #7 0x00007f8f723bedb8 in qbus_walk_children (bus=0x7f8f740706e0,
>> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 <qdev_reset_one>,
>> post_busfn=0x7f8f723bd320 <qbus_reset_one>, opaque=0x0)
>> at hw/core/qdev.c:369
>> #8 0x00007f8f724f5c5d in qemu_devices_reset () at vl.c:1867
>> #9 qemu_system_reset (address@hidden) at vl.c:1880
>> #10 0x00007f8f7256dba2 in load_vmstate (address@hidden "foo")
>> at /home/crobinso/src/qemu/savevm.c:1098
>>
>> The 'cause' is this:
>>
>> #0 ehci_detach (port=0x555556436968) at hw/usb/hcd-ehci.c:810
>> #1 0x0000555555727b5e in usb_detach (address@hidden)
>> at hw/usb/core.c:49
>> #2 0x0000555555736bf3 in ehci_reset (opaque=0x5555564364d8)
>> at hw/usb/hcd-ehci.c:941
>> #3 0x00005555557e1fcd in qemu_devices_reset () at vl.c:1867
>> #4 qemu_system_reset (address@hidden) at vl.c:1880
>> #5 0x0000555555859f12 in load_vmstate (address@hidden "foo")
>> at /home/crobinso/src/qemu/savevm.c:1098
>>
>> ehci_reset calls usb_detach which sets pcibus->irq_count[3] = 1. pcibus_reset
>> runs and hits the assertion. But I don't understand this stuff enough to
>> determine what's actually wrong here :)
>>
>> I bisected the issue to:
>>
>> commit 31b030d4abc5bea89c2b33b39d3b302836f6b6ee
>> Author: Andreas Färber <address@hidden>
>> Date: Wed Sep 4 01:29:02 2013 +0200
>>
>> cputlb: Change tlb_flush_page() argument to CPUState
>>
>> Signed-off-by: Andreas Färber <address@hidden>
>>
>> ...and then I double checked it since that sounds unrelated. Same result.
>
> You are running into an unrelated migration bug:
> http://git.qemu.org/?p=qemu.git;a=commit;h=c01a71c1a56fa27f43449ff59e5d03b2483658a2
>
> Sorry about that. You'll need to patch -p1 the above commit on top of
> each git-bisect commit to find the actual breakage if the above commit
> is already bad (can't test right now).
>
Indeed, that seemed to be messing up my search, thanks.
So the real culprit is:
commit 9bdbbfc3a04c28dc43af5afffb32066623cb0022
Author: Paolo Bonzini <address@hidden>
Date: Fri Dec 6 17:54:25 2013 +0100
pci: clean up resetting of IRQs
pci_device_reset will deassert the INTX pins, and this will make the
irq_count array all-zeroes. Check that this is the case, and remove
the existing loop which might even unsync irq_count and irq_state.
Which is what adds the assert.
Looking at pci_device_reset, there is an issue:
dev->irq_state = 0;
pci_update_irq_status(dev);
pci_device_deassert_intx(dev);
irq_state is cleared before pci_device_deassert_intx. But tries to clear all
irqs via pci_irq_handler, but that function will exit without taking any
action if the requested irq level matches what we already track in irq_state.
Since irq_state is 0, pci_device_deassert_intx is basically a no-op. Any
interrupts with level=1 will not be cleared, which is the case with the usb
tablet after usb_detach.
This fixes things for me, but I have no idea if it's the proper fix:
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 8f722dd..1912dfb 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -189,9 +189,9 @@ static void pci_do_device_reset(PCIDevice *dev)
{
int r;
+ pci_device_deassert_intx(dev);
dev->irq_state = 0;
pci_update_irq_status(dev);
- pci_device_deassert_intx(dev);
/* Clear all writable bits */
pci_word_test_and_clear_mask(dev->config + PCI_COMMAND,
pci_get_word(dev->wmask + PCI_COMMAND) |
- Cole