qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] hitting intermittent issue with live migration from qem


From: Ladi Prosek
Subject: Re: [Qemu-devel] hitting intermittent issue with live migration from qemu-kvm-ev 2.3.0 to qemu-kvm-ev 2.6.0
Date: Tue, 4 Apr 2017 17:07:52 +0200

On Tue, Apr 4, 2017 at 4:28 PM, Chris Friesen
<address@hidden> wrote:
> On 04/04/2017 07:56 AM, Ladi Prosek wrote:
>>
>> On Mon, Apr 3, 2017 at 9:11 PM, Stefan Hajnoczi <address@hidden>
>> wrote:
>>>
>>> On Fri, Mar 31, 2017 at 02:12:36PM -0600, Chris Friesen wrote:
>
>
>>>> Initially we have a bunch of guests running on compute-2 (which is
>>>> running
>>>> qemu-kvm-ev 2.3.0).  We then started live-migrating them one at a time
>>>> to
>>>> compute-0 (which is running qemu-kvm-ev 2.6.0).  Three of them migrated
>>>> successfully.  The fourth (which was essentially identical in
>>>> configuration
>>>> to the first three) failed, as per the following logs in
>>>> /var/log/libvirt/qemu/instance-0000000e.log:
>>>>
>>>>
>>>> 2017-03-29T06:38:37.886940Z qemu-kvm: VQ 2 size 0x80 < last_avail_idx
>>>> 0x47b
>>>> - used_idx 0x47c
>>>> 2017-03-29T06:38:37.886974Z qemu-kvm: error while loading state for
>>>> instance
>>>> 0x0 of device '0000:00:07.0/virtio-balloon'
>>>> 2017-03-29T06:38:37.888684Z qemu-kvm: load of migration failed:
>>>> Operation
>>>> not permitted
>>>> 2017-03-29 06:38:37.896+0000: shutting down
>>>>
>>>>
>>>> Does anyone know of an existing bug report covering this issue?  (I took
>>>> a
>>>> look and didn't see anything obviously related.)
>>>
>>>
>>> This is the virtio-balloon device.  If you remove the device the live
>>> migration should work reliably.
>>>
>>> Alternatively, you can temporarily rmmod virtio_balloon inside the guest
>>> for live migration.  After migration you can modprobe virtio_balloon
>>> again.
>>>
>>> last_avail_idx 0x47b with used_idx 0x47c is an invalid device state.
>>> I've diffed qemu-kvm-ev 2.6.0-27.1 hw/virtio/virtio-balloon.c against
>>> qemu.git/master and do not see an obvious bug.  I also compared
>>> qemu-kvm-ev 2.3.0-31 with qemu-kvm-ev 2.6.0-27.1.
>>
>>
>> The device likely got into the invalid state as part of a previous
>> migration to an unfixed QEMU. I second Stefan's suggestion to
>> temporarily remove the device or unload the driver.
>
>
> I'll give that a try (been busy with a separate issue).
>
> If I have a guest already running, can I unilaterally hot-remove the device
> from the host side or does the guest need to be involved as well?  (I'm just
> trying to figure out how to deal with existing guests.)

Hot-remove should be fine.

> Thanks,
> Chris



reply via email to

[Prev in Thread] Current Thread [Next in Thread]