qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] commit virtio: recalculate vq->inuse after migration might


From: Halil Pasic
Subject: [Qemu-devel] commit virtio: recalculate vq->inuse after migration might cause last_avail_idx vs. used_idx failure
Date: Wed, 14 Dec 2016 20:12:17 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1

We have a migration problem, which is in my opinion caused by a
deficiency in how vq->inuse is calculated after the migration (commit
bccdef6b  "virtio: recalculate vq->inuse after migration" to
blame).


We got a bugreport with this log for a live migration target. 

2016-12-13T18:59:03.647309Z qemu-system-s390x: VQ 1 size 0x100 < last_avail_idx 
0x2f76 - used_idx 0x762f
2016-12-13T18:59:03.647385Z qemu-system-s390x: error while loading state for 
instance 0x0 of device '/fe.0.0001/virtio-net'
2016-12-13T18:59:03.647540Z qemu-system-s390x: load of migration failed: 
Operation not permitted
2016-12-13 18:59:03.796+0000: shutting down, reason=failed

They use QEMU version 2.7 but looking at the current git master
I think this did not get fixed in the meanwhile.

So here goes the argument. The recalculation is done like this:

+            vdev->vq[i].inuse = vdev->vq[i].last_avail_idx -
+                                vdev->vq[i].used_idx;

This does not seem correct when last_avail_idx has already
wrapped around but used_idx not yet. We see from the log that
last_avail_idx  (0x2f76) less that used_idx (0x762f) thus
inuse (of type int) ends up being negative.

+            if (vdev->vq[i].inuse > vdev->vq[i].vring.num) {

Because vdev->vq[i].vring.num is unsigned int ala usual arithmetic
conversions ("Otherwise, if the operand that has unsigned integer type
has rank greater or equal to the rank of the type of the other operand,
then the operand with signed integer type is converted to the type of
the operand with unsigned integer type." C99) inuse gets converted to
unsigned int.

Thus the check fails and produces the log cited above.

+                error_report("VQ %d size 0x%x < last_avail_idx 0x%x - "
+                             "used_idx 0x%x",
+                             i, vdev->vq[i].vring.num,
+                             vdev->vq[i].last_avail_idx,
+                             vdev->vq[i].used_idx);
+                return -1;
+            }

Do we want to try to fix this for 2.8? I already have a small patch prepared.

Regards,
Halil




reply via email to

[Prev in Thread] Current Thread [Next in Thread]