|
From: | Denis V. Lunev |
Subject: | Re: [Qemu-devel] [PATCH 2.5 v5 0/11] dataplane snapshot fixes |
Date: | Fri, 6 Nov 2015 19:19:33 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 11/06/2015 07:05 PM, Eric Blake wrote:
On 11/06/2015 08:54 AM, Stefan Hajnoczi wrote:On Wed, Nov 04, 2015 at 08:19:31PM +0300, Denis V. Lunev wrote:with test while /bin/true ; do virsh snapshot-create rhel7 sleep 10 virsh snapshot-delete rhel7 --current done with enabled iothreads on a running VM leads to a lot of troubles: hangs, asserts, errors.That is a case of using libvirt to trigger internal snapshots...The HMP monitor is legacy and also not used by modern libvirt....and libvirt is forced to use HMP for internal snapshots, since we _still_ haven't exposed internal snapshots as a QMP command.I think the affected use cases are restricted to savevm+dataplane and HMP+dataplane.The fact that the commit message calls out a libvirt method of triggering the bug does mean that it is user-visible, and so it would qualify as a bug fix even during hard freeze. But I also understand that taking a large complex series late in the game is not without risk; and it is not like this is a regression (rather, something that has never worked bulletproof), right?
yes, this was not working in the past and this is not a regression. The problem is that it seems that NOBODY uses iothreads in the production or even for complex real life production tests. There is another recently merged example of this (100% reproducible, happens both on migration/snapshot). We have faced this on suspend operation. commit 10a06fd65f667a972848ebbbcac11bdba931b544 Author: Pavel Butsykin <address@hidden> Date: Mon Oct 26 14:42:57 2015 +0300virtio: sync the dataplane vring state to the virtqueue before virtio_save
I have started this initially as a set of small bits in savevm code and was asked to move the code from savevm.c to block layer. This has been done and yes, series becomes complex after that and it was obvious that it will be complex when the task was set to move a bunch of code from one place to another. Anyway, from my point of view the serie is not that complex. It is just large and is doing simple things almost near copy/paste and there is a month to catch bugs here. Can we still consider this for merge? Den
[Prev in Thread] | Current Thread | [Next in Thread] |