Am 18.02.2017 um 11:54 hat Denis V. Lunev geschrieben:
On 02/17/2017 03:54 PM, Vladimir Sementsov-Ogievskiy wrote:
17.02.2017 17:24, Kevin Wolf wrote:
Am 17.02.2017 um 14:48 hat Denis V. Lunev geschrieben:
On 02/17/2017 04:34 PM, Kevin Wolf wrote:
Am 17.02.2017 um 14:22 hat Denis V. Lunev geschrieben:
But for sure this is bad from the downtime point of view.
On migrate you will have to write to the image and re-read
it again on the target. This would be very slow. This will
not help for the migration with non-shared disk too.
That is why we have specifically worked in a migration,
which for a good does not influence downtime at all now.
With a write we are issuing several write requests + sync.
Our measurements shows that bdrv_drain could take around
a second on an averagely loaded conventional system, which
seems unacceptable addition to me.
I'm not arguing against optimising migration, I fully agree with
you. I
just think that we should start with a correct if slow base version
and
then add optimisation to that, instead of starting with a broken base
version and adding to that.
Look, whether you do the expensive I/O on open/close and make that a
slow operation or whether you do it on invalidate_cache/inactivate
doesn't really make a difference in term of slowness because in
general
both operations are called exactly once. But it does make a difference
in terms of correctness.
Once you do the optimisation, of course, you'll skip writing those
bitmaps that you transfer using a different channel, no matter whether
you skip it in bdrv_close() or in bdrv_inactivate().
Kevin
I do not understand this point as in order to optimize this
we will have to create specific code path or option from
the migration code and keep this as an ugly kludge forever.
The point that I don't understand is why it makes any difference for the
follow-up migration series whether the writeout is in bdrv_close() or
bdrv_inactivate(). I don't really see the difference between the two
from a migration POV; both need to be skipped if we transfer the bitmap
using a different channel.
Maybe I would see the reason if I could find the time to look at the
migration patches first, but unfortunately I don't have this time at the
moment.
My point is just that generally we want to have a correctly working qemu
after every single patch, and even more importantly after every series.
As the migration series is separate from this, I don't think it's a good
excuse for doing worse than we could easily do here.
Kevin
With open/close all is already ok - bitmaps will not be saved because
of BDRV_O_INACTIVE and will not be loaded because of IN_USE.
If the contents of so-called persistent bitmaps is lost across
migration and stale after bdrv_invalidate_cache, that's not "all ok" in
my book. It's buggy.
in any case load/store happens outside of VM downtime.
The target is running at the moment of close on source,
the source is running at the moment of open on target.
Which makes the operation in open/close meaningless: open reads data
which may still by changed, and close cannot write to the image any more
because ownership is already given up.
Anyway, all of this isn't really productive. Please, can't you just
answer that simple question that I asked you above: What problems would
you get if you used invalidate_cache/inactivate instead of open/close
for triggering these actions?
If you can bring up some "this would break X", "it would be very hard to
implement Y correctly then" or "in scenario Z, this would have unwanted
effects", then we can figure out what to do. But not putting things in
the proper place just because you don't feel like it isn't a very strong
argument.
Kevin