qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] HEAD is failing virt-test on migration tests


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] HEAD is failing virt-test on migration tests
Date: Fri, 13 Feb 2015 11:23:21 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

* Alexander Graf (address@hidden) wrote:
> 
> 
> On 13.02.15 10:04, Dr. David Alan Gilbert wrote:
> > * Alexander Graf (address@hidden) wrote:
> >>
> >>
> >> On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote:
> >>> Copying Alex.
> >>>
> >>> OK, after bisecting, this is what I've got:
> >>>
> >>> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit
> >>> commit 8118f0950fc77cce7873002a5021172dd6e040b5
> >>> Author: Alexander Graf <address@hidden <mailto:address@hidden>>
> >>> Date:   Thu Jan 22 15:01:39 2015 +0100
> >>>
> >>>     migration: Append JSON description of migration stream
> >>>     
> >>>     One of the annoyances of the current migration format is the fact that
> >>>     it's not self-describing. In fact, it's not properly describing at 
> >>> all.
> >>>     Some code randomly scattered throughout QEMU elaborates roughly how to
> >>>     read and write a stream of bytes.
> >>>     
> >>>     We discussed an idea during KVM Forum 2013 to add a JSON description 
> >>> of
> >>>     the migration protocol itself to the migration stream. This patch
> >>>     adds a section after the VM_END migration end marker that contains
> >>>     description data on what the device sections of the stream are
> >>> composed of.
> >>>     
> >>>     This approach is backwards compatible with any QEMU version reading 
> >>> the
> >>>     stream, because QEMU just stops reading after the VM_END marker and
> >>> ignores
> >>>     any data following it.
> >>>     
> >>>     With an additional external program this allows us to decipher the
> >>>     contents of any migration stream and hopefully make migration bugs
> >>> easier
> >>>     to track down.
> >>>     
> >>>     Signed-off-by: Alexander Graf <address@hidden <mailto:address@hidden>>
> >>>     Signed-off-by: Amit Shah <address@hidden
> >>> <mailto:address@hidden>>
> >>>     Signed-off-by: Juan Quintela <address@hidden
> >>> <mailto:address@hidden>>
> >>>
> >>> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738
> >>> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw
> >>> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037
> >>> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude
> >>> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1
> >>> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration
> >>> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d
> >>> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c
> >>> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa
> >>> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests
> >>>
> >>> So there you go. I'm going to sleep, if you need any extra help let me 
> >>> know.
> >>
> >> So the major difference with this patch applied is that the sender could
> >> send more data than the receive wants to read. I can't see the actual
> >> migrate command you used down there.
> >>
> >> I haven't seen this actually being a problem so far, as the receiver
> >> just close()s its file descriptor once it hits VM_EOF. This should only
> >> break senders if they expect they can send more. That said, I think I
> >> only tested offline migration (via exec:), so maybe QEMU is behaving
> >> badly and actually wants to send all data and just fails the migration
> >> without?
> > 
> > Hmm, for such an odd change to the migration stream it's a surprise you
> > didn't test it live.
> 
> Well, let's say I don't remember explicitly testing it live - I probably
> did at one point.
> 
> I just verified that migrating with tcp:... works fine in master.

Yes, that's fair.

My suspicion (for which I have no proof) is that it might depend on the
amount of buffer in the connection; if there's enough buffer to hold
your JSON description it'll work, because you'll have sent the JSON
before the destination has spotted the terminator; if you've
not got much buffering (e.g. on a local fd) then the source might
get stuck trying to write the json or error because the destination
has closed the fd.

Dave

> 
> 
> Alex
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]