qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migrati


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migration format
Date: Tue, 26 Jul 2011 13:51:13 +0100

On Tue, Jul 26, 2011 at 10:48 AM, Stefan Hajnoczi
<address@hidden> wrote:
> On Mon, Jul 25, 2011 at 06:23:17PM -0500, Anthony Liguori wrote:
>> On 07/25/2011 04:10 PM, Paolo Bonzini wrote:
>> >On Thu, Jun 30, 2011 at 17:46, Paolo Bonzini<address@hidden>  wrote:
>> >>With the current migration format, VMS_STRUCTs with subsections
>> >>are ambiguous.  The protocol cannot tell whether a 0x5 byte after
>> >>the VMS_STRUCT is a subsection or part of the parent data stream.
>> >>In the past QEMU assumed it was always a part of a subsection; after
>> >>commit eb60260 (savevm: fix corruption in vmstate_subsection_load(),
>> >>2011-02-03) the choice depends on whether the VMS_STRUCT has subsections
>> >>defined.
>> >>
>> >>Unfortunately, this means that if a destination has no subsections
>> >>defined for the struct, it will happily read subsection data into
>> >>its own fields.  And if you are "lucky" enough to stumble on a
>> >>zero byte at the right time, it will be interpreted as QEMU_VM_EOF
>> >>and migration will be interrupted with half-loaded state.
>> >>
>> >>There is no way out of this except defining an incompatible
>> >>migration protocol.  Not-so-long-term we should really try to define
>> >>one that is not a joke, but the bug is serious so we need a solution
>> >>for 0.15.  A sentinel at the end of embedded structs does remove the
>> >>ambiguity.
>> >>
>> >>Of course, this can be restricted to new machine models, and this
>> >>is what the patch series does.  (And note that only patch 3 is specific
>> >>to the short-term solution, everything else is entirely generic).
>> >>
>> >>Untested beyond compilation.
>> >
>> >I have now tested this series (exactly as sent) both by examining
>> >manually the differences between the two formats on the same guest
>> >state, and by a mix of saves/restores (new on new, 0.14 on new
>> >pc-0.14, new pc-0.14 on 0.14; also the same combinations on RHEL).  It
>> >always does what is expected.
>> >
>> >Michael Tsirkin objected that the format should be passed as a
>> >parameter in the migrate command.  I kind of agree, however since this
>> >is a real bug you would need to bump the default for new machine
>> >types, and this default would still go in the QEMUMachine struct like
>> >I am doing.  So I consider the two settings to be orthogonal.  Also,
>> >the alternative requires changes to the whole management stack and if
>> >the default is not changed it imposes a broken format unless you
>> >update the management tools.  Clearly much less bang for the buck.
>> >
>> >I think this is ready to go into 0.15.
>>
>> I'll take a look for 0.15.
>>
>> >The bug happens when migrating
>> >to 0.14 a pc-0.14 machine created with QEMU 0.15 and which has a
>> >floppy.  The media changed subsection is almost always included, and
>> >this causes problems when migrating to 0.14 which didn't have any
>> >subsection for the floppy device.  While QEMU support for migration to
>> >old version admittedly depends on luck, this isn't true of certain
>> >downstreams :) which would like to have an unambiguous migration
>> >format.
>>
>> So this got me thinking about where we're at with migration and
>> where we need to go.
>>
>> I actually think there might be a reasonable path forward if we
>> attack the problem differently than we have so far.
>>
>> == Today ==
>>
>> Today we only support generating the latest serialization of
>> devices. To increase the probability of the latest version working
>> on older versions of QEMU, we strategically omit fields that we know
>> can safely be omitted with older versions (subsections).  More than
>> likely, migrating new to old won't work.
>>
>> Migrating old to new is more likely to work.  We version each
>> section in order to be able to identify when we're dealing with old.
>>
>> But all of this logic lives in one of two forms.  Either as a
>> savevm/loadvm callback that takes a QEMUFile and writes byte
>> serialization to the stream in an open way (usually big endian) or
>> encoded declaratively in a VMState section.
>>
>> == What we need ==
>>
>> We need to decompose migration into three different problems: 1)
>> serializing device state 2) transforming the device model in order
>> to satisfy forwards and backwards compatibility 3) encoding the
>> serialized device model on the wire.
>>
>> We also need a way to future proof ourselves.
>>
>> == What we can do ==
>>
>> 1) Add migration capabilities to future proof ourselves.  I think
>> the simplest way this would work is to have a
>> 'query-migration-capabilities' command that returned a bitmask of
>> supported migration features.  I think we also introduce a
>> 'set-migration-capabilities' command that can mask some of the
>> supported features.
>>
>> A management tool would query-migration features on the source and
>> destination, take the intersection of the two masks, and set that
>> mask on both the source and destination.
>>
>> Lack of support for these commands indicates a mask of zero which is
>> the protocol we offer today.
>
> When the management tool drives negotiation it is possible to do nice
> error reporting (each capability bit has a meaning and detailed
> incompatibility errors can be generated).
>
> However, doing so imposes extra work on management tools - they need to
> understand and drive negotiation.  If QEMU adds a new capability we
> might even need to update management tools!
>
> As a management tool author I would prefer the source and destination to
> work it out amongst themselves so that I just issue the 'migrate'
> command.  Negotiation can be done without the management tool's
> involvement: fail migration if the initial negotation phase fails.

An advantage I didn't think of was that management tools handling
negotiation makes negotiation out-of-band and the migration protocol
doesn't need to be changed.

It seems like the migration protocol needs an overhaul sooner or later
anyway, so perhaps it's not work making the negotiation external.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]