qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Date: Thu, 17 Oct 2013 13:50:59 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote:
> v2 -> v3:
> * add documentation of new option in qapi-schema.
> 
> * long option name: ft -> fault-tolerant
> 
> v1 -> v2:
> * cmdline: migrate curling:tcp:<address>:<port> 
>        ->  migrate -f tcp:<address>:<port>
> 
> * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration
>           to indicate this is a ft migration.
> 
> * receiver: look for the signature: 
>             QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total)
>             which indicates the end of one migration.
> --
> Jules Wang (4):
>   Curling: add doc
>   Curling: cmdline interface.
>   Curling: the sender
>   Curling: the receiver

It would be helpful to clarify the status of Curling in the cover letter
email so reviewers know what to expect.

This series does not address I/O or failover.  I guess you are aware of
the missing topics that I mentioned, here are my thoughts on them:

I/O needs to be held back until the destination host has acknowledged
receiving the last full migration state.  The outside world cannot
witness state changes in the guest until the migration state has been
successfully transferred to the destination host.  Otherwise the guest
may appear to act incorrectly when resuming execution from the last
snapshot.

The time period used by the FT sender thread determines how much latency
is added to I/O requests.

Failover functionality is missing from these patches.  We cannot simply
start executing on the destination host when the migration connection
ends.  If the guest disk image is located on shared storage then
split-brain occurs when a network error terminates the migration
connection - will both hosts begin accessing the shared disk?

What is your plan to address these issues?

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]