Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance

From:	Jules
Subject:	Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Date:	Wed, 23 Oct 2013 08:08:55 +0800

> On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote:
> > v2 -> v3:
> > * add documentation of new option in qapi-schema.
> > 
> > * long option name: ft -> fault-tolerant
> > 
> > v1 -> v2:
> > * cmdline: migrate curling:tcp:<address>:<port> 
> >        ->  migrate -f tcp:<address>:<port>
> > 
> > * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration
> >           to indicate this is a ft migration.
> > 
> > * receiver: look for the signature: 
> >             QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total)
> >             which indicates the end of one migration.
> > --
> > Jules Wang (4):
> >   Curling: add doc
> >   Curling: cmdline interface.
> >   Curling: the sender
> >   Curling: the receiver
> 

First of all, thanks for your superb and spot-on comments.

> It would be helpful to clarify the status of Curling in the cover letter
> email so reviewers know what to expect.

OK, but I'm not quite clear about how to clarify the status, would you
pls give me an example? 
> 
> This series does not address I/O or failover.  I guess you are aware of
> the missing topics that I mentioned, here are my thoughts on them:
> 
> I/O needs to be held back until the destination host has acknowledged
> receiving the last full migration state.  The outside world cannot
> witness state changes in the guest until the migration state has been
> successfully transferred to the destination host.  Otherwise the guest
> may appear to act incorrectly when resuming execution from the last
> snapshot.
> 
> The time period used by the FT sender thread determines how much latency
> is added to I/O requests.

Yes, there is the latency. That is inevitable.

I guess you mean the following situation:
If a msg 'hello' is sent to the chat room server just a few seconds
before the failover happens, there is a possibility that the msg will be
sent to the others twice or be lost.

Am I right?

> 
> Failover functionality is missing from these patches.  We cannot simply
> start executing on the destination host when the migration connection
> ends.  If the guest disk image is located on shared storage then
> split-brain occurs when a network error terminates the migration
> connection - 

> will both hosts begin accessing the shared disk? 
YES
> 

I have a simple way to handle that. In one word, the third point
--gateway.

Both the sender and the receiver check the connectivity to the gateway
every X seconds. Let's use A and B stand for whether the sender and the
receiver are connected to the gateway respectively.

When the connection between the sender and the receiver is down.
A && B is false.

If A is false, the vm instance at the sender will be stopped.
If B is false, the vm instance at the receiver will not be started.

a.A false  B false: 0 vm run
b.A false  B true: 1 vm run 
c.A true   B false: 1 vm run
d.A true   B true : 1 vm run (normal case)

It becomes complicated when we consider the state transitions in
these four states.
  
I suggest adding this feature to libvirt instead of qemu.


> What is your plan to address these issues?
> 
> Stefan
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Jules Wang, 2013/10/15
- [Qemu-devel] [PATCH v3 3/4] Curling: the sender, Jules Wang, 2013/10/15
- [Qemu-devel] [PATCH v3 2/4] Curling: cmdline interface., Jules Wang, 2013/10/15
- [Qemu-devel] [PATCH v3 4/4] Curling: the receiver, Jules Wang, 2013/10/15
- [Qemu-devel] [PATCH v3 1/4] Curling: add doc, Jules Wang, 2013/10/15
  - Re: [Qemu-devel] [PATCH v3 1/4] Curling: add doc, Stefan Hajnoczi, 2013/10/17
- Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Stefan Hajnoczi, 2013/10/17
  - Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Jules <=
    - Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Stefan Hajnoczi, 2013/10/24
- Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Michael R. Hines, 2013/10/22
  - Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Jules, 2013/10/23
- Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance, Michael R. Hines, 2013/10/22

Prev by Date: Re: [Qemu-devel] [Xen-devel] Hvmloader: Modify ACPI to only supply _EJ0 methods for PCIslots that support hotplug by runtime patching
Next by Date: Re: [Qemu-devel] [PATCH] MAINTAINERS: add block driver sub-maintainers
Previous by thread: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Next by thread: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Index(es):
- Date
- Thread