qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 3/5] QMP: Introduce MIGRATION events


From: Anthony Liguori
Subject: [Qemu-devel] Re: [PATCH 3/5] QMP: Introduce MIGRATION events
Date: Tue, 25 May 2010 11:10:23 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Lightning/1.0pre Thunderbird/3.0

On 05/25/2010 11:04 AM, Juan Quintela wrote:
Anthony Liguori<address@hidden>  wrote:
On 05/25/2010 10:35 AM, Juan Quintela wrote:
problem here is that libvirt start target with -S, and waits to do the
"cont" as soon as possible.  As of know, only way to do it is to poll
info migrate on source faster.

Why does it do that??

That sound like a terrible idea.
Becaues migration is not reliable, and they don't have a way to issue
cont only in one of the sides :(

I don't know what you mean by reliable.

When the migration completes on the destination, it will start automatically.

The source will not start unless explicitly invoked. If you successfully cancel a migration on the source, it's guaranteed that it won't start on the destination. So the sequence looks like:

src) // decide we want to give up migration
src) migrate_cancel
src) // check migration status
src) cont // if migration cancelled
src) //if migration succeeded, check destination for completion
dst) // if not responsive and not completed in appropriate amount of time, kill guest
src) cont // if killed destination

I don't see what the problem is.

We make migration protocol reliable, or management application have to
decide when migration suceeded or not.

Reliability has nothing to do with the protocol and everything to do with the presence of the third node.

This new events help then a lot.  But they issue the cont really fast
(before migration ends).  I don't remember why they did that.

If libvirt is launching the destination with -S, it's doing the wrong thing and we ought make sure the proper fix gets implemented.

danp?

There should be some information about why it failed, no? Preferrably
in a QError format.

At this point, we have basically -1 :(

I can add a field with an error number, but we are very bad at the
moment about moving errno's upstack.

We need a better solution for reporting errors via notifications.
Suggestions?

Notice that what we need now is a way to know if migration ended with
success or in any other way, as soon as possible.

Markus/Luiz?

I think this makes more sense as a MIGRATION_CONNECTED event.  It
probably should carry peer information too.

What kind of peer information?

We have tcp/fd/exec/unix migrations.  calling it CONNECTED vs STARTED, I
don't care.  But adding information?  Notice that the management
application knows what it did, I can put the:

   "exec: gzip -d<   /tmp/foo"

string, but not much more that I can put here.

Basically, do we have any useful information in info migrate that we
can include?
(qemu) info migrate
Migration status: active
transferred ram: 874808 kbytes
remaining ram: 227912 kbytes
total ram: 1065344 kbytes
(qemu)

I can't see anything interesting to put here :(

Ugh.

About the CONNECTED/STARTED distintion, I fully agree with danp.  We
just want STARTED event for migration, CONNECTION should be generated
(or not) for all sockets/char devices.  it don't make sense for fd/exec
for instance.

That makes sense to me.

Regards,

Anthony Liguori

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]