qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Live migration hangs after migration to remote host


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] Live migration hangs after migration to remote host
Date: Wed, 29 Jul 2015 09:11:21 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

* Eduardo Otubo (address@hidden) wrote:
> On Tue, Jul 28, 2015 at 04=19=46PM +0100, Dr. David Alan Gilbert wrote:
> > * Eduardo Otubo (address@hidden) wrote:
> > > Hello all,
> > > 
> > > I'm facing a weird behavior on my tests: I am able to live migrate
> > > between two virtual machines on my localhost, but not to another
> > > machine, both using tcp.
> > > 
> > > * I am using the same arguments on the command line;
> > > * Both virtual machines uses the same qcow2 file visible through NFS;
> > > * Both machines are in the same subnet;
> > > * Migration is being done from intel to intel;
> > > * Same version of Qemu (github master - f8787f8723);
> > > 
> > > Using all above I am able to live migrate on the same host: between two
> > > vms on local host or between two vms in the remote host; but when
> > > migrating from local to remote, the guest hangs. I still can access its
> > > console via ctrl+alt+2, though, and everything seems to be normal. If I
> > > issue a reboote via console on the remote, the guest gets back to
> > > normal.
> > > 
> > > Am I missing something here?
> > 
> > Just checking, but are you saying that as far as qemu is concerned, the 
> > migration
> > is happy, it's just the guest that's hung?
> 
> That's exactly the case. The console (via ctrl+alt+2) is active and
> responding to all commands normally, but the screen (ctrl+alt+1) is
> frozen and I can't interact with it at all.

Are you driving this via libvirt or using qemu monitor directly?
If the latter, can you please get an 'info migrate' from the source
and an 'info status' from the destination at the end of migrate.

> > Are the host clocks on the two hosts very close (there are lots of
> > weird corner cases with mismatched clocks) - same time zone?
> 
> Yep. Both machines are in the same room and have the clock sync'ed.

OK, good.

> > 
> > Are you using cache=none (given that it's NFS shared)
> 
> I wasn't. But I tried again with cache=none and I got exactly the same
> thing.

OK, and this pair of machines, have you tried both directions - i.e.
going a->b and b->a - do both directions fail?
Is the NFS server one of the two machines?  If it is, and you're using libvirt,
make sure that the directory the disks are on is an NFS mount on both
machines; e.g. don't migrate directly from the NFS export.

> Also, I tried with stable-2.2 branch and got the same behavior. I really
> think that's very unlikely to have unstable code of such an important
> feature upstream, or on a stable- branch. Most probable thing is that
> I have something wrong on my environment.

Yes, the challenge is to find what; and if it's something common
we should try and find a way of spotting it.

> Anyway, I'll keep tetsing different stable- branches until I find
> something that works for me. I'll keep the mailing list posted.

Could you share the qemu command line so we can see if we can
spot anything?

Dave

> 
> Thanks for the light!
> 
> -- 
> Eduardo Otubo
> ProfitBricks GmbH


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]