qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 00/12] Dirty bitmaps migration


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-devel] [PATCH v5 00/12] Dirty bitmaps migration
Date: Tue, 26 Jan 2016 11:45:57 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0

On 03.06.2015 01:17, John Snow wrote:

On 05/28/2015 04:56 PM, Denis V. Lunev wrote:
On 28/05/15 23:09, John Snow wrote:
On 05/26/2015 10:51 AM, Denis V. Lunev wrote:
On 26/05/15 17:48, Denis V. Lunev wrote:
On 21/05/15 19:44, John Snow wrote:
On 05/21/2015 09:57 AM, Denis V. Lunev wrote:
On 21/05/15 16:51, Vladimir Sementsov-Ogievskiy wrote:
Hi all.

Hmm. There is an interesting suggestion from Denis Lunev (in CC)
about
how to drop meta bitmaps and make things easer.

method:

start migration
disk and memory are migrated, but not dirty bitmaps.
stop vm
create all necessary bitmaps in destination vm (empty, but with same
names and granularities and enabled flag)
start destination vm
empty bitmaps are tracking now
start migrating dirty bitmaps. merge them to corresponding bitmaps
in destination
while bitmaps are migrating, they should be in some kind of
'inconsistent' state.
so, we can't start backup or other migration while bitmaps are
migrating, but vm is already _running_ on destination.

what do you think about it?

the description is a bit incorrect

- start migration process, perform memory and disk migration
      as usual. VM is still executed at source
- start VM on target. VM on source should be on pause as usual,
      do not finish migration process. Running VM on target "writes"
      normally setting dirty bits as usual
- copy active dirty bitmaps from source to target. This is safe
      as VM on source is not running
- "OR" copied bitmaps with ones running on target
- finish migration process (stop source VM).

Downtime will not be increased due to dirty bitmaps with this
approach, migration process is very simple - plain data copy.

Regards,
       Den

I was actually just discussing the live migration approach a little
bit
ago with Stefan, trying to decide on the "right" packet format (The
only
two patches I haven't ACKed yet are ones in which we need to choose a
send size) and we decided that 1KiB chunk sends would be
appropriate for
live migration.

I think I'm okay with that method, but obviously this approach
outlined
here would also work very well and would avoid meta bitmaps, chunk
sizes, migration tuning, convergence questions, etc etc etc.

You'd need to add a new status to the bitmap on the target (maybe
"INCOMPLETE" or "MIGRATING") that prevents it from being used for a
backup operation without preventing it from recording new writes.

My only concern is how easy it will be to work this into the migration
workflow.

It would require some sort of "post-migration" ternary phase, I
suppose,
for devices/data that can be transferred after the VM starts -- and I
suspect we'll be the only use of that phase for now.

David, what are your thoughts, here? Would you prefer Vladimir and I
push forward on the live migration approach, or add a new post-hoc
phase? This approach might be simpler on the block layer, but I
would be
rather upset if he scrapped his entire series for the second time for
another approach that also didn't get accepted.

--js
hmmm.... It looks like we should proceed with this to fit 2.4 dates.
There is not much interest at the moment. I think that we could
implement this later in 2.5 etc...

Regards,
      Den
oops. I have written something strange. Anyway, I think that for
now we should proceed with this patchset to fit QEMU 2.4 dates.
The implementation with additional stage (my proposal) could be
added later, f.e. in 2.5 as I do not see much interest from migration
gurus.

In this case the review will take a ... lot of time.

Regards,
      Den

That sounds good to me. I think this solution is workable for 2.4, and
we can begin working on a post-migration phase for the future to help
simplify our cases a lot.

I have been out sick much of this week, so apologies in my lack of
fervor getting this series upstream recently.

--js
no prob :)
Had a chat with Stefan about this approach and apparently that's what
the postcopy migration patches on-list are all about.

Stefan brought up the point of post-hoc reliability: It's possible to
transfer control to the new VM and then lose your link, making migration
completion impossible. Adding a post-copy phase to our existing live
migration is a non-starter, because it introduces unfairly this
unreliability to the existing system.

However, we can make this idea work for migrations started via the
post-copy mechanism, because the entire migration already carries that
known risk of completion failure.

It seems like the likely outcome though is that migrations will be able
to be completed with either mechanism in the future: either up-front
migration or post-copy migration. In that light, it seems we won't be
able to fully rid ourselves of the meta_bitmap idea, making the
post-copy idea here not too useful in culling our complexity, since
we'll have to support the current standard live migration anyway.

So I have reviewed the current set of patches under the assumption that
it seems like the right way to go for 2.4 and beyond.

Thank you!
--js

For now, post-copy migration is merged as I know. Is something changed for its reliability? Do we still need meta-bitmap approach for bitmap migration?

--
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]