[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM
From: |
Li, Liang Z |
Subject: |
Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM |
Date: |
Wed, 18 Mar 2015 03:19:25 +0000 |
> This needs further review/changes on the block layer.
>
> First explanation, why I think this don't fix the full problem.
> Whith this patch, we fix the problem where we have a dirty block layer but
> basically nothing dirtying the memory on the guest (we are moving the 20
> seconds from max_downtime for the blocklayer flush), to 20 seconds until
> we have decided that the amount of dirty memory is small enough to be
> transferred during max_downtime. But it is still going to take 20 seconds to
> flush the block layer, and during that 20 seconds, the amount of memory that
> can be dirty is HUGE.
It's true.
> I think our ouptions are:
>
> - tell the block layer at the beggining of migration
> Hey, we are migrating, could you please start flusing data now, and
> don't get the caches to grow too much, please, pretty please.
> (I left the API to the block layer)
> - Add on that point a new function:
> bdrvr_flush_all_start()
> That starts the sending of pages, and we "hope" that by the time that
> we have migrated all memory, they have also finished (so our last
> call to block_flush_all() have less work to do)
> - Add another function:
> int bdrv_flush_all_timeout(int timeout)
> that returns if timeout pass, telling if it has migrated all pages or
> timeout has passed. So we can got back to the iterative stage if it
> has taken too long.
>
> Notice that *normally* bdrv_flush_all() is very fast, the problem is that
> sometimes it get really, really slow (NFS decided to go slow, TCP drop a
> packet, whatever).
>
> Right now, we don't have an interface to detect that cases and got back to
> the iterative stage.
How about go back to the iterative stage when detect that the pending_size is
larger
Than max_size, like this:
+ /* do flush here is aimed to shorten the VM downtime,
+ * bdrv_flush_all is a time consuming operation
+ * when the guest has done some file writing */
+ bdrv_flush_all();
+ pending_size = qemu_savevm_state_pending(s->file, max_size);
+ if (pending_size && pending_size >= max_size) {
+ qemu_mutex_unlock_iothread();
+ continue;
+ }
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
if (ret >= 0) {
qemu_file_set_rate_limit(s->file, INT64_MAX);
and this is quite simple.
> So, I agree whit the diagnosis that there is a problem there, but I think that
> the solution is more complex that this. You helped one load making a
> different other worse. I am not sure which of the two compromises is
> better :-(
>
> Makes this sense?
>
> Later, Juan.
>
- [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Liang Li, 2015/03/17
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Juan Quintela, 2015/03/17
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM,
Li, Liang Z <=
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Kevin Wolf, 2015/03/18
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Juan Quintela, 2015/03/18
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Paolo Bonzini, 2015/03/18
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Kevin Wolf, 2015/03/18
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Li, Liang Z, 2015/03/20
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Juan Quintela, 2015/03/25
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Kevin Wolf, 2015/03/25
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Li, Liang Z, 2015/03/25
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Li, Liang Z, 2015/03/18
- Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM, Dr. David Alan Gilbert, 2015/03/18