qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: optimize the downtime


From: Jay Zhou
Subject: Re: [Qemu-devel] [PATCH] migration: optimize the downtime
Date: Tue, 25 Jul 2017 15:29:40 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 2017/7/25 0:33, Paolo Bonzini wrote:
On 24/07/2017 17:35, Dr. David Alan Gilbert wrote:
* Jay Zhou (address@hidden) wrote:
Hi Dave,

On 2017/7/21 17:49, Dr. David Alan Gilbert wrote:
* Jay Zhou (address@hidden) wrote:
Qemu_savevm_state_cleanup() takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region(). In kmod, the main cost is
kvm_zap_obsolete_pages(), which traverses the active_mmu_pages list to
zap the unsync sptes.

Hi Jay,
    Is this actually increasing the real downtime when the guest isn't
running, or is it just the reported time? I see that the s->downtime
value is calculated right after where we currently call
qemu_savevm_state_cleanup.

It actually increased the real downtime, I used the "ping" command to
test. Reason is that the source side libvirt sends qmp to qemu to query
the status of migration, which needs the BQL. qemu_savevm_state_cleanup
is done with BQL, qemu can not handle the qmp if qemu_savevm_state_cleanup
has not finished. And the source side libvirt delays about 300ms to notify
the destination side libvirt to send the "cont" command to start the vm.

I think the value of s->downtime is not accurate enough, maybe we could
move the calculation of end_time after qemu_savevm_state_cleanup has done.

I'm copying in Paolo, Radim and Andrea- is there anyway we can make the
teardown of KVMs dirty tracking not take so long? 300ms is a silly long time
on only a small VM.

Xiao Guangrong is working on something vaguely related (but different
and simpler because it's entirely contained within KVM), which is to
make log_sync faster.

The Intel folks working on clear containers also would like
MemoryListeners to have a better complexity, but that's again separate
from the "zapping" of SPTEs.

Can you tell me which version of libvirt you're using?
I thought the newer ones were supposed to use events so they did't
have to poll qemu.

  If we move qemu_savevm_state_cleanup is it still safe? Are there
some things we're supposed to do at that point which are wrong if
we don't.

I wonder about something like;  take a mutex in
memory_global_dirty_log_start, release it in
memory_global_dirty_log_stop.  Then make ram_save_cleanup start
a new thread that does the call to memory_global_dirty_log_stop.

I don't like having such a long-lived mutex (it seems like a recipe for
deadlocks with the BQL), plus memory_region_transaction_commit (the
expensive part of memory_global_dirty_log_stop) needs to be under the
BQL itself because it calls MemoryListeners.

Maybe memory_global_dirty_log_stop can delay itself to the next vm_start
if it's called while runstate_running() returns false (which should be
always the case)?

It is logical, but then we need to put some migration related codes into
the file of cpus.c(in my patch, I put some codes into the file of qmp.c),
is there any elegant way to reduce the coupling?

Thanks,
Jay

It could even be entirely enclosed within memory.c if you do it with a
VMChangeStateHandler.

Thanks,

Paolo

Dave

Thanks,
Jay

However, we would need to be a bit careful of anything that needs
cleaning up before the source restarts on failure; I'm not sure of
the semantics of all the current things wired into save_cleanup.

Dave


Signed-off-by: Jay Zhou <address@hidden>
---
   migration/migration.c | 16 +++++++++-------
   qmp.c                 | 10 ++++++++++
   2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index a0db40d..72832be 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1877,6 +1877,15 @@ static void *migration_thread(void *opaque)
           if (qemu_file_get_error(s->to_dst_file)) {
               migrate_set_state(&s->state, current_active_state,
                                 MIGRATION_STATUS_FAILED);
+            /*
+             * The resource has been allocated by migration will be reused in
+             * COLO process, so don't release them.
+             */
+            if (!enable_colo) {
+                qemu_mutex_lock_iothread();
+                qemu_savevm_state_cleanup();
+                qemu_mutex_unlock_iothread();
+            }
               trace_migration_thread_file_err();
               break;
           }
@@ -1916,13 +1925,6 @@ static void *migration_thread(void *opaque)
       end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);

       qemu_mutex_lock_iothread();
-    /*
-     * The resource has been allocated by migration will be reused in COLO
-     * process, so don't release them.
-     */
-    if (!enable_colo) {
-        qemu_savevm_state_cleanup();
-    }
       if (s->state == MIGRATION_STATUS_COMPLETED) {
           uint64_t transferred_bytes = qemu_ftell(s->to_dst_file);
           s->total_time = end_time - s->total_time;
diff --git a/qmp.c b/qmp.c
index b86201e..0e68eaa 100644
--- a/qmp.c
+++ b/qmp.c
@@ -37,6 +37,8 @@
   #include "qom/object_interfaces.h"
   #include "hw/mem/pc-dimm.h"
   #include "hw/acpi/acpi_dev_interface.h"
+#include "migration/migration.h"
+#include "migration/savevm.h"

   NameInfo *qmp_query_name(Error **errp)
   {
@@ -200,6 +202,14 @@ void qmp_cont(Error **errp)
       if (runstate_check(RUN_STATE_INMIGRATE)) {
           autostart = 1;
       } else {
+        /*
+         * Delay the cleanup to reduce the downtime of migration.
+         * The resource has been allocated by migration will be reused
+         * in COLO process, so don't release them.
+         */
+        if (runstate_check(RUN_STATE_POSTMIGRATE) && !migrate_colo_enabled()) {
+            qemu_savevm_state_cleanup();
+        }
           vm_start();
       }
   }
--
1.8.3.1


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

.


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]