[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH COLO-Frame v18 21/34] COLO failover: Shutdown relate
From: |
zhanghailiang |
Subject: |
[Qemu-devel] [PATCH COLO-Frame v18 21/34] COLO failover: Shutdown related socket fd when do failover |
Date: |
Wed, 3 Aug 2016 20:25:59 +0800 |
If the net connection between primary host and secondary host
is broken while COLO/COLO incoming thread is blocked in read()/write()
socket fd.
It will be a long time to detect this error until connection is timeout.
Here we shutdown all the related socket file descriptors to wake up the
blocking operation in failover BH. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them, or there will be an error.
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Cc: Dr. David Alan Gilbert <address@hidden>
---
v17:
- Rename colo_sem to colo_exit_sem.
v13:
- Add Reviewed-by tag
- Use semaphore to notify colo/colo incoming loop that
failover work is finished.
v12:
- Shutdown both QEMUFile's fd though they may use the
same fd. (Dave's suggestion)
v11:
- Only shutdown fd for once
---
include/migration/migration.h | 3 +++
migration/colo.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 8ae5248..15e01f3 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -113,6 +113,7 @@ struct MigrationIncomingState {
QemuThread colo_incoming_thread;
/* The coroutine we should enter (back) after failover */
Coroutine *migration_incoming_co;
+ QemuSemaphore colo_incoming_sem;
/* See savevm.c */
LoadStateEntry_Head loadvm_handlers;
@@ -183,6 +184,8 @@ struct MigrationState
QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest)
src_page_requests;
/* The RAMBlock used in the last src_page_request */
RAMBlock *last_req_rb;
+ /* The semaphore is used to notify COLO thread that failover is finished */
+ QemuSemaphore colo_exit_sem;
/* The last error that occurred */
Error *error;
diff --git a/migration/colo.c b/migration/colo.c
index f8fce0d..af1cf8e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -60,6 +60,18 @@ static void secondary_vm_do_failover(void)
/* recover runstate to normal migration finish state */
autostart = true;
}
+ /*
+ * Make sure COLO incoming thread not block in recv or send,
+ * If mis->from_src_file and mis->to_src_file use the same fd,
+ * The second shutdown() will return -1, we ignore this value,
+ * It is harmless.
+ */
+ if (mis->from_src_file) {
+ qemu_file_shutdown(mis->from_src_file);
+ }
+ if (mis->to_src_file) {
+ qemu_file_shutdown(mis->to_src_file);
+ }
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
@@ -68,6 +80,8 @@ static void secondary_vm_do_failover(void)
"secondary VM", old_state);
return;
}
+ /* Notify COLO incoming thread that failover work is finished */
+ qemu_sem_post(&mis->colo_incoming_sem);
/* For Secondary VM, jump to incoming co */
if (mis->migration_incoming_co) {
qemu_coroutine_enter(mis->migration_incoming_co);
@@ -82,6 +96,18 @@ static void primary_vm_do_failover(void)
migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
MIGRATION_STATUS_COMPLETED);
+ /*
+ * Wake up COLO thread which may blocked in recv() or send(),
+ * The s->rp_state.from_dst_file and s->to_dst_file may use the
+ * same fd, but we still shutdown the fd for twice, it is harmless.
+ */
+ if (s->to_dst_file) {
+ qemu_file_shutdown(s->to_dst_file);
+ }
+ if (s->rp_state.from_dst_file) {
+ qemu_file_shutdown(s->rp_state.from_dst_file);
+ }
+
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
if (old_state != FAILOVER_STATUS_HANDLING) {
@@ -89,6 +115,8 @@ static void primary_vm_do_failover(void)
old_state);
return;
}
+ /* Notify COLO thread that failover work is finished */
+ qemu_sem_post(&s->colo_exit_sem);
}
void colo_do_failover(MigrationState *s)
@@ -374,6 +402,14 @@ out:
COLO_EXIT_REASON_REQUEST, NULL);
}
+ /* Hope this not to be too long to wait here */
+ qemu_sem_wait(&s->colo_exit_sem);
+ qemu_sem_destroy(&s->colo_exit_sem);
+ /*
+ * Must be called after failover BH is completed,
+ * Or the failover BH may shutdown the wrong fd that
+ * re-used by other threads after we release here.
+ */
if (s->rp_state.from_dst_file) {
qemu_fclose(s->rp_state.from_dst_file);
}
@@ -382,6 +418,7 @@ out:
void migrate_start_colo_process(MigrationState *s)
{
qemu_mutex_unlock_iothread();
+ qemu_sem_init(&s->colo_exit_sem, 0);
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
colo_process_checkpoint(s);
@@ -421,6 +458,8 @@ void *colo_process_incoming_thread(void *opaque)
Error *local_err = NULL;
int ret;
+ qemu_sem_init(&mis->colo_incoming_sem, 0);
+
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
@@ -550,6 +589,10 @@ out:
*/
colo_release_ram_cache();
+ /* Hope this not to be too long to loop here */
+ qemu_sem_wait(&mis->colo_incoming_sem);
+ qemu_sem_destroy(&mis->colo_incoming_sem);
+ /* Must be called after failover BH is completed */
if (mis->to_src_file) {
qemu_fclose(mis->to_src_file);
}
--
1.8.3.1
- [Qemu-devel] [PATCH COLO-Frame v18 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT), zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 08/34] COLO: Add a new RunState RUN_STATE_COLO, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 01/34] configure: Add parameter for configure to enable/disable COLO support, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 07/34] COLO: Implement COLO checkpoint protocol, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 19/34] COLO: Implement failover work for Secondary VM, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 13/34] COLO: Flush PVM's cached RAM into SVM's memory, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 22/34] COLO failover: Don't do failover during loading VM's state, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 14/34] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 26/34] migration/savevm: Add new helpers to process the different stages of loadvm, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 21/34] COLO failover: Shutdown related socket fd when do failover,
zhanghailiang <=
- [Qemu-devel] [PATCH COLO-Frame v18 18/34] COLO: Implement failover work for Primary VM, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 12/34] COLO: Load VMState into buffer before restore it, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 28/34] COLO: Separate the process of saving/loading ram and device state, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 10/34] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 24/34] COLO: Update the global runstate after going into colo state, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 20/34] qmp event: Add COLO_EXIT event to notify users while exited from COLO, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 05/34] migration: Integrate COLO checkpoint process into loadvm, zhanghailiang, 2016/08/03
- [Qemu-devel] [PATCH COLO-Frame v18 11/34] ram/COLO: Record the dirty pages that SVM received, zhanghailiang, 2016/08/03