[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown relate
From: |
zhanghailiang |
Subject: |
[Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover |
Date: |
Tue, 15 Dec 2015 16:22:47 +0800 |
If the net connection between COLO's two sides is broken while colo/colo
incoming
thread is blocked in 'read'/'write' socket fd. It will not detect this error
until
connect timeout. It will be a long time.
Here we shutdown all the related socket file descriptors to wake up the blocking
operation in failover BH. Besides, we should close the corresponding file
descriptors
after failvoer BH shutdown them, or there will be an error.
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
---
v12:
- Shutdown both QEMUFile's fd though they may use the same fd. (Dave's
suggestion)
v11:
- Only shutdown fd for once
Signed-off-by: zhanghailiang <address@hidden>
---
migration/colo.c | 42 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 40 insertions(+), 2 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index d06c14f..58531e7 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -60,6 +60,18 @@ static void secondary_vm_do_failover(void)
/* recover runstate to normal migration finish state */
autostart = true;
}
+ /*
+ * Make sure colo incoming thread not block in recv or send,
+ * If mis->from_src_file and mis->to_src_file use the same fd,
+ * The second shutdown() will return -1, we ignore this value,
+ * it is harmless.
+ */
+ if (mis->from_src_file) {
+ qemu_file_shutdown(mis->from_src_file);
+ }
+ if (mis->to_src_file) {
+ qemu_file_shutdown(mis->to_src_file);
+ }
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
@@ -82,6 +94,18 @@ static void primary_vm_do_failover(void)
migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
MIGRATION_STATUS_COMPLETED);
+ /*
+ * Make sure colo thread no block in recv or send,
+ * The s->rp_state.from_dst_file and s->to_dst_file may use the
+ * same fd, but we still shutdown the fd for twice, it is harmless.
+ */
+ if (s->to_dst_file) {
+ qemu_file_shutdown(s->to_dst_file);
+ }
+ if (s->rp_state.from_dst_file) {
+ qemu_file_shutdown(s->rp_state.from_dst_file);
+ }
+
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
if (old_state != FAILOVER_STATUS_HANDLING) {
@@ -348,7 +372,7 @@ static void colo_process_checkpoint(MigrationState *s)
}
out:
- if (ret < 0) {
+ if (ret < 0 || (!ret && !failover_request_is_active())) {
error_report("%s: %s", __func__, strerror(-ret));
qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
true, strerror(-ret), NULL);
@@ -360,6 +384,15 @@ out:
qsb_free(buffer);
buffer = NULL;
+ /* Hope this not to be too long to loop here */
+ while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+ ;
+ }
+ /*
+ * Must be called after failover BH is completed,
+ * Or the failover BH may shutdown the wrong fd, that
+ * re-used by other thread after we release here.
+ */
if (s->rp_state.from_dst_file) {
qemu_fclose(s->rp_state.from_dst_file);
}
@@ -519,7 +552,7 @@ void *colo_process_incoming_thread(void *opaque)
}
out:
- if (ret < 0) {
+ if (ret < 0 || (!ret && !failover_request_is_active())) {
error_report("colo incoming thread will exit, detect error: %s",
strerror(-ret));
qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_ERROR,
@@ -539,6 +572,11 @@ out:
*/
colo_release_ram_cache();
+ /* Hope this not to be too long to loop here */
+ while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+ ;
+ }
+ /* Must be called after failover BH is completed */
if (mis->to_src_file) {
qemu_fclose(mis->to_src_file);
}
--
1.8.3.1
- Re: [Qemu-devel] [PATCH COLO-Frame v12 21/38] COLO failover: Introduce a new command to trigger a failover, (continued)
- [Qemu-devel] [PATCH COLO-Frame v12 22/38] COLO failover: Introduce state to record failover process, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 23/38] COLO: Implement failover work for Primary VM, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 27/38] COLO failover: Don't do failover during loading VM's state, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 18/38] COLO: Flush PVM's cached RAM into SVM's memory, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover,
zhanghailiang <=
- [Qemu-devel] [PATCH COLO-Frame v12 24/38] COLO: Implement failover work for Secondary VM, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 19/38] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 28/38] COLO: Process shutdown command for VM in COLO state, zhanghailiang, 2015/12/15