qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
Date: Thu, 25 Feb 2016 19:52:33 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

* zhanghailiang (address@hidden) wrote:
> From: root <address@hidden>
> 
> This is the 15th version of COLO (Still only support periodic checkpoint).
> 
> Here is only COLO frame part, you can get the whole codes from github:
> https://github.com/coloft/qemu/commits/colo-v2.6-periodic-mode
> 
> There are little changes for this series except the network releated part.

I was looking at the time the guest is paused during COLO and
was surprised to find one of the larger chunks was the time to reset
the guest before loading each checkpoint;  I've traced it part way, the
biggest contributors for my test VM seem to be:

  3.8ms  pcibus_reset: VGA
  1.8ms  pcibus_reset: virtio-net-pci
  1.5ms  pcibus_reset: virtio-blk-pci
  1.5ms  qemu_devices_reset: piix4_reset
  1.1ms  pcibus_reset: piix3-ide
  1.1ms  pcibus_reset: virtio-rng-pci

I've not looked deeper yet, but some of these are very silly;
I'm running with -nographic so why it's taking 3.8ms to reset VGA is 
going to be interesting.
Also, my only block device is the virtio-blk, so while I understand the
standard PC machine has the IDE controller, why it takes it over a ms
to reset an unused device.

I guess reset is normally off anyones radar since it's outside
the time anyone cares about, but I guess perhaps the guys trying
to make qemu start really quickly would be interested.

Dave

> 
> Patch status:
> Unreviewed: patch 21,27,28,29,33,38
> Updated: patch 31,34,35,37
> 
> TODO:
> 1. Checkpoint based on proxy in qemu
> 2. The capability of continuous FT
> 3. Optimize the VM's downtime during checkpoint
> 
> v15:
>  - Go on the shutdown process if encounter error while sending shutdown
>    message to SVM. (patch 24)
>  - Rename qemu_need_skip_netfilter to qemu_netfilter_can_skip and Remove
>    some useless comment. (patch 31, Jason)
>  - Call object_new_with_props() directly to add filter in
>    colo_add_buffer_filter. (patch 34, Jason)
>  - Re-implement colo_set_filter_status() based on COLOBufferFilters
>    list. (patch 35)
>  - Re-implement colo_flush_filter_packets() based on COLOBufferFilters
>    list. (patch 37) 
> v14:
>  - Re-implement the network processing based on netfilter (Jason Wang)
>  - Rename 'COLOCommand' to 'COLOMessage'. (Markus's suggestion)
>  - Split two new patches (patch 27/28) from patch 29
>  - Fix some other comments from Dave and Markus.
> 
> v13:
>  - Refactor colo_*_cmd helper functions to use 'Error **errp' parameter
>   instead of return value to indicate success or failure. (patch 10)
>  - Remove the optional error message for COLO_EXIT event. (patch 25)
>  - Use semaphore to notify colo/colo incoming loop that failover work is
>    finished. (patch 26)
>  - Move COLO shutdown related codes to colo.c file. (patch 28)
>  - Fix memory leak bug for colo incoming loop. (new patch 31)
>  - Re-use some existed helper functions to realize the process of
>    saving/loading ram and device. (patch 32)
>  - Fix some other comments from Dave and Markus.
> 
> zhanghailiang (38):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'x-colo' to migration
>   COLO: migrate colo related info to secondary node
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO/migration: Create a new communication path from destination to
>     source
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save PVM state to secondary side when do checkpoint
>   COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
>   ram/COLO: Record the dirty pages that SVM received
>   COLO: Load VMState into qsb before restore it
>   COLO: Flush PVM's cached RAM into SVM's memory
>   COLO: Add checkpoint-delay parameter for migrate-set-parameters
>   COLO: synchronize PVM's state to SVM periodically
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Introduce state to record failover process
>   COLO: Implement failover work for Primary VM
>   COLO: Implement failover work for Secondary VM
>   qmp event: Add COLO_EXIT event to notify users while exited from COLO
>   COLO failover: Shutdown related socket fd when do failover
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Process shutdown command for VM in COLO state
>   COLO: Update the global runstate after going into colo state
>   savevm: Introduce two helper functions for save/find loadvm_handlers
>     entry
>   migration/savevm: Add new helpers to process the different stages of
>     loadvm
>   migration/savevm: Export two helper functions for savevm process
>   COLO: Separate the process of saving/loading ram and device state
>   COLO: Split qemu_savevm_state_begin out of checkpoint process
>   net/filter: Add a 'status' property for filter object
>   filter-buffer: Accept zero interval
>   net: Add notifier/callback for netdev init
>   COLO/filter: add each netdev a buffer filter
>   COLO: manage the status of buffer filters for PVM
>   filter-buffer: make filter_buffer_flush() public
>   COLO: flush buffered packets in checkpoint process or exit COLO
>   COLO: Add block replication into colo process
> 
>  configure                     |  11 +
>  docs/qmp-events.txt           |  16 +
>  hmp-commands.hx               |  15 +
>  hmp.c                         |  15 +
>  hmp.h                         |   1 +
>  include/exec/ram_addr.h       |   1 +
>  include/migration/colo.h      |  42 ++
>  include/migration/failover.h  |  33 ++
>  include/migration/migration.h |  16 +
>  include/migration/qemu-file.h |   3 +-
>  include/net/filter.h          |   5 +
>  include/net/net.h             |   4 +
>  include/sysemu/sysemu.h       |   9 +
>  migration/Makefile.objs       |   2 +
>  migration/colo-comm.c         |  76 ++++
>  migration/colo-failover.c     |  83 ++++
>  migration/colo.c              | 866 
> ++++++++++++++++++++++++++++++++++++++++++
>  migration/migration.c         | 109 +++++-
>  migration/qemu-file-buf.c     |  61 +++
>  migration/ram.c               | 175 ++++++++-
>  migration/savevm.c            | 114 ++++--
>  net/filter-buffer.c           |  14 +-
>  net/filter.c                  |  40 ++
>  net/net.c                     |  33 ++
>  qapi-schema.json              | 104 ++++-
>  qapi/event.json               |  15 +
>  qemu-options.hx               |   4 +-
>  qmp-commands.hx               |  23 +-
>  stubs/Makefile.objs           |   1 +
>  stubs/migration-colo.c        |  54 +++
>  trace-events                  |   8 +
>  vl.c                          |  31 +-
>  32 files changed, 1908 insertions(+), 76 deletions(-)
>  create mode 100644 include/migration/colo.h
>  create mode 100644 include/migration/failover.h
>  create mode 100644 migration/colo-comm.c
>  create mode 100644 migration/colo-failover.c
>  create mode 100644 migration/colo.c
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]