[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH COLO-Frame v11 00/39] COarse-grain LOck-stepping(COL
From: |
zhanghailiang |
Subject: |
[Qemu-devel] [PATCH COLO-Frame v11 00/39] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) |
Date: |
Tue, 24 Nov 2015 17:25:10 +0800 |
This is the 11th version of COLO.
As usual, this version of COLO is only support periodic checkpoint,
just like MicroCheckpointing and Remus does.
The 'peroidic' mode is based on netfilter which has been merged.
It uses the 'filter-buffer' to buffer and release packets.
We add each netdev a default buffer filter and the name is 'nop'.
The 'nop' buffer filter will not buffer any packets in default.
So it has no side effect on netdev.
As usual, here is only COLO frame part, you can get the whole codes from github:
https://github.com/coloft/qemu/commits/colo-v2.2-periodic-mode
Test procedure:
1. Startup qemu
Primary side:
#x86_64-softmmu/qemu-system-x86_64 -enable-kvm -boot c -m 2048 -smp 8 -qmp
stdio -vnc :7 -name primary -cpu qemu64,+kvmclock -device piix3-usb-uhci
-device usb-tablet -netdev tap,id=hn0 -device
virtio-net-pci,id=net-pci0,netdev=hn0 -drive
if=virtio,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=raw
Secondary side:
#x86_64-softmmu/qemu-system-x86_64 -enable-kvm -boot c -m 2048 -smp 8 -qmp
stdio -vnc :7 -name secondary -cpu qemu64,+kvmclock -device piix3-usb-uhci
-device usb-tablet -netdev tap,id=hn0 -device
virtio-net-pci,id=net-pci0,netdev=hn0 -drive
if=none,id=colo-disk0,file.filename=/dev/null,driver=raw -drive
if=virtio,id=active-disk0,throttling.bps-total=70000000,driver=replication,mode=secondary,file.driver=qcow2,file.file.filename=/mnt/ramfs/active_disk.img,file.backing.allow-write-backing-file=on,file.backing.driver=qcow2,file.backing.file.filename=/mnt/ramfs/hidden_disk.img,file.backing.backing.allow-write-backing-file=on,file.backing.backing.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,file.backing.backing.driver=raw,file.backing.backing.node-name=node0
-incoming tcp:0:8888
2. On Secondary VM's QEMU monitor, issue command
(Note: Hmp command is unstable, so here we recommand using qmp command)
{'execute':'qmp_capabilities'}
{'execute': 'blockdev-remove-medium', 'arguments': {'device': 'colo-disk0'} }
{'execute': 'blockdev-insert-medium', 'arguments': {'device': 'colo-disk0',
'node-name': 'node0'} }
{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data':
{'host': '192.168.2.88', 'port': '8889'} } } }
{'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk0', 'writable':
true } }
{'execute': 'trace-event-set-state', 'arguments': {'name': 'colo*', 'enable':
true} }
3. On Primary VM's QEMU monitor, issue command:
{'execute':'qmp_capabilities'}
{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add
buddy
driver=replication,mode=primary,file.driver=nbd,file.host=9.61.1.7,file.port=8889,file.export=colo-disk0,node-name=node0,if=none'}}
{'execute':'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node':
'node0' } }
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [
{'capability': 'x-colo', 'state': true } ] } }
{'execute': 'migrate', 'arguments': {'uri': 'tcp:192.168.2.88:8888' } }
4. After the above steps, you will see, whenever you make changes to PVM, SVM
will be synced.
You can by issue command '{ "execute": "migrate-set-parameters" , "arguments":{
"checkpoint-delay": 2000 } }'
to change the checkpoint period time.
5. Failover test
You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's
monitor at the same time, then SVM will failover and client will not feel this
change.
The qmp command is '{ "execute": "x-colo-lost-heartbeat" }'.
COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.
TODO:
1. implement packets compare module (proxy) in qemu
2. checkpoint based on proxy in qemu
3. The capability of continuous FT
v11:
- Re-implement buffer/release packets based on filter-buffer according
to Jason Wang's suggestion. (patch 34, patch 36 ~ patch 38)
- Rebase master to re-use some stuff introduced by post-copy.
- Address several comments from Eric and Dave, the fixing record can
be found in each patch.
v10:
- Rename 'colo_lost_heartbeat' command to experimental 'x_colo_lost_heartbeat'
- Rename migration capability 'colo' to 'x-colo' (Eric's suggestion)
- Simplify the process of primary side by dropping colo thread and reusing
migration thread. (Dave's suggestion)
- Add several netfilter related APIs to support buffer/release packets
for COLO (patch 32 ~ patch 36)
zhanghailiang (39):
configure: Add parameter for configure to enable/disable COLO support
migration: Introduce capability 'x-colo' to migration
COLO: migrate colo related info to secondary node
migration: Export migrate_set_state()
migration: Add state records for migration incoming
migration: Integrate COLO checkpoint process into migration
migration: Integrate COLO checkpoint process into loadvm
migration: Rename the'file' member of MigrationState
COLO/migration: Create a new communication path from destination to
source
COLO: Implement colo checkpoint protocol
COLO: Add a new RunState RUN_STATE_COLO
QEMUSizedBuffer: Introduce two help functions for qsb
COLO: Save PVM state to secondary side when do checkpoint
ram: Split host_from_stream_offset() into two helper functions
COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
ram/COLO: Record the dirty pages that SVM received
COLO: Load VMState into qsb before restore it
COLO: Flush PVM's cached RAM into SVM's memory
COLO: Add checkpoint-delay parameter for migrate-set-parameters
COLO: synchronize PVM's state to SVM periodically
COLO failover: Introduce a new command to trigger a failover
COLO failover: Introduce state to record failover process
COLO: Implement failover work for Primary VM
COLO: Implement failover work for Secondary VM
COLO: implement default failover treatment
qmp event: Add event notification for COLO error
COLO failover: Shutdown related socket fd when do failover
COLO failover: Don't do failover during loading VM's state
COLO: Process shutdown command for VM in COLO state
COLO: Update the global runstate after going into colo state
savevm: Split load vm state function qemu_loadvm_state
COLO: Separate the process of saving/loading ram and device state
COLO: Split qemu_savevm_state_begin out of checkpoint process
net/filter-buffer: Add default filter-buffer for each netdev
filter-buffer: Accept zero interval
filter-buffer: Introduce a helper function to enable/disable default
filter
filter-buffer: Introduce a helper function to release packets
colo: Use default buffer-filter to buffer and release packets
COLO: Add block replication into colo process
configure | 11 +
docs/qmp-events.txt | 17 +
hmp-commands.hx | 15 +
hmp.c | 15 +
hmp.h | 1 +
include/exec/ram_addr.h | 1 +
include/migration/colo.h | 38 +++
include/migration/failover.h | 33 ++
include/migration/migration.h | 18 +-
include/migration/qemu-file.h | 3 +-
include/net/filter.h | 5 +
include/net/net.h | 4 +
include/sysemu/sysemu.h | 9 +
migration/Makefile.objs | 2 +
migration/colo-comm.c | 71 ++++
migration/colo-failover.c | 83 +++++
migration/colo.c | 773 ++++++++++++++++++++++++++++++++++++++++++
migration/exec.c | 4 +-
migration/fd.c | 4 +-
migration/migration.c | 215 ++++++++----
migration/postcopy-ram.c | 6 +-
migration/qemu-file-buf.c | 61 ++++
migration/ram.c | 195 ++++++++++-
migration/savevm.c | 295 ++++++++++++----
migration/tcp.c | 4 +-
migration/unix.c | 4 +-
net/filter-buffer.c | 123 ++++++-
net/net.c | 37 ++
qapi-schema.json | 110 +++++-
qapi/event.json | 17 +
qmp-commands.hx | 22 +-
stubs/Makefile.objs | 1 +
stubs/migration-colo.c | 45 +++
trace-events | 9 +
vl.c | 37 +-
35 files changed, 2105 insertions(+), 183 deletions(-)
create mode 100644 include/migration/colo.h
create mode 100644 include/migration/failover.h
create mode 100644 migration/colo-comm.c
create mode 100644 migration/colo-failover.c
create mode 100644 migration/colo.c
create mode 100644 stubs/migration-colo.c
--
1.8.3.1
- [Qemu-devel] [PATCH COLO-Frame v11 00/39] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT),
zhanghailiang <=
- [Qemu-devel] [PATCH COLO-Frame v11 04/39] migration: Export migrate_set_state(), zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 05/39] migration: Add state records for migration incoming, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 02/39] migration: Introduce capability 'x-colo' to migration, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 20/39] COLO: synchronize PVM's state to SVM periodically, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 28/39] COLO failover: Don't do failover during loading VM's state, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 19/39] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 30/39] COLO: Update the global runstate after going into colo state, zhanghailiang, 2015/11/24
- [Qemu-devel] [PATCH COLO-Frame v11 32/39] COLO: Separate the process of saving/loading ram and device state, zhanghailiang, 2015/11/24