qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd


From: Hailiang Zhang
Subject: Re: [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd
Date: Thu, 14 Jul 2016 18:24:10 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 2016/7/14 2:02, Dr. David Alan Gilbert wrote:
* zhanghailiang (address@hidden) wrote:
For now, we still didn't support live memory snapshot, we have discussed
a scheme which based on userfaultfd long time ago.
You can find the discussion by the follow link:
https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg01779.html

The scheme is based on userfaultfd's write-protect capability.
The userfaultfd write protection feature is available here:
http://www.spinics.net/lists/linux-mm/msg97422.html

I've (finally!) had a brief look through this, I like the idea.
I've not bothered with minor cleanup like comments on them;
I'm sure those will happen later; some larger scale things to think
about are:
   a) I wonder if it's really best to put that much code into the postcopy
      function; it might be but I can see other userfault uses as well.

Yes, it is better to extract common codes into public functions.

   b) I worry a bit about the size of the copies you create during setup
      and I don't really understand why you can't start sending those pages

Because we save device state and ram in the same snapshot_thread, if the process
of saving device is blocked by writing pages, we can remove the write-protect in
'postcopy/fault' thread, but can't send it immediately.


      immediately - but then I worry aobut the relative order of when pages
      data should be sent compared to the state of devices view of RAM.
   c) Have you considered also using userfault for loading the snapshot - I
     know there was someone on #qemu a while ago who was talking about using
     it as a way to quickly reload from a migration image.


I didn't notice such talking before, maybe i missed it.
Could you please send me the link ?

But i do consider the scenario of quickly snapshot restoring.
And the difficulty here is how can we quickly find the position
of the special page. That is, while VM is accessing one page, we
need to find its position in snapshot file and read it into memory.
Consider the compatibility, we hope we can still re-use all migration
capabilities.

My rough idea about the scenario is:
1. Use an array to recode the beginning position of all VM's pages.
Use the offset as the index for the array, just like migration bitmaps.
2. Save the data of the array into another file in a special format.
3. Also record the position of device state data in snapshot file.
(Or we can put the device state data at the head of snapshot file)
4. While restore the snapshot, reload the array first, and then read
the device state.
5. Set all pages to MISS status.
6. Resume VM to run
7. The next process is like how postcopy incoming does.

I'm not sure if this scenario is practicable or not. We need further
discussion. :)

Hailiang

Dave


The process of this live memory scheme is like bellow:
1. Pause VM
2. Enable write-protect fault notification by using userfaultfd to
    mark VM's memory to write-protect (readonly).
3. Save VM's static state (here is device state) to snapshot file
4. Resume VM, VM is going to run.
5. Snapshot thread begins to save VM's live state (here is RAM) into
    snapshot file.
6. During this time, all the actions of writing VM's memory will be blocked
   by kernel, and kernel will wakeup the fault treating thread in qemu to
   process this write-protect fault. The fault treating thread will deliver this
   page's address to snapshot thread.
7. snapshot thread gets this address, save this page into snasphot file,
    and then remove the write-protect by using userfaultfd API, after that,
    the actions of writing will be recovered.
8. Repeat step 5~7 until all VM's memory is saved to snapshot file

Compared with the feature of 'migrate VM's state to file',
the main difference for live memory snapshot is it has little time delay for
catching VM's state. It just captures the VM's state while got users snapshot
command, just like take a photo of VM's state.

For now, we only support tcg accelerator, since userfaultfd is not supporting
tracking write faults for KVM.

Usage:
1. Take a snapshot
#x86_64-softmmu/qemu-system-x86_64 -machine pc-i440fx-2.5,accel=tcg,usb=off 
-drive 
file=/mnt/windows/win7_install.qcow2.bak,if=none,id=drive-ide0-0-1,format=qcow2,cache=none
 -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1  -vnc :7 -m 
8192 -smp 1 -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0  
--monitor stdio
Issue snapshot command:
(qemu)migrate -d file:/home/Snapshot
2. Revert to the snapshot
#x86_64-softmmu/qemu-system-x86_64 -machine pc-i440fx-2.5,accel=tcg,usb=off 
-drive 
file=/mnt/windows/win7_install.qcow2.bak,if=none,id=drive-ide0-0-1,format=qcow2,cache=none
 -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1  -vnc :7 -m 
8192 -smp 1 -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0  
--monitor stdio -incoming file:/home/Snapshot

NOTE:
The userfaultfd write protection feature does not support THP for now,
Before taking snapshot, please disable THP by:
echo never > /sys/kernel/mm/transparent_hugepage/enabled

TODO:
- Reduce the influence for VM while taking snapshot

zhanghailiang (13):
   postcopy/migration: Split fault related state into struct
     UserfaultState
   migration: Allow the migrate command to work on file: urls
   migration: Allow -incoming to work on file: urls
   migration: Create a snapshot thread to realize saving memory snapshot
   migration: implement initialization work for snapshot
   QEMUSizedBuffer: Introduce two help functions for qsb
   savevm: Split qemu_savevm_state_complete_precopy() into two helper
     functions
   snapshot: Save VM's device state into snapshot file
   migration/postcopy-ram: fix some helper functions to support
     userfaultfd write-protect
   snapshot: Enable the write-protect notification capability for VM's
     RAM
   snapshot/migration: Save VM's RAM into snapshot file
   migration/ram: Fix some helper functions' parameter to use
     PageSearchStatus
   snapshot: Remove page's write-protect and copy the content during
     setup stage

  include/migration/migration.h     |  41 +++++--
  include/migration/postcopy-ram.h  |   9 +-
  include/migration/qemu-file.h     |   3 +-
  include/qemu/typedefs.h           |   1 +
  include/sysemu/sysemu.h           |   3 +
  linux-headers/linux/userfaultfd.h |  21 +++-
  migration/fd.c                    |  51 ++++++++-
  migration/migration.c             | 101 ++++++++++++++++-
  migration/postcopy-ram.c          | 229 ++++++++++++++++++++++++++++----------
  migration/qemu-file-buf.c         |  61 ++++++++++
  migration/ram.c                   | 104 ++++++++++++-----
  migration/savevm.c                |  90 ++++++++++++---
  trace-events                      |   1 +
  13 files changed, 587 insertions(+), 128 deletions(-)

--
1.8.3.1


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]