qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation


From: zhanghailiang
Subject: Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation
Date: Mon, 24 Nov 2014 16:10:03 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.1.1

On 2014/11/21 18:14, Dr. David Alan Gilbert wrote:
* zhanghailiang (address@hidden) wrote:
Hi David,

When i migrated VM in postcopy way when configuring VM with '-realtime 
mlock=on' option,
It failed, and reports "postcopy_ram_hosttest: remap_anon_pages not available: File 
exists" in destination,

Is it a bug of userfaultfd API?

Thanks.

cc: Andrea

reproduce Steps:
Source:
qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
-machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
-hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :11 -monitor stdio

Destination:
qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
-machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
-hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :12 -monitor stdio \
-incoming unix:/mnt/migrate.sock
(1) migrate_set_capability x-postcopy-ram on
(2) migrate -d unix:/mnt/migrate.sock

In Destination, it fails, reports:
address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: 0
address@hidden qemu_loadvm_state loop: section_type=6
address@hidden loadvm_postcopy_ram_handle_advise
postcopy_ram_hosttest: remap_anon_pages not available: File exists
address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: -1

Yes, I think I need to chat to Andrea about how that's supposed to work with 
mlock.
I've added it to my list and we'll figure it out; I suspect on the destination
I need to avoid doing the mlockall until after postcopy completes.

And one more thing, i want to know: ;)
Why we must start precopy first before start postcopy?
Can we do postcopy at the beginning of migration?

You can send migrate_start_postcopy immediately after you send the migrate
command, which is very close to no-precopy; the original API had a timeout
and if you set it to 0 then it would do exactly no-precopy, but the current API
was preferred by reviewers, and is simpler.
With testing, the best performance is from doing one full pass of precopy and
then starting postcopy; that way all of the kernel and other static stuff
has already moved to the destination, and there are much fewer page requests.


Got it, :) Thanks.

Thanks for the report,

Dave


Thanks,
zhanghailiang

On 2014/10/4 1:47, Dr. David Alan Gilbert (git) wrote:
From: "Dr. David Alan Gilbert" <address@hidden>

Hi,
   This is the 4th cut of my version of postcopy; it is designed for use with
the Linux kernel additions just posted by Andrea Arcangeli here:

http://marc.info/?l=linux-kernel&m=141235633015100&w=2

(Note: This is a new version compared to my previous postcopy patchset; you'll
need to update the kernel to the new version.)

Other than the new kernel ABI (which is only a small change to the userspace 
side);
the major changes are;

   a) Code for host page size != target page size
   b) Support for migration over fd
      From Cristian Klein; this is for libvirt support which Cristian recently
      posted to the libvirt list.
   c) It's now build bisectable and builds on 32bit

Testing wise; I've now done many thousand of postcopy migrations without
failure (both of idle and busy guests); so it seems pretty solid.

Must-TODO's:
   1) A partially repeatable migration_cancel failure
   2) virt_test's migrate.with_reboot test is failing
   3) The ACPI fix in 2.1 that allowed migrating RAMBlocks to be larger than
     the source feels like it needs looking at for postcopy.
   4) Paolo's comments with respect to the wakeup_request/is_running code
      in the migration thread
   5) xbzrle needs disabling once in postcopy

Later-TODO's:
   1) Control the rate of background page transfers during postcopy to
      reduce their impact on the latency of postcopy requests.
   2) Work with RDMA
   3) Could destination RP be made blocking (as per discussion with Paolo;
      I'm still worried that that changes too many assumptions)



V4:
   Initial support for host page size != target page size
     - tested heavily on hps==tps
     - only partially tested on hps!=tps systems
     - This involved quite a bit of rework around the discard code
   Updated to new kernel userfault ABI
     - It won't work with the previous version
   Fix mis-optimisation of postcopy request for wrong RAMBlock
      request for block A offset n
      un-needed fault for block B/m (already received - no req sent)
      request for block B/l  - wrongly sent as request for A/l
   Fix thinko in discard bitmap processing (missed last word of bitmap)
      Symptom: remap failures near the top of RAM if postcopy started late
   Fix bug that caused kernel page acknowledgments to be misaligned
      May have meant the guest was paused for longer than required
   Fix potential for crashing cleaning up failed RP
   Fixes in docs (from Yang)
   Handle migration by fd as sockets if they are sockets
   Build tested on 32bit
   Fully build bisectable (x86-64)


Dave

Cristian Klein (1):
   Handle bi-directional communication for fd migration

Dr. David Alan Gilbert (46):
   QEMUSizedBuffer based QEMUFile
   Tests: QEMUSizedBuffer/QEMUBuffer
   Start documenting how postcopy works.
   qemu_ram_foreach_block: pass up error value, and down the ramblock
     name
   improve DPRINTF macros, add to savevm
   Add qemu_get_counted_string to read a string prefixed by a count byte
   Create MigrationIncomingState
   socket shutdown
   Provide runtime Target page information
   Return path: Open a return path on QEMUFile for sockets
   Return path: socket_writev_buffer: Block even on non-blocking fd's
   Migration commands
   Return path: Control commands
   Return path: Send responses from destination to source
   Return path: Source handling of return path
   qemu_loadvm errors and debug
   ram_debug_dump_bitmap: Dump a migration bitmap as text
   Rework loadvm path for subloops
   Add migration-capability boolean for postcopy-ram.
   Add wrappers and handlers for sending/receiving the postcopy-ram
     migration messages.
   QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream
   migrate_init: Call from savevm
   Allow savevm handlers to state whether they could go into postcopy
   postcopy: OS support test
   migrate_start_postcopy: Command to trigger transition to postcopy
   MIG_STATE_POSTCOPY_ACTIVE: Add new migration state
   qemu_savevm_state_complete: Postcopy changes
   Postcopy page-map-incoming (PMI) structure
   Postcopy: Maintain sentmap and calculate discard
   postcopy: Incoming initialisation
   postcopy: ram_enable_notify to switch on userfault
   Postcopy: Postcopy startup in migration thread
   Postcopy: Create a fault handler thread before marking the ram as
     userfault
   Page request:  Add MIG_RPCOMM_REQPAGES reverse command
   Page request: Process incoming page request
   Page request: Consume pages off the post-copy queue
   Add assertion to check migration_dirty_pages
   postcopy_ram.c: place_page and helpers
   Postcopy: Use helpers to map pages during migration
   qemu_ram_block_from_host
   Don't sync dirty bitmaps in postcopy
   Host page!=target page: Cleanup bitmaps
   Postcopy; Handle userfault requests
   Start up a postcopy/listener thread ready for incoming page data
   postcopy: Wire up loadvm_postcopy_ram_handle_{run,end} commands
   End of migration for postcopy

  Makefile.objs                    |    2 +-
  arch_init.c                      |  739 +++++++++++++++++++++++++--
  docs/migration.txt               |  189 +++++++
  exec.c                           |   76 ++-
  hmp-commands.hx                  |   15 +
  hmp.c                            |    7 +
  hmp.h                            |    1 +
  include/exec/cpu-common.h        |    8 +-
  include/migration/migration.h    |  130 +++++
  include/migration/postcopy-ram.h |  106 ++++
  include/migration/qemu-file.h    |   47 ++
  include/migration/vmstate.h      |    2 +-
  include/qemu/sockets.h           |    1 +
  include/qemu/typedefs.h          |    9 +-
  include/sysemu/sysemu.h          |   43 +-
  migration-fd.c                   |   24 +-
  migration-rdma.c                 |    4 +-
  migration.c                      |  693 +++++++++++++++++++++++++-
  postcopy-ram.c                   | 1016 ++++++++++++++++++++++++++++++++++++++
  qapi-schema.json                 |   14 +-
  qemu-file.c                      |  598 +++++++++++++++++++++-
  qmp-commands.hx                  |   19 +
  savevm.c                         |  881 +++++++++++++++++++++++++++++++--
  tests/Makefile                   |    2 +-
  tests/test-vmstate.c             |   74 +--
  util/qemu-sockets.c              |   28 ++
  26 files changed, 4550 insertions(+), 178 deletions(-)
  create mode 100644 include/migration/postcopy-ram.h
  create mode 100644 postcopy-ram.c



--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]