qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation
Date: Fri, 21 Nov 2014 10:14:00 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

* zhanghailiang (address@hidden) wrote:
> Hi David,
> 
> When i migrated VM in postcopy way when configuring VM with '-realtime 
> mlock=on' option,
> It failed, and reports "postcopy_ram_hosttest: remap_anon_pages not 
> available: File exists" in destination,
> 
> Is it a bug of userfaultfd API?

Thanks.

> cc: Andrea
> 
> reproduce Steps:
> Source:
> qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
> -machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
> -hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :11 -monitor stdio
> 
> Destination:
> qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
> -machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
> -hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :12 -monitor stdio \
> -incoming unix:/mnt/migrate.sock
> (1) migrate_set_capability x-postcopy-ram on
> (2) migrate -d unix:/mnt/migrate.sock
> 
> In Destination, it fails, reports:
> address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: 0
> address@hidden qemu_loadvm_state loop: section_type=6
> address@hidden loadvm_postcopy_ram_handle_advise
> postcopy_ram_hosttest: remap_anon_pages not available: File exists
> address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: -1

Yes, I think I need to chat to Andrea about how that's supposed to work with 
mlock.
I've added it to my list and we'll figure it out; I suspect on the destination
I need to avoid doing the mlockall until after postcopy completes.

> And one more thing, i want to know: ;)
> Why we must start precopy first before start postcopy?
> Can we do postcopy at the beginning of migration?

You can send migrate_start_postcopy immediately after you send the migrate
command, which is very close to no-precopy; the original API had a timeout
and if you set it to 0 then it would do exactly no-precopy, but the current API
was preferred by reviewers, and is simpler.
With testing, the best performance is from doing one full pass of precopy and
then starting postcopy; that way all of the kernel and other static stuff
has already moved to the destination, and there are much fewer page requests.

Thanks for the report,

Dave

> 
> Thanks,
> zhanghailiang
> 
> On 2014/10/4 1:47, Dr. David Alan Gilbert (git) wrote:
> >From: "Dr. David Alan Gilbert" <address@hidden>
> >
> >Hi,
> >   This is the 4th cut of my version of postcopy; it is designed for use with
> >the Linux kernel additions just posted by Andrea Arcangeli here:
> >
> >http://marc.info/?l=linux-kernel&m=141235633015100&w=2
> >
> >(Note: This is a new version compared to my previous postcopy patchset; 
> >you'll
> >need to update the kernel to the new version.)
> >
> >Other than the new kernel ABI (which is only a small change to the userspace 
> >side);
> >the major changes are;
> >
> >   a) Code for host page size != target page size
> >   b) Support for migration over fd
> >      From Cristian Klein; this is for libvirt support which Cristian 
> > recently
> >      posted to the libvirt list.
> >   c) It's now build bisectable and builds on 32bit
> >
> >Testing wise; I've now done many thousand of postcopy migrations without
> >failure (both of idle and busy guests); so it seems pretty solid.
> >
> >Must-TODO's:
> >   1) A partially repeatable migration_cancel failure
> >   2) virt_test's migrate.with_reboot test is failing
> >   3) The ACPI fix in 2.1 that allowed migrating RAMBlocks to be larger than
> >     the source feels like it needs looking at for postcopy.
> >   4) Paolo's comments with respect to the wakeup_request/is_running code
> >      in the migration thread
> >   5) xbzrle needs disabling once in postcopy
> >
> >Later-TODO's:
> >   1) Control the rate of background page transfers during postcopy to
> >      reduce their impact on the latency of postcopy requests.
> >   2) Work with RDMA
> >   3) Could destination RP be made blocking (as per discussion with Paolo;
> >      I'm still worried that that changes too many assumptions)
> >
> >
> >
> >V4:
> >   Initial support for host page size != target page size
> >     - tested heavily on hps==tps
> >     - only partially tested on hps!=tps systems
> >     - This involved quite a bit of rework around the discard code
> >   Updated to new kernel userfault ABI
> >     - It won't work with the previous version
> >   Fix mis-optimisation of postcopy request for wrong RAMBlock
> >      request for block A offset n
> >      un-needed fault for block B/m (already received - no req sent)
> >      request for block B/l  - wrongly sent as request for A/l
> >   Fix thinko in discard bitmap processing (missed last word of bitmap)
> >      Symptom: remap failures near the top of RAM if postcopy started late
> >   Fix bug that caused kernel page acknowledgments to be misaligned
> >      May have meant the guest was paused for longer than required
> >   Fix potential for crashing cleaning up failed RP
> >   Fixes in docs (from Yang)
> >   Handle migration by fd as sockets if they are sockets
> >   Build tested on 32bit
> >   Fully build bisectable (x86-64)
> >
> >
> >Dave
> >
> >Cristian Klein (1):
> >   Handle bi-directional communication for fd migration
> >
> >Dr. David Alan Gilbert (46):
> >   QEMUSizedBuffer based QEMUFile
> >   Tests: QEMUSizedBuffer/QEMUBuffer
> >   Start documenting how postcopy works.
> >   qemu_ram_foreach_block: pass up error value, and down the ramblock
> >     name
> >   improve DPRINTF macros, add to savevm
> >   Add qemu_get_counted_string to read a string prefixed by a count byte
> >   Create MigrationIncomingState
> >   socket shutdown
> >   Provide runtime Target page information
> >   Return path: Open a return path on QEMUFile for sockets
> >   Return path: socket_writev_buffer: Block even on non-blocking fd's
> >   Migration commands
> >   Return path: Control commands
> >   Return path: Send responses from destination to source
> >   Return path: Source handling of return path
> >   qemu_loadvm errors and debug
> >   ram_debug_dump_bitmap: Dump a migration bitmap as text
> >   Rework loadvm path for subloops
> >   Add migration-capability boolean for postcopy-ram.
> >   Add wrappers and handlers for sending/receiving the postcopy-ram
> >     migration messages.
> >   QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream
> >   migrate_init: Call from savevm
> >   Allow savevm handlers to state whether they could go into postcopy
> >   postcopy: OS support test
> >   migrate_start_postcopy: Command to trigger transition to postcopy
> >   MIG_STATE_POSTCOPY_ACTIVE: Add new migration state
> >   qemu_savevm_state_complete: Postcopy changes
> >   Postcopy page-map-incoming (PMI) structure
> >   Postcopy: Maintain sentmap and calculate discard
> >   postcopy: Incoming initialisation
> >   postcopy: ram_enable_notify to switch on userfault
> >   Postcopy: Postcopy startup in migration thread
> >   Postcopy: Create a fault handler thread before marking the ram as
> >     userfault
> >   Page request:  Add MIG_RPCOMM_REQPAGES reverse command
> >   Page request: Process incoming page request
> >   Page request: Consume pages off the post-copy queue
> >   Add assertion to check migration_dirty_pages
> >   postcopy_ram.c: place_page and helpers
> >   Postcopy: Use helpers to map pages during migration
> >   qemu_ram_block_from_host
> >   Don't sync dirty bitmaps in postcopy
> >   Host page!=target page: Cleanup bitmaps
> >   Postcopy; Handle userfault requests
> >   Start up a postcopy/listener thread ready for incoming page data
> >   postcopy: Wire up loadvm_postcopy_ram_handle_{run,end} commands
> >   End of migration for postcopy
> >
> >  Makefile.objs                    |    2 +-
> >  arch_init.c                      |  739 +++++++++++++++++++++++++--
> >  docs/migration.txt               |  189 +++++++
> >  exec.c                           |   76 ++-
> >  hmp-commands.hx                  |   15 +
> >  hmp.c                            |    7 +
> >  hmp.h                            |    1 +
> >  include/exec/cpu-common.h        |    8 +-
> >  include/migration/migration.h    |  130 +++++
> >  include/migration/postcopy-ram.h |  106 ++++
> >  include/migration/qemu-file.h    |   47 ++
> >  include/migration/vmstate.h      |    2 +-
> >  include/qemu/sockets.h           |    1 +
> >  include/qemu/typedefs.h          |    9 +-
> >  include/sysemu/sysemu.h          |   43 +-
> >  migration-fd.c                   |   24 +-
> >  migration-rdma.c                 |    4 +-
> >  migration.c                      |  693 +++++++++++++++++++++++++-
> >  postcopy-ram.c                   | 1016 
> > ++++++++++++++++++++++++++++++++++++++
> >  qapi-schema.json                 |   14 +-
> >  qemu-file.c                      |  598 +++++++++++++++++++++-
> >  qmp-commands.hx                  |   19 +
> >  savevm.c                         |  881 +++++++++++++++++++++++++++++++--
> >  tests/Makefile                   |    2 +-
> >  tests/test-vmstate.c             |   74 +--
> >  util/qemu-sockets.c              |   28 ++
> >  26 files changed, 4550 insertions(+), 178 deletions(-)
> >  create mode 100644 include/migration/postcopy-ram.h
> >  create mode 100644 postcopy-ram.c
> >
> 
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]