qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 13/13] migrate/ram: Tolerate partially changed mappings in


From: Peter Xu
Subject: Re: [PATCH v1 13/13] migrate/ram: Tolerate partially changed mappings in postcopy code
Date: Thu, 20 Feb 2020 16:07:12 -0500

On Thu, Feb 20, 2020 at 12:24:42PM +0100, David Hildenbrand wrote:
> On 19.02.20 17:17, David Hildenbrand wrote:
> > When we partially change mappings (e.g., mmap over parts of an existing
> > mmap) where we have a userfaultfd handler registered, the handler will
> > implicitly be unregistered from the parts that changed. This is e.g., the
> > case when doing a qemu_ram_remap(), but is also a preparation for RAM
> > blocks with resizable allocations and we're shrinking RAM blocks.
> > 
> > When the mapping is changed and the handler is removed, any waiters are
> > woken up. Trying to place pages will fail. We can simply ignore erors
> > due to that when placing pages - as the mapping changed on the migration
> > destination, also the content is stale. E.g., after shrinking a RAM
> > block, nobody should be using that memory. After doing a
> > qemu_ram_remap(), the old memory is expected to have vanished.
> > 
> > Let's tolerate such errors (but still warn for now) when placing pages.
> > Also, add a comment why unregistering will continue to work even though
> > the mapping might have changed.
> > 
> > Cc: "Dr. David Alan Gilbert" <address@hidden>
> > Cc: Juan Quintela <address@hidden>
> > Cc: Peter Xu <address@hidden>
> > Cc: Andrea Arcangeli <address@hidden>
> > Signed-off-by: David Hildenbrand <address@hidden>
> > ---
> >  migration/postcopy-ram.c | 43 ++++++++++++++++++++++++++++++++++------
> >  1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index c68caf4e42..df9d27c004 100644
> > --- a/migration/postcopy-ram.c
> > +++ b/migration/postcopy-ram.c
> > @@ -506,6 +506,13 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
> >      range_struct.start = (uintptr_t)host_addr;
> >      range_struct.len = length;
> >  
> > +    /*
> > +     * In case the mapping was partially changed since we enabled userfault
> > +     * (esp. when whrinking RAM blocks and we have resizable allocations, 
> > or
> > +     * via qemu_ram_remap()), the userfaultfd handler was already removed 
> > for
> > +     * the mappings that changed. Unregistering will, however, still work 
> > and
> > +     * ignore mappings without a registered handler.
> > +     */
> >      if (ioctl(mis->userfault_fd, UFFDIO_UNREGISTER, &range_struct)) {
> >          error_report("%s: userfault unregister %s", __func__, 
> > strerror(errno));
> >  
> > @@ -1239,10 +1246,28 @@ int postcopy_place_page(MigrationIncomingState 
> > *mis, void *host, void *from,
> >       */
> >      if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, from, pagesize, rb)) {
> >          int e = errno;
> > -        error_report("%s: %s copy host: %p from: %p (size: %zd)",
> > -                     __func__, strerror(e), host, from, pagesize);
> >  
> > -        return -e;
> > +        /*
> > +         * When the mapping gets partially changed before we try to place 
> > a page
> > +         * (esp. when whrinking RAM blocks and we have resizable 
> > allocations, or
> > +         * via qemu_ram_remap()), the userfaultfd handler will be removed 
> > and
> > +         * placing pages will fail. In that case, any waiter was already 
> > woken
> > +         * up when the mapping was changed. We can safely ignore this, as
> > +         * mappings that change once we're running on the destination imply
> > +         * that memory of these mappings vanishes. Let's still print a 
> > warning
> > +         * for now.
> > +         *
> 
> After talking to Andrea, on mapping changes, no waiter will be woken up
> automatically. We have to do an UFFDIO_WAKE, which even works when there
> is no longer a handler registered for that reason. Interesting stuff :)

Yes actually it makes sense. :)

Though I do think it should hardly happen, otherwise even if it's
waked up it'll still try to access that GPA and KVM will be confused
on that and exit because no memslot was setup for that.  Then I think
it's a fatal VM error.  In other words, I feel like the resizing
should be blocked somehow by that stall vcpu too (e.g., even if we
want to reboot a Linux guest, it'll sync between vcpus, and same to
bootstraping).

Btw, I feel like we cannot always depend on the fact that userfaultfd
will dissapear itself if the VMA is unmapped, because even it's true
it'll only be true for shrinking of memories.  How about extending
memory in the future?  So IIUC if we want to really fix this, we
probably need to take care of uffd register and unregister of changed
memory regions, which AFAIUI can be done inside your newly introduced
resize hook...

We probably need to take care of other things that might be related to
ramblock resizing too in the same notifier.  One I can think of is to
realloc the ramblock.receivedmap otherwise we could have some bit
cleared forever for shrinking memories (which logically when migration
finishes that bitmap should be all set).

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]