[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 13/13] migration/ram: Tolerate partially changed mappings
From: |
David Hildenbrand |
Subject: |
Re: [PATCH v2 13/13] migration/ram: Tolerate partially changed mappings in postcopy code |
Date: |
Tue, 25 Feb 2020 08:44:56 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 |
On 24.02.20 23:49, Peter Xu wrote:
> On Fri, Feb 21, 2020 at 05:42:04PM +0100, David Hildenbrand wrote:
>> When we partially change mappings (esp., mmap over parts of an existing
>> mmap like qemu_ram_remap() does) where we have a userfaultfd handler
>> registered, the handler will implicitly be unregistered from the parts that
>> changed.
>>
>> Trying to place pages onto mappings where there is no longer a handler
>> registered will fail. Let's make sure that any waiter is woken up - we
>> have to do that manually.
>>
>> Let's also document how UFFDIO_UNREGISTER will handle this scenario.
>>
>> This is mainly a preparation for RAM blocks with resizable allcoations,
>> where the mapping of the invalid RAM range will change. The source will
>> keep sending pages that are outside of the new (shrunk) RAM size. We have
>> to treat these pages like they would have been migrated, but can
>> essentially simply drop the content (ignore the placement error).
>>
>> Keep printing a warning on EINVAL, to avoid hiding other (programming)
>> issues. ENOENT is unique.
>>
>> Cc: "Dr. David Alan Gilbert" <address@hidden>
>> Cc: Juan Quintela <address@hidden>
>> Cc: Peter Xu <address@hidden>
>> Cc: Andrea Arcangeli <address@hidden>
>> Signed-off-by: David Hildenbrand <address@hidden>
>> ---
>> migration/postcopy-ram.c | 37 +++++++++++++++++++++++++++++++++++++
>> 1 file changed, 37 insertions(+)
>>
>> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> index c68caf4e42..f023830b9a 100644
>> --- a/migration/postcopy-ram.c
>> +++ b/migration/postcopy-ram.c
>> @@ -506,6 +506,12 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
>> range_struct.start = (uintptr_t)host_addr;
>> range_struct.len = length;
>>
>> + /*
>> + * In case the mapping was partially changed since we enabled userfault
>> + * (e.g., via qemu_ram_remap()), the userfaultfd handler was already
>> removed
>> + * for the mappings that changed. Unregistering will, however, still
>> work
>> + * and ignore mappings without a registered handler.
>> + */
>
> Ideally we should still only unregister what we have registered.
> After all we do have this information because we know what we
> registered, we know what has unmapped (in your new resize() hook, when
> postcopy_state==RUNNING).
Not in the case of qemu_ram_remap(). And whatever you propose will
require synchronization (see my other mail) and more complicated
handling than this. uffd allows you to handle races with mmap changes in
a very elegant way (e.g., -ENOENT, or unregisterignoring changed mappings).
>
> An extreme example is when we register with pages in range [A, B),
> then shrink it to [A, C), then we mapped something else within [C, B)
> (note, with virtio-mem logically B can be very big and C can be very
> small, it means [B, C) can cover quite some address space). Then if:
>
> - [C, B) memory type is not compatible with uffd, or
That will never happen in the near future. Without resizable allocations:
- All memory is either anonymous or from a single fd
In addition, right now, only anonymous memory can be used for resizable
RAM. However, with resizable allocations we could have:
- All used_length memory is either anonymous or from a single fd
- All remaining memory is either anonymous or from a single fd
Everything else does not make any sense IMHO and I don't think this is
relevant long term. You cannot arbitrarily map things into the
used_length part of a RAMBlock. That would contradict to its page_size
and its fd. E.g., you would break qemu_ram_remap().
>
> - [C, B) could be registered with uffd again due to some other
> reason (so far QEMU should not have such a reason)
Any code that wants to make use of uffd properly has to synchronize
against postcopy code either way IMHO. It just doesn't work otherwise.
E.g., once I would use it to protect unplugged memory in virtio-mem
(something I am looking into right now and teaching QEMU not to touch
all RAMBlock memory is complicated :) ), virtio-mem would unregister any
uffd handler when notified that postcopy will start, and re-register
after postcopy finished.
[...]
>>
>> +static int qemu_ufd_wake_ioctl(int userfault_fd, void *host_addr,
>> + uint64_t pagesize)
>> +{
>> + struct uffdio_range range = {
>> + .start = (uint64_t)(uintptr_t)host_addr,
>> + .len = pagesize,
>> + };
>> +
>> + return ioctl(userfault_fd, UFFDIO_WAKE, &range);
>> +}
>> +
>> static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
>> void *from_addr, uint64_t pagesize, RAMBlock
>> *rb)
>> {
>> @@ -1198,6 +1215,26 @@ static int qemu_ufd_copy_ioctl(int userfault_fd, void
>> *host_addr,
>> zero_struct.mode = 0;
>> ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
>> }
>> +
>> + /*
>> + * When the mapping gets partially changed (e.g., qemu_ram_remap())
>> before
>> + * we try to place a page, the userfaultfd handler will be removed for
>> the
>> + * changed mappings and placing pages will fail. We can safely ignore
>> this,
>> + * because mappings that changed on the destination don't need data
>> from the
>> + * source (e.g., qemu_ram_remap()). Wake up any waiter waiting for that
>> page
>> + * (unlikely but possible). Waking up waiters is always possible, even
>> + * without a registered userfaultfd handler.
>> + *
>> + * Old kernels report EINVAL, new kernels report ENOENT in case there is
>> + * no longer a userfaultfd handler for a mapping.
>> + */
>> + if (ret && (errno == ENOENT || errno == EINVAL)) {
>> + if (errno == EINVAL) {
>> + warn_report("%s: Failed to place page %p. Waking up any
>> waiters.",
>> + __func__, host_addr);
>> + }
>> + ret = qemu_ufd_wake_ioctl(userfault_fd, host_addr, pagesize);
>
> ... if with above information (takes notes on where we registered
> uffd), I think we don't need to capture error, but we can simply skip
> those outliers.
I think we could skip them in general. Nobody should be touching that
memory. But debugging e.g., a SEGFAULT is easier than debugging some
sleeping thread IMHO.
--
Thanks,
David / dhildenb
- [PATCH v2 07/13] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init(), (continued)
- [PATCH v2 07/13] migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init(), David Hildenbrand, 2020/02/21
- [PATCH v2 09/13] migration/ram: Consolidate variable reset after placement in ram_load_postcopy(), David Hildenbrand, 2020/02/21
- [PATCH v2 10/13] migration/ram: Handle RAM block resizes during postcopy, David Hildenbrand, 2020/02/21
- [PATCH v2 11/13] migration/multifd: Print used_length of memory block, David Hildenbrand, 2020/02/21
- [PATCH v2 12/13] migration/ram: Use offset_in_ramblock() in range checks, David Hildenbrand, 2020/02/21
- [PATCH v2 13/13] migration/ram: Tolerate partially changed mappings in postcopy code, David Hildenbrand, 2020/02/21
Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, Peter Xu, 2020/02/21
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, David Hildenbrand, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, Peter Xu, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, David Hildenbrand, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, David Hildenbrand, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, Peter Xu, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, David Hildenbrand, 2020/02/24
- Re: [PATCH v2 00/13] migrate/ram: Fix resizing RAM blocks while migrating, Peter Xu, 2020/02/24