qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page


From: Alexey Perevalov
Subject: Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page
Date: Fri, 28 Jul 2017 18:29:20 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 07/28/2017 10:06 AM, Alexey Perevalov wrote:
On 07/28/2017 09:57 AM, Peter Xu wrote:
On Fri, Jul 28, 2017 at 09:43:28AM +0300, Alexey Perevalov wrote:
On 07/28/2017 07:27 AM, Peter Xu wrote:
On Thu, Jul 27, 2017 at 10:27:41AM +0300, Alexey Perevalov wrote:
On 07/27/2017 05:35 AM, Peter Xu wrote:
On Wed, Jul 26, 2017 at 06:24:11PM +0300, Alexey Perevalov wrote:
On 07/26/2017 11:43 AM, Peter Xu wrote:
On Wed, Jul 26, 2017 at 11:07:17AM +0300, Alexey Perevalov wrote:
On 07/26/2017 04:49 AM, Peter Xu wrote:
On Thu, Jul 20, 2017 at 09:52:34AM +0300, Alexey Perevalov wrote:
This patch adds ability to track down already received
pages, it's necessary for calculation vCPU block time in
postcopy migration feature, maybe for restore after
postcopy migration failure.
Also it's necessary to solve shared memory issue in
postcopy livemigration. Information about received pages
will be transferred to the software virtual bridge
(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
already received pages. fallocate syscall is required for
remmaped shared memory, due to remmaping itself blocks
ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
error (struct page is exists after remmap).

Bitmap is placed into RAMBlock as another postcopy/precopy
related bitmaps.

Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Alexey Perevalov <address@hidden>
---
[...]

static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
-        void *from_addr, uint64_t pagesize)
+ void *from_addr, uint64_t pagesize, RAMBlock *rb)
  {
+    int ret;
      if (from_addr) {
          struct uffdio_copy copy_struct;
          copy_struct.dst = (uint64_t)(uintptr_t)host_addr;
          copy_struct.src = (uint64_t)(uintptr_t)from_addr;
          copy_struct.len = pagesize;
          copy_struct.mode = 0;
-        return ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
+        ret = ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
      } else {
          struct uffdio_zeropage zero_struct;
zero_struct.range.start = (uint64_t)(uintptr_t)host_addr;
          zero_struct.range.len = pagesize;
          zero_struct.mode = 0;
- return ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); + ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
+    }
+    if (!ret) {
+        ramblock_recv_bitmap_set(host_addr, rb);
Wait...

Now we are using 4k-page/bit bitmap, do we need to take care of the huge pages here? Looks like we are only setting the first bit of it
if it is a huge page?
First version was per ramblock page size, IOW bitmap was smaller in
case of hugepages.
Yes, but this is not the first version any more. :)

This patch is using:

   bitmap_new(rb->max_length >> TARGET_PAGE_BITS);

to allocate bitmap, so it is using small pages always for bitmap,
right? (I should not really say "4k" pages, here I think the size is
host page size, which is the thing returned from getpagesize()).

You mentioned that TARGET_PAGE_SIZE is reasonable for precopy case, in "Re: [Qemu-devel] [PATCH v1 2/2] migration: add bitmap for copied page" I though TARGET_PAGE_SIZE as transmition unit, is using in precopy even
hugepage case.
But it's not so logically, page being marked as dirty, should be sent as a
whole page.
Sorry if I misunderstood, but I didn't see anything wrong - we are
sending pages in small pages, but when postcopy is there, we do
UFFDIO_COPY in huge page, so everything is fine?
I think yes, we chose TARGET_PAGE_SIZE because of wider
use case ranges.
So... are you going to post another version? IIUC we just need to use
a bitmap_set() to replace the ramblock_recv_bitmap_set(), while set
the size with "pagesize / TARGET_PAGE_SIZE"?
From my point of view TARGET_PAGE_SIZE/TARGET_PAGE_BITS it's a platform
specific

and it used in ram_load to copy to buffer so it's more preferred for bitmap size and I'm not going to replace ramblock_recv_bitmap_set helper - it calculates offset.

(I think I was wrong when saying getpagesize() above: the small page
should be target page size, while the huge page should be the host's)
I think we should forget about huge page case in "received bitmap"
concept, maybe in "uffd_copied bitmap" it was reasonable ;)
Again, I am not sure I got the whole idea of the reply...

However, I do think when we UFFDIO_COPY a huge page, then we should do
bitmap_set() on the received bitmap for the whole range that the huge
page covers.
for what purpose?
We chose to use small-paged bitmap since in precopy we need to have
such a granularity (in precopy, we can copy a small page even that
small page is on a host huge page).

Since we decided to use the small-paged bitmap, we need to make sure
it follows how it was defined: one bit defines whether the
corresponding small page is received. IMHO not following that is hacky
and error-prone.

IMHO, the bitmap is defined as "one bit per small page", and the small
page size is TARGET_PAGE_SIZE. We cannot just assume that "as long as
the first bit of the huge page is set, all the small pages in the huge
page are set".
At the moment of copying all small pages of the huge page,
should be received. Yes it's assumption, but I couldn't predict
side effect, maybe it will be necessary in postcopy failure handling,
while copying pages back, but I'm not sure right now.
To know that, need to start implementing it, or at least to deep
investigation.
Yes, postcopy failure handling is exactly one case where it can be
used. Of course with all the ramblock information we can re-construct
the real bitmap when the source received the bitmaps from destination.
However, why not we make it correct at the very beginning (especially
when it is quite easy to do so)?

(Actually, I asked since I am working on the RFC series of postcopy
  failure recovery. I will post RFCs soon)

Thanks,

Ok, I'll resend patchset today, all bits of the appropriate huge

page will set.


I saw you already included in you patch set

 migration: fix incorrect postcopy recved_bitmap


do you think, is it worth to include your patch,

of course with preserving authorship, into this patch set?


--
Best regards,
Alexey Perevalov



reply via email to

[Prev in Thread] Current Thread [Next in Thread]