qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page
Date: Mon, 31 Jul 2017 09:33:27 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Jul 28, 2017 at 06:29:20PM +0300, Alexey Perevalov wrote:
> On 07/28/2017 10:06 AM, Alexey Perevalov wrote:
> >On 07/28/2017 09:57 AM, Peter Xu wrote:
> >>On Fri, Jul 28, 2017 at 09:43:28AM +0300, Alexey Perevalov wrote:
> >>>On 07/28/2017 07:27 AM, Peter Xu wrote:
> >>>>On Thu, Jul 27, 2017 at 10:27:41AM +0300, Alexey Perevalov wrote:
> >>>>>On 07/27/2017 05:35 AM, Peter Xu wrote:
> >>>>>>On Wed, Jul 26, 2017 at 06:24:11PM +0300, Alexey Perevalov wrote:
> >>>>>>>On 07/26/2017 11:43 AM, Peter Xu wrote:
> >>>>>>>>On Wed, Jul 26, 2017 at 11:07:17AM +0300, Alexey Perevalov wrote:
> >>>>>>>>>On 07/26/2017 04:49 AM, Peter Xu wrote:
> >>>>>>>>>>On Thu, Jul 20, 2017 at 09:52:34AM +0300, Alexey
> >>>>>>>>>>Perevalov wrote:
> >>>>>>>>>>>This patch adds ability to track down already received
> >>>>>>>>>>>pages, it's necessary for calculation vCPU block time in
> >>>>>>>>>>>postcopy migration feature, maybe for restore after
> >>>>>>>>>>>postcopy migration failure.
> >>>>>>>>>>>Also it's necessary to solve shared memory issue in
> >>>>>>>>>>>postcopy livemigration. Information about received pages
> >>>>>>>>>>>will be transferred to the software virtual bridge
> >>>>>>>>>>>(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
> >>>>>>>>>>>already received pages. fallocate syscall is required for
> >>>>>>>>>>>remmaped shared memory, due to remmaping itself blocks
> >>>>>>>>>>>ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
> >>>>>>>>>>>error (struct page is exists after remmap).
> >>>>>>>>>>>
> >>>>>>>>>>>Bitmap is placed into RAMBlock as another postcopy/precopy
> >>>>>>>>>>>related bitmaps.
> >>>>>>>>>>>
> >>>>>>>>>>>Reviewed-by: Peter Xu <address@hidden>
> >>>>>>>>>>>Signed-off-by: Alexey Perevalov <address@hidden>
> >>>>>>>>>>>---
> >>>>>>>>>>[...]
> >>>>>>>>>>
> >>>>>>>>>>>  static int qemu_ufd_copy_ioctl(int userfault_fd,
> >>>>>>>>>>>void *host_addr,
> >>>>>>>>>>>-        void *from_addr, uint64_t pagesize)
> >>>>>>>>>>>+                               void *from_addr,
> >>>>>>>>>>>uint64_t pagesize, RAMBlock *rb)
> >>>>>>>>>>>  {
> >>>>>>>>>>>+    int ret;
> >>>>>>>>>>>      if (from_addr) {
> >>>>>>>>>>>          struct uffdio_copy copy_struct;
> >>>>>>>>>>>          copy_struct.dst = (uint64_t)(uintptr_t)host_addr;
> >>>>>>>>>>>          copy_struct.src = (uint64_t)(uintptr_t)from_addr;
> >>>>>>>>>>>          copy_struct.len = pagesize;
> >>>>>>>>>>>          copy_struct.mode = 0;
> >>>>>>>>>>>-        return ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
> >>>>>>>>>>>+        ret = ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
> >>>>>>>>>>>      } else {
> >>>>>>>>>>>          struct uffdio_zeropage zero_struct;
> >>>>>>>>>>>          zero_struct.range.start =
> >>>>>>>>>>>(uint64_t)(uintptr_t)host_addr;
> >>>>>>>>>>>          zero_struct.range.len = pagesize;
> >>>>>>>>>>>          zero_struct.mode = 0;
> >>>>>>>>>>>-        return ioctl(userfault_fd, UFFDIO_ZEROPAGE,
> >>>>>>>>>>>&zero_struct);
> >>>>>>>>>>>+        ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE,
> >>>>>>>>>>>&zero_struct);
> >>>>>>>>>>>+    }
> >>>>>>>>>>>+    if (!ret) {
> >>>>>>>>>>>+        ramblock_recv_bitmap_set(host_addr, rb);
> >>>>>>>>>>Wait...
> >>>>>>>>>>
> >>>>>>>>>>Now we are using 4k-page/bit bitmap, do we need to take
> >>>>>>>>>>care of the
> >>>>>>>>>>huge pages here?  Looks like we are only setting the
> >>>>>>>>>>first bit of it
> >>>>>>>>>>if it is a huge page?
> >>>>>>>>>First version was per ramblock page size, IOW bitmap was
> >>>>>>>>>smaller in
> >>>>>>>>>case of hugepages.
> >>>>>>>>Yes, but this is not the first version any more. :)
> >>>>>>>>
> >>>>>>>>This patch is using:
> >>>>>>>>
> >>>>>>>>   bitmap_new(rb->max_length >> TARGET_PAGE_BITS);
> >>>>>>>>
> >>>>>>>>to allocate bitmap, so it is using small pages always for bitmap,
> >>>>>>>>right? (I should not really say "4k" pages, here I think the
> >>>>>>>>size is
> >>>>>>>>host page size, which is the thing returned from getpagesize()).
> >>>>>>>>
> >>>>>>>>>You mentioned that TARGET_PAGE_SIZE is reasonable for
> >>>>>>>>>precopy case,
> >>>>>>>>>in "Re: [Qemu-devel] [PATCH v1 2/2] migration: add bitmap
> >>>>>>>>>for copied page"
> >>>>>>>>>I though TARGET_PAGE_SIZE as transmition unit, is using in
> >>>>>>>>>precopy even
> >>>>>>>>>hugepage case.
> >>>>>>>>>But it's not so logically, page being marked as dirty,
> >>>>>>>>>should be sent as a
> >>>>>>>>>whole page.
> >>>>>>>>Sorry if I misunderstood, but I didn't see anything wrong - we are
> >>>>>>>>sending pages in small pages, but when postcopy is there, we do
> >>>>>>>>UFFDIO_COPY in huge page, so everything is fine?
> >>>>>>>I think yes, we chose TARGET_PAGE_SIZE because of wider
> >>>>>>>use case ranges.
> >>>>>>So... are you going to post another version? IIUC we just need
> >>>>>>to use
> >>>>>>a bitmap_set() to replace the ramblock_recv_bitmap_set(), while set
> >>>>>>the size with "pagesize / TARGET_PAGE_SIZE"?
> >>>>> From my point of view TARGET_PAGE_SIZE/TARGET_PAGE_BITS it's a
> >>>>>platform
> >>>>>specific
> >>>>>
> >>>>>and it used in ram_load to copy to buffer so it's more preferred
> >>>>>for bitmap size
> >>>>>and I'm not going to replace ramblock_recv_bitmap_set helper - it
> >>>>>calculates offset.
> >>>>>
> >>>>>>(I think I was wrong when saying getpagesize() above: the small page
> >>>>>>  should be target page size, while the huge page should be the
> >>>>>>host's)
> >>>>>I think we should forget about huge page case in "received bitmap"
> >>>>>concept, maybe in "uffd_copied bitmap" it was reasonable ;)
> >>>>Again, I am not sure I got the whole idea of the reply...
> >>>>
> >>>>However, I do think when we UFFDIO_COPY a huge page, then we should do
> >>>>bitmap_set() on the received bitmap for the whole range that the huge
> >>>>page covers.
> >>>for what purpose?
> >>We chose to use small-paged bitmap since in precopy we need to have
> >>such a granularity (in precopy, we can copy a small page even that
> >>small page is on a host huge page).
> >>
> >>Since we decided to use the small-paged bitmap, we need to make sure
> >>it follows how it was defined: one bit defines whether the
> >>corresponding small page is received. IMHO not following that is hacky
> >>and error-prone.
> >>
> >>>>IMHO, the bitmap is defined as "one bit per small page", and the small
> >>>>page size is TARGET_PAGE_SIZE. We cannot just assume that "as long as
> >>>>the first bit of the huge page is set, all the small pages in the huge
> >>>>page are set".
> >>>At the moment of copying all small pages of the huge page,
> >>>should be received. Yes it's assumption, but I couldn't predict
> >>>side effect, maybe it will be necessary in postcopy failure handling,
> >>>while copying pages back, but I'm not sure right now.
> >>>To know that, need to start implementing it, or at least to deep
> >>>investigation.
> >>Yes, postcopy failure handling is exactly one case where it can be
> >>used. Of course with all the ramblock information we can re-construct
> >>the real bitmap when the source received the bitmaps from destination.
> >>However, why not we make it correct at the very beginning (especially
> >>when it is quite easy to do so)?
> >>
> >>(Actually, I asked since I am working on the RFC series of postcopy
> >>  failure recovery. I will post RFCs soon)
> >>
> >>Thanks,
> >>
> >Ok, I'll resend patchset today, all bits of the appropriate huge
> >
> >page will set.
> >
> >
> I saw you already included in you patch set
> 
>  migration: fix incorrect postcopy recved_bitmap
> 
> 
> do you think, is it worth to include your patch,
> 
> of course with preserving authorship, into this patch set?

I think we'd better squash that patch into yours (considering that
current patch hasn't been merged), since I see that patch not an
enhancement but a correction of this one. Or, do you have better way
to write that? I didn't really think too much on it, just to make sure
it can work well with the recovery rfc series.

Please don't worry on the authorship, just squash it if you like - I
am totally fine that you see that patch as "a comment" but in patch
format. :-)

--
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]