qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: RFC: Split EPT huge pages in advance of dirty logging


From: Zhoujian (jay)
Subject: RE: RFC: Split EPT huge pages in advance of dirty logging
Date: Mon, 24 Feb 2020 01:07:47 +0000


> -----Original Message-----
> From: Peter Feiner [mailto:address@hidden]
> Sent: Saturday, February 22, 2020 8:19 AM
> To: Junaid Shahid <address@hidden>
> Cc: Ben Gardon <address@hidden>; Zhoujian (jay)
> <address@hidden>; Peter Xu <address@hidden>;
> address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden; Liujinsong (Paul)
> <address@hidden>; linfeng (M) <address@hidden>; wangxin (U)
> <address@hidden>; Huangweidong (C)
> <address@hidden>
> Subject: Re: RFC: Split EPT huge pages in advance of dirty logging
> 
> On Fri, Feb 21, 2020 at 2:08 PM Junaid Shahid <address@hidden> wrote:
> >
> > On 2/20/20 9:34 AM, Ben Gardon wrote:
> > >
> > > FWIW, we currently do this eager splitting at Google for live
> > > migration. When the log-dirty-memory flag is set on a memslot we
> > > eagerly split all pages in the slot down to 4k granularity.
> > > As Jay said, this does not cause crippling lock contention because
> > > the vCPU page faults generated by write protection / splitting can
> > > be resolved in the fast page fault path without acquiring the MMU lock.
> > > I believe +Junaid Shahid tried to upstream this approach at some
> > > point in the past, but the patch set didn't make it in. (This was
> > > before my time, so I'm hoping he has a link.) I haven't done the
> > > analysis to know if eager splitting is more or less efficient with
> > > parallel slow-path page faults, but it's definitely faster under the
> > > MMU lock.
> > >
> >
> > I am not sure if we ever posted those patches upstream. Peter Feiner would
> know for sure. One notable difference in what we do compared to the approach
> outlined by Jay is that we don't rely on tdp_page_fault() to do the 
> splitting. So
> we don't have to create a dummy VCPU and the specialized split function is 
> also
> much faster.

I'm curious and interested in the way you implemented, especially you mentioned
that the performance is much faster without a dummy VCPU.

> We've been carrying these patches since 2015. I've never posted them.
> Getting them in shape for upstream consumption will take some work. I can
> look into this next week.

It will be nice if you're going to post it to the upstream.

Regards,
Jay Zhou

> 
> Peter

reply via email to

[Prev in Thread] Current Thread [Next in Thread]