Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to it

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to it

From:	Alexander Graf
Subject:	Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function
Date:	Wed, 18 Dec 2013 09:54:42 +0100

On 06.11.2013, at 17:22, Juan Quintela <address@hidden> wrote:

> Paolo Bonzini <address@hidden> wrote:
>> Il 06/11/2013 14:04, Juan Quintela ha scritto:
>>> We want to have all the functions that handle directly the dirty
>>> bitmap near.  We will change it later.
>> 
>> Please move it to exec.c instead.
>> 
>>> Signed-off-by: Juan Quintela <address@hidden>
>>> ---
>>> include/exec/memory-physical.h | 31 +++++++++++++++++++++++++++++++
>>> kvm-all.c                      | 27 ++-------------------------
>>> 2 files changed, 33 insertions(+), 25 deletions(-)
>>> 
>>> diff --git a/include/exec/memory-physical.h b/include/exec/memory-physical.h
>>> index 610c55f..72faf06 100644
>>> --- a/include/exec/memory-physical.h
>>> +++ b/include/exec/memory-physical.h
>>> @@ -72,6 +72,37 @@ static inline void 
>>> cpu_physical_memory_set_dirty_range(ram_addr_t start,
>>>     xen_modified_memory(start, length);
>>> }
>>> 
>>> +static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long 
>>> *bitmap,
>> 
>> Why "le"?
> 
> little endian,  name is already too long (cpu_physical_memory_ is too
> big as a preffix.)
> 
>> 
>> Paolo
>> 
>>> +                                                          ram_addr_t start,
>>> +                                                          ram_addr_t pages)
>>> +{
>>> +    unsigned int i, j;
>>> +    unsigned long page_number, c;
>>> +    hwaddr addr;
>>> +    ram_addr_t ram_addr;
>>> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>>> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>>> +
>>> +    /*
>>> +     * bitmap-traveling is faster than memory-traveling (for addr...)
>>> +     * especially when most of the memory is not dirty.
>>> +     */
>>> +    for (i = 0; i < len; i++) {
>>> +        if (bitmap[i] != 0) {
>>> +            c = leul_to_cpu(bitmap[i]);
> 
> this is the important part.  KVM maintains the bitmap is Little Endian
> mode,  and PPC?  I assme needs this.

Yes, there's a good chance you're running 32bit user space on a 64bit kernel on 
PPC. With big endian bitmaps that completely blows my mind, so I've settled for 
little endian bitmaps back in the day (which make a lot more sense anyway) :).

> 
> 
>>> +            do {
>>> +                j = ffsl(c) - 1;
>>> +                c &= ~(1ul << j);
>>> +                page_number = (i * HOST_LONG_BITS + j) * hpratio;
>>> +                addr = page_number * TARGET_PAGE_SIZE;
>>> +                ram_addr = start + addr;
>>> +                cpu_physical_memory_set_dirty_range(ram_addr,
>>> +                                                    TARGET_PAGE_SIZE
>>> * hpratio);
> 
> I checked,  and it appears that ppc really use the hpratio thing.  I
> hope that the bitmap optimization also works for ppc,  but its
> architecture is weird^Winteresting O:-)

Well, it's a ration between host and target page size.

TARGET_PAGE_SIZE defines the smallest chunk that the MMU can possibly split 
things down to.
Host page size however is the smallest chunk that the Linux page allocation can 
give us.

That's a very important distinction. All server PPC hardware can use 4k page 
sizes. But you can configure the kernel to use 64k pages on 64bit. For older 
hardware, that means that we're using the full 16 HTAB group entries for a 
single page (16*4 = 64) which gives measurable speedups. For newer hardware we 
can even have real 64k pages.

So while QEMU always needs to keep the guest's view of the page table world at 
4k to stay compatible with what the guest is trying to do, the host kernel may 
have itself configured to 64k pages.

This is not really PPC specific. AArch64 has a similar thing with native 
support for 64k page size. There are a few x86 cores with 64k page size too, 
but I don't think any of them run Linux :). Either way, x86 as we have it today 
is really a special case with TARGET_PAGE_SIZE == host_page_size.

Alex

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function, Alexander Graf <=

Prev by Date: [Qemu-devel] [PATCH] PPC: Add VSX to hflags
Next by Date: Re: [Qemu-devel] [PATCH uq/master] kvm: x86: Separately write feature control MSR on reset
Previous by thread: [Qemu-devel] [PATCH] PPC: Add VSX to hflags
Next by thread: Re: [Qemu-devel] [PATCH -V7 3/3] target-ppc: Fix page table lookup with kvm enabled
Index(es):
- Date
- Thread