qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe.


From: Mark Cave-Ayland
Subject: Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe.
Date: Mon, 3 Apr 2017 16:06:46 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0

On 03/04/17 13:42, Gerd Hoffmann wrote:

> On Mo, 2017-04-03 at 13:24 +0100, Mark Cave-Ayland wrote:
>> On 03/04/17 13:03, Gerd Hoffmann wrote:
>>
>>>   Hi,
>>>
>>>> I checked the branch, is bitmap_copy_and_clear_atomic correct when you
>>>> have partial updates?  Maybe you need to handle partial updates of the
>>>> first and last word?
>>>
>>> Should not be a problem.  We might clear some more bits, but these are
>>> outsize the visible area so they should cause visible corruption (and if
>>> the visible area changes the display code needs to do a full refresh
>>> anyway).
>>
>> Right. Also is there a reason that
>> cpu_physical_memory_snapshot_and_clear_dirty() uses BITS_PER_LEVEL when
>> calculating the alignment used for the first/last addresses?
> 
> The code copy doesn't operate on bits but on bitmap longs, so I can just
> copy the whole long instead of fiddeling with bits.  So I round down to
> a bitmap long start.  On 64bit hosts that are units of 64 pages.
> 
> Actually querying the copied bitmap operates on page granularity of
> course.
> 
>> Otherwise you end up expanding the range of your last address
>> considerably beyond the next page alignment.
> 
> Yes, it's larger, but as it expands to the start of the long word I
> expect the penalty is almost zero (same cache line) and being able to
> just copy the whole long instead of dealing with individual bits should
> be a win.
> 
>> And also in the main page
>> loop why is BITS_PER_LEVEL used? I see that several of the internal
>> bitmap routines appear to use BITS_PER_LONG for compressing into the
>> bitmap which might be more appropriate?
> 
> shift by BITS_PER_LEVEL is the same as division by BITS_PER_LONG

Right hopefully I understand this now. The following diff appears to fix
CG3/TCX for me:

diff --git a/exec.c b/exec.c
index 000af18..66bdcf4 100644
--- a/exec.c
+++ b/exec.c
@@ -1071,7 +1071,7 @@ DirtyBitmapSnapshot
*cpu_physical_memory_snapshot_and_clear_dirty
      (ram_addr_t start, ram_addr_t length, unsigned client)
 {
     DirtyMemoryBlocks *blocks;
-    unsigned long align = 1 << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
+    unsigned long align = 1UL << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
     ram_addr_t first = QEMU_ALIGN_DOWN(start, align);
     ram_addr_t last  = QEMU_ALIGN_UP(start + length, align);
     DirtyBitmapSnapshot *snap;
@@ -1097,7 +1097,6 @@ DirtyBitmapSnapshot
*cpu_physical_memory_snapshot_and_clear_dirty

         assert(QEMU_IS_ALIGNED(offset, (1 << BITS_PER_LEVEL)));
         assert(QEMU_IS_ALIGNED(num,    (1 << BITS_PER_LEVEL)));
-        offset >>= BITS_PER_LEVEL;

         bitmap_copy_and_clear_atomic(snap->dirty + dest,
                                      blocks->blocks[idx] + (offset >>
BITS_PER_LEVEL),


There were 2 issues here: without the UL suffix on align I was getting
incorrect first/last addresses since the high bits of align weren't
being cleared, and then offset appeared to be shifted twice.

Does this look reasonable to you?


ATB,

Mark.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]