[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe.
From: |
Mark Cave-Ayland |
Subject: |
Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe. |
Date: |
Mon, 3 Apr 2017 16:06:46 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0 |
On 03/04/17 13:42, Gerd Hoffmann wrote:
> On Mo, 2017-04-03 at 13:24 +0100, Mark Cave-Ayland wrote:
>> On 03/04/17 13:03, Gerd Hoffmann wrote:
>>
>>> Hi,
>>>
>>>> I checked the branch, is bitmap_copy_and_clear_atomic correct when you
>>>> have partial updates? Maybe you need to handle partial updates of the
>>>> first and last word?
>>>
>>> Should not be a problem. We might clear some more bits, but these are
>>> outsize the visible area so they should cause visible corruption (and if
>>> the visible area changes the display code needs to do a full refresh
>>> anyway).
>>
>> Right. Also is there a reason that
>> cpu_physical_memory_snapshot_and_clear_dirty() uses BITS_PER_LEVEL when
>> calculating the alignment used for the first/last addresses?
>
> The code copy doesn't operate on bits but on bitmap longs, so I can just
> copy the whole long instead of fiddeling with bits. So I round down to
> a bitmap long start. On 64bit hosts that are units of 64 pages.
>
> Actually querying the copied bitmap operates on page granularity of
> course.
>
>> Otherwise you end up expanding the range of your last address
>> considerably beyond the next page alignment.
>
> Yes, it's larger, but as it expands to the start of the long word I
> expect the penalty is almost zero (same cache line) and being able to
> just copy the whole long instead of dealing with individual bits should
> be a win.
>
>> And also in the main page
>> loop why is BITS_PER_LEVEL used? I see that several of the internal
>> bitmap routines appear to use BITS_PER_LONG for compressing into the
>> bitmap which might be more appropriate?
>
> shift by BITS_PER_LEVEL is the same as division by BITS_PER_LONG
Right hopefully I understand this now. The following diff appears to fix
CG3/TCX for me:
diff --git a/exec.c b/exec.c
index 000af18..66bdcf4 100644
--- a/exec.c
+++ b/exec.c
@@ -1071,7 +1071,7 @@ DirtyBitmapSnapshot
*cpu_physical_memory_snapshot_and_clear_dirty
(ram_addr_t start, ram_addr_t length, unsigned client)
{
DirtyMemoryBlocks *blocks;
- unsigned long align = 1 << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
+ unsigned long align = 1UL << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
ram_addr_t first = QEMU_ALIGN_DOWN(start, align);
ram_addr_t last = QEMU_ALIGN_UP(start + length, align);
DirtyBitmapSnapshot *snap;
@@ -1097,7 +1097,6 @@ DirtyBitmapSnapshot
*cpu_physical_memory_snapshot_and_clear_dirty
assert(QEMU_IS_ALIGNED(offset, (1 << BITS_PER_LEVEL)));
assert(QEMU_IS_ALIGNED(num, (1 << BITS_PER_LEVEL)));
- offset >>= BITS_PER_LEVEL;
bitmap_copy_and_clear_atomic(snap->dirty + dest,
blocks->blocks[idx] + (offset >>
BITS_PER_LEVEL),
There were 2 issues here: without the UL suffix on align I was getting
incorrect first/last addresses since the high bits of align weren't
being cleared, and then offset appeared to be shifted twice.
Does this look reasonable to you?
ATB,
Mark.