[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_li
From: |
alvise rigo |
Subject: |
Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list |
Date: |
Wed, 12 Aug 2015 16:04:40 +0200 |
On Wed, Aug 12, 2015 at 2:36 PM, Paolo Bonzini <address@hidden> wrote:
>
>
> On 12/08/2015 09:31, alvise rigo wrote:
>> I think that tlb_flush_entry is not enough, since in theory another
>> vCPU could have a different TLB address referring the same phys
>> address.
>
> You're right, this is a TLB so it's virtually-indexed. :( I'm not sure
> what happens on ARM, since it has a virtually indexed (VIVT or VIPT)
> cache, but indeed it would be a problem when implementing e.g. CMPXCHG
> using the TCG ll/sc ops.
>
> I'm a bit worried about adding such a big bitmap. It's only used on
> TCG, but it is also allocated on KVM and on KVM you can have hundreds
> of VCPUs. Wasting 200 bits per guest memory page (i.e. ~0.6% of guest
> memory) is obviously not a great idea. :(
I agree, it's a waste of memory.
>
> Perhaps we can use a bytemap instead:
>
> - 0..253 = TLB_EXCL must be set in all VCPUs except CPU n. A VCPU that
> loads the TLB for this vaddr does not have to set it.
>
> - 254 = TLB_EXCL must be set in all VCPUs. A VCPU that
> loads the TLB for this vaddr has to set it.
>
> - 255 = TLB_EXCL not set in at least two VCPUs
>
> Transitions:
>
> - ll transitions: anything -> 254
>
> - sc transitions: 254 -> current CPU_ID
>
> - TLB_EXCL store transitions: 254 -> current CPU_ID
>
> - tlb_st_page transitions: CPU_ID other than current -> 255
>
> The initial value is 255 on SMP guests, 0 on UP guests.
>
> The algorithms are very similar to yours, just using this approximate
> representation.
>
> ll algorithm:
> llsc_value = bytemap[vaddr]
> if llsc_value == CPU_ID
> do nothing
> elseif llsc_value < 254
> flush TLB of CPU llsc_value
> elseif llsc_value == 255
> flush all TLBs
> set TLB_EXCL
> bytemap[vaddr] = 254
> load
>
> tlb_set_page algorithm:
> llsc_value = bytemap[vaddr]
> if llsc_value == CPU_ID or llsc_value == 255
> do nothing
> else if llsc_value == 254
> set TLB_EXCL
> else
> # two CPUs without TLB_EXCL
> bytemap[vaddr] = 255
>
> TLB_EXCL slow path algorithm:
> if bytemap[vaddr] == 254
> bytemap[vaddr] = CPU_ID
> else
> # two CPUs without TLB_EXCL
> bytemap[vaddr] = 255
> clear TLB_EXCL in this CPU
> store
>
> sc algorithm:
> if bytemap[vaddr] == CPU_ID or bytemap[vaddr] == 254
> bytemap[vaddr] = CPU_ID
> clear TLB_EXCL in this CPU
> store
> succeed
> else
> fail
>
> clear algorithm:
> if bytemap[vaddr] == 254
> bytemap[vaddr] = CPU_ID
Isn't this also required for the clear algorithm?
if bytemap[vaddr] < 254
/* this can happen for the TLB_EXCL slow path effect */
bytemap[vaddr] = 255
The whole idea makes sense, I will consider it for the next iteration
of the patches.
Thanks,
alvise
>
> The UP case is optimized because bytemap[vaddr] will always be 0 or 254.
>
> The algorithm requires the LL to be cleared e.g. on exceptions.
> Paolo
>
>> alvise
>>
>> On Tue, Aug 11, 2015 at 6:32 PM, Paolo Bonzini <address@hidden> wrote:
>>>
>>>
>>> On 11/08/2015 18:11, alvise rigo wrote:
>>>>>> Why flush the entire cache (I understand you mean TLB)?
>>>> Sorry, I meant the TLB.
>>>> If for each removal of an exclusive entry we set also the bit to 1, we
>>>> force the following LL to make a tlb_flush() on every vCPU.
>>>
>>> What if you only flush one entry with tlb_flush_entry (on every vCPU)?
>>>
>>> Paolo
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, (continued)
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Paolo Bonzini, 2015/08/11
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, alvise rigo, 2015/08/11
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Paolo Bonzini, 2015/08/11
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, alvise rigo, 2015/08/11
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Paolo Bonzini, 2015/08/11
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, alvise rigo, 2015/08/12
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Paolo Bonzini, 2015/08/12
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Peter Maydell, 2015/08/12
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list,
alvise rigo <=
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, Paolo Bonzini, 2015/08/12
- Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list, alvise rigo, 2015/08/12
[Qemu-devel] [RFC v4 2/9] softmmu: Add new TLB_EXCL flag, Alvise Rigo, 2015/08/07
[Qemu-devel] [RFC v4 7/9] tcg-arm: Implement excl variants of qemu_{ld, st}, Alvise Rigo, 2015/08/07
[Qemu-devel] [RFC v4 8/9] tcg-aarch64: Implement excl variants of qemu_{ld, st}, Alvise Rigo, 2015/08/07
[Qemu-devel] [RFC v4 6/9] tcg-i386: Implement excl variants of qemu_{ld, st}, Alvise Rigo, 2015/08/07
[Qemu-devel] [RFC v4 9/9] target-arm: translate: Use ld/st excl for atomic insns, Alvise Rigo, 2015/08/07