[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v5 2/6] softmmu: Add new TLB_EXCL flag
From: |
alvise rigo |
Subject: |
Re: [Qemu-devel] [RFC v5 2/6] softmmu: Add new TLB_EXCL flag |
Date: |
Wed, 30 Sep 2015 11:24:33 +0200 |
On Wed, Sep 30, 2015 at 5:34 AM, Richard Henderson <address@hidden> wrote:
> On 09/24/2015 06:32 PM, Alvise Rigo wrote:
>>
>> + if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write &
>> TLB_EXCL))) {
>> + /* We are removing an exclusive entry, set the page to dirty.
>> This
>> + * is not be necessary if the vCPU has performed both SC and LL.
>> */
>> + hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr &
>> TARGET_PAGE_MASK) +
>> + (te->addr_write &
>> TARGET_PAGE_MASK);
>> + cpu_physical_memory_set_excl_dirty(hw_addr, cpu->cpu_index);
>> + }
>
>
> Um... this seems dangerous.
>
> (1) I don't see why EXCL support should differ whether MMIO is set or not.
> Either we support exclusive accesses on memory-mapped io like we do on ram
> (in which case this is wrong) or we don't (in which case this is
> unnecessary).
I was not sure whether or not we had to support also MMIO memory.
In theory there shouldn't be any issues for including also
memory-mapped io, I will consider this for the next version.
>
> (2) Doesn't this prevent a target from accessing a page during a ll/sc
> sequence that aliases within our trivial hash? Such a case on arm might be
>
> mov r3, #0x100000
> ldrex r0, [r2]
> ldr r1, [r2, r3]
> add r0, r0, r1
> strex r0, [r2]
>
I'm not sure I got it. When the CPU issues the ldrex the page will be
set as "clean" (meaning that all the CPUs will then follow the
slow-path for that page) and the exclusive range - [r2, r2+4] in this
case - is stored in the CPU state.
The forced slow-path is used to check if the normal store is hitting
the exclusive range of any CPUs, the normal loads are not affected.
I don't see any problem in the code above, what am I missing?
> AFAIK, Alpha is the only target we have which specifies that any normal
> memory access during a ll+sc sequence may fail the sc.
I will dig into it because I remember that the Alpha architecture
behaves like ARM in the handling of LDxL/STxC instructions.
>
> (3) I'm finding the "clean/dirty" words less helpful than they could be,
> especially since "clean" implies "some cpu has an excl lock on the page",
> which is reverse of what seems natural but understandable given the
> implementation. Perhaps we could rename these helpers?
>
>> @@ -376,6 +392,28 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1,
>> target_ulong addr)
>> return qemu_ram_addr_from_host_nofail(p);
>> }
>>
>> +/* Atomic insn translation TLB support. */
>> +#define EXCLUSIVE_RESET_ADDR ULLONG_MAX
>> +/* For every vCPU compare the exclusive address and reset it in case of a
>> + * match. Since only one vCPU is running at once, no lock has to be held
>> to
>> + * guard this operation. */
>> +static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr
>> size)
>> +{
>> + CPUState *cpu;
>> + CPUArchState *acpu;
>> +
>> + CPU_FOREACH(cpu) {
>> + acpu = (CPUArchState *)cpu->env_ptr;
>> +
>> + if (acpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
>> + ranges_overlap(acpu->excl_protected_range.begin,
>> + acpu->excl_protected_range.end -
>> acpu->excl_protected_range.begin,
>> + addr, size)) {
>
>
> Watch the indentation here... it ought to line up with the previous argument
> on the line above, just after the (. This may require you split the
> subtract across the line too but that's ok.
OK, I will fix it.
>
>
>
>> void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
>> void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
>> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
>> index 98b9cff..a67f295 100644
>> --- a/include/exec/cpu-defs.h
>> +++ b/include/exec/cpu-defs.h
>> @@ -27,6 +27,7 @@
>> #include <inttypes.h>
>> #include "qemu/osdep.h"
>> #include "qemu/queue.h"
>> +#include "qemu/range.h"
>> #include "tcg-target.h"
>> #ifndef CONFIG_USER_ONLY
>> #include "exec/hwaddr.h"
>> @@ -150,5 +151,16 @@ typedef struct CPUIOTLBEntry {
>> #define CPU_COMMON
>> \
>> /* soft mmu support */
>> \
>> CPU_COMMON_TLB
>> \
>> + \
>> + /* Used by the atomic insn translation backend. */ \
>> + int ll_sc_context; \
>> + /* vCPU current exclusive addresses range.
>> + * The address is set to EXCLUSIVE_RESET_ADDR if the vCPU is not.
>> + * in the middle of a LL/SC. */ \
>> + struct Range excl_protected_range; \
>> + /* Used to carry the SC result but also to flag a normal (legacy)
>> + * store access made by a stcond (see softmmu_template.h). */ \
>> + int excl_succeeded; \
>
>
>
> This seems to be required by softmmu_template.h? In which case this must be
> in CPU_COMMON_TLB.
>
> Please expand on this "legacy store access" comment here. I don't
> understand.
ACK.
>
>
>> +
>>
>> #endif
>> diff --git a/softmmu_template.h b/softmmu_template.h
>> index d42d89d..e4431e8 100644
>> --- a/softmmu_template.h
>> +++ b/softmmu_template.h
>> @@ -409,19 +409,53 @@ void helper_le_st_name(CPUArchState *env,
>> target_ulong addr, DATA_TYPE val,
>> tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
>> }
>>
>> - /* Handle an IO access. */
>> + /* Handle an IO access or exclusive access. */
>> if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
>> - CPUIOTLBEntry *iotlbentry;
>> - if ((addr & (DATA_SIZE - 1)) != 0) {
>> - goto do_unaligned_access;
>> + CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
>> +
>> + if ((tlb_addr & ~TARGET_PAGE_MASK) == TLB_EXCL) {
>> + /* The slow-path has been forced since we are writing to
>> + * exclusive-protected memory. */
>> + hwaddr hw_addr = (iotlbentry->addr & TARGET_PAGE_MASK) +
>> addr;
>> +
>> + /* The function lookup_and_reset_cpus_ll_addr could have
>> reset the
>> + * exclusive address. Fail the SC in this case.
>> + * N.B.: Here excl_succeeded == 0 means that
>> helper_le_st_name has
>> + * not been called by a softmmu_llsc_template.h. */
>> + if(env->excl_succeeded) {
>> + if (env->excl_protected_range.begin != hw_addr) {
>> + /* The vCPU is SC-ing to an unprotected address. */
>> + env->excl_protected_range.begin =
>> EXCLUSIVE_RESET_ADDR;
>> + env->excl_succeeded = 0;
>> +
>> + return;
>
>
> Huh? How does a normal store ever fail. Surely you aren't using the normal
> store path to support true store_conditional?
As written in the comment, we are not doing a normal store, but the
store for a SC operation.
>
>> + }
>> +
>> + cpu_physical_memory_set_excl_dirty(hw_addr,
>> ENV_GET_CPU(env)->cpu_index);
>> + }
>> +
>> + haddr = addr + env->tlb_table[mmu_idx][index].addend;
>> + #if DATA_SIZE == 1
>> + glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val);
>> + #else
>> + glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val);
>> + #endif
>
>
> You're assuming that TLB_EXCL can never have any other TLB_ bits set. We
Yes, I was assuming this...
> definitely have to support TLB_EXCL+TLB_NOTDIRTY, and probably +TLB_MMIO too
> (iirc that's how watchpoints are implemeneted).
...but indeed I was wrong, we need to support also these combinations.
I will address this points for the next version.
Thank you,
alvise
>
>
> r~