qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 12/21] target-arm: A64: Implement DC ZVA


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v4 12/21] target-arm: A64: Implement DC ZVA
Date: Fri, 07 Mar 2014 06:51:04 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0

On 03/06/2014 11:32 AM, Peter Maydell wrote:
> +/**
> + * tlb_vaddr_to_host:
> + * @env: CPUArchState
> + * @addr: guest virtual address to look up
> + * @mmu_idx: MMU index to use for lookup
> + *
> + * Look up the specified guest virtual index in the TCG softmmu TLB.
> + * If the TLB contains a host virtual address suitable for direct RAM
> + * access, then return it. Otherwise (TLB miss, TLB entry is for an
> + * I/O access, etc) return NULL.
> + *
> + * This is the equivalent of the initial fast-path code used by
> + * TCG backends for guest load and store accesses.
> + */
> +static inline void *tlb_vaddr_to_host(CPUArchState *env, target_ulong addr,
> +                                      int mmu_idx)
> +{
> +    int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> +    target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;

Somewhere I think the function name or at least the block comment should
indicate that this lookup is for writing, since we hard-code addr_write here.

> +void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
> +{
> +    /* Implement DC ZVA, which zeroes a fixed-length block of memory.
> +     * Note that we do not implement the (architecturally mandated)
> +     * alignment fault for attempts to use this on Device memory
> +     * (which matches the usual QEMU behaviour of not implementing either
> +     * alignment faults or any memory attribute handling).
> +     */
> +
> +    ARMCPU *cpu = arm_env_get_cpu(env);
> +    uint64_t blocklen = 4 << cpu->dcz_blocksize;
> +    uint64_t vaddr = vaddr_in & ~(blocklen - 1);
> +
> +#ifndef CONFIG_USER_ONLY
> +    {
> +        /* Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
> +         * the block size so we might have to do more than one TLB lookup.
> +         * We know that in fact for any v8 CPU the page size is at least 4K
> +         * and the block size must be 2K or less, but TARGET_PAGE_SIZE is 
> only
> +         * 1K as an artefact of legacy v5 subpage support being present in 
> the
> +         * same QEMU executable.
> +         */
> +        int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
> +        void *hostaddr[maxidx];

What's the maximum blocksize?  Did you really need dynamic allocation here?

> +        int try, i;
> +
> +        for (try = 0; try < 2; try++) {
> +
> +            for (i = 0; i < maxidx; i++) {
> +                hostaddr[i] = tlb_vaddr_to_host(env,
> +                                                vaddr + TARGET_PAGE_SIZE * i,
> +                                                cpu_mmu_index(env));
> +                if (!hostaddr[i]) {
> +                    break;
> +                }
> +            }
> +            if (i == maxidx) {
> +                /* If it's all in the TLB it's fair game for just writing to;
> +                 * we know we don't need to update dirty status, etc.
> +                 */
> +                for (i = 0; i < maxidx - 1; i++) {
> +                    memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
> +                }
> +                memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
> +                return;
> +            }
> +            /* OK, try a store and see if we can populate the tlb. This
> +             * might cause an exception if the memory isn't writable,
> +             * in which case we will longjmp out of here. We must for
> +             * this purpose use the actual register value passed to us
> +             * so that we get the fault address right.
> +             */
> +            cpu_stb_data(env, vaddr_in, 0);
> +            /* Now we can populate the other TLB entries, if any */
> +            for (i = 0; i < maxidx; i++) {
> +                uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
> +                if (va != (vaddr_in & TARGET_PAGE_MASK)) {
> +                    cpu_stb_data(env, va, 0);
> +                }
> +            }

cpu_stb_data doesn't take into account user vs kernel mode accesses.  Maybe
better off using helper_ret_stb_mmu, and passing along GETRA().

As a bonus, you'll have accurate exceptions should the access throw, so you
don't need to force the save of PC before calling the helper.  Which... I don't
see you doing, so perhaps there's a bug here at the moment.



r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]