qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm


From: Alexander Graf
Subject: Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm enabled
Date: Sun, 27 Oct 2013 11:29:40 -0700

On 11.10.2013, at 09:58, Aneesh Kumar K.V <address@hidden> wrote:

> Alexander Graf <address@hidden> writes:
> 
>> On 11.10.2013, at 13:13, Aneesh Kumar K.V <address@hidden> wrote:
>> 
>>> From: "Aneesh Kumar K.V" <address@hidden>
>>> 
>>> With kvm enabled, we store the hash page table information in the 
>>> hypervisor.
>>> Use ioctl to read the htab contents. Without this we get the below error 
>>> when
>>> trying to read the guest address
>>> 
>>> (gdb) x/10 do_fork
>>> 0xc000000000098660 <do_fork>:   Cannot access memory at address 
>>> 0xc000000000098660
>>> (gdb)
>>> 
>>> Signed-off-by: Aneesh Kumar K.V <address@hidden>
>>> ---
>>> Changes from V4:
>>> * Rewrite to avoid two code paths for doing hash lookups
> 
> ....
> 
>>> +
>>> +    i = 0;
>>> +    hpte = pte_index * HASH_PTE_SIZE_64;
>>>    if (likely((flags & H_EXACT) == 0)) {
>>>        pte_index &= ~7ULL;
>>> -        hpte = pte_index * HASH_PTE_SIZE_64;
>>> -        for (i = 0; ; ++i) {
>>> +        token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
>>> +        do {
>> 
>> Why convert this into a while loop?
> 
> I am moving i = 0 outside the loop. Hence found while () better than 
> for(;;++i) 

Outside of what loop? You're only moving it outside of the if().

> 
>> 
>>>            if (i == 8) {
>>> +                ppc_hash64_stop_access(token);
>>>                return H_PTEG_FULL;
>>>            }
>>> -            if ((ppc_hash64_load_hpte0(env, hpte) & HPTE64_V_VALID) == 0) {
>>> +            if ((ppc_hash64_load_hpte0(token, i) & HPTE64_V_VALID) == 0) {
>>>                break;
>>>            }
>>> -            hpte += HASH_PTE_SIZE_64;
>>> -        }
>>> +        } while (i++);
>>> +        ppc_hash64_stop_access(token);
> 
> ....
> 
> 
>>> +
>>> +int kvm_ppc_hash64_start_access(PowerPCCPU *cpu, unsigned long pte_index,
>>> +                                struct ppc_hash64_hpte_token *token)
>>> +{
>>> +    int htab_fd;
>>> +    int hpte_group_size;
>>> +    struct kvm_get_htab_fd ghf;
>>> +    struct kvm_get_htab_buf {
>>> +        struct kvm_get_htab_header header;
>>> +        /*
>>> +         * We required one extra byte for read
>>> +         */
>>> +        unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
>>> +    } hpte_buf;;
>> 
>> Double semicolon?
> 
> Will fix
> 
>> 
>>> +
>>> +    ghf.flags = 0;
>>> +    ghf.start_index = pte_index;
>>> +    htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &ghf);
>>> +    if (htab_fd < 0) {
>>> +        goto error_out;
>>> +    }
>>> +    memset(&hpte_buf, 0, sizeof(hpte_buf));
> 
> ....
> 
>>> diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
>>> index 67fc1b5..aeb4593 100644
>>> --- a/target-ppc/mmu-hash64.c
>>> +++ b/target-ppc/mmu-hash64.c
>>> @@ -302,29 +302,73 @@ static int ppc_hash64_amr_prot(CPUPPCState *env, 
>>> ppc_hash_pte64_t pte)
>>>    return prot;
>>> }
>>> 
>>> -static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
>>> +struct ppc_hash64_hpte_token *ppc_hash64_start_access(PowerPCCPU *cpu,
>>> +                                                      unsigned long 
>>> pte_index)
>> 
>> How about you also pass in the number of PTEs you want to access?
>> Let's call it "pte_num" for now. Then if you only care about one PTE
>> you can indicate so, otherwise it's clear that you want to access 8
>> PTEs beginning from the one you're pointing at.
> 
> So if we want to pass pte_num, then i can be any number, 1, 8, 10. That
> would make the code complex, because now we need to make the buffer
> passed to read() of variable size.Also i would need another allocation
> for the return buffer. I can do tricks like make the token handle the
> pointer to actual buffer skipping the header. But ppc_hash64=stop_acess then
> would have to know about kvm htab read header which i found not nice.
> We can possibly update the function name to indicate that it will always
> read hptegroup from the pte_index. Something like ppc64_start_hpteg_access() 
> ?. 

Just abort() if pte_num is not 1 or 8.

> 
>> 
>>> +{
>>> +    hwaddr pte_offset;
>>> +    struct ppc_hash64_hpte_token *token;
>> 
>> void *token = NULL;
>> 
>> if (kvmppc_uses_htab_fd(cpu)) {
>>    /* HTAB is controlled by KVM. Fetch the PTEG into a new buffer. */
>> 
>>    int hpte_group_size = sizeof(unsigned long) * 2 * pte_num;
>>    token = g_malloc(hpte_group_size);
>>    if (kvm_ppc_hash64_read_pteg(cpu, pte_index, token)) {
> 
> That is the tricky part, the read buffer need to have a header in the
> beginning. May be i can do kvm_ppc_hash64_stop_access(void *token) that
> does the pointer match gets to the head of token and free. Will try that.
> 
>>        free(token);
>>        return NULL;
>>    }
>> } else {
>>    /* HTAB is controlled by QEMU. Just point to the internally accessible 
>> PTEG. */
>>    hwaddr pte_offset;
>> 
>>    pte_offset = pte_index * HASH_PTE_SIZE_64;
>>    if (cpu->env.external_htab) {
>>        token = cpu->env.external_htab + pte_offset;
>>    } else {
>>        token = (uint8_t *) cpu->env.htab_base + pte_offset;
>>    }
>> }
>> 
>> return token;
>> 
>> This way it's more obvious which path the "normal code flow" would be. We 
>> also only clearly choose what to do depending on in-kernel HTAB or now. As a 
>> big plus we don't need a struct that we need to dynamically allocate even in 
>> the TCG case (malloc is slow).
>> 
>> You also want to cache kvmppc_uses_htab_fd(cpu) so it doesn't have to
>> call an ioctl every time we try to read a PTE. I guess the best spot
>> for it would be to put it into env.
> 
> didn't get that. We already cache that value as 
> 
> bool kvmppc_has_cap_htab_fd(void)
> {
>    return cap_htab_fd;
> }
> 
> Looking at the kernel capability check I do find an issue, So with both
> hv and pr loaded as module, that would always return true. Now that is
> not we want here. Ideally we should have all these as VM ioctl and not
> device ioctl. I had asked that as a fixme in the HV PR patchset.

As you've realized we only cache the ability for an htab fd, not whether we are 
actually using one. That needs to be a separate variable.


Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]