[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] cpu: implementing victim TLB for QEMU system em
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH] cpu: implementing victim TLB for QEMU system emulated TLB |
Date: |
Wed, 22 Jan 2014 13:55:03 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 |
On 01/22/2014 06:48 AM, Xin Tong wrote:
> +#define TLB_XOR_SWAP(X, Y) do {*X = *X ^ *Y; *Y = *X ^ *Y; *X = *X ^
> *Y;}while(0);
First, your patch is line wrapped. You really really really need to follow the
directions Peter gave you.
Second, using xor to swap values is a cute assembler trick, but it has no place
in high-level programming. Look at the generated assembly and you'll find way
more memory accesses than necessary.
> +void swap_tlb(CPUTLBEntry *te, CPUTLBEntry *se, hwaddr *iote, hwaddr *iose)
This function could probably stand to be inline, so that we produce better code
for softmmu_template.h.
> + for (k = 0;k < CPU_VTLB_SIZE; k++) {
Watch your spacing. Did the patch pass checkpatch.pl?
> for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> unsigned int i;
> -
> for (i = 0; i < CPU_TLB_SIZE; i++) {
Don't randomly change whitespace.
> + /* do not discard the translation in te, evict it into a random
> victim tlb */
> + unsigned vidx = rand() % CPU_VTLB_SIZE;
Don't use rand. That's a huge heavy-weight function. Treating the victim
table as a circular buffer would surely be quicker. Using a LRU algorithm
might do better, but could also be overkill.
> do_unaligned_access(env, addr, READ_ACCESS_TYPE, mmu_idx,
> retaddr);
> }
> #endif
> - tlb_fill(env, addr, READ_ACCESS_TYPE, mmu_idx, retaddr);
> + /* we are about to do a page table walk. our last hope is the
> victim tlb.
> + * try to refill from the victim tlb before walking the page table.
> */
> + int vidx, vhit = false;
We're supposed to be c89 compliant. No declarations in the middle of the
block. Also, you can avoid the vhit variable entirely with
> + for(vidx = 0;vidx < CPU_VTLB_SIZE; ++vidx) {
for (vidx = CPU_VTLB_SIZE - 1; vidx >= 0; --vidx) {
...
}
if (vidx < 0) {
tlb_fill(...);
}
r~