qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] instruction optimization thoughts


From: Piotr Krysik
Subject: Re: [Qemu-devel] instruction optimization thoughts
Date: Tue, 24 Aug 2004 08:36:37 -0700 (PDT)

--- Elefterios Stamatogiannakis
<address@hidden> wrote:

>   I don't think the database solution would work. 
> First of all the database would have to be very 
> big in order to be effective. That means that in 
> order to lookup the streams of instructions in 
> the database you would thrash the cache (lots 
> and lots of memory reads to various places).

Hi!

The database lookup can be very fast -- use good 
hash function and memory mapped file. For best 
performance used code fragments should be located 
in sequence (single page fault and disk read would 
bring a few useful code fragments to RAM). For 
this reason the database should be optimized for 
specific programs executed by particular user.

But once you have a nice optimizer for building 
the database why not integrate it in Qemu to 
optimize most frequently executed blocks (HotSpot)? 
Once we find the optimization to be CPU-expensive, 
we could add small persistent database.


[...]
>   teris.
> 
> ps All these code optimizing ideas pale in
> effectiveness with what a mmu optimization work 
> would produce. There is a reason why there is a 
> qemu-fast and a qemu-soft. If somehow these too
> could be consolidated then the performance gain 
> would be considerable....
> I think.

The MMU optimization improves performance a lot, 
but most of qemu-fast speed comes from code-copy 
optimization (to compare, run benchmarks with 
qemu-fast -no-code-copy). The MMU optimization 
is important for code-copy as it allows running 
blocks that read/write memory.

For MMU optimization to work it is necessary to 
have as big as possible continuous region of virtual 
address space that is dedicated to the guest. 
For this reason qemu-fast uses special memory 
layout and it causes some problems (it requires 
static compilation, hacking of libc). Qemu-fast 
will never be as portable as softmmu is. Except, 
maybe, when running 32-bit guest on 64-bit host.

I did some experiments to check if optimization 
of softmmu, by using techniques of MMU optimization 
is feasible. For this I tried to remove memory 
layout constraint of qemu-fast by using a memory 
"mapping table".

In this experiments I redirect memory access of 
guest code via a mapping table to an area of 
virtual address space, where guest pages are 
mapped (using mmap). This area can be much smaller 
than guest address space, so I minimize problems 
of qemu-fast and improve its portability. In 
future this approach could help MMU optimization 
to enter qemu-softmmu.

As memory access is much simpler that softmmu, 
the benchmarks give better results (but still 
it's 20% slower than qemu-fast with -no-code-copy). 
When running real OS, Linux seems faster then under 
softmmu but Windows 98 -- much slower. The problem 
with Windows is that it "likes" to modify pages 
where code is executing, and this causes lots of 
page faults.

I'm attaching a patch, so other developers can see 
what I'm doing. Before using the patch, please make 
sure you can build working qemu-fast with your 
setup. To see it working run qemu-fast with 
-no-code-copy option.

Fabrice, does it make sense to modify code-copy 
to be compatible with this patch (I know it's a lot 
of work)?


Regards,

Piotrek



        
                
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail 
diff -ru qemu-snapshot-2004-08-04_23/cpu-all.h 
qemu-snapshot-2004-08-04_23-fast-map/cpu-all.h
--- qemu-snapshot-2004-08-04_23/cpu-all.h       2004-07-05 23:25:09.000000000 
+0200
+++ qemu-snapshot-2004-08-04_23-fast-map/cpu-all.h      2004-08-19 
00:28:30.000000000 +0200
@@ -24,6 +24,17 @@
 #define WORDS_ALIGNED
 #endif
 
+/* keep in sync with exec-all.h
+ */
+#ifndef offsetof
+#define offsetof(type, field) ((size_t) &((type *)0)->field)
+#endif
+
+/* XXX: assume sizeof(long) >= sizeof(void*)
+ */
+#define map_target2host(env, ptr) \
+    (env->map[((unsigned long) (ptr)) >> MAP_BLOCK_BITS] + ((unsigned long) 
(ptr)))
+
 /* some important defines: 
  * 
  * WORDS_ALIGNED : if defined, the host cpu can only make word aligned
@@ -181,6 +192,67 @@
     *(uint8_t *)ptr = v;
 }
 
+static inline int ldub_map(void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile (
+        "mov    %3, %%eax\n"
+        "shr    %2, %%eax\n"
+        "mov    %1(%%ebp,%%eax,4), %%eax\n"
+        "movzbl (%3,%%eax,1), %0\n"
+        : "=r" (val)
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr)
+        : "%eax");
+    return (val);
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline int ldsb_map(void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile (
+        "mov    %3, %%eax\n"
+        "shr    %2, %%eax\n"
+        "mov    %1(%%ebp,%%eax,4), %%eax\n"
+        "movsbl (%3,%%eax,1), %0\n"
+        : "=r" (val)
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr)
+        : "%eax");
+    return (val);
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline void stb_map(void *ptr, int v)
+{
+#if defined(__i386__)
+    asm volatile (
+        "mov    %2, %%eax\n"
+        "shr    %1, %%eax\n"
+        "mov    %0(%%ebp,%%eax,4), %%eax\n"
+        "movb   %b3, (%2,%%eax)\n"
+        :
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr),
+          "r" (v)
+        : "%eax");
+#else
+#error unsupported target CPU
+#endif
+}
+
+
+
 /* NOTE: on arm, putting 2 in /proc/sys/debug/alignment so that the
    kernel handles unaligned load/stores may give better results, but
    it is a system wide setting : bad */
@@ -467,6 +539,105 @@
     *(uint64_t *)ptr = v;
 }
 
+static inline int lduw_map(void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile (
+        "mov    %3, %%eax\n"
+        "shr    %2, %%eax\n"
+        "mov    %1(%%ebp,%%eax,4), %%eax\n"
+        "movzwl (%3,%%eax,1), %0\n"
+        : "=r" (val)
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr)
+        : "%eax");
+    return (val);
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline int ldsw_map(void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile (
+        "mov    %3, %%eax\n"
+        "shr    %2, %%eax\n"
+        "mov    %1(%%ebp,%%eax,4), %%eax\n"
+        "movswl (%3,%%eax,1), %0\n"
+        : "=r" (val)
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr)
+        : "%eax");
+    return (val);
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline int ldl_map(void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile (
+        "mov    %3, %%eax\n"
+        "shr    %2, %%eax\n"
+        "mov    %1(%%ebp,%%eax,4), %%eax\n"
+        "movl   (%3,%%eax,1), %0\n"
+        : "=r" (val)
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr)
+        : "%eax");
+    return (val);
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline void stw_map(void *ptr, int v)
+{
+#if defined(__i386__)
+    asm volatile (
+        "mov    %2, %%eax\n"
+        "shr    %1, %%eax\n"
+        "mov    %0(%%ebp,%%eax,4), %%eax\n"
+        "movw   %w3, (%2,%%eax)\n"
+        :
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr),
+          "r" (v)
+        : "%eax");
+    /* XXX PK: clobber memory? */
+#else
+#error unsupported target CPU
+#endif
+}
+
+static inline void stl_map(void *ptr, int v)
+{
+#if defined(__i386__)
+    asm volatile (
+        "mov    %2, %%eax\n"
+        "shr    %1, %%eax\n"
+        "mov    %0(%%ebp,%%eax,4), %%eax\n"
+        "movl   %3, (%2,%%eax)\n"
+        :
+        : "m" (*(uint8_t *)offsetof(CPUX86State, map[0])),
+          "I" (MAP_BLOCK_BITS),
+          "r" (ptr),
+          "r" (v)
+        : "%eax");
+#else
+#error unsupported target CPU
+#endif
+}
+
 /* float access */
 
 static inline float ldfl_raw(void *ptr)
diff -ru qemu-snapshot-2004-08-04_23/cpu-exec.c 
qemu-snapshot-2004-08-04_23-fast-map/cpu-exec.c
--- qemu-snapshot-2004-08-04_23/cpu-exec.c      2004-07-14 19:20:55.000000000 
+0200
+++ qemu-snapshot-2004-08-04_23-fast-map/cpu-exec.c     2004-08-19 
00:05:30.000000000 +0200
@@ -814,10 +814,13 @@
 {
     struct ucontext *uc = puc;
     unsigned long pc;
+    unsigned long addr;
     int trapno;
+    int res;
 
 #ifndef REG_EIP
 /* for glibc 2.1 */
+#define REG_EAX    EAX
 #define REG_EIP    EIP
 #define REG_ERR    ERR
 #define REG_TRAPNO TRAPNO
@@ -831,10 +834,20 @@
         return 1;
     } else
 #endif
-        return handle_cpu_signal(pc, (unsigned long)info->si_addr, 
-                                 trapno == 0xe ? 
-                                 (uc->uc_mcontext.gregs[REG_ERR] >> 1) & 1 : 0,
-                                 &uc->uc_sigmask, puc);
+           {
+        /* EAX == env->map[addr >> MAP_BLOCK_BITS]
+         * see *_map functions in cpu-all.h
+         */
+        /* XXX: check opcode at pc to detect possible inconsistency?
+         */
+        addr = (unsigned long)info->si_addr - uc->uc_mcontext.gregs[REG_EAX];
+        res = handle_cpu_signal(pc, addr, 
+                                trapno == 0xe ? 
+                                (uc->uc_mcontext.gregs[REG_ERR] >> 1) & 1 : 0,
+                                &uc->uc_sigmask, puc);
+        uc->uc_mcontext.gregs[REG_EAX] = (uint32_t)cpu_single_env->map[addr >> 
MAP_BLOCK_BITS];
+        return (res);
+    }
 }
 
 #elif defined(__x86_64__)
diff -ru qemu-snapshot-2004-08-04_23/exec.c 
qemu-snapshot-2004-08-04_23-fast-map/exec.c
--- qemu-snapshot-2004-08-04_23/exec.c  2004-07-05 23:25:10.000000000 +0200
+++ qemu-snapshot-2004-08-04_23-fast-map/exec.c 2004-08-19 00:22:03.000000000 
+0200
@@ -45,7 +45,7 @@
 
 #define SMC_BITMAP_USE_THRESHOLD 10
 
-#define MMAP_AREA_START        0x00000000
+#define MMAP_AREA_START        (MAP_PAGE_SIZE + MAP_BLOCK_SIZE + MAP_PAGE_SIZE)
 #define MMAP_AREA_END          0xa8000000
 
 TranslationBlock tbs[CODE_GEN_MAX_BLOCKS];
@@ -125,6 +125,73 @@
 FILE *logfile;
 int loglevel;
 
+/* XXX: NOT TESTED
+ */
+static void *map_mmap(CPUState *env, target_ulong begin, target_ulong length, 
int prot, int flags,
+                      int fd, off_t offset)
+{
+    char *addr;
+    void *res;
+
+    addr = map_target2host(env, begin);
+    if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= addr))
+        abort();
+    res = mmap((void *)addr, length, prot, flags, fd, offset);
+    if (res == MAP_FAILED)
+        return (res);
+    if (!(begin && (MAP_BLOCK_SIZE - 1)) && ((char *)MMAP_AREA_START <= 
map_target2host(env, begin - MAP_PAGE_SIZE))) {
+        addr = map_target2host(env, begin - MAP_PAGE_SIZE) + MAP_PAGE_SIZE;
+        if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= 
addr))
+            abort();
+        res = mmap((void *)addr, length, prot, flags, fd, offset);
+    }
+    return (res);
+}
+
+/* XXX: NOT TESTED
+ */
+static int map_munmap(CPUState *env, target_ulong begin, target_ulong length)
+{
+    char *addr;
+    int res;
+
+    addr = map_target2host(env, begin);
+    if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= addr))
+        abort();
+    res = munmap((void *)addr, length);
+    if (res == - 1)
+        return (res);
+    if (!(begin && (MAP_BLOCK_SIZE - 1)) && ((char *)MMAP_AREA_START <= 
map_target2host(env, begin - MAP_PAGE_SIZE))) {
+        addr = map_target2host(env, begin - MAP_PAGE_SIZE) + MAP_PAGE_SIZE;
+        if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= 
addr))
+            abort();
+        res = munmap((void *)addr, length);
+    }
+    return (res);
+}
+
+/* XXX: NOT TESTED
+ */
+static int map_mprotect(CPUState *env, target_ulong begin, target_ulong 
length, int prot)
+{
+    char *addr;
+    int res;
+
+    addr = map_target2host(env, begin);
+    if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= addr))
+        abort();
+    res = mprotect((void *)addr, length, prot);
+    if (res == - 1)
+        return (res);
+    if (!(begin && (MAP_BLOCK_SIZE - 1)) && ((char *)MMAP_AREA_START <= 
map_target2host(env, begin - MAP_PAGE_SIZE))) {
+        addr = map_target2host(env, begin - MAP_PAGE_SIZE) + MAP_PAGE_SIZE;
+        if ((addr < (char *)MMAP_AREA_START) || ((char *)MMAP_AREA_END <= 
addr))
+            abort();
+        res = mprotect((void *)addr, length, prot);
+    }
+    return (res);
+}
+
 static void page_init(void)
 {
     /* NOTE: we can always suppose that qemu_host_page_size >=
@@ -836,8 +903,8 @@
         prot = 0;
         for(addr = host_start; addr < host_end; addr += TARGET_PAGE_SIZE)
             prot |= page_get_flags(addr);
-        mprotect((void *)host_start, qemu_host_page_size, 
-                 (prot & PAGE_BITS) & ~PAGE_WRITE);
+        map_mprotect(host_start, qemu_host_page_size, 
+                     (prot & PAGE_BITS) & ~PAGE_WRITE);
 #ifdef DEBUG_TB_INVALIDATE
         printf("protecting code page: 0x%08lx\n", 
                host_start);
@@ -1313,8 +1380,9 @@
     }
 
 #if !defined(CONFIG_SOFTMMU)
-    if (addr < MMAP_AREA_END)
-        munmap((void *)addr, TARGET_PAGE_SIZE);
+    if (((char *)MMAP_AREA_START <= map_target2host(env, addr))
+            && (map_target2host(env, addr) < (char *)MMAP_AREA_END))
+        map_munmap(env, addr, TARGET_PAGE_SIZE);
 #endif
 }
 
@@ -1341,8 +1409,9 @@
 #if !defined(CONFIG_SOFTMMU)
     /* NOTE: as we generated the code for this page, it is already at
        least readable */
-    if (addr < MMAP_AREA_END)
-        mprotect((void *)addr, TARGET_PAGE_SIZE, PROT_READ);
+    if (((char *)MMAP_AREA_START <= map_target2host(env, addr))
+            && (map_target2host(env, addr) < (char *)MMAP_AREA_END))
+        map_mprotect(env, addr, TARGET_PAGE_SIZE, PROT_READ);
 #endif
 }
 
@@ -1418,9 +1487,10 @@
                     if (p->valid_tag == virt_valid_tag &&
                         p->phys_addr >= start && p->phys_addr < end &&
                         (p->prot & PROT_WRITE)) {
-                        if (addr < MMAP_AREA_END) {
-                            mprotect((void *)addr, TARGET_PAGE_SIZE, 
-                                     p->prot & ~PROT_WRITE);
+                        if (((char *)MMAP_AREA_END <= map_target2host(env, 
addr))
+                                && (map_target2host(env, addr) < (char 
*)MMAP_AREA_END)) {
+                            map_mprotect(env, addr, TARGET_PAGE_SIZE, 
+                                         p->prot & ~PROT_WRITE);
                         }
                     }
                     addr += TARGET_PAGE_SIZE;
@@ -1556,34 +1626,58 @@
         } else {
             void *map_addr;
 
-            if (vaddr >= MMAP_AREA_END) {
-                ret = 2;
-            } else {
-                if (prot & PROT_WRITE) {
-                    if ((pd & ~TARGET_PAGE_MASK) == IO_MEM_ROM || 
+            if (prot & PROT_WRITE) {
+                if ((pd & ~TARGET_PAGE_MASK) == IO_MEM_ROM || 
 #if defined(TARGET_HAS_SMC) || 1
-                        first_tb ||
+                    first_tb ||
 #endif
-                        ((pd & ~TARGET_PAGE_MASK) == IO_MEM_RAM && 
-                         !cpu_physical_memory_is_dirty(pd))) {
-                        /* ROM: we do as if code was inside */
-                        /* if code is present, we only map as read only and 
save the
-                           original mapping */
-                        VirtPageDesc *vp;
-                        
-                        vp = virt_page_find_alloc(vaddr >> TARGET_PAGE_BITS);
-                        vp->phys_addr = pd;
-                        vp->prot = prot;
-                        vp->valid_tag = virt_valid_tag;
-                        prot &= ~PAGE_WRITE;
-                    }
+                    ((pd & ~TARGET_PAGE_MASK) == IO_MEM_RAM && 
+                     !cpu_physical_memory_is_dirty(pd))) {
+                    /* ROM: we do as if code was inside */
+                    /* if code is present, we only map as read only and save 
the
+                       original mapping */
+                    VirtPageDesc *vp;
+                    
+                    vp = virt_page_find_alloc(vaddr >> TARGET_PAGE_BITS);
+                    vp->phys_addr = pd;
+                    vp->prot = prot;
+                    vp->valid_tag = virt_valid_tag;
+                    prot &= ~PAGE_WRITE;
                 }
-                map_addr = mmap((void *)vaddr, TARGET_PAGE_SIZE, prot, 
-                                MAP_SHARED | MAP_FIXED, phys_ram_fd, (pd & 
TARGET_PAGE_MASK));
-                if (map_addr == MAP_FAILED) {
-                    cpu_abort(env, "mmap failed when mapped physical address 
0x%08x to virtual address 0x%08x\n",
-                              paddr, vaddr);
+            }
+            /* if null block [MAP_PAGE_SIZE ... MAP_PAGE_SIZE + MAP_BLOCK_SIZE 
+ MAP_PAGE_SIZE), alloc new block
+             */
+            /* XXX: handle unaligned access on block bounduary (need to 
allocate block for address vaddr - MAP_PAGE_SIZE)
+             */
+            if ((map_target2host(env, vaddr) < (char *)MMAP_AREA_START)
+                    || ((char *)MMAP_AREA_END <= map_target2host(env, vaddr))) 
{
+                static uint32_t block_next = MMAP_AREA_START;
+                uint32_t block;
+                int i;
+
+                block = block_next;
+                block_next = block + MAP_BLOCK_SIZE + MAP_PAGE_SIZE;
+                if (block_next > MMAP_AREA_END) {
+                    block = MMAP_AREA_START;
+                    block_next = block + MAP_BLOCK_SIZE + MAP_PAGE_SIZE;
                 }
+                /* invalidate pointers to chosen block
+                 */
+                /* XXX: NOT TESTED
+                 */
+                for (i = 0; i < (1L << (MAP_ADDR_BITS - MAP_BLOCK_BITS)); ++ i)
+                    if (env->map[i] == (char *)(block - i * (MAP_BLOCK_SIZE + 
MAP_PAGE_SIZE))) {
+                        env->map[i] = (char *)(MAP_PAGE_SIZE - i * 
(MAP_BLOCK_SIZE + MAP_PAGE_SIZE));
+                        munmap((void *)block, MAP_BLOCK_SIZE + MAP_PAGE_SIZE);
+                    }
+                i = vaddr >> MAP_BLOCK_BITS;
+                env->map[i] = (char *)(block - (i << MAP_BLOCK_BITS));
+            }
+            map_addr = map_mmap(env, vaddr, TARGET_PAGE_SIZE, prot, 
+                                MAP_SHARED | MAP_FIXED, phys_ram_fd, (pd & 
TARGET_PAGE_MASK));
+            if (map_addr == MAP_FAILED) {
+                cpu_abort(env, "mmap failed when mapped physical address 
0x%08x to virtual address 0x%08x\n",
+                          paddr, vaddr);
             }
         }
     }
@@ -1604,7 +1698,8 @@
     addr &= TARGET_PAGE_MASK;
 
     /* if it is not mapped, no need to worry here */
-    if (addr >= MMAP_AREA_END)
+    if ((map_target2host(cpu_single_env, addr) < (char *)MMAP_AREA_START)
+          || (map_target2host(cpu_single_env, addr) >= (char *)MMAP_AREA_END))
         return 0;
     vp = virt_page_find(addr >> TARGET_PAGE_BITS);
     if (!vp)
@@ -1619,7 +1714,7 @@
     printf("page_unprotect: addr=0x%08x phys_addr=0x%08x prot=%x\n", 
            addr, vp->phys_addr, vp->prot);
 #endif
-    if (mprotect((void *)addr, TARGET_PAGE_SIZE, vp->prot) < 0)
+    if (map_mprotect(cpu_single_env, addr, TARGET_PAGE_SIZE, vp->prot) < 0)
         cpu_abort(cpu_single_env, "error mprotect addr=0x%lx prot=%d\n",
                   (unsigned long)addr, vp->prot);
     /* set the dirty bit */
@@ -1754,8 +1849,8 @@
     if (prot & PAGE_WRITE_ORG) {
         pindex = (address - host_start) >> TARGET_PAGE_BITS;
         if (!(p1[pindex].flags & PAGE_WRITE)) {
-            mprotect((void *)host_start, qemu_host_page_size, 
-                     (prot & PAGE_BITS) | PAGE_WRITE);
+            map_mprotect(host_start, qemu_host_page_size, 
+                         (prot & PAGE_BITS) | PAGE_WRITE);
             p1[pindex].flags |= PAGE_WRITE;
             /* and since the content will be modified, we must invalidate
                the corresponding translated code. */
diff -ru qemu-snapshot-2004-08-04_23/target-i386/cpu.h 
qemu-snapshot-2004-08-04_23-fast-map/target-i386/cpu.h
--- qemu-snapshot-2004-08-04_23/target-i386/cpu.h       2004-07-12 
22:33:47.000000000 +0200
+++ qemu-snapshot-2004-08-04_23-fast-map/target-i386/cpu.h      2004-08-19 
00:28:38.000000000 +0200
@@ -20,6 +20,12 @@
 #ifndef CPU_I386_H
 #define CPU_I386_H
 
+#define MAP_PAGE_BITS 12
+#define MAP_BLOCK_BITS 24
+#define MAP_ADDR_BITS 32
+#define MAP_PAGE_SIZE (1L << MAP_PAGE_BITS)
+#define MAP_BLOCK_SIZE (1L << MAP_BLOCK_BITS)
+
 #define TARGET_LONG_BITS 32
 
 /* target supports implicit self modifying code */
@@ -291,6 +297,9 @@
     int32_t df; /* D flag : 1 if D = 0, -1 if D = 1 */
     uint32_t hflags; /* hidden flags, see HF_xxx constants */
 
+    /* offset <= 127 to enable assembly optimization */
+    void *map[1L << (MAP_ADDR_BITS - MAP_BLOCK_BITS)];
+
     /* FPU state */
     unsigned int fpstt; /* top of stack index */
     unsigned int fpus;
diff -ru qemu-snapshot-2004-08-04_23/target-i386/op.c 
qemu-snapshot-2004-08-04_23-fast-map/target-i386/op.c
--- qemu-snapshot-2004-08-04_23/target-i386/op.c        2004-08-03 
23:37:41.000000000 +0200
+++ qemu-snapshot-2004-08-04_23-fast-map/target-i386/op.c       2004-08-19 
00:04:06.000000000 +0200
@@ -390,7 +390,7 @@
 
 /* memory access */
 
-#define MEMSUFFIX _raw
+#define MEMSUFFIX _map
 #include "ops_mem.h"
 
 #if !defined(CONFIG_USER_ONLY)
diff -ru qemu-snapshot-2004-08-04_23/target-i386/ops_template_mem.h 
qemu-snapshot-2004-08-04_23-fast-map/target-i386/ops_template_mem.h
--- qemu-snapshot-2004-08-04_23/target-i386/ops_template_mem.h  2004-01-18 
22:44:40.000000000 +0100
+++ qemu-snapshot-2004-08-04_23-fast-map/target-i386/ops_template_mem.h 
2004-08-19 00:04:06.000000000 +0200
@@ -23,11 +23,11 @@
 #if MEM_WRITE == 0
 
 #if DATA_BITS == 8
-#define MEM_SUFFIX b_raw
+#define MEM_SUFFIX b_map
 #elif DATA_BITS == 16
-#define MEM_SUFFIX w_raw
+#define MEM_SUFFIX w_map
 #elif DATA_BITS == 32
-#define MEM_SUFFIX l_raw
+#define MEM_SUFFIX l_map
 #endif
 
 #elif MEM_WRITE == 1
diff -ru qemu-snapshot-2004-08-04_23/target-i386/translate.c 
qemu-snapshot-2004-08-04_23-fast-map/target-i386/translate.c
--- qemu-snapshot-2004-08-04_23/target-i386/translate.c 2004-06-13 
15:26:14.000000000 +0200
+++ qemu-snapshot-2004-08-04_23-fast-map/target-i386/translate.c        
2004-08-19 00:04:06.000000000 +0200
@@ -394,7 +394,7 @@
 };
 
 static GenOpFunc *gen_op_arithc_mem_T0_T1_cc[9][2] = {
-    DEF_ARITHC(_raw)
+    DEF_ARITHC(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_ARITHC(_kernel)
     DEF_ARITHC(_user)
@@ -423,7 +423,7 @@
 };
 
 static GenOpFunc *gen_op_cmpxchg_mem_T0_T1_EAX_cc[9] = {
-    DEF_CMPXCHG(_raw)
+    DEF_CMPXCHG(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_CMPXCHG(_kernel)
     DEF_CMPXCHG(_user)
@@ -467,7 +467,7 @@
 };
 
 static GenOpFunc *gen_op_shift_mem_T0_T1_cc[9][8] = {
-    DEF_SHIFT(_raw)
+    DEF_SHIFT(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_SHIFT(_kernel)
     DEF_SHIFT(_user)
@@ -498,7 +498,7 @@
 };
 
 static GenOpFunc1 *gen_op_shiftd_mem_T0_T1_im_cc[9][2] = {
-    DEF_SHIFTD(_raw, im)
+    DEF_SHIFTD(_map, im)
 #ifndef CONFIG_USER_ONLY
     DEF_SHIFTD(_kernel, im)
     DEF_SHIFTD(_user, im)
@@ -506,7 +506,7 @@
 };
 
 static GenOpFunc *gen_op_shiftd_mem_T0_T1_ECX_cc[9][2] = {
-    DEF_SHIFTD(_raw, ECX)
+    DEF_SHIFTD(_map, ECX)
 #ifndef CONFIG_USER_ONLY
     DEF_SHIFTD(_kernel, ECX)
     DEF_SHIFTD(_user, ECX)
@@ -540,8 +540,8 @@
 };
 
 static GenOpFunc *gen_op_lds_T0_A0[3 * 3] = {
-    gen_op_ldsb_raw_T0_A0,
-    gen_op_ldsw_raw_T0_A0,
+    gen_op_ldsb_map_T0_A0,
+    gen_op_ldsw_map_T0_A0,
     NULL,
 #ifndef CONFIG_USER_ONLY
     gen_op_ldsb_kernel_T0_A0,
@@ -555,8 +555,8 @@
 };
 
 static GenOpFunc *gen_op_ldu_T0_A0[3 * 3] = {
-    gen_op_ldub_raw_T0_A0,
-    gen_op_lduw_raw_T0_A0,
+    gen_op_ldub_map_T0_A0,
+    gen_op_lduw_map_T0_A0,
     NULL,
 
 #ifndef CONFIG_USER_ONLY
@@ -572,9 +572,9 @@
 
 /* sign does not matter, except for lidt/lgdt call (TODO: fix it) */
 static GenOpFunc *gen_op_ld_T0_A0[3 * 3] = {
-    gen_op_ldub_raw_T0_A0,
-    gen_op_lduw_raw_T0_A0,
-    gen_op_ldl_raw_T0_A0,
+    gen_op_ldub_map_T0_A0,
+    gen_op_lduw_map_T0_A0,
+    gen_op_ldl_map_T0_A0,
 
 #ifndef CONFIG_USER_ONLY
     gen_op_ldub_kernel_T0_A0,
@@ -588,9 +588,9 @@
 };
 
 static GenOpFunc *gen_op_ld_T1_A0[3 * 3] = {
-    gen_op_ldub_raw_T1_A0,
-    gen_op_lduw_raw_T1_A0,
-    gen_op_ldl_raw_T1_A0,
+    gen_op_ldub_map_T1_A0,
+    gen_op_lduw_map_T1_A0,
+    gen_op_ldl_map_T1_A0,
 
 #ifndef CONFIG_USER_ONLY
     gen_op_ldub_kernel_T1_A0,
@@ -604,9 +604,9 @@
 };
 
 static GenOpFunc *gen_op_st_T0_A0[3 * 3] = {
-    gen_op_stb_raw_T0_A0,
-    gen_op_stw_raw_T0_A0,
-    gen_op_stl_raw_T0_A0,
+    gen_op_stb_map_T0_A0,
+    gen_op_stw_map_T0_A0,
+    gen_op_stl_map_T0_A0,
 
 #ifndef CONFIG_USER_ONLY
     gen_op_stb_kernel_T0_A0,
@@ -621,8 +621,8 @@
 
 static GenOpFunc *gen_op_st_T1_A0[3 * 3] = {
     NULL,
-    gen_op_stw_raw_T1_A0,
-    gen_op_stl_raw_T1_A0,
+    gen_op_stw_map_T1_A0,
+    gen_op_stl_map_T1_A0,
 
 #ifndef CONFIG_USER_ONLY
     NULL,
@@ -4321,7 +4321,7 @@
 
 
     DEF_READF( )
-    DEF_READF(_raw)
+    DEF_READF(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_READF(_kernel)
     DEF_READF(_user)
@@ -4440,7 +4440,7 @@
 
 
     DEF_WRITEF( )
-    DEF_WRITEF(_raw)
+    DEF_WRITEF(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_WRITEF(_kernel)
     DEF_WRITEF(_user)
@@ -4479,7 +4479,7 @@
     [INDEX_op_rorl ## SUFFIX ## _T0_T1_cc] = INDEX_op_rorl ## SUFFIX ## _T0_T1,
 
     DEF_SIMPLER( )
-    DEF_SIMPLER(_raw)
+    DEF_SIMPLER(_map)
 #ifndef CONFIG_USER_ONLY
     DEF_SIMPLER(_kernel)
     DEF_SIMPLER(_user)
diff -ru qemu-snapshot-2004-08-04_23/vl.c 
qemu-snapshot-2004-08-04_23-fast-map/vl.c
--- qemu-snapshot-2004-08-04_23/vl.c    2004-08-04 00:09:30.000000000 +0200
+++ qemu-snapshot-2004-08-04_23-fast-map/vl.c   2004-08-19 00:35:26.000000000 
+0200
@@ -3035,6 +3035,8 @@
 
     /* init CPU state */
     env = cpu_init();
+    for (i = 0; i < (1L << (MAP_ADDR_BITS - MAP_BLOCK_BITS)); ++ i)
+        env->map[i] = (char *)(MAP_PAGE_SIZE - i * MAP_BLOCK_SIZE);
     global_env = env;
     cpu_single_env = env;
 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]