qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH 00/13] target-openrisc: More optimizations and corre


From: Sebastian Macke
Subject: [Qemu-devel] [PATCH 00/13] target-openrisc: More optimizations and corrections
Date: Tue, 29 Oct 2013 20:04:42 +0100

Hi,

This is the second part of the patches to make the openrisc target faster 
and more reliable.

The first four patches are increasing the speed to a level comparable to the
i386 emulation by implementing block chaining and further small optimizations.

Two patches change the handling of the TLB flushing and increase 
especially the tlb refill procedure for the softmmu emulation. 
In some way it does very slightly break the specification but 
increases the compatbility with the specification of QEMU instead.

One patch introduces a new CPU which neglects
the carry and overflow flags. I hope this one gets accepted. 
It increases the speed and does not harm anything, but violates the 
specification. But since you have to activate it explicitly I don't 
see a problem.

The other patches correct small stuff or increase the readability 
One patch solves a problem I introduced in gdbstub.c in my previous 
patchset.

Further optimizations would require instruction fusion and the resorting 
of instructions (delayed slot). I have not seen any of the other 
targets doing this. But it would be interesting :)

The nbench results are quite remarkable and show a speed increase up to 
a factor of ten for the user space emulation.

                                    old          +chaining+rw     +no flags     
 +sync          +jmp_pc as flag
                      i386 user     or1k user      or1k user      or1k user     
 or1k user      or1k user      
                      gcc 4.8.1     gcc 4.8.1      gcc 4.8.1      gcc 4.8.1     
 gcc 4.8.1      gcc 4.8.1    
TEST                : New Index   : New Index   :  New Index   :  New Index   : 
 New Index   :  New Index   :
                    : AMD K6/233* : AMD K6/233* :  AMD K6/233* :  AMD K6/233* : 
 AMD K6/233* :  AMD K6/233* :
--------------------:-------------:-------------:--------------: -------------: 
-------------: -------------:
NUMERIC SORT        :       3.16  :       0.38  :        2.52  :        3.63  : 
       3.90  :        4.25  : 
STRING SORT         :       3.52  :       0.43  :        1.95  :        2.71  : 
       2.81  :        2.96  : 
BITFIELD            :       8.67  :       0.91  :        3.91  :        6.22  : 
       7.23  :        7.44  : 
FP EMULATION        :      14.12  :       1.25  :        7.27  :        9.82  : 
      10.20  :       11.09  :
FOURIER             :       1.46  :       0.01  :        0.06  :        0.08  : 
       0.08  :        0.09  :
ASSIGNMENT          :      15.34  :       0.67  :        5.05  :        7.88  : 
       8.50  :        9.89  :
IDEA                :      10.43  :       0.82  :        4.34  :        5.87  : 
       6.33  :        6.57  :
HUFFMAN             :       7.58  :       0.65  :        3.48  :        5.14  : 
       5.54  :        5.70  :
NEURAL NET          :       0.37  :       0.02  :        0.08  :        0.10  : 
       0.11  :        0.11  :
LU DECOMPOSITION    :       0.71  :       0.04  :        0.15  :        0.21  : 
       0.22  :        0.23  :


Keep in mind, that the toolchain is missing floating point support 
and that the emulator has to convert between little and big endian.

The patches where tested by the following programs
- tcg testsuite 
- booting Linux with busybox (+singlestep)
- running nbench in softmmu and user mode (+singlestep)
- configure && make && make install and testing of two applications (cmatrix 
and Frotz)
- running a few binutils tools like objdump
- qemu user chroot using proot-x86 and running busybox, nano editor and gcc
- gcc C-torture test with static and shared library.

The test.div of the tcg testsuite fails, but this is expected if you want to 
divide signed 0x80000000 by -1.

At the moment there is only one real bug left which I cannot find.
The compiled application "make" does not work in qemu-user mode.
It failes a few times in the uClibc library because of a segmentation fault. 
E.g. when setting the locale. But this failure existed also before my patches 
and is maybe not related to the target.

Best Regards
Sebastian



Sebastian Macke (13):
  target-openrisc: Implement translation block chaining
  target-openrisc: Separate Delayed slot handling from main loop
  target-openrisc: Separate of load/store instructions
  target-openrisc: sync flags only when necessary
  target-openrisc: Remove TLB flush on exception
  target-openrisc: Remove TLB flush from l.rfe instruction
  target-openrisc: Correct l.cmov conditional check
  target-openrisc: Test for Overflow exception statically
  target-openrisc: Add CPU which neglects Carry and Overflow Flag
  target-openrisc: Correct target number for 64 bit llseek
  target-openrisc: use jmp_pc as flag variable for branches
  target-openrisc: Add correct gdb information for the pc value
  target-openrisc: Add In-circuit emulator support

 linux-user/openrisc/syscall_nr.h   |   2 +-
 target-openrisc/cpu.c              |  15 +-
 target-openrisc/cpu.h              |  10 +-
 target-openrisc/gdbstub.c          |  14 +-
 target-openrisc/interrupt.c        |   4 -
 target-openrisc/interrupt_helper.c |   8 -
 target-openrisc/sys_helper.c       |   4 +
 target-openrisc/translate.c        | 317 +++++++++++++++++++++++++------------
 8 files changed, 250 insertions(+), 124 deletions(-)

-- 
1.8.4.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]