qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Proposed patch: huge RX speedup for hw/e1000.c


From: Luigi Rizzo
Subject: [Qemu-devel] Proposed patch: huge RX speedup for hw/e1000.c
Date: Wed, 30 May 2012 22:23:11 +0200
User-agent: Mutt/1.4.2.3i

Hi,
while testing qemu with netmap (see [Note 1] for details) on e1000
emulation, i noticed that my sender program using a custom backend
[Note 2] could reach 1 Mpps (million packets per second) but on the
receive side i was limited to 50 Kpps (and CPU always below 5%).

The problem was fixed by the following one-line addition to
hw/e1000.c :: e1000_mmio_write() , to wakeup the qemu mainloop and
check that some buffers might be available.

        --- hw/e1000.c.orig  2012-02-17 20:45:39.000000000 +0100
        +++ hw/e1000.c  2012-05-30 20:01:52.000000000 +0200
        @@ -919,6 +926,7 @@
                 DBGOUT(UNKNOWN, "MMIO unknown write 
addr=0x%08x,val=0x%08"PRIx64"\n",
                        index<<2, val);
             }
        +    qemu_notify_event();
         }

         static uint64_t

With this fix, the read throughput reaches 1 Mpps matching the write
speed. Now the system becomes CPU-bound, but this opens the way to
more optimizations in the emulator.

The same problem seems to exist on other network drivers, e.g.
hw/rtl8139.c and others. The only one that seems to get it
right is virtio-net.c

I think it would be good if this change could make it into
the tree.

[Note 1] Netmap ( http://info.iet.unipi.it/~luigi/netmap )
    is an efficient mechanism for packet I/O that bypasses
    the network stack and provides protected access to the
    network adapter from userspace.
    It works especially well on top of qemu because the
    kernel needs only to trap a single register access
    for each batch of packets.

[Note 2] the custom backend is a virtual local ethernet
    called VALE, implemented as a kernel module on the host,
    that extends netmap to implement communication
    between virtual machines.
    VALE is extremely efficient, currently delivering about
    10~Mpps with 60-byte frames, and 5~Mpps with 1500-byte frames.
    The 1 Mpps rates i mentioned are obtained between qemu instances
    running in userspace on FreeBSD (no kernel acceleration whatsoever)
    and using VALE as a communication mechanism.

    
        cheers
        luigi
-----------------------------------------+-------------------------------
  Prof. Luigi RIZZO, address@hidden  . Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
-----------------------------------------+-------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]