qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V14 0/5] VMXNET3 paravirtual NIC device implemen


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH V14 0/5] VMXNET3 paravirtual NIC device implementation
Date: Tue, 12 Mar 2013 10:38:42 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Sat, Mar 09, 2013 at 11:21:01AM +0200, Dmitry Fleytman wrote:
> This set of patches implements VMWare VMXNET3 paravirtual NIC device.
> The device supports of all the device features including offload capabilties,
> VLANs and etc.
> The device is tested on different OSes:
>     Fedora 15
>     Ubuntu 10.4
>     Centos 6.2
>     Windows 2008R2
>     Windows 2008 64bit
>     Windows 2008 32bit
>     Windows 2003 64bit
>     Windows 2003 32bit
> 
> Changes in V14:
>    QOM-related fixes
> 
> Changes in V13:
>    Licensing of vmxnet3.h changed to match original
>    VMWare's header file license
> 
> Changes in V12:
>    Reported-by: Andreas Farber <address@hidden>
>    QOM-related fixes 
> 
> Changes in V11:
>    Rebased to master HEAD to fix multiqueue compilation issues
> 
> Changes in V10:
>    Reported-by: Stefan Hajnoczi <address@hidden>
>    Issues reported by Stefan Hajnoczi fixed.
> 
> Changes in V9:
>    Reported-by: Stefan Hajnoczi <address@hidden>
>    Issues reported by Stefan Hajnoczi fixed.
> 
> Changes in V8:
>    Reported-by: Stefan Hajnoczi <address@hidden>
>    Issues reported by Stefan Hajnoczi reviewed and mostly fixed:
> 
> > +        }
> > +        curr_src_off += src[i].iov_len;
> > +    }
> > +    return j;
> > +}
> 
> The existing iov_copy() function provides equivalent functionality.  I
> don't think iov_rebuild() is needed.
> 
> [DF] Done. Thanks, missed it.
> 
> 
> > +            size -= len;
> > +        }
> > +        iovec_off += iov[i].iov_len;
> > +    }
> > +    return res;
> > +}
> 
> Rename this net_checksum_add_iov() and place it in net/checksum.c,
> then the new dependency on net from block can be dropped.
> 
> [DF] Done. 
> 
> > +vmw_shmem_read(hwaddr addr, void *buf, int len)
> >  {
> >      VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf);
> >      cpu_physical_memory_read(addr, buf, len);
> >  }
> 
> All changes to this file should be squashed with the previous patch.
> 
> [DF] Done
> 
> > +#ifdef VMXNET_DEBUG_SHMEM_ACCESS
> > +#define VMW_SHPRN(fmt, ...)                                                
> >   
> >  \
> > +    do {                                                                   
> >   
> >  \
> > +        printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__,    
> >   
> >  \
> > +            ## __VA_ARGS__);                                               
> >   
> >  \
> > +    } while (0)
> > +#else
> > +#define VMW_SHPRN(fmt, ...) do {} while (0)
> > +#endif
> 
> Please use QEMU tracing.  It eliminates all this boilerplate and
> conditional compilation.  Tracing can be enabled/disabled at runtime
> and works with SystemTap/DTrace.  See docs/tracing.txt.
> 
> [DF] We'd like to stick with compile time logic in this case becase of 2 
> reasons:
> [DF] 1. These printouts are intended for reverse engineering/development only 
> and there is
> [DF]    no need to enable them at run time
> [DF] 2. There is a big number of printouts, all driver-device communication 
> is traced,
> [DF]    they hit performance even on strongest x86 in case of run-time logic
> 
> 
> > +struct eth_header {
> > +    uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> > +    uint8_t  h_source[ETH_ALEN]; /* source ether addr    */
> > +    uint16_t h_proto;            /* packet type ID field */
> > +};
> 
> Looks like it's copy-pasted stuff from /usr/include/linux/if_*.h,
> /usr/include/netinet/*.h, and friends.  If the system-wide headers are
> included names will collide for some of the macros at least.
> 
> Did you check if the slirp/ definitions can be reused?
> 
> [DF] Yes, you are right. This is copy-pasted from different places.
> [DF] Slips definishing do not fully cover our needs.
> 
> 
> I'd rather we import network header definitions once in a generic
> place into the source tree.  That way vmxnet and other components
> don't need to redefine these structs.
> 
> [DF] Exaclty! Our intention is to create generic header with network 
> definitions and make everyone use it.
> [DF] We can move our header to some shared place if you want, however I'd do 
> it in parallel with cleanup
> [DF] of similar definitions in existing code and this is a big change that os 
> out of scope of these patches.
> 
> > + 
> > *===========================================================================*/
> 
> Is this huge comment box a sign that the code should be split into a
> foo_tx.c and an foo_rx.c file?
> 
> [DF] As for me this file is not that big to be splitted (<800 lines), however 
> I'll do this if you insist :)
> 
> > +size_t vmxnet_tx_pkt_send(VmxnetTxPktH pkt, NetClientState *vc)
> 
> 'vc' is an old name that was used for VLANClientState.  The struct has
> since been renamed to NetClientState and the rest of QEMU uses 'nc'
> instead of 'vc'.
> 
> [DF] Fixed. Thanks.
> 
> > +/* tx module context handle */
> > +typedef void *VmxnetTxPktH;
> 
> Forward-declaring the struct is nicer:
> 
> typedef struct VmxnetTxPkt VmxnetTxPkt;
> 
> The definition of VmxnetTxPkt is still hidden from the caller but you
> avoid the void* and casting.  In vmxnet_pkt.c define using:
> 
> struct VmxnetTxPkt {
>     ...
> };
> 
> [DF] Agree, fixed. Thanks.
> 
> 
> > diff --git a/hw/vmxnet_utils.h b/hw/vmxnet_utils.h
> > index 7fd9a01..fac3b7b 100644
> > --- a/hw/vmxnet_utils.h
> > +++ b/hw/vmxnet_utils.h
> 
> Please squash these fixes into the previous patch.
> 
> 
> [DF] Oops, my bad. Fixed.
> 
> 
> > +        uint8_t msix_used;
> > +        /* Whether MSI support was installed successfully */
> > +        uint8_t msi_used;
> 
> These two fields should be bool.
> 
> > +        /* Whether automatic interrupts masking enabled */
> > +        uint8_t auto_int_masking;
> 
> bool
> 
> [DF] Done.
> 
> 
> > +static inline void vmxnet3_flush_shmem_changes(void)
> > +{
> > +    /*
> > +     * Flush shared memory changes
> > +     * Needed before sending interrupt to guest to ensure
> > +     * it gets consistent memory state
> > +     */
> > +    smp_wmb();
> > +}
> 
> It's useful to document why a memory barrier is being used in each
> instance.  Therefore hiding smp_wmb() inside a wrapper function isn't
> great.
> 
> [DF] Fixed.
> 
> Also, it's suspicious that smb_wmb() is used but no other barriers are
> used.  What about a read memory barrier when accessing shared memory
> written by the guest?
> 
> [DF] VMWARE interface build a in safe way - you always read out data buffer 
> (packet descriptor) with memcopy,
> [DF] and then check its last byte to see whether it was fully filled by 
> driver.
> [DF] In this case device doesn't need read barriers. Drivers need write 
> barriers of course to secure
> [DF] proper order of writes.
> 
> Changes in V7:
> 
>    Reported-by: Michael S. Tsirkin <address@hidden>
>    Issues reported by Michael S. Tsirkin reviewed and mostly fixed:
> 
> File vmware_utils.h:
> 
> ...
> 
> > +static inline void
> > +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value)
> > +{
> > +    VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, 
> > value);
> > +    stq_le_phys(addr, value);
> > +}
> > +
> 
> Pls remove these wrappers.  These are just memory stores. Our codebase
> is too large as it is without every driver wrapping all standard calls.
> 
> [DF] Idea behind this macros is to have printout in each of them
> [DF] Printouts are really needed when reverse-engineering windows guest 
> drivers
> [DF] Since there is no windows drivers code this is the only way to understand
> [DF] sequences they use
> 
> > +/* MACROS for simplification of operations on array-style registers */
> 
> UPPERCASE ABUSE
> 
> [DF] Fixed
> 
> > +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize)                 \
> > +    (((addr + 1) > (base)) && ((addr) < (base) + (cnt) * (regsize)))
> 
> 
> Same as range_covers_byte(base, cnt * regsize, addr)?
> 
> [DF] Fixed, thanks
> 
> > +
> > +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize)                  \
> > +    (((addr) - (base)) / (regsize))
> > +
> 
> Above two macros is all that's left. No objection but it does not say
> what they do - want to add minimal documentation?
> And please prefix with VMWARE_ or something.
> 
> [DF] Prefix added, macros documented
> 
> File vmxnet_utils.h (and related with the same problems):
> 
> ...
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef _VMXNET_UTILS_H_
> > +#define _VMXNET_UTILS_H_
> 
> Please do not start identifiers with _ followed by an uppercase
> lattters.
> 
> [DF] Fixed
> 
> ...
> > +
> > +struct eth_header {
> > +    uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> > +    uint8_t  h_source[ETH_ALEN]; /* source ether addr    */
> > +    uint16_t h_proto;            /* packet type ID field */
> > +};
> > +
> 
> And fix struct definitions to
> 1. follow qemu coding style
> [DF] It's not clear what's wrong with the coding style. Please, elaborate.
> 2. start with vmxnet
> [DF] Since this is generic network definitions with no device specifics
> [DF] I'm not sure it makes sense to add vmxnet prefix.
> [DF] I'd say it makes sense to put them into some generic header files under 
> "net" directory instead
> [DF] and clean up other devices that have their local definitions of the same 
> structures.
> [DF] Please, advise.
> 
> 
> I also don't really understand why are these
> functions split out - vmxnet is the only user, no?
> 
> [DF] Currently vmxnet3 is the only user, hovewer we have vmxnet2 
> implementation as well (not submitted yet)
> [DF] and it uses the same header
> 
> File vmxnet_pkt.h:
> 
> ...
> > +
> > +/* tx module context handle */
> > +typedef void *VmxnetTxPktH;
> 
> This gets you zero type safety and makes debugging impossible.
> Just use pointers like everyone does.
> 
> [DF] Handle type is added to hide VmxnetTxPkt structure into .c header
> [DF] for better encapsulation.
> [DF] Should we drop it anyway?
> 
> Files vmxnet3.*:
> 
> ...
> > +
> > +    if (zero_region) {
> > +        vmw_shmem_set(pa, 0, size*cell_size);
> 
> spaces around *
> 
> [DF] Fixed
> 
> ...
> > +#define vmxnet3_ring_dump(macro, ring_name, ridx, r)                       
> >   
> > \
> > +    macro("%s#%d: base %" PRIx64 " size %lu cell_size %lu gen %d next 
> > %lu",  
> > \
> > +          (ring_name), (ridx),                                             
> >   
> > \
> > +          (r)->pa, (r)->size, (r)->cell_size, (r)->gen, (r)->next)
> 
> make macros upper case
> 
> [DF] Fixed
> 
> ...
> > +static inline void vmxnet3_flush_shmem_changes(void)
> > +{
> > +    /*
> > +     * Flush shared memory changes
> > +     * Needed before transferring control to guest
> 
> what does 'transferring control to guest' mean?
> [DF] Changed to 
> [DF]    /*
> [DF]     * Flush shared memory changes
> [DF]     * Needed before sending interrupt to guest to ensure
> [DF]     * it gets consistent memory state
> [DF]     */
> 
> ...
> > +     */
> > +    smp_wmb();
> > +}
> 
> Don't use wrappers like this. They just hide bugs. For example
> it's not helpful before an interrupt in the function below.
> 
> [DF] I guess you are talking about vmxnet3_complete_packet()
> [DF] Strictly speaking barrier is a must because we change shared memory in 
> [DF] vmxnet3_complete_packet()
> 
> [DF] And the wrapper is a good thing because its name explains its effect
> [DF] in a formal way as opposed to comments
> 
> ...
> > +    switch (status) {
> > +    case VMXNET3_PKT_STATUS_OK: {
> 
> don't put {} around cases: they align incorrectly
> if it's too big move to a function.
> 
> [DF] Fixed
> 
> ...
> > +static bool
> > +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx)
> > +{
> > +    size_t bytes_sent = 0;
> > +    bool res = true;
> 
> why = true? don't initialize just because.
> 
> [DF] Fixed
> 
> ...
> > +/*
> > + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding
> > + * standards in sense of types and defines used.
> > + * Since we didn't want to change VMWARE code, following set of typedefs
> > + * and defines needed to compile these headers with QEMU introduced.
> > + */
> 
> No need for this now.
> You can export headers and put them under linux-headers.
> 
> [DF] Not sure it is possible because the header as-is is not stand-alone and 
> won't compile
> [DF] without changes. We extracted definitions we use from their header and 
> dropped unused
> [DF] and kernel-specific stuff.
> [DF} Please, advise.
> 
> ...
> 
> > +        if (VMXNET3_OM_TSO == s->offload_mode) {
> 
> Don't do Yoda style like this
> 
> [DF] "Yoda" style removed everywhere
> 
> 
> Changes in V6:
>    Fixed most of problems pointed out by Michael S. Tsirkin
>    The only issue still open is creation of shared place
>    with generic network structures and functions. Currently
>    all generic network code introduced by VMXNET3 resides in
>    vmxnet_utils.c/h files. It could be moved to some shared location however
>    we believe it is a matter of separate refactoring as there are a lot of 
> copy-pasted
>    definitions in almost every device and code cleanup efforts requred in 
> order
>    to create truly shared codebase.
> 
>      Reported-by: Michael S. Tsirkin <address@hidden>
> 
>    Implemented suggestions by Anthony Liguori 
> 
>      Reported-by: Anthony Liguori <address@hidden>
> 
>    Fixed incorrect checksum caclulation for some packets in SW offloads mode
> 
>      Reported-by: Gerhard Wiesinger <address@hidden>
> 
> Changes in V5:
>    MSI-X save/load implemented in the device instead of pci bus as
>    suggested by Michael S. Tsirkin
> 
>      Reported-by: Michael S. Tsirkin <address@hidden>
> 
>    Patches regrouped as suggested by Paolo Bonzini
> 
>      Reported-by: Paolo Bonzini <address@hidden>
> 
> Changes in V4:
>    Fixed a few problems uncovered by NETIO test suit
>    Assertion on failure to initialize MSI/MSI-X replaced with warning 
>    message and fallback to Legacy/MSI respectively   
> 
>      Reported-by: Gerhard Wiesinger <address@hidden>
> 
>    Various coding style adjustments and patch split-up as suggested by 
> Anthony 
> Liguori
>      
>      Reported-by: Anthony Liguori <address@hidden>
> 
>    Live migration support added
> 
> Changes in V3:
>    Fixed crash when net device that is used as network fronted has no
>    virtio HDR support.
>    Task offloads emulation for cases when net device that is used as 
>    network fronted has no virtio HDR support.
> 
>      Reported-by: Gerhard Wiesinger  <address@hidden>
> 
> Changes in V2:
>    License text changed accoring to community suggestions
>    Standard license header from GPLv2+ - licensed QEMU files used
> 
> Dmitry Fleytman (5):
>   Checksum-related utility functions
>   net: iovec checksum calculator
>   Common definitions for VMWARE devices
>   Packet abstraction for VMWARE network devices
>   VMXNET3 device implementation
> 
>  default-configs/pci.mak |    1 +
>  hw/Makefile.objs        |    2 +
>  hw/pci/pci.h            |    1 +
>  hw/vmware_utils.h       |  143 +++
>  hw/vmxnet3.c            | 2461 
> +++++++++++++++++++++++++++++++++++++++++++++++
>  hw/vmxnet3.h            |  760 +++++++++++++++
>  hw/vmxnet_debug.h       |  115 +++
>  hw/vmxnet_rx_pkt.c      |  187 ++++
>  hw/vmxnet_rx_pkt.h      |  174 ++++
>  hw/vmxnet_tx_pkt.c      |  567 +++++++++++
>  hw/vmxnet_tx_pkt.h      |  148 +++
>  include/net/checksum.h  |   26 +-
>  include/net/eth.h       |  347 +++++++
>  net/Makefile.objs       |    1 +
>  net/checksum.c          |   42 +-
>  net/eth.c               |  217 +++++
>  16 files changed, 5185 insertions(+), 7 deletions(-)
>  create mode 100644 hw/vmware_utils.h
>  create mode 100644 hw/vmxnet3.c
>  create mode 100644 hw/vmxnet3.h
>  create mode 100644 hw/vmxnet_debug.h
>  create mode 100644 hw/vmxnet_rx_pkt.c
>  create mode 100644 hw/vmxnet_rx_pkt.h
>  create mode 100644 hw/vmxnet_tx_pkt.c
>  create mode 100644 hw/vmxnet_tx_pkt.h
>  create mode 100644 include/net/eth.h
>  create mode 100644 net/eth.c
> 
> -- 
> 1.8.1.2
> 

Thanks, applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]