qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7] virtio-net: Fix network stall at the host side waiting fo


From: Michael S. Tsirkin
Subject: Re: [PATCH v7] virtio-net: Fix network stall at the host side waiting for kick
Date: Tue, 2 Jul 2024 07:27:43 -0400

On Tue, Jul 02, 2024 at 07:45:31AM +0800, Yang Dongshan wrote:
> > what does "changed" mean here? changed compared to what?
> For a split queue, if the shadow_avail_idx synced from avail ring idx
> by vring_avail_idx(vq) last time doesn't equal the current value of avail ring
> idx.
> 
> vq->shadow_avail_idx != vring_avail_idx(vq);
> 
> For packed queue, the logic is similar, if vq->shadow_avail_idx
> 
> becomes available, it means the guest has added buf at the slot.
> 
> vring_packed_desc_read(vq->vdev, &desc, &caches->desc,
> 
>                         vq->shadow_avail_idx, true);
> 
> if (is_desc_avail(desc.flags, vq-> shadow_avail_wrap_counter))
> 
>     return true;
> 
> 

This answer does not make sense from API POV.

My suggestion:

        change virtqueue_get_avail_bytes to return the shadow
        in an opaque unsigned value.

        add virtqueue_poll that gets this opaque and tells us whether any new
        buffers became available in the queue since that value
        was returned.

        accordingly, virtio_queue_set_notification_and_check
        will accept this opaque value and check avail buffers
        against it.



> On Tue, Jul 2, 2024 at 2:46 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> 
>     On Tue, Jul 02, 2024 at 01:18:15AM +0800, Yang Dongshan wrote:
>     > > Please document what this does.
>     > okay, i will.
>     >
>     > > So this will return false if ring has any available buffers?
>     > > Equivalent to:
>     > > 
>     > > bool virtio_queue_set_notification_and_check(VirtQueue *vq, int 
> enable)
>     > > {
>     > >        virtio_queue_packed_set_notification(vq, enable);
>     > >         return virtio_queue_empty(vq);
>     > > }
>     >
>     > No, only when the shadow_avail_idx is changed shall the function return
>     true,
> 
> 
>     what does "changed" mean here? changed compared to what?
> 
>     > compared with the value seen by the host last time, else return false
>     even if
>     > there are some buffers available in the queue, as the total size of the
>     > available
>     > buffers in the queue can't satisfy the request.
>     >
>     > It maybe better to pass only one arg to the function like this:
>     > bool virtio_queue_set_notification_and_check(VirtQueue *vq)
>     > {
>     >        virtio_queue_packed_set_notification(vq, true);
>     >        
>     >        return shadow_avail_idx_changed()? true: false;
>     > }
>     >
>     > Thanks Michael a lot!
>     >
>     > On Mon, Jul 1, 2024 at 11:05 PM Michael S. Tsirkin <mst@redhat.com>
>     wrote:
>     >
>     >     On Mon, Jul 01, 2024 at 10:00:17PM +0800, Wencheng Yang wrote:
>     >     > From: thomas <east.moutain.yang@gmail.com>
>     >     >
>     >     > Patch 06b12970174 ("virtio-net: fix network stall under load")
>     >     > added double-check to test whether the available buffer size
>     >     > can satisfy the request or not, in case the guest has added
>     >     > some buffers to the avail ring simultaneously after the first
>     >     > check. It will be lucky if the available buffer size becomes
>     >     > okay after the double-check, then the host can send the packet
>     >     > to the guest. If the buffer size still can't satisfy the request,
>     >     > even if the guest has added some buffers, viritio-net would
>     >     > stall at the host side forever.
>     >     >
>     >     > The patch checks whether the guest has added some buffers
>     >     > after last check of avail idx when the available buffers are
>     >     > not sufficient, if so then recheck the available buffers
>     >     > in the loop.
>     >     >
>     >     > The patch also reverts patch "06b12970174".
>     >     >
>     >     > The case below can reproduce the stall.
>     >     >
>     >     >                                        Guest 0
>     >     >                                      +--------+
>     >     >                                      | iperf  |
>     >     >                     ---------------> | server |
>     >     >          Host       |                +--------+
>     >     >        +--------+   |                    ...
>     >     >        | iperf  |----
>     >     >        | client |----                  Guest n
>     >     >        +--------+   |                +--------+
>     >     >                     |                | iperf  |
>     >     >                     ---------------> | server |
>     >     >                                      +--------+
>     >     >
>     >     > Boot many guests from qemu with virtio network:
>     >     >  qemu ... -netdev tap,id=net_x \
>     >     >     -device virtio-net-pci-non-transitional,\
>     >     >     iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x
>     >     >
>     >     > Each guest acts as iperf server with commands below:
>     >     >  iperf3 -s -D -i 10 -p 8001
>     >     >  iperf3 -s -D -i 10 -p 8002
>     >     >
>     >     > The host as iperf client:
>     >     >  iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 40000
>     >     >  iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 40000
>     >     >
>     >     > After some time, the host loses connection to the guest,
>     >     > the guest can send packet to the host, but can't receive
>     >     > packet from host.
>     >     >
>     >     > It's more likely to happen if SWIOTLB is enabled in the guest,
>     >     > allocating and freeing bounce buffer takes some CPU ticks,
>     >     > copying from/to bounce buffer takes more CPU ticks, compared
>     >     > with that there is no bounce buffer in the guest.
>     >     > Once the rate of producing packets from the host approximates
>     >     > the rate of receiveing packets in the guest, the guest would
>     >     > loop in NAPI.
>     >     >
>     >     >          receive packets    ---
>     >     >                |             |
>     >     >                v             |
>     >     >            free buf      virtnet_poll
>     >     >                |             |
>     >     >                v             |
>     >     >      add buf to avail ring  ---
>     >     >                |
>     >     >                |  need kick the host?
>     >     >                |  NAPI continues
>     >     >                v
>     >     >          receive packets    ---
>     >     >                |             |
>     >     >                v             |
>     >     >            free buf      virtnet_poll
>     >     >                |             |
>     >     >                v             |
>     >     >      add buf to avail ring  ---
>     >     >                |
>     >     >                v
>     >     >               ...           ...
>     >     >
>     >     > On the other hand, the host fetches free buf from avail
>     >     > ring, if the buf in the avail ring is not enough, the
>     >     > host notifies the guest the event by writing the avail
>     >     > idx read from avail ring to the event idx of used ring,
>     >     > then the host goes to sleep, waiting for the kick signal
>     >     > from the guest.
>     >     >
>     >     > Once the guest finds the host is waiting for kick singal
>     >     > (in virtqueue_kick_prepare_split()), it kicks the host.
>     >     >
>     >     > The host may stall forever at the sequences below:
>     >     >
>     >     >          Host                        Guest
>     >     >      ------------                 -----------
>     >     >  fetch buf, send packet           receive packet ---
>     >     >          ...                          ...         |
>     >     >  fetch buf, send packet             add buf       |
>     >     >          ...                        add buf   virtnet_poll
>     >     >     buf not enough      avail idx-> add buf       |
>     >     >     read avail idx                  add buf       |
>     >     >                                     add buf      ---
>     >     >                                   receive packet ---
>     >     >     write event idx                   ...         |
>     >     >     wait for kick                   add buf   virtnet_poll
>     >     >                                       ...         |
>     >     >                                                  ---
>     >     >                                  no more packet, exit NAPI
>     >     >
>     >     > In the first loop of NAPI above, indicated in the range of
>     >     > virtnet_poll above, the host is sending packets while the
>     >     > guest is receiving packets and adding buffers.
>     >     >  step 1: The buf is not enough, for example, a big packet
>     >     >          needs 5 buf, but the available buf count is 3.
>     >     >          The host read current avail idx.
>     >     >  step 2: The guest adds some buf, then checks whether the
>     >     >          host is waiting for kick signal, not at this time.
>     >     >          The used ring is not empty, the guest continues
>     >     >          the second loop of NAPI.
>     >     >  step 3: The host writes the avail idx read from avail
>     >     >          ring to used ring as event idx via
>     >     >          virtio_queue_set_notification(q->rx_vq, 1).
>     >     >  step 4: At the end of the second loop of NAPI, recheck
>     >     >          whether kick is needed, as the event idx in the
>     >     >          used ring written by the host is beyound the
>     >     >          range of kick condition, the guest will not
>     >     >          send kick signal to the host.
>     >     >
>     >     > Fixes: 06b12970174 ("virtio-net: fix network stall under load")
>     >     > Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com>
>     >     > ---
>     >     >
>     >     > Changelog:
>     >     > v7:
>     >     > - Add function virtio_queue_set_notification_and_check()
>     >     > - Restore the function sequence introduce in v6
>     >     >
>     >     > v6:
>     >     > - Take packed packed queue into cosideration
>     >     > - Adjust function sequence to fix compilation issue
>     >     >
>     >     > v5:
>     >     > - Modify return type of virtio_queue_set_notification() to
>     >     >   bool to indicate whether the guest has added some buffers
>     >     >   after last check of avail idx
>     >     > - Loop in virtio_net_has_buffers() if the available buffers
>     >     >   are not sufficient and the guest has added some buffers.
>     >     > - Revert patch "06b12970174"
>     >     > - Update the subject
>     >     >
>     >     > v4:
>     >     > - Correct spelling mistake in the subject
>     >     > - Describe the issue that virtio-net is blocked at host side
>     >     >
>     >     > v3:
>     >     > - Add virtio-net tag in the subject
>     >     > - Refine commit log
>     >     >
>     >     > v2:
>     >     > - Add SOB tag at the end of the commit message
>     >     > - Place Fixes tag at the end of the commit message
>     >     >
>     >     > v1:
>     >     > - Initial patch
>     >     >
>     >     >  hw/net/virtio-net.c        | 19 ++++++++-----------
>     >     >  hw/virtio/virtio.c         | 39
>     ++++++++++++++++++++++++++++++++++++++
>     >     >  include/hw/virtio/virtio.h |  1 +
>     >     >  3 files changed, 48 insertions(+), 11 deletions(-)
>     >     >
>     >     > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>     >     > index 9c7e85caea..357c651d9b 100644
>     >     > --- a/hw/net/virtio-net.c
>     >     > +++ b/hw/net/virtio-net.c
>     >     > @@ -1642,18 +1642,15 @@ static bool virtio_net_can_receive
>     (NetClientState
>     >     *nc)
>     >     >  static int virtio_net_has_buffers(VirtIONetQueue *q, int bufsize)
>     >     >  {
>     >     >      VirtIONet *n = q->n;
>     >     > -    if (virtio_queue_empty(q->rx_vq) ||
>     >     > -        (n->mergeable_rx_bufs &&
>     >     > -         !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
>     >     > -        virtio_queue_set_notification(q->rx_vq, 1);
>     >     > -
>     >     > -        /* To avoid a race condition where the guest has made 
> some
>     >     buffers
>     >     > -         * available after the above check but before 
> notification
>     was
>     >     > -         * enabled, check for available buffers again.
>     >     > -         */
>     >     > -        if (virtio_queue_empty(q->rx_vq) ||
>     >     > +
>     >     > +    while (virtio_queue_empty(q->rx_vq) ||
>     >     >              (n->mergeable_rx_bufs &&
>     >     > -             !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
>     >     > +            !virtqueue_avail_bytes(q->rx_vq, bufsize, 0))) {
>     >     > +        /* guest may have made some buf, try again */
>     >     > +        if (virtio_queue_set_notification_and_check(q->rx_vq, 1))
>     {
>     >     > +            virtio_queue_set_notification(q->rx_vq, 0);
>     >     > +            continue;
>     >     > +        } else {
>     >     >              return 0;
>     >     >          }
>     >     >      }
>     >     > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>     >     > index 893a072c9d..b165764017 100644
>     >     > --- a/hw/virtio/virtio.c
>     >     > +++ b/hw/virtio/virtio.c
>     >     > @@ -668,6 +668,45 @@ static inline bool is_desc_avail(uint16_t
>     flags,
>     >     bool wrap_counter)
>     >     >      return (avail != used) && (avail == wrap_counter);
>     >     >  }
>     >     > 
>     >     > +bool virtio_queue_set_notification_and_check(VirtQueue *vq, int
>     enable)
>     >     > +{
>     >     > +    uint16_t shadow_idx;
>     >     > +    VRingPackedDesc desc;
>     >     > +    VRingMemoryRegionCaches *caches;
>     >     > +
>     >     > +    vq->notification = enable;
>     >     > +
>     >     > +    if (!vq->vring.desc) {
>     >     > +        return false;
>     >     > +    }
>     >     > +
>     >     > +    if (virtio_vdev_has_feature(vq->vdev, VIRTIO_F_RING_PACKED)) 
> {
>     >     > +        virtio_queue_packed_set_notification(vq, enable);
>     >     > +
>     >     > +        if (enable) {
>     >     > +            caches = vring_get_region_caches(vq);
>     >     > +            if (!caches) {
>     >     > +                return false;
>     >     > +            }
>     >     > +
>     >     > +            vring_packed_desc_read(vq->vdev, &desc, 
> &caches->desc,
>     >     > +                                   vq->shadow_avail_idx, true);
>     >     > +            if (is_desc_avail(desc.flags, vq->
>     >     shadow_avail_wrap_counter)) {
>     >     > +                return true;
>     >     > +            }
>     >     > +        }
>     >     > +    } else {
>     >     > +        shadow_idx = vq->shadow_avail_idx;
>     >     > +        virtio_queue_split_set_notification(vq, enable);
>     >     > +
>     >     > +        if (enable) {
>     >     > +            return shadow_idx != vring_avail_idx(vq);
>     >     > +        }
>     >     > +    }
>     >     > +
>     >     > +    return false;
>     >     > +}
>     >
>     >     Please document what this does.
>     >     So this will return false if ring has any availabe buffers?
>     >     Equivalent to:
>     >
>     >     bool virtio_queue_set_notification_and_check(VirtQueue *vq, int
>     enable)
>     >     {
>     >             virtio_queue_packed_set_notification(vq, enable);
>     >             return virtio_queue_empty(vq);
>     >     }
>     >
>     >
>     >     or am I missing something?
>     >
>     >     > +
>     >     >  /* Fetch avail_idx from VQ memory only when we really need to 
> know
>     if
>     >     >   * guest has added some buffers.
>     >     >   * Called within rcu_read_lock().  */
>     >     > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/
>     virtio.h
>     >     > index 7d5ffdc145..ed85023b87 100644
>     >     > --- a/include/hw/virtio/virtio.h
>     >     > +++ b/include/hw/virtio/virtio.h
>     >     > @@ -302,6 +302,7 @@ void virtio_notify_config(VirtIODevice *vdev);
>     >     > 
>     >     >  bool virtio_queue_get_notification(VirtQueue *vq);
>     >     >  void virtio_queue_set_notification(VirtQueue *vq, int enable);
>     >     > +bool virtio_queue_set_notification_and_check(VirtQueue *vq, int
>     enable);
>     >     > 
>     >     >  int virtio_queue_ready(VirtQueue *vq);
>     >     > 
>     >     > --
>     >     > 2.39.0
>     >
>     >
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]