qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest


From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest
Date: Tue, 20 May 2014 14:06:18 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

On 05/19/2014 06:06 PM, Michael S. Tsirkin wrote:
> On Mon, May 19, 2014 at 05:34:38PM +0800, Jason Wang wrote:
>> > On 05/18/2014 05:04 PM, Michael S. Tsirkin wrote:
>>> > > On Fri, May 16, 2014 at 01:02:51PM +0800, Jason Wang wrote:
>>>> > >> On 05/15/2014 05:45 PM, Michael S. Tsirkin wrote:
>>>>> > >>> On Thu, May 15, 2014 at 05:22:28PM +0800, Jason Wang wrote:
>>>>>> > >>>> On 05/15/2014 04:28 PM, Michael S. Tsirkin wrote:
>>>>>>> > >>>>> Thanks, looks good.
>>>>>>> > >>>>> Some minor comments below,
>>>>>>> > >>>>>
>>>>>>> > >>>>> On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
>>>>>>>> > >>>>>> It's hard to track all mac addresses and their configurations 
>>>>>>>> > >>>>>> (e.g
>>>>>>>> > >>>>>> vlan or ipv6) in qemu. Without those informations, it's 
>>>>>>>> > >>>>>> impossible to
>>>>>>> > >>>>> s/those informations/this information/
>>>>>>> > >>>>>
>>>>>>>> > >>>>>> build proper garp packet after migration. The only possible 
>>>>>>>> > >>>>>> solution
>>>>>>>> > >>>>>> to this is let guest (who knew all configurations) to do this.
>>>>>>> > >>>>> s/knew/knows/
>>>>>>> > >>>>>
>>>>>>>> > >>>>>> So, this patch introduces a new readonly config status bit of 
>>>>>>>> > >>>>>> virtio-net,
>>>>>>>> > >>>>>> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to 
>>>>>>>> > >>>>>> announce
>>>>>>>> > >>>>>> presence of its link through config update interrupt.When 
>>>>>>>> > >>>>>> guest has
>>>>>>>> > >>>>>> done the announcement, it should ack the notification through
>>>>>>>> > >>>>>> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated 
>>>>>>>> > >>>>>> by a new
>>>>>>>> > >>>>>> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been 
>>>>>>>> > >>>>>> supported by
>>>>>>>> > >>>>>> Linux guest).
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> During load, a counter of announcing rounds were set so that 
>>>>>>>> > >>>>>> the after
>>>>>>> > >>>>> s/were/is/
>>>>>>> > >>>>> s/the after/after/
>>>>>> > >>>> Will correct those typos.
>>>>>>>> > >>>>>> the vm is running it can trigger rounds of config interrupts 
>>>>>>>> > >>>>>> to notify
>>>>>>>> > >>>>>> the guest to build and send the correct garps.
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Tested with ping to guest with vlan during migration. Without 
>>>>>>>> > >>>>>> the
>>>>>>>> > >>>>>> patch, lots of the packets were lost after migration. With 
>>>>>>>> > >>>>>> the patch,
>>>>>>>> > >>>>>> could not notice packet loss after migration.
>>>>>>> > >>>>> below changelog should go after ---, until the ack list.
>>>>>>> > >>>>>
>>>>>> > >>>> Ok.
>>>>>>>> > >>>>>> Reference:
>>>>>>>> > >>>>>> RFC v2: 
>>>>>>>> > >>>>>> https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
>>>>>>>> > >>>>>> RFC v1: 
>>>>>>>> > >>>>>> https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
>>>>>>>> > >>>>>> V7:     
>>>>>>>> > >>>>>> https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Changes from RFC v2:
>>>>>>>> > >>>>>> - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
>>>>>>>> > >>>>>> - compat self announce for 2.0 machine type
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Changes from RFC v1:
>>>>>>>> > >>>>>> - clean VIRTIO_NET_S_ANNOUNCE bit during reset
>>>>>>>> > >>>>>> - free announce timer during clean
>>>>>>>> > >>>>>> - make announce work for non-vhost case
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Changes from V7:
>>>>>>>> > >>>>>> - Instead of introducing a global method for each kind of 
>>>>>>>> > >>>>>> nic, this
>>>>>>>> > >>>>>>   version limits the changes to virtio-net itself.
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> Cc: Liuyongan <address@hidden>
>>>>>>>> > >>>>>> Cc: Amos Kong <address@hidden>
>>>>>>>> > >>>>>> Signed-off-by: Jason Wang <address@hidden>
>>>>>>>> > >>>>>> ---
>>>>>>>> > >>>>>>  hw/net/virtio-net.c            |   48 
>>>>>>>> > >>>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>>>> > >>>>>>  include/hw/i386/pc.h           |    5 ++++
>>>>>>>> > >>>>>>  include/hw/virtio/virtio-net.h |   16 +++++++++++++
>>>>>>>> > >>>>>>  3 files changed, 69 insertions(+), 0 deletions(-)
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>>> > >>>>>> index 940a7cf..98d59e9 100644
>>>>>>>> > >>>>>> --- a/hw/net/virtio-net.c
>>>>>>>> > >>>>>> +++ b/hw/net/virtio-net.c
>>>>>>>> > >>>>>> @@ -25,6 +25,7 @@
>>>>>>>> > >>>>>>  #include "monitor/monitor.h"
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>>  #define VIRTIO_NET_VM_VERSION    11
>>>>>>>> > >>>>>> +#define VIRTIO_NET_ANNOUNCE_ROUNDS    3
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>>  #define MAC_TABLE_ENTRIES    64
>>>>>>>> > >>>>>>  #define MAX_VLAN    (1 << 12)   /* Per 802.1Q definition */
>>>>>>> > >>>>> I would make it  5 to be consistent with SELF_ANNOUNCE_ROUNDS
>>>>>>> > >>>>> in savevm.c
>>>>>>> > >>>>>
>>>>>>> > >>>>> in fact, let's move SELF_ANNOUNCE_ROUNDS to 
>>>>>>> > >>>>> include/migration/vmstate.h
>>>>>>> > >>>>> and reuse it.
>>>>>> > >>>> Ok.
>>>>>>>> > >>>>>> @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet 
>>>>>>>> > >>>>>> *n, uint8_t status)
>>>>>>>> > >>>>>>          (n->status & VIRTIO_NET_S_LINK_UP) && 
>>>>>>>> > >>>>>> vdev->vm_running;
>>>>>>>> > >>>>>>  }
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>> +static void virtio_net_announce_timer(void *opaque)
>>>>>>>> > >>>>>> +{
>>>>>>>> > >>>>>> +    VirtIONet *n = opaque;
>>>>>>>> > >>>>>> +    VirtIODevice *vdev = VIRTIO_DEVICE(n);
>>>>>>>> > >>>>>> +
>>>>>>>> > >>>>>> +    if (n->announce &&
>>>>>>> > >>>>> I would make it > 0 here, just in case it becomes negative as a 
>>>>>>> > >>>>> result
>>>>>>> > >>>>> of some bug.
>>>>>> > >>>> Sure.
>>>>>>>> > >>>>>> +        virtio_net_started(n, vdev->status) &&
>>>>>>>> > >>>>>> +        vdev->guest_features & (0x1 << 
>>>>>>>> > >>>>>> VIRTIO_NET_F_GUEST_ANNOUNCE) &&
>>>>>>>> > >>>>>> +        vdev->guest_features & (0x1 << 
>>>>>>>> > >>>>>> VIRTIO_NET_F_CTRL_VQ)) {
>>>>>>>> > >>>>>> +
>>>>>>>> > >>>>>> +        n->announce--;
>>>>>>>> > >>>>>> +        n->status |= VIRTIO_NET_S_ANNOUNCE;
>>>>>>>> > >>>>>> +        virtio_notify_config(vdev);
>>>>>>>> > >>>>>> +    } else {
>>>>>>>> > >>>>>> +        timer_del(n->announce_timer);
>>>>>>> > >>>>> why do this here?
>>>>>>> > >>>>>
>>>>>>>> > >>>>>> +        n->announce = 0;
>>>>>>> > >>>>> why is this here?
>>>>>>> > >>>>>
>>>>>> > >>>> If guest shuts down the device, there's no need to do the 
>>>>>> > >>>> announcing.
>>>>> > >>> It's still weird.
>>>>> > >>> Either announce is 0 and then timer was not running
>>>>> > >>> or it's > 0 and then this won't trigger.
>>>> > >> Right, the logic here is for QEMU_CLOCK_REALTIME. But there's another
>>>> > >> question, we use QEMU_CLOCK_VIRTUAL while qemu is using
>>>> > >> QEMU_CLOCK_REALTIME for its announcing. This looks fine but not sure
>>>> > >> whether this is safe.
>>> > > Meaning QEMU_CLOCK_REALTIME that qemu uses?
>>> > > Not sure either but it doesn't modify guest state so it seems safe.
>>> > >
>>> > >
>>>> > >> And if the link was down, it's better for us to
>>>> > >> stop the announcing?
>>> > > I think that it doesn't matter: guest won't announce when
>>> > > link is down, anyway.
>>> > > Not worth it to write extra logic here in the host.
>> > 
>> > Ok.
>>>>>>>> > >>>>>> +    }
>>>>>>>> > >>>>>> +}
>>>>>>>> > >>>>>> +
>>>>>>>> > >>>>>>  static void virtio_net_vhost_status(VirtIONet *n, uint8_t 
>>>>>>>> > >>>>>> status)
>>>>>>>> > >>>>>>  {
>>>>>>>> > >>>>>>      VirtIODevice *vdev = VIRTIO_DEVICE(n);
>>>>>>>> > >>>>>> @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct 
>>>>>>>> > >>>>>> VirtIODevice *vdev, uint8_t status)
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>>      virtio_net_vhost_status(n, status);
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>> +    virtio_net_announce_timer(n);
>>>>>>>> > >>>>>> +
>>>>>>> > >>>>> why do this here? why not right after we set announce counter?
>>>>>> > >>>> The reasons are:
>>>>>> > >>>>
>>>>>> > >>>> - The counters were set in load, but the device is not running so 
>>>>>> > >>>> we
>>>>>> > >>>> could not inject the interrupt at that time.
>>>>> > >>> I see. This makes sense - but this isn't intuitive.
>>>>> > >>> Why don't we simply start timer with current time?
>>>>> > >>> Need to make sure it runs fine if time passes, but
>>>>> > >>> I think it's fine.
>>>> > >> Not sure I get the point, I didn't see any differences except for an
>>>> > >> extra timer fire.
>>> > > The only reason you call virtio_net_announce_timer from  set_status
>>> > > is because it gets run on vm start/stop.
>>> > > It's true but not intuitive.
>>> > > Just run timer always from timer, it's clearer this way :)
>>> > >
>> > 
>> > Sure.
>>>>>> > >>>> - We can stop the progress when guest is shutting down the device.
>>>>> > >>> On shut down guest will reset device stopping timer - this seems 
>>>>> > >>> enough.
>>>> > >> Yes, I see.
>>>>>>>> > >>>>>>      for (i = 0; i < n->max_queues; i++) {
>>>>>>>> > >>>>>>          q = &n->vqs[i];
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>> @@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice 
>>>>>>>> > >>>>>> *vdev)
>>>>>>>> > >>>>>>      n->nobcast = 0;
>>>>>>>> > >>>>>>      /* multiqueue is disabled by default */
>>>>>>>> > >>>>>>      n->curr_queues = 1;
>>>>>>>> > >>>>>> +    timer_del(n->announce_timer);
>>>>>>>> > >>>>>> +    n->announce = 0;
>>>>>>>> > >>>>>> +    n->status &= ~VIRTIO_NET_S_ANNOUNCE;
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>>      /* Flush any MAC and VLAN filter table state */
>>>>>>>> > >>>>>>      n->mac_table.in_use = 0;
>>>>>>>> > >>>>>> @@ -731,6 +756,22 @@ static int 
>>>>>>>> > >>>>>> virtio_net_handle_vlan_table(VirtIONet *n, uint8_t cmd,
>>>>>>>> > >>>>>>      return VIRTIO_NET_OK;
>>>>>>>> > >>>>>>  }
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>> +static int virtio_net_handle_announce(VirtIONet *n, uint8_t 
>>>>>>>> > >>>>>> cmd,
>>>>>>>> > >>>>>> +                                      struct iovec *iov, 
>>>>>>>> > >>>>>> unsigned int iov_cnt)
>>>>>>>> > >>>>>> +{
>>>>>>>> > >>>>>> +    if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
>>>>>>>> > >>>>>> +        n->status &= ~VIRTIO_NET_S_ANNOUNCE;
>>>>>>>> > >>>>>> +        if (n->announce) {
>>>>>>> > >>>>> I would make it > 0 here, just in case it becomes negative as a 
>>>>>>> > >>>>> result
>>>>>>> > >>>>> of some bug.
>>>>>> > >>>> Ok.
>>>>>>>> > >>>>>> +            timer_mod(n->announce_timer,
>>>>>>>> > >>>>>> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) 
>>>>>>>> > >>>>>> + 50 +
>>>>>>>> > >>>>>> +                      (VIRTIO_NET_ANNOUNCE_ROUNDS - 
>>>>>>>> > >>>>>> n->announce - 1) * 100);
>>>>>>> > >>>>> savevm.c has this code, and a comment:
>>>>>>> > >>>>>         /* delay 50ms, 150ms, 250ms, ... */
>>>>>>> > >>>>>         50 + (SELF_ANNOUNCE_ROUNDS - count - 1) * 100
>>>>>>> > >>>>>
>>>>>>> > >>>>> how about an API in include/migration/vmstate.h
>>>>>>> > >>>>>
>>>>>>> > >>>>> static inline
>>>>>>> > >>>>> int64_t self_announce_delay(int round)
>>>>>>> > >>>>> {
>>>>>>> > >>>>>         assert(round < SELF_ANNOUNCE_ROUNDS && round > 0);
>>>>>>> > >>>>>         /* delay 50ms, 150ms, 250ms, ... */
>>>>>>> > >>>>>         return 50 + (SELF_ANNOUNCE_ROUNDS - round - 1) * 100;
>>>>>>> > >>>>> }
>>>>>>> > >>>>>
>>>>>>> > >>>>> or something to this end.
>>>>>>> > >>>>>
>>>>>> > >>>> Good idea, will do this.
>>>>>>>> > >>>>>> +        }
>>>>>>>> > >>>>>> +        return VIRTIO_NET_OK;
>>>>>>>> > >>>>>> +    } else {
>>>>>>>> > >>>>>> +        return VIRTIO_NET_ERR;
>>>>>>>> > >>>>>> +    }
>>>>>>>> > >>>>>> +}
>>>>>>>> > >>>>>> +
>>>>>>>> > >>>>>>  static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
>>>>>>>> > >>>>>>                                  struct iovec *iov, unsigned 
>>>>>>>> > >>>>>> int iov_cnt)
>>>>>>>> > >>>>>>  {
>>>>>>>> > >>>>>> @@ -794,6 +835,8 @@ static void 
>>>>>>>> > >>>>>> virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
>>>>>>>> > >>>>>>              status = virtio_net_handle_mac(n, ctrl.cmd, iov, 
>>>>>>>> > >>>>>> iov_cnt);
>>>>>>>> > >>>>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
>>>>>>>> > >>>>>>              status = virtio_net_handle_vlan_table(n, 
>>>>>>>> > >>>>>> ctrl.cmd, iov, iov_cnt);
>>>>>>>> > >>>>>> +        } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
>>>>>>>> > >>>>>> +            status = virtio_net_handle_announce(n, ctrl.cmd, 
>>>>>>>> > >>>>>> iov, iov_cnt);
>>>>>>>> > >>>>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
>>>>>>>> > >>>>>>              status = virtio_net_handle_mq(n, ctrl.cmd, iov, 
>>>>>>>> > >>>>>> iov_cnt);
>>>>>>>> > >>>>>>          } else if (ctrl.class == 
>>>>>>>> > >>>>>> VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
>>>>>>>> > >>>>>> @@ -1451,6 +1494,7 @@ static int virtio_net_load(QEMUFile *f, 
>>>>>>>> > >>>>>> void *opaque, int version_id)
>>>>>>>> > >>>>>>          qemu_get_subqueue(n->nic, i)->link_down = link_down;
>>>>>>>> > >>>>>>      }
>>>>>>>> > >>>>>>  
>>>>>>>> > >>>>>> +    n->announce = VIRTIO_NET_ANNOUNCE_ROUNDS;
>>>>>>> > >>>>> Well if virtio_net_handle_announce runs now it will start timer
>>>>>>> > >>>>> in the past, right?
>>>>>>> > >>>>> This seems wrong.
>>>>>> > >>>> Not sure I get the case. When in virtio_net_load() the vm is not 
>>>>>> > >>>> even
>>>>>> > >>>> running so looks like virtio_net_handle_announce() could not run 
>>>>>> > >>>> in the
>>>>>> > >>>> same time.
>>>>> > >>> I see, this works because you decrement it when VM starts running.
>>>>> > >>> I think it's not a good idea to rely on this though,
>>>>> > >>> better do everything from timer, and handle
>>>>> > >>> the case of command arriving too early.
>>>>> > >>>
>>>> > >> Right, if QEMU_CLOCK_VIRTUAL is fine, we can do everything in a timer.
>>>> > >>
>>>> > >> For the case of command arriving too early. Is there anything else we
>>>> > >> need to do? Since we only start the next timer when we get ack 
>>>> > >> command.
>>>> > >>
>>>> > >> Thanks
>>> > > I think we need to make sure we don't set the timer in the past or
>>> > > very far in the future.
>> > 
>> > n->announce is between 0 and VIRTIO_NET_ANNOUNCE_ROUNDS - 1, when doing
>> > timer_mod(), so it looks ok.
> All I am saying, make sure it's within the range even if
> handle announce is called before timer_mod.
> Can not happen with your current patch, but must make sure
> it's still correct after you make changes :)
> Also, maybe add an assert there.
>

I get your meaning. Will post V2.

Thanks



reply via email to

[Prev in Thread] Current Thread [Next in Thread]