qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event


From: Amos Kong
Subject: Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
Date: Fri, 31 May 2013 11:02:54 +0800
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, May 31, 2013 at 08:35:28AM +0800, Amos Kong wrote:
> On Thu, May 30, 2013 at 04:54:41PM +0300, Michael S. Tsirkin wrote:
> > On Tue, May 28, 2013 at 08:25:56AM -0400, Luiz Capitulino wrote:
> > > On Tue, 28 May 2013 06:43:04 +0800
> > > Amos Kong <address@hidden> wrote:
> > > 
> > > > On Mon, May 27, 2013 at 09:24:28AM -0400, Luiz Capitulino wrote:
> > > > > On Mon, 27 May 2013 09:10:11 -0400
> > > > > Luiz Capitulino <address@hidden> wrote:
> > > > > 
> > > > > > > We use the QMP event to notify management about the mac changing.
> > > > > > > 
> > > > > > > In this thread, we _wrongly_ considered to use qmp approach to 
> > > > > > > delay
> > > > > > > the event for avoiding the flooding.
> > > > > > > 
> > > > > > >   eg: monitor_protocol_event_throttle(NIC_RX_FILTER_CHANGED, 
> > > > > > > 1000);
> > > > > > > 
> > > > > > > Now we have a solution (using a flag to turn on/off the notify) 
> > > > > > > to avoid the
> > > > > > > flooding, only emit the event if we have no un-read event.
> > > > > > > 
> > > > > > > If we want to (flag is on) emit the event, we wish the event be 
> > > > > > > sent ASAP
> > > > > > > (so event_throttle isn't needed).
> > > > > > 
> > > > > > Unfortunately this doesn't answer my question. I did understand why 
> > > > > > you're
> > > > > > not using the event throttle API (which is because you don't want 
> > > > > > to slow down
> > > > > > the guest, not the QMP client).
> > > > > > 
> > > > > > My point is whether coupling the event with the query command is 
> > > > > > really
> > > > > > justified or even if it really fixes the problem. Two points:
> > > > > > 
> > > > > >  1. Coupling them is bad design, and will probably strike back, as 
> > > > > > we plan
> > > > > >     for a better API for events where events can be disabled
> > > > > 
> > > > > I meant we may in the future, for example, introduce the ability to 
> > > > > disable
> > > > > commands (and events). One could argue that the event w/o a query 
> > > > > command
> > > > > is not that useful, as events can be lost. But loosing an event is 
> > > > > one thing,
> > > > > not having it because it got disabled by a side effect is another.
> > > > 
> > > > event_throttle() couples the event in QMP framework, but we use flags
> > > > to disabled the event from real source (emit points/senders). 
> > > > 
> > > > If we set the evstate->rate to -1, we can ignore the events in
> > > > monitor_protocol_event_queue(), but we could not control the event
> > > > emitting of each emit point (each nic).
> > > >  
> > > > > But anyway, my main point in this thread is to make sure we at least
> > > > > justify having this coupling. Aren't we optimizing prematurely? Aren't
> > > > > we optimizing for a corner case? That's what I want to see answered.
> > > > 
> > > > If it's a corner case, we don't need a general API to disable event.
> > > 
> > > If it's a corner case, it's really worth to fix it?
> > > 
> > > I think that what we need a real world test-case to show us we're
> > > doing the right thing.
> > > 
> > > > We can disable this event by a flag, and introduce a new API
> > > > if we have same request from other events.
> > > > 
> > > > > >  2. Can you actually show the problem does exist so that we ensure 
> > > > > > this is
> > > > > >     not premature optimization? Might be a good idea to have this 
> > > > > > in the
> > > > > >     commit log
> > > > > > 
> > > > > > > > (which is to couple the event with the query command) is
> > > > > > > > appropriate. We're in user-space already, many things could slow
> > > > > > > > the guest down apart from the event generation.
> > > > > > > > 
> > > > > > > > Two questions:
> > > > > > > > 
> > > > > > > >  1. Do we know how slow (or how many packets are actually 
> > > > > > > > dropped)
> > > > > > > >     if the mac is changed too often *and* the event is always 
> > > > > > > > sent?
> > > > > > > 
> > > > > > > We always disable interface first, then change the macaddr.
> > > > > > > But we just have patch to allow guest to change macaddr of
> > > > > > > virtio-net when the interface is running.
> > > > > > > 
> > > > > > > | commit 2dcd0cce551983afe2f900125457f10bb5d980ae
> > > > > > > | Author: Jiri Pirko <address@hidden>
> > > > > > > | Date:   Tue Dec 11 15:33:56 2012 -0500
> > > > > > > | 
> > > > > > > |     [virt] virtio_net: allow to change mac when iface is running
> > > > > > > 
> > > > > > > >  2. Does this solution consider what happens if the QMP client 
> > > > > > > > does
> > > > > > > >     respond timely to the event by issuing the query-rx-filter
> > > > > > > >     command?
> > > > > > > 
> > > > > > > We assume that the QMP client (management) cares about the mac 
> > > > > > > changing
> > > > > > > event, and will query the latest rx-filter state and sync to 
> > > > > > > macvtap
> > > > > > > device.
> > > > > > > 
> > > > > > > 1) If QMP client respond timely to the event: that's what we 
> > > > > > > expected :)
> > > > > > 
> > > > > > Won't this slow down the guest? If not, why?
> > > > 
> > > > If guest changes fx-filter configs frequently & management always query 
> > > > the
> > > > event very timely, this will slow down the guest.
> > > > 
> > > > We should detect & process the abnormal behavior from management.
> > > 
> > > That's not abnormal. Management is doing what it should do.
> > > 
> > > Maybe using the event throttle API can solve the mngt side of the problem,
> > > but I still think we need a reproducible test-case to ensure we're doing
> > > the right thing.
> > 
> > I agree we should make sure this code is tested.
> > It's pretty easy: run ifconfig in a loop in guest.
> > 
> > Amos, did you try this? Probably should otherwise
> > we don't really know whether the logic works.
> 
> With v4 patch (without using event throttle)
> 
> 1. continually query rx-filter from monitor
> # while true; do echo "info rx-filter" | nc -U /tmp/m; done
> 
> 2. change mac in guest repeatedly
> # while true; do
>  ifconfig eth1 down; ifconfig eth1 hw ether 12:00:00:00:00:00
>  ifconfig eth1 down; ifconfig eth1 hw ether 14:00:00:00:00:00
>  done
> 
> One time (down if, change mac, up) takes about 3,500,000 ns in guest
> some event will be ignored by qemu. I will try to only query when it
> gets NIC_RX_FILTER_CHANGE event from QMP monitor, query ASAP.
> 
> ==> Resource usage:
> 
>   PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
> 16387 root      20   0 2375m 326m 6684 R 104.2  4.2   8:32.16 qemu-system-x86
> 
> loop script takes about 10% guest cpu (1 core), guest is not slow
> 
> ----
> If we don't use the flag (same effect as that management taks 0 ns to
> response & complete the query after event comming)


>   PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND         
>                                                                               
>                      
>  4317 root      20   0 2377m 285m 6664 R 103.4  3.7   2:06.82 qemu-system-x86
> 
> guest is very slow (no response for the keyboard input), output:
> clocksource tsc unstable.

^^  The reproduce rate : about 10%

I did the following tests, it seems repeatedly executing vq command
with large data will cause guest hung.

Could not slow guest by repeatedly changing rx-filter (reference test1).



NOTE: Test1,2,3,4 didn't use control flag, emit all the events to monitor.

==> test1 (applied v4)
I tried to test by repeatedly change promisc mode (on/off).
The difference is that there is no data passed when it
executes vq command. Changing main mac will pass 1 macaddr (6 bytes).

Result: not reproduce, events emitted to QMP monitor, everything is fine.

==> test2 (applied v4)
So I tried to test by repeatedly change mac of vlan interface.
it will cause guest update mac-table every time, more data will
be passed. 

Result: reproduced (guest becomes slow, stops emit event)

==> test3 (latest qemu)
execute test2 with latest qemu

Result: guest hung

==> test4 (applied v4, + event_throttle change)
set event_throttle for rxfilter event in monitor.c:monitor_protocol_event_init()
  monitor_protocol_event_throttle(QEVENT_NIC_RX_FILTER_CHANGED, 1000);
and execute test2.

Result: guest script stuck, guest gets slow, no event emitted

| Guest) # strace ifconfig
| ...
| ioctrl(4, SIOCGIFCONF, {
| (stuck...)

---
                        Amos



reply via email to

[Prev in Thread] Current Thread [Next in Thread]