[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5] net: add support of mac-programming over mac
From: |
Luiz Capitulino |
Subject: |
Re: [Qemu-devel] [PATCH v5] net: add support of mac-programming over macvtap in QEMU side |
Date: |
Fri, 7 Jun 2013 09:46:12 -0400 |
On Wed, 5 Jun 2013 18:42:13 +0800
Amos Kong <address@hidden> wrote:
> Currently macvtap based macvlan device is working in promiscuous
> mode, we want to implement mac-programming over macvtap through
> Libvirt for better performance.
>
> Design:
> QEMU notifies Libvirt when rx-filter config is changed in guest,
> then Libvirt query the rx-filter information by a monitor command,
> and sync the change to macvtap device. Related rx-filter config
> of the nic contains main mac, rx-mode items and vlan table.
>
> This patch adds a QMP event to notify management of rx-filter change,
> and adds a monitor command for management to query rx-filter
> information.
>
> For reducing length of output, we just return the entries of vlan
> filter table that have active vlan.
>
> Event_throttle API can avoid the events to flood QMP client, but it
> could cause an unexpected delay. So a flag for each nic is used to
> avoid events flooding, if management doesn't query rx-filter after
> it receives one event, new events won't be emitted to QMP monitor.
>
> There maybe exist an uncontrollable delay if we let Libvirt do the
> real change, guests normally expect rx-filter updates immediately.
> But it's another separate issue, we can investigate it when the
> work in Libvirt side is done.
I think I completely misunderstood your testing results.
I had understood that: 1. changing the mac often & quick enough to
be a problem was a corner case and 2. you actually can overflow mngt
I hope I really got it wrong, otherwise you'll be using that
flag as a replacement for the event throttle API, which would be
a big mistake.
Can you please add your test results & analysis to this commit message?
Two small comments below.
> Signed-off-by: Amos Kong <address@hidden>
> ---
> v2: add argument to filter mac-table info of single nic (Stefan)
> update the document, add event notification
> v3: rename to rx-filter, add main mac, avoid events flooding (MST)
> fix error process (Stefan), fix qmp interface (Eric)
> v4: process qerror in hmp, cleanup (Luiz)
> set flag for each device, add device path in event, add
> helper for g_strdup_printf (MST)
> fix qmp document (Eric)
> v5: add path in doc, define notify flag to unsigned (Eric)
> add vlan table (Jason), drop monitor cmd
> ---
> QMP/qmp-events.txt | 20 +++++++++
> hw/net/virtio-net.c | 112
> ++++++++++++++++++++++++++++++++++++++++++++++
> include/monitor/monitor.h | 1 +
> include/net/net.h | 3 ++
> monitor.c | 1 +
> net/net.c | 47 +++++++++++++++++++
> qapi-schema.json | 89 ++++++++++++++++++++++++++++++++++++
> qmp-commands.hx | 66 +++++++++++++++++++++++++++
> 8 files changed, 339 insertions(+)
>
> diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
> index 92fe5fb..885230e 100644
> --- a/QMP/qmp-events.txt
> +++ b/QMP/qmp-events.txt
> @@ -172,6 +172,26 @@ Data:
> },
> "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
>
> +NIC_RX_FILTER_CHANGED
> +-----------------
> +
> +Emitted when rx-filter configuration of nic is changed by the guest.
> +Each nic has a flag to control event emit, the flag is set to false
> +when it emits one event of the nic, the flag is set to true when
> +management queries the rx-filter of the nic. This is used to avoid
> +events flooding.
Having this flag is an implementation detail. I think you should only
say that the event is emitted once until the query command is executed.
> +
> +Data:
> +
> +- "name": net client name (json-string)
> +- "path": device path (json-string)
> +
> +{ "event": "NIC_RX_FILTER_CHANGED",
> + "data": { "name": "vnet0",
> + "path": "/machine/peripheral/vnet0/virtio-backend" },
> + "timestamp": { "seconds": 1368697518, "microseconds": 326866 } }
> +}
> +
> RESET
> -----
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 1ea9556..ae1eab6 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -21,6 +21,8 @@
> #include "hw/virtio/virtio-net.h"
> #include "net/vhost_net.h"
> #include "hw/virtio/virtio-bus.h"
> +#include "qapi/qmp/qjson.h"
> +#include "monitor/monitor.h"
>
> #define VIRTIO_NET_VM_VERSION 11
>
> @@ -192,6 +194,104 @@ static void virtio_net_set_link_status(NetClientState
> *nc)
> virtio_net_set_status(vdev, vdev->status);
> }
>
> +static void rxfilter_notify(NetClientState *nc)
> +{
> + QObject *event_data;
> + VirtIONet *n = qemu_get_nic_opaque(nc);
> +
> + if (nc->rxfilter_notify_enabled) {
> + event_data = qobject_from_jsonf("{ 'name': %s, 'path': %s }",
> + n->netclient_name,
> + object_get_canonical_path(OBJECT(n->qdev)));
> + monitor_protocol_event(QEVENT_NIC_RX_FILTER_CHANGED, event_data);
> + qobject_decref(event_data);
> + /* disable event notification to avoid events flooding */
> + nc->rxfilter_notify_enabled = 0;
> + }
> +}
> +
> +static char *mac_strdup_printf(uint8_t *mac)
> +{
mac can be const. Is there more code in QEMU that could use this
function? If there's, then it's better to move this function to a
more generic .c file.
> + return g_strdup_printf("%.2x:%.2x:%.2x:%.2x:%.2x:%.2x", mac[0],
> + mac[1], mac[2], mac[3], mac[4], mac[5]);
> +}
> +
> +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc)
> +{
> + VirtIONet *n = qemu_get_nic_opaque(nc);
> + RxFilterInfo *info;
> + strList *str_list = NULL;
> + strList *entry;
> + VlanTableEntryList *vlan_list = NULL;
> + VlanTableEntryList *vlan_entry;
> + VlanTableEntry *table_entry;
> + int i;
> +
> + info = g_malloc0(sizeof(*info));
> + info->name = g_strdup(nc->name);
> + info->promiscuous = n->promisc;
> +
> + if (n->nouni) {
> + info->unicast = RX_STATE_NONE;
> + } else if (n->alluni) {
> + info->unicast = RX_STATE_ALL;
> + } else {
> + info->unicast = RX_STATE_NORMAL;
> + }
> +
> + if (n->nomulti) {
> + info->multicast = RX_STATE_NONE;
> + } else if (n->allmulti) {
> + info->multicast = RX_STATE_ALL;
> + } else {
> + info->multicast = RX_STATE_NORMAL;
> + }
> +
> + info->broadcast_allowed = n->nobcast;
> + info->multicast_overflow = n->mac_table.multi_overflow;
> + info->unicast_overflow = n->mac_table.uni_overflow;
> +
> + info->main_mac = mac_strdup_printf(n->mac);
> +
> + for (i = 0; i < n->mac_table.first_multi; i++) {
> + entry = g_malloc0(sizeof(*entry));
> + entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN);
> + entry->next = str_list;
> + str_list = entry;
> + }
> + info->unicast_table = str_list;
> +
> + str_list = NULL;
> + for (i = n->mac_table.first_multi; i < n->mac_table.in_use; i++) {
> + entry = g_malloc0(sizeof(*entry));
> + entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN);
> + entry->next = str_list;
> + str_list = entry;
> + }
> + info->multicast_table = str_list;
> +
> + i = 0;
> + while (i < MAX_VLAN >> 5) {
> + /* ignore the entries that have no active vlan */
> + if (n->vlans[i] == 0) {
> + i++;
> + continue;
> + }
> + vlan_entry = g_malloc0(sizeof(*vlan_entry));
> + vlan_entry->value = g_malloc0(sizeof(*table_entry));
> + vlan_entry->value->index = i;
> + vlan_entry->value->value = n->vlans[i++];
> + vlan_entry->next = vlan_list;
> + vlan_list = vlan_entry;
> + }
> + info->vlan_table = vlan_list;
> +
> + /* enable event notification after query */
> + nc->rxfilter_notify_enabled = 1;
> +
> + return info;
> +}
> +
> static void virtio_net_reset(VirtIODevice *vdev)
> {
> VirtIONet *n = VIRTIO_NET(vdev);
> @@ -420,6 +520,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n,
> uint8_t cmd,
> {
> uint8_t on;
> size_t s;
> + NetClientState *nc = qemu_get_queue(n->nic);
>
> s = iov_to_buf(iov, iov_cnt, 0, &on, sizeof(on));
> if (s != sizeof(on)) {
> @@ -442,6 +543,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n,
> uint8_t cmd,
> return VIRTIO_NET_ERR;
> }
>
> + rxfilter_notify(nc);
> +
> return VIRTIO_NET_OK;
> }
>
> @@ -487,6 +590,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t
> cmd,
> {
> struct virtio_net_ctrl_mac mac_data;
> size_t s;
> + NetClientState *nc = qemu_get_queue(n->nic);
>
> if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) {
> if (iov_size(iov, iov_cnt) != sizeof(n->mac)) {
> @@ -495,6 +599,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t
> cmd,
> s = iov_to_buf(iov, iov_cnt, 0, &n->mac, sizeof(n->mac));
> assert(s == sizeof(n->mac));
> qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
> + rxfilter_notify(nc);
> +
> return VIRTIO_NET_OK;
> }
>
> @@ -559,6 +665,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t
> cmd,
> n->mac_table.multi_overflow = 1;
> }
>
> + rxfilter_notify(nc);
> +
> return VIRTIO_NET_OK;
> }
>
> @@ -567,6 +675,7 @@ static int virtio_net_handle_vlan_table(VirtIONet *n,
> uint8_t cmd,
> {
> uint16_t vid;
> size_t s;
> + NetClientState *nc = qemu_get_queue(n->nic);
>
> s = iov_to_buf(iov, iov_cnt, 0, &vid, sizeof(vid));
> vid = lduw_p(&vid);
> @@ -584,6 +693,8 @@ static int virtio_net_handle_vlan_table(VirtIONet *n,
> uint8_t cmd,
> else
> return VIRTIO_NET_ERR;
>
> + rxfilter_notify(nc);
> +
> return VIRTIO_NET_OK;
> }
>
> @@ -1312,6 +1423,7 @@ static NetClientInfo net_virtio_info = {
> .receive = virtio_net_receive,
> .cleanup = virtio_net_cleanup,
> .link_status_changed = virtio_net_set_link_status,
> + .query_rx_filter = virtio_net_query_rxfilter,
> };
>
> static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
> diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> index 1a6cfcf..1942cc4 100644
> --- a/include/monitor/monitor.h
> +++ b/include/monitor/monitor.h
> @@ -41,6 +41,7 @@ typedef enum MonitorEvent {
> QEVENT_BLOCK_JOB_READY,
> QEVENT_DEVICE_DELETED,
> QEVENT_DEVICE_TRAY_MOVED,
> + QEVENT_NIC_RX_FILTER_CHANGED,
> QEVENT_SUSPEND,
> QEVENT_SUSPEND_DISK,
> QEVENT_WAKEUP,
> diff --git a/include/net/net.h b/include/net/net.h
> index 43d85a1..30e4b04 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -49,6 +49,7 @@ typedef ssize_t (NetReceiveIOV)(NetClientState *, const
> struct iovec *, int);
> typedef void (NetCleanup) (NetClientState *);
> typedef void (LinkStatusChanged)(NetClientState *);
> typedef void (NetClientDestructor)(NetClientState *);
> +typedef RxFilterInfo *(QueryRxFilter)(NetClientState *);
>
> typedef struct NetClientInfo {
> NetClientOptionsKind type;
> @@ -59,6 +60,7 @@ typedef struct NetClientInfo {
> NetCanReceive *can_receive;
> NetCleanup *cleanup;
> LinkStatusChanged *link_status_changed;
> + QueryRxFilter *query_rx_filter;
> NetPoll *poll;
> } NetClientInfo;
>
> @@ -74,6 +76,7 @@ struct NetClientState {
> unsigned receive_disabled : 1;
> NetClientDestructor *destructor;
> unsigned int queue_index;
> + unsigned rxfilter_notify_enabled:1;
> };
>
> typedef struct NICState {
> diff --git a/monitor.c b/monitor.c
> index 6ce2a4e..5e64fe8 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = {
> [QEVENT_BLOCK_JOB_READY] = "BLOCK_JOB_READY",
> [QEVENT_DEVICE_DELETED] = "DEVICE_DELETED",
> [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED",
> + [QEVENT_NIC_RX_FILTER_CHANGED] = "NIC_RX_FILTER_CHANGED",
> [QEVENT_SUSPEND] = "SUSPEND",
> [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK",
> [QEVENT_WAKEUP] = "WAKEUP",
> diff --git a/net/net.c b/net/net.c
> index 43a74e4..33abffe 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState *nc)
> nc->info_str);
> }
>
> +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name,
> + Error **errp)
> +{
> + NetClientState *nc;
> + RxFilterInfoList *filter_list = NULL, *last_entry = NULL;
> +
> + QTAILQ_FOREACH(nc, &net_clients, next) {
> + RxFilterInfoList *entry;
> + RxFilterInfo *info;
> +
> + /* only query rx-filter information of nic */
> + if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) {
> + continue;
> + }
> + if (has_name && strcmp(nc->name, name) != 0) {
> + continue;
> + }
> +
> + if (nc->info->query_rx_filter) {
> + info = nc->info->query_rx_filter(nc);
> + entry = g_malloc0(sizeof(*entry));
> + entry->value = info;
> +
> + if (!filter_list) {
> + filter_list = entry;
> + } else {
> + last_entry->next = entry;
> + }
> + last_entry = entry;
> + } else if (has_name) {
> + error_setg(errp, "net client(%s) doesn't support"
> + " rx-filter querying", name);
> + break;
> + }
> + }
> +
> + if (filter_list == NULL && !error_is_set(errp)) {
> + if (has_name) {
> + error_setg(errp, "invalid net client name: %s", name);
> + } else {
> + error_setg(errp, "no net client supports rx-filter querying");
> + }
> + }
> +
> + return filter_list;
> +}
> +
> void do_info_network(Monitor *mon, const QDict *qdict)
> {
> NetClientState *nc, *peer;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index ef1f657..fb6f12c 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -3618,3 +3618,92 @@
> '*cpuid-input-ecx': 'int',
> 'cpuid-register': 'X86CPURegister32',
> 'features': 'int' } }
> +
> +##
> +# @RxState:
> +#
> +# Packets receiving state
> +#
> +# @normal: filter assigned packets according to the mac-table
> +#
> +# @none: don't receive any assigned packet
> +#
> +# @all: receive all assigned packets
> +#
> +##
> +{ 'enum': 'RxState', 'data': [ 'normal', 'none', 'all' ] }
> +
> +##
> +# @VlanTableEntry
> +#
> +# Entry detail of vlan filter table
> +#
> +# @index: index of vlan filter table
> +#
> +# @value: 32 bit of indicator for active vlans
> +#
> +# Since 1.6
> +##
> +
> +{ 'type': 'VlanTableEntry', 'data': { 'index': 'int', 'value': 'uint32' } }
> +
> +##
> +# @RxFilterInfo:
> +#
> +# Rx-filter information for a nic.
> +#
> +# @name: net client name
> +#
> +# @promiscuous: whether promiscuous mode is enabled
> +#
> +# @multicast: multicast receive state
> +#
> +# @unicast: unicast receive state
> +#
> +# @broadcast-allowed: whether to receive broadcast
> +#
> +# @multicast-overflow: multicast table is overflowed or not
> +#
> +# @unicast-overflow: unicast table is overflowed or not
> +#
> +# @main-mac: the main macaddr string
> +#
> +# @vlan-table: a list of @VlanTableEntry
> +#
> +# @unicast-table: a list of unicast macaddr string
> +#
> +# @multicast-table: a list of multicast macaddr string
> +#
> +# Since 1.6
> +##
> +
> +{ 'type': 'RxFilterInfo',
> + 'data': {
> + 'name': 'str',
> + 'promiscuous': 'bool',
> + 'multicast': 'RxState',
> + 'unicast': 'RxState',
> + 'broadcast-allowed': 'bool',
> + 'multicast-overflow': 'bool',
> + 'unicast-overflow': 'bool',
> + 'main-mac': 'str',
> + 'vlan-table': ['VlanTableEntry'],
> + 'unicast-table': ['str'],
> + 'multicast-table': ['str'] }}
> +
> +##
> +# @query-rx-filter:
> +#
> +# Return rx-filter information for all nics (or for the given nic).
> +#
> +# @name: #optional net client name
> +#
> +# Returns: list of @RxFilterInfo for all nics (or for the given nic).
> +# Returns an error if the given @name doesn't exist, or given
> +# nic doesn't support rx-filter querying, or no net client
> +# supports rx-filter querying
> +#
> +# Since: 1.6
> +##
> +{ 'command': 'query-rx-filter', 'data': { '*name': 'str' },
> + 'returns': ['RxFilterInfo'] }
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index ffd130e..e76e9a2 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -2932,3 +2932,69 @@ Example:
> <- { "return": {} }
>
> EQMP
> + {
> + .name = "query-rx-filter",
> + .args_type = "name:s?",
> + .mhandler.cmd_new = qmp_marshal_input_query_rx_filter,
> + },
> +
> +SQMP
> +query-rx-filter
> +---------------
> +
> +Show rx-filter information.
> +
> +Returns a json-array of rx-filter information for all nics (or for the
> +given nic), returning an error if the given nic doesn't exist, or
> +given nic doesn't support rx-filter querying, or no net client
> +supports rx-filter querying.
> +
> +The query will clear the event notification flag of each nic, then qemu
> +will start to emit event to QMP monitor.
> +
> +Each array entry contains the following:
> +
> +- "name": net client name (json-string)
> +- "promiscuous": promiscuous mode is enabled (json-bool)
> +- "multicast": multicast receive state (one of 'normal', 'none', 'all')
> +- "unicast": unicast receive state (one of 'normal', 'none', 'all')
> +- "broadcast-allowed": allow to receive broadcast (json-bool)
> +- "multicast-overflow": multicast table is overflowed (json-bool)
> +- "unicast-overflow": unicast table is overflowed (json-bool)
> +- "main-mac": main macaddr string (json-string)
> +- "vlan-table": a json-array of vlan-table entry
> + - "index": index of vlan filter table (json-int)
> + - "value": 32 bit of indicator for active vlans
> +- "unicast-table": a json-array of unicast macaddr string
> +- "multicast-table": a json-array of multicast macaddr string
> +
> +Example:
> +
> +-> { "execute": "query-rx-filter", "arguments": { "name": "vnet0" } }
> +<- { "return": [
> + {
> + "promiscuous": true,
> + "name": "vnet0",
> + "main-mac": "52:54:00:12:34:56",
> + "unicast": "normal",
> + "vlan-table": [
> + { "index": 0,
> + "entry": 1
> + }
> + ],
> + "unicast-table": [
> + ],
> + "multicast": "normal",
> + "multicast-overflow": false,
> + "unicast-overflow": false,
> + "multicast-table": [
> + "01:00:5e:00:00:01",
> + "33:33:00:00:00:01",
> + "33:33:ff:12:34:56"
> + ],
> + "broadcast-allowed": false
> + }
> + ]
> + }
> +
> +EQMP