... and upon closer inspection, the problem described below (frontendblocks the backend, then tries to drain the wrong queue causing a stall)
occurs because the hub in the middle breaks the flow of events.
In the configuration below ( -net nic -net tap,ifname=tap0,... ) we have
e1000.0 <--> hub0port0 [hub] hub0port1 <--> tap.0
The hub0port1 reports as non-writable when all other ports
(just one in this case) are full, and the packet is queued
on hub0port1. However when the e1000 frontend tries to drain
the queue, it directly accesses the queue attached to hub0port0,
which is empty.
So it appears that the only fix is the following:
when a node is attached to a hub, instead of draining the
queue on the node one should drain all queues attached to the hub.
A new function qemu_flush_hub() would be handy, something like
QLIST_FOREACH(port, &hub->ports, next) {
if (port != source_port)
qemu_flush_queued_packets(&port->nc);
}
The other option (queueing on the output ports of the hub)
would require a bit more attention to make sure that
the callback is only executed once (and also, avoid exceeding
data replication). Not impossible, but it requires reference
counting the packet.
What do you think, which way do you prefer ?
cheers
luigi