qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Insane virtio-serial semantics


From: Anthony Liguori
Subject: Re: [Qemu-devel] Insane virtio-serial semantics
Date: Wed, 07 Dec 2011 13:53:00 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13

On 12/07/2011 01:44 PM, Michael Roth wrote:
On 12/07/2011 07:49 AM, Anthony Liguori wrote:
On 12/07/2011 02:21 AM, Markus Armbruster wrote:
Anthony Liguori<address@hidden> writes:

On 12/06/2011 04:30 PM, Lluís Vilanova wrote:
Anthony Liguori writes:

I really worry about us introducing so many of these one-off
paravirtual devices.
I would much prefer that you look at doing this as an extension to
the ivshmem
device as it already has this sort of scope. You should be able to
do this by
just extending the size of bar 1 and using a well known guest id.

I did in fact look at ivshmem some time ago, and it's true that both
use the
same mechanisms; but each device has a completely different purpose.
To me it
just seems that extending the control BAR in ivshmem to call the
user-provided
backdoor callbacks is just conflating two completely separate
devices into a
single one. Besides, I think that the qemu-side of the backdoor is
simple enough
to avoid being a maintenance burden.

They have the same purpose (which are both vague TBH). The only
reason I'm sympathetic to this device is that virtio-serial has such
insane semantics.

Could you summarize what's wrong? Is it fixable?

I don't think so as it's part of the userspace ABI now.

Mike, please help me make sure I get this all right. A normal
file/socket has the following guest semantics:

1) When a disconnect occurs, you will receive a return of '0' or -EPIPE
depending on the platform. The fd is now unusable and you must
close/reopen.

2) You can setup SIGIO/SIGPIPE to fire off whenever a file descriptor
becomes readable/writable.

virtio serial has the following semantics:

1) When a disconnect occurs, if you read() you will receive an -EPIPE.

2) However, if a reconnect occurs before you issue your read(), the read
will complete with no indication that a disconnect occurred.

3) This makes it impossible to determine whether a disconnect has
occurred which makes it very hard to reset your protocol stream. To deal
with this, virtio-serial can issue a SIGIO signal upon disconnect.

4) Signals are asynchronous, so a reconnect may have occurred by the
time you get the SIGIO signal. It's unclear that you can do anything
useful with this.

That about sums it up. There was a thread about this a while back where there
was some tentative agreement on a way to fix this by introducing QEMU flags that
invoke similar semantics to unix sockets:

http://thread.gmane.org/gmane.comp.emulators.qemu/94721/focus=95496

But at this point we'd need to re-visit, since there's a fair number of
virtio-serial users now. It'd probably need to be something you could switch on
from the guest via an fcntl() or something.


So besides overloading the meaning of SIGIO, there's really no way to
figure out in the guest when a reconnect has occurred. To deal with this
in qemu-ga, we actually only allow 7-bit data transfers and use the 8th
bit as an in-band message to tell the guest that a reset has occurred.

Yup, it's not perfect though, due to a delayed/spurious response from an agent
that sent data before it read/handled the reset sequence. We don't get that
problem with unix sockets since they'd get an -EPIPE and would be blocked from
sending to a newly opened session.

We try to account for this on the host by following up a reset sequences will
the guest-sync RPC, which contains a unique ID that the guest echos back to us.
That way we can throw away stale data on the host until we get the intended
response. In our case, it's not quite perfect since if the agent sent a "{"
before getting reset, subsequent transmission of the guest-sync response can be
lost. We'd need to precede responses to guest-sync with a 0xFF as well, so that
the host flushes it's rcv buffer/parser state...

And, somewhat off-topic, but none of addresses the case where an agent hangs on
an RPC. This would require some additional handling by the agent side where we
might have tie some additional action to the 0xFF sequence.

Previously this scenario was handled by a hard-coded timeout mechanism in the
agent, with a seperate thread handling the RPCs, but we've since dropped the
thread due to potential for memory leaks (with plans to re-introduce using a
child process).

client-induced resets would be much nicer though, and a reserved byte is the
best solution we've been able to come up with given the current virtio-serial
semantics.

Yeah, we really need a "sane reset semantics" flag for virtio-serial that provides a guest and host initiated channel close mechanism.

I think you need to do this by using a single ring and using a simple session id with an explicit open/close message. That way there is never ambiguity.

And yes, I can't help but think of Dave Millers comments long ago that any PV transport is eventually going to reinvent TCP, poorly.

Regards,

Anthony Liguori



Regards,

Anthony Liguori


[...]








reply via email to

[Prev in Thread] Current Thread [Next in Thread]