qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Insane virtio-serial semantics


From: Markus Armbruster
Subject: Re: [Qemu-devel] Insane virtio-serial semantics
Date: Thu, 08 Dec 2011 11:11:56 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Anthony Liguori <address@hidden> writes:

> On 12/07/2011 01:44 PM, Michael Roth wrote:
>> On 12/07/2011 07:49 AM, Anthony Liguori wrote:
>>> On 12/07/2011 02:21 AM, Markus Armbruster wrote:
>>>> Anthony Liguori<address@hidden> writes:
[...]
>>>>> They have the same purpose (which are both vague TBH). The only
>>>>> reason I'm sympathetic to this device is that virtio-serial has such
>>>>> insane semantics.
>>>>
>>>> Could you summarize what's wrong? Is it fixable?
>>>
>>> I don't think so as it's part of the userspace ABI now.
>>>
>>> Mike, please help me make sure I get this all right. A normal
>>> file/socket has the following guest semantics:
>>>
>>> 1) When a disconnect occurs, you will receive a return of '0' or -EPIPE
>>> depending on the platform. The fd is now unusable and you must
>>> close/reopen.
>>>
>>> 2) You can setup SIGIO/SIGPIPE to fire off whenever a file descriptor
>>> becomes readable/writable.
>>>
>>> virtio serial has the following semantics:
>>>
>>> 1) When a disconnect occurs, if you read() you will receive an -EPIPE.
>>>
>>> 2) However, if a reconnect occurs before you issue your read(), the read
>>> will complete with no indication that a disconnect occurred.
>>>
>>> 3) This makes it impossible to determine whether a disconnect has
>>> occurred which makes it very hard to reset your protocol stream. To deal
>>> with this, virtio-serial can issue a SIGIO signal upon disconnect.
>>>
>>> 4) Signals are asynchronous, so a reconnect may have occurred by the
>>> time you get the SIGIO signal. It's unclear that you can do anything
>>> useful with this.
>>
>> That about sums it up. There was a thread about this a while back where there
>> was some tentative agreement on a way to fix this by introducing QEMU flags 
>> that
>> invoke similar semantics to unix sockets:
>>
>> http://thread.gmane.org/gmane.comp.emulators.qemu/94721/focus=95496
>>
>> But at this point we'd need to re-visit, since there's a fair number of
>> virtio-serial users now. It'd probably need to be something you could switch 
>> on
>> from the guest via an fcntl() or something.
>>
>>>
>>> So besides overloading the meaning of SIGIO, there's really no way to
>>> figure out in the guest when a reconnect has occurred. To deal with this
>>> in qemu-ga, we actually only allow 7-bit data transfers and use the 8th
>>> bit as an in-band message to tell the guest that a reset has occurred.
>>
>> Yup, it's not perfect though, due to a delayed/spurious response from an 
>> agent
>> that sent data before it read/handled the reset sequence. We don't get that
>> problem with unix sockets since they'd get an -EPIPE and would be blocked 
>> from
>> sending to a newly opened session.
>>
>> We try to account for this on the host by following up a reset sequences will
>> the guest-sync RPC, which contains a unique ID that the guest echos back to 
>> us.
>> That way we can throw away stale data on the host until we get the intended
>> response. In our case, it's not quite perfect since if the agent sent a "{"
>> before getting reset, subsequent transmission of the guest-sync response can 
>> be
>> lost. We'd need to precede responses to guest-sync with a 0xFF as well, so 
>> that
>> the host flushes it's rcv buffer/parser state...
>>
>> And, somewhat off-topic, but none of addresses the case where an agent hangs 
>> on
>> an RPC. This would require some additional handling by the agent side where 
>> we
>> might have tie some additional action to the 0xFF sequence.
>>
>> Previously this scenario was handled by a hard-coded timeout mechanism in the
>> agent, with a seperate thread handling the RPCs, but we've since dropped the
>> thread due to potential for memory leaks (with plans to re-introduce using a
>> child process).
>>
>> client-induced resets would be much nicer though, and a reserved byte is the
>> best solution we've been able to come up with given the current virtio-serial
>> semantics.
>
> Yeah, we really need a "sane reset semantics" flag for virtio-serial
> that provides a guest and host initiated channel close mechanism.
>
> I think you need to do this by using a single ring and using a simple
> session id with an explicit open/close message.  That way there is
> never ambiguity.

So it is fixable.

> And yes, I can't help but think of Dave Millers comments long ago that
> any PV transport is eventually going to reinvent TCP, poorly.

No doubt then, no doubt now.  But if I remember correctly, we didn't
create virtio-serial because we thought we could do better than TCP/IP.
We thought we need a zero-config communication channel that doesn't
interfere in any way with the guest's networking.  Since the network
folks were unwilling to give us one ("use TCP already"), we looked for
another bare metal thing to imitate, and chose serial lines.

Now, comparing serial lines to TCP/IP makes no sense.  Different layers.

Layering a real network protocol on top of serial line is possible; SLIP
exists.  But as long as we insist on "don't interfere in any way with
the guest's networking", we can't use TCP, and thus are doomed to
reinvent it, poorly.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]