qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] virtio-serial-bus: replay guest_open on migrati


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH] virtio-serial-bus: replay guest_open on migration
Date: Wed, 27 Jul 2011 10:01:57 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b2 Thunderbird/3.1.10

On 07/27/2011 09:49 AM, Alon Levy wrote:
On Wed, Jul 27, 2011 at 09:16:33AM -0500, Anthony Liguori wrote:
On 07/27/2011 02:07 AM, Alon Levy wrote:
On Wed, Jul 27, 2011 at 07:45:25AM +0200, Markus Armbruster wrote:
Alon Levy<address@hidden>   writes:

Signed-off-by: Alon Levy<address@hidden>
---
  hw/virtio-serial-bus.c |    8 +++++++-
  1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index c5eb931..7a652ff 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -618,14 +618,20 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
      for (i = 0; i<   nr_active_ports; i++) {
          uint32_t id;
          bool host_connected;
+        VirtIOSerialPortInfo *info;

          id = qemu_get_be32(f);
          port = find_port_by_id(s, id);
          if (!port) {
              return -EINVAL;
          }
-
          port->guest_connected = qemu_get_byte(f);
+        info = DO_UPCAST(VirtIOSerialPortInfo, qdev, port->dev.info);
+        if (port->guest_connected&&   info->guest_open) {
+            /* replay guest open */
+            info->guest_open(port);
+
+        }
          host_connected = qemu_get_byte(f);
          if (host_connected != port->host_connected) {
              /*

The patch makes enough sense to me, but the commit message is
insufficient.  Why do you have to replay?  And what's being fixed?

When migrating a host with with a spice agent running the mouse becomes
non operational after the migration. This is rhbz #718463, currently on
spice-server but it seems this is a qemu-kvm issue. The problem is that
after migration spice doesn't know the guest agent is open.

The problem is that guest_open is a hack.

You want connection semantics.  You need the following information
from the backend and the client:

1) backend is associated with a transport.  The transport may
disconnect at any point in time.  The backend needs to have explicit
state transitions associated with the transport disconnecting and
connecting.

2) the client may disconnect and reconnect at any point in time.  A
device model reset is a disconnect followed by a reconnect.

This gives you the following matrix of states:

A: backend-connected, client-connected
B: backend-disconnected, client-disconnected
C: backend-connected, client-disconnected
D: backend-disconnected, client-connected

The state transition diagram looks like this:

B: for some devices, immediately goto C.  other devices, on accept()
goto D.  if in B and client connects, goto D

C: if transport disconnects, goto B. if client connects, goto A

D: if transport connects, goto A.  if client disconects, goto B

A: if transport disconnects, goto B, if client disconnects, goto C

The problem is that guest_open() is a poor approximation of
'client-connected' and it's not used universally.  We need to
introduce proper state tracking to the character device layer and we
need to have a proper connection function that is used by all char
device clients.

Semantically, write should only be allowed in states A and D.  Read
should only be allowed in states A and C.

C and D should have very well defined semantics about what happens
to the data that is written  Arguably, read/write should not be
allowed in states C/D.

Device reset should always trigger a client reconnect.  Migration
resets devices so migration would Just Work if we modelled the state
transitions appropriately.
Are you saying currently on migration the guest (client above) always
receives an event from virtio?

No. Everything is all over the place today. I'm saying this is how it should work:

All character devices need to start in the same state today. As it stands, everything but spicevmc starts in D. spicevmc starts in B. guest_open()/guest_close() assumes the devices starts in D.

guest_open/guest_close is a fundamentally broken interface because it only works for spicevmc.

We need to generalize guest_open/guest_close and make all backends start in D. That means we need to add guest_open() calls to the initfns of any device that uses a CharDriverState.

And the natural place to add a guest_close() is to the destroy functions of any device.

reset() is semantically a close followed by an open, so for all devices that use CDSs, we should add a guest_close() followed by a guest_open().

Now, for virtio-serial, since it supports multiple ports, in it's reset() path, it needs to do close() followed by open. Since vmstate_load() is basically an extension of reset(), you'll have to open() any new ports that appear in that path.

The guest_open callback happens when
a guest operation happens, not when the device gets reset, unless the later
triggers the former, but I don't understand how that would happen since a
reset can happen while the guest isn't ready to handle anything (guest is
booting).
I do see a virtio_pci_reset does a virtio_reset which sends the
VIRTIO_NO_VECTOR interrupt, but I don't understand what happens after that.

Besides, I understand the need to fix the connection semantics of chardevs,
but the situation is broken right now and even if someone were to write this
I don't believe you would just take it to 0.15.0, would you?

No, guest_open() is fundamentally broken right now. It's not that hard to fix, but it's too much for 0.150 I suspect.


Also, the conversation is still ongoing but Armbru mentioned some 
''relevant-cases''
in http://lists.nongnu.org/archive/html/qemu-devel/2011-07/msg03221.html
backing the fix-the-hack approach (at least for 0.15.0).

The problem is this isn't the only place where this problem with occur. libvirt just implemented reboot via shutdown and reset which means that if you don't do the right thing in reset(), you're going to have the same problem there.

Fixing it right isn't hard, let's focus on doing it the right way.

Regards,

Anthony Liguori


Regards,

Anthony Liguori






reply via email to

[Prev in Thread] Current Thread [Next in Thread]