qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/3] vhost-scsi: Support live migration


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 0/3] vhost-scsi: Support live migration
Date: Mon, 15 Apr 2019 10:37:25 +0100
User-agent: Mutt/1.11.3 (2019-02-01)

On Thu, Apr 11, 2019 at 01:45:06AM +0300, Liran Alon wrote:
> 
> 
> > On 9 Apr 2019, at 17:29, Stefan Hajnoczi <address@hidden> wrote:
> > 
> > On Thu, Mar 21, 2019 at 09:55:42AM +0200, Nir Weiner wrote:
> >> Originally migration was not possible with vhost-scsi because
> >> as part of migration, the source host target SCSI device state
> >> needs to be saved and loaded into the destination host target SCSI
> >> device. This cannot be done by QEMU.
> >> 
> >> As this can be handled either by external orchestrator or by having
> >> shared-storage (i.e. iSCSI), there is no reason to limit the
> >> orchestrator from being able to explictly specify it wish to enable
> >> migration even when VM have a vhost-scsi device.
> >> 
> >> Liran Alon (1):
> >>  vhost-scsi: Allow user to enable migration
> >> 
> >> Nir Weiner (2):
> >>  vhost-scsi: The vhost device should be stopped when the VM is not
> >>    running
> >>  vhost-scsi: Add vmstate descriptor
> >> 
> >> hw/scsi/vhost-scsi.c                  | 57 ++++++++++++++++++++++-----
> >> include/hw/virtio/vhost-scsi-common.h |  1 +
> >> 2 files changed, 48 insertions(+), 10 deletions(-)
> > 
> > What happens when migration is attempted while there is in-flight I/O in
> > the vring?
> > 
> > Stefan
> 
> What do you define as “in-flight I/O”? I think there is a need to separate 
> the discussion here to multiple cases:
> 
> 1) The I/O request was written to vring but guest have not yet written to 
> doorbell:
> This is the simplest case. No state is lost as the vring is migrated from 
> source host to dest host as part of guest memory migration.
> Also, the vring properties (E.g. Guest base address) is transferred from 
> source host to dest host as part of vhost-scsi VMState.
> (See patch 2/3 of series which adds VMSTATE_VIRTIO_DEVICE to vhost-scsi 
> VMState which will cause virtio_save() to save vring on migration stream).
> 
> 2) The I/O request was written to vring and the guest have written to 
> doorbell:
> The I/O request was submitted and processed by the vhost-scsi backend. 
> Therefore, the I/O request was sent to the remote TGT server.
> If the the TGT server has performed the write and returned ACK but the ACK 
> was not received by guest, then guest is expected to initiate the write again.
> (Similar to what happens for a momentary TGT server downtime / network 
> downtime). So this isn’t suppose to be an issue.

This scenario doesn't fit into the virtio-scsi model since each
virtio-scsi request requires a response.  Otherwise there is a vring
descriptor leak.

If the migration scheme can leak vring descriptors then it is incorrect.
Especially repeated migrations could cause the vring to become totally
filled with leaked descriptors.

I think the case you've described cannot happen based on the
drivers/vhost/scsi.c code.  It seems that clearing the endpoint flushes
and waits for inflight requests to complete.  The
VHOST_SCSI_CLEAR_ENDPOINT ioctl blocks until the remote target has
returned an ACK for all in-flight requests.

This patch series needs a clear explanation of how the running
vhost_scsi.ko instance is quiesced such that:

1. Requests in-flight to the remote target are completed before
   migration handover.  (Fancier approaches to live migration are
   possible but this patch series doesn't seem to implement them, so we
   need at least this basic guarantee for correctness.)
2. No further I/O requests are submitted to the remote target after the
   vm stops running.
3. No further guest memory access takes place after the vm stops running.

These properties are critical for correct vhost live migration and I'm
not confident yet based on this patch series.  Please review the code
(including kernel vhost code) and include an explanation in code
comments (if possible) or commit descriptions.

Thanks,
Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]