[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device
From: |
Yan Zhao |
Subject: |
Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device |
Date: |
Fri, 21 Jun 2019 06:45:09 -0400 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Fri, Jun 21, 2019 at 05:22:37PM +0800, Kirti Wankhede wrote:
>
>
> On 6/21/2019 2:16 PM, Yan Zhao wrote:
> > On Fri, Jun 21, 2019 at 04:02:50PM +0800, Kirti Wankhede wrote:
> >>
> >>
> >> On 6/21/2019 6:54 AM, Yan Zhao wrote:
> >>> On Fri, Jun 21, 2019 at 08:25:18AM +0800, Yan Zhao wrote:
> >>>> On Thu, Jun 20, 2019 at 10:37:28PM +0800, Kirti Wankhede wrote:
> >>>>> Add migration support for VFIO device
> >>>>>
> >>>>> This Patch set include patches as below:
> >>>>> - Define KABI for VFIO device for migration support.
> >>>>> - Added save and restore functions for PCI configuration space
> >>>>> - Generic migration functionality for VFIO device.
> >>>>> * This patch set adds functionality only for PCI devices, but can be
> >>>>> extended to other VFIO devices.
> >>>>> * Added all the basic functions required for pre-copy, stop-and-copy
> >>>>> and
> >>>>> resume phases of migration.
> >>>>> * Added state change notifier and from that notifier function, VFIO
> >>>>> device's state changed is conveyed to VFIO device driver.
> >>>>> * During save setup phase and resume/load setup phase, migration
> >>>>> region
> >>>>> is queried and is used to read/write VFIO device data.
> >>>>> * .save_live_pending and .save_live_iterate are implemented to use
> >>>>> QEMU's
> >>>>> functionality of iteration during pre-copy phase.
> >>>>> * In .save_live_complete_precopy, that is in stop-and-copy phase,
> >>>>> iteration to read data from VFIO device driver is implemented till
> >>>>> pending
> >>>>> bytes returned by driver are not zero.
> >>>>> * Added function to get dirty pages bitmap for the pages which are
> >>>>> used by
> >>>>> driver.
> >>>>> - Add vfio_listerner_log_sync to mark dirty pages.
> >>>>> - Make VFIO PCI device migration capable. If migration region is not
> >>>>> provided by
> >>>>> driver, migration is blocked.
> >>>>>
> >>>>> Below is the flow of state change for live migration where states in
> >>>>> brackets
> >>>>> represent VM state, migration state and VFIO device state as:
> >>>>> (VM state, MIGRATION_STATUS, VFIO_DEVICE_STATE)
> >>>>>
> >>>>> Live migration save path:
> >>>>> QEMU normal running state
> >>>>> (RUNNING, _NONE, _RUNNING)
> >>>>> |
> >>>>> migrate_init spawns migration_thread.
> >>>>> (RUNNING, _SETUP, _RUNNING|_SAVING)
> >>>>> Migration thread then calls each device's .save_setup()
> >>>>> |
> >>>>> (RUNNING, _ACTIVE, _RUNNING|_SAVING)
> >>>>> If device is active, get pending bytes by .save_live_pending()
> >>>>> if pending bytes >= threshold_size, call save_live_iterate()
> >>>>> Data of VFIO device for pre-copy phase is copied.
> >>>>> Iterate till pending bytes converge and are less than threshold
> >>>>> |
> >>>>> On migration completion, vCPUs stops and calls
> >>>>> .save_live_complete_precopy
> >>>>> for each active device. VFIO device is then transitioned in
> >>>>> _SAVING state.
> >>>>> (FINISH_MIGRATE, _DEVICE, _SAVING)
> >>>>> For VFIO device, iterate in .save_live_complete_precopy until
> >>>>> pending data is 0.
> >>>>> (FINISH_MIGRATE, _DEVICE, _STOPPED)
> >>>>
> >>>> I suggest we also register to VMStateDescription, whose .pre_save
> >>>> handler would get called after .save_live_complete_precopy in pre-copy
> >>>> only case, and will called before .save_live_iterate in post-copy
> >>>> enabled case.
> >>>> In the .pre_save handler, we can save all device state which must be
> >>>> copied after device stop in source vm and before device start in target
> >>>> vm.
> >>>>
> >>> hi
> >>> to better describe this idea:
> >>>
> >>> in pre-copy only case, the flow is
> >>>
> >>> start migration --> .save_live_iterate (several round) -> stop source vm
> >>> --> .save_live_complete_precopy --> .pre_save -->start target vm
> >>> -->migration complete
> >>>
> >>>
> >>> in post-copy enabled case, the flow is
> >>>
> >>> start migration --> .save_live_iterate (several round) --> start post
> >>> copy -->
> >>> stop source vm --> .pre_save --> start target vm --> .save_live_iterate
> >>> (several round)
> >>> -->migration complete
> >>>
> >>> Therefore, we should put saving of device state in .pre_save interface
> >>> rather than in .save_live_complete_precopy.
> >>> The device state includes pci config data, page tables, register state,
> >>> etc.
> >>>
> >>> The .save_live_iterate and .save_live_complete_precopy should only deal
> >>> with saving dirty memory.
> >>>
> >>
> >> Vendor driver can decide when to save device state depending on the VFIO
> >> device state set by user. Vendor driver doesn't have to depend on which
> >> callback function QEMU or user application calls. In pre-copy case,
> >> save_live_complete_precopy sets VFIO device state to
> >> VFIO_DEVICE_STATE_SAVING which means vCPUs are stopped and vendor driver
> >> should save all device state.
> >>
> > when post copy stops vCPUs and vfio device, vendor driver only needs to
> > provide device state. but how vendor driver knows that, if no extra
> > interface or no extra device state is provides?
> >
>
> .save_live_complete_postcopy interface for post-copy will get called,
> right?
>
yes, but it's too late after postcopy completion
> Thanks,
> Kirti
>
> >>>
> >>> I know current implementation does not support post-copy. but at least
> >>> it should not require huge change when we decide to enable it in future.
> >>>
> >>
> >> .has_postcopy and .save_live_complete_postcopy need to be implemented to
> >> support post-copy. I think .save_live_complete_postcopy should be
> >> similar to vfio_save_complete_precopy.
> >>
> >> Thanks,
> >> Kirti
> >>
> >>> Thanks
> >>> Yan
> >>>
- [Qemu-devel] [PATCH v4 11/13] vfio: Add vfio_listerner_log_sync to mark dirty pages, (continued)
- [Qemu-devel] [PATCH v4 11/13] vfio: Add vfio_listerner_log_sync to mark dirty pages, Kirti Wankhede, 2019/06/20
- [Qemu-devel] [PATCH v4 12/13] vfio: Make vfio-pci device migration capable., Kirti Wankhede, 2019/06/20
- [Qemu-devel] [PATCH v4 13/13] vfio: Add trace events in migration code path, Kirti Wankhede, 2019/06/20
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Yan Zhao, 2019/06/20
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Yan Zhao, 2019/06/20
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Kirti Wankhede, 2019/06/21
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Yan Zhao, 2019/06/21
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Kirti Wankhede, 2019/06/21
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device,
Yan Zhao <=
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Dr. David Alan Gilbert, 2019/06/24
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Yan Zhao, 2019/06/25
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Dr. David Alan Gilbert, 2019/06/28
- Re: [Qemu-devel] [PATCH v4 00/13] Add migration support for VFIO device, Yan Zhao, 2019/06/28