qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] libvirt vGPU QEMU integration


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [RFC] libvirt vGPU QEMU integration
Date: Wed, 24 Aug 2016 18:29:18 -0400
User-agent: Mutt/1.6.2 (2016-07-01)

On Thu, Aug 18, 2016 at 09:41:59AM -0700, Neo Jia wrote:
> Hi libvirt experts,
> 
> I am starting this email thread to discuss the potential solution / proposal 
> of
> integrating vGPU support into libvirt for QEMU.
> 
> Some quick background, NVIDIA is implementing a VFIO based mediated device
> framework to allow people to virtualize their devices without SR-IOV, for
> example NVIDIA vGPU, and Intel KVMGT. Within this framework, we are reusing 
> the
> VFIO API to process the memory / interrupt as what QEMU does today with 
> passthru
> device.
> 
> The difference here is that we are introducing a set of new sysfs file for
> virtual device discovery and life cycle management due to its virtual nature.
> 
> Here is the summary of the sysfs, when they will be created and how they 
> should
> be used:
> 
> 1. Discover mediated device
> 
> As part of physical device initialization process, vendor driver will register
> their physical devices, which will be used to create virtual device (mediated
> device, aka mdev) to the mediated framework.
> 
> Then, the sysfs file "mdev_supported_types" will be available under the 
> physical
> device sysfs, and it will indicate the supported mdev and configuration for 
> this 
> particular physical device, and the content may change dynamically based on 
> the
> system's current configurations, so libvirt needs to query this file every 
> time
> before create a mdev.
> 
> Note: different vendors might have their own specific configuration sysfs as
> well, if they don't have pre-defined types.
> 
> For example, we have a NVIDIA Tesla M60 on 86:00.0 here registered, and here 
> is
> NVIDIA specific configuration on an idle system.
> 
> For example, to query the "mdev_supported_types" on this Tesla M60:
> 
> cat /sys/bus/pci/devices/0000:86:00.0/mdev_supported_types
> # vgpu_type_id, vgpu_type, max_instance, num_heads, frl_config, framebuffer,
> max_resolution
> 11      ,"GRID M60-0B",      16,       2,      45,     512M,    2560x1600
> 12      ,"GRID M60-0Q",      16,       2,      60,     512M,    2560x1600
> 13      ,"GRID M60-1B",       8,       2,      45,    1024M,    2560x1600
> 14      ,"GRID M60-1Q",       8,       2,      60,    1024M,    2560x1600
> 15      ,"GRID M60-2B",       4,       2,      45,    2048M,    2560x1600
> 16      ,"GRID M60-2Q",       4,       4,      60,    2048M,    2560x1600
> 17      ,"GRID M60-4Q",       2,       4,      60,    4096M,    3840x2160
> 18      ,"GRID M60-8Q",       1,       4,      60,    8192M,    3840x2160

I'm unclear on the requirements about data format for this file.
Looking at the docs:

  http://www.spinics.net/lists/kvm/msg136476.html

the format is completely unspecified.

> 
> 2. Create/destroy mediated device
> 
> Two sysfs files are available under the physical device sysfs path : 
> mdev_create
> and mdev_destroy
> 
> The syntax of creating a mdev is:
> 
>     echo "$mdev_UUID:vendor_specific_argument_list" >
> /sys/bus/pci/devices/.../mdev_create

I'm not really a fan of the idea of having to provide arbitrary vendor
specific arguments to the mdev_create call, as I don't really want to
have to create vendor specific code for each vendor's vGPU hardware in
libvirt.

What is the relationship between the mdev_supported_types data and
the vendor_specific_argument_list requirements ?


> The syntax of destroying a mdev is:
> 
>     echo "$mdev_UUID:vendor_specific_argument_list" >
> /sys/bus/pci/devices/.../mdev_destroy
> 
> The $mdev_UUID is a unique identifier for this mdev device to be created, and 
> it
> is unique per system.
> 
> For NVIDIA vGPU, we require a vGPU type identifier (shown as vgpu_type_id in
> above Tesla M60 output), and a VM UUID to be passed as
> "vendor_specific_argument_list".
> 
> If there is no vendor specific arguments required, either "$mdev_UUID" or
> "$mdev_UUID:" will be acceptable as input syntax for the above two commands.

This raises the question of how an application discovers what
vendor specific arguments are required or not, and what they
might mean.

> To create a M60-4Q device, libvirt needs to do:
> 
>     echo "$mdev_UUID:vgpu_type_id=20,vm_uuid=$VM_UUID" >
> /sys/bus/pci/devices/0000\:86\:00.0/mdev_create

Overall it doesn't seem like the proposed kernel interfaces provide
enough vendor abstraction to be able to use this functionality without
having to create vendor specific code in libvirt, which is something
I want to avoid us doing.



Ignoring the details though, in terms of libvirt integration, I think I'd
see us primarily doing work in the node device APIs / XML. Specifically
for physical devices, we'd have to report whether they support the
mediated device feature and some way to enumerate the validate device
types that can be created. The node device creation API would have to
support create/deletion of the devices (mapping to mdev_create/destroy)


When configuring a guest VM, we'd use the <hostdev> XML to point to one
or more mediated devices that have been created via the node device APIs
previously. When starting the guest, we'd set those mediate devices
online.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]