qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/4] hw/display/virtio-gpu: Introducing asynchronous gues


From: Kim, Dongwon
Subject: Re: [RFC PATCH 0/4] hw/display/virtio-gpu: Introducing asynchronous guest display render
Date: Fri, 5 Jul 2024 11:24:55 -0700
User-agent: Mozilla Thunderbird

On 7/2/2024 11:26 PM, Marc-André Lureau wrote:
Hi

On Tue, Jul 2, 2024 at 10:11 PM Kim, Dongwon <dongwon.kim@intel.com <mailto:dongwon.kim@intel.com>> wrote:

    Hi Marc-André,

    On 7/2/2024 3:29 AM, Marc-André Lureau wrote:
     > Hi
     >
     > On Fri, Jun 21, 2024 at 3:20 AM <dongwon.kim@intel.com
    <mailto:dongwon.kim@intel.com>
     > <mailto:dongwon.kim@intel.com <mailto:dongwon.kim@intel.com>>> wrote:
     >
     >     From: Dongwon Kim <dongwon.kim@intel.com
    <mailto:dongwon.kim@intel.com> <mailto:dongwon.kim@intel.com
    <mailto:dongwon.kim@intel.com>>>
     >
     >     Introducing new virtio-gpu param, 'render_sync' when guest
    scanout blob
     >     is used (blob=true). The new param is used to specify when to
    start
     >     rendering a guest scanout frame.
     >
     >     By default (and so far) rendering of the guest frame is
    started in
     >     the draw event to make sure guest display update is
    sychronized with
     >     host's vsync. But this method inevitably brings some extra
    wait because
     >     most of time, the draw event is not happening right after the
    guest
     >     scanout frame is flushed.
     >
     >
     >     This delay often makes the guest stuck at certain frame for
    too long and
     >     causes general performance degradation of graphic workloads
    on the
     >     guest's
     >     side especially when the display update rate is high. This
    unwanted perf
     >     drop can be reduced if the guest scanout frame is rendered as
    soon
     >     as it is
     >     flushed through 'VIRTIO_GPU_CMD_RESOURCE_FLUSH' msg. The gl
    display
     >     pipeline can be unblocked a lot earlier in this case so that
    guest can
     >     move to the next display frame right away.
     >
     >     However, this "asynchrounous" render mode may cause some waste of
     >     resources
     >     as the guest could produce more frames than what are actually
    displayed
     >     on the host display. This is similar as running rendering
    apps with
     >     no vblank
     >     or vsync option. This is why this feature should stay as
    optional.
     >
     >
     > Indeed, I don't see much point in doing so, it's pure waste. If
    you want
     > to benchmark something perhaps. But then, why not simply run a
    benchmark
     > offscreen?

    Benchmark score is not the primary reason. The problem we want to avoid
    is the laggy display update due to too many asynchronous plane updates
    happening in the guest in certain situations like when moving SW mouse
    cursor rigorously on the guest. This is because the fence (as well as
    response for the resource-flush cmd) is signaled in the draw event.


Presumably, you won't get more frames displayed (perhaps even less due to extra load), so why is it laggy? Is it because the guest is doing too much buffering?

Yes, I believe so. Virtio-gpu driver can't issue another resource flush as the previous frame is still on hold by the host. But there are many plane-update requests due to frame buffer updates because of the movement of SW mouse cursor in between. BTW, we currently use dirtyFB update on the guest running Xorg. Maybe using pageflip would make things better but we haven't tried it yet.

Because the mouse event queue isn't being drained
between scanouts?




     >
     >
     >     The param 'render_sync' is set to 'true' by default and this
    is in line
     >     with traditional way while setting it to 'false' is basically
    enabling
     >     this asynchronouse mode.
     >
     >
     > As it stands now, the option should actually be on the display
    backend
     > (gtk/gtk-egl), it's not implemented for other backends.

    We can help to deploy this concept to other backends if needed.


Why not make it a gtk display option instead?

That is possible but I made it virtio-gpu option as this is specific to virtio-gpu backend. We can consider moving it to gtk layer if it makes more sense.



     >
     > I am not convinced this is generally useful to be an extra option
    though.

    I initially thought this should be the default because the virtio-gpu
    guest should not be hold by the host for too long in any cases. And
    this
    new approach is comparable to the default mode with blob=false where
    the
    guest is released as soon as the resource flush is done. Only
    difference
    is the resource synchronization using the fence.


virgl should be blocking rendering too

Could you detail your testing setup ?

We use ubuntu linux as a guest OS but it's modified for our GFX SRIOV
setup. I am not sure if this is the details you are looking for but here is it: we virtualize our GPU using SRIOV. So individual linux guest have their dedicated portion of GPU device that is detected and worked as "render-only" device. And since it is render-only, we pair it with virtio-gpu device/driver that provides display feature. Scanouts on the guest are allocated by virtio-gpu then imported by Mesa driver then used as a main framebuffer. On every plane update done by compositor will get virtio-gpu driver to issue resource-flush (followed by set scanout) to host. Host (QEMU) gets this scanout data as blob (scatter-gather list) then creates its own dmabuf using udmabuf driver. At every resource flush, we block the render pipeline as well as the guest until the rendering if finished for scanout synchronization as you know.

We do not use virgl (instead, the normal HW driver for SRIOV-Virtual device is used).

So this idea of having render_sync=false is based on

1. glBlitFrameBuffer can be done anytime after resource-flush is received.
2. the guest can be unblocked whenever the blob is used (cloned by glBlitFrameBuffer).


thanks


     >
     >     Dongwon Kim (4):
     >        hw/display/virtio-gpu: Introducing render_sync param
     >        ui/egl-helpers: Consolidates create-sync and create-fence
     >        ui/gtk-egl: Start rendering of guest blob scanout if
    render_sync is
     >          off
     >        ui/gtk-gl-draw: Start rendering of guest blob scanout if
    render_sync
     >          is off
     >
     >       include/hw/virtio/virtio-gpu.h  |  3 ++
     >       include/ui/dmabuf.h             |  4 +-
     >       include/ui/egl-helpers.h        |  3 +-
     >       hw/display/vhost-user-gpu.c     |  3 +-
     >       hw/display/virtio-gpu-udmabuf.c |  3 +-
     >       hw/display/virtio-gpu.c         |  2 +
     >       hw/vfio/display.c               |  3 +-
     >       ui/dbus-listener.c              |  2 +-
     >       ui/dmabuf.c                     | 11 +++-
     >       ui/egl-helpers.c                | 27 ++++------
     >       ui/gtk-egl.c                    | 93
    ++++++++++++++++++---------------
     >       ui/gtk-gl-area.c                | 90
    +++++++++++++++++++------------
     >       12 files changed, 146 insertions(+), 98 deletions(-)
     >
     >     --
     >     2.34.1
     >
     >
     >
     >
     > --
     > Marc-André Lureau



--
Marc-André Lureau




reply via email to

[Prev in Thread] Current Thread [Next in Thread]