qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 23/38] ivshmem: Receive shared memory synchronou


From: Marc-André Lureau
Subject: Re: [Qemu-devel] [PATCH 23/38] ivshmem: Receive shared memory synchronously in realize()
Date: Wed, 2 Mar 2016 19:11:56 +0100

Hi

On Mon, Feb 29, 2016 at 7:40 PM, Markus Armbruster <address@hidden> wrote:
> When configured for interrupts (property "chardev" given), we receive
> the shared memory from an ivshmem server.  We do so asynchronously
> after realize() completes, by setting up callbacks with
> qemu_chr_add_handlers().
>
> Keeping server I/O out of realize() that way avoids delays due to a
> slow server.  This is probably relevant only for hot plug.
>
> However, this funny "no shared memory, yet" state of the device also
> causes a raft of issues that are hard or impossible to work around:
>
> * The guest is exposed to this state: when we enter and leave it its
>   shared memory contents is apruptly replaced, and device register
>   IVPosition changes.
>
>   This is a known issue.  We document that guests should not access
>   the shared memory after device initialization until the IVPosition
>   register becomes non-negative.
>
>   For cold plug, the funny state is unlikely to be visible in
>   practice, because we normally receive the shared memory long before
>   the guest gets around to mess with the device.
>
>   For hot plug, the timing is tighter, but the relative slowness of
>   PCI device configuration has a good chance to hide the funny state.
>
>   In either case, guests complying with the documented procedure are
>   safe.
>
> * Migration becomes racy.
>
>   If migration completes before the shared memory setup completes on
>   the source, shared memory contents is silently lost.  Fortunately,
>   migration is rather unlikely to win this race.
>
>   If the shared memory's ramblock arrives at the destination before
>   shared memory setup completes, migration fails.
>
>   There is no known way for a management application to wait for
>   shared memory setup to complete.
>
>   All you can do is retry failed migration.  You can improve your
>   chances by leaving more time between running the destination QEMU
>   and the migrate command.
>
>   To mitigate silent memory loss, you need to ensure the server
>   initializes shared memory exactly the same on source and
>   destination.
>
>   These issues are entirely undocumented so far.
>
> I'd expect the server to be almost always fast enough to hide these
> issues.  But then rare catastrophic races are in a way the worst kind.
>
> This is way more trouble than I'm willing to take from any device.
> Kill the funny state by receiving shared memory synchronously in
> realize().  If your hot plug hangs, go kill your ivshmem server.
>
> For easier review, this commit only makes the receive synchronous, it
> doesn't add the necessary error propagation.  Without that, the funny
> state persists.  The next commit will do that, and kill it off for
> real.
>
> Signed-off-by: Markus Armbruster <address@hidden>
> ---
>  hw/misc/ivshmem.c    | 70 
> +++++++++++++++++++++++++++++++++++++---------------
>  tests/ivshmem-test.c | 26 ++++++-------------
>  2 files changed, 57 insertions(+), 39 deletions(-)
>
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index c366087..352937f 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -676,27 +676,47 @@ static void ivshmem_read(void *opaque, const uint8_t 
> *buf, int size)
>      process_msg(s, incoming_posn, incoming_fd);
>  }
>
> -static void ivshmem_check_version(void *opaque, const uint8_t * buf, int 
> size)
> +static int64_t ivshmem_recv_msg(IVShmemState *s, int *pfd)
>  {
> -    IVShmemState *s = opaque;
> -    int tmp;
> -    int64_t version;
> +    int64_t msg;
> +    int n, ret;
>
> -    if (!fifo_update_and_get_i64(s, buf, size, &version)) {
> -        return;
> -    }
> +    n = 0;
> +    do {
> +        ret = qemu_chr_fe_read_all(s->server_chr, (uint8_t *)&msg + n,
> +                                 sizeof(msg) - n);
> +        if (ret < 0 && ret != -EINTR) {
> +            /* TODO error handling */
> +            return INT64_MIN;
> +        }
> +        n += ret;
> +    } while (n < sizeof(msg));
>
> -    tmp = qemu_chr_fe_get_msgfd(s->server_chr);
> -    if (tmp != -1 || version != IVSHMEM_PROTOCOL_VERSION) {
> +    *pfd = qemu_chr_fe_get_msgfd(s->server_chr);
> +    return msg;
> +}
> +
> +static void ivshmem_recv_setup(IVShmemState *s)
> +{
> +    int64_t msg;
> +    int fd;
> +
> +    msg = ivshmem_recv_msg(s, &fd);
> +    if (fd != -1 || msg != IVSHMEM_PROTOCOL_VERSION) {
>          fprintf(stderr, "incompatible version, you are connecting to a 
> ivshmem-"
>                  "server using a different protocol please check your 
> setup\n");
> -        qemu_chr_add_handlers(s->server_chr, NULL, NULL, NULL, s);
>          return;
>      }
>
> -    IVSHMEM_DPRINTF("version check ok, switch to real chardev handler\n");
> -    qemu_chr_add_handlers(s->server_chr, ivshmem_can_receive, ivshmem_read,
> -                          NULL, s);
> +    /*
> +     * Receive more messages until we got shared memory.
> +     */
> +    do {
> +        msg = ivshmem_recv_msg(s, &fd);
> +        process_msg(s, msg, fd);
> +    } while (msg != -1);
> +
> +    assert(memory_region_is_mapped(&s->ivshmem));

It is easy to trigger at runtime, I suggest to report an error instead.

looks good otherwise

>  }
>
>  /* Select the MSI-X vectors used by device.
> @@ -903,19 +923,29 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
> **errp)
>          IVSHMEM_DPRINTF("using shared memory server (socket = %s)\n",
>                          s->server_chr->filename);
>
> -        if (ivshmem_setup_interrupts(s) < 0) {
> -            error_setg(errp, "failed to initialize interrupts");
> -            return;
> -        }
> -
>          /* we allocate enough space for 16 peers and grow as needed */
>          resize_peers(s, 16);
>          s->vm_id = -1;
>
>          pci_register_bar(dev, 2, attr, &s->bar);
>
> -        qemu_chr_add_handlers(s->server_chr, ivshmem_can_receive,
> -                              ivshmem_check_version, NULL, s);
> +        /*
> +         * Receive setup messages from server synchronously.
> +         * Older versions did it asynchronously, but that creates a
> +         * number of entertaining race conditions.
> +         * TODO Propagate errors!  Without that, we still have races
> +         * on errors.
> +         */
> +        ivshmem_recv_setup(s);
> +        if (memory_region_is_mapped(&s->ivshmem)) {
> +            qemu_chr_add_handlers(s->server_chr, ivshmem_can_receive,
> +                                  ivshmem_read, NULL, s);
> +        }
> +
> +        if (ivshmem_setup_interrupts(s) < 0) {
> +            error_setg(errp, "failed to initialize interrupts");
> +            return;
> +        }
>      } else {
>          /* just map the file immediately, we're not using a server */
>          int fd;
> diff --git a/tests/ivshmem-test.c b/tests/ivshmem-test.c
> index c1dd7bb..68d6840 100644
> --- a/tests/ivshmem-test.c
> +++ b/tests/ivshmem-test.c
> @@ -309,35 +309,23 @@ static void test_ivshmem_server(bool msi)
>      ret = ivshmem_server_start(&server);
>      g_assert_cmpint(ret, ==, 0);
>
> -    setup_vm_with_server(&state1, nvectors, msi);
> -    s1 = &state1;
> -    setup_vm_with_server(&state2, nvectors, msi);
> -    s2 = &state2;
> -
> -    /* check state before server sends stuff */
> -    g_assert_cmpuint(in_reg(s1, IVPOSITION), ==, 0xffffffff);
> -    g_assert_cmpuint(in_reg(s2, IVPOSITION), ==, 0xffffffff);
> -    g_assert_cmpuint(qtest_readb(s1->qtest, (uintptr_t)s1->mem_base), ==, 
> 0x00);
> -
>      thread.server = &server;
>      ret = pipe(thread.pipe);
>      g_assert_cmpint(ret, ==, 0);
>      thread.thread = g_thread_new("ivshmem-server", server_thread, &thread);
>      g_assert(thread.thread != NULL);
>
> -    /* waiting for devices to become operational */
> -    while (g_get_monotonic_time() < end_time) {
> -        g_usleep(1000);
> -        if ((int)in_reg(s1, IVPOSITION) >= 0 &&
> -            (int)in_reg(s2, IVPOSITION) >= 0) {
> -            break;
> -        }
> -    }
> +    setup_vm_with_server(&state1, nvectors, msi);
> +    s1 = &state1;
> +    setup_vm_with_server(&state2, nvectors, msi);
> +    s2 = &state2;
>
>      /* check got different VM ids */
>      vm1 = in_reg(s1, IVPOSITION);
>      vm2 = in_reg(s2, IVPOSITION);
> -    g_assert_cmpuint(vm1, !=, vm2);
> +    g_assert_cmpint(vm1, >=, 0);
> +    g_assert_cmpint(vm2, >=, 0);
> +    g_assert_cmpint(vm1, !=, vm2);
>
>      /* check number of MSI-X vectors */
>      global_qtest = s1->qtest;
> --
> 2.4.3
>
>



-- 
Marc-André Lureau



reply via email to

[Prev in Thread] Current Thread [Next in Thread]