qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 5/8] virtio-blk: fix "disabled data plane" mode


From: tu bo
Subject: Re: [Qemu-devel] [PATCH 5/8] virtio-blk: fix "disabled data plane" mode
Date: Mon, 14 Mar 2016 17:18:54 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

Using the latest qemu from master, and got a new qemu crash as below,

(gdb) bt
#0  0x000003ffabb3b650 in raise () from /lib64/libc.so.6
#1  0x000003ffabb3ced8 in abort () from /lib64/libc.so.6
#2 0x0000000010384c30 in qemu_coroutine_enter (co=0x10a2ed40, opaque=0x0) at util/qemu-coroutine.c:112 #3 0x00000000102fd5c2 in bdrv_co_io_em_complete (opaque=0x3ff22beb518, ret=0) at block/io.c:2311 #4 0x00000000102f1428 in qemu_laio_process_completion (s=0x10a25e30, laiocb=0x3ffa400a2a0) at block/linux-aio.c:92 #5 0x00000000102f15e8 in qemu_laio_completion_bh (opaque=0x10a25e30) at block/linux-aio.c:139
#6  0x0000000010281d70 in aio_bh_call (bh=0x109e3580) at async.c:65
#7  0x0000000010281eb8 in aio_bh_poll (ctx=0x109efe10) at async.c:93
#8  0x000000001029538e in aio_dispatch (ctx=0x109efe10) at aio-posix.c:306
#9 0x0000000010295da6 in aio_poll (ctx=0x109efe10, blocking=false) at aio-posix.c:475
#10 0x000000001014662e in iothread_run (opaque=0x109ef8d0) at iothread.c:46
#11 0x000003ffabd084c6 in start_thread () from /lib64/libpthread.so.0
#12 0x000003ffabc02ec2 in thread_start () from /lib64/libc.so.6
(gdb) frame 2
#2 0x0000000010384c30 in qemu_coroutine_enter (co=0x10a2ed40, opaque=0x0) at util/qemu-coroutine.c:112
112             abort();
(gdb) list
107     
108         trace_qemu_coroutine_enter(self, co, opaque);
109     
110         if (co->caller) {
111             fprintf(stderr, "Co-routine re-entered recursively\n");
112             abort();
113         }
114     
115         co->caller = self;
116         co->entry_arg = opaque;


Messages in the log file of "/var/log/libvirt/qemu/" as below,
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -name rt_vm2 -S -machine s390-ccw-virtio-2.6,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -object iothread,id=iothread1 -uuid 80cfa525-b35b-4341-aa20-a581bb528fbf -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-rt_vm2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -drive file=/dev/mapper/36005076305ffc1ae0000000000008036,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-ccw,iothread=iothread1,scsi=off,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-ccw,netdev=hostnet0,id=net0,mac=02:e7:24:dc:dc:11,devno=fe.0.0000 -netdev tap,fd=30,id=hostnet1,vhost=on,vhostfd=31 -device virtio-net-ccw,netdev=hostnet1,id=net1,mac=52:54:00:e3:0a:44,devno=fe.0.0002 -chardev pty,id=charconsole0 -device sclpconsole,chardev=charconsole0,id=console0 -device virtio-balloon-ccw,id=balloon0,devno=fe.3.ffba -msg timestamp=on
char device redirected to /dev/pts/6 (label charconsole0)

Co-routine re-entered recursively
2016-03-14 09:05:37.075+0000: shutting down


On 03/11/2016 06:28 PM, Paolo Bonzini wrote:


On 10/03/2016 10:40, Christian Borntraeger wrote:
On 03/10/2016 10:03 AM, Christian Borntraeger wrote:
On 03/10/2016 02:51 AM, Fam Zheng wrote:
[...]
The aio_poll() inside "blk_set_aio_context(s->conf->conf.blk, s->ctx)" looks
suspicious:

        main thread                                          iothread
----------------------------------------------------------------------------
     virtio_blk_handle_output()
      virtio_blk_data_plane_start()
       vblk->dataplane_started = true;
       blk_set_aio_context()
        bdrv_set_aio_context()
         bdrv_drain()
          aio_poll()
           <snip...>
            virtio_blk_handle_output()
             /* s->dataplane_started is true */
!!!   ->    virtio_blk_handle_request()
          event_notifier_set(ioeventfd)
                                                     aio_poll()
                                                      
virtio_blk_handle_request()

Christian, could you try the followed patch? The aio_poll above is replaced
with a "limited aio_poll" that doesn't disptach ioeventfd.

(Note: perhaps moving "vblk->dataplane_started = true;" after
blk_set_aio_context() also *works around* this.)

---

diff --git a/block.c b/block.c
index ba24b8e..e37e8f7 100644
--- a/block.c
+++ b/block.c
@@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,

  void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
  {
-    bdrv_drain(bs); /* ensure there are no in-flight requests */
+    /* ensure there are no in-flight requests */
+    bdrv_drained_begin(bs);
+    bdrv_drained_end(bs);

      bdrv_detach_aio_context(bs);


That seems to do the trick.

Or not. Crashed again :-(

I would put bdrv_drained_end just before aio_context_release.

But secondarily, I'm thinking of making the logic simpler to understand
in two ways:

1) adding a mutex around virtio_blk_data_plane_start/stop.

2) moving

     event_notifier_set(virtio_queue_get_host_notifier(s->vq));
     virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);

to a bottom half (created with aio_bh_new in s->ctx).  The bottom half
takes the mutex, checks again "if (vblk->dataplane_started)" and if it's
true starts the processing.

Thanks,

Paolo





reply via email to

[Prev in Thread] Current Thread [Next in Thread]