[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Freeze / spin in virtio blk...flatview do translate
From: |
Frank Yang |
Subject: |
Re: [Qemu-devel] Freeze / spin in virtio blk...flatview do translate |
Date: |
Thu, 20 Sep 2018 07:16:52 -0700 |
I have added more logging code and it seems that there is a hang that
happens with 4096 MB RAM on Mac in virtio_blk_handle_vq:
#define VIRTIO_BLK_UNUSUAL_ITER_COUNT 1024
bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
{
VirtIOBlockReq *req;
MultiReqBuffer mrb = {};
bool progress = false;
uint32_t all_iters = 0;
uint32_t progress_iters = 0;
aio_context_acquire(blk_get_aio_context(s->blk));
blk_io_plug(s->blk);
do {
++all_iters;
virtio_queue_set_notification(vq, 0);
while ((req = virtio_blk_get_request(s, vq))) {
progress = true;
++progress_iters;
if (virtio_blk_handle_request(req, &mrb)) {
virtqueue_detach_element(req->vq, &req->elem, 0);
virtio_blk_free_request(req);
break;
}
qemu_spin_warning(
progress_iters,
VIRTIO_BLK_UNUSUAL_ITER_COUNT,
"Warning: virtio_blk_handle_vq spun %u times with
progress.\n",
progress_iters);
}
qemu_spin_warning(
all_iters,
VIRTIO_BLK_UNUSUAL_ITER_COUNT,
"Warning: virtio_blk_handle_vq spun %u times total.\n",
<----------------------------this printed
all_iters);
virtio_queue_set_notification(vq, 1);
} while (!virtio_queue_empty(vq));
<--------------------------------makes me think virtio queue is corrupted
if (mrb.num_reqs) {
virtio_blk_submit_multireq(s->blk, &mrb);
}
blk_io_unplug(s->blk);
aio_context_release(blk_get_aio_context(s->blk));
return progress;
}
On Tue, Sep 18, 2018 at 11:57 AM Frank Yang <address@hidden> wrote:
> We also only get those reports from users with 4G RAM configured, so it
> could also have to do with overflow.
>
> On Tue, Sep 18, 2018 at 11:57 AM Frank Yang <address@hidden> wrote:
>
>> That seems to be the case, since our 15 second detector is reset if the
>> main loop runs its timers again, so no main loop iterations happened since
>> that aio_dispatch_handlers call (we use a looper abstraction for it).
>>
>> On Tue, Sep 18, 2018 at 8:56 AM Paolo Bonzini <address@hidden>
>> wrote:
>>
>>> On 15/09/2018 20:41, Frank Yang via Qemu-devel wrote:
>>> > We have not reproduced this hang so far, this is from user crash
>>> reports
>>> > that triggered our hang detector (where 15+ seconds pass without main
>>> loop
>>> > / VCPU threads being able to go back and ping their loopers in main
>>> loop /
>>> > vcpu threads.
>>> >
>>> > 0x00000001024e9fcb(qemu-system-x86_64 -exec.c:511)flatview_translate
>>> > 0x00000001024f2390(qemu-system-x86_64
>>> > -memory.h:1865)address_space_lduw_internal_cached
>>> > 0x000000010246ff11(qemu-system-x86_64
>>> > -virtio-access.h:166)virtio_queue_set_notification
>>> > 0x00000001024fa2c9(qemu-system-x86_64+ 0x000a72c9)virtio_blk_handle_vq
>>> > 0x00000001024746ee(qemu-system-x86_64
>>> > -virtio.c:1521)virtio_queue_host_notifier_aio_read
>>> > 0x0000000103a5ed8a(qemu-system-x86_64
>>> -aio-posix.c:406)aio_dispatch_handlers
>>> > 0x0000000103a5ecc8(qemu-system-x86_64 -aio-posix.c:437)aio_dispatch
>>> > 0x0000000103a5c158(qemu-system-x86_64 -async.c:261)aio_ctx_dispatch
>>> > 0x0000000103a92103(qemu-system-x86_64
>>> -gmain.c:3072)g_main_context_dispatch
>>> > 0x0000000103a5e4ad(qemu-system-x86_64 -main-loop.c:224)main_loop_wait
>>> > 0x0000000102468ab8(qemu-system-x86_64 -vl.c:2172)main_impl
>>> > 0x0000000102461a3a(qemu-system-x86_64 -vl.c:3332)run_qemu_main
>>> > 0x000000010246eef3(qemu-system-x86_64
>>> > -main.cpp:577)enter_qemu_main_loop(int, char**)
>>> > 0x00000001062b63a9(libQt5Core.5.dylib
>>> > -qthread_unix.cpp:344)QThreadPrivate::start(void*)
>>> > 0x00007fff65118660
>>> > 0x00007fff6511850c
>>> > 0x00007fff65117bf8
>>> > 0x00000001062b623f(libQt5Core.5.dylib+ 0x0002623f)
>>>
>>> To be clear, is aio_dispatch_handlers running for 15+ seconds?
>>>
>>> None of the patches you point out are related however.
>>>
>>> Paolo
>>>
>>