Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used

From:	Cornelia Huck
Subject:	Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues
Date:	Mon, 27 Feb 2017 16:06:09 +0100

On Mon, 27 Feb 2017 15:09:30 +0100
Christian Borntraeger <address@hidden> wrote:

> Paolo,
> 
> commit 97cd965c070152bc626c7507df9fb356bbe1cd81
> "virtio: use VRingMemoryRegionCaches for avail and used rings"
> does cause a segfault on my s390 system when I use num-queues.
> 
> gdb --args qemu-system-s390x -nographic -enable-kvm -m 1G -drive 
> file=/var/lib/libvirt/qemu/image.zhyp137,if=none,id=d1 -device 
> virtio-blk-ccw,drive=d1,iothread=io1,num-queues=2 -object iothread,id=io1

(...)

> (gdb) bt
> #0  0x0000000001024a26 in address_space_translate_cached (cache=0x38, addr=2, 
> xlat=0x3ffe587bff8, plen=0x3ffe587bff0, is_write=false) at 
> /home/cborntra/REPOS/qemu/exec.c:3187
> #1  0x0000000001025596 in address_space_lduw_internal_cached (cache=0x38, 
> addr=2, attrs=..., result=0x0, endian=DEVICE_BIG_ENDIAN) at 
> /home/cborntra/REPOS/qemu/memory_ldst.inc.c:264
> #2  0x0000000001025846 in address_space_lduw_be_cached (cache=0x38, addr=2, 
> attrs=..., result=0x0) at /home/cborntra/REPOS/qemu/memory_ldst.inc.c:322
> #3  0x000000000102597e in lduw_be_phys_cached (cache=0x38, addr=2) at 
> /home/cborntra/REPOS/qemu/memory_ldst.inc.c:340
> #4  0x0000000001114856 in virtio_lduw_phys_cached (vdev=0x1c57cd0, 
> cache=0x38, pa=2) at 
> /home/cborntra/REPOS/qemu/include/hw/virtio/virtio-access.h:164
> #5  0x000000000111523c in vring_avail_idx (vq=0x3fffde1e090) at 
> /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:201
> #6  0x0000000001115bba in virtio_queue_empty (vq=0x3fffde1e090) at 
> /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:332
> #7  0x000000000111c312 in virtio_queue_host_notifier_aio_poll 
> (opaque=0x3fffde1e0f8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:2294
> #8  0x000000000147a036 in run_poll_handlers_once (ctx=0x1bb8bb0) at 
> /home/cborntra/REPOS/qemu/util/aio-posix.c:490
> #9  0x000000000147a2fe in try_poll_mode (ctx=0x1bb8bb0, blocking=true) at 
> /home/cborntra/REPOS/qemu/util/aio-posix.c:566
> #10 0x000000000147a3ca in aio_poll (ctx=0x1bb8bb0, blocking=true) at 
> /home/cborntra/REPOS/qemu/util/aio-posix.c:595
> #11 0x00000000011a0176 in iothread_run (opaque=0x1bb86c0) at 
> /home/cborntra/REPOS/qemu/iothread.c:59
> #12 0x000003ffe9087bc4 in start_thread () at /lib64/libpthread.so.0
> #13 0x000003ffe8f8a9f2 in thread_start () at /lib64/libc.so.6
> 
> It seems to make a difference if its the boot disk or not. Maybe the reset of 
> the
> devices that the bootloader does before handling over control to Linux creates
> some trouble here.

I can reproduce this (the root cause seems to be that the bootloader
only sets up the first queue but the dataplane code wants to handle
both queues); this particular problem is fixed by 
https://patchwork.ozlabs.org/patch/731445/ but then I hit a similar
problem later:

0x0000000010019b46 in address_space_translate_cached (cache=0x60, addr=0, 
    xlat=0x3fffcb7e420, plen=0x3fffcb7e418, is_write=false)
    at /root/git/qemu/exec.c:3187
3187        assert(addr < cache->len && *plen <= cache->len - addr);

(...)

(gdb) bt
#0  0x0000000010019b46 in address_space_translate_cached (cache=0x60, addr=0, 
    xlat=0x3fffcb7e420, plen=0x3fffcb7e418, is_write=false)
    at /root/git/qemu/exec.c:3187
#1  0x000000001001a5fe in address_space_lduw_internal_cached (cache=0x60, 
    addr=0, attrs=..., result=0x0, endian=DEVICE_BIG_ENDIAN)
    at /root/git/qemu/memory_ldst.inc.c:264
#2  0x000000001001a88e in address_space_lduw_be_cached (cache=0x60, addr=0, 
    attrs=..., result=0x0) at /root/git/qemu/memory_ldst.inc.c:322
#3  0x000000001001a9c6 in lduw_be_phys_cached (cache=0x60, addr=0)
    at /root/git/qemu/memory_ldst.inc.c:340
#4  0x00000000100fa876 in virtio_lduw_phys_cached (vdev=0x10bc2ce0, cache=0x60, 
    pa=0) at /root/git/qemu/include/hw/virtio/virtio-access.h:164
#5  0x00000000100fb536 in vring_used_flags_set_bit (vq=0x3fffdebc090, mask=1)
    at /root/git/qemu/hw/virtio/virtio.c:255
#6  0x00000000100fb7fa in virtio_queue_set_notification (vq=0x3fffdebc090, 
    enable=0) at /root/git/qemu/hw/virtio/virtio.c:297
#7  0x0000000010101d22 in virtio_queue_host_notifier_aio_poll_begin (
    n=0x3fffdebc0f8) at /root/git/qemu/hw/virtio/virtio.c:2285
#8  0x00000000103f4164 in poll_set_started (ctx=0x10ae8230, started=true)
    at /root/git/qemu/util/aio-posix.c:338
#9  0x00000000103f4d5a in try_poll_mode (ctx=0x10ae8230, blocking=true)
    at /root/git/qemu/util/aio-posix.c:553
#10 0x00000000103f4e56 in aio_poll (ctx=0x10ae8230, blocking=true)
    at /root/git/qemu/util/aio-posix.c:595
#11 0x000000001017ea36 in iothread_run (opaque=0x10ae7d40)
    at /root/git/qemu/iothread.c:59
#12 0x000003fffd6084c6 in start_thread () from /lib64/libpthread.so.0
#13 0x000003fffd502ec2 in thread_start () from /lib64/libc.so.6

I think we may be missing guards for not-yet-setup queues in other
places; maybe we can centralize this instead of playing whack-a-mole?

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Christian Borntraeger, 2017/02/27
- Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Cornelia Huck <=
  - Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Cornelia Huck, 2017/02/27
    - Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Paolo Bonzini, 2017/02/27
    - Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Cornelia Huck, 2017/02/28
    - Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Paolo Bonzini, 2017/02/28
    - Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Cornelia Huck, 2017/02/28
- Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues, Paolo Bonzini, 2017/02/27

Prev by Date: Re: [Qemu-devel] [PULL v2 00/17] KVM and cpu-exec patches for 2.9 soft freeze
Next by Date: [Qemu-devel] [Bug 1668273] Re: DoS possible on - a QEMU process using userspace SLIRP?
Previous by thread: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues
Next by thread: Re: [Qemu-devel] segfault use VRingMemoryRegionCaches for avail and used ring vs num-queues
Index(es):
- Date
- Thread