qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Crashing in tcp_close


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Crashing in tcp_close
Date: Mon, 7 Nov 2016 13:57:49 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

On Mon, Nov 07, 2016 at 11:09:10AM +0000, Brian Candler wrote:
> On 07/11/2016 10:42, Stefan Hajnoczi wrote:
> > Let's try to isolate the cause of this crash:
> > 
> > Are you able to switch -netdev user to -netdev tap so we can rule out
> > the slirp user network stack as the source of memory corruption?
> Let me try to set that up. Using packer.io, I will have to start a VM by
> hand, and then use the 'null' builder to ssh to the existing VM (whereas
> normally packer fires up the qemu process by itself)
> 
> > Alternatively could you re-run with virtio-blk instead of virtio-scsi to
> > see if that eliminates crashes?
> This is what I got after changing to virtio:
> 
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/usr/local/bin/qemu-system-x86_64 -netdev
> user,id=user.0,hostfwd=tcp::2521-:22'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x00007fa76d645428 in __GI_raise (address@hidden) at
> ../sysdeps/unix/sysv/linux/raise.c:54
> 54    ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> [Current thread is 1 (Thread 0x7fa76f065a80 (LWP 18155))]
> (gdb) bt
> #0  0x00007fa76d645428 in __GI_raise (address@hidden) at
> ../sysdeps/unix/sysv/linux/raise.c:54
> #1  0x00007fa76d64702a in __GI_abort () at abort.c:89
> #2  0x00007fa76d63dbd7 in __assert_fail_base (fmt=<optimised out>,
>     address@hidden "mr != NULL",
>     address@hidden "/home/nsrc/qemu-2.7.0/exec.c",
> address@hidden,
>     address@hidden <__PRETTY_FUNCTION__.42881>
> "address_space_unmap")
>     at assert.c:92
> #3  0x00007fa76d63dc82 in __GI___assert_fail (
>     address@hidden "mr != NULL",
>     address@hidden "/home/nsrc/qemu-2.7.0/exec.c",
> address@hidden,
>     address@hidden <__PRETTY_FUNCTION__.42881>
> "address_space_unmap")
>     at assert.c:101
> #4  0x00005629ce6c0ffe in address_space_unmap (as=<optimised out>,
> buffer=<optimised out>,
>     len=<optimised out>, is_write=1, access_len=4096) at
> /home/nsrc/qemu-2.7.0/exec.c:2967
> #5  0x00005629ce743beb in virtqueue_unmap_sg
> (address@hidden, address@hidden,
>     vq=0x5629d13186b0) at /home/nsrc/qemu-2.7.0/hw/virtio/virtio.c:254
> #6  0x00005629ce744422 in virtqueue_fill (address@hidden,
>     address@hidden, len=61441, address@hidden)
>     at /home/nsrc/qemu-2.7.0/hw/virtio/virtio.c:282
> #7  0x00005629ce7445db in virtqueue_push (vq=0x5629d13186b0,
> address@hidden,
>     len=<optimised out>) at /home/nsrc/qemu-2.7.0/hw/virtio/virtio.c:308
> #8  0x00005629ce71894d in virtio_blk_req_complete
> (address@hidden,
>     address@hidden '\000') at
> /home/nsrc/qemu-2.7.0/hw/block/virtio-blk.c:58
> #9  0x00005629ce718b59 in virtio_blk_rw_complete (opaque=<optimised out>,
> ret=0)
>     at /home/nsrc/qemu-2.7.0/hw/block/virtio-blk.c:121
> #10 0x00005629ce98025a in blk_aio_complete (acb=0x5629d298f370)
>     at /home/nsrc/qemu-2.7.0/block/block-backend.c:923
> #11 0x00005629ce9efaea in coroutine_trampoline (i0=<optimised out>,
> i1=<optimised out>)
>     at /home/nsrc/qemu-2.7.0/util/coroutine-ucontext.c:78
> #12 0x00007fa76d65a5d0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #13 0x00007ffee3d75a20 in ?? ()
> #14 0x2d2d2d2d2d2d2d2d in ?? ()
> ---Type <return> to continue, or q <return> to quit---
> #15 0x00000000000000f0 in ?? ()
> #16 0x0000000000000000 in ?? ()
> 
> Aside: I see "virtqueue_unmap_sg" in the backtrace. Is this correct even for
> a non-SCSI virtio?

Great, now we know virtio-scsi is not causing this crash.

virtqueue_unmap_sg() is used by all virtio devices.  "sg" means
scatter-gather list.  It's unmapping the buffers that the guest passed
to the host.

> The command line was something like this (captured by running packer another
> time, so the ports and filenames are not exactly the same)
> 
> /usr/local/bin/qemu-system-x86_64 -m 4G -vnc [::]:59 -machine
> type=pc,accel=kvm -netdev user,id=user.0,hostfwd=tcp::2879-:22 -boot c -smp
> 8,sockets=1,cores=4,threads=2 -name vtp-nmm-201611071057.qcow2 -device
> virtio-net,netdev=user.0 -drive 
> file=output-qemu-vtp-nmm/vtp-nmm-201611071057.qcow2,if=virtio,cache=writeback,discard=ignore,format=qcow2
> 
> 
> > The core dumps are likely to contain more clues.  If you are comfortable
> > with gdb and debugging C code you could dump the memory surround where
> > the junk value (mr) was loaded from.  Perhaps there is a hint about who
> > zeroed the memory.  In the first core dump you could start with:
> > 
> >   (gdb) up 6  # go to the dma_blk_unmap() stack frame
> >   (gdb) p *(DMAAIOCB *)0x560909ceca90
> >   (gdb) p *((DMAAIOCB *)0x560909ceca90).sg
> 
> (gdb) up 6
> #6  dma_blk_unmap (address@hidden) at
> /home/nsrc/qemu-2.7.0/dma-helpers.c:102
> 102            dma_memory_unmap(dbs->sg->as, dbs->iov.iov[i].iov_base,
> (gdb) p *(DMAAIOCB *)0x560909ceca90
> $1 = {common = {aiocb_info = 0x560907c15690 <dma_aiocb_info>, bs = 0x0,
>     cb = 0x56090767e250 <scsi_dma_complete>, opaque = 0x560909c2b8e0, refcnt
> = 1},
>   ctx = 0x5609087d82a0, acb = 0x0, sg = 0x560909af7430, offset = 4302675968,
>   dir = DMA_DIRECTION_FROM_DEVICE, sg_cur_index = 126, sg_cur_byte = 0, iov
> = {
>     iov = 0x560909c6e960, niov = 126, nalloc = 126, size = 1048576}, bh =
> 0x0,
>   io_func = 0x56090767d110 <scsi_dma_readv>, io_func_opaque =
> 0x560909c2b8e0}
> (gdb) p *((DMAAIOCB *)0x560909ceca90).sg
> $2 = {sg = 0x560909fab1e0, nsg = 126, nalloc = 143, size = 1048576, dev =
> 0x5609087e5630,
>   as = 0x560907e20480 <address_space_memory>}
> (gdb)
> 
> I'm comfortable with C, but don't really know what I'm looking for, nor what
> the data structures represent :-)
> 
> (gdb) p dbs->iov.niov
> $3 = 126
> (gdb) p i
> $4 = 125
> 
> ...so it appears it was in the last iteration of the loop.
> 
> (gdb) print dbs->sg->as
> $5 = (AddressSpace *) 0x560907e20480 <address_space_memory>
> (gdb) print dbs->iov.iov[i].iov_base
> $6 = (void *) 0x7f354099e000
> (gdb) print dbs->iov.iov[i].iov_len
> $7 = 8192
> (gdb) print dbs->dir
> $8 = DMA_DIRECTION_FROM_DEVICE
> 
> Unfortunately, much has been inlined:
> 
> (gdb) frame 4
> #4  0x000056090749dffe in address_space_unmap (as=<optimised out>,
> buffer=<optimised out>,
>     len=<optimised out>, is_write=1, access_len=8192) at
> /home/nsrc/qemu-2.7.0/exec.c:2967
> 2967            assert(mr != NULL);
> (gdb) print mr
> $9 = (MemoryRegion *) 0x0
> (gdb) print buffer
> $10 = <optimised out>

buffer should be 0x7f354099e000.  memory_region_from_host() returned
NULL because it was unable to find the MemoryRegion for this host
address.

Are you hotplugging and devices or adding/removing memory from the
guest?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]