qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] strange crash in tracked_request_begin


From: Stefan Hajnoczi
Subject: Re: [Qemu-block] strange crash in tracked_request_begin
Date: Mon, 7 Mar 2016 17:01:39 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

On Mon, Mar 07, 2016 at 01:29:08PM +0100, Christian Borntraeger wrote:
> Folks,
> 
> I had a crash of a qemu guest in tracked_request_begin.
> The testcase was a guest with ramdisk/kernel that reboots in a 
> loop. (about 10 times per second) with a single null-co disk 
> attached. No idea how to reproduce this, seems to be a lucky hit.
> 
> (gdb) bt
> #0  0x00000000101db5ba in tracked_request_begin (address@hidden, 
> address@hidden, address@hidden, address@hidden, address@hidden)
>     at /home/cborntra/REPOS/qemu/block/io.c:390
> #1  0x00000000101de91e in bdrv_co_do_preadv (bs=0x42a39190, offset=0, 
> bytes=4096, qiov=0x3ff7400cbd8, flags=<optimized out>, 
> address@hidden(unknown: 0))
>     at /home/cborntra/REPOS/qemu/block/io.c:1001
> #2  0x00000000101dfc3e in bdrv_co_do_readv (flags=(unknown: 0), 
> qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized out>, 
> bs=<optimized out>)
>     at /home/cborntra/REPOS/qemu/block/io.c:1024
> #3  bdrv_co_do_rw (opaque=0x3ff7400e370) at 
> /home/cborntra/REPOS/qemu/block/io.c:2173
> #4  0x000000001022d8f6 in coroutine_trampoline (i0=<optimized out>, 
> i1=-1946150928) at /home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79
> #5  0x000003ff95ed150a in __makecontext_ret () from /lib64/libc.so.6
> 
> looking at the code we are at
> 
> QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
> which translates to
> 
> if (((req)->list.le_next = (&bs->tracked_requests)->lh_first) != NULL) 
>     (&bs->tracked_requests)->lh_first->list.le_prev = &(req)->list.le_next;
> (&bs->tracked_requests)->lh_first = (req);                       
> (req)->list.le_prev = &(&bs->tracked_requests)->lh_first;
> 
> gdb says, that (&bs->tracked_requests)->lh_first) is zero in the corefile
> (gdb) print /x bs->tracked_requests
> $6 = {lh_first = 0x0}
> 
> Now looking at the code I am asking myself if this can happen in parallel
> to another code that touches tracked_requests, because gcc seems to read
> &bs->tracked_requests)->lh_first twice (first to check the value, then
> to use it as pointer)

tracked_requests is protected by AioContext.  Perhaps something is doing
I/O without acquiring AioContext?

Luckily there is only 1 place where items are added and removed from
tracked_requests.  This might make debugging somewhat easier.

> 
> 388       qemu_co_queue_init(&req->wait_queue);
>    0x00000000101db594 <+76>:  la      %r2,72(%r13)
>    0x00000000101db598 <+80>:  brasl   %r14,0x1022cdc0 <qemu_co_queue_init>
> 
> 389   
> 390       QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
>    0x00000000101db59e <+86>:  lg      %r1,12744(%r12)         # r1 = 
> (&bs->tracked_requests)->lh_first)
>    0x00000000101db5a4 <+92>:  stg     %r1,48(%r13)            # 
> (req)->list.le_next = r1
>    0x00000000101db5aa <+98>:  cgij    %r1,0,8,0x101db5c0 ---+ # if r1==0 goto
>    0x00000000101db5b0 <+104>: lg      %r1,12744(%r12)       | # r1 = 
> (&bs->tracked_requests)->lh_first) (again!!)
>    0x00000000101db5b6 <+110>: la      %r2,48(%r13)          | 
> => 0x00000000101db5ba <+114>: stg     %r2,56(%r1)           | # r1==0 bang
>    0x00000000101db5c0 <+120>: stg     %r13,12744(%r12)<-----+
>    0x00000000101db5c6 <+126>: lay     %r12,12744(%r12)
>    0x00000000101db5cc <+132>: stg     %r12,56(%r13)
> 
> 
> Christian
> 

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]