[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device
From: |
Dong Jia Shi |
Subject: |
Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device |
Date: |
Wed, 27 Sep 2017 15:11:06 +0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
* Dong Jia Shi <address@hidden> [2017-09-26 15:48:56 +0800]:
[...]
> > >
> > > Tried to test with the following method:
> > > 1. Start g1 (first level guest on kvm a host) with a virtio blk device
> > > defined:
> > > -drive
> > > file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw
> > > \
> > > -device
> > > virtio-blk-ccw,devno=fe.0.2222,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1
> > > \
> > > 2. Login g1, and bind the subchannel of ccw device 0.0.2222 with
> > > vfio-ccw drvier.
> > > 3. Create a mdev on the above subchannel.
> > > 4. Passthrough the mdev to g2, and try to start g2.
> > >
> > > The 4th step failed with the following message and hang:
> > > qemu-system-s390x: vfio-ccw: wirte I/O region: errno=4
> > > (BTW, 4 is EINTR.)
> > >
> > > I roughly guess this might be caused by:
> > > On the kvm host, virtio callback injects the I/O interrupt in a
> > > synchronzing manner. And this causes g1's I/O interrupt handler getting
> > > the interrupt and then signaling the Qemu instance on g1 with the I/O
> > > result, even before return of the pwrite().
> > >
> > > But, using gdb on the kvm host, I do see several ssch successfully
> > > executed. I will dig the root reason, and see if there is some way to
> > > fix the issue.
> >
> > Hm... would that be the ccws used for setting up a virtio device, and
> > the problems start once adapter interrupts become active?
> After a debugging, when starting g2, I got the following ccw sequence:
> 1. CCW_CMD_SENSE_ID 0xe4 [OK]
> 2. CCW_CMD_NOOP 0x03 [OK]
> 3. CCW_CMD_SET_VIRTIO_REV 0x83 [OK]
> 4. CCW_CMD_VDEV_RESET 0x33 [FAILED]
>
> So this is still in the phase of setting up the device.
>
> > Does it work if you modify the nested guest to use the old
> > per-subchannel indicators mechanism?
> It turns out the root reason for the pwrite failure is caused by a bug
> in the vfio-ccw driver:
> drivers/s390/cio/vfio_ccw_cp.c: ccwchain_fetch_direct()
> calls pfn_array_alloc_pin() with a zero @len parameter.
> So it results in a -EINVAL return.
>
> The current code assumes that a valid direct ccw always has its count
> value not equal to zero. However this is not true at least for the
> CCW_CMD_VDEV_RESET (0x33) command:
> (gdb) p/x ccw
> $5 = {cmd_code = 0x33, flags = 0x4, count = 0x0, cda = 0x0}
>
> With a temp fix on this problem, more ccws (e.g. 0x11, 0x12, 0x31, 0x72
> ...) could be translated and executed well. But finnaly the qemu process
> on g1 got a segmentation fault:
> User process fault: interruption code 0238 ilc:3 in
> libpthread-2.24.so[3ff84f80000+1b000]
> Failing address: 000ce330b0b00000 TEID: 000ce330b0b00800
> Fault in primary space mode while using user ASCE.
> AS:000000003b6cc1c7 R3:0000000000000024
> Segmentation fault
>
> dmesg on g1:
> [ 18.160413] User process fault: interruption code 0238 ilc:3 in
> libpthread-2.24.so[3ff84f80000+1b000]
> [ 18.160462] Failing address: 000ce330b0b00000 TEID: 000ce330b0b00800
> [ 18.160463] Fault in primary space mode while using user ASCE.
> [ 18.160470] AS:000000003b6cc1c7 R3:0000000000000024
> [ 18.160476] CPU: 1 PID: 2095 Comm: qemu-system-s39 Not tainted
> 4.13.0-01250-g6baa298-dirty #58
> [ 18.160477] Hardware name: IBM 2964 NC9 704 (KVM/Linux)
> [ 18.160479] task: 0000000038ac8000 task.stack: 0000000038e4c000
> [ 18.160480] User PSW : 0705200180000000 000003ff84f93b8a
> [ 18.160483] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0
> RI:0 EA:3
> [ 18.160486] User GPRS: 0000000000000001 000003ff00000003 0000000104be86b0
> 0000000104be86c6
> [ 18.160487] 0000000000000000 0000000100000001 00000001049efb22
> 000003ffc5dfe13f
> [ 18.160489] 000003ff643fee60 0000000000000000 000003ffc5dfe258
> 000003ff643fe8c8
> [ 18.160490] 000003ff855a5000 00000001049cc320 000003ff643fe888
> 000003ff643fe7e8
> [ 18.160503] User Code: 000003ff84f93b7a: c0e5ffffe7cb brasl
> %r14,3ff84f90b10
> 000003ff84f93b80: a7f4ffc4 brc
> 15,3ff84f93b08
> #000003ff84f93b84: e5600000ff0c tbegin 0,65292
> >000003ff84f93b8a: b2220050 ipm >%r5
> 000003ff84f93b8e: 8850001c srl %r5,28
> 000003ff84f93b92: a774001c brc
> 7,3ff84f93bca
> 000003ff84f93b96: e30020000012 lt %r0,0(%r2)
> 000003ff84f93b9c: a784ffb6 brc
> 8,3ff84f93b08
> [ 18.160520] Last Breaking-Event-Address:
> [ 18.160524] [<00000001046404e6>] 0x1046404e6
>
> The above fault is not caused by vfio-ccw directly I think. So now I
> need to install gdb stuff on g1, and continuing debugging. But ideas on
> this are welcomed. ;)
Using gdb with Qemu on g1, I got the following information:
Thread 3 "qemu-system-s39" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3ffdcdff910 (LWP 2095)]
__lll_lock_elision (futex=0x1007686b0 <qemu_global_mutex>,
adapt_count=0x1007686c6 <qemu_global_mutex+22>, private=0)
at ../sysdeps/unix/sysv/linux/s390/elision-lock.c:66
66 ../sysdeps/unix/sysv/linux/s390/elision-lock.c: No such file or
directory.
(gdb) bt
#0 __lll_lock_elision (futex=0x1007686b0 <qemu_global_mutex>,
adapt_count=0x1007686c6 <qemu_global_mutex+22>, private=0)
at ../sysdeps/unix/sysv/linux/s390/elision-lock.c:66
#1 0x000003fffd98a1f4 in __GI___pthread_mutex_lock (mutex=<optimized out>)
at ../nptl/pthread_mutex_lock.c:92
#2 0x0000000100515326 in qemu_mutex_lock (
mutex=0x1007686b0 <qemu_global_mutex>) at util/qemu-thread-posix.c:65
#3 0x00000001000f2dec in qemu_mutex_lock_iothread () at /root/qemu/cpus.c:1581
#4 0x000000010022827e in kvm_arch_handle_exit (cs=0x100c30ce0,
run=0x3fffce80000) at /root/qemu/target/s390x/kvm.c:2193
#5 0x0000000100131c40 in kvm_cpu_exec (cpu=0x100c30ce0)
at /root/qemu/accel/kvm/kvm-all.c:2094
#6 0x00000001000f1d2a in qemu_kvm_cpu_thread_fn (arg=0x100c30ce0)
at /root/qemu/cpus.c:1128
#7 0x000003fffd9879d4 in start_thread (arg=0x3ffdcdff910)
at pthread_create.c:335
#8 0x000003fffd8736ae in thread_start ()
at ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:71
PC not saved
Googled lock elision for a while, and I still have no idea on this
problem. Any suggestions on this?
--
Dong Jia Shi
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, (continued)
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Dong Jia Shi, 2017/09/07
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Cornelia Huck, 2017/09/07
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Halil Pasic, 2017/09/07
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Cornelia Huck, 2017/09/07
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Dong Jia Shi, 2017/09/07
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Halil Pasic, 2017/09/08
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Dong Jia Shi, 2017/09/19
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Dong Jia Shi, 2017/09/21
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Cornelia Huck, 2017/09/21
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device, Dong Jia Shi, 2017/09/26
- Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device,
Dong Jia Shi <=
[Qemu-devel] [PATCH 4/5] s390x/css: support ccw IDA, Halil Pasic, 2017/09/05
Re: [Qemu-devel] [PATCH 0/5] add CCW indirect data access support, Halil Pasic, 2017/09/08