[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
From: |
Zhenyu Ye |
Subject: |
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions |
Date: |
Tue, 11 Aug 2020 21:54:08 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 |
Hi Kevin,
On 2020/8/10 23:38, Kevin Wolf wrote:
> Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben:
>> Before doing qmp actions, we need to lock the qemu_global_mutex,
>> so the qmp actions should not take too long time.
>>
>> Unfortunately, some qmp actions need to acquire aio context and
>> this may take a long time. The vm will soft lockup if this time
>> is too long.
>
> Do you have a specific situation in mind where getting the lock of an
> AioContext can take a long time? I know that the main thread can
> block for considerable time, but QMP commands run in the main thread, so
> this patch doesn't change anything for this case. It would be effective
> if an iothread blocks, but shouldn't everything running in an iothread
> be asynchronous and therefore keep the AioContext lock only for a short
> time?
>
Theoretically, everything running in an iothread is asynchronous. However,
some 'asynchronous' actions are not non-blocking entirely, such as
io_submit(). This will block while the iodepth is too big and I/O pressure
is too high. If we do some qmp actions, such as 'info block', at this time,
may cause vm soft lockup. This series can make these qmp actions safer.
I constructed the scene as follow:
1. create a vm with 4 disks, using iothread.
2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%.
3. add press to the 4 disks in the vm at the same time. I used the fio and
some parameters are:
fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4
4. do block query actions, for example, by virsh:
virsh qemu-monitor-command [vm name] --hmp info block
Then the vm will soft lockup, the calltrace is:
[ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s!
[kworker/1:1:33]
[ 192.314241] Kernel panic - not syncing: softlockup: hung tasks
[ 192.316370] CPU: 1 PID: 33 Comm: kworker/1:1 Kdump: loaded Tainted: G
OEL 4.19.36+ #16
[ 192.318765] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 192.325638] Workqueue: events drm_fb_helper_dirty_work
[ 192.roorr327238] Call trace:
[ 192.331528] dump_backtrace+0x0/0x198
[ 192.332695] show_stack+0x24/0x30
[ 192.334219] dump_stack+0xa4/0xcc
[ 192.335578] panic+0x12c/0x314
[ 192.336565] watchdog_timoot_fn+0x3e4/0x3e8
[ 192.339984] __hrtimer_run_queues+0x114/0x358
[ 192.341576] hrtimer_interrupt+0x104/0x2d8
[ 192.343247] arch_timer_handler_virt+0x38/0x58
[ 192.345074] handle_percpu_devid_irq+0x90/0x248
[ 192.347238] generic_handle_irq+0x34/0x50
[ 192.349536] __handle_domain_irq+0x68/0xc0
[ 192.351193] gic_handle_irq+0x6c/0x150
[ 192.352639] el1_irq+0xb8/0x140
[ 192.353855] vp_notify+0x28/0x38 [virtio_pci]
[ 192.355646] virtqueue_kick+0x3c/0x78 [virtio_ring]
[ 192.357539] virtio_gpu_queue_ctrl_buffer_locked+0x180/0x248 [virtio_gpu]
[ 192.359869] virtio_gpu_queue_ctrl_buffer+0x50/0x78 [virtio_gpu]
[ 192.361456] virtio_gpu_cmd_resource_flush+0x8c/0xb0 [virtio_gpu]
[ 192.363422] virtio_gpu_surface_dirty+0x60/0x110 [virtio_gpu]
[ 192.365215] virtio_gpu_framebuffer_surface_dirty+0x34/0x48 [virtio_gpu]
[ 192.367452] drm_fb_helper_dirty_work+0x178/0x1c0
[ 192.368912] process_one_work+0x1b4/0x3f8
[ 192.370192] worker_thread+0x54/0x470
[ 192.371370] kthread+0x134/0x138
[ 192.379241] ret_from_fork+0x10/0x18
[ 192.382688] kernel fault(0x5) notification starting on CPU 1
[ 192.385059] kernel fault(0x5) notification finished on CPU 1
[ 192.387311] SMP: stopping secondary CPUs
[ 192.391024] Kernel Offset: disabled
[ 192.392111] CPU features: 0x0,a1806008
[ 192.393306] Memory Limit: none
[ 192.396701] Starting crashdump kernel...
[ 192.397821] Bye!
This problem can be avoided after this series applied.
Thanks,
Zhenyu