[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread |
Date: |
Mon, 21 Aug 2017 11:17:28 +0100 |
User-agent: |
Mutt/1.8.3 (2017-05-23) |
* Peter Xu (address@hidden) wrote:
> On Mon, Aug 21, 2017 at 04:58:51PM +0800, Fam Zheng wrote:
> > On Mon, 08/21 15:44, Peter Xu wrote:
> > > This is an extended work for migration postcopy recovery. This series
> > > is tested with the following series to make sure it solves the monitor
> > > hang problem that we have encountered for postcopy recovery:
> > >
> > > [RFC 00/29] Migration: postcopy failure recovery
> > > [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery
> > >
> > > The root problem is that, monitor commands are all handled in main
> > > loop thread now, no matter how many monitors we specify. And, if main
> > > loop thread hangs due to some reason, all monitors will be stuck.
> > > This can be done in reversed order as well: if any of the monitor
> > > hangs, it will hang the main loop, and the rest of the monitors (if
> > > there is any).
> > >
> > > That affects postcopy recovery, since the recovery requires user input
> > > on destination side. If monitors hang, the destination VM dies and
> > > lose hope for even a final recovery.
> > >
> > > So, sometimes we need to make sure the monitor be alive, at least one
> > > of them.
> > >
> > > The whole idea of this series is that instead if handling monitor
> > > commands all in main loop thread, we do it separately in per-monitor
> > > threads. Then, even if main loop thread hangs at any point by any
> > > reason, per-monitor thread can still survive. Further, we add hint in
> > > QMP/HMP to show whether a command can be executed without QMP, if so,
> > > we avoid taking BQL when running that command. It greatly reduced
> > > contention of BQL. Now the only user of that new parameter (currently
> > > I call it "without-bql") is "migrate-incoming" command, which is the
> > > only command to rescue a paused postcopy migration.
> > >
> > > However, even with the series, it does not mean that per-monitor
> > > threads will never hang. One example is that we can still run "info
> > > vcpus" in per-monitor threads during a paused postcopy (in that state,
> > > page faults are never handled, and "info cpus" will never return since
> > > it tries to sync every vcpus). So to make sure it does not hang, we
> > > not only need the per-monitor thread, the user should be careful as
> > > well on how to use it.
> >
> > I think this is like saying we expect the user to understand the internals
> > of
> > QEMU, unless the "rules" are clearly documented. Taking this into account,
> > does it make sense to make the per-monitor thread only allow BQL-free
> > commands?
>
> I don't think users need to know the internals - they just need to be
> careful on using them. Just take the example of "info cpus": during
> paused postcopy it will hang, but IMHO it does not mean that it's
> illegal for user to send that command. It's "by-design" that it'll be
> stuck if one of the vcpus is stuck somewhere; it's just not the
> correct way to use it when the monitor is prepared for postcopy
> recovery.
>
> And IMHO we should not treat threaded monitors special - it should be
> exactly the same monitor service when used with main loop thread. It
> just has its own thread to handle the requests, so it is less
> dependent on main loop thread, and that's all.
From previous discussions we've had, one suggestion was to have some
type of 'safe' command; once issued in a thread, the monitor thread
would only allow other lock-free commands to be issued; it stops any
accidents of them issuing unsafe commands.
Dave
> Thanks,
>
> --
> Peter Xu
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 3/6] QAPI: new QMP command option "without-bql", Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 2/6] monitor: allow monitor to create thread to poll, Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 4/6] migration: qmp: migrate_incoming don't need BQL, Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 5/6] hmp: support "without_bql", Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 6/6] migration: hmp: migrate_incoming don't need BQL, Peter Xu, 2017/08/21
- [Qemu-devel] [RFC 1/6] monitor: move skip_flush into monitor_data_init, Peter Xu, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Fam Zheng, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Fam Zheng, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Dr. David Alan Gilbert, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Fam Zheng, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Dr. David Alan Gilbert, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Fam Zheng, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Peter Xu, 2017/08/21
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Fam Zheng, 2017/08/22
- Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread, Peter Xu, 2017/08/22