qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to


From: Markus Armbruster
Subject: Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll
Date: Mon, 28 Aug 2017 13:08:11 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert <address@hidden>
> wrote:
>
>> * Marc-André Lureau (address@hidden) wrote:
>> > Hi
>> >
>> > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu <address@hidden> wrote:
>> >
>> > > Firstly, introduce Monitor.use_thread, and set it for monitors that are
>> > > using non-mux typed backend chardev.  We only do this for monitors, so
>> > > mux-typed chardevs are not suitable (when it connects to, e.g., serials
>> > > and the monitor together).
>> > >
>> > > When use_thread is set, we create standalone thread to poll the monitor
>> > > events, isolated from the main loop thread.  Here we still need to take
>> > > the BQL before dispatching the tasks since some of the monitor commands
>> > > are not allowed to execute without the protection of BQL.  Then this
>> > > gives us the chance to avoid taking the BQL for some monitor commands in
>> > > the future.
>> > >
>> > > * Why this change?
>> > >
>> > > We need these per-monitor threads to make sure we can have at least one
>> > > monitor that will never stuck (that can receive further monitor
>> > > commands).
>> > >
>> > > * So when will monitors stuck?  And, how do they stuck?
>> > >
>> > > After we have postcopy and remote page faults, it's simple to achieve a
>> > > stuck in the monitor (which is also a stuck in main loop thread):
>> > >
>> > > (1) Monitor deadlock on BQL
>> > >
>> > > As we may know, when postcopy is running on destination VM, the vcpu
>> > > threads can stuck merely any time as long as it tries to access an
>> > > uncopied guest page.  Meanwhile, when the stuck happens, it is possible
>> > > that the vcpu thread is holding the BQL.  If the page fault is not
>> > > handled quickly, you'll find that monitors stop working, which is trying
>> > > to take the BQL.
>> > >
>> > > If the page fault cannot be handled correctly (one case is a paused
>> > > postcopy, when network is temporarily down), monitors will hang
>> > > forever.  Without current patch, that means the main loop hanged. We'll
>> > > never find a way to talk to VM again.
>> > >
>> >
>> > Could the BQL be pushed down to the monitor commands level instead? That
>> > way we wouldn't need a seperate thread to solve the hang on commands that
>> > do not need BQL.
>>
>> If the main thread is stuck though I don't see how that helps you; you
>> have to be able to run these commands on another thread.
>>
>
> Why would the main thread be stuck? In (1) If the vcpu thread takes the BQL
> and the command doesn't need it, it would work.  In (2),  info cpus
> shouldn't keep the BQL (my qapi-async series would probably help here)

This has been discussed several times[*], but of course not with
everybody, so I'll summarize once more: asynchronous commands are not a
actually *required* for anything.  They are *one* way to package the
"kick off task, receive an asynchronous message when it's done" pattern.
Another way is synchronous command for the kick off, event for the
"done".

For better or worse, synchronous command + event is what we have today.
Whether adding another way to package the the same thing improves the
QMP interface is doubtful.


[*] Try this one:
Message-ID: <address@hidden>
https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg05483.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]