qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to


From: Marc-André Lureau
Subject: Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll
Date: Mon, 28 Aug 2017 12:28:45 +0000

Hi

On Mon, Aug 28, 2017 at 1:08 PM Markus Armbruster <address@hidden> wrote:

> Marc-André Lureau <address@hidden> writes:
>
> > On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert <
> address@hidden>
> > wrote:
> >
> >> * Marc-André Lureau (address@hidden) wrote:
> >> > Hi
> >> >
> >> > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu <address@hidden> wrote:
> >> >
> >> > > Firstly, introduce Monitor.use_thread, and set it for monitors that
> are
> >> > > using non-mux typed backend chardev.  We only do this for monitors,
> so
> >> > > mux-typed chardevs are not suitable (when it connects to, e.g.,
> serials
> >> > > and the monitor together).
> >> > >
> >> > > When use_thread is set, we create standalone thread to poll the
> monitor
> >> > > events, isolated from the main loop thread.  Here we still need to
> take
> >> > > the BQL before dispatching the tasks since some of the monitor
> commands
> >> > > are not allowed to execute without the protection of BQL.  Then this
> >> > > gives us the chance to avoid taking the BQL for some monitor
> commands in
> >> > > the future.
> >> > >
> >> > > * Why this change?
> >> > >
> >> > > We need these per-monitor threads to make sure we can have at least
> one
> >> > > monitor that will never stuck (that can receive further monitor
> >> > > commands).
> >> > >
> >> > > * So when will monitors stuck?  And, how do they stuck?
> >> > >
> >> > > After we have postcopy and remote page faults, it's simple to
> achieve a
> >> > > stuck in the monitor (which is also a stuck in main loop thread):
> >> > >
> >> > > (1) Monitor deadlock on BQL
> >> > >
> >> > > As we may know, when postcopy is running on destination VM, the vcpu
> >> > > threads can stuck merely any time as long as it tries to access an
> >> > > uncopied guest page.  Meanwhile, when the stuck happens, it is
> possible
> >> > > that the vcpu thread is holding the BQL.  If the page fault is not
> >> > > handled quickly, you'll find that monitors stop working, which is
> trying
> >> > > to take the BQL.
> >> > >
> >> > > If the page fault cannot be handled correctly (one case is a paused
> >> > > postcopy, when network is temporarily down), monitors will hang
> >> > > forever.  Without current patch, that means the main loop hanged.
> We'll
> >> > > never find a way to talk to VM again.
> >> > >
> >> >
> >> > Could the BQL be pushed down to the monitor commands level instead?
> That
> >> > way we wouldn't need a seperate thread to solve the hang on commands
> that
> >> > do not need BQL.
> >>
> >> If the main thread is stuck though I don't see how that helps you; you
> >> have to be able to run these commands on another thread.
> >>
> >
> > Why would the main thread be stuck? In (1) If the vcpu thread takes the
> BQL
> > and the command doesn't need it, it would work.  In (2),  info cpus
> > shouldn't keep the BQL (my qapi-async series would probably help here)
>
> This has been discussed several times[*], but of course not with
> everybody, so I'll summarize once more: asynchronous commands are not a
> actually *required* for anything.  They are *one* way to package the
> "kick off task, receive an asynchronous message when it's done" pattern.
> Another way is synchronous command for the kick off, event for the
> "done".
>

But you would have to break or duplicate the QMP APIs. My proposal doesn't
need that, a command can reenter the main loop, and keep current QMP API.


> For better or worse, synchronous command + event is what we have today.
> Whether adding another way to package the the same thing improves the
> QMP interface is doubtful.
>

I would argue my series is mostly about internal refactoring for the
benefit mentionned above. The fact that you can do (optionnaly) concurrent
QMP commands is a nice bonus. Furthermore, it simplifies the API compared
to CMD / dummy reply + EVENT. And it gives a meaning to the exisiting
command "id"..


>
> [*] Try this one:
> Message-ID: <address@hidden>
> https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg05483.html
>
-- 
Marc-André Lureau


reply via email to

[Prev in Thread] Current Thread [Next in Thread]