qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/2]: qemu-ga: drop automatic reaper


From: Michael Roth
Subject: Re: [Qemu-devel] [RFC 0/2]: qemu-ga: drop automatic reaper
Date: Mon, 23 Apr 2012 09:14:18 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Apr 23, 2012 at 02:20:53PM +0200, Michal Privoznik wrote:
> On 20.04.2012 15:36, Luiz Capitulino wrote:
> > On Fri, 20 Apr 2012 14:07:16 +0200
> > Michal Privoznik <address@hidden> wrote:
> > 
> >>>> But, I think if we tell users we'll *only* send response on error,
> >>>> we should do our part to *not* send the responses, rather than relying
> >>>> on them having implemented the reset mechanism to throw them away after
> >>>> guest wake-up. What we could do is allow a command to set a flag, after
> >>>> it reaps it's child (in the case of suspend this would be after
> >>>> wake-up, for shutdown it'd basically be a no-op, but worth adding
> >>>> for readability sake), to have qemu-ga not send a response. We'd
> >>>> implement it similarly to how we did ga_set_response_delimited().
> >>>
> >>> Fine with me. Stating that "do not wait for an OK response, because none
> >>> will be sent" sounds clearer than "an OK response may, or may not be
> >>> emitted. Or it may be emitted when the VM resumes".
> >>
> >>
> >> Just to make this clear: this "report-only-error" behavior concerns only
> >> guest-suspend-* and guest-shutdown commands, right? Because otherwise,
> >> if we enable such behavior for all commands (e.g. fsfreeze) I think we
> >> are entering the world of pain.
> > 
> > Exactly, this would only concern the suspend and shutdown commands.
> > 
> >> From user POV there is a huge difference between those 2 sets of
> >> commands (suspend/shutdown on one side, the others on the other side):
> >> - the first emits an event on qemu monitor, so libvirt can catch that
> >> and confirm suspend/shutdown has succeeded
> > 
> > Oops, this is a different subject but there's a problem here. Events are 
> > just
> > hints, they shouldn't be your definitive source of information.
> > 
> > For shutdown and suspend-disk, I think that the best indication that
> > the command has succeeded is that the VM will successfully exit. We could
> > also have a special exit status code for suspend-to-disk, because the
> > command could run in parallel with the user powering off the VM and libvirt
> > wouldn't know that (and would think the VM is suspended, while it's really
> > powered-off).
> > 
> > For suspend-ram and suspend-hybrid, we're missing a 'suspended' RunState.
> > The event serve as a good hint and you can use it, but if it's lost for some
> > reason (eg. libvirt crashes before it's received) then libvirt can learn the
> > VM state by issuing query-status.
> > 
> > Now, going back to the original subject. I have to admit that I'm not sure
> > what's the best way to go here.
> > 
> > I'll try to recapitulate (for myself and for those that may be confused) 
> > I'll be
> > verbose a bit.
> > 
> > We have two qemu-ga commands that are special: guest-shutdown and the 
> > guest-suspend
> > family. They are special because they shut down the VM or suspend its 
> > execution
> > (meaning that the world of qemu-ga is gone or gets completely frozen).
> > 
> > Today, shutdown is an asynchronous operation: qemu-ga gets the shutdown 
> > process
> > started and returns to accept new commands. For qemu-ga clients, this 
> > implies:
> > 
> >  1. errors in the shutdown operation are not reported back
> >  2. qemu-ga doesn't block
> > 
> > For qemu-ga this implies having a way to automatically reap terminated 
> > children
> > processes.
> > 
> > The guest-suspend commands do the same when executing the suspend operation,
> > but before they do that they need to query the VM for suspend support and
> > this is done by executing pm-is-supported (if available). This fact 
> > shouldn't
> > be visible to qemu-ga clients, but it has two internal implications:
> > 
> >  1. The operation is half synchronous and half asynchronous
> >  2. In order to bypass the automatic process reaper in qemu-ga when 
> > executing
> >     pm-is-supported, we have to play tricks that makes the suspend code
> >     more complex than it should be
> > 
> > We have two options:
> > 
> >  1. Keep the current behavior (explained above, shutdown is async, suspend
> >     is half sync half async). For libvirt this means nothing changes, for
> >     qemu-ga this means more complex code
> > 
> >  2. Change everything to be synchronous (this series). This essentially 
> > means:
> > 
> >      A. errors are going to reported back
> >      B. qemu-ga will block
> >      C. we avoid all the dirty tricks, and qemu-ga code becomes simpler
> >      D. In theory, this should be a compatible change due to the end of 
> > world
> >         nature of the commands involved
> > 
> > A third possible option would be to have asynchronous support. But I'm not
> > sure whether this would fit well and how complex this would be (specially
> > because of fork()).
> > 
> 
> Okay, thanks for recap.
> One thing that I am sure will not play nicely is old libvirt with new
> qemu-ga. Here's the flow:
> 
> 1. User issues virDomainPMSuspend*()
> 2. Libvirt chews this and calls guest-suspend-* holding up return from
> API until an qemu-ga answer is read
> 3. Guest gets suspended
> ... Eventually ...
> 4. Guest gets resumed and qemu-ga returns "{'return':{}}"
> 5. Libvirt reads response and returns from API
> 
> So, I think if we are going to change these commands, we need to spawn a
> new ones so user can distinguish if he's talking to fixed or broken
> qemu-ga. Or, we can introduce query-* commands family like we already
> have for the monitor, so user can guess it from qemu-ga version;
> 
> Having said how libvirt use qemu-suspend-*; what should be the new
> libvirt behavior? I mean - how long should libvirt wait until it can
> return from API? IOW - after what period of time/what event user can be
> sure no error will be returned? Because as Luiz said above - STOP event
> is just hint, not confirmation. I guess API should wait until qemu
> process dies. This may, however, take ages (esp. in case of
> guest-shutdown).

So, currently, libvirt waits indefinitely (till guest wake-up) for
the guest-suspend-* response? I think that's broken, since
guest-suspend-* is documented such that a response is not garaunteed,
so a time-out mechanism should be employed here. How would they issue
the system_wakeup command to wake up the guest (not sure if this is
in libvirt, but I assume it will be at some point)? Does the command
wait in the background without holding up subsequent
commands/virsh sessions?

If so, or rather either way, we should be relying instead on a
timeout, and if the time-out is hit you should execute the qemu-ga
reset loop (guest-sync/guest-sync-delimited, also with
a timeout, documented here:
http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol)
until a connection is re-established (which won't occur until the
guest is woken up). As an optimization you could disable the reset
loop until a wakeup event is recieved, but I'm not sure how reliable
that is, and I don't think we have a reliable indicator currently of
whether or not the guest is in s3/s4, if it's possible I don't think
it's implemented yet.

So I think that's how it should be done currently.

With the changes proposed (respond only if there's an error for
guest-suspend-*/guest-shutdown) you'd do the same thing basically,
except the cases where you'd normally "expect" a response would
always be time outs. That behavior would actually be okay, since it
still aligns with the commands documentation.

There are a couple commands where you can avoid the timeout scenario
on success though:

For guest-shutdown(powerdown) you'd wait/check for process exit
(either after the 10 second timeout on waiting for the error
response, or, preferably, concurrently if that's doable, since it
won't impose any minimum wait time on the user). If, after some
amount of time, you get neither an error nor a process exit, you
should report a timeout/error to the user.

For guest-suspend-disk, you'd do the same, since according to Gleb's
comments we cannot reliably distinguish between shutdown/hibernate
based on any machine state.

For commands that don't result in process exit:
guest-shutdown(reboot|halt)/guest-suspend-ram/guest-suspend-hybrid,
I don't think you have any choice but to implement them the
same way I suggested they should be implemented now: wait some
reasonable amount of time for error responses, 10 seconds or so, and
then go into the reset loop. This means there's still no way to
garauntee a successful suspend state, since the events aren't
considered reliable, and polling runstate via query-status will raise
with guest-wakeup. Maybe with some work we can build some assurances
around these events/states that we can pass to the user, but for now
these commands are strictly "requests" to the guest that may or may
not succeed, success will always result in a timeout (though a
timeout does not garauntee success). I don't think *these* particular
timeouts should be reported to the user, just the errors if any are
recieved before the timeout.

That's my perspective from the qemu-ga side anyway. I'm not too
familiar with what the constraints are within libvirt.

> 
> Michal
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]