[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RESEND 1/1] qmp.c: system_wakeup: adding RUN_STA
Re: [Qemu-devel] [PATCH RESEND 1/1] qmp.c: system_wakeup: adding RUN_STATE_SUSPENDED check before proceeding
Wed, 03 Jan 2018 14:41:31 +0100
Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)
Daniel Henrique Barboza <address@hidden> writes:
> On 01/02/2018 06:03 PM, Eric Blake wrote:
>> On 01/02/2018 11:06 AM, Daniel Henrique Barboza wrote:
>>> The qmp/hmp command 'system_wakeup' is simply a direct call to
>>> 'qemu_system_wakeup_request' from vl.c. This function verifies if
>>> runstate is SUSPENDED and if the wake up reason is valid before
>>> proceeding. However, no error or warning is thrown if any of those
>>> pre-requirements isn't met.
>>> This leads to situations such as the one described in
>>> https://github.com/open-power-host-os/qemu/issues/31, where one
>>> can induce the OS to be suspended by using pm-suspend (via
>>> dompmsuspend, for example) but for some reason the machine failed to
>>> go to the SUSPENDED runstate, staying at runstate RUNNING. The user
>>> then tries to wake up the guest using system_wakeup (or dompmwakeup),
>>> no error is thrown but nothing happened either because the wake up
>>> wasn't fired at all. In the end, the user is left with a guest that
>>> is dormant and believing that system_wakeup isn't working.
>>> This patch changes qmp_system_wakeup to make the runstate verification
>>> before proceeding to call qemu_system_wakeup_request, firing up
>>> an error message if the user tries to wake up a machine that
>>> isn't in SUSPENDED state. The change isn't made inside
>>> qemu_system_wakeup_request because it is used in migration,
>>> ACPI and others where this usage might be valid. This patch
>>> by no means fixes the situation described above, but it can direct
>>> the user/management closer to the real problem.
>> Converting something that silently did nothing successfully into now
>> returning an error is somewhat backwards-incompatible. Is this
>> something that needs to go through a deprecation cycle, to give people a
>> chance to fix QMP scripts that were relying on the previous no-op
>> behavior to now work with the new semantics?
I doubt it.
We're fixing a silent failure to be "loud". How exactly could that
Applications that don't bother to check the result remain exactly as
broken as before.
Applications that do check become aware of the fact that the command did
not work as advertized. I guess this risks breaking them differently.
qmp-spec.txt section 5. gives us license to add errors: "Clients must
not assume any particular [...] errors generated by a command, that is,
new errors can be added to any existing command in newer versions of the
>> behavior to now work with the new semantics? Or should we add an
>> optional bool parameter to the command, where omitting the parameter
>> gives the old semantics, but new enough clients can pass the bool to
>> choose which behavior (no-op or error) they prefer?
> Good point. Back when I first coded this patch I made sure that no QEMU test
> were broken, but I didn't consider that existing user scripts that uses
> system_wakeup might now yield a different result.
> The extra optional parameter seems a good approach to take here IMO. The
> current system_wakeup implementation works fine if the user is careful
> enough to not call it if the guest isn't on SUSPENDED state, so I am unsure
> if deprecating this behavior in favor of this new one is a bit too extreme.
I'd say we call the silent failure to wake up the guest a bug, and your
patch a bug fix.
I figure we could find precedence for turning broken successful behavior
into a clean error if we looked for it.