qemu-trivial
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V2 0/3] improve -overcommit cpu-pm=on|off


From: Zhao Liu
Subject: Re: [PATCH V2 0/3] improve -overcommit cpu-pm=on|off
Date: Thu, 30 May 2024 21:54:47 +0800

Hi Zide,

On Wed, May 29, 2024 at 10:31:21AM -0700, Chen, Zide wrote:
> Date: Wed, 29 May 2024 10:31:21 -0700
> From: "Chen, Zide" <zide.chen@intel.com>
> Subject: Re: [PATCH V2 0/3] improve -overcommit cpu-pm=on|off
> 
> 
> 
> On 5/29/2024 5:46 AM, Igor Mammedov wrote:
> > On Tue, 28 May 2024 11:16:59 -0700
> > "Chen, Zide" <zide.chen@intel.com> wrote:
> > 
> >> On 5/28/2024 2:23 AM, Igor Mammedov wrote:
> >>> On Fri, 24 May 2024 13:00:14 -0700
> >>> Zide Chen <zide.chen@intel.com> wrote:
> >>>   
> >>>> Currently, if running "-overcommit cpu-pm=on" on hosts that don't
> >>>> have MWAIT support, the MWAIT/MONITOR feature is advertised to the
> >>>> guest and executing MWAIT/MONITOR on the guest triggers #UD.  
> >>>
> >>> this is missing proper description how do you trigger issue
> >>> with reproducer and detailed description why guest sees MWAIT
> >>> when it's not supported by host.  
> >>
> >> If "overcommit cpu-pm=on" and "-cpu host" are present, as shown in the
> > it's bette to provide full QEMU CLI and host/guest kernels used and what
> > hardware was used if it's relevant so others can reproduce problem.
> 
> I ever reproduced this on an older Intel Icelake machine, a
> Sapphire Rapids and a Sierra Forest, but I believe this is a x86 generic
> issue, not specific to particular models.
> 
> For the CLI, I think the only command line options that matter are
>  -overcommit cpu-pm=on: to set enable_cpu_pm
>  -cpu host: so that cpu->max_features is set
> 
> For QEMU version, as long as it's after this commit: 662175b91ff2
> ("i386: reorder call to cpu_exec_realizefn")
> 
> The guest fails to boot:
> 
> [ 24.825568] smpboot: x86: Booting SMP configuration:
> [ 24.826377] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12
> #13 #14 #15 #17
> [ 24.985799] .... node #1, CPUs: #128 #129 #130 #131 #132 #133 #134 #135
> #136 #137 #138 #139 #140 #141 #142 #143 #145
> [ 25.136955] invalid opcode: 0000 1 PREEMPT SMP NOPTI
> [ 25.137790] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0 #2
> [ 25.137790] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/04
> [ 25.137790] RIP: 0010:mwait_idle+0x35/0x80
> [ 25.137790] Code: 6f f0 80 48 02 20 48 8b 10 83 e2 08 75 3e 65 48 8b 15
> 47 d6 56 6f 48 0f ba e2 27 72 41 31 d2 48 89 d8
> [ 25.137790] RSP: 0000:ffffffff91403e70 EFLAGS: 00010046
> [ 25.137790] RAX: ffffffff9140a980 RBX: ffffffff9140a980 RCX:
> 0000000000000000
> [ 25.137790] RDX: 0000000000000000 RSI: ffff97f1ade21b20 RDI:
> 0000000000000004
> [ 25.137790] RBP: 0000000000000000 R08: 00000005da4709cb R09:
> 0000000000000001
> [ 25.137790] R10: 0000000000005da4 R11: 0000000000000009 R12:
> 0000000000000000
> [ 25.137790] R13: ffff98573ff90fc0 R14: ffffffff9140a038 R15:
> 0000000000093ff0
> [ 25.137790] FS: 0000000000000000(0000) GS:ffff97f1ade00000(0000)
> knlGS:0000000000000000
> [ 25.137790] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 25.137790] CR2: ffff97d8aa801000 CR3: 00000049e9430001 CR4:
> 0000000000770ef0
> [ 25.137790] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 25.137790] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
> 0000000000000400
> [ 25.137790] PKRU: 55555554
> [ 25.137790] Call Trace:
> [ 25.137790] <TASK>
> [ 25.137790] ? die+0x37/0x90
> [ 25.137790] ? do_trap+0xe3/0x110
> [ 25.137790] ? mwait_idle+0x35/0x80
> [ 25.137790] ? do_error_trap+0x6a/0x90
> [ 25.137790] ? mwait_idle+0x35/0x80
> [ 25.137790] ? exc_invalid_op+0x52/0x70
> [ 25.137790] ? mwait_idle+0x35/0x80
> [ 25.137790] ? asm_exc_invalid_op+0x1a/0x20
> [ 25.137790] ? mwait_idle+0x35/0x80
> [ 25.137790] default_idle_call+0x30/0x100
> [ 25.137790] cpuidle_idle_call+0x12c/0x170
> [ 25.137790] ? tsc_verify_tsc_adjust+0x73/0xd0
> [ 25.137790] do_idle+0x7f/0xd0
> [ 25.137790] cpu_startup_entry+0x29/0x30
> [ 25.137790] rest_init+0xcc/0xd0
> [ 25.137790] start_kernel+0x396/0x5d0
> [ 25.137790] x86_64_start_reservations+0x18/0x30
> [ 25.137790] x86_64_start_kernel+0xe7/0xf0
> [ 25.137790] common_startup_64+0x13e/0x148
> [ 25.137790] </TASK>
> [ 25.137790] Modules linked in:
> [ 25.137790] --[ end trace 0000000000000000 ]--
> [ 25.137790] invalid opcode: 0000 2 PREEMPT SMP NOPTI
> [ 25.137790] RIP: 0010:mwait_idle+0x35/0x80
> [ 25.137790] Code: 6f f0 80 48 02 20 48 8b 10 83 e2 08 75 3e 65 48 8b 15
> 47 d6 56 6f 48 0f ba e2 27 72 41 31 d2 48 89 d8
> 
> > 
> >> following, CPUID_EXT_MONITOR is set after x86_cpu_filter_features(), so
> >> that it doesn't have a chance to check MWAIT against host features and
> >> will be advertised to the guest regardless of whether it's supported by
> >> the host or not.
> >>
> >> x86_cpu_realizefn()
> >>   x86_cpu_filter_features()
> >>   cpu_exec_realizefn()
> >>     kvm_cpu_realizefn
> >>       host_cpu_realizefn
> >>         host_cpu_enable_cpu_pm
> >>           env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR;
> >>
> >>
> >> If it's not supported by the host, executing MONITOR or MWAIT
> >> instructions from the guest triggers #UD, no matter MWAIT_EXITING
> >> control is set or not.
> > 
> > If I recall right, kvm was able to emulate mwait/monitor.
> > So question is why it leads to exception instead?
> 
> KVM can come to play only iff it can trigger MWAIT/MONITOR VM exits. I
> didn't find explicit proof from Intel SDM that #UD exceptions take
> precedence over MWAIT/MONITOR VM exits, but this is my speculation. For
> example, in ancient machines which don't support MWAIT yet, the only way
> it can do is #UD, not MWAIT VM exit?

For the Host which doesn't support MWAIT, it shouldn't have the VMX
control bit for mwait exit either, right?

Could you pls check this on your machine? If VMX doesn't support this
exit event, then triggering an exception will make sense.

-Zhao




reply via email to

[Prev in Thread] Current Thread [Next in Thread]