qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Windows 10 and 11 VMs fails to boot with SapphireRapids CPU definition


From: Manish Mishra
Subject: Windows 10 and 11 VMs fails to boot with SapphireRapids CPU definition
Date: Thu, 18 Jul 2024 11:44:28 +0000

Hi Everyone,

We are facing issues booting windows VMs with SapphireRapids CPU definition. This is happening in case we have multiple cores per vcpu set and the VM is a UEFI, secure boot and credential guard enabled. Till now we have observed this issue on windows 10 and 11. 

 

We did some triaging around this. SapphireRapids CPU definition has raised cpuid_level to 0x20. This includes leaf V2 extended topology (0x1f). QEMU returns all zeros in case !x86_has_extended_topo(). As per expectation(also mentioned in https://cdrdv2-public.intel.com/775917/intel-64-architecture-processor-topology-enumeration.pdf) if guests see this it should fallback to 0x1b. Somehow windows 10 and windows 11 does not work well with this assumption and panics on boot. 

 

We checked on one of the SapphireRapids node with no multi-die topology; this is how CPUID output looks like. 0x1f output is the same as 0xb.

 

# cpuid -l 0xb -s 0 -1

CPU:

   x2APIC features / processor topology (0xb):

      extended APIC ID                      = 37

      --- level 0 ---

      level number                          = 0x0 (0)

      level type                            = thread (1)

      bit width of level & previous levels  = 0x1 (1)

      number of logical processors at level = 0x2 (2)

# cpuid -l 0xb -s 1 -1

CPU:

      --- level 1 ---

      level number                          = 0x1 (1)

      level type                            = core (2)

      bit width of level & previous levels  = 0x7 (7)

      number of logical processors at level = 0x28 (40)

# cpuid -l 0xb -s 2 -1

CPU:

      --- level 2 ---

      level number                          = 0x2 (2)

      level type                            = invalid (0)

      bit width of level & previous levels  = 0x0 (0)

      number of logical processors at level = 0x0 (0)

# cpuid -l 0x1f -s 0 -1

CPU:

   V2 extended topology (0x1f):

      x2APIC ID of logical processor = 0x25 (37)

      --- level 0 ---

      level number                          = 0x0 (0)

      level type                            = thread (1)

      bit width of level & previous levels  = 0x1 (1)

      number of logical processors at level = 0x2 (2)

# cpuid -l 0x1f -s 1 -1

CPU:

      --- level 1 ---

      level number                          = 0x1 (1)

      level type                            = core (2)

      bit width of level & previous levels  = 0x7 (7)

      number of logical processors at level = 0x28 (40)

# cpuid -l 0x1f -s 2 -1

CPU:

      --- level 2 ---

      level number                          = 0x2 (2)

      level type                            = invalid (0)

      bit width of level & previous levels  = 0x0 (0)

      number of logical processors at level = 0x0 (0)

 

We tried a workaround having 0x1f output same as 0xb in case !x86_has_extended_topo(), instead of setting all zeros. This seems to work fine. Our understanding is that current QEMU behaviour is not incorrect but still does the above mentioned workaround makes sense? And if we look it is the same as bare-metal so it should not be unreasonable. If so will be happy to send a patch for same.

 

Thanks

Manish Mishra

 

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]