qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] KVM internal error with kvm-pr on ppc64le


From: Loic Devulder
Subject: Re: [Qemu-ppc] KVM internal error with kvm-pr on ppc64le
Date: Thu, 31 May 2018 15:02:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

Hi,

After some test I find a way to workaround this using 2 graphics cards
in the VM.

Boot with this:
# /usr/bin/qemu-system-ppc64 -serial mon:stdio -m 1024 -cpu host -device 
virtio-scsi-pci,id=scsi0
-device scsi-cd,drive=cd0,bus=scsi0.0 -drive
media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
-smp 1,threads=1 -enable-kvm -device nec-usb-xhci -device usb-tablet -device 
usb-kbd -device
virtio-vga -device virtio-gpu-pci -nodefaults

And do this in the SLOF shell (be careful, order is important but I don't know 
why...):
0 > dev /pci/vga
0 > s" device_type" delete-property
0 > dev /pci/display
0 > s" display" device-type
0 > boot

And now OS is booting using the 'virtio-gpu-pci' and not the 'virtio-vga', no 
crash anymore.

I also tried the same but with '-vga std' instead of '-device virtio-vga', in 
that case I need to
first add the 'device_type' property for 'virtio-gpuc-pci' and *after* remove 
it for the VGA card
(again, I don't know why). But even with this I still have a 'KVM Internal 
error', a new one with a
more weird suberror: "Suberror: -2130704892".

Here the trace:
Trying to load:  from: /address@hidden/address@hidden/address@hidden ...   
Successfully loaded
SUSE Linux
#1 SMP Fri Jun 3KVM internal error. Suberror: -2130704892
extra data[0]: 1e1e1e1e
extra data[1]: 100000004
extra data[2]: 2fff0000
extra data[3]: 3000000002001000
extra data[4]: b000000002001033
extra data[5]: 42000000
extra data[6]: 8000000000002933
extra data[7]: c000000000fd0874
extra data[8]: 2fff0000
extra data[9]: c000000000030ed0
extra data[10]: 0
extra data[11]: 0
[...]
extra data[8190]: 0
extra data[8191]: 0
NIP c0000000000264b0   LR d0000000020b7298 CTR 00000000000c0000 XER 
0000000020000000 CPU#0
MSR b000000000009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
TB 00000000 00000000 DECR 00000000
GPR00 d0000000020b6ee4 c00000003a0fef80 d0000000020c9230 d000080080400000
GPR04 0000000000000000 0000000000300000 0800200bc00003a1 c00000003e8af1b0
GPR08 00000000000c0000 d000080080400000 0000000000000000 d0000000020bd818
GPR12 c000000000026440 c00000000fa00000 0000000000000020 d00000000211de66
GPR16 d00000000211e760 0000000000000002 c00000003a0ffdfc d000000002150000
GPR20 c00000003a5fb800 c00000003a0e7888 c00000003a9df2b8 0000000000000000
GPR24 c00000003a9df280 c00000003a0ff1f0 c00000003a0e79b8 0000000000000000
GPR28 c00000003a0e78a8 c00000003a0ff1f0 c00000003a0e7888 0000000000000000
CR 24228824  [ E  G  E  E  L  L  E  G  ]             RES ffffffffffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 414e564544006b63 45440034706f6f6c
FPR08 554e514553006b73 4954494e495f4345 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
 SRR0 c0000000000264b0  SRR1 b000000000009033    PVR 00000000004b0201 VRSAVE 
0000000000000000
SPRG0 0000000000000000 SPRG1 c00000000fa00000  SPRG2 c00000000fa00000  SPRG3 
0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 
0000000000000000
HSRR0 0000000000000000 HSRR1 0000000000000000
 CFAR 0000000000000000
 LPCR 0000000002000000
 SDR1 0000000000000000   DAR d000080080400000  DSISR 0000000042000000

I don't know if this could be useful, but if someone else has the issue he can 
find a workaround now :)

On 05/30/2018 02:59 PM, Loic Devulder wrote:
> Hi guys!
> 
> I tried to start a Linux CD (I tried openSUSE Leap 42.3) to install my 
> ppc64le VM with kvm-pr (yes
> kvm-pr!) on a Power8 host and I have some issues...
> 
> My host system is an openSUSE Tumbleweed 20180525 with qemu 2.11 and kernel 
> 4.16.11.
> 
> I tried different version of qemu without any success but I tried with 
> openSUSE Leap 42.3 as a host
> (and qemu 2.11 and 2.12) with success. Kernel is 4.4 on Leap 42.3, and if I 
> update the kernel to a
> more recent version I have the same issue! So it seems to be a kvm problem, 
> not qemu.
> 
> I also tried without KVM and it works (but it is of course very slow!).
> 
> I tried to understand why and it seems to be related to the virtio-vga. Here 
> the different tries I
> made with openSUSE Tumbleweed 20180525 as host.
> 
> - With only virtio-vga and boot/messages on VGA console:
> /usr/bin/qemu-system-ppc64 -m 1024 -cpu host -device virtio-scsi-pci,id=scsi0 
> -device
> scsi-cd,drive=cd0,bus=scsi0.0 -drive
> media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
> -smp 1,threads=1 -enable-kvm -device nec-usb-xhci -device usb-tablet -device 
> usb-kbd -serial
> stdio:mon -device virtio-vga -nodefaults
> 
> Trying to load:  from: /address@hidden/address@hidden/address@hidden ...   
> Successfully loaded
> SUSE Linux
> #1 SMP Fri Jun 3KVM internal error. Suberror: 61568
> NIP c0000000005359f4   LR c0000000005356cc CTR 0000000000000032 XER 
> 0000000000000000 CPU#0
> MSR b000000000009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
> TB 00000000 00000000 DECR 00000000
> GPR00 0000000000006400 c00000003e49f440 c000000000f3ba00 0000000000000190
> GPR04 d0000800803ce800 0000000000000000 0000000000000000 0000000000000000
> GPR08 0000000000000040 d0000800803ce800 0000000000000032 ffffffffffffffff
> GPR12 c000000000535950 c00000000fa00000 c00000000000bf38 0000000000000000
> GPR16 0000000000000064 0000000000000025 0000000000000064 c0000000012d5a98
> GPR20 0000000000000000 0000000000000320 0000000000000000 0000000000000000
> GPR24 0000000000000006 0000000000000008 0000000000000000 0000000000006400
> GPR28 c000000000535950 c00000003e719800 0000000000000000 d0000800803ce800
> CR 44022084  [ G  G  -  E  E  -  L  G  ]             RES ffffffffffffffff
> FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPSCR 0000000000000000
>  SRR0 c0000000005359f4  SRR1 b000000000009033    PVR 00000000004b0201 VRSAVE 
> 0000000000000000
> SPRG0 0000000000000000 SPRG1 c00000000fa00000  SPRG2 c00000000fa00000  SPRG3 
> 0000000000000000
> SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 
> 0000000000000000
> HSRR0 0000000000000000 HSRR1 0000000000000000
>  CFAR 0000000000000000
>  LPCR 0000000002000000
>  SDR1 0000000000000000   DAR d0000800803ce800  DSISR 0000000042000000
> 
> I have the "Loading kernel..." to "Booting Linux via __start() ..." on the 
> VGA console and after the
> crash on the serial console. "Suberror: 61568" is weird...
> 
> Same if I boot with "-vga std" or "-vga virtio" instead of "-device 
> virtio-vga" (I know that "-vga
> virtio" and "-device virtio-vga" are the same).
> 
> With the help of Alexander Graf (thanks!) I was able to have some debug 
> informations:
> qemu-system-ppc-9944  [006] .... 83419.417302: kvm_book3s_reenter: reentry 
> r=1 | pc=0xc000000000089930
>  qemu-system-ppc-9944  [006] .... 83419.417304: kvm_exit: exit=PROGRAM | 
> pc=0xc000000000009958 |
> msr=0xb000000000001031 | dar=0x7fff67356200 | srr1=0x800000000004d033 | 
> last_inst=0x4c000024
>  qemu-system-ppc-9944  [006] .... 83419.417305: kvm_ppc_instr: inst 
> 1275068452 pc 0xc0000000005359f4
> emulate 0
>  qemu-system-ppc-9944  [006] .... 83419.417305: kvm_book3s_reenter: reentry 
> r=1 | pc=0xc0000000005359f4
>  qemu-system-ppc-9944  [006] .... 83419.417307: kvm_exit: exit=DATA_SEGMENT | 
> pc=0xc0000000005359f4
> | msr=0xb000000000009033 | dar=0xd0000800803ce800 | srr1=0x800000000000d033 | 
> last_inst=0x4c000024
>  qemu-system-ppc-9944  [006] .... 83419.417307: kvm_book3s_slb_found: 
> 18000033deaf8008 -> 61c0f87e5e
>  qemu-system-ppc-9944  [006] .... 83419.417308: kvm_book3s_slbmte: 
> 61c0f87e5e990, d000080088000003
>  qemu-system-ppc-9944  [006] .... 83419.417308: kvm_book3s_reenter: reentry 
> r=0 | pc=0xc0000000005359f4
>  qemu-system-ppc-9944  [006] .... 83419.417310: kvm_exit: exit=DATA_STORAGE | 
> pc=0xc0000000005359f4
> | msr=0xb000000000009033 | dar=0xd0000800803ce800 | srr1=0x800000000000d033 | 
> last_inst=0xf8c90000
>  qemu-system-ppc-9944  [006] .... 83419.417315: kvm_book3s_slb_found: 
> 18000033deaf8008 -> 61c0f87e5e
>  qemu-system-ppc-9944  [006] .... 83419.417317: kvm_book3s_reenter: reentry 
> r=2 | pc=0xc0000000005359f4
>  qemu-system-ppc-9944  [006] .... 83419.417319: kvm_userspace_exit: reason 
> KVM_EXIT_INTERNAL_ERROR (17)
> qemu-system-ppc-9941  [004] .... 83419.447328: kvm_book3s_mmu_flush: Flush 
> 5473 pPTEs: 200080000000
> - 200081000000
>            gdbus-9946  [004] .... 83465.282674: kvm_book3s_mmu_flush: Flush 
> 5473 PTEs: 0 - 0
>            gdbus-9946  [004] .... 83465.282684: kvm_book3s_mmu_invalidate: 
> Flush: hva=61c1b7cf37f7f0
> pfn=7677e ea=c00000002f7f0000 vp=ea1b217002f7f0 ra=2f7f0000 [6]
> 
> - With only virtio-vga and boot/messages on hvterm console (I removed the USB 
> part to force the
> console output):
> /usr/bin/qemu-system-ppc64 -serial mon:stdio -m 1024 -cpu host -device 
> virtio-scsi-pci,id=scsi0
> -device scsi-cd,drive=cd0,bus=scsi0.0 -drive
> media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
> -smp 1,threads=1 -enable-kvm -device virtio-vga -nodefaults
> 
> Loading kernel ...
> Loading initial ramdisk ...
> OF stdout device is: /vdevice/address@hidden
> Preparing to boot Linux version 4.4.74-1-default (address@hidden) (gcc 
> version 4.8.5 (SUSE Linux) )
> #1 SMP Fri Jun 30 19:11:19 UTC 2017 (f0a86d9)
> Detected machine type: 0000000000000101
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> command line: BOOT_IMAGE=/boot/ppc64le/linux
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000006380000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000040000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000040000000
> found display   : /address@hidden/address@hidden, opening... done
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x0000000006390000 -> 0x0000000006390abe
> Device tree struct  0x00000000063a0000 -> 0x00000000063b0000
> Quiescing Open Firmware ...
> Booting Linux via __start() ...
> Loading kernel ...
> Loading initial ramdisk ...
> OF stdout device is: /vdevice/address@hidden
> Preparing to boot Linux version 4.4.74-1-default (address@hidden) (gcc 
> version 4.8.5 (SUSE Linux) )
> #1 SMP Fri Jun 30 19:11:19 UTC 2017 (f0a86d9)
> Detected machine type: 0000000000000101
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> command line: BOOT_IMAGE=/boot/ppc64le/linux
> [...]
> io scheduler cfq registered (default)
> pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> pciehp: PCI Express Hot Plug Controller Driver version: 0.4
> Using unsupported 800x600 vga at 200080800000, depth=32, pitch=3200
> KVM internal error. Suberror: 88
> NIP c0000000005359f4   LR c0000000005356cc CTR 0000000000000032 XER 
> 0000000000000000 CPU#0
> MSR b000000000009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
> TB 00000000 00000000 DECR 00000000
> GPR00 0000000000006400 c00000003e49f440 c000000000f3ba00 0000000000000190
> GPR04 d0000800801ce800 0000000000000000 0000000000000000 0000000000000000
> GPR08 0000000000000040 d0000800801ce800 0000000000000032 ffffffffffffffff
> GPR12 c000000000535950 c00000000fa00000 c00000000000bf38 0000000000000000
> GPR16 0000000000000064 0000000000000025 0000000000000064 c0000000012d5a98
> GPR20 0000000000000000 0000000000000320 0000000000000000 0000000000000000
> GPR24 0000000000000006 0000000000000008 0000000000000000 0000000000006400
> GPR28 c000000000535950 c00000003e719800 0000000000000000 d0000800801ce800
> CR 44022084  [ G  G  -  E  E  -  L  G  ]             RES ffffffffffffffff
> FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPSCR 0000000000000000
>  SRR0 c0000000005359f4  SRR1 b000000000009033    PVR 00000000004b0201 VRSAVE 
> 0000000000000000
> SPRG0 0000000000000000 SPRG1 c00000000fa00000  SPRG2 c00000000fa00000  SPRG3 
> 0000000000000000
> SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 
> 0000000000000000
> HSRR0 0000000000000000 HSRR1 0000000000000000
>  CFAR 0000000000000000
>  LPCR 0000000002000000
>  SDR1 0000000000000000   DAR d0000800801ce800  DSISR 0000000042000000
> 
> It's a little bit better, I think "Suberror: 88" is more comprehensible (at 
> least I hope!).
> 
> Same if I boot with "-vga std" or "-vga virtio" instead of "-device 
> virtio-vga".
> 
> Some debug informations:
> qemu-system-ppc-10016 [004] .... 83794.482587: kvm_book3s_reenter: reentry 
> r=1 | pc=0xc000000000089930
>  qemu-system-ppc-10016 [004] .... 83794.482588: kvm_exit: exit=PROGRAM | 
> pc=0xc000000000009958 |
> msr=0xb000000000001031 | dar=0x7fff3e357200 | srr1=0x800000000004d033 | 
> last_inst=0x4c000024
>  qemu-system-ppc-10016 [004] .... 83794.482589: kvm_ppc_instr: inst 
> 1275068452 pc 0xc0000000005359f4
> emulate 0
>  qemu-system-ppc-10016 [004] .... 83794.482590: kvm_book3s_reenter: reentry 
> r=1 | pc=0xc0000000005359f4
>  qemu-system-ppc-10016 [004] .... 83794.482591: kvm_exit: exit=DATA_SEGMENT | 
> pc=0xc0000000005359f4
> | msr=0xb000000000009033 | dar=0xd0000800801ce800 | srr1=0x800000000000d033 | 
> last_inst=0x4c000024
>  qemu-system-ppc-10016 [004] .... 83794.482592: kvm_book3s_slb_found: 
> 18000033deaf8008 -> 61c1b7cf37
>  qemu-system-ppc-10016 [004] .... 83794.482592: kvm_book3s_slbmte: 
> 61c1b7cf37990, d000080088000003
>  qemu-system-ppc-10016 [004] .... 83794.482592: kvm_book3s_reenter: reentry 
> r=0 | pc=0xc0000000005359f4
>  qemu-system-ppc-10016 [004] .... 83794.482594: kvm_exit: exit=DATA_STORAGE | 
> pc=0xc0000000005359f4
> | msr=0xb000000000009033 | dar=0xd0000800801ce800 | srr1=0x800000000000d033 | 
> last_inst=0xf8c90000
>  qemu-system-ppc-10016 [004] .... 83794.482599: kvm_book3s_slb_found: 
> 18000033deaf8008 -> 61c1b7cf37
>  qemu-system-ppc-10016 [004] .... 83794.482602: kvm_book3s_reenter: reentry 
> r=2 | pc=0xc0000000005359f4
>  qemu-system-ppc-10016 [004] .... 83794.482602: kvm_userspace_exit: reason 
> KVM_EXIT_INTERNAL_ERROR (17)
>  qemu-system-ppc-10013 [005] .... 83794.495264: kvm_book3s_mmu_flush: Flush 
> 5468 pPTEs: 200080000000
> - 200081000000
>  qemu-system-ppc-10013 [001] .... 83800.464923: kvm_book3s_mmu_flush: Flush 
> 5468 PTEs: 0 - 0
>  qemu-system-ppc-10013 [001] .... 83800.464933: kvm_book3s_mmu_invalidate: 
> Flush: hva=61c0392d85f7f0
> pfn=75b7e ea=c00000002f7f0000 vp=ea1b217002f7f0 ra=2f7f0000 [6]
>  qemu-system-ppc-10013 [001] .... 83800.464938: kvm_book3s_mmu_invalidate: 
> Flush: hva=61c0392d85f7e0
> pfn=75b7d ea=c00000002f7e0000 vp=ea1b217002f7e0 ra=2f7e0000 [6]
> 
> - With only virtio-gpu and boot/messages on hvterm/VGA console:
> /usr/bin/qemu-system-ppc64 -serial mon:stdio -m 1024 -cpu host -device 
> virtio-scsi-pci,id=scsi0
> -device scsi-cd,drive=cd0,bus=scsi0.0 -drive
> media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
> -smp 1,threads=1 -enable-kvm -device virtio-gpu -nodefaults
> 
> No problem, the kernel boot with default console on hvterm and some messages 
> on VGA console. And the
> kernel boot also if I set the console to VGA (console=tty on Linux).
> 
> - With only virtio-gpu-pci and boot/messages on hvterm:
> /usr/bin/qemu-system-ppc64 -serial mon:stdio -m 1024 -cpu host -device 
> virtio-scsi-pci,id=scsi0
> -device scsi-cd,drive=cd0,bus=scsi0.0 -drive
> media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
> -smp 1,threads=1 -enable-kvm -device virtio-gpu-pci -nodefaults
> 
> Same as above. It seems like if virtio-gpu and virtio-gpu-pci are the same.
> 
> QT_X11_NO_MITSHM=1 /usr/bin/qemu-system-ppc64 -serial mon:stdio -m 1024 -cpu 
> host -device
> virtio-scsi-pci,id=scsi0 -device scsi-cd,drive=cd0,bus=scsi0.0 -drive
> media=cdrom,if=none,id=cd0,format=raw,file=/opt/openSUSE-Leap-42.3-NET-ppc64le-Build0130-Media.iso
> -smp 1,threads=1 -enable-kvm -device nec-usb-xhci -device usb-tablet -device 
> usb-kbd -device
> virtio-vga -device virtio-gpu-pci -nodefaults
> 
> So, is anybody already saw this? May I did something wrong? How we can debug 
> this? (And sorry for
> this long email, I try to add all the needed information!).
> 

-- 
Loic Devulder - Senior QA Engineer
Container & Storage Solutions Quality Assurance team (qa-css)
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB, 21284 (AG 
Nuernberg)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]