qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFCv2 00/12] Clean up compatibility mode handling


From: Greg Kurz
Subject: Re: [Qemu-devel] [RFCv2 00/12] Clean up compatibility mode handling
Date: Thu, 1 Dec 2016 14:16:15 +0100

On Wed, 16 Nov 2016 09:17:43 +1100
David Gibson <address@hidden> wrote:

> This series is a significant rework to how we handle CPU compatibility
> modes on ppc.
> 

David,

Please find below the results of the migration tests.

>  * Information about compatibility modes was previously open coded and
>    scattered across a number of functions in both target-ppc and spapr
>    code.  It's now brought together into a common table of
>    compatibility modes.
> 
>  * There was significant conceptual confusion about what a
>    compatibility mode means, and how it interacts with the machine
>    type.  This cleans that up, clarifying that a compatibility mode
>    (as an externally set option) only makes sense on machine types
>    that don't permit the guest hypervisor privilege (i.e. 'pseries')
> 
>  * It was previously the user's (or management layer's) responsibility
>    to determine compatibility of CPUs on either end for migration.
>    This uses the compatibility modes to check that properly during an
>    incoming migration.
> 
>  * Some ill-considered sanity checks broke migration from 2.6 to 2.7,
>    due to some new instruction classes being added.  This should avoid
>    a repeat of that problem for 2.8 (we may be able to backport a
>    minimal subset to 2.7-stable to fix the existing problem).
> 
> Patches 1-3 are preliminary cleanups which could stand on their own.
> Patches 4-12 are the compatibility mode cleanup proper.
> 
> So far, this has been mimimally tested.  There are quite a few
> migration cases to check.  For example:
> 
> Basic:
> 
> 1) Boot guest with -cpu host
>       Should go into POWER8 compat mode after CAS
>       Previously would have been raw mode
> 
> 2) Boot guest with -machine pseries,max-cpu-compat=power7 -cpu host
>       Should go into POWER7 compat mode
> 
> 3) Boot guest with -cpu host,compat=power7
>       Should act as (2), but print a warning
> 
> 4) Boot guest via libvirt with power7 compat mode specified in XML
>       Should act as (3), (2) once we fix libvirt
> 
> 5) Hack guest to only advertise power7 compatibility, boot with -cpu host
>       Should go into POWER7 compat mode after CAS
> 
> 6) Hack guest to only advertise real PVRs
>       Should remain in POWER8 raw mode after CAS
> 
> 7) Hack guest to only advertise real PVRs
>    Boot with -machine pseries,max-cpu-compat=power8
>       Should fail at CAS time
> 
> 8) Hack guest to only advertise power7 compatibility, boot with -cpu host
>    Reboot to normal guest
>       Should go to power7 compat mode after CAS of boot 1
>       Should revert to raw mode on reboot
>       SHould go to power8 compat mode after CAS of boot 2
> 
> Migration:
> 

The QEMU command line used to test migration is as follows:

ppc64-softmmu/qemu-system-ppc64 \
        -snapshot \
        -nodefaults \
        -no-shutdown \
        -nographic \
        -device virtio-blk-pci,drive=drive0 \
        -drive file=/home/greg/images/fedora24-ppc64.qcow2,id=drive0,if=none \
        -global virtio-blk-pci.disable-legacy=off \
        -global virtio-blk-pci.disable-modern=on \
        -device virtio-net,netdev=netdev0,mac=C0:FF:EE:00:00:66,id=net0 \
        -netdev tap,id=netdev0,vhost=off,helper=/usr/libexec/qemu-bridge-helper 
\
        -global virtio-net-pci.disable-legacy=off \
        -global virtio-net-pci.disable-modern=on \
        -m 4G \
        -serial mon:stdio \
        -trace spapr_cas_pvr

Note that virtio devices are explicitely configured to run in legacy
mode because I couldn't pass tests 15 and 16 otherwise, with various
issues including QEMU getting killed by OOM ! I'll focus on these
issues separately.

> 9) Boot guest with qemu-2.6 -machine pseries-2.6 -cpu host
>    Migrate to qemu-2.8 -machine pseries-2.6 -cpu host
>       Should work, end up running in power8 raw mode
> 

== QEMU-2.6 ==

spapr_cas_pvr current=0, cpu_match=1, new=0, compat flags=6

== guest (source) ==

cpu             : POWER8 (raw), altivec supported

== guest (target) ==

cpu             : POWER8 (raw), altivec supported

> 10) Boot guest with qemu-2.7 -machine pseries-2.7 -cpu host
>     Migrate to qemu-2.8 -machine pseries-2.7 -cpu host
>       Should work, end up running in power8 raw mode
> 

== QEMU-2.7 ==

spapr_cas_pvr current=0, cpu_match=1, new=0, compat flags=2000000000000006

== guest (source) ==

cpu             : POWER8 (raw), altivec supported

== guest (target) ==

cpu             : POWER8 (raw), altivec supported

> 11) Boot guest with qemu-2.7 -machine pseries-2.7 -cpu host,compat=power7
>     Migrate to qemu-2.8 -machine pseries-2.7 -cpu host,compat=power7
>       Should work, be running in POWER7 compat after, but give warning like
>       (3)
> 

== QEMU-2.7 ==

spapr_cas_pvr current=f000003, cpu_match=1, new=f000003, compat 
flags=2000000000000006

== guest (source) ==

cpu             : POWER7 (architected), altivec supported

== QEMU-2.8 ==

CPU 'compat' property is deprecated and has no effect; use max-cpu-compat 
machine property instead

Migration completes but the guest gets a program interrupt:

(qemu) info registers 
NIP 0000000000000700   LR c0000000008309ac CTR c000000000830b40 XER 
0000000020000000 CPU#0
MSR 8000000000001000 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
TB 00000000 00000000 DECR 00000000
GPR00 0000000000000000 0000000000000000 0000000000000000 000000007fef0000
GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08 0000000000000000 0000000020000000 6000000060000000 6000000060006180
GPR12 c000000000081000 0000000000000000 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
CR 20000000  [ E  -  -  -  -  -  -  -  ]             RES ffffffffffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
 SRR0 6000000060006180  SRR1 c000000000081000    PVR 00000000004d0200 VRSAVE 
0000000000000000
SPRG0 0000000000000000 SPRG1 0000000000000000  SPRG2 0000000000000000  SPRG3 
0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 
0000000000000000
HSRR0 0000000000000000 HSRR1 0000000000000000
 CFAR 0000000000000000
 SDR1 0000000000000007   DAR 0000000000000000  DSISR 0000000000000000

Same happens with a pseries-2.6 machine.

Would you have suggestions how to debug this ? The values in SRR0 and SRR1 look
weird compared to the what is described in the ISA...

> 12) Boot guest with qemu-2.7 -machine pseries-2.7 -cpu host,compat=power7
>     Migrate to qemu-2.8 -machine pseries-2.7,max-cpu-compat=power7 -cpu host
>       Should work, be running in POWER7 compat after, no warning
> 

Same as 11) except the CPU 'compat' warning for both pseries-2.6 and 
pseries-2.7.

It seems to be related to the compat mode itself as I also hit the error when
running with qemu-2.8 -machine pseries-2.8,max-cpu-compat=power8 on a POWER8
host.

> 13) Boot to SLOF with qemu-2.6 -machine pseries-2.6 -cpu host
>     Migrate to qemu-2.8 -machine pseries-2.6 -cpu host
>       ?
> 

Migration succeeds, typing 'boot' at the SLOF prompt succeeds in booting the
system:

== QEMU-2.8 ==

spapr_cas_pvr current=0, explicit_match=1, new=f000004

== guest (target) ==

cpu             : POWER8 (architected), altivec supported

> 14) Boot to SLOF with qemu-2.7 -machine pseries-2.7 -cpu host
>     Migrate to qemu-2.8 -machine pseries-2.7 -cpu host
>       ?
> 

Same as 13)

> 15) Boot to SLOF with qemu-2.7 -machine pseries-2.7 -cpu host,compat=power7
>     Migrate to qemu-2.8 -machine pseries-2.7 -cpu host,compat=power7
>       ?
> 

== QEMU-2.8 ==

CPU 'compat' property is deprecated and has no effect; use max-cpu-compat 
machine property instead

Migration succeeds, but this time SLOF then boots the system automatically:

spapr_cas_pvr current=f000003, explicit_match=1, new=f000003

== guest (target) ==

cpu             : POWER7 (architected), altivec supported

The same happens with pseries-2.6.

> 16) Boot to SLOF with qemu-2.7 -machine pseries-2.7 -cpu host,compat=power7
>     Migrate to qemu-2.8 -machine pseries-2.7,max-cpu-compat=power7 -cpu host
>       ?
> 

Same as 16) except the CPU 'compat' warning for both pseries-2.6 and 
pseries-2.7.

> 17) Boot guest with qemu-2.6 -machine pseries-2.6 -cpu host
>     Migrate to qemu-2.7.z -machine pseries-2.6 -cpu host
>       Should work
> 

It doesn't. Migration fails on destination:

error while loading state for instance 0x0 of device 'cpu'
load of migration failed: Invalid argument

> 18) Hack guest to only advertise power7 compatibility, boot with -cpu host
>     Boot with qemu-2.8, migrate to qemu-2.8
>       Should be in power7 compat mode after CAS on source, and still
>       in power7 compat mode on destination
> 

Same failure as 11)

Cheers.

--
Greg

> Changes since RFCv1:
>   * Change CAS logic to prefer compatibility modes over raw mode
>   * Simplified by giving up on half-hearted attempts to maintain
>     backwards migration
>   * Folded migration stream changes into a single patch
>   * Removed some preliminary patches which are already merged
> 
> David Gibson (12):
>   pseries: Always use core objects for CPU construction
>   pseries: Make cpu_update during CAS unconditional
>   ppc: Clean up and QOMify hypercall emulation
>   ppc: Rename cpu_version to compat_pvr
>   ppc: Rewrite ppc_set_compat()
>   ppc: Rewrite ppc_get_compat_smt_threads()
>   ppc: Validate compatibility modes when setting
>   pseries: Rewrite CAS PVR compatibility logic
>   ppc: Add ppc_set_compat_all()
>   pseries: Move CPU compatibility property to machine
>   pseries: Reset CPU compatibility mode
>   ppc: Rework CPU compatibility testing across migration
> 
>  hw/ppc/spapr.c              | 158 ++++++++++++++++------------
>  hw/ppc/spapr_cpu_core.c     |  85 ++++++++++-----
>  hw/ppc/spapr_hcall.c        | 140 +++++++------------------
>  hw/ppc/trace-events         |   2 +-
>  include/hw/ppc/spapr.h      |  12 ++-
>  target-ppc/Makefile.objs    |   2 +-
>  target-ppc/compat.c         | 249 
> ++++++++++++++++++++++++++++++++++++++++++++
>  target-ppc/cpu.h            |  49 +++++++--
>  target-ppc/excp_helper.c    |  11 +-
>  target-ppc/kvm.c            |   4 +-
>  target-ppc/kvm_ppc.h        |   4 +-
>  target-ppc/machine.c        |  87 ++++++++++++++--
>  target-ppc/translate_init.c | 157 +++++++---------------------
>  13 files changed, 607 insertions(+), 353 deletions(-)
>  create mode 100644 target-ppc/compat.c
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]