qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPY


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC CPU
Date: Fri, 8 Jun 2018 16:23:00 -0300
User-agent: Mutt/1.9.2 (2017-12-15)

On Fri, Jun 08, 2018 at 06:40:16PM +0000, Moger, Babu wrote:
> Hi Eduardo, 
> Sorry for the late response. Got pulled into something else.
> 
> > -----Original Message-----
> > From: Eduardo Habkost [mailto:address@hidden
> > Sent: Wednesday, June 6, 2018 5:40 PM
> > To: Moger, Babu <address@hidden>
> > Cc: address@hidden; address@hidden; address@hidden;
> > address@hidden; address@hidden; address@hidden;
> > address@hidden; address@hidden; address@hidden
> > Subject: Re: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC
> > CPU
> > 
> > On Wed, Jun 06, 2018 at 10:36:45AM -0400, Babu Moger wrote:
> > > Enable TOPOEXT feature on EPYC CPU. This is required to support
> > > hyperthreading on VM guests. Also extend xlevel to 0x8000001E.
> > >
> > > Disable TOPOEXT feature for legacy machines.
> > >
> > > Signed-off-by: Babu Moger <address@hidden>
> > 
> > Now, I just noticed we have a problem here:
> > 
> > "-machine pc -cpu EPYC -smp 64" works today
> > 
> > This patch makes it stop working, but it shouldn't.
> 
> No. It works fine. I have tested it.

This doesn't sound right.  The code in this series will error out
of TOPOEXT is enabled and you have more than 64 VCPUs.

But I just noticed we have a bug introduced by:

commit f548222c24342ca74689de7794f9006b43f86a54
Author: Xiao Guangrong <address@hidden>
Date:   Thu May 3 16:06:11 2018 +0800

    migration: introduce decompress-error-check

    QEMU 3.0 enables strict check for compression & decompression to
    make the migration more robust, that depends on the source to fix
    the internal design which triggers the unexpected error conditions

    To make it work for migrating old version QEMU to 2.13 QEMU, we
    introduce this parameter to disable the error check on the
    destination which is the default behavior of the machine type
    which is older than 2.13, alternately, the strict check can be
    enabled explicitly as followings:
          -M pc-q35-2.11 -global migration.decompress-error-check=true

    Signed-off-by: Xiao Guangrong <address@hidden>
    Reviewed-by: Juan Quintela <address@hidden>
    Signed-off-by: Juan Quintela <address@hidden>

This commits added PC_COMPAT_2_12 to the 3.0 machine-types.
Because of this bug, TOPOEXT is being unconditionally disabled on
all machine-types, unless I apply the fix below:


diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 3d81136065..b4c5b03274 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -430,7 +430,6 @@ static void pc_i440fx_3_0_machine_options(MachineClass *m)
     pc_i440fx_machine_options(m);
     m->alias = "pc";
     m->is_default = 1;
-    SET_MACHINE_COMPAT(m, PC_COMPAT_2_12);
 }
 
 DEFINE_I440FX_MACHINE(v3_0, "pc-i440fx-3.0", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index b60cbb9266..83d6d75efa 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -312,7 +312,6 @@ static void pc_q35_3_0_machine_options(MachineClass *m)
 {
     pc_q35_machine_options(m);
     m->alias = "q35";
-    SET_MACHINE_COMPAT(m, PC_COMPAT_2_12);
 }
 
 DEFINE_Q35_MACHINE(v3_0, "pc-q35-3.0", NULL,

> 
> > 
> > On the other hand, I believe you expect:
> > * "-machine pc -cpu EPYC -smp 8" to automatically enable topoext.
> Yes. Only on new machines-types
> > * "-machine pc -cpu Opteron_G1 -smp 8" to not enable topoext.
> Yes.
> > * What about "-machine -cpu Opteron_G1 -smp 8,threads=2"?
> No. This should not enable topoext.  Topoext is not supported by Opteron_G1.
> This should warn about hyperthreading and continue.

OK, makes sense to me.

> > 
> > 
> > We also have other requirements, I will try to enumerate all of
> > them below:
> > 
> > 0) "-topoext" explicitly configured (any machine-type):
> > * Must never enable topoext.
> Yes.
> > 
> > 1) "+topoext" explicitly configured (any machine-type):
> > * Must validate topology and refuse to start if unsupported.
> 
> Yes.
> 
> > 
> > 2) Older machine-types:
> > * Must never enable topoext automatically, even if using "EPYC"
> >   or "threads=2"
> > 
> Yes.
> 
> > 3) "EPYC" CPU model (on new machine-types):
> > * Should enable topoext automatically, but only if topology is
> >   supported.
> > * Must not error out if topology is not supported.
> In new machine types we will enable topoext for "EPYC" CPU model.
> Right now(old machine type) we can disable for all the CPU models. 
> So, we don't need two bits(topoext and auto-topoext)

Right, so you agree that in this case we must _not_ error out if
topology is unsupported, correct?  Otherwise we will break this
existing use case:
  "-machine pc -cpu EPYC -smp 64".

> 
> I thought we should error out if topology cannot be supported. But we can 
> warn(disable topoext) and continue that is another option.
> 
> > * Should this enable topoext automatically even if threads=1?
> 
> Yes. We should enable even with threads=1.
> 
> > 
> > 4) Other AMD CPU models with "threads=2" (on new machine-types):
> > * We might want to make this enable topoext automatically, too.
> >   What do you think?
> 
>   No. We should not enable topoext here. We should depend on CPU model table 
> here.
> 
> > 
> > Is the above description accurate?  Do you agree with these
> > requirements?
> 
> With these requirements in mind, I will send that patches. We can start our 
> discussion.
> We don't need one more bits. That is my opinion. 

Thanks for confirming the requirements above.

But it doesn't seem to be possible represent these requirements
with just one bit.  Otherwise you can't differentiate explicit
"+topoext" (1 above) from topoext being implicitly enabled by
"-cpu EPYC" (3 above).

Another problem is query-cpu-model-expansion QMP command: this
patch makes "topoext" appear on the output of
"query-cpu-model-expansion model=EPYC", meaning that management
software will assume everybody using the "EPYC" CPU model will
require +topoext.  A separate "auto-topoext" property would avoid
this issue.

(Yeah, this is tricky.  I want to eventually encode these subtle
rules in automated test cases, so these issues could be detected
by software instead of code inspection.)

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]