qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be set


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be set
Date: Fri, 17 Jun 2016 10:18:15 -0300
User-agent: Mutt/1.6.1 (2016-04-27)

On Fri, Jun 17, 2016 at 09:15:06AM +0100, Dr. David Alan Gilbert wrote:
> * Eduardo Habkost (address@hidden) wrote:
> > On Thu, Jun 16, 2016 at 06:12:12PM +0100, Dr. David Alan Gilbert (git) 
> > wrote:
> > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > 
> > > Currently QEMU sets the x86 number of physical address bits to the
> > > magic number 40.  This is only correct on some small AMD systems;
> > > Intel systems tend to have 36, 39, 46 bits, and large AMD systems
> > > tend to have 48.
> > > 
> > > Having the value different from your actual hardware is detectable
> > > by the guest and in principal can cause problems;
> > 
> > What kind of problems?
> > 
> > Is it a problem to have something smaller from the actual
> > hardware, or just if it's higher?
> 
> I'm a bit vague on the failure cases; but my understanding of the two
> cases are;
> 
> Larger is a problem if the guest tries to map something to a high
> address that's not addressable.
> 
> Smaller is potentially a problem if the guest plays tricks with
> what it thinks are spare bits in page tables but which are actually
> interpreted.   I believe KVM plays a trick like this.

If both smaller and larger are a problem, we have a much bigger
problem than we thought. We need to confirm this.

So, what happens if the guest play tricks in bits 40-45 when QEMU
sets the limit to 40 but we are running in a 46-bit host? Is it
really a problem? I assumed it would be safe.

> 
> > > The current limit of 40 stops TB VMs being created by those lucky
> > > enough to have that much.
> > > 
> > > This patch lets you set the physical bits by a cpu property but
> > > defaults to the same existing magic 40.
> > > 
> > 
> > The existing 40-bit default looks like a problem for 36-bit
> > systems. Do you know what kind of systems have 36 bits only? Only
> > old ones, or recent ones too? If only old ones, how old?
> 
> My Sandy Bridge (~2.5 year old) laptop is 36 bits; I've seen
> some other single-socket 39bit machines (Ivy bridge and I think newer).
> 
> > Can't we have a new default that is as small as possible for the
> > VM RAM+devices configuration?
> 
> That would be good to have, however:
>    1) I didn't want to change the default upstream behaviour at first,
>       this time through I just wanted a way to set it.

Agreed. We can change the default after we introduce the
mechanisms to handle the existing defaults properly.

>    2) While we have maxmem settings to tell us the top of VM RAM, do
>       we have anything that tells us the top of IO space? What happens
>       when we hotplug a PCI card?

(CCing Marcel and Michael, as we were discussing this recently.)

That's a good question. When calculating how many bits the
machine requires, machine code could choose to reserve a
reasonable amount of space for hotplug by default.

Whatever we choose as the default, in some corner cases (e.g.
almost-32GB VMs running in a 39-bit host) we will still need to
let the user choose between having extra space for hotplug and
being able to safely migrate to 36-bit hosts.

This can be solved by setting x86_64-cpu.phys-bits, but isn't it
too low level? We could have a higher level mechanism in the PC
machine class (e.g. a min-extra-phys-addr-space-for-hotplug
property).


>    3) Is it better to stick to sizes that correspond to real hardware
>       if you can?  For example I don't know of any machines with 37 bits
>       - in practice I think it's best to stick with sizes that correspond
>       to some real hardware.

Yeah, "as small as possible" could be actually "the smallest
possible value from a set of known-to-exist values". e.g. if we
find out that we need 37 bits, it's probably better to simply use
39 bits.

Choosing from a smaller set of values also makes corner cases
(like the example above) less likely to happen.

> 
> > > I've removed the ancient warning about the 42 bit limit in exec.c;
> > > I can't find that limit in there and no one else seems to know where
> > > it is.
> > > 
> > > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > > ---
> > >  target-i386/cpu.c | 8 +++++---
> > >  target-i386/cpu.h | 3 +++
> > >  2 files changed, 8 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > index 4111240..c3bbf8e 100644
> > > --- a/target-i386/cpu.c
> > > +++ b/target-i386/cpu.c
> > > @@ -2606,9 +2606,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
> > > index, uint32_t count,
> > >          /* virtual & phys address size in low 2 bytes. */
> > >  /* XXX: This value must match the one used in the MMU code. */
> > 
> > Do you know where's the MMU code mentioned here?
> 
> No!
> 
> > >          if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> > > -            /* 64 bit processor */
> > > -/* XXX: The physical address space is limited to 42 bits in exec.c. */
> > > -            *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
> > > +            /* 64 bit processor, 48 bits virtual, configurable
> > > +             * physical bits.
> > > +             */
> > > +            *eax = 0x00003000 + cpu->phys_bits;
> > 
> > We should reject the configuration if phys-bits is set to
> > something larger than the host's phys_bits, shouldn't we?
> > 
> > Maybe we can't do that on old machine-types that already have the
> > 40-bit default, but if we have a new reasonable default based on
> > VM size, we can be more strict.
> 
> See notes above; but my worry with rejecting stuff is I don't
> know how many thousands of machines are already relying on it
> in farms with a mix of single and multi socket hosts.

That's true. Maybe we should just print a warning instead.

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]