qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches
Date: Tue, 5 Jul 2016 13:41:31 +0300

On Tue, Jul 05, 2016 at 11:13:26AM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (address@hidden) wrote:
> > On Tue, Jul 05, 2016 at 10:33:25AM +0100, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (address@hidden) wrote:
> > > > On Mon, Jul 04, 2016 at 08:16:03PM +0100, Dr. David Alan Gilbert (git) 
> > > > wrote:
> > > > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > > > 
> > > > > QEMU sets the guests physical address bits to 40; this is wrong
> > > > > on most hardware, and can be detected by the guest.
> > > > > It also stops you using really huge multi-TB VMs.
> > > > > 
> > > > > Red Hat has had a patch, that Andrea wrote, downstream for a couple
> > > > > of years that reads the hosts value and uses that in the guest.  
> > > > > That's
> > > > > correct as far as the guest sees it, and lets you create huge VMs.
> > > > > 
> > > > > The downside, is that if you've got a mix of hosts, say an i7 and a 
> > > > > Xeon,
> > > > > life gets complicated in migration; prior to 2.6 it all apparently
> > > > > worked (although a guest that looked might spot the change).
> > > > > In 2.6 Paolo started checking MSR writes and they failed when the
> > > > > incoming MTRR mask didn't fit.
> > > > > 
> > > > > This series:
> > > > >    a) Fixes up mtrr masks so that if you're migrating between hosts
> > > > >       of different physical address size it tries to do something 
> > > > > sensible.
> > > > > 
> > > > >    b) Lets you specify the guest physical address size via a CPU 
> > > > > property, i.e.
> > > > >         -cpu SandyBridge,phys-bits=36
> > > > > 
> > > > >       The default on old machine types is to use the existing 40 bits 
> > > > > value.
> > > > > 
> > > > >    c) Lets you tell qemu to use the same setting as the host, i.e.
> > > > >         -cpu SandyBridge,phys-bits=0
> > > > >  
> > > > >       This is the default on new machine types.
> > > > > 
> > > > > Note that mixed size hosts are still not necessarily safe; a guest
> > > > > started on a host with a large physical address size might start using
> > > > > those bits and get upset when it's moved to a small host.
> > > > > However that was already potentially broken in existing qemu that
> > > > > used a magic value of 40.
> > > > > 
> > > > > There's potential to add some extra guards against people
> > > > > doing silly stuff; e.g. stop people running VMs using 1TB of
> > > > > address space on a tiny host.
> > > > > 
> > > > > Dave
> > > > 
> > > > This is all in target-i386 so if the maintainers want it this way, they
> > > > can merge this, and I do not have strong objections, but I wanted to
> > > > document an alternative that is IMHO somewhat nicer. Feel free to
> > > > ignore.  See below.
> > > > 
> > > > How can guest use more memory than what host supports?
> > > > I think there are two ways:
> > > > 
> > > > 1. more memory than host supports is supplied
> > > >    This is a configuration error. We can simply detect this
> > > >    and fail init, or print a warning, no need for new flags.
> > > 
> > > Yes we should do that; however there's a case that's potentially
> > > currently working for people but actually kind of illegal.
> > > That case is specifying a small amount of actual memory
> > > but a large maxmem - i.e.:
> > > 
> > >      -m 2G,slots=16,maxmem=2T
> > > 
> > > On a host with a 39bit physaddress limit do you error
> > > on that or not?  I think oVirt is currently doing something
> > > similar to that, but I'm trying to get confirmation.
> > 
> > That would only be a problem since pci is allocated above
> > maxmem so 64 bit pci addresses aren't accessible.
> > With my proposal we can actually force firmware to avoid
> > using 64 bit memory for that config.
> > Will work better than today.
> > 
> > 
> > > > 2. pci addresses out of host range assigned by guest
> > > >    Again normally at least seabios will not do this,
> > > >    maybe OVMF will?
> > > >    we certainly can add an interface telling firmware
> > > >    what the limit is.
> > > > 
> > > > Thus an alternative is:
> > > > - add interface to tell QEMU how much 64 bit memory can pci use.
> > > > - teach firmware to limit itself to that
> > > > - set guest bits to 48 unconditionally
> > > > 
> > > > 
> > > > the disadvantage of this approach is that firmware needs to be changed
> > > 
> > > I guess it also needs the CRS to tell the guest OS not
> > > to remap PCI stuff into that space?
> > 
> > CRS is a list of legal addresses, not list of illegal ones.
> > So just don't include what's illegal there.
> > 
> > >  I thought also from the previous
> > > discussions that the guest would get a different exception if it
> > > actually tried to use any of the bits below 48 it didn't have.
> > 
> > Basically if you try to map pci at an address outside CRS
> > you can get any kind of crash since there could be on-board
> > hardware handling these addresses.
> > So I do not think we care about that.
> 
> The issue about guest bits is not purely about PCI addresses though;
> I thought it was also to do with visible behaviour/exceptions in
> page tables.

Only if you make guest phy bits < host phy bits.

If guest phy bits >= host phy bits, then there's never
a configuration that from guest POV should trigger
an exception but does not.

This is another advantage of my proposal.


> > > > the advantage is that we get seemless migration between different
> > > > hosts as long as they both can support the configuration,
> > > > without any management effort.
> > > 
> > > The reality (Linux guest) is that this already works as long as you don't
> > > map anything into the high address space, and the firmware wont do
> > > that unless it's pushed to by an excessive maxmem or huge
> > > 64bit PCI bars.
> > > 
> > > Dave
> > 
> > Right. So the disadvantage isn't big at all, and I think advantages
> > outweight it.
> 
> Except that no one will ever get around to writing the firmware changes
> for both sets of firmware; so we never move forward?
> 
> Dave

seabios is already ok, so only ovmf needs to be patched.

I agree need to change firmware is a disadvantage.
You guys decide, but I thought I'd put that on the table.


> > 
> > > > 
> > > > > 
> > > > > v2
> > > > >   Default on new machine types is to read from the host
> > > > >   Use the MAKE_64BIT_MASK macro
> > > > >   Validate phys_bits in the realise method
> > > > >   Move reading the host physical bits to the realise method
> > > > >   Set phys-bits even for 32bit guests
> > > > >   Add warning when your phys-bits doesn't match your host in the none
> > > > >     default case
> > > > > 
> > > > > Dr. David Alan Gilbert (6):
> > > > >   x86: Allow physical address bits to be set
> > > > >   x86: Mask mtrr mask based on CPU physical address limits
> > > > >   x86: fill high bits of mtrr mask
> > > > >   x86: Set physical address bits based on host
> > > > >   x86: fix up 32 bit phys_bits case
> > > > >   x86: Add sanity checks on phys_bits
> > > > > 
> > > > >  include/hw/i386/pc.h | 10 ++++++++
> > > > >  target-i386/cpu.c    | 71 
> > > > > ++++++++++++++++++++++++++++++++++++++++++++++------
> > > > >  target-i386/cpu.h    |  6 +++++
> > > > >  target-i386/kvm.c    | 36 +++++++++++++++++++++++---
> > > > >  4 files changed, 112 insertions(+), 11 deletions(-)
> > > > > 
> > > > > -- 
> > > > > 2.7.4
> > > --
> > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]