[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches |
Date: |
Tue, 5 Jul 2016 13:41:31 +0300 |
On Tue, Jul 05, 2016 at 11:13:26AM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (address@hidden) wrote:
> > On Tue, Jul 05, 2016 at 10:33:25AM +0100, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (address@hidden) wrote:
> > > > On Mon, Jul 04, 2016 at 08:16:03PM +0100, Dr. David Alan Gilbert (git)
> > > > wrote:
> > > > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > > >
> > > > > QEMU sets the guests physical address bits to 40; this is wrong
> > > > > on most hardware, and can be detected by the guest.
> > > > > It also stops you using really huge multi-TB VMs.
> > > > >
> > > > > Red Hat has had a patch, that Andrea wrote, downstream for a couple
> > > > > of years that reads the hosts value and uses that in the guest.
> > > > > That's
> > > > > correct as far as the guest sees it, and lets you create huge VMs.
> > > > >
> > > > > The downside, is that if you've got a mix of hosts, say an i7 and a
> > > > > Xeon,
> > > > > life gets complicated in migration; prior to 2.6 it all apparently
> > > > > worked (although a guest that looked might spot the change).
> > > > > In 2.6 Paolo started checking MSR writes and they failed when the
> > > > > incoming MTRR mask didn't fit.
> > > > >
> > > > > This series:
> > > > > a) Fixes up mtrr masks so that if you're migrating between hosts
> > > > > of different physical address size it tries to do something
> > > > > sensible.
> > > > >
> > > > > b) Lets you specify the guest physical address size via a CPU
> > > > > property, i.e.
> > > > > -cpu SandyBridge,phys-bits=36
> > > > >
> > > > > The default on old machine types is to use the existing 40 bits
> > > > > value.
> > > > >
> > > > > c) Lets you tell qemu to use the same setting as the host, i.e.
> > > > > -cpu SandyBridge,phys-bits=0
> > > > >
> > > > > This is the default on new machine types.
> > > > >
> > > > > Note that mixed size hosts are still not necessarily safe; a guest
> > > > > started on a host with a large physical address size might start using
> > > > > those bits and get upset when it's moved to a small host.
> > > > > However that was already potentially broken in existing qemu that
> > > > > used a magic value of 40.
> > > > >
> > > > > There's potential to add some extra guards against people
> > > > > doing silly stuff; e.g. stop people running VMs using 1TB of
> > > > > address space on a tiny host.
> > > > >
> > > > > Dave
> > > >
> > > > This is all in target-i386 so if the maintainers want it this way, they
> > > > can merge this, and I do not have strong objections, but I wanted to
> > > > document an alternative that is IMHO somewhat nicer. Feel free to
> > > > ignore. See below.
> > > >
> > > > How can guest use more memory than what host supports?
> > > > I think there are two ways:
> > > >
> > > > 1. more memory than host supports is supplied
> > > > This is a configuration error. We can simply detect this
> > > > and fail init, or print a warning, no need for new flags.
> > >
> > > Yes we should do that; however there's a case that's potentially
> > > currently working for people but actually kind of illegal.
> > > That case is specifying a small amount of actual memory
> > > but a large maxmem - i.e.:
> > >
> > > -m 2G,slots=16,maxmem=2T
> > >
> > > On a host with a 39bit physaddress limit do you error
> > > on that or not? I think oVirt is currently doing something
> > > similar to that, but I'm trying to get confirmation.
> >
> > That would only be a problem since pci is allocated above
> > maxmem so 64 bit pci addresses aren't accessible.
> > With my proposal we can actually force firmware to avoid
> > using 64 bit memory for that config.
> > Will work better than today.
> >
> >
> > > > 2. pci addresses out of host range assigned by guest
> > > > Again normally at least seabios will not do this,
> > > > maybe OVMF will?
> > > > we certainly can add an interface telling firmware
> > > > what the limit is.
> > > >
> > > > Thus an alternative is:
> > > > - add interface to tell QEMU how much 64 bit memory can pci use.
> > > > - teach firmware to limit itself to that
> > > > - set guest bits to 48 unconditionally
> > > >
> > > >
> > > > the disadvantage of this approach is that firmware needs to be changed
> > >
> > > I guess it also needs the CRS to tell the guest OS not
> > > to remap PCI stuff into that space?
> >
> > CRS is a list of legal addresses, not list of illegal ones.
> > So just don't include what's illegal there.
> >
> > > I thought also from the previous
> > > discussions that the guest would get a different exception if it
> > > actually tried to use any of the bits below 48 it didn't have.
> >
> > Basically if you try to map pci at an address outside CRS
> > you can get any kind of crash since there could be on-board
> > hardware handling these addresses.
> > So I do not think we care about that.
>
> The issue about guest bits is not purely about PCI addresses though;
> I thought it was also to do with visible behaviour/exceptions in
> page tables.
Only if you make guest phy bits < host phy bits.
If guest phy bits >= host phy bits, then there's never
a configuration that from guest POV should trigger
an exception but does not.
This is another advantage of my proposal.
> > > > the advantage is that we get seemless migration between different
> > > > hosts as long as they both can support the configuration,
> > > > without any management effort.
> > >
> > > The reality (Linux guest) is that this already works as long as you don't
> > > map anything into the high address space, and the firmware wont do
> > > that unless it's pushed to by an excessive maxmem or huge
> > > 64bit PCI bars.
> > >
> > > Dave
> >
> > Right. So the disadvantage isn't big at all, and I think advantages
> > outweight it.
>
> Except that no one will ever get around to writing the firmware changes
> for both sets of firmware; so we never move forward?
>
> Dave
seabios is already ok, so only ovmf needs to be patched.
I agree need to change firmware is a disadvantage.
You guys decide, but I thought I'd put that on the table.
> >
> > > >
> > > > >
> > > > > v2
> > > > > Default on new machine types is to read from the host
> > > > > Use the MAKE_64BIT_MASK macro
> > > > > Validate phys_bits in the realise method
> > > > > Move reading the host physical bits to the realise method
> > > > > Set phys-bits even for 32bit guests
> > > > > Add warning when your phys-bits doesn't match your host in the none
> > > > > default case
> > > > >
> > > > > Dr. David Alan Gilbert (6):
> > > > > x86: Allow physical address bits to be set
> > > > > x86: Mask mtrr mask based on CPU physical address limits
> > > > > x86: fill high bits of mtrr mask
> > > > > x86: Set physical address bits based on host
> > > > > x86: fix up 32 bit phys_bits case
> > > > > x86: Add sanity checks on phys_bits
> > > > >
> > > > > include/hw/i386/pc.h | 10 ++++++++
> > > > > target-i386/cpu.c | 71
> > > > > ++++++++++++++++++++++++++++++++++++++++++++++------
> > > > > target-i386/cpu.h | 6 +++++
> > > > > target-i386/kvm.c | 36 +++++++++++++++++++++++---
> > > > > 4 files changed, 112 insertions(+), 11 deletions(-)
> > > > >
> > > > > --
> > > > > 2.7.4
> > > --
> > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
- Re: [Qemu-devel] [PATCH v2 5/6] x86: fix up 32 bit phys_bits case, (continued)
[Qemu-devel] [PATCH v2 6/6] x86: Add sanity checks on phys_bits, Dr. David Alan Gilbert (git), 2016/07/04
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Michael S. Tsirkin, 2016/07/04
- Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Dr. David Alan Gilbert, 2016/07/05
- Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Paolo Bonzini, 2016/07/05
- Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Michael S. Tsirkin, 2016/07/05
- Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Paolo Bonzini, 2016/07/05
- Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Michael S. Tsirkin, 2016/07/05
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Dr. David Alan Gilbert, 2016/07/05
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Michael S. Tsirkin, 2016/07/05
Re: [Qemu-devel] [PATCH v2 0/6] x86: Physical address limit patches, Daniel P. Berrange, 2016/07/05