qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: A


From: Andrea Arcangeli
Subject: Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: Allow physical address bits to be set)
Date: Wed, 22 Jun 2016 16:24:14 +0200
User-agent: Mutt/1.6.1 (2016-04-27)

Hello,

On Wed, Jun 22, 2016 at 02:41:22PM +0200, Paolo Bonzini wrote:
> From a semantics point of view, using a smaller phys-addr-bits than the
> host is the worst, because you tell the guest that some bits are
> must-be-zero, when they're not.  Using a larger phys-addr-bits cannot

Ok, so EPT/KVM should always use the host phys bits, never the guest
ones, for EPT violations. KVM runs in the host so that's not a
concern. EPT is irrelevant.

The only relevancy is in the guest pagetables with EPT. I don't think
any sane OS can break. It'd be inefficient too, to use a cacheline to
check the phys bits at runtime before setting up pagetables.

The MTRR if it doesn't set the "valid" phys bits to 1 (because the
guest bits are reduced), it should be still safe.

> cause malfunctioning, only crashes (and as Gerd said, if you cross your
> fingers and hope the guest doesn't put anything so high in memory,
> chances are you'll succeed), and this makes it "safer".  I'm not sure
> which one is more likely to happen.

But the crash with guest phys bits > host phys bits is material, linux
will definitely crash in such condition.

Linux could not possibly crash instead if host phys bits > guest phys
bits because it will never depend on GPF triggering if the must be
zero bits of the guest pagetables are set. Linux won't ever try to set
those bits and I'd be shocked if any other OS does.

So while not perfect emulation of the hardware, the risk with known OS
should be zero.

> So there's no correct answer, and that's why I think the lesser evil is
> to go with the time-tested alternative and use host phys-addr-bits as
> the default, even if it causes weird behavior on migration.  If a fixed
> phys-addr-bits is specified on the destination, it should match the
> value that was used on the source though.

I agree with should start with the host phys bits like we use in
production (plus the mtrr fix).

It is a net improvement compared to upstream because it restrict the
risk to only live migration and it otherwise always runs perfectly
safe. Upstream is never safe on any host with phys bits != 40,
especially if the host phys bits is < 40.

My solution has the main benefit to avoid to compute the highest
possible RAM/PCI bar guest physical address that could be ever mapped,
in order to generate a "soft" guest phys bits.

Later we could still consider to introduce a "soft" guest phys bits
with the only objective of preventing the risk of migration breakages.

qemu shouldn't let the guest migrate if we find destination host phys
bits is < "soft" guest phys bits.

Then a command line quirk -cpu=force_host_phys_bits would set the
"soft" guest phys bits to the host value and prevent live migration to
any destination with host phys bits != "soft" guest phys bits. And it
should be used only for such hypothetical OS that depends on the must
zero bits violations in the guest pagetables.

If this is good idea or not as a second step, boils down to how
difficult it is to calculate the highest possible guest physical
address at boot time. If that's impossible with PCI hotplug (memory
hotplug shouldn't be an issue), then "soft" guest phys bits also
becomes mostly worthless (unless we require -cpu=force_host_phys_bits
for PCI hotplug to work).

Again starting with the host -> guest phys bits sounds fine to me, at
least everything will work perfect in all cases except live migration,
and you should know what you're doing with live migration if you've a
very diverse host phys bits in the cloud nodes and very large guests
too or guest with weird OS depending on must be zero bits violations
in guest pagetables.

Thanks,
Andrea



reply via email to

[Prev in Thread] Current Thread [Next in Thread]