qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] RAM backend and guest ABI (was Re: [PATCH v2] pc: memhp: en


From: Eduardo Habkost
Subject: [Qemu-devel] RAM backend and guest ABI (was Re: [PATCH v2] pc: memhp: enforce minimal 128Mb) alignment for pc-dimm
Date: Thu, 29 Oct 2015 16:16:57 -0200
User-agent: Mutt/1.5.23 (2014-03-12)

(CCing Michal and libvir-list, so libvirt team is aware of this
restriction)

On Thu, Oct 29, 2015 at 02:36:37PM +0100, Igor Mammedov wrote:
> On Tue, 27 Oct 2015 14:36:35 -0200
> Eduardo Habkost <address@hidden> wrote:
> 
> > On Tue, Oct 27, 2015 at 10:14:56AM +0100, Igor Mammedov wrote:
> > > On Tue, 27 Oct 2015 10:53:08 +0200
> > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > 
> > > > On Tue, Oct 27, 2015 at 09:48:37AM +0100, Igor Mammedov wrote:
> > > > > On Tue, 27 Oct 2015 10:31:21 +0200
> > > > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > > > 
> > > > > > On Mon, Oct 26, 2015 at 02:24:32PM +0100, Igor Mammedov wrote:
> > > > > > > Yep it's workaround but it works around QEMU's broken virtio
> > > > > > > implementation in a simple way without need for guest side 
> > > > > > > changes.
> > > > > > > 
> > > > > > > Without foreseeable virtio fix it makes memory hotplug unusable 
> > > > > > > and even
> > > > > > > more so if there were a virtio fix it won't fix old guests since 
> > > > > > > you've
> > > > > > > said that virtio fix would require changes of both QEMU and guest 
> > > > > > > sides.
> > > > > > 
> > > > > > What makes it not foreseeable?
> > > > > > Apparently only the fact that we have a work-around in place so no 
> > > > > > one
> > > > > > works on it.  I can code it up pretty quickly, but I'm flat out of 
> > > > > > time
> > > > > > for testing as I'm going on vacation soon, and hard freeze is pretty
> > > > > > close.
> > > > > I can lend a hand for testing part.
> > > > > 
> > > > > > 
> > > > > > GPA space is kind of cheap, but wasting it in chunks of 512M
> > > > > > seems way too aggressive.
> > > > > hotplug region is sized with 1Gb alignment reserve per DIMM so we 
> > > > > aren't
> > > > > actually wasting anything here.
> > > > >
> > > > 
> > > > If I allocate two 1G DIMMs, what will be the gap size? 512M? 1G?
> > > > It's too much either way.
> > > minimum would be 512, and if backend is 1Gb-hugepage gap will be
> > > backend's natural alignment (i.e. 1Gb).
> > 
> > Is backend configuration even allowed to affect the machine ABI? We need
> > to be able to change backend configuration when migrating the VM to
> > another host.
> for now, one has to use the same type of backend on both sides
> i.e. if source uses 1Gb huge pages backend then target also
> need to use it.
> 

The page size of the backend don't even depend on QEMU arguments, but on
the kernel command-line or hugetlbfs mount options. So it's possible to
have exactly the same QEMU command-line on source and destination (with
an explicit versioned machine-type), and get a VM that can't be
migrated? That means we are breaking our guarantees about migration and
guest ABI.


> We could change this for the next machine type to always force
> max alignment (1Gb), then it would be possible to change
> between backends with different alignments.

I'm not sure what's the best solution here. If always using 1GB is too
aggressive, we could require management to ask for an explicit alignment
as a -machine option if they know they will need a specific backend page
size.

BTW, are you talking about the behavior introduced by
aa8580cddf011e8cedcf87f7a0fdea7549fc4704 ("pc: memhp: force gaps between
DIMM's GPA") only, or the backend page size was already affecting GPA
allocation before that commit?

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]