qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA


From: Bharata B Rao
Subject: Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA
Date: Mon, 5 Oct 2015 14:14:46 +0530
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Sep 28, 2015 at 11:13:42AM +0200, Igor Mammedov wrote:
> On Mon, 28 Sep 2015 10:09:26 +0530
> Bharata B Rao <address@hidden> wrote:
> 
> > On Sun, Sep 27, 2015 at 04:04:06PM +0200, Igor Mammedov wrote:
> > > On Sun, 27 Sep 2015 16:11:02 +0300
> > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > 
> > > > On Sun, Sep 27, 2015 at 03:06:24PM +0200, Igor Mammedov wrote:
> > > > > On Sun, 27 Sep 2015 13:48:21 +0300
> > > > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > > > 
> > > > > > On Fri, Sep 25, 2015 at 03:53:12PM +0200, Igor Mammedov wrote:
> > > > > > > mapping DIMMs non contiguously allows to workaround
> > > > > > > virtio bug reported earlier:
> > > > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > > > > > > in this case guest kernel doesn't allocate buffers
> > > > > > > that can cross DIMM boundary keeping each buffer
> > > > > > > local to a DIMM.
> > > > > > > 
> > > > > > > Suggested-by: Michael S. Tsirkin <address@hidden>
> > > > > > > Signed-off-by: Igor Mammedov <address@hidden>
> > > > > > > ---
> > > > > > > benefit of this workaround is that no guest side
> > > > > > > changes are required.
> > > > > > 
> > > > > > That's a hard requirement, I agree.
> > > > > > 
> > > > > > 
> > > > > > > ---
> > > > > > >  hw/i386/pc.c         | 4 +++-
> > > > > > >  hw/i386/pc_piix.c    | 3 +++
> > > > > > >  hw/i386/pc_q35.c     | 3 +++
> > > > > > >  include/hw/i386/pc.h | 2 ++
> > > > > > >  4 files changed, 11 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > Aren't other architectures besides PC ever affected?
> > > > > > Do they all allocate all of memory contigious in HVA space?
> > > > > I'm not sure about other targets I've CCed interested parties.
> > > > > 
> > > > > > 
> > > > > > Also - does the issue only affect hotplugged memory?
> > > > > Potentially it affects -numa memdev=foo, but however I've
> > > > > tried I wasn't able to reproduce.
> > > > > We could do it as
> > > > > separate workaround later if it would affect someone
> > > > > and virtio is not fixed to handle split buffers by that time.
> > > > > 
> > > > 
> > > > You can't reproduce a crash or you can't reproduce getting
> > > > contigious GPA with fragmented HVA?
> > > > If you can see fragmentation that's enough to assume guest crash
> > > > can be triggered, even if it doesn't with Linux.
> > > I'll check it.
> > > 
> > > > 
> > > > >  
> > > > > > Can't the patch be local to pc-dimm (except maybe the
> > > > > > backwards compatibility thing)?
> > > > > I think decision about using gaps and its size
> > > > > should be done by board and not generic pc-dimm.
> > > > > 
> > > > 
> > > > Well virtio is generic and can be used by all boards.
> > > Still pc-dimm.addr is not allocation is not part of pc-dimm
> > > device. it's just helper functions that happen to live in
> > > the same file source file.
> > > 
> > > But more importantly every target might have it's own
> > > notion how it partitions hotplug address space so making
> > > the same gap global might break them.
> > > 
> > > It's safer to enable gaps per target, I think ppc guys
> > > will make their own patch on top of this to taking
> > > in account their target specific and compat stuff.
> > 
> > I have never seen this issue that you mention at
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > 
> > in PowerPC. I have not been able to reproduce the QEMU crash with the
> > commandline suggested there.
> > 
> > (# ./ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic
> > -machine pseries -m 8G,slots=32,maxmem=32G -device
> > virtio-blk-pci,drive=rootdisk -drive
> > file=/home/bharata/F20-snap1,if=none,cache=none,id=rootdisk,format=qcow2
> > -monitor telnet:localhost:1235,server,nowait -vga none
> > -bios /home/bharata/slof/slof.bin -smp 16,maxcpus=32 -netdev
> > tap,id=foo,ifname=tap0,script=/home/bharata/qemu-ifup -device
> > virtio-net-pci,id=n1,netdev=foo `for i in $(seq 0 15); do echo -n
> > "-object memory-backend-ram,id=m$i,size=256M -device
> > pc-dimm,id=dimm$i,memdev=m$i "; done` -snapshot)
> > 
> > PowerPC sPAPR memory hotplug enforces memory alignment of 256MB
> > for both boottime as well as hotplugged memory.
> > 
> > So not sure if anything other than the default gap=0 which you have
> > done in this patchset for PowerPC is necessary.
> The bigger initial memory and dimm sizes the less likelihood of
> triggering the bug. You don't see it mostly due to luck, but it doesn't
> rule out possibility of it happening in production.
> So please consider turning on gaps for ppc machine.
> 
> Looking at how hotplug_mem_size is sized in hw/ppc/spapr.c it doesn't
> look that just turning on gaps would work since it doesn't have a space
> for alignment adjustment.
> 
> try to plug dimm device in following order:
>   -m 8G,slots=2,maxmem=1256M \
> 
>   -object memory-backend-ram,id=m1,size=256M -device pc-dimm,memdev=m1 \
> 
>   -object 
> memory-backend-file,id=hugepage1g,size=1G,file=/path/to1Gb/hugepagefs \
>   -device pc-dimm,memdev=hugepage1g
> 
> it should fail when adding second dimm since alignment for 1Gb huge page
> would be 1Gb but hotplug_mem container size is only 1256M total.

PowerKVM supports only 16MB huge page size. Given that we enforce 256MB
alignment on RAM size, individual node mem sizes, maxmem and DIMM size,
I don't think we will encounter the problem similar to the above.

However to take care of the virtio bug, I just posted a patchset to
force gaps between DIMMs.

Regards,
Bharata.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]