[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] mmap-alloc: use same backend for all mappings
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH] mmap-alloc: use same backend for all mappings |
Date: |
Tue, 1 Dec 2015 16:19:27 +0200 |
On Tue, Dec 01, 2015 at 02:31:19PM +0100, Greg Kurz wrote:
> On Tue, 1 Dec 2015 12:57:47 +0200
> "Michael S. Tsirkin" <address@hidden> wrote:
>
> > On Tue, Dec 01, 2015 at 04:23:11PM +0530, Aneesh Kumar K.V wrote:
> > > "Michael S. Tsirkin" <address@hidden> writes:
> > >
> > > > On Mon, Nov 30, 2015 at 02:46:31PM +0100, Greg Kurz wrote:
> > > >> On Mon, 30 Nov 2015 15:06:33 +0200
> > > >> "Michael S. Tsirkin" <address@hidden> wrote:
> > > >>
> > >
> > >
> > > ....
> > > >>
> > > >> On ppc64, the address space is divided in 256MB-sized segments where
> > > >> all pages
> > > >> have the same size. This is a hw limitation IIUC. I don't know if it
> > > >> can be
> > > >> fixed and I'll let Ben comment on it.
> > > >
> > > > But it's anonymous memory with PROT_NONE. There should be no pages
> > > > there:
> > > > just a chunk of virtual memory reserved.
> > > >
> > >
> > > ppc64 use page size (called as base page size) to find the hash slot in
> > > which we find the virtual address to real address translation. All the
> > > pages in a segment should have same base page size. Hugetlb pages have a
> > > base page size of 16M whereas a regular linux page have 64K. mmap will
> > > fail to map a hugetlb mapping in a segment that already have regular
> > > pages mapped.
> > >
> > > -aneesh
> >
> >
> > I see this in kernel:
> >
> > } else if (flags & MAP_HUGETLB) {
> > struct user_struct *user = NULL;
> > struct hstate *hs;
> >
> > hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) &
> > SHM_HUGE_MASK);
> > if (!hs)
> > return -EINVAL;
> >
> > len = ALIGN(len, huge_page_size(hs));
> > /*
> > * VM_NORESERVE is used because the reservations will be
> > * taken when vm_ops->mmap() is called
> > * A dummy user value is used because we are not locking
> > * memory so no accounting is necessary
> > */
> > file = hugetlb_file_setup(HUGETLB_ANON_FILE, len,
> > VM_NORESERVE,
> > &user, HUGETLB_ANONHUGE_INODE,
> > (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
> > if (IS_ERR(file))
> > return PTR_ERR(file);
> > }
> >
> > So maybe it's a question of passing in MAP_HUGETLB and the
> > correct size mask.
> >
>
> I guess you are talking about the PROT_NONE mapping here ^^.
Yes.
> How do we know that the fd points to hugepages ?
Donnu ... I guess we can just try this if the regular
mmap fails?
> And what's the difference between passing MAP_HUGETLB and passing a
> hugetlbfs backed fd + MAP_NORESERVE ?
Does MAP_NORESERVE have the desired effect?
I need to look at the kernel code, man page merely
mentions swap space use.
> I think the latter is easier
> because we don't need to guess if backend is hugetlbfs.
If this helps, that's fine by me.
It's probably a good idea to set this anyway.
--
MST