qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbin


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind
Date: Thu, 14 Aug 2014 11:56:51 +0200

On Thu, Aug 14, 2014 at 05:09:00PM +0800, zhanghailiang wrote:
> On 2014/8/14 15:15, Hu Tao wrote:
> >On Thu, Aug 14, 2014 at 02:31:41PM +0800, zhanghailiang wrote:
> >>On 2014/8/13 19:50, Michael S. Tsirkin wrote:
> >>>On Wed, Aug 13, 2014 at 07:21:57PM +0800, zhanghailiang wrote:
> >>>>If we configure qemu with realtime-mlock-on and memory-node-bind at the 
> >>>>same time,
> >>>>Qemu will fail to start, and mbind() fails with message "Input/output 
> >>>>error".
> >>>>
> >>>>> From man page:
> >>>>int mbind(void *addr, unsigned long len, int mode,
> >>>>                  unsigned long *nodemask, unsigned long maxnode,
> >>>>                  unsigned flags);
> >>>>The *MPOL_BIND* mode specifies a strict policy that restricts memory 
> >>>>allocation
> >>>>to the nodes specified in nodemask.
> >>>>If *MPOL_MF_STRICT* is passed in flags and policy is not MPOL_DEFAULT(In 
> >>>>qemu
> >>>>here is MPOL_BIND), then the call will fail with the error EIO if the 
> >>>>existing
> >>>>pages in  the memory range don't follow the policy.
> >>>>
> >>>>The memory locked ahead by mlockall can not guarantee to follow the 
> >>>>policy above,
> >>>>And if that happens, it will result in an EIO error.
> >>>>
> >>>>So we should call mlock after mbind, here we adjust the place where 
> >>>>called mlock,
> >>>>Move it to function pc_memory_init.
> >>>>
> >>>>Signed-off-by: xiexiangyou<address@hidden>
> >>>>Signed-off-by: zhanghailiang<address@hidden>
> >>>
> >>>OK but won't this still fail in case of memory hotplug?
> >>>We set MCL_FUTURE so the same will apply?
> >
> >It has already been set.
> >
> >>>Maybe it's enough to set MPOL_MF_MOVE?
> >>>Does the following work for you?
> >>>
> >>
> >>Hi Michael,
> >>
> >>I have tested memory hotplug, use virsh command like
> >>'virsh setmem redhat-6.4 6388608 --config --live', it is OK, and
> >>it will not call mbind when do such memory hotplug.
> >>but i don't know if there is command like 'memory-node hotplug' ?
> >
> >Using qemu monitor command:
> >object_add memory-backend-ram,id=ram1,size=128M,host-nodes=0,policy=bind
> >
> 
> Hi Hu Tao,
> 
> I have tested it use the above command, and yes, it failed.
> And if i remove the *MCL_FUTURE* flag of mlockall(), it will work fine.
> 
> *MCL_FUTURE*  Lock  all  pages  which  will become mapped into the
> address space of the process in the future.(From man page)
> 
> So i think we could remove this flag, and call mlockall(MCL_CURRENT)
> every time when we do the above 'memory object add'.

No that's wrong. We want allocations of qemu memory locked as well -
not just guest memory.

> >>
> >>MPOL_MF_MOVE can work, it is more simple, but it is not perfect. It
> >>consumes more time to *move the memory*(i guess will reconstruct
> >>pages and copy memory)which has been locked by mlockall. The result
> >>is VM will start slower than the above scenario.
> >
> >I think your patch makes sense but it doesn't work for hotplugged
> >memory. We need MPOL_MF_MOVE, too.
> >
> 
> Thank you very much, before send another patch, I will look into the
> qemu process of hotplug memory, and try to solve this fault.:)

Does setting MPOL_MF_MOVE solve all problems?

> >>
> >>BTW, i think the follow process is clearer and more logical:
> >>Allocate memory--->Set memory policy--->Lock memory.
> >>
> >>So what's your opinion? Thanks very much.
> >>
> >>
> >>>-->
> >>>
> >>>hostmem: set MPOL_MF_MOVE
> >>>
> >>>When memory is allocated on a wrong node, MPOL_MF_STRICT
> >>>doesn't move it - it just fails the allocation.
> >>>A simple way to reproduce the failure is with mlock=on
> >>>realtime feature.
> >>>
> >>>The code comment actually says: "ensure policy won't be ignored"
> >>>so setting MPOL_MF_MOVE seems like a better way to do this.
> >>>
> >>>Signed-off-by: Michael S. Tsirkin<address@hidden>
> >>>
> >>>---
> >>>
> >>>diff --git a/backends/hostmem.c b/backends/hostmem.c
> >>>index ca10c51..a9905c0 100644
> >>>--- a/backends/hostmem.c
> >>>+++ b/backends/hostmem.c
> >>>@@ -304,7 +304,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
> >>>Error **errp)
> >>>          /* ensure policy won't be ignored in case memory is preallocated
> >>>           * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages 
> >>> so
> >>>           * this doesn't catch hugepage case. */
> >>>-        unsigned flags = MPOL_MF_STRICT;
> >>>+        unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
> >>>
> >>>          /* check for invalid host-nodes and policies and give more 
> >>> verbose
> >>>           * error messages than mbind(). */
> >>>
> >>
> >
> >
> >.
> >
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]