[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
From: |
Stefano Stabellini |
Subject: |
Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit |
Date: |
Wed, 3 Apr 2013 11:36:19 +0100 |
User-agent: |
Alpine 2.02 (DEB 1266 2009-07-14) |
On Wed, 3 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:address@hidden
> > Sent: 2013年4月2日 21:28
> > To: Hanweidong
> > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > Yanqiangjun; address@hidden; address@hidden; Gonglei
> > (Arei); Anthony Perard; Wangzhenguo
> > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> >
> > On Tue, 2 Apr 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:address@hidden
> > > > Sent: 2013年4月1日 22:39
> > > > To: Hanweidong
> > > > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > > > Yanqiangjun; address@hidden; address@hidden;
> > Gonglei
> > > > (Arei); Anthony Perard; Wangzhenguo
> > > > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results
> > in
> > > > qemu exit
> > > >
> > > > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > > > -----Original Message-----
> > > > > > From: Stefano Stabellini
> > [mailto:address@hidden
> > > > > > Sent: 2013年3月29日 20:37
> > > > > > To: Hanweidong
> > > > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun;
> > qemu-
> > > > > > address@hidden; address@hidden; Gonglei (Arei);
> > Anthony
> > > > > > Perard; Wangzhenguo
> > > > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning
> > results
> > > > in
> > > > > > qemu exit
> > > > > >
> > > > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > > > We fixed this issue by below patch which computed the correct
> > > > size
> > > > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr()
> > will
> > > > call
> > > > > > xen_map_cache() with size is 0 if the requested address is in
> > the
> > > > RAM.
> > > > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > > > checking
> > > > > > if the corresponding pfn was mapped in cache. But test_bits()
> > will
> > > > > > always return 1 when size is 0 without any bit testing.
> > Actually,
> > > > for
> > > > > > this case, test_bits should check one bit. So this patch
> > introduced
> > > > a
> > > > > > __test_bit_size which is greater than 0 and a multiple of
> > > > XC_PAGE_SIZE,
> > > > > > then test_bits can work correctly with __test_bit_size >>
> > > > XC_PAGE_SHIFT
> > > > > > as its size.
> > > > > > >
> > > > > > > Signed-off-by: Zhenguo Wang <address@hidden>
> > > > > > > Signed-off-by: Weidong Han <address@hidden>
> > > > > >
> > > > > > Thanks for the patch and for debugging this difficult problem.
> > > > > > The reality is that size is never actually 0: when
> > qemu_get_ram_ptr
> > > > > > calls xen_map_cache with size 0, it actually means "map until
> > the
> > > > end
> > > > > > of
> > > > > > the page". As a consequence test_bits should always test at
> > least 1
> > > > bit,
> > > > > > like you wrote.
> > > > >
> > > > > Yes, for the case of size is 0, we can just simply set
> > > > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > > > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > > > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to
> > test
> > > > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test
> > 2
> > > > bits, but size >> XC_PAGE_SHIFT is only 1.
> > > > >
> > > >
> > > > I was assuming that the size is always page aligned.
> > > > Looking through the code actually I think that it's better not to
> > rely
> > > > on this assumption.
> > > >
> > > > Looking back at your original patch:
> > > >
> > > >
> > > >
> > > > > We fixed this issue by below patch which computed the correct
> > size
> > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> > call
> > > > xen_map_cache() with size is 0 if the requested address is in the
> > RAM.
> > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > checking
> > > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > > always return 1 when size is 0 without any bit testing. Actually,
> > for
> > > > this case, test_bits should check one bit. So this patch introduced
> > a
> > > > __test_bit_size which is greater than 0 and a multiple of
> > XC_PAGE_SIZE,
> > > > then test_bits can work correctly with __test_bit_size >>
> > XC_PAGE_SHIFT
> > > > as its size.
> > > > >
> > > > > Signed-off-by: Zhenguo Wang <address@hidden>
> > > > > Signed-off-by: Weidong Han <address@hidden>
> > > > >
> > > > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > > > index 31c06dc..bd4a97f 100644
> > > > > --- a/xen-mapcache.c
> > > > > +++ b/xen-mapcache.c
> > > > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr,
> > hwaddr
> > > > size,
> > > > > hwaddr address_index;
> > > > > hwaddr address_offset;
> > > > > hwaddr __size = size;
> > > > > + hwaddr __test_bit_size = size;
> > > > > bool translated = false;
> > > > >
> > > > > tryagain:
> > > > > @@ -210,7 +211,23 @@ tryagain:
> > > > >
> > > > > trace_xen_map_cache(phys_addr);
> > > > >
> > > > > - if (address_index == mapcache->last_address_index && !lock
> > > > && !__size) {
> > > > > + entry = &mapcache->entry[address_index % mapcache-
> > >nr_buckets];
> > > >
> > > > there is no need to move this line up here, see below
> > > >
> > > >
> > > > > + /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > > > + if (size) {
> > > > > + __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE -
> > 1));
> > > > > +
> > > > > + if (__test_bit_size % XC_PAGE_SIZE) {
> > > > > + __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > > > XC_PAGE_SIZE);
> > > > > + }
> > > > > + } else {
> > > > > + __test_bit_size = XC_PAGE_SIZE;
> > > > > + }
> > > >
> > > > this is OK
> > > >
> > > >
> > > > > + if (address_index == mapcache->last_address_index && !lock
> > > > && !__size &&
> > > > > + test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > > + __test_bit_size >> XC_PAGE_SHIFT,
> > > > > + entry->valid_mapping)) {
> > > > > trace_xen_map_cache_return(mapcache->last_address_vaddr
> > +
> > > > address_offset);
> > > > > return mapcache->last_address_vaddr + address_offset;
> > > > > }
> > > >
> > > > Unless I am missing something this change is unnecessary: if the
> > > > mapping
> > > > is not valid than mapcache->last_address_index is set to -1.
> > >
> > > mapcache->last_address_index means the corresponding bucket (1MB) was
> > mapped, but we noticed that some pages of the bucket may be not mapped.
> > So we need to check if it's mapped even the address_index is equal to
> > last_address_index.
> >
> > That is a good point, but the current fix doesn't fully address that
> > problem: the first entry found in the cache might not be the one
> > corresponding to last_address_index.
> >
> > I think that the right fix here would be to replace last_address_index
> > and last_address_vaddr with a last_entry pointer.
> >
> > I have sent a small patch series that includes your patch, can you
> > please let me know if it does solve your problem and if you think that
> > is correct?
> >
>
> The patches look good for me. We verified that the patches solved our
> problem.
Great, thanks!
I'll submit a pull request.