[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] vm performance degradation after kvm live migration or
From: |
Gleb Natapov |
Subject: |
Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled |
Date: |
Mon, 5 Aug 2013 11:43:09 +0300 |
On Mon, Aug 05, 2013 at 08:35:09AM +0000, Zhanghaoyu (A) wrote:
> >> >> >> hi all,
> >> >> >>
> >> >> >> I met similar problem to these, while performing live migration or
> >> >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2,
> >> >> >> guest:suse11sp2), running tele-communication software suite in
> >> >> >> guest,
> >> >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> >> >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> >> >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> >> >> >>
> >> >> >> After live migration or virsh restore [savefile], one process's CPU
> >> >> >> utilization went up by about 30%, resulted in throughput
> >> >> >> degradation of this process.
> >> >> >>
> >> >> >> If EPT disabled, this problem gone.
> >> >> >>
> >> >> >> I suspect that kvm hypervisor has business with this problem.
> >> >> >> Based on above suspect, I want to find the two adjacent versions of
> >> >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
> >> >> >> and analyze the differences between this two versions, or apply the
> >> >> >> patches between this two versions by bisection method, finally find
> >> >> >> the key patches.
> >> >> >>
> >> >> >> Any better ideas?
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Zhang Haoyu
> >> >> >
> >> >> >I've attempted to duplicate this on a number of machines that are as
> >> >> >similar to yours as I am able to get my hands on, and so far have not
> >> >> >been able to see any performance degradation. And from what I've read
> >> >> >in the above links, huge pages do not seem to be part of the problem.
> >> >> >
> >> >> >So, if you are in a position to bisect the kernel changes, that would
> >> >> >probably be the best avenue to pursue in my opinion.
> >> >> >
> >> >> >Bruce
> >> >>
> >> >> I found the first bad
> >> >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault
> >> >> r/w information to gup(), allow read-only memory) which triggers this
> >> >> problem by git bisecting the kvm kernel (download from
> >> >> https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes.
> >> >>
> >> >> And,
> >> >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p >
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log
> >> >> git diff
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4
> >> >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff
> >> >>
> >> >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff,
> >> >> came to a conclusion that all of the differences between
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4
> >> >> are contributed by no other than
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the
> >> >> peace-breaker which directly or indirectly causes the degradation.
> >> >>
> >> >> Does the map_writable flag passed to mmu_set_spte() function have
> >> >> effect on PTE's PAT flag or increase the VMEXITs induced by that guest
> >> >> tried to write read-only memory?
> >> >>
> >> >> Thanks,
> >> >> Zhang Haoyu
> >> >>
> >> >
> >> >There should be no read-only memory maps backing guest RAM.
> >> >
> >> >Can you confirm map_writable = false is being passed to __direct_map?
> >> >(this should not happen, for guest RAM).
> >> >And if it is false, please capture the associated GFN.
> >> >
> >> I added below check and printk at the start of __direct_map() at the fist
> >> bad commit version,
> >> --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c
> >> 2013-07-26 18:44:05.000000000 +0800
> >> +++ kvm-612819/arch/x86/kvm/mmu.c 2013-07-31 00:05:48.000000000 +0800
> >> @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu
> >> int pt_write = 0;
> >> gfn_t pseudo_gfn;
> >>
> >> + if (!map_writable)
> >> + printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__,
> >> __func__, gfn);
> >> +
> >> for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
> >> if (iterator.level == level) {
> >> unsigned pte_access = ACC_ALL;
> >>
> >> I virsh-save the VM, and then virsh-restore it, so many GFNs were printed,
> >> you can absolutely describe it as flooding.
> >>
> >The flooding you see happens during migrate to file stage because of dirty
> >page tracking. If you clear dmesg after virsh-save you should not see any
> >flooding after virsh-restore. I just checked with latest tree, I do not.
>
> I made a verification again.
> I virsh-save the VM, during the saving stage, I run 'dmesg', no GFN printed,
> maybe the switching from running stage to pause stage takes so short time,
> no guest-write happens during this switching period.
> After the completion of saving operation, I run 'demsg -c' to clear the
> buffer all the same, then I virsh-restore the VM, so many GFNs are printed by
> running 'dmesg',
> and I also run 'tail -f /var/log/messages' during the restoring stage, so
> many GFNs are flooded dynamically too.
> I'm sure that the flooding happens during the virsh-restore stage, not the
> migration stage.
>
Interesting, is this with upstream kernel? For me the situation is
exactly the opposite. What is your command line?
--
Gleb.
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Gleb Natapov, 2013/08/01
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/05
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled,
Gleb Natapov <=
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Gleb Natapov, 2013/08/05
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/06
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/06
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Gleb Natapov, 2013/08/07
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/14
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/20
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled, Zhanghaoyu (A), 2013/08/31