[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3] target-i386: present virtual L3 cache info f
From: |
Gonglei (Arei) |
Subject: |
Re: [Qemu-devel] [PATCH v3] target-i386: present virtual L3 cache info for vcpus |
Date: |
Fri, 2 Sep 2016 02:41:55 +0000 |
> -----Original Message-----
> From: longpeng
> Sent: Friday, September 02, 2016 10:23 AM
> To: address@hidden; address@hidden; address@hidden;
> address@hidden
> Cc: Zhaoshenglong; Gonglei (Arei); Huangpeng (Peter); Herongguang (Stephen);
> address@hidden; Longpeng(Mike)
> Subject: [PATCH v3] target-i386: present virtual L3 cache info for vcpus
>
> From: "Longpeng(Mike)" <address@hidden>
>
A typo in email address, pls resend the v3.
> Some software algorithms are based on the hardware's cache info, for
> example,
> for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will
> trigger
> a resched IPI and told cpu2 to do the wakeup if they don't share low level
> cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share
> llc.
> The relevant linux-kernel code as bellow:
>
> static void ttwu_queue(struct task_struct *p, int cpu)
> {
> struct rq *rq = cpu_rq(cpu);
> ......
> if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
> ......
> ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
> return;
> }
> ......
> ttwu_do_activate(rq, p, 0); /* access target's rq directly */
> ......
> }
>
> In real hardware, the cpus on the same socket share L3 cache, so one won't
> trigger a resched IPIs when wakeup a task on others. But QEMU doesn't
> present a
> virtual L3 cache info for VM, then the linux guest will trigger lots of RES
> IPIs
> under some workloads even if the virtual cpus belongs to the same virtual
> socket.
>
> For KVM, this degrades performance, because there will be lots of vmexit due
> to
> guest send IPIs.
>
> The workload is a SAP HANA's testsuite, we run it one round(about 40
> minuates)
> and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering
> during
> the period:
>
> No-L3 With-L3(applied this patch)
> cpu0: 363890 44582
> cpu1: 373405 43109
> cpu2: 340783 43797
> cpu3: 333854 43409
> cpu4: 327170 40038
> cpu5: 325491 39922
> cpu6: 319129 42391
> cpu7: 306480 41035
> cpu8: 161139 32188
> cpu9: 164649 31024
> cpu10: 149823 30398
> cpu11: 149823 32455
> cpu12: 164830 35143
> cpu13: 172269 35805
> cpu14: 179979 33898
> cpu15: 194505 32754
> avg: 268963.6 40129.8
>
> The VM's topology is "1*socket 8*cores 2*threads".
> After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
> reduce 85%.
>
> What's more, for KVM, vcpus send IPIs will cause vmexit which is expensive.
> We had tested the overall system performance if vcpus actually run on sparate
> physical socket. With L3 cache, the performance improves
> 7.2%~33.1%(avg:15.7%).
>
> Signed-off-by: Longpeng(Mike) <address@hidden>
>
Here as well.
Regards,
-Gonglei