[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL
From: |
Gonglei (Arei) |
Subject: |
Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL |
Date: |
Wed, 7 Feb 2018 11:14:47 +0000 |
> -----Original Message-----
> From: Peter Maydell [mailto:address@hidden
> Sent: Tuesday, February 06, 2018 10:36 PM
> To: Gonglei (Arei)
> Cc: QEMU Developers; Paolo Bonzini; Huangweidong (C)
> Subject: Re: [PATCH v2] rtc: placing RTC memory region outside BQL
>
> On 6 February 2018 at 14:07, Gonglei <address@hidden> wrote:
> > As windows guest use rtc as the clock source device,
> > and access rtc frequently. Let's move the rtc memory
> > region outside BQL to decrease overhead for windows guests.
> > Meanwhile, adding a new lock to avoid different vCPUs
> > access the RTC together.
> >
> > $ cat strace_c.sh
> > strace -tt -p $1 -c -o result_$1.log &
> > sleep $2
> > pid=$(pidof strace)
> > kill $pid
> > cat result_$1.log
> >
> > Before appling this change:
> > $ ./strace_c.sh 10528 30
> > % time seconds usecs/call calls errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> > 93.87 0.119070 30 4000 ppoll
> > 3.27 0.004148 2 2038 ioctl
> > 2.66 0.003370 2 2014 futex
> > 0.09 0.000113 1 106 read
> > 0.09 0.000109 1 104 io_getevents
> > 0.02 0.000029 1 30 poll
> > 0.00 0.000000 0 1 write
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00 0.126839 8293 total
> >
> > After appling the change:
> > $ ./strace_c.sh 23829 30
> > % time seconds usecs/call calls errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> > 92.86 0.067441 16 4094 ppoll
> > 4.85 0.003522 2 2136 ioctl
> > 1.17 0.000850 4 189 futex
> > 0.54 0.000395 2 202 read
> > 0.52 0.000379 2 202 io_getevents
> > 0.05 0.000037 1 30 poll
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00 0.072624 6853 total
> >
> > The futex call number decreases ~90.6% on an idle windows 7 guest.
>
> These are the same figures as from v1 -- it would be interesting
> to check whether the additional locking that v2 adds has affected
> the results.
>
Oh, yes. the futex number of v2 don't decline compared too much to v1 because it
takes the BQL before raising the outbound IRQ line now.
Before applying v2:
# ./strace_c.sh 8776 30
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
78.01 0.164188 26 6436 ppoll
8.39 0.017650 5 3700 39 futex
7.68 0.016157 6 2758 ioctl
5.48 0.011530 3 4586 1113 read
0.30 0.000640 20 32 io_submit
0.15 0.000317 4 89 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.210482 17601 1152 total
After applying v2:
# ./strace_c.sh 15968 30
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
78.28 0.171117 27 6272 ppoll
8.50 0.018571 5 3663 21 futex
7.76 0.016973 6 2732 ioctl
4.85 0.010597 3 4115 853 read
0.31 0.000672 11 63 io_submit
0.30 0.000659 4 180 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.218589 17025 874 total
> Does the patch improve performance in a more interesting use
> case than "the guest is just idle" ?
>
I think so, after all, the scope of the locking is reduced .
Besides this, can we optimize the rtc timer to avoid to hold BQL
by separate threads?
> > +static void rtc_rasie_irq(RTCState *s)
>
> Typo: should be "raise".
>
Good catch. :)
Thanks,
-Gonglei