[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.
From: |
Daniel Kiper |
Subject: |
Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail. |
Date: |
Mon, 29 Jan 2018 17:42:26 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Jan 19, 2018 at 04:10:01PM -0500, Peter Jones wrote:
> On Thu, Jan 18, 2018 at 11:52:53PM +0100, Daniel Kiper wrote:
> > On Tue, Jan 16, 2018 at 01:16:17PM -0500, Peter Jones wrote:
> > > On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> > > using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> > > to do so (as measured with the stopwatch on my phone), with a tsc delta
> > > of 0x1cd1c85300, or around 125 billion cycles.
> > >
> > > If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> > > to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> > > million cycles, or more or less instantly.
> > >
> > > Additionally, this reading the pmtimer was returning 0xffffffff anyway,
> > > and that's obviously an invalid return. I've added a check for that and
> > > 0 so we don't bother waiting for the test if what we're seeing is dead
> > > pins with no response at all.
> > >
> > > Signed-off-by: Peter Jones <address@hidden>
> > > ---
> > > grub-core/kern/i386/tsc_pmtimer.c | 43
> > > ++++++++++++++++++++++++++++++---------
> > > 1 file changed, 33 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/grub-core/kern/i386/tsc_pmtimer.c
> > > b/grub-core/kern/i386/tsc_pmtimer.c
> > > index c9c36169978..609402b8376 100644
> > > --- a/grub-core/kern/i386/tsc_pmtimer.c
> > > +++ b/grub-core/kern/i386/tsc_pmtimer.c
> > > @@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
> > > grub_uint64_t start_tsc;
> > > grub_uint64_t end_tsc;
> > > int num_iter = 0;
> > > + int bad_reads = 0;
> > >
> > > - start = grub_inl (pmtimer) & 0xffffff;
> > > + start = grub_inl (pmtimer) & 0x3fff;
> >
> > I am not sure why you are changing this to 0x3fff...
>
> Upon re-reading, I am not either. (As you see I wrote this patch 2+
> months ago while working on something else, so the memory has already
> faded.)
Yeah, I know how it works...
> Obviously I need to make a version 2 of this patch to fix some issues.
Great!
> > > last = start;
> > > end = start + num_pm_ticks;
> > > start_tsc = grub_get_tsc ();
> > > while (1)
> > > {
> > > - cur = grub_inl (pmtimer) & 0xffffff;
> >
> > What about 24-bit timers? I would leave this here...
>
> I was blisfully unaware of 24-bit timers. I'll put that back and add a
> comment for future readers of the code, and check for 0xffffff below.
Thanks!
> > > + cur = grub_inl (pmtimer);
> > > +
> > > + /* If we get 10 reads in a row that are obviously dead pins,
> > > there's no
> > > + reason to do this thousands of times.
> > > + */
> > > + if (cur == 0xffffffff || cur == 0)
> >
> > ...and here I would check for 0xffffff and 0.
>
> Sure.
>
> >
> > > + {
> > > + bad_reads++;
> > > + grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur,
> > > bad_reads);
> > > +
> > > + if (bad_reads == 10)
> > > + return 0;
> > > + }
> > > + else if (bad_reads)
> > > + bad_reads = 0;
> >
> > Do we really need to reset this?
>
> I mean, it's technically correct, but I don't think you're going to see
> 0 and 0xffffff as valid values often enough to actually care.
>
> > > + cur &= 0x3fff;
> > > +
> > > if (cur < last)
> > > - cur |= 0x1000000;
> > > + cur |= 0x4000;
> > > num_iter++;
>
> You didn't spot this, but it's wrong for exactly the same reason the
> initialization of start is. I'm pretty sure when I did this part, I was
> letting the comment below tell me what was going on, and completely
> missed that the comment doesn't match the conditional at all.
>
> > > if (cur >= end)
> > > {
> > > end_tsc = grub_get_tsc ();
> > > + grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n",
> > > + end_tsc - start_tsc);
> > > return end_tsc - start_tsc;
> > > }
> > > - /* Check for broken PM timer.
> > > - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
> > > - if after this time we still don't have 1 ms on pmtimer, then
> > > - pmtimer is broken.
> > > + /* Check for broken PM timer. 5000 TSCs is between 5us (10GHz) and
> > ^^^ 500ns?
> >
> > > + 200us (250 MHz). If after this time we still don't have 1us on
> > ^^^^^ 20us?
>
> You know, I stared at this for a surprising amount of time when writing
> it trying to figure out why these numbers seemed weird. While I have
> the diff in front of me rather than the code as it is now, I see that
> it's because I was starting with the comment that was previously there,
> which had 50M instead of 5M for the TSC count. So I wound up dividing
> these by 10k instead of 1k. Woops. Will fix in v2.
>
> >
> > > + pmtimer, then pmtimer is broken.
> > > */
> > > - if ((num_iter & 0xffffff) == 0 && grub_get_tsc () - start_tsc >
> > > 5000000) {
> > > - return 0;
> > > - }
> > > + end_tsc = grub_get_tsc();
> > > + if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)
> >
> > Why 0x3fff here which means, AIUI, 4000 iterations?
>
> Er... 16384, surely? But basically I just wanted to scale down the
Yep, or I simply forgot 0x before 4000...
> limit by approximately the same order as I scaled down the TSC delta,
> and I wanted to leave the code basically working the same way (not for
> any good reason), so I still wanted a number with only low-order bits
> set. TSC is scaled down by 1000, so I scaled the PM tick iteration down
> by 1024. I could make it twice as wide of a window instead, if you're
> worried about that particular race, but I don't think it's going to
> matter.
>
> If our loop takes 1 TSC tick per iteration, which is quite impluasibly
> fast, and we're on a 10GHz machine, then we should normally see pmtimer
> change by 3580 in ~2800 iterations. If it takes 16k iterations (again,
I think that number of iterations is wrong here. Let's see...
f_pm = n_pm / t_pm, n_pm = 3580, f_pm = 3.58MHz => t_pm = 3580 / 3.58MHz
t_pm = 0,001s = 1ms => num_iter = 1ms * 10GHz = 10.000.000
Then below calculations are wrong too...
> with perfectly executing code that takes no time), the pmtimer is
> running at 625KHz instead of 3.58MHz. If we're on a 250MHz machine,
> that number is ~70 iterations. If we get to 16000 iterations without
> seeing pmtimer change by 3580 then pmtimer is either broken or we're on
> a 60GHz+ machine.
>
> That said the logic here did not match the comment and it still doesn't;
> it should be comparing cur and start, not the number of iterations. On
> the next version of the patch I'm going to do something like:
>
> if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) ||
Taking into account above I am not sure why you multiply by 8 here...
> (cur == start && end_tsc - start_tsc > 5000))
>
> So it'll have a fairly widely ranged total limit but also actually check if
> pmtimer results have changed at all.
>
> Tentative patch below, but I haven't had a chance to test it yet:
In general I am not against the idea itself but I think that some
calculations should be fixed here and there.
> From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001
> From: Peter Jones <address@hidden>
> Date: Tue, 7 Nov 2017 17:12:17 -0500
> Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.
>
> On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> to do so (as measured with the stopwatch on my phone), with a tsc delta
> of 0x1cd1c85300, or around 125 billion cycles.
>
> If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> million cycles, or more or less instantly.
>
> Additionally, this reading the pmtimer was returning 0xffffffff anyway,
> and that's obviously an invalid return. I've added a check for that and
> 0 so we don't bother waiting for the test if what we're seeing is dead
> pins with no response at all.
> ---
> grub-core/kern/i386/tsc_pmtimer.c | 56
> ++++++++++++++++++++++++++++++++-------
> 1 file changed, 46 insertions(+), 10 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c
> b/grub-core/kern/i386/tsc_pmtimer.c
> index c9c36169978..b09c00316df 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
> grub_uint32_t cur, end;
> grub_uint64_t start_tsc;
> grub_uint64_t end_tsc;
> - int num_iter = 0;
> + unsigned int num_iter = 0;
> + int bad_reads = 0;
>
> + /* Some timers are 24-bit and some are 32-bit, but it doesn't make much
Please leave empty line after "/*".
> + * difference to us. Caring which one we have isn't really worth it since
> + * the low-order digits will give us enough data to calibrate TSC. So just
> + * mask the top-order byte off.
> + */
> start = grub_inl (pmtimer) & 0xffffff;
> last = start;
> end = start + num_pm_ticks;
> @@ -46,22 +52,52 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
> while (1)
> {
> cur = grub_inl (pmtimer) & 0xffffff;
> +
> + /* If we get 10 reads in a row that are obviously dead pins, there's no
> + reason to do this thousands of times.
Please be consistent with the comments format. I prefer one given above.
> + */
> + if (cur == 0xffffff || cur == 0)
> + {
> + bad_reads++;
> + grub_dprintf ("pmtimer", "pmtimer value: 0x%08x bad_reads: %d\n",
> + cur, bad_reads);
> +
> + if (bad_reads == 10)
> + return 0;
> + }
> + else if (bad_reads)
> + bad_reads = 0;
> +
> + end_tsc = grub_get_tsc();
> +
> if (cur < last)
> cur |= 0x1000000;
> - num_iter++;
> if (cur >= end)
> {
> - end_tsc = grub_get_tsc ();
> + grub_dprintf ("pmtimer", "tsc delta is 0x%016llx\n",
> + end_tsc - start_tsc);
> return end_tsc - start_tsc;
> }
> - /* Check for broken PM timer.
> - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
> - if after this time we still don't have 1 ms on pmtimer, then
> - pmtimer is broken.
> +
> + /* Check for broken PM timer. 5000 TSCs is between 500ns (10GHz) and
> + 20us (250 MHz). If after this time we still don't have 1us on
> + pmtimer, then pmtimer is broken.
> +
> + Likewise, if our code is perfectly efficient and introduces no delays
> + whatsoever, on a 10GHz system we should see a TSC delta of 3580 in
> + ~2800 iterations. On a 250MHz machine that should be ~70 iterations.
Incorrect calculations?
> + So there's a hard limit here at 8x our desired pm timer delta, because
> + if we get that far, we're either on a 60GHz+ machine or we're never
> + hitting our threshold.
And please use same comment format as in first one.
Thanks,
Daniel