[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: OpenBLAS and performance
From: |
Dave Love |
Subject: |
Re: OpenBLAS and performance |
Date: |
Fri, 22 Dec 2017 14:35:22 +0000 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
For what it's worth, I get 37000 Mflops from the dgemm.goto benchmark
using the current Guix openblas and OPENBLAS_NUM_THREADS=1 at a size of
7000 on a laptop with "i5-6200U CPU @ 2.30GHz" (avx2). That looks about
right, and it should more-or-less plateau at that size. For comparison,
I get 44000 on a cluster node "E5-2690 v3 @ 2.60GHz" with its serial
build of 0.2.19. (I mis-remembered the sandybridge figures, which
should be low 20s, not high 20s.)
If you see something much different, perhaps the performance counters
give a clue, e.g. with Guix' scorep/cube, oprofile, or perf.
I've sent a patch for the correct cache size on haswell, but I don't
think it makes much difference in this case.
- OpenBLAS and performance, Pjotr Prins, 2017/12/19
- Re: OpenBLAS and performance, Ludovic Courtès, 2017/12/19
- Re: OpenBLAS and performance, Dave Love, 2017/12/20
- Re: OpenBLAS and performance, Dave Love, 2017/12/20
- Re: OpenBLAS and performance, Pjotr Prins, 2017/12/20
- Re: OpenBLAS and performance, Ricardo Wurmus, 2017/12/20
- Re: OpenBLAS and performance, Pjotr Prins, 2017/12/20
- Re: OpenBLAS and performance, Ludovic Courtès, 2017/12/21
- Re: OpenBLAS and performance,
Dave Love <=
- Re: OpenBLAS and performance, Ricardo Wurmus, 2017/12/22
- Re: OpenBLAS and performance, Pjotr Prins, 2017/12/20
- Re: OpenBLAS and performance, Ricardo Wurmus, 2017/12/20
- Re: OpenBLAS and performance, Pjotr Prins, 2017/12/21
- Re: OpenBLAS and performance, Eric Bavier, 2017/12/20
- Re: OpenBLAS and performance, Dave Love, 2017/12/21
- Re: OpenBLAS and performance, Dave Love, 2017/12/21
- Re: OpenBLAS and performance, Ricardo Wurmus, 2017/12/21
- Re: OpenBLAS and performance, Ricardo Wurmus, 2017/12/21
- Re: OpenBLAS and performance, Dave Love, 2017/12/22