qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [RFC PATCH v2 2/2] utils: Add prefetch for Thunderx platf


From: Richard Henderson
Subject: Re: [Qemu-arm] [RFC PATCH v2 2/2] utils: Add prefetch for Thunderx platform
Date: Wed, 17 Aug 2016 08:34:21 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

On 08/16/2016 04:45 PM, Vijay Kilari wrote:
On Tue, Aug 16, 2016 at 11:32 PM, Richard Henderson <address@hidden> wrote:
On 08/16/2016 05:02 AM, address@hidden wrote:

+static inline void prefetch_vector_loop(const VECTYPE *p, int index)
+{
+#if defined(__aarch64__)
+    if (is_thunderx_pass2_cpu()) {
+        /* Prefetch 4 cache lines ahead from index */
+        VEC_PREFETCH(p, index + (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
* 4));
+    }
+#endif
+}


Oh come now.  This is even worse than before.  A function call protecting a
mere prefetch within the main body of an inner loop?

Did you not understand what I was asking for?

No, Could you please detail the problem?.

The thunderx check, *if it even needs to exist at all*, must happen outside the loop. Preferably not more than once, at startup time.

I strongly suspect that you do not need any check at all. That even for cpus which automatically detect the streaming loop, adding a prefetch will not hurt.

You should repeat your same benchmark, with and without the prefetch, on (1) an A57 or suchlike, and (2) an x86 of some variety.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]