Am 22.03.2013 20:37, schrieb Eric Blake:
On 03/22/2013 06:46 AM, Peter Lieven wrote:
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.
due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.
Signed-off-by: Peter Lieven <address@hidden>
---
include/qemu-common.h | 13 +++++++++++++
util/cutils.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
+#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
+static inline bool
+can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
+{
+ if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
+ * sizeof(VECTYPE)) == 0
+ && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) {
I know that emacs tends to indent the second line to the column after
the ( that it is associated with, as in:
+ if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
+ * sizeof(VECTYPE)) == 0
+ && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) {
But since checkpatch.pl didn't complain, and since I'm not sure if there
is a codified qemu indentation style, and since I _am_ sure that not
everyone uses emacs [hi, vi guys], it's not worth respinning. A
maintainer can touch it up if desired.
Actually, I was totally unsure how to indent this. Maybe just give
a hint what you would like to see. As I will replace patch 4 with
an earlier version that is not vector optimized, but uses loop unrolling,
I will have to do a v5 and so I can fix this.
+
+size_t buffer_find_nonzero_offset(const void *buf, size_t len)
+{
+ VECTYPE *p = (VECTYPE *)buf;
+ VECTYPE zero = ZERO_SPLAT;
+ size_t i;
+
+ assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
+ * sizeof(VECTYPE)) == 0);
+ assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0);
I would have written this:
assert(can_use_buffer_find_nonzero_offset(buf, len));
Good point. Will be changed in v5.
But that's cosmetic, and compiles to the same code, so it's not worth a
respin.
You've addressed my concerns on v3.
Reviewed-by: Eric Blake <address@hidden>
Peter