[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Adding slice-by-4 and slice-by-8 to CRC32
From: |
Jeffrey Walton |
Subject: |
Re: Adding slice-by-4 and slice-by-8 to CRC32 |
Date: |
Mon, 14 Oct 2024 18:02:48 -0400 |
On Mon, Oct 14, 2024 at 5:11 PM Sam Russell <sam.h.russell@gmail.com> wrote:
>
> One issue I've noticed is that the crc functions take a char* and while this
> is ok in other implementations, gnulib appears to enforce strict alignment
> (presumably for portability purposes).
>
> Because of this, we might need to introduce a new function that takes an
> unsigned long long* (and then _m128* when we do the SSE4.1 option) because
> casting a char* doesn't work.
>
> Do you know of any workarounds here? memcpy() to an unsigned long long*
> solves the alignment problem but adds a performance overhead. If the worst
> case is better than the existing algorithm then it might be worth it
> though... It would make sense to also offer a second function to allow
> callers to dump in a block of guaranteed-aligned memory.
GCC can usually elide the memcpy if it can determine the upcast would
be aligned. So there's usually no performance hit in practice. You
need to check the generated code.
But if you don't perform the memcpy, then you risk the code being
removed due to violating punning rules a/k/a/ undefined behavior.
Jeff
- Adding slice-by-4 and slice-by-8 to CRC32, Sam Russell, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Bruno Haible, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Sam Russell, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Bruno Haible, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Sam Russell, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Bruno Haible, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Sam Russell, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Collin Funk, 2024/10/14
- Re: Adding slice-by-4 and slice-by-8 to CRC32, Sam Russell, 2024/10/15
- Re: Adding slice-by-4 and slice-by-8 to CRC32,
Jeffrey Walton <=
Re: Adding slice-by-4 and slice-by-8 to CRC32, Jim Meyering, 2024/10/14
Re: Adding slice-by-4 and slice-by-8 to CRC32, Simon Josefsson, 2024/10/14