qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] utils: Add pow2ceil()


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v2] utils: Add pow2ceil()
Date: Tue, 24 Feb 2015 10:39:58 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 02/23/2015 10:40 AM, Markus Armbruster wrote:
>
>>>> int64_t pow2floor(int64_t value)
>>>> {
>>>>     assert(value > 0);
>>>>     return 0x8000000000000000u >> clz64(value);
>>>> }
>>>
>>> Needs to be 0x8000000000000000ull for 32-bit machines to compile correctly.
>> 
>> Why?
>
> Because 0x8000000000000000u is only required to be a 'long', and on
> 32-bit machines, your constant would overflow long.  See, for example,
> commit 5cb6be2ca.  You NEED the 'll' suffix to ensure that the compiler
> doesn't reject the constant as an overflow.

Not true.

If you're not interested in whether it is true or not, only what suffix
we should use, skip to "Commit 5cb6be2ca".

If you trust me as a language lawyer, you can skip over my experiment to
"Why is this so".

---<lit.c>---
#include <stdio.h>

#define HAS_LIT_TYPE(lit, type) \
    __builtin_types_compatible_p(type, typeof(lit))

#define LIT_TYPE_NAME1(lit, type) \
    (HAS_LIT_TYPE(lit, type) ? #type : NULL)

#define LIT_INT_TYPE_NAME(lit)                  \
    (LIT_TYPE_NAME1(lit, int)                   \
     ?: LIT_TYPE_NAME1(lit, unsigned)           \
     ?: LIT_TYPE_NAME1(lit, long)               \
     ?: LIT_TYPE_NAME1(lit, unsigned long)      \
     ?: LIT_TYPE_NAME1(lit, long long)          \
     ?: LIT_TYPE_NAME1(lit, unsigned long long) \
     ?: "unknown")

#define PR_LIT_INT_TYPE(lit) \
    printf("%-24s %s\n", #lit, LIT_INT_TYPE_NAME(lit));

int
main(void)
{
    PR_LIT_INT_TYPE(0x8000000000000000);
    PR_LIT_INT_TYPE(0x8000000000000000u);
    PR_LIT_INT_TYPE(0x8000000000000000ull);
    return 0;
}
---<End of lit.c>---

$ gcc -o lit64 -g -O -Wall lit.c
$ gcc -m32 -o lit32 -g -O -Wall lit.c
$ ./lit32
0x8000000000000000       unsigned long long
0x8000000000000000u      unsigned long long
0x8000000000000000ull    unsigned long long
$ ./lit64
0x8000000000000000       unsigned long
0x8000000000000000u      unsigned long
0x8000000000000000ull    unsigned long long

Why is this so?  The type of an integer constant depends on its value,
its suffix, and whether it's decimal or not.  ISO/IEC 9899:1999
ยง6.4.4.1:

       [#5]  The  type  of  an integer constant is the first of the
       corresponding list in which its value can be represented.

                    |                       | Octal or Hexadecimal
       Suffix       |   Decimal Constant    |       Constant
       -------------+-----------------------+-----------------------
       none         |int                    |int
                    |long int               |unsigned int
                    |long long int          |long int
                    |                       |unsigned long int
                    |                       |long long int
                    |                       |unsigned long long int
       -------------+-----------------------+-----------------------
       u or U       |unsigned int           |unsigned int
                    |unsigned long int      |unsigned long int
                    |unsigned long long int |unsigned long long int
       -------------+-----------------------+-----------------------
       l or L       |long int               |long int
                    |long long int          |unsigned long int
                    |                       |long long int
                    |                       |unsigned long long int
       -------------+-----------------------+-----------------------
       Both u or U  |unsigned long int      |unsigned long int
       and l or L   |unsigned long long int |unsigned long long int
       -------------+-----------------------+-----------------------
       ll or LL     |long long int          |long long int
                    |                       |unsigned long long int
       -------------+-----------------------+-----------------------
       Both u or U  |unsigned long long int |unsigned long long int
       and ll or LL |                       |

A hexadecimal constant without a suffix can have a signed type, which
can surprise the unwary programmer.  But if the suffix contains u, the
constant has an unsigned type.

0x8000000000000000u needs 64 bits to be representable.  Its type is the
narrowest of unsigned, unsigned long, unsigned long long that can
represent it.

Real-world examples:

    data model | int  long  long long | type of 0x8000000000000000u
    LLP64      |  32    32         64 | unsigned long long
    LP64       |  32    64         64 | unsigned long
    IILP64     |  64    64         64 | unsigned

Commit 5cb6be2ca fixes up hexadecimal literals without a suffix.  For
what it's worth, I can't reproduce the warning with gcc -m32.  Whether
and where u-suffixed literals trigger the warning I can't say.

If they do, we'll want whatever suffix it takes.

If they don't, you could argue for suffix ull to avoid ambiguity.
Matter of taste.  I don't particularly care, but the reason is style,
not "won't compile".

>>> Why is the parameter int64_t?  Wouldn't it be more useful to have:
>>>
>>> uint64_t pow2floor(uint64_t value)
>> 
>> Crossed my mind, too.  However, the existing callers pass *signed*
>> arguments.
>
> I guess it means auditing callers either way.
>
>>>> int64_t pow2ceil(int64_t value)
>>>> {
>>>
>>> Again, why allow signed inputs?
>>>
>>>>     assert(value <= 0x4000000000000000)
>>>>     if (value <= 1)
>>>>    return 1;
>>>
>>> In particular, this slams all negative values to a result of 1, which
>>> doesn't necessarily make sense.
>> 
>> It implements a straightforward contract: return the smallest power of
>> two greater or equal to the argument.  The function's domain is the set
>> of int64_t arguments where this value can be represented in int64_t,
>> i.e. [-2^63..2^62].
>> 
>> Feel free to suggest a more sensible contract.
>
> But why should I claim that the nearest power of 2 greater than -3 is
> positive 1, when I could argue that it should instead be -2 (nearest
> positive or negative power of 2 rounding towards +inf) or -4 (nearest
> positive or negative power of 2 rounding away from 0)?  Since there are
> multiple potential contracts once negative values are involved, I find
> it easier to just make the contract require a positive input to begin with.

-2 is not a power of two!  Ask any mathematician :)

"Round down to the nearest power of 2" is a pefectly obvious function
contract.  The function's domain is the representable numbers >= 1.

Same for "round up", domain is representable numbers <= largest
representable power of two.

My implementations assert the actual argument is in the domain.

If you want to change the two functions' contracts to return the nearest
positive or negative power of two rounding towards whatever, I don't
mind as long as you explicitly write it into their contract, because it
sure isn't obvious from the function names.

If you want to change the functions' parameter and return type to
uint64_t, I don't mind, either.  However, you'll have to handle domain
errors in both functions all the same.

An audit of the callers may be advisable regardless of what we do.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]