lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Using unnsigned (was: More Clang fixes)


From: Vadim Zeitlin
Subject: Re: [lmi] Using unnsigned (was: More Clang fixes)
Date: Mon, 28 Mar 2016 17:07:19 +0200

On Mon, 28 Mar 2016 14:41:32 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-03-27 21:24, Vadim Zeitlin wrote:
GC> > 
GC> >> 2. Using unsigned does not allow for the negative value checking without
GC> >> first converting back to signed.
GC> > 
GC> >  This is neither here nor there because unsigned is only used for values
GC> > which are never negative, so you never need to check for this and thus 
it's
GC> > irrelevant that you can't. Equivalently, if you need to check whether a
GC> > value is negative, do use signed for it, of course.
GC> 
GC> In this case (citing the code before yesterday's changes):
GC>   unsigned int const sel = GetSelection();
GC>   if(!(0 <= sel && sel < axis_.GetCardinality())) // ... diagnosed error
GC> I think this is a real concern.

 I agree, but for me this is due to the wrong "sel" variable type. If it
were correctly declared as "int" or -- better -- "auto", this wouldn't have
happened.

GC> It's not immediately obvious whether
GC> GetSelection() or GetCardinality() can return only nonnegative values.

 Again, I agree, but the problem is that GetSelection() is a bad API. So
the choice we have is to make all the other, newer APIs bad as well for
consistency with it or just live with this one being bad, but make all the
others do the right thing. I think the latter choice is more fruitful.

GC> GetSelection() and GetCardinality() cannot meaningfully return a negative
GC> selection or a negative cardinality.

 Yes. Which is why GetSelection() is wrong.

GC> We need signed int not to represent such impossible
GC> things, but only to allow special values indicating exceptional conditions:

 I strongly feel that we do _not_ need signed int for this but another way
of indicating such exceptional conditions (there are several, using
optional<int> is the most elegant IMO, but not the only one).

GC> >> 3. Using unsigned only increases the size of an integer by 1 bit.
GC> > 
GC> >  Completely irrelevant, nobody uses unsigned because of this.
GC> 
GC> Allocation functions use unsigned arguments exactly because of this.
GC> Otherwise, an array of size 2^16-1 couldn't be allocated on a 16-bit
GC> machine. That's why size_t is unsigned; and therefore functions like
GC> size() are, as well.

 Sorry, I should have been more precise: "nobody uses unsigned" in design
of APIs not directly related to memory allocation because of this.

GC> But for APIs that return, say, the current selection in a wxChoice,
GC> this is no reason to prefer unsigned: even on a 16-bit machine, we
GC> won't offer 2^16 selections in a wxChoice, or even half that number.

 Exactly, there is no reason to prefer unsigned _due_to_this_reason_ and
nobody has ever used it to justify using unsigned.

GC> We could have identified the same issue with gcc. We didn't, only because
GC> we had its applicable warning turned off. We had turned it off because it
GC> had been observed to issue too many errors when compiling some earlier
GC> version of wx with some earlier version of gcc. When you questioned that
GC> yesterday, I enabled gcc's warning and found that it identified the same
GC> issues as Clang.

 Yes, sure, gcc has been giving this warnings since quite some time too.
Even MSVC does it, this is not at all clang-specific and I've updated the
thread subject to indicate it.

GC> Aside from those mechanical considerations, comparing GetCardinality()
GC> and GetSelection() is innately difficult because of their differing
GC> return types. Tools that identify mixed comparisons can show us where we
GC> need to expend extra thought, but cannot save us from that expenditure.
GC> Use signed integers in the interface as a general rule avoids these
GC> problems and lets us write simpler code, faster.

 This would result in reducing everything to the lowest common denominator
of C-like APIs.


GC> >  So, on balance, I only see one tiny argument not to use unsigned. OTOH
GC> > there are two big arguments in favour of using it, of very different
GC> > varieties:
GC> > 
GC> > 1. Theoretic one: it provides more semantic information which is always
GC> >    a good thing. It also reduces cognitive dissonance which happens when
GC> >    something that obviously can't be negative is represented by a signed
GC> >    type.
GC> 
GC> Yet consider GetCardinality() and GetSelection() again from this POV.

 Again, we just keep returning to the fact that GetSelection() chose the
simplest solution and not the best one. Quite independently of everything
else, I'm completely sure that doing it like this was a huge mistake
because half of the uses of GetSelection() (in this and many other wx
classes) don't check for wxNOT_FOUND correctly and blithely use it as an
index, resulting in wx asserts at best and crashes (when it's used as an
index into an array) at worst. We should have had written it as "bool
GetSelection(unsigned* sel) const" instead to strongly hint that its return
value must be checked. Of course, if we had C++17 back in the nineties,
using std::optional<> would have been even better.

GC> > 2. Practical one: unsigned type (size_t) is used everywhere in the 
standard
GC> >    library and if the rest of the code uses signed types, it requires
GC> >    converting, implicitly (dangerous; results in warnings) or explicitly
GC> >    (annoying; can still be dangerous) back and forth between them.
GC> 
GC> size_t is unsigned for the valid reason given above.

 Whatever the reason is, writing static_cast<>s everywhere is really not
appealing.

GC> >  FWIW in my ideal UI library API this function wouldn't return neither
GC> > unsigned nor signed int but rather optional<unsigned> value from here.
GC> 
GC> Another digression: APL would use an "empty vector" here, with much the same
GC> meaning as your ideal. That's built into the language: in that respect it's
GC> more like nullptr than a default-constructed std::vector<> with zero 
elements.

 APL was just too much in advance for its time (only half joking). Since
then, using Maybe (Haskell), Option (OCaml, Rust), ... or optional (C++17)
has become the widely acknowledged best way to deal with this problem.

GC> In the C family, it's customary to use "impossible" negative values to
GC> represent errors:

 I would say "in C" instead of "C family".

GC> Two options:
GC> 
GC> - Return unsigned when we can, and signed when we know that we'll want to
GC> represent errors in "special" values--even though the function is 
semantically
GC> nonnegative. In that case, we have to look up the return values to 
distinguish
GC> these two possibilities.
GC> 
GC> - Lakos's option: always return signed for simplicity and compatibility.

 My option is the third one, as described above, i.e. return unsigned
whenever possible and do something else (but *not* return signed int) when
it isn't. Its big advantage is not that it avoids signed/unsigned
comparison warnings, which are not that important in the great scheme of
things, but that it forces the caller to check for the invalid value.
Lakos's option does less than nothing here: at least, when a function
returns signed and you start comparing its result with unsigned, you will
get warnings which will hopefully help you to do the right thing. But if
unsigned is not used, even this thin safety net is not there any more and
it's all too easy to go and compare GetSelection() with GetCardinality()
and get a wrong answer when the former is -1.

 I hope my arguments can convince you. They definitely -- even if
unsurprisingly -- seem convincing to me...

VZ

reply via email to

[Prev in Thread] Current Thread [Next in Thread]