[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HEAD: inclusion order wrong for input.c
From: |
Gary V. Vaughan |
Subject: |
Re: HEAD: inclusion order wrong for input.c |
Date: |
Tue, 3 Apr 2007 10:44:25 +0100 |
Hi Eric,
On 3 Apr 2007, at 03:53, Eric Blake wrote:
According to Gary V. Vaughan on 4/2/2007 4:37 PM:
Cast the subscript to unsigned char before using it as index.
Otherwise, on a system where char is signed, and its high bit is
set,
and you haven't adjusted the array range to allow for negative
values,
fun will ensue.
If the table value for META-^A is held at element 128 of the array
(since
the table was built assuming char* is unsigned by default), and we
compile
on a host with signed chars, does the signed char value of META-^A
still
become 128 when cast to unsigned char? Or does 2's complement
come into
play and scramble the order of the negative signed char values when
casting them before doing a table lookup?
As long as the table is handled consistently (in other words, as
long as
ALL uses of characters as indices occur as unsigned char or within
to_uchar), then META-^A (usually encoded as -128 in signed char) will
always appear at the same index, regardless of whether that index
is 128
(as it will be on 2's complement machine; the bulk of what exists
today),
or 255 (which is what (unsigned char) -128 might become on a 1's
compliment machine, mostly theoretical). You only run into the bug
that
you were describing if you also reference the array based on a given
integer encoding of characters.
My point exactly. Here's a violation of that consistency in syntax.c:
109 m4_syntax_table *
110 m4_syntax_create (void)
111 {
112 m4_syntax_table *syntax = xzalloc (sizeof *syntax);
113 int ch;
114
115 /* Set up default table. This table never changes during
operation. */
116 for (ch = 256; --ch >= 0;)
117 switch (ch)
118 {
119 case '(':
120 syntax->orig[ch] = M4_SYNTAX_OPEN;
121 break;
In this case, we let a possibly signed literal char self promote
to an int, but assume that those values with the high bit set will map
correctly when manually fed through to_uchar when we do lookups in that
table.
In practice, we don't have any case statements for high-bit-set chars
inside the switch, so it hasn't caught us out. Even so, with portable
defensive coding style, it seems better to use the same method of
dereferencing indices when building the table as when looking up entries
in it... I've probably made this same bad assumption in a few other
places where I wrote code to do table lookups for char values :-(
Cheers,
Gary
--
())_. Email me: address@hidden
( '/ Read my blog: http://blog.azazil.net
/ )= ...and my book: http://sources.redhat.com/autobook
`(_~)_ Join my AGLOCO Network: http://www.agloco.com/r/BBBS7912
PGP.sig
Description: This is a digitally signed message part
- HEAD: inclusion order wrong for input.c, Ralf Wildenhues, 2007/04/02
- Re: HEAD: inclusion order wrong for input.c, Eric Blake, 2007/04/03
- Re: HEAD: inclusion order wrong for input.c, Ralf Wildenhues, 2007/04/03
- Re: HEAD: inclusion order wrong for input.c, Eric Blake, 2007/04/03
- Re: HEAD: inclusion order wrong for input.c, Ralf Wildenhues, 2007/04/06
- Re: HEAD: inclusion order wrong for input.c, Eric Blake, 2007/04/10
- Re: HEAD: inclusion order wrong for input.c, Ralf Wildenhues, 2007/04/10
- stdin seekable failure [was: HEAD: inclusion order wrong for input.c], Eric Blake, 2007/04/11
- Re: stdin seekable failure, Paul Eggert, 2007/04/11
- Re: stdin seekable failure, Eric Blake, 2007/04/12
- Re: stdin seekable failure, Eric Blake, 2007/04/12