[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gnuastro-commits] master 213a1976: Library (type.h): signed ints, same
From: |
Mohammad Akhlaghi |
Subject: |
[gnuastro-commits] master 213a1976: Library (type.h): signed ints, same width preferred in reading strings |
Date: |
Wed, 16 Feb 2022 19:54:52 -0500 (EST) |
branch: master
commit 213a1976ccc27cda0e8bc648ae525800e7c1a937
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Library (type.h): signed ints, same width preferred in reading strings
Until now, when reading a number from a string if it was positive, we would
only assign it to an un-signed integer type: the number of bits in the
integer was determined based on its largest possible value. However, in C's
internal automatic type conversion (when dealing with binary operators like
'+' or '>'), when an un-signed and signed integer of same width is given to
a binary operator, they are both converted to the unsigned type.
This would produce unexpected results! Because a signed integer can have a
value like '-1'. But when this value is converted to an unsigned integer of
the same width, it becomes the maximum value! This issue can be
demonstrated with the commands below:
astarithmetic 200 200 2 makenew 5 mknoise-sigma int32 \
--output=int-noise.fits
astarithmetic int-noise.fits 100000 gt --output=thresh.fits
The standard deviation of 'int-noise.fits' is 100, so its maximum will be
around 500 (5-sigma). Therefore, none of the pixels should be larger than
100000 and 'thresh.fits' should only have a value of 0. However, until this
commit 'thresh.fits' would have many '1'-valued pixels.
With this commit, this problem has been addressed in three levels:
- In 'gal_type_string_to_number' (where we read a number from the
command-line string that the user has given), positive values aren't
only given to unsigned types. If a positive value is smaller than the
largest possible signed value, it will be given a signed type.
- In 'arithmetic_binary', a new sanity check has been added: when the
inputs have the same width and are integers, but with different signs, a
warning will be printed to fully elaborate the problem to the user and
propose solutions.
This bug was reported by Zohreh Ghaffari and Pedram Ashofteh Ardakani.
This fixes bug #62069.
---
NEWS | 4 +++
lib/arithmetic.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
lib/type.c | 48 ++++++++++++++++++++++++++++--------
3 files changed, 117 insertions(+), 10 deletions(-)
diff --git a/NEWS b/NEWS
index 5835f885..491a1514 100644
--- a/NEWS
+++ b/NEWS
@@ -199,6 +199,10 @@ See the end of the file for license conditions.
Infante-Sainz.
bug #62054: Crash in Table's '--catrowfile' when string column is present;
reported by Manuel Sánchez-Benavente.
+ bug #62069: Wrong Arithmetic result in binary operators when both input
+ operands are integers of same width, but different sign (for
+ example 'int32' and 'uint32'); reported by Zohreh Ghaffari
+ and Pedram Ashofteh Ardakani.
diff --git a/lib/arithmetic.c b/lib/arithmetic.c
index a810bc0f..f3806ae9 100644
--- a/lib/arithmetic.c
+++ b/lib/arithmetic.c
@@ -1694,6 +1694,76 @@ arithmetic_binary_out_type(int operator, gal_data_t *l,
gal_data_t *r)
+/* Binary arithmetic's type checks: According to C's automatic type
+ conversion in binary operators, the unsigned types have higher
+ precedence for the same width. As a result, something like the following
+ will prove correct (the value after 'check:' will be '1').
+
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <stdint.h>
+
+ int
+ main(void)
+ {
+ int32_t a=-50;
+ uint32_t b=10000;
+
+ uint8_t o=a>b;
+ printf("check: %u\n", o);
+ return 0;
+ }
+
+ To avoid this situation, it is therefore necessary to print a message
+ and let the user know that strange situations like above may occur. Just
+ note that this won't happen if 'a' and 'b' have different widths: such
+ that this will work fine: 'int8_t a=-1; uint16_t b=50000'. */
+static void
+arithmetic_binary_int_sanity_check(gal_data_t *l, gal_data_t *r,
+ int operator)
+{
+ /* Variables to simplify the checks. */
+ int l_is_signed=0, r_is_signed=0;
+
+ /* Checks are only necessary for same-width types. */
+ if( gal_type_sizeof(l->type)==gal_type_sizeof(r->type) )
+ {
+ /* No checks needed if atleast one of the inputs is a float. */
+ if( l->type==GAL_TYPE_FLOAT32 || l->type==GAL_TYPE_FLOAT64
+ || r->type==GAL_TYPE_FLOAT32 || r->type==GAL_TYPE_FLOAT64 )
+ return;
+ else
+ {
+ if( l->type==GAL_TYPE_INT8 || l->type==GAL_TYPE_INT16
+ || l->type==GAL_TYPE_INT32 || l->type==GAL_TYPE_INT64 )
+ l_is_signed=1;
+ if( r->type==GAL_TYPE_INT8 || r->type==GAL_TYPE_INT16
+ || r->type==GAL_TYPE_INT32 || r->type==GAL_TYPE_INT64 )
+ r_is_signed=1;
+ if( l_is_signed!=r_is_signed )
+ error(EXIT_SUCCESS, 0, "the two integer operands given "
+ "to '%s' have the same width, but a different sign: "
+ "the first popped operand has type '%s' and the "
+ "second has type '%s'. This may create unexpected "
+ "results if the signed input contains negative "
+ "values. To address this problem there are two "
+ "options: 1) if you know that the signed input can "
+ "only have positive values, use Arithmetic's type "
+ "conversion operators to convert it to an un-signed "
+ "type of the same width (e.g., 'uint8', 'uint16', "
+ "'uint32' or 'uint64'). 2) Convert the unsigned input "
+ "to a signed one of the next largest width with the "
+ "type conversion operators (e.g., 'int16', 'int32' "
+ "or 'int64')", gal_arithmetic_operator_string(operator),
+ gal_type_name(r->type, 1), gal_type_name(l->type, 1));
+ }
+ }
+}
+
+
+
+
+
static gal_data_t *
arithmetic_binary(int operator, int flags, gal_data_t *l, gal_data_t *r)
{
@@ -1713,6 +1783,11 @@ arithmetic_binary(int operator, int flags, gal_data_t
*l, gal_data_t *r)
"have the same dimension/size", __func__,
gal_arithmetic_operator_string(operator));
+ /* Print a warning if the inputs are both integers, but have different
+ signs (the user needs to know that the output may not be what they
+ expect!).*/
+ arithmetic_binary_int_sanity_check(l, r, operator);
+
/* Set the output type. For the comparison operators, the output type is
either 0 or 1, so we will set the output type to 'unsigned char' for
diff --git a/lib/type.c b/lib/type.c
index 89e8e501..50d5a6b0 100644
--- a/lib/type.c
+++ b/lib/type.c
@@ -560,6 +560,7 @@ gal_type_from_string(void **out, char *string, uint8_t type)
void *
gal_type_string_to_number(char *string, uint8_t *type)
{
+ long int l;
void *ptr, *out;
int fnz=-1, lnz=0; /* 'F'irst (or 'L'ast) 'N'on-'Z'ero. */
uint8_t forcedfloat=0;
@@ -584,15 +585,28 @@ gal_type_string_to_number(char *string, uint8_t *type)
/* See if the number is actually an integer: */
if( forcedfloat==0 && ceil(d) == d )
{
+ /* We know the number is an integer, so we should re-read it again,
+ but this time, as an integer, because: 1) floating point numbers
+ can only preserve a certain number of decimals precisely after a
+ certain number of decimals, they loose precision. 2) Integer
+ comparisons (that are done below) are faster, but this is
+ secondary because the parsing itself takes more time! */
+ l=strtol(string, &tailptr, 0);
+ if(*tailptr!='\0')
+ error(EXIT_FAILURE, 0, "%s: a bug! Please contact us at %s "
+ "to fix this problem. The string '%s' couldn't be "
+ "parsed with 'strtol', but was parsed by 'strtod'",
+ __func__, PACKAGE_BUGREPORT, string);
+
/* If the number is negative, put it in the signed types (based on
its value). If its zero or positive, then put it in the unsigned
types. */
- if( d < 0 )
+ if( l < 0 )
{
- if (d>INT8_MIN) { i8=d; ptr=&i8; *type=GAL_TYPE_INT8; }
- else if(d>INT16_MIN) { i16=d; ptr=&i16; *type=GAL_TYPE_INT16; }
- else if(d>INT32_MIN) { i32=d; ptr=&i32; *type=GAL_TYPE_INT32; }
- else { i64=d; ptr=&i64; *type=GAL_TYPE_INT64; }
+ if (l>INT8_MIN) { i8=l; ptr=&i8; *type=GAL_TYPE_INT8; }
+ else if(l>INT16_MIN) { i16=l; ptr=&i16; *type=GAL_TYPE_INT16; }
+ else if(l>INT32_MIN) { i32=l; ptr=&i32; *type=GAL_TYPE_INT32; }
+ else { i64=l; ptr=&i64; *type=GAL_TYPE_INT64; }
}
else
{
@@ -602,11 +616,25 @@ gal_type_string_to_number(char *string, uint8_t *type)
confusing situations (for example when the user gives 255), if
the value is equal to the given maximum of the given type,
we'll assign it to a larger type. In other words, we won't be
- using the '<=MAX', but '<MAX'. */
- if (d<UINT8_MAX) { u8=d; ptr=&u8; *type=GAL_TYPE_UINT8; }
- else if(d<UINT16_MAX) { u16=d; ptr=&u16; *type=GAL_TYPE_UINT16; }
- else if(d<UINT32_MAX) { u32=d; ptr=&u32; *type=GAL_TYPE_UINT32; }
- else { u64=d; ptr=&u64; *type=GAL_TYPE_UINT64; }
+ using the '<=MAX', but '<MAX'.
+
+ Even though they are positive, we should give priority to the
+ signed types if the number fits in the range of signed type
+ for that width: this is the way that C's internal automatic
+ type conversion works (which is used by Arithmetic's binary
+ operators for example). */
+ if (l<UINT8_MAX)
+ { if(l>INT8_MAX) { u8=l; ptr=&u8; *type=GAL_TYPE_UINT8; }
+ else { i8=l; ptr=&i8; *type=GAL_TYPE_INT8; } }
+ else if(l<UINT16_MAX)
+ { if(l>INT16_MAX) { u16=l; ptr=&u16; *type=GAL_TYPE_UINT16; }
+ else { i16=l; ptr=&i16; *type=GAL_TYPE_INT16; } }
+ else if(l<UINT32_MAX)
+ { if(l>INT32_MAX) { u32=l; ptr=&u32; *type=GAL_TYPE_UINT32; }
+ else { i32=l; ptr=&i32; *type=GAL_TYPE_INT32; } }
+ else
+ { if(l>INT64_MAX) { u64=l; ptr=&u64; *type=GAL_TYPE_UINT64; }
+ else { i64=l; ptr=&i64; *type=GAL_TYPE_INT64; } }
}
}
else
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gnuastro-commits] master 213a1976: Library (type.h): signed ints, same width preferred in reading strings,
Mohammad Akhlaghi <=