Re: factor [Was: coreutils v5.2.1

bug-coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: factor [Was: coreutils v5.2.1 - stat.c]

From:	ThMO
Subject:	Re: factor [Was: coreutils v5.2.1 - stat.c]
Date:	Sat, 01 Oct 2005 14:00:04 +0200

Good morning Eric, Paul and others listening,

> > > +     ['Ma', '-4294966201', {OUT => '-1 12197 352133'}],
> >
> > Now that I see these test cases I also see a reason why this is not a
> > good idea.  Currently, 'factor' outputs the unique factorization of
> > the input number into primes.  But there is no unique factorization
> > for negative numbers.  For example, OUT could have been '-12197
> > 352133' or '12197 -352133'.

exactly that's the reason why I prefer printing out -1 as the 1st factor,
even though it's not a real prime-factor, as it's only purpose is to
negate the final product, but *all* remaining factors *are* in fact
prime-factors - otherwise each combination of positive and negative
factors would need to be printed out, which isn't useful anymore - IHMO.

> If we decide to permit negative numbers, it seems like '-1' as the
> first factor would be the most useful, even though -1 is not a prime;
> then all the remaining factors ARE positive primes.  Then, for consistency,
> -1 would print just -1, 0 would be an error, and 1 would print nothing.
> [...]
> However, it would also be nice if factor continued on with its
> remaining arguments, rather than exiting immediately on error:

My applied patch file does exactly this now.
Additionally I've added some comments, which were flagged as FXIME before.

According to the error of factorizing 0 - my patch produces the following
output on stdout for the command: ./gfactor `seq -- -2 2`
(or ./gfactor -- `seq -- -2 2` using > v5.2.1):

-2: -1 2
-1: -1
0: cannot be factorized
1:
2: 2

But I do *not* return a failure status, as the then printed
        Try `gfactor --help' for more information
looks really ugly in this case, but YMMV.

Maybe the output ``0: cannot be factorized'' could be printed out to
stderr instead - what do you think?
E.g.:  gfactor: 0: cannot be factorized

> [...]
> Also something to think about - if factor is changed to accept
> negative numbers, then perhaps it should accept arguments
> that look like options:
> 
> Current:
> $ factor -2
> stderr> factor: invalid option -- 2
> stderr> Try `factor --help' for more information.
> 
> Proposed (if we keep negatives illegal):
> $ factor -2
> stderr> factor: `-2' is not a valid positive integer
> stderr> Try `factor --help' for more information.
> 
> Proposed (if we decide to allow negatives):
> $ factor -2
> stdout> -2: -1 2

That's right, but this is true only for coreutils > v5.2.1, as in v5.2.1
only the long-options `--help' and `--version' are handled, which is IMHO
more naturally than forcing the user to type a `--' end-of-options in order
to specify negative values on the command line.

BTW, it would be a nice thing, if factor would be generally installed as
gfactor, since at least on BSD systems there is already a factor command
present...

THX for listening.

CU Tom.
(Thomas M.Ott)
Germany

--- coreutils-5.2.1/src/factor.c.orig   2004-01-21 23:27:02.000000000 +0100
+++ coreutils-5.2.1/src/factor.c        2005-10-01 13:30:47.000000000 +0200
@@ -91,16 +91,105 @@
   exit (status);
 }
 
-/* FIXME: comment */
+/* using a union simplifies cast orgies in order to omit warnings (ThMO) */
+
+union factor {
+    intmax_t   sfac;
+    uintmax_t  ufac;
+  };
+
+#if    HAVE_LONG_LONG
+  #define  ONE 1LL
+#else
+  #define  ONE 1L
+#endif
+
+/* sfactor(): factorize signed long (long) ints -- (ThMO) */
 
 static int
-factor (uintmax_t n0, int max_n_factors, uintmax_t *factors)
+sfactor( const intmax_t sn0, const int max_n_factors, union factor *factors)
+{
+  intmax_t            sn;
+  uintmax_t           n, d, q;
+  int                 n_factors;
+  const unsigned int  *w;
+
+  n_factors= 0;
+  if ( ( sn= sn0) <= 1)
+    {
+      if ( sn >= 0)
+        return( n_factors);
+      assert( 2 < max_n_factors);
+      factors[ n_factors++].sfac= -1;
+      switch ( sn)
+        {
+          case ONE << ( sizeof( sn)* CHAR_BIT- 1):
+            /* handle special case of smallest long (long) int */
+            factors[ n_factors++].ufac= 2;
+            sn= (intmax_t) ( (uintmax_t) sn >> 1);
+            /* sn= ONE << ( sizeof( sn)* CHAR_BIT- 2); */
+            break;
+
+          default:
+            sn= -sn;
+            break;
+        }
+    }
+
+  /* from now on the value is guaranteed to be positive, so force further
+   * calculations with unsigned quantities, since the evaluation of the
+   * remainder from a division will always have the correct sign, unlike
+   * the signed modulus operation - furthermore unsigned division is mostly
+   * somewhat faster than signed division -- (ThMO)
+   */
+  for ( n= (uintmax_t) sn;  ( n & 1) == 0;  n >>= 1)
+    {
+      /* factorize by 2 the easy way... */
+      assert( n_factors < max_n_factors);
+      factors[ n_factors++].ufac= 2;
+    }
+
+  if ( n >= ( d= 3))
+    {
+      /* this is basically the same algorithm as below - only we'll use
+       * the modulo operator (%), as the remainder will be delivered for
+       * free during division
+       */
+      w= &wheel_tab[ 1];
+      do
+        {
+          while ( q= n/ d,  n % d == 0)
+            {
+              assert( n_factors < max_n_factors);
+              factors[ n_factors++].ufac= d;
+              n= q;
+            }
+          d += *w++;
+          if ( w == WHEEL_END)
+            w -= WHEEL_END- WHEEL_START;
+        }
+      while ( d <= q);
+      if ( n > 1)
+        {
+          assert( n_factors < max_n_factors);
+          factors[ n_factors++].ufac= n;
+        }
+    }
+  return( n_factors);
+}
+ 
+/* factor(): factorize unsigned long (long) ints larger than
+ *           signed long (long) ints only
+ */
+
+static int
+factor (uintmax_t n0, int max_n_factors, union factor *factors)
 {
   register uintmax_t n = n0, d, q;
   int n_factors = 0;
   unsigned int const *w = wheel_tab;
 
-  if (n < 1)
+  if (n <= 1)
     return n_factors;
 
   /* The exit condition in the following loop is correct because
@@ -118,7 +207,7 @@
       while (n == q * d)
        {
          assert (n_factors < max_n_factors);
-         factors[n_factors++] = d;
+         factors[n_factors++].ufac = d;
          n = q;
          q = n / d;
        }
@@ -128,27 +217,59 @@
     }
   while (d <= q);
 
-  if (n != 1 || n0 == 1)
+  if (n != 1)  /* || n0 == 1   <-- can't happen -- (ThMO) */
     {
       assert (n_factors < max_n_factors);
-      factors[n_factors++] = n;
+      factors[n_factors++].ufac = n;
     }
 
   return n_factors;
 }
 
-/* FIXME: comment */
+/* print_factors(): try to factorize given number or state an
+ *                  appropriate error message
+ */
+
+#define        ary_size( t)    ( sizeof( (t))/ sizeof( (t)[ 0]) )
 
 static int
 print_factors (const char *s)
 {
-  uintmax_t factors[MAX_N_FACTORS];
+  union factor  factors[ MAX_N_FACTORS+ 1];
+  intmax_t      sn;
   uintmax_t n;
   int n_factors;
   int i;
-  char buf[INT_BUFSIZE_BOUND (uintmax_t)];
+  char buf[ 1+ INT_BUFSIZE_BOUND (uintmax_t)];
   strtol_error err;
 
+  if ( ( err= xstrtoimax( s, NULL, 10, &sn, "")) == LONGINT_OK)
+    switch ( sn)
+      {
+        case 0:
+          printf( "0: %s\n", _( "cannot be factorized"));
+          return( 0);
+        default:
+          /* factorize signed long (long) ints */
+          n_factors= sfactor( sn, ary_size( factors), factors);
+          printf( "%s:", imaxtostr( sn, buf));
+          for ( i= 0;  i < n_factors;  i++)
+            printf( " %s", imaxtostr( factors[ i].sfac, buf));
+          printf( "\n");
+          return( 0);
+      }
+  else if ( err != LONGINT_OVERFLOW)
+    {
+      error( 0, 0, _( "%s is not a valid integer"), s);
+      return( 1);
+    }
+  else if ( sn < 0)    /* --> underflow */
+    {
+      error( 0, 0, _( "%s is too small"), s);
+      return( 1);
+    }
+
+  /* overflowing a signed long (long) int -> try it unsigned then */
   if ((err = xstrtoumax (s, NULL, 10, &n, "")) != LONGINT_OK)
     {
       if (err == LONGINT_OVERFLOW)
@@ -160,7 +281,7 @@
   n_factors = factor (n, MAX_N_FACTORS, factors);
   printf ("%s:", umaxtostr (n, buf));
   for (i = 0; i < n_factors; i++)
-    printf (" %s", umaxtostr (factors[i], buf));
+    printf (" %s", umaxtostr (factors[i].ufac, buf));
   putchar ('\n');
   return 0;
 }

[Prev in Thread]

Current Thread

[Next in Thread]

Re: factor [Was: coreutils v5.2.1 - stat.c], ThMO <=
- Re: factor [Was: coreutils v5.2.1 - stat.c], Alfred M\. Szmidt, 2005/10/01
  - Re: factor [Was: coreutils v5.2.1 - stat.c], ThMO, 2005/10/01
    - Re: factor [Was: coreutils v5.2.1 - stat.c], Alfred M\. Szmidt, 2005/10/01
    - Re: factor [Was: coreutils v5.2.1 - stat.c], Eric Blake, 2005/10/01
- Re: factor [Was: coreutils v5.2.1 - stat.c], ThMO, 2005/10/02
  - Re: factor [Was: coreutils v5.2.1 - stat.c], Alfred M\. Szmidt, 2005/10/02
    - Re: factor [Was: coreutils v5.2.1 - stat.c], ThMO, 2005/10/03
    - Re: factor [Was: coreutils v5.2.1 - stat.c], Alfred M\. Szmidt, 2005/10/03
    - Re: factor [Was: coreutils v5.2.1 - stat.c], Alfred M\. Szmidt, 2005/10/03
  - Re: factor, Jim Meyering, 2005/10/03

Prev by Date: Re: coreutils-5.90 build feedback
Next by Date: Re: factor [Was: coreutils v5.2.1 - stat.c]
Previous by thread: Java Installation
Next by thread: Re: factor [Was: coreutils v5.2.1 - stat.c]
Index(es):
- Date
- Thread