swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: crash on Alpha box


From: glen e. p. ropella
Subject: Re: crash on Alpha box
Date: Mon, 10 Feb 1997 07:33:37 -0700

Hi Bohdan,

> Program received signal SIGSEGV, Segmentation fault.
> objc_msg_lookup (receiver=0x40099310, op=0x140019e90) at sendmsg.c:115
> sendmsg.c:115: No such file or directory.
> (gdb) where 10
> #0  objc_msg_lookup (receiver=0x40099310, op=0x140019e90) at sendmsg.c:115
> #1  0x12006d0dc in _c_Map_c__createBegin_ (self=0x14009f110, 
> _cmd=0x1400164d0, 
>     aZone=0x40099310) at Map.m:55
> #2  0x120065198 in _i_ProbeLibrary__createEnd (self=0x1400a2550, 
>     _cmd=0x14001e100) at ProbeLibrary.m:16
> #3  0x1201092b4 in __objc_init_install_dtable (receiver=0x11ffff760, 
>     op=0x14001e100) at sendmsg.c:208
> #4  0x12007a4d8 in _c_CreateDrop_s__create_ (self=0x140016460, 
>     _cmd=0x140014a00, aZone=0x140099310) at Create.m:28
> #5  0x1201092b4 in __objc_init_install_dtable (receiver=0x11ffff8d0, 
>     op=0x140014a00) at sendmsg.c:208
> #6  0x12005d714 in initProbing () at probing.m:22
> #7  0x12002c03c in initSwarm (argc=1, argv=0x11ffffb68) at simtools.m:32
> #8  0x12001ece0 in main (argc=1, argv=0x11ffffb68) at main.m:19
> #9  0x12001d9b0 in __start ()

I'm going to guess at this one.  And if I'm right, it won't be
good news.  The fact that the receiver is at 0x40099310 and
everything else is in the vicinity of 0x1400lmnop, tells me that
the receiver is mucked up, like you pointed out.  (In fact, 
0x40099310 is probably not in the data section of the process.)
It looks, to me, like the "aZone" variable has been tampered with
to make it go from 0x140099310 to 0x040099310 somewhere between
the createBegin: aZone call in Create.m (which took an argument
of "globalZone") and the [aZone allocIVars: self] in Map.m, the
zone variable was mucked up.

I suspect it was in the [self getZone] in ProbeLibrary.m at

  classMap = [[Map createBegin: [self getZone]] createEnd] ;


Now, the "getZone" method is defined on all objects inheriting
from DefObject; but, it's really just a macro defined in 
defalloc.h, which reads:

#define getZone( anObject ) \
({ unsigned _zbits_ = (anObject)->zbits; \
  ( _zbits_ & BitSuballocList ? \
   (id)((Object_s *)( _zbits_ & ~0x7 ))->zbits : \
   (id)( _zbits_ & ~0x7 ) ); })

"zbits" is a variable we use internally for keeping track of which
zone the object is in.  Now, the zone is basically defined as
the contents of (zbits & ~0x7), which is 

 xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|x000

on a 32 bit machine.  The three reserved bits are used for
keeping track of the allocation properties of the object.  
#define BitMappedAlloc     0x4  // set by suballoc list or explicit macro 
#define BitSuballocList    0x2  // set whenever object contains suballoc list
#define BitComponentAlloc  0x1  // set if object is not in the zone population

Now, the only things that might allow a number like 0x140099310 to 
get corrupted to 0x040099310 (which you'll notice as corrupted in
the first digit beyond what's normally available in a 32 bit machine),
is the sign bit or trash in the higher significant digits.  And I 
don't think the sign bit would be anywhwere near 0x0000000x00000000.

So, I suspect that one of the constants (0x2 or ~0x7) might be screwing
up the location of the zone by picking up trash in the higher part
of the word.

*OR* since we're casting the result of (_zbits_ & ~0x7) to a pointer,
which could mean promoting an unsigned (presuming the bit-wise
and of an unsigned and an integer constant gets promoted to an 
unsigned int) to a pointer, we could be picking up trash in the 
higher part of the word if unsigned ints are 32 bits on the alpha.
(Pointers are 64 bits.)

My advice would be to modify defalloc.h to read:

   #define getZone( anObject ) \
   ({ unsigned long _zbits_ = (anObject)->zbits; \
     ( _zbits_ & BitSuballocList ? \
      (id)((Object_s *)( _zbits_ & ~0x7L ))->zbits : \
      (id)( _zbits_ & ~0x7L ) ); })

[...]

   #define BitSuballocList    0x2L  // set whenever object contains suballoc 
list

(i.e. make _zbits_ an unsigned long and adding the "L" to the
end of the three constants.)

This assumes that unsigned long is 64 bits.  If it's not, then 
you may need unsigned long long or some crap like that.

glen
p.s. Sorry for such a long winded response.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]