swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Dmalloc: How To for Swarm on linux


From: Paul E. Johnson
Subject: Dmalloc: How To for Swarm on linux
Date: Mon, 16 Nov 1998 16:16:41 -0600

Dear Everybody:

I've had a memory leak and used the dmalloc library to find it.  Dmalloc
is a library you compile for your system and then compile swarmapps with
it, and when your program runs it outputs a list of all the unfreed
memory your program creates. After a lot of fussing, I found the result
I needed. If you ever get a memory leak, maybe you will remember this
note and try dmalloc.  I did not get very satisfactory results from the
ElectricFence library that is included in the Redhat Distribution and
would recommend you try dmalloc if your sim's memory usage seems to grow
indefinitely.

I have to write these things down or else I forget the details within
days, so here is the poop! 

Today there is a new version of the dmalloc library, looks good to me.
 
First, Gray Watson's announcement:

>Version 4.1.0 has been released.

        http://www.letters.com/dmalloc/
        ftp://ftp.letters.com/src/dmalloc/

>It contains the new FREED_POINTER_DELAY feature which was recommended on
>the list, better handling of the allow-free-null token (was allow-zero).
>I've also Added the new -g (--gdb) flag to the dmalloc utility and added the
>cool gdb script to the contrib/ directory.

The dmalloc software has pretty nice installation directions with it,
read them. You fiddle in dmalloc settings.dist, compile it, install it,
change an argument in your .bashrc, edit settings in a configuration
file you put in your home directory, and to link the dmalloc library
into your program, your use the command
        #make LIBS=-ldmalloc
Then you use the dmalloc command with parameters to create an
environment and run your program. If all goes well, you have a big
logfile at the end showing your memory goofs.


Here are the wrinkles that wasted days for me. If you use dmalloc,
consider this.

1. No matter what I did, I got segfaults with every swarm program
compiled under dmalloc until I liberalized one of the settings.dist
arguments before dmalloc configuration.  That trouble maker is the
prohibition on 0 sized memory allocations.  Swarm does that, and you
have to allow it by setting it like this:

#define ALLOW_ALLOC_ZERO_SIZE 1

(Don't forget make distclean if this is the second time to compile
dmalloc).
I found no runtime setting could cure this problem, and dmalloc had to
be recompiled.

2. The swarm include file misc.h has to be changed. Look in line 30 or
so and make it like so:

#ifndef __DMALLOC_H__
void *xmalloc (size_t size);
void *xmalloc_atomic (size_t size);
void *xcalloc (size_t nmemb, size_t size);
void *xrealloc (void *buf, size_t size);
void xfree (void *buf);
#endif

#define XFREE(buf) xfree((void *)(buf))

char *dropdir (char *path);

#ifndef __DMALLOC_H__
#ifndef HAVE_STRDUP
char *strdup (const char *string);
#endif
#endif

This has the effect of letting the special stuff in the Dmalloc library
do its work, but if you don't use Dmalloc, then swarm needs those
definitions and can find them.

4. In the dmalloc dist there is a sample rc file. Save it in your home
as .dmallocrc and look in there for the settings under Low, High, etc.
Until version 4.03 of dmalloc, there was a "token" called allow-zero,
but now it is called allow-free-null.  I had to add that option to the
standard configurations to prevent swarmapps from crashing ( yes, even
heatbugs).

5. After you include <dmalloc.h> in your own swarm code files, you
compile with make LIBS=-ldmalloc and then you use a command like
"dmalloc -l logfile -i 100 low" to tell it where your log file gets
written and which level of scrutiny to apply. (The docs describe a
change in .bashrc required for this.)  After the run, you look in your
logfile and you find thousands of lines. At the end of this note, I
pasted an excerpt from a recent run of dmalloc.  

For Swarm purposes, it is very important not to be upset by the large
number of unfreed memory segments. These ones that say from
'ra=0x80afe7f' are from an unknown source. If you had #include
<dmalloc.h> in the source file that caused that error, you would see the
file name there. But you don't see them in this excerpt. Dmalloc
supplies a nice perl script that works with gdb to track these down. Its
instructions say it is for occasions when you can't or are too lazy to
include dmalloc.h where you really need it. If you use the perl script
(from the dmalloc/contrib directory), you don't find out anything
interesting. It says this:

 -----------------------------------------------
Address = '0x80afe7f' line 18 num 1
0x80afe7f <objc_malloc+15>:     0xc483c389
No line number information available for address 0x80afe7f
<objc_malloc+15>  
---------------------------------------------

I spent a couple of days recompiling the swarm code itself, thinking I'd
track these little bastards to their source and win a merit badge.  But
even after that, I did not find out where they were. This unfreed memory
is getting created somewhere deep in the guts of gcc and the objc
runtime. I considered recompiling gcc and the objc runtime library to
debug them, but decided not to because I'm scared to death of work.

Anyway, later I learned that Swarm generates all kinds of little widget
objects that get allocated and never freed and they don't make a program
explode in size and I don't need to worry about them.  Indeed,it is
correct. If you run "top" in one xterm and run a swarm program, the
program does not explode in memory usage, even though it is going to
generate a logfile that has thousands of those stupid little unfreed
memories.  

With the dmalloc.h included in my source, I was able to peruse the
logfile from dmalloc and see that there was indeed a memory leak in my
code and the dmalloc output pointed right at it.  Of course, the leaks
from my files were scattered among thousands of unknown source unfreed
memories, but that is not a problem.  Once I fixed my leak, my program
ran at a stable level of memory usage.

logfile excerpt:
63197: Dmalloc version '4.0.3'.  UN-LICENSED copy.
63197: dmalloc_logfile 'logfile': flags = 0x24e405c3, addr = 0
63197: threads enabled, lock-init = 2
63197: starting time = 911152255
63197: free bucket count/bits:  109/4 27/5 25/6 56/7 3/8 7/9 3/10
63197: basic-block 4096 bytes, alignment 8 bytes, heap grows up
63197: heap: 0x80d7000 to 0x8293000, size 1818624 bytes (444 blocks),
checked 210
63197: alloc calls: malloc 36402, calloc 3501, realloc 12, free 23282
63197: alloc calls: recalloc 0, valloc 0
63197:  total memory allocated: 2533727 bytes (39915 pnts)
63197:  max in use at one time: 1080219 bytes (16686 pnts)
63197: max alloced with 1 call: 33706 bytes
63197: max alloc rounding loss: 479269 bytes (30%)
63197: max memory space wasted: 0 bytes (0%)
63197: final user memory space: basic 58, divided 326, 1094307 bytes
63197:    final admin overhead: basic 2, divided 58, 245760 bytes (13%)
63197:    final external space: 0 bytes (0 blocks)
63197: not freed: '0x80d9008|s1' (12 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9028|s9' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9048|s3' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9068|s7' (8 bytes) from 'ra=0x402cfdf9'
63197: not freed: '0x80d9088|s3' (20 bytes) from 'ra=0x4012c6dc'
63197: not freed: '0x80d90a8|s1' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d90c8|s3' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d90e8|s1' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9108|s3' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9128|s1' (8 bytes) from 'ra=0x80afe7f'
63197: not freed: '0x80d9148|s3' (8 bytes) from 'ra=0x80afe7f'

-----------------thousands of lines omitted --------------------
63197: not freed: '0x8291008|s1' (340 bytes) from 'ra=0x402c71ea'
63197: not freed: '0x8292008|s1' (700 bytes) from 'ra=0x402c724c'
63197:   known memory not freed: 6 pointers, 289 bytes
63197: unknown memory not freed: 16615 pointers, 875766 bytes
63197: ending time = 911152273, elapsed since start = 0:0:18

Paul Johnson
University of Kansas

PS. If you run into problems with a Redhat system when you try to turn
outputted pixmaps into movies (The method for this is described in the
swarm-support list by Marcus Daniels and Sven Thommesen) let me know and
I'll send you the binary care package.  Many of the binaries distributed
with RedHat 5.1 and the SUSE fli utilities simply don't work. Example:
pngtopnm.  But I have working copies of them and can email you them. 
NO, I did not make new RPMS, you'll just have to take the tarball of
compiled programs.

                  ==================================
   Swarm-Support is for discussion of the technical details of the day
   to day usage of Swarm.  For list administration needs (esp.
   [un]subscribing), please send a message to <address@hidden>
   with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]