help-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU Make on Linux Feeding All Commands Through ksh


From: Steve Waltner
Subject: GNU Make on Linux Feeding All Commands Through ksh
Date: Wed, 8 Oct 2008 11:31:37 -0500

I'm working on completing the migration of our build process for a rather large software project from Solaris SPARC to Linux x86 and have run into an issue. This process is using GNU Make 3.81 on a Solaris 9 box and a RedHat AS 4.7 x86_64 system. The major symptom that I've noticed is that the Linux system doesn't really honor the "-j 4" option we typically build with. It quickly degrades into a single threaded build. Items of note include:

- The builds are being passed to a cluster of systems running Sun Grid Engine, so the "-j 4" option isn't passed at the command line. The first build command looks for a $NSLOTS environment variable and changes the MAKEFLAGS as appropriate.

- I am running both builds with a copy of GNU Make that I compiled from the same source code. I am not using the copy of make that was included with the RedHat or Solaris systems.

- The makefiles include settings like "override SHELL = /bin/ksh" to force all shell interpretations to go through the ksh.

It appears as though the Linux system feeds every command through a ksh process, while the same function on the Solaris system calls the command (wether it is a ccpentium, ccarm, or make command) directly. This is done by looking at the process hierarchy using the pstree command. The examples below were both done on a build with NSLOTS=4 (ie: -j 4). You can see the Solaris build running three ccpentium processes at the time this snapshot was taken, while the Linux build has only spawned a single ccpentium command.

Solaris:
====================
ictgrid004:~> sgetree
-+- 00278 sgeadmin 9:36 /soft/gridware-wic/sge/6.0u6/bin/sol-sparc64/ sge_execd
 |-+- 18341 sgeadmin sge_shepherd-467543 -bg
| \-+- 18342 root /soft/gridware-wic/sge/6.0u6/utilbin/sol-sparc64/ rshd -l | \-+- 18343 swaltner /soft/gridware-wic/sge/6.0u6/utilbin/sol- sparc64/qrsh_
 |     \-+- 18347 swaltner tcsh -c hostname ; gmake
 |       \-+- 18353 swaltner /soft/gnu/make/3.81/bin/gmake
| \-+- 26740 swaltner /soft/gnu/make/3.81/bin/gmake Platform/.make App | |-+- 26757 swaltner /soft/gnu/make/3.81/bin/gmake -C Platform MKLe | | \-+- 26819 swaltner /soft/gnu/make/3.81/bin/gmake Boot/.make Sys | | \-+- 26906 swaltner /soft/gnu/make/3.81/bin/gmake -C System MK | | \-+- 27024 swaltner /soft/gnu/make/3.81/bin/gmake BSP/.make | | \-+- 10502 swaltner /soft/gnu/make/3.81/bin/ gmake -C DQ MK | | \-+- 10560 swaltner /soft/gnu/make/3.81/bin/ gmake DQ MKL | | \-+- 11323 swaltner ccpentium -c -o dq.o - fmessage-len | | \--- 11331 swaltner /soft/windriver/gpp/ 3.4/gnu/3.4. | \-+- 26788 swaltner /soft/gnu/make/3.81/bin/gmake -C Application M | \-+- 26853 swaltner /soft/gnu/make/3.81/bin/gmake RAID/.make Deb | |-+- 06868 swaltner /soft/gnu/make/3.81/bin/gmake -C Debug MKL | | \-+- 06928 swaltner /soft/gnu/make/3.81/bin/gmake ccvm_dbg/. | | \-+- 11524 swaltner /soft/gnu/make/3.81/bin/ gmake -C safe_ | | \-+- 11585 swaltner /soft/gnu/make/3.81/bin/ gmake safe_d | | \-+- 11639 swaltner ccpentium -c -o safeSymbolDebug.o | | \--- 11642 swaltner /soft/windriver/gpp/ 3.4/gnu/3.4. | |-+- 26909 swaltner /soft/gnu/make/3.81/bin/gmake -C RAID MKLe | | \-+- 27055 swaltner /soft/gnu/make/3.81/bin/gmake cache/.mak | | \-+- 08612 swaltner /soft/gnu/make/3.81/bin/ gmake -C hid M | | \-+- 08728 swaltner /soft/gnu/make/3.81/bin/ gmake hid MK | | \-+- 11452 swaltner ccpentium -c -o hidLUDispatch.o -f | | \--- 11457 swaltner /soft/windriver/gpp/ 3.4/gnu/3.4. | \--- 11635 swaltner /soft/gnu/make/3.81/bin/gmake -C MAPI MKLe
====================

Linux:
====================
ictgrid005:~/ccm_wa/symbios/RAIDCore-swaltner_1636/ dev_09q4_fc_7091-68.10.00.03> ~/pstree-2.32/pstree 3543
-+= 03543 root /soft/gridware-wic/sge/6.0u6/bin/lx24-amd64/sge_execd
 \-+= 21589 root sge_shepherd-467474 -bg
\-+= 21590 root /soft/gridware-wic/sge/6.0u6/utilbin/lx24-amd64/ rshd -l \-+= 21591 swaltner /soft/gridware-wic/sge/6.0u6/utilbin/lx24- amd64/qrsh_starter /var/spool/sgeexecd/ictgrid005/active_jobs/467474.
       \-+= 21603 swaltner tcsh -c hostname ; gmake
         \-+- 21612 swaltner gmake
\-+- 04707 swaltner /bin/ksh -c gmake Platform/.make Application/.make MKLevel=$(( 0 + 1 )) MKopts=''; \-+- 04708 swaltner gmake Platform/.make Application/.make MKLevel=1 MKopts= \-+- 04787 swaltner /bin/ksh -c gmake -C Application MKLevel=$(( 1 + 1 ))
                 \-+- 04788 swaltner gmake -C Application MKLevel=2
\-+- 04868 swaltner /bin/ksh -c gmake RAID/.make Debug/.make MAPI/.make TAPI/.make Spy/.make Stpsim/.make FBDT/.make \-+- 04870 swaltner gmake RAID/.make Debug/.make MAPI/.make TAPI/.make Spy/.make Stpsim/.make FBDT/.make IT/.make D \-+- 04947 swaltner /bin/ksh -c gmake -C RAID MKLevel=$(( 3 + 1 ))
                         \-+- 04948 swaltner gmake -C RAID MKLevel=4
\-+- 05074 swaltner /bin/ksh -c gmake cache/.make iop/.make htd/.make hid/.make icn/.make rtr/.make rpa/.make \-+- 05075 swaltner gmake cache/.make iop/.make htd/.make hid/.make icn/.make rtr/.make rpa/.make Fibre/.ma \-+- 11193 swaltner /bin/ksh -c gmake -C vdm MKLevel=$(( 5 + 1 )) \-+- 11194 swaltner gmake -C vdm MKLevel=6 \-+- 18797 swaltner /bin/ksh -c gmake vdm MKLevel=$(( 6 + 1 )) MKopts=''; \-+- 18798 swaltner gmake vdm MKLevel=7 MKopts= \-+- 22893 swaltner /bin/ksh - c HOME="" LM_LICENSE_FILE="" ccpentium -c -o vdmRVState.o -fmessage \-+- 22894 swaltner ccpentium -c -o vdmRVState.o -fmessage-length=0 -O2 -nostdlib -fno- builtin |--- 22896 swaltner /soft/ windriver/gpp/3.4/gnu/3.4.4-vxworks-6.4/x86-linux2/bin/../libexec/g \--- 22895 root (get_feature) ictgrid005:~/ccm_wa/symbios/RAIDCore-swaltner_1636/ dev_09q4_fc_7091-68.10.00.03>
====================

I believe this behavior is causing the make process to consume tokens for the parallel builds when it shouldn't be. The ksh process that launches the gmake command in the subdirectory is consuming the token. Once you get deep enough in the source directory, all the tokens are in use by these idle ksh processes causing it to fall-back to a single thread on the build. This is confirmed by starting a build using a "-j 8" or "-j 16" or higher. By giving the make process more tokens, it is able to keep the CPU busy on this quad CPU Linux server. This worked fine when there is a single developer on the build system, but that won't work well for the way we launch builds on these systems through SGE. Once this issue is resolved, we can deploy the x86 hardware which will give us the same build speeds in a box that is 20% the physical size and costs about 10% of the price of the SPARC systems we have been using.

Thanks for any guidance you can provide. I've been fooling with this for several days without any luck.

Steve




reply via email to

[Prev in Thread] Current Thread [Next in Thread]