Re: Build 2.1.64 on OS 10.3. error

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Build 2.1.64 on OS 10.3. error

From:	Per Persson
Subject:	Re: Build 2.1.64 on OS 10.3. error
Date:	Tue, 21 Dec 2004 19:22:38 +0100


On Dec 21, 2004, at 18:55, Samir Sharshar wrote:

Hello,

It's me ....

With ./configure --enable-dl --enable-shared --disabled-static

I've got

ld: misc/machar.o has local relocation entries in non-writable section(__TEXT,__text)

/usr/bin/libtool: internal link edit command failed
make[3]: *** [libcruft.dylib] Error 1
make[2]: *** [libraries] Error 2
make[1]: *** [libcruft] Error 2
make: *** [all] Error 2

Fortran compiler g77
FLIBS='-lg2c'
FFLAGS='-O5 -funroll-loops'
CFLAGS='-fast -mdynamic-no-pic'
CXXFLAGS='-fast -mdynamic-no-pic'

First of all, let me quote the docs for -fast (final paragraph of -fastsection in<http://www.opensource.apple.com/darwinsource/10.3.6/gcc-1495/AppleReleaseNotes.html>):

-----
Users of -fast should be aware of the following caveats:

-----

This, unfortunately, translates to "using -fast may wreak havoc,analyze the code and apply the appropriate flags one by one".

Secondly, -mdynamic-no-pic is meant for executables, not librarieswhich needs to have relocatable code. Check "man gcc". You need to makesure that -fPIC is passed alongside -mdynamic-no-pic if you want tobuild relocatable code with -mdynamic-no-pic applied globally.

Finally, my advice would be to start by dropping -fast -mdynamic-no-picand just add the -mcpu option specifying _your particular_ cpu. Ifthat turns out well, analyze the code and incrementally add flags thatyou have reason to believe will improve performance, building betweeneach increment until you have obtained a satisfying speedup.

If you are really serious, use Shark (from Apple) to analyze the codefor things like pipeline stalls etc.

Sorry if I'm sounding negative, but it is my experience that applyingsomething as agressive[1] as the -fast option will not work well onsomething as complex as octave. As I understand it, -fast was added togive good SPEC marks, and has sure seen little testing with code otherthan the SPEC code.


HTH,
Per

PS. For interested parties I'm pasting a summary of what -fast impliesbelow:

================================

-fast changes the overall optimization strategy of GCC 3.3 in order toproduce the fastest possible running code for G4 and G5 architectures.Optimizations under -fast are roughly grouped under the followingcategories.

1.

-fast sets the optimization level to -O3, the highest level ofoptimization supported by GCC 3.3. If any other optimization level(-O0, -O1, -O2 or -Os) is specified, it is ignored by the compiler.

2.

Alignment. Assume alignments for loops, functions, branches andstructure data fields that provide fastest performance on the PowerPC.-fast sets the following alignment-specific options:

   -falign-loops-max-skip=15
   -falign-jumps-max-skip=15
   -falign-loops=16
   -falign-jumps=16
   -falign-functions=16
   -malign-natural

        3.

-fast enables the -ffast-math option, which allows certain unsafe mathoperations for performance gains.

4.

Strict aliasing rules. -fast allows the compiler to assume thestrictest aliasing rules applicable to the language being compiled.For C and C++, this activates optimizations based on the type ofexpressions: an object of one type is assumed never to reside at thesame address as an object of a different type, unless the types arealmost the same. Furthermore, struct field references are assumed notto alias each other as long as their direct and indirect enclosingstructure types are distinct. -fast enables the following aliasingoptions:

   -fstrict-aliasing
   -frelax-aliasing
   -fgcse-mem-alias

Warning: the behavior of correct programs will not be affected bystrict aliasing, but programs that make use of nonportable typeconversions may behave in unexpected ways.

5.

-fast enables various performance-related code transformations. Theseinclude loop unrolling, transposing nested loops to improve locality ofarray element access, conversion of certain initiliazation loops tomemset calls, and inline expansion of calls to library functions suchas floor. -fast enables the following code transformation options:

   -funroll-loops
   -floop-transpose
   -floop-to-memset
   -finline-floor  (G5 only)


 Some of these transformations increase code size.
        6.

G5 specific instruction generation. With -fast (unless -mcpu=G4 isspecified), GCC 3.3 generates instructions which are specific to G5and result in performance gain for G5. The following options areassumed for G5 under -fast:


   -mcpu=G5
   -mpowerpc64
   -mpowerpc-gpopt

        7.

Scheduling changes. -fast option allows inter-block scheduling, andscheduling specific to the G5 architecture. One such scheduling changeis load after a store that partially loads what was stored. Thefollowing scheduling-related options are enabled by -fast:

   -mtune=G5  (unless -mtune=G4 is specified)
   -fsched-interblock
   -fload-after-store
   --param max-gcse-passes=3
   -fno-gcse-sm
   -fgcse-loop-depth

        8.

-fast enables intermodule inlining when all source files are placed onthe same command line. The following options are set by -fast andaffect such inlining:

   -funit-at-a-time
   -fcallgraph-inlining
   -fdisable-typechecking-for-spec

        9.

-fast sets -mdynamic-no-pic by default. This allows for generation ofnon-relocatable code and is not suitable for shared libraries. Thisoption may be overridden by -fPIC.


Users of -fast should be aware of the following caveats:

• Because -fast enables highly aggressive optimizations, some ofwhich may have an effect on code size or on program behavior, thoroughtesting is especially important before deploying applications compiledwith -fast.• For maximum run-time performance you should experiment with avariety of optimization options; no one set of flags is best for allapplications.• In future releases of GCC, -fast may enable a different set ofoptimization options. The intention behind this option is that -fastwill enable optimizations that result in the fastest code for mostapplications.




-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------

[Prev in Thread]

Current Thread

[Next in Thread]

Build 2.1.64 on OS 10.3. error, Samir Sharshar, 2004/12/21
- Re: Build 2.1.64 on OS 10.3. error, Per Persson <=
  - Re: Build 2.1.64 on OS 10.3. error, Michael Martin, 2004/12/21
- Re: Build 2.1.64 on OS 10.3. error, Vic Norton, 2004/12/21

Prev by Date: Build 2.1.64 on OS 10.3. error
Next by Date: Re: Build 2.1.64 on OS 10.3. error
Previous by thread: Build 2.1.64 on OS 10.3. error
Next by thread: Re: Build 2.1.64 on OS 10.3. error
Index(es):
- Date
- Thread