[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Build 2.1.64 on OS 10.3. error
From: |
Per Persson |
Subject: |
Re: Build 2.1.64 on OS 10.3. error |
Date: |
Tue, 21 Dec 2004 19:22:38 +0100 |
On Dec 21, 2004, at 18:55, Samir Sharshar wrote:
Hello,
It's me ....
With ./configure --enable-dl --enable-shared --disabled-static
I've got
ld: misc/machar.o has local relocation entries in non-writable section
(__TEXT,__text)
/usr/bin/libtool: internal link edit command failed
make[3]: *** [libcruft.dylib] Error 1
make[2]: *** [libraries] Error 2
make[1]: *** [libcruft] Error 2
make: *** [all] Error 2
Fortran compiler g77
FLIBS='-lg2c'
FFLAGS='-O5 -funroll-loops'
CFLAGS='-fast -mdynamic-no-pic'
CXXFLAGS='-fast -mdynamic-no-pic'
First of all, let me quote the docs for -fast (final paragraph of -fast
section in
<http://www.opensource.apple.com/darwinsource/10.3.6/gcc-1495/
AppleReleaseNotes.html>):
-----
Users of -fast should be aware of the following caveats:
• Because -fast enables highly aggressive optimizations, some of
which may have an effect on code size or on program behavior, thorough
testing is especially important before deploying applications compiled
with -fast.
• For maximum run-time performance you should experiment with a
variety of optimization options; no one set of flags is best for all
applications.
-----
This, unfortunately, translates to "using -fast may wreak havoc,
analyze the code and apply the appropriate flags one by one".
Secondly, -mdynamic-no-pic is meant for executables, not libraries
which needs to have relocatable code. Check "man gcc". You need to make
sure that -fPIC is passed alongside -mdynamic-no-pic if you want to
build relocatable code with -mdynamic-no-pic applied globally.
Finally, my advice would be to start by dropping -fast -mdynamic-no-pic
and just add the -mcpu option specifying _your particular_ cpu. If
that turns out well, analyze the code and incrementally add flags that
you have reason to believe will improve performance, building between
each increment until you have obtained a satisfying speedup.
If you are really serious, use Shark (from Apple) to analyze the code
for things like pipeline stalls etc.
Sorry if I'm sounding negative, but it is my experience that applying
something as agressive[1] as the -fast option will not work well on
something as complex as octave. As I understand it, -fast was added to
give good SPEC marks, and has sure seen little testing with code other
than the SPEC code.
HTH,
Per
PS. For interested parties I'm pasting a summary of what -fast implies
below:
================================
-fast changes the overall optimization strategy of GCC 3.3 in order to
produce the fastest possible running code for G4 and G5 architectures.
Optimizations under -fast are roughly grouped under the following
categories.
1.
-fast sets the optimization level to -O3, the highest level of
optimization supported by GCC 3.3. If any other optimization level
(-O0, -O1, -O2 or -Os) is specified, it is ignored by the compiler.
2.
Alignment. Assume alignments for loops, functions, branches and
structure data fields that provide fastest performance on the PowerPC.
-fast sets the following alignment-specific options:
-falign-loops-max-skip=15
-falign-jumps-max-skip=15
-falign-loops=16
-falign-jumps=16
-falign-functions=16
-malign-natural
3.
-fast enables the -ffast-math option, which allows certain unsafe math
operations for performance gains.
4.
Strict aliasing rules. -fast allows the compiler to assume the
strictest aliasing rules applicable to the language being compiled.
For C and C++, this activates optimizations based on the type of
expressions: an object of one type is assumed never to reside at the
same address as an object of a different type, unless the types are
almost the same. Furthermore, struct field references are assumed not
to alias each other as long as their direct and indirect enclosing
structure types are distinct. -fast enables the following aliasing
options:
-fstrict-aliasing
-frelax-aliasing
-fgcse-mem-alias
Warning: the behavior of correct programs will not be affected by
strict aliasing, but programs that make use of nonportable type
conversions may behave in unexpected ways.
5.
-fast enables various performance-related code transformations. These
include loop unrolling, transposing nested loops to improve locality of
array element access, conversion of certain initiliazation loops to
memset calls, and inline expansion of calls to library functions such
as floor. -fast enables the following code transformation options:
-funroll-loops
-floop-transpose
-floop-to-memset
-finline-floor (G5 only)
Some of these transformations increase code size.
6.
G5 specific instruction generation. With -fast (unless -mcpu=G4 is
specified), GCC 3.3 generates instructions which are specific to G5
and result in performance gain for G5. The following options are
assumed for G5 under -fast:
-mcpu=G5
-mpowerpc64
-mpowerpc-gpopt
7.
Scheduling changes. -fast option allows inter-block scheduling, and
scheduling specific to the G5 architecture. One such scheduling change
is load after a store that partially loads what was stored. The
following scheduling-related options are enabled by -fast:
-mtune=G5 (unless -mtune=G4 is specified)
-fsched-interblock
-fload-after-store
--param max-gcse-passes=3
-fno-gcse-sm
-fgcse-loop-depth
8.
-fast enables intermodule inlining when all source files are placed on
the same command line. The following options are set by -fast and
affect such inlining:
-funit-at-a-time
-fcallgraph-inlining
-fdisable-typechecking-for-spec
9.
-fast sets -mdynamic-no-pic by default. This allows for generation of
non-relocatable code and is not suitable for shared libraries. This
option may be overridden by -fPIC.
Users of -fast should be aware of the following caveats:
• Because -fast enables highly aggressive optimizations, some of
which may have an effect on code size or on program behavior, thorough
testing is especially important before deploying applications compiled
with -fast.
• For maximum run-time performance you should experiment with a
variety of optimization options; no one set of flags is best for all
applications.
• In future releases of GCC, -fast may enable a different set of
optimization options. The intention behind this option is that -fast
will enable optimizations that result in the fastest code for most
applications.
-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.
Octave's home on the web: http://www.octave.org
How to fund new projects: http://www.octave.org/funding.html
Subscription information: http://www.octave.org/archive.html
-------------------------------------------------------------